In my previous post on Google’s TensorFlow, I had mentioned the idea of using the library for Genetic Programming applications. In my free time, I tried to map out how such an application would be run. The focus obviously wasn’t on building the smartest way to do GP, but rather on exploring if it was practically possible. One of the ideas that stuck out, was to have a Graph for each candidate solution – which would be populated based on cross-overed elements from the parent solutions’ Graphs. This would require a good API to copy computational elements from one Graph to another in TensorFlow. I tried digging around to see if such functionality was available, but couldn’t find any (atleast from a good exploring of the github repo).

I wrote some rudimentary code to accomplish this, and heres a basic outline of **how it works**:

Consider the example given on TensorFlow’s Get Started page.

import tensorflow as tf import numpy as np # Make 100 phony data points in NumPy. x_data = np.float32(np.random.rand(2, 100)) # Random input y_data = np.dot([0.100, 0.200], x_data) + 0.300 # Construct a linear model. b = tf.Variable(tf.zeros([1])) W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0)) y = tf.matmul(W, x_data) + b # Minimize the squared errors. loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(loss) # For initializing the variables. init = tf.initialize_all_variables() # Launch the graph sess = tf.Session() sess.run(init) # Fit the plane. for step in xrange(0, 201): sess.run(train) if step % 20 == 0: print step, sess.run(W), sess.run(b) # Learns best fit is W: [[0.100 0.200]], b: [0.300]

Lets say you want to copy the main training element `train`

to another graph as defined here:

to_graph = tf.Graph()

1) You first decide on a namespace inside which all the copied elements will exist in ` to_graph`

. This is not really required if the Graph you are copying to is empty. But its important to remember that element names matter a lot in TensorFlow’s workings. Therefore, to avoid naming conflicts, its better to define such a namespace. So basically, what would be called “*C*” in `to_graph`

‘s namespace, would now be called “*N/C*” where “*N*” is the namespace String.

Lets assume the namespace we define the copied elements in our example to, is “CopiedOps”.

namespace = "CopiedOps"

2) You then copy all the variables first, one by one, using a dedicated function `copy_variable_to_graph`

. Each time, you supply the original instance, the target graph (`to_graph`

in this case), the namespace, and an extra dictionary called `copied_variables`

. This dictionary is useful while copying the computational nodes (`Operation`

instances, we will call them ops) in the next step. Since `Variable`

instances act as inputs for ops, we need a way to keep track of them for later.

I initially wanted to combine this initialization of variables with the function that copies the computational elements, but I found it really tricky to capture the appropriate `Variable`

instances based on their connections to ops. Anyways, since variables are more like parameter-storing units whose values are needed frequently, its better to initialize them separately.

The `Variable`

instances in the above example are `b`

and `W`

. Heres how you would do it with my code:

copied_variables = {} b1 = copy_variable_to_graph(b, to_graph, namespace, copied_variables) W1 = copy_variable_to_graph(W, to_graph, namespace, copied_variables)

Ofcourse, if your code has a lot of variables, you could just store them all in a list and run the above function over all of them with a common dictionary for copied variables.

3) You then recursively copy all the computational nodes (ops, `Placeholder`

s) to the other graph. Now heres the nice things about the method- You only need to do it for the topmost node computational node. All connected inputs and Tensors are automatically taken care of!

For the above example, the `train`

object constructed on line 16 is the ‘topmost’ node. So copying the whole learner is as simple as:

train_copy = copy_to_graph(train, to_graph, copied_variables, namespace)

Thats it! The other instances like `y, optimizer, loss`

are automatically replicated in `to_graph`

.

Theres also a helper function in case you want to find the equivalent of an element from the original graph, in `to_graph`

:

loss_copy = get_copied(loss, to_graph, copied_variables, namespace)

4) You can now run the new node in `to_graph`

. Remember to initialize a new `Session`

instance linked to the graph, and initialize all Variables. So heres how you would go about it:

with to_graph.as_default(): init1 = tf.initialize_all_variables() sess1 = tf.Session() sess1.run(init1) for step in xrange(0, 201): sess1.run(train_copy) if step % 20 == 0: print step, sess1.run(W1), sess1.run(b1)

This provides an output similar to what you would get from the Get Started original example.

**The Code**

import tensorflow as tf from tensorflow.python.framework import ops from copy import deepcopy def copy_variable_to_graph(org_instance, to_graph, namespace, copied_variables={}): """ Copies the Variable instance 'org_instance' into the graph 'to_graph', under the given namespace. The dict 'copied_variables', if provided, will be updated with mapping the new variable's name to the instance. """ if not isinstance(org_instance, tf.Variable): raise TypeError(str(org_instance) + " is not a Variable") #The name of the new variable if namespace != '': new_name = (namespace + '/' + org_instance.name[:org_instance.name.index(':')]) else: new_name = org_instance.name[:org_instance.name.index(':')] #Get the collections that the new instance needs to be added to. #The new collections will also be a part of the given namespace, #except the special ones required for variable initialization and #training. collections = [] for name, collection in org_instance.graph._collections.items(): if org_instance in collection: if (name == ops.GraphKeys.VARIABLES or name == ops.GraphKeys.TRAINABLE_VARIABLES or namespace == ''): collections.append(name) else: collections.append(namespace + '/' + name) #See if its trainable. trainable = (org_instance in org_instance.graph.get_collection( ops.GraphKeys.TRAINABLE_VARIABLES)) #Get the initial value with org_instance.graph.as_default(): temp_session = tf.Session() init_value = temp_session.run(org_instance.initialized_value()) #Initialize the new variable with to_graph.as_default(): new_var = tf.Variable(init_value, trainable, name=new_name, collections=collections, validate_shape=False) #Add to the copied_variables dict copied_variables[new_var.name] = new_var return new_var def copy_to_graph(org_instance, to_graph, copied_variables={}, namespace=""): """ Makes a copy of the Operation/Tensor instance 'org_instance' for the graph 'to_graph', recursively. Therefore, all required structures linked to org_instance will be automatically copied. 'copied_variables' should be a dict mapping pertinent copied variable names to the copied instances. The new instances are automatically inserted into the given 'namespace'. If namespace='', it is inserted into the graph's global namespace. However, to avoid naming conflicts, its better to provide a namespace. If the instance(s) happens to be a part of collection(s), they are are added to the appropriate collections in to_graph as well. For example, for collection 'C' which the instance happens to be a part of, given a namespace 'N', the new instance will be a part of 'N/C' in to_graph. Returns the corresponding instance with respect to to_graph. TODO: Order of insertion into collections is not preserved """ #The name of the new instance if namespace != '': new_name = namespace + '/' + org_instance.name else: new_name = org_instance.name #If a variable by the new name already exists, return the #correspondng tensor that will act as an input if new_name in copied_variables: return to_graph.get_tensor_by_name( copied_variables[new_name].name) #If an instance of the same name exists, return appropriately try: already_present = to_graph.as_graph_element(new_name, allow_tensor=True, allow_operation=True) return already_present except: pass #Get the collections that the new instance needs to be added to. #The new collections will also be a part of the given namespace. collections = [] for name, collection in org_instance.graph._collections.items(): if org_instance in collection: if namespace == '': collections.append(name) else: collections.append(namespace + '/' + name) #Take action based on the class of the instance if isinstance(org_instance, tf.python.framework.ops.Tensor): #If its a Tensor, it is one of the outputs of the underlying #op. Therefore, copy the op itself and return the appropriate #output. op = org_instance.op new_op = copy_to_graph(op, to_graph, copied_variables, namespace) output_index = op.outputs.index(org_instance) new_tensor = new_op.outputs[output_index] #Add to collections if any for collection in collections: to_graph.add_to_collection(collection, new_tensor) return new_tensor elif isinstance(org_instance, tf.python.framework.ops.Operation): op = org_instance #If it has an original_op parameter, copy it if op._original_op is not None: new_original_op = copy_to_graph(op._original_op, to_graph, copied_variables, namespace) else: new_original_op = None #If it has control inputs, call this function recursively on each. new_control_inputs = [copy_to_graph(x, to_graph, copied_variables, namespace) for x in op.control_inputs] #If it has inputs, call this function recursively on each. new_inputs = [copy_to_graph(x, to_graph, copied_variables, namespace) for x in op.inputs] #Make a new node_def based on that of the original. #An instance of tensorflow.core.framework.graph_pb2.NodeDef, it #stores String-based info such as name, device and type of the op. #Unique to every Operation instance. new_node_def = deepcopy(op._node_def) #Change the name new_node_def.name = new_name #Copy the other inputs needed for initialization output_types = op._output_types[:] input_types = op._input_types[:] #Make a copy of the op_def too. #Its unique to every _type_ of Operation. op_def = deepcopy(op._op_def) #Initialize a new Operation instance new_op = tf.python.framework.ops.Operation(new_node_def, to_graph, new_inputs, output_types, new_control_inputs, input_types, new_original_op, op_def) #Use Graph's hidden methods to add the op to_graph._add_op(new_op) to_graph._record_op_seen_by_control_dependencies(new_op) for device_function in reversed(to_graph._device_function_stack): new_op._set_device(device_function(new_op)) return new_op else: raise TypeError("Could not copy instance: " + str(org_instance)) def get_copied(original, graph, copied_variables={}, namespace=""): """ Get a copy of the instance 'original', present in 'graph', under the given 'namespace'. 'copied_variables' is a dict mapping pertinent variable names to the copy instances. """ #The name of the copied instance if namespace != '': new_name = namespace + '/' + original.name else: new_name = original.name #If a variable by the name already exists, return it if new_name in copied_variables: return copied_variables[new_name] return graph.as_graph_element(new_name, allow_tensor=True, allow_operation=True)

Working with feeding is pretty simple too:

>>> x = tf.placeholder("float") >>> a = tf.constant(3, "float") >>> y = tf.add(x, a) >>> namespace = "CopiedOps" >>> to_graph = tf.Graph() >>> copied_variables = {} >>> y1 = copy_to_graph(y, to_graph, namespace) >>> x1 = get_copied(x, to_graph, namespace) >>> with to_graph.as_default(): sess = tf.Session() print sess.run(y1, feed_dict={x1: 5}) 8.0

I guess thats all for my hacking around with TensorFlow for the week. If you intend to use this code, please note that it may not be perfect at doing what it says. I haven’t tried it out with all sorts of TensorFlow data structures as yet, so be open to getting an Exception or two that you may have to fix. Infact, do drop me a comment or mail so I can make this code as fool-proof as I can. Cheers!

Hey will this code copy a trained graph or would you need to retrain it once copied over

Interesting! ‘Training’ a graph in TensorFlow essentially means putting the appropriate value(s) into the required Variable instances. So this code will sadly not copy the info gathered after training. However, you can hack it easily to implement this functionality. Look at line 45 in the big chunk of code. In the place of whats written (which copies the initial value of the Variable), write “init_value = temp_session.run(org_instance)”. This will copy the *current value* into the new Variable. With this, you can use the code to copy over trained graphs 🙂

For a certain usecase, I am building more than one DNN model based on Tensorflow’s skflow library. I partition my data into minibatches and use partial_fit for fitting. After every cycle of partial_fit, I would like to copy the weights of the first n-hidden layers of one TensorFlowDNNClassifier model to another TensorFlowDNNClassifier model. Then continue learning/copying using partial_fit. (The topology of the first n-hidden layers for both models are identical.)

I know how to retrieve weights from classifier1:

classifier1.get_tensor_value(‘dnn/layer0/Linear/Matrix:0’)

But I don’t know how to copy their values to a classifier2!

I don’t know how to add ops into the graph to update the weights. Could you please help me do this?

is that code possible for adding new output for pre-trained model ?

Dear Joscha,

I found your code suiting another problem, where I had to deeply copy a tensor within the same graph, as to force the tensor to be recalculated after variables changed. As such I am about to use your code (modified) in my thesis and would also like to publish the result under GPL. Would you be fine with that? I would certainly name, where the appropriate function was derived from.

Thank you!

* Sachin, not Joscha. I somehow mixed the “J” of your second name with Sachin (-;