Click on the gif below for a live version:

Reinforcement learning is exciting. It also is quite difficult. Knowing both of these things to be true, I wanted to find a way to use Unity’s ML-Agents reinforcement learning framework to train neural networks for use on the web with TensorFlow.js (TFJS).

Why do this? Specifically, why use Unity ML Agents rather than training the models in TFJS directly? After all, TFJS currently has at least two separate examples of reinforcement learning, each capable of training in the browser with TFJS directly (or somewhat more practically, training with TensorflowJS’s Node.js backend). While there is something very exciting about the idea of training and using a reinforcement learning agent all in the browser, I had a lot of difficulty training a functional model using either of these examples. And my understanding of the intricacies of the RL algorithms in use (and how to optimize their respective hyper-parameters, etc.) is limited, so debugging these examples involved a deep dive into some pretty scary looking code.

So I turned to Unity’s ML Agents framework, which uses the Unity 3D development environment as host to various agent-based RL models. What is most exciting for me about this framework is that:

it appears to have a lot of support and an exciting (open source) community developing around it,
not having to build your own environment makes getting up and running with the agent-training pretty quick,
it features at least two algorithms for RL training (Proximal Policy Optimization and Soft Actor Critic), many working examples, and a backend which allows for a lot of more advanced training techniques (i.e. curriculum learning)

Because it uses TensorFlow to train models, I thought it might be possible to export a model for TFJS pretty quickly. This was not the case, but hopefully my stumbling through it will help you. What follows is how I managed to convert a neural network model trained using reinforcement learning in Unity into a TensorFlow.js model for use in the browser.

Note that the information below may well be out of date by the time you are reading it. The Unity ML Agents repo seems to be changing pretty quickly, as does the TensorFlow.js repo. Think of this as a lesson in stubbornness rather than a strict guide. If you want to skip my process, and just read how to do this yourself, skip to “Attempt #2”

A long and Rambling Process

What is a “.nn” file?

When I downloaded the latest release of the ML Agents package, I noticed that all of the example models were stored as mysterious “.nn” files. It turns out that the Unity ML Agents framework now exports trained models as “.nn” files for use with Unity’s new Barracuda inference engine. All of the pre-trained example models were stored in this “.nn” format (which cannot be converted to TensorFlow.js format models). At this point, I thought this project might have reached a dead end :(, but I opened an issue on the Unity ML Agents github repo anyway asking for them to implement a Barracuda to TensorFlow Converter.

After another few hours reading through the Unity ML Agents codebase and messing about with the example scenes, however, I realized that Unity still uses TensorFlow for training. Once training is complete, it converts and exports the TensorFlow model to Barracuda ‘.nn’ format using a conversion script. Crucially, this function also exports a TensorFlow Frozen Graph (".pb"), checkpoint and various other files. These are gitignored from Unity’s ML Agents repo, so in order to access them, it is necessary to train a model from scratch. After training, these files will be listed in your training output folder along with the “.nn” file:

Attempt #1: Convert Frozen Model to TFJS Model

My next attempt to get a working TFJS model from Unity was to directly convert the ‘.pb’ Frozen Graph format to TensorFlow.JS format using the TensorFlow.js Converter. Unfortunately, the converter utility deprecated the ‘Frozen Model’ format to focus their support on the harder, better, faster, stronger SavedModel format. What is the difference between a FrozenModel and a SavedModel? I don’t really know. So after another few hours of fooling about with Python virtual environments I managed to install an earlier release of TensorFlow.js (0.8.6) which supported converting a Frozen TensorFlow Model to a TensorFlow.js Model. Hurray!

Now I had to figure out what to input into the output_node_names parameter for the tensorflowjs_converter shell command. What is in an (output node) name? At this point, I had no idea, but I had found the file in the Unity ML Agents code which contains the export_model function responsible for exporting the TensorFlow Frozen Model and Unity Barracuda model. This function contained a line which defined target_nodes. This sounds pretty close to output_nodes, doesn’t it? Printing out these target nodes seemed like a fruitful next step.

a brief aside:

I have to pause here to mention that in order to alter the ML-Agents python code (the part of the Unity ML-Agents platform which runs training with TensorFlow), it was necessary to install these python packages from the repo, rather than from the PyPi (i.e. ‘pip’), which involved a whole bunch more struggling with Python virtual environments and following this guide here.

I added a print(target_nodes) line to the export_model function, ran the mlagents-learn script again and found that the target_nodes were is_continuous_control,version_number,memory_size,action_output_shape,action. Next I plugged these into the TFJS converter script with the following command:

$ tensorflowjs_converter   \
  --input_format=tf_frozen_model \
  --output_node_names=is_continuous_control,version_number,memory_size,action_output_shape,action \
  ./frozen_graph_def.pb   \
  ./web-model

At this point, I was greeted with a new and exciting error! This felt like success:

ValueError: Unsupported Ops in the model before optimization
AddV2

Some searching found a quick and dirty solution, add a --skip_op_check=SKIP_OP_CHECK flag to the converter script parameters.

It worked! There was a new web-model folder with three files:

weights_manifest.json, a human-readable JSON file and,
tensorflowjs_model.pb, a non-human readable (binary data) file
group1-shard1of1, a non-human readable (binary data) file

Hurray!

But wait… How do I load this into TFJS?

Loading Frozen TFJS Model in TFJS 0.15

Because I converted a Frozen Model into a TFJS model, loading it required the (deprecated) tf.loadFrozenModel function, which required downgrading to TFJS v < 1.0.0 This done, the following code loaded the model:

// NOTE: this function only available in TensorFlowjs versions < 1.0.0
// https://github.com/tensorflow/tfjs/issues/1079
// https://stackoverflow.com/questions/54143715/why-cant-i-load-tensorflow-model-using-tf-loadfrozenmodel-in-tensorflowjs
let MODEL_URL = './path/to/model.pb';
let WEIGHTS_URL = "./path/to/weights_manifest.json"

const model =(async () => {
    await tf.loadFrozenModel(MODEL_URL, WEIGHTS_URL);
    console.log(model);
    })();

At this point, I saw the javascript version of an error from before: Uncaught (in promise) Error: Tensorflow Op is not supported: AddV2. Rather than continue down the rabbit hole of working with older versions of TFJS, I took another tack: exporting a TensorFlow SavedModel from Unity, which could be converted to a TFJS model using the lastest version of the TFJS converter.

Attempt #2: Exporting a SavedModel from Unity

At this point, I was trying to export a SavedModel from Unity for use with the tensorflowjs_converter. This ended up being the approach which worked. Roughly, these were the required steps to export a SavedModel from Unity:

Install Unity ML Agents according to development installation instructions (to allow changes to python code)
In ml-agents/mlagents/trainers/tf_policy.py script, add the following lines to the export_model function to export a TensorFlow SavedModel. Note that the required graph nodes are different for continuous action space (i.e. the agent takes action on float values) and discrete action space (i.e. the agent takes action on integer values), so you will need to uncomment the respective lines accordingly:

def export_model(self):
        """
        Exports latest saved model to .nn format for Unity embedding.
        """

        with self.graph.as_default():
            graph_def = self.graph.as_graph_def()

            # BEGINNING OF ADDED CODE:
            # To learn more about nodes of the graph, uncomment the following lines:
            # for node in graph_def.node:
            #     print("-------")
            #     print(node.name)
            #     print(node.input)
            #     print(node.attr)
 
            # Uncomment for discrete vector action space:
            # vectorInputNode = self.graph.get_tensor_by_name("vector_observation:0")
            # actionMaskInput = self.graph.get_tensor_by_name("action_masks:0")
            # actionOutputNode = self.graph.get_tensor_by_name("action:0")
            # sigs = {}
            # sigs[tf.compat.v1.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
                tf.saved_model.signature_def_utils.predict_signature_def( \
                    {"in": vectorInputNode, "actionMask": actionMaskInput}, {"out": actionOutputNode})
            
            # Uncomment for continuous vector action space:
            # vectorInputNode = self.graph.get_tensor_by_name("vector_observation:0")
            # epsilonInputNode = self.graph.get_tensor_by_name("epsilon:0")
            # actionOutputNode = self.graph.get_tensor_by_name("action:0")
            # sigs = {}
            # sigs[tf.compat.v1.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
            #     tf.saved_model.signature_def_utils.predict_signature_def( \
            #         {"in": vectorInputNode,"epsilon": epsilonInputNode}, {"out": actionOutputNode})

            
            builder = tf.compat.v1.saved_model.Builder(self.model_path + "/SavedModel/")
            builder.add_meta_graph_and_variables( \
                self.sess, \
                [tf.saved_model.tag_constants.SERVING], \
                signature_def_map=sigs, \
                strip_default_attrs=True)
            builder.save()                                
            # END OF ADDED CODE
           
            target_nodes = ",".join(self._process_graph())
            
            output_graph_def = graph_util.convert_variables_to_constants(
                self.sess, graph_def, target_nodes.replace(" ", "").split(",")
            )
            frozen_graph_def_path = self.model_path + "/frozen_graph_def.pb"
            with gfile.GFile(frozen_graph_def_path, "wb") as f:
                f.write(output_graph_def.SerializeToString())
            tf2bc.convert(frozen_graph_def_path, self.model_path + ".nn")
            logger.info("Exported " + self.model_path + ".nn file")

The exported SavedModel should be able to be converted to a web model by the latest tensorflowjs-converter (tfjs version 1.4.0, currently) using the following script (run from within the SavedModel folder):

tensorflowjs_converter \
        --input_format=tf_saved_model \
        --output_format=tfjs_graph_model \
        --signature_name=serving_default \
        --saved_model_tags=serve \
        ./ \
        ./web_model

You should now have a model.json file and one or more binary files (i.e. group1-shard1of1.bin)

another brief aside:

Unity ML Agents uses Tensorboard to monitor training progress. Tensorboard has a ‘graphs’ tab which shows a visual representation of all of the inputs and outputs to each node of a graph model. This was somewhat helpful for starting to understand which nodes were necessary to attach as inputs and outputs for a SavedModel. It looks like this:

Tensorboard graph diagram.

This graph is not exported by default, but this functionality be added to the Unity codebase. An object called tf.summary.FileWriter is responsible for outputting the summaries. This object is located inside the ml-agents/mlagents/trainers/trainer.py. Inside of this file, a function called write_tensorboard_text calls this object and outputs the training summaries to file, to be read by Tensorboard. To include the graph output to this training summary, alter the write_tensorboard_text function to include the following:

self.summary_writer.add_summary(s, self.get_step)

ADD THE FOLLOWING:

self.summary_writer.add_graph( self.policy.graph, global_step=None, graph_def=None )


>The graph diagram should now be visible in Tensorboard.

---


## Hosting / Running Inference on a Unity Model in TFJS:
There are many TFJS resources available online.  I found it particularly helpful to look at [ml5.js'](https://ml5js.org/) [source code](https://github.com/ml5js/ml5-library) to understand how to load and run inference on a model in TFJS.  That said, I learned a few things along the way about running inference on a Unity model specifically which are included in the code below:

```javascript
let model;
async function loadAndRunModel() {
    // to load model generated by tensorflow.js converter from SavedModel:
    let MODEL_URL = './path/to/model.json';
    model = await tf.loadGraphModel(MODEL_URL);


    // FOR CONTINUOUS ACTION SPACE MODEL:
    // Inputs for continuous action space model:
    // Length of vectorObservations should match number of vector observations from Unity Agent's Behavior Paramters script:
    let vectorObservations = [0,0,0,0]; // for an agent with four vector observations
    const vectorObservationsTensor = tf.tensor([vectorObservations]);
    // I have almost no idea what epsilon does...
    let epsilon = [0.1];
    const epsilonTensor = tf.tensor([epsilon); 
    // Run inference:
    let actions = await model.predict([epsilon, vectorObservations]).print();
    

    // FOR DISCRETE ACTION SPACE MODEL:
    // Inputs for discrete action space model:
    let vectorObservations = [0,0,0,0];
    const vectorObservationsTensor = tf.tensor([vectorObservations]);
    // ActionMask is used to disallow certain actions during certain environment states
    // For instance, the GridWorld Agent in the Unity ML Agents examples is not allowed to move into a wall
    // 1 for action allowed, 0 for action not allowed
    const actionMask = tf.tensor([[1, 1]]); // All two actions allowed

    // To run discrete action space model, first perform forward pass:
    let actionProbabilities = await model.predict([actionMask, vectorObservations]);

    // Then pass 'logits' (values returned from forward pass) through tf.multinomial to get actions for your agent:
    // The seed value for tf.multinomial random number generator is 
    // supposedly optional, but I got an especially scary error when I didn't include it:
    let seedValue = 3; 
    let actions = await tf.multinomial(actionProbabilities, seedValue).print();
}

loadAndRunModel();

At this point, you should be able to generate inputs and get outputs from your Unity RL model in TFJS. :)! Here is an example of the 3D Balance Ball example scene in TFJS.

WORKFLOW: UNITY TO TENSORFLOWJS