Aug 22, 2016

Yes it is, and in my opinion is very important. TensorFlow team is actually actively working on a JIT (https://github.com/tensorflow/tensorflow/issues/164). I'll paste here a relevant part of the TensorFlow paper regarding the Future Work:

"We also have a number of concrete directions to improve the performance of TensorFlow. One such direction is our initial work on a just-in-time compiler that can take a subgraph of a TensorFlow execution, perhaps with some runtime profiling information about the typical sizes and shapes of tensors, and can generate an optimized routine for this subgraph. This compiler will understand the semantics of perform a number of optimizations such as loop fusion, blocking and tiling for locality, specialization for particular shapes and sizes, etc."

http://download.tensorflow.org/paper/whitepaper2015.pdf

Feb 16, 2016

Model serving in production is a persistent pain point for many ML backends, and is usually done quite poorly, so this is great to see.

I'm expecting large leaps and bounds for TensorFlow itself. This improvement to surrounding infrastructure is a nice surprise, just as TensorBoard is one of the nicest "value-adds" that the original library had[4].

Google have ensured many high quality people have been active as evangelists[3], helping build a strong community and answerbase. While there are still gaps in what the whitepaper[1] promises and what has made it to the open source world[2], it's coming along steadily.

My largest interests continue to be single machine performance (a profiler for performance analysis + speedier RNN implementations) and multi-device / distributed execution. Single machine performance had a huge bump from v0.5 to v0.6 for CNNs, eliminating one of the pain points there, so they're on their way.

I'd have expected this to lead to an integration with Google Compute Engine (TensorFlow training / prediction as a service) except for the conspicuous lack of GPU instances on GCE. While GPUs are usually essential for training (and theoretically could be abstracted away behind a magical GCE TF layer) there are still many situations in which you'd want access to the GPU itself, particularly as performance can be unpredictable across even similar hardware and machine learning model architectures.

[1]: http://download.tensorflow.org/paper/whitepaper2015.pdf

[2]: Extricating TensorFlow from "Google internal" must be a real challenge given TF distributed training interacts with various internal infra tools and there are gaps with open source equivalents.

[3]: Shout out to @mrry who seems to have his fingers permanently poised above the keyboard - http://stackoverflow.com/users/3574081/mrry?tab=answers&sort...

[4]: I've been working on a dynamic memory network (http://arxiv.org/abs/1506.07285) implementation recently and it's just lovely to see a near perfect visualization of the model architecture by default - http://imgur.com/a/PbIMI