I found this paper after reading through Jeff Dean's talk at NIPS. I recommend that, too, obviously!
You can generally find a paper for each section.
You are referring to this http://learningsys.org/nips17/assets/slides/dean-nips17.pdf, right?
It didn't make the cut just because is a keynote and not a paper, but nevertheless is a great engineering fit. Even more if considering that in the meanwhile Jeff is running the entire brain team, and keep pouring contributions on a regular basis: https://research.google.com/pubs/jeff.html
Ok, I can understand how a bloom filter can be replaced by a neural network predictive model. You could actually train it while stuff gets added. This would make adding somewhat more expensive, but ...
Ah so it appears they're advocating using neural networks as index functions to sorted arrays (hashmaps are simply sorted by hash instead of by something in the data).
So what they do is they take a FIXED set of data that you want to quickly lookup in, already sorted, train a model (2 layer 32 width, relu activation is one architecture, but they also train sequences of models, HUGE changes to error (as the cost of max and min error are huge, you minimize max error rather than average error)).
They have the following brilliant insight : an index over a database (which gives the position of the data given the search key) is a CDF (cumulative distribution function) ! That's brilliant ! Of course it is !
And of course, this is Google. Once you have an index trained (which is a linear operation), you can translate the neural network model directly into C++, and compile it into machine instructions that don't depend on anything like tensorflow libraries. The resulting code can be pasted into anything you want. This may work fast, but seems less then entirely practical ... although I guess you could do the same in Java far easier and you could just include that code.