Hey HN, I'm author of the article. If there's anything confusing, I'd be happy to help :)
The GNMT architecture is likely to keep popping up in Google's papers due to the amount of engineering that's been dedicated to make it scale well. Other than chewing up massive machine translation datasets, from single to multiple language pairs, it's also been trained over the entirety of Reddit for a conversational model system.
Just this week a new paper was released using the GNMT architecture which teaches a machine translation system how to translate between language pairs it has never been taught before by using bridging from other language pair knowledge. It's all pretty fascinating stuff :)