The Challenges of Deploying High-Performance NLP

  • Sarcasm
  • Phrase ambiguity
  • Slang
  • Bias
  • Regional languages

The Transformer

  1. The encoder takes the input sequence and encodes it into an intermediate representation
  2. The decoder then takes that internal representation and decodes it into the desired output sequence.
The transformer-model architecture. Source: Vaswani,, 2017
A notional example of attention weights from “it” to other words. The weights from “it” to “package” and “big” should be larger than the weight of “car”.

Transformers and other NLP Models in Production

How can Wallaroo help?

Learn More



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store



Wallaroo enables data scientists and ML engineers to deploy enterprise-level AI into production simpler, faster, and with incredible efficiency.