TensorFlow vs. PyTorch vs. Keras for NLP

Earlier than starting a function comparability between TensorFlow, PyTorch, and Keras, let’s cowl some tender, non-competitive variations between them.

Non-competitive info:

Beneath, we current some variations between the three that ought to function an introduction to TensorFlow, PyTorch, and Keras. These variations aren’t written within the spirit of evaluating one with the opposite however with a spirit of introducing the topic of our dialogue on this article.


  • Created by Google
  • Model 1.Zero in February 2017


  • Created by Fb
  • Model 1.Zero in October 2018
  • Based mostly on Torch, one other deep studying framework based mostly on Lua


  • Excessive-level API to simplify the complexity of deep studying frameworks
  • Runs on high of different deep studying APIs — TensorFlow, Theano, and CNTK
  • It’s not a library by itself

Aggressive variations of TensorFlow, PyTorch, and Keras:

Now let’s see extra aggressive info in regards to the three of them. We’re particularly seeking to do a comparative evaluation of the frameworks specializing in Pure Language Processing.

1. Kinds of RNNs accessible

When on the lookout for a deep studying answer to an NLP drawback, Recurrent Neural Networks (RNNs) are the most well-liked go-to structure for builders. Due to this fact, it is sensible to match the frameworks from this angle.

All the frameworks into consideration have modules that permit us to create easy RNNs in addition to their extra developed variants — Gated Recurrent Items (GRU) and Lengthy Brief Time period Reminiscence (LSTM) networks.


PyTorch supplies 2 ranges of lessons for constructing such recurrent networks:

  • Multi-layer lessons — nn.RNN, nn.GRU, and nn.LSTM. Objects of those lessons are able to representing deep bidirectional recurrent neural networks.
  • Cell-level lessons — nn.RNNCell, nn.GRUCell, and nn.LSTMCell. Objects of those lessons can symbolize solely a single cell (once more, a easy RNN or LSTM or GRU cell) that may deal with one timestep of the enter information.

So, the multi-layer lessons are like a pleasant wrapper to the cell-level lessons for the occasions once we don’t need a lot customization inside our neural community.

Additionally, making an RNN bi-directional is so simple as setting the bidirectional argument to True within the multi-layer lessons!


TensorFlow supplies us with a tf.nn.rnn_cell module to assist us with our customary RNN wants.

A number of the most vital lessons within the tf.nn.rnn_cell module are as follows:

  • Cell degree lessons are used to outline a single cell of the RNN, viz — BasicRNNCell , GRUCell and LSTMCell
  • MultiRNNCell class is used to stack the assorted cells to create deep RNNs
  • DropoutWrapper class is used to implement dropout regularization


Beneath are the recurrent layers supplied within the Keras library. A few of these layers are:

  • SimpleRNN — Absolutely-connected RNN the place the output is to be fed again to enter
  • GRU — Gated Recurrent Unit layer
  • LSTM — Lengthy Brief Time period Reminiscence layer

TensorFlow, PyTorch, and Keras have built-in capabilities to permit us to create fashionable RNN architectures. The distinction lies of their interface.

Keras has a easy interface with a small listing of well-defined parameters, which makes the above lessons straightforward to implement. Being a high-level API on high of TensorFlow, we are able to say that Keras makes TensorFlow straightforward. Whereas PyTorch supplies the same degree of flexibility as TensorFlow, it has a a lot cleaner interface.

Whereas we’re on the topic, let’s dive deeper right into a comparative research based mostly on the convenience of use for every framework.

2. Ease of use: TensorFlow vs. PyTorch vs. Keras

TensorFlow is usually reprimanded over its incomprehensive API. PyTorch is far more pleasant and easy to make use of. General, the PyTorch framework is extra tightly built-in with Python language and feels extra native more often than not. While you write in TensorFlow, typically you’re feeling that your mannequin is behind a brick wall with a number of tiny holes to speak over. 

Let’s talk about just a few extra elements evaluating the three, based mostly on their ease of use:

Static computational graphs vs. dynamic computational graphs:

This issue is particularly vital in NLP. TensorFlow makes use of static graphs for computation whereas PyTorch makes use of dynamic computation graphs.

Which means that in Tensorflow, you outline the computation graph statically earlier than a mannequin is run. All communication with the outer world is carried out by way of tf.Session object and tf.Placeholder, that are tensors that will probably be substituted by exterior information at runtime.

In PyTorch, issues are far more crucial and dynamic: you’ll be able to outline, change, and execute nodes as you go; no particular session interfaces or placeholders.

In RNNs, with static graphs, the enter sequence size will keep fixed. Which means that if you happen to develop a sentiment evaluation mannequin for English sentences, it’s essential to repair the sentence size to some most worth and pad all smaller sequences with zeros. Not too handy, proper?


Because the computation graph in PyTorch is outlined at runtime, you need to use our favourite Python debugging instruments corresponding to pdb, ipdb, PyCharm debugger, or previous trusty print statements.

This isn’t the case with TensorFlow. You’ve gotten an possibility to make use of a particular software known as tfdbg, which lets you consider TensorFlow expressions at runtime and browse all tensors and operations in session scope. In fact, you gained’t be capable to debug any python code with it, so will probably be mandatory to make use of pdb individually.

  • Neighborhood dimension:

Tensorflow is extra mature than PyTorch. It has a a lot bigger neighborhood as in comparison with PyTorch and Keras mixed. Its consumer base is rising sooner than each PyTorch and Keras.

So this implies:

  • A bigger StackOverFlow neighborhood to assist together with your issues
  • A bigger set of on-line research supplies — blogs, movies, programs, and so forth.
  • Sooner adoption for the newest Deep Studying methods

Way forward for NLP:

Whereas Recurrent Neural Networks have been the “go-to” structure for NLP duties for some time now, it is most likely not going to be this fashion eternally. We have already got a more recent transformer mannequin based mostly on the eye mechanism gaining recognition amongst the researchers.

It’s already being hailed as the brand new NLP customary, changing Recurrent Neural Networks. Some commentators imagine that the Transformer will turn into the dominant NLP deep studying structure of 2019.

Tensorflow appears to be forward on this race:

  • Initially, attention-based architectures have been launched by Google itself.
  • Second, solely TensorFlow has a steady launch for Transformer structure

This isn’t to say that PyTorch is way behind, many pre-trained transformer fashions can be found at Huggingface’s GitHub: https://github.com/huggingface/pytorch-transformers.

So, that’s all in regards to the comparability. However earlier than parting methods, let me inform you about one thing that may make this complete dialog out of date in 1 12 months!

TensorFlow 2.0

Google not too long ago introduced Tensorflow 2.0, and it’s a game-changer!

Right here’s how:

  • Going ahead, Keras would be the high-level API for TensorFlow, and it’s prolonged so as to use all of the superior options of TensorFlow immediately from tf.keras. So, all of TensorFlow with Keras simplicity at each scale and with all {hardware}.
  • In TensorFlow 2.0, keen execution is now the default. You’ll be able to make the most of graphs even in keen context, which makes your debugging and prototyping straightforward, whereas the TensorFlow runtime takes care of efficiency and scaling underneath the hood.
  • TensorBoard integration with Keras, which is now a… one-liner!

So, that mitigates nearly all of the complaints that individuals have about TensorFlow, I suppose. Which implies that TensorFlow will consolidate its place because the go-to framework for all deep studying duties and is even higher now!

0 Comment

Leave a comment