Guide To THiNC: A Refreshing Functional Take On Deep Learning
THiNC is a deep learning framework that makes composing, configuring and deploying models easy. It provides a flexible yet simple approach to modelling by providing low-level abstractions of the training loop, evaluation loop etc. Moreover, it plays well with major deep learning frameworks like TensorFlow and PyTorch. The functional programming API of THiNC is fairly simple and elegant. It’s light weighted API makes THiNC a good option for quick prototyping and deployment of machine learning models.
Core Design Idea
Modularity is the most important aspect of any deep learning framework. The ability to build larger functions from smaller component functions is imperative to build DL Models. To provide this functionality is not as trivial as it seems. Each block of a DL Model should be capable of running a forward pass of activations and a backward pass of gradients. Several Frameworks choose several ways to deal with this requirement. THiNC uses a callback mechanism to solve this model composition problem. Each layer in the forward pass returns a backward callback function along with the activation result. The following example will make things clear.
Example: Let’s say we want a reduce sum layer followed by a RELU layer in a model. THiNC builds layers similar to the following functions.
def reduce_sum(X: Floats3d) -> Tuple[Floats2d, Callable[[Floats2d], Floats3d]]: Y = X.sum(axis=1) X_shape = X.shape def backprop_reduce_sum(dY: Floats2d) -> Floats3d: dX = zeros(X_shape) dX += dY.reshape((dY.shape[0], 1, dY.shape[1])) return dX return Y, backprop_reduce_sum def relu(inputs: Floats2d) -> Tuple[Floats2d, Callable[[Floats2d], Floats2d]]: mask = inputs >= 0 def backprop_relu(d_outputs: Floats2d) -> Floats2d: return d_outputs * mask return inputs * mask, backprop_relu
Many layers of DL models require parameters to work. But passing in these parameters individually to the forward function along with input is extremely difficult to manage. It becomes quite messy with more number of parameters and layers. To avoid this parameters are encapsulated in a Model Class and are passed as input to the forward function. Unlike TensorFlow or PyTorch’s model class, this class won’t become different for different kinds of layers through inheritance. This model is the same for almost all types of layers, only the forward function which takes the model as input will differentiate between the layers.
Features
As per the official Documentation THiNC offers the following features and abilities:
- Type-checking of your model definitions
- Integrating your favourite DL frameworks seamlessly.
- Functional Programming API.
- Operator Overloading
- Integrated Config system for exposing hyperparameters
- Choosing a GPU or CPU backend.
- Supporting variable-length sequences
- Low Abstraction Training Loop.
Let’s get started with THiNC.
Setup
!pip install THiNC --pre #To use GPU we need to install cupy !pip install "cupy-cuda101"
Classifier Model
To Build a model all the classes and functions must be imported from thin.api .Let’s build a simple FCNN with dropout to recognize the handwritten digits of MNIST dataset.
from thinc.api import chain, Relu, Softmax n_hidden1, n_hidden2, n_hidden3 = 64,32,10 dropout = 0.2 model = chain( Relu(nO=n_hidden1, dropout=dropout), Relu(nO=n_hidden2, dropout=dropout), Relu(nO=n_hidden3, dropout=dropout), Softmax() )
Operator Overloading
Operator overloading can be used to build complex models in a concise manner.This FCNN model can be built in a single line by using operator overloading.
with Model.define_operators({">>": chain}): model = Relu(nO=n_hidden1, dropout=dropout) >> Relu(nO=n_hidden2, dropout=dropout) >> Relu(nO=n_hidden3, dropout=dropout) >> Softmax()
Model Initialization
Input and output shapes, along with all the missing shape information in the model can be inferred automatically by initializing the model with sample inputs.
model.initialize(X=train_X[:5], Y=train_Y[:5]) nI = model.get_dim("nI") nO = model.get_dim("nO") print(f"Initialized model with input dimension nI={nI} and output dimension nO={nO}")
Training the Model
Next, we need to build a training loop. THiNC provides low-level abstractions for batching the data and shuffling it. These help in building custom training loops. The following lines are the backbone of training.
for X, Y in tqdm(batches, leave=False): Yh, backprop = model.begin_update(X) backprop(Yh - Y) model.finish_update(optimizer)
We can add all sorts of functionality and build a generic training loop on top of these essential lines.
def train_model(data, model, optimizer, n_iter, batch_size): (train_X, train_Y), (dev_X, dev_Y) = data indices = model.ops.xp.arange(train_X.shape[0], dtype="i") for i in range(n_iter): batches = model.ops.multibatch(batch_size, train_X, train_Y, shuffle=True) for X, Y in tqdm(batches, leave=False): Yh, backprop = model.begin_update(X) backprop(Yh - Y) model.finish_update(optimizer) # Evaluate and print progress correct = 0 total = 0 for X, Y in model.ops.multibatch(batch_size, dev_X, dev_Y): Yh = model.predict(X) correct += (Yh.argmax(axis=1) == Y.argmax(axis=1)).sum() total += Yh.shape[0] score = correct / total print(f" {i} {float(score):.3f}") Now we can define an optimizer and run the training. from THiNC.api import Adam, fix_random_seed fix_random_seed(0) optimizer = Adam(0.001) batch_size = 128 n_iter = 10 train_model(((train_X, train_Y), (dev_X, dev_Y)), model, optimizer, n_iter, batch_size) # We got 96.2% Validation Accuracy
Compatibility with Other Frameworks
THiNC provides awesome wrappers to integrate with Tensorflow, PyTorch and MXNet.These wrappers help a lot when it comes to porting code amongst different frameworks. If some functionality you need is ready in a different framework, then you can use those special layers by wrapping them using the THiNC wrappers for that particular framework.
Example: Let’s implement the same FCNN as above using layers from both TensorFlow and PyTorch.
from thinc.api import PyTorchWrapper, TensorFlowWrapper, chain, Linear, Adam dropout=0.2 from tensorflow.keras.layers import Dense, Dropout from tensorflow.keras.models import Sequential tf_model = Sequential() tf_model.add(Dense(64, activation="relu", input_shape=(784,))) tf_model.add(Dropout(dropout)) import torch import torch.nn import torch.nn.functional as F class PyTorchModel(torch.nn.Module): def __init__(self, nO, nI, dropout): super(PyTorchModel, self).__init__() self.dropout1 = torch.nn.Dropout2d(dropout) self.fc1 = torch.nn.Linear(nI, nO,) def forward(self, x): x = F.relu(x) x = self.dropout1(x) x = self.fc1(x) x = F.relu(x) return x model = chain( TensorFlowWrapper(tf_model), PyTorchWrapper(PyTorchModel( 32, 64, dropout)), Relu(nO=10, dropout=dropout), Softmax() ) model
THiNC is not yet completely developed and it is not advisable to use layers from multiple frameworks in production. These frameworks tend to hog the available GPU.
Config System
Configuration is an important aspect of product development. You need models to be manageable. We might often need to expose lots of model components as hyperparameters.
THiNC provides a great config system to do this.We can define trees of hyperparameters using simple json like structures. Then we can resolve these configurations and get the functions to run.Let’s create a config for the classifier model above.
from thinc.api import Config, registry CONFIG = """ [hyper_params] n_hidden1 = 64 n_hidden2 = 32 n_hidden3 = 10 dropout = 0.2 [model] @layers = "chain.v1" [model.*.relu1] @layers = "Relu.v1" nO = ${hyper_params:n_hidden1} dropout = ${hyper_params:dropout} [model.*.relu2] @layers = "Relu.v1" nO = ${hyper_params:n_hidden2} dropout = ${hyper_params:dropout} [model.*.relu3] @layers = "Relu.v1" nO = ${hyper_params:n_hidden3} dropout = ${hyper_params:dropout} [model.*.softmax] @layers = "Softmax.v1" [optimizer] @optimizers = "Adam.v1" [optimizer.learn_rate] @schedules = "warmup_linear.v1" initial_rate = 2e-5 warmup_steps = 1000 total_steps = 10000 [training] n_iter = 10 batch_size = 128 """ config = Config().from_str(CONFIG) config
Here each block is separated by [<block name>].THiNC registry contains the definitions of standard functions we can get these definitions from the registry using
@<registry name> =”<function name string>”
Once we have the function in the succeeding lines we can add argument values to the function.
<argument-name>= value
The hierarchy of functions can be represented using the “.” operator
[<blockA>]
[<blockA>.<blockB>]
Once we have the config we can resolve it and train the model.
loaded_config = registry.resolve(config) model = loaded_config["model"] optimizer = loaded_config["optimizer"] n_iter = loaded_config["training"]["n_iter"] batch_size = loaded_config["training"]["batch_size"] model.initialize(X=train_X[:5], Y=train_Y[:5]) train_model(((train_X, train_Y), (dev_X, dev_Y)), model, optimizer, n_iter, batch_size)
This unique config system allows us to expose lots of model details to the user without having to write complex code.
Disadvantages
Although THiNC is a good approach to solve deep learning modelling problems, it introduces its own set of problems. Following are some of the disadvantages of THiNC
No Implicit Backward Pass of gradients – As discussed in the design idea section of the post, we need every block to return the activation output and a callback function for backpropagation. Unlike other frameworks THiNC doesn’t use reverse model auto differentiation, we need to explicitly code the backward loss function.
THiNC’s capability to integrate various frameworks is rudimentary. Its low-level abstraction makes it easier to make mistakes.
Conclusion
THiNC is a lightweight DL framework that makes model composition facile. It’s various enticing advantages like Shape inference, concise model representation, effortless debugging and awesome config system, makes this a recommendable choice of framework.
References
The post Guide To THiNC: A Refreshing Functional Take On Deep Learning appeared first on Analytics India Magazine.