This lesson is being piloted (Beta version)

Sample Application of NEMO for Sentiment Analysis

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use NEMO in container

Objectives
  • Apply NEMO to run sentiment analysis

NeMo

Import and Create NeMo sqsh file:

The NGC for NeMo can be found here: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

$ enroot import docker://nvcr.io#nvidia/nemo:22.09
$ enroot create nvidia+nemo+22.09.sqsh

Sentiment Analysis using NeMo

Here we use this sentiment sample from NVIDIA

SST2 data:

We download the Stanford Sentiment Treebank v2 (SST-2) and preprocess to nemo format for training and testing data

cd $WORK
mkdir nemo && cd nemo

curl -s -O https://dl.fbaipublicfiles.com/glue/data/SST-2.zip\
 && unzip -o SST-2.zip -d ./\
 && sed 1d ./SST-2/train.tsv > ./train_nemo_format.tsv\
 && sed 1d ./SST-2/dev.tsv > ./dev_nemo_format.tsv &

Requesting a compute node with NeMo container enable with a GPU:

srun -N1 -G1 -c10 --mem=64G --time=12:00:00 --container-image $WORK/sqsh/nvidia+nemo+22.09.sqsh --container-mounts=$WORK --pty bash -i

Let’s run Sentiment Analysis using NeMo

cd $WORK/nemo/SST-2
python /workspace/nemo/examples/nlp/text_classification/text_classification_with_bert.py \
      model.dataset.num_classes=2 \
      model.dataset.max_seq_length=256 \
      model.train_ds.batch_size=64 \
      model.validation_ds.batch_size=64 \
      model.language_model.pretrained_model_name='bert-base-cased' \
      model.train_ds.file_path=train_nemo_format.tsv \
      model.validation_ds.file_path=dev_nemo_format.tsv \
      trainer.num_nodes=1 \
      trainer.max_epochs=20 \
      trainer.precision=16 \
      model.optim.name=adam \
      model.optim.lr=1e-4

Check the GPU usage with nvidia-smi command

Output of the model training is text_classification_model.nemo

Model Evaluation and Inference

from nemo.collections.nlp.models.text_classification import TextClassificationModel
model = TextClassificationModel.restore_from("text_classification_model.nemo")
model.to("cuda")

# define the list of queries for inference
queries = ['legendary irish writer brendan behan memoir , borstal boy',
           'demonstrates that the director of such hollywood blockbusters as patriot games can still turn out a small , personal film with an emotional wallop ', 
           'on the worst revenge-of-the-nerds clichés the filmmakers could dredge up', 
           'uneasy mishmash of styles and genres']

results = model.classifytext(queries=queries, batch_size=3, max_seq_length=512)

print('The prediction results of some sample queries with the trained model:')
for query, result in zip(queries, results):
    print(f'Query : {query}')
    print(f'Predicted label: {result}')

Key Points

  • NGC Container, NEMO, Sentiment Analysis