This lesson is being piloted (Beta version)

SMU SuperPOD 101

Introduction to SMU SuperPOD

Overview

Teaching: 10 min
Exercises: 0 min
Questions
Objectives
  • Onboarding to SMU SuperPOD

Introduction

NVIDIA DGX SuperPOD Advantage Specifications

Specification Values
Computational Ability 1,644 TFLOPS
Number of Nodes 20
CPU Cores 2,560
Total Memory 52.5 TB
Node Interconnect Bandwidth 200 Gb/s Infiniband Connections Per Node
Work Storage 768 TB (Shared)
Scratch Storage 750 TB (Raw)
Archival Storage N/A
Operating System Ubuntu 20.04

Specification for each compute node:

Specification Values
CPU number 128
GPU number 8
Memory 1910gb
Time Limit 2 days
Home Storage 200gb (Independence from M3)
Scratch Storage Unlimited (Independence from M3)
Work Storage 8TB (shared with M3)

Command to check number the configuration of All nodes:

$ sinfo --Format="PartitionName,Nodes:10,CPUs:8,Memory:12,Time:15,Features:18,Gres:14"

Storage

Note that:

Variable Path Quota Usage
${HOME} /users/${USER} 200 GB Home directory, backed up
${WORK} /work/users/${USER} 8 TB Long term storage
${SCRATCH} /scratch/users/${USER} None Temporary scratch space
${JOB_SCRATCH} /scratch/_tmp/${USER:0:1}/ None Per job scratch space,
${JOB_SCRATCH} ${USER}/${SLURM_JOB_ID}_   ${SLURM_ARRAY_TASK_ID} is
${JOB_SCRATCH} ${SLURM_ARRAY_TASK_ID}   zero for standard jobs

Command to check available data from your work storage:

$ lfs quota -h -u $USERNAME /work

Login to SuperPOD

$ ssh username@superpod.smu.edu
$ ssh username@slogin-01.superpod.smu.edu
$ ssh username@slogin-02.superpod.smu.edu

SuperPOD is using the same module system as M3 so nearly all commands are similar.

Requesting a compute node

SuperPOD uses SLURM as scheduler so it is no different from M3 when requesting an interactive node:

For example, requesting a node with 1 GPU, 10 CPUs, 128gb memory for 12 hours:

$ srun -N 1 -G 1 -c 10 --mem=128G --time=12:00:00 --pty $SHELL
$ srun -N 1 -G 1 -c 10 --mem=128G --time=12:00:00 --pty bash

For this workshop on campus, we do have available workshop queue (using flag -p workshop) for you (to speed up the process of requesting resources):

$ srun -N 1 -G 1 -c 10 --mem=64G -p workshop --time=12:00:00 --pty $SHELL

Transfering data

scp /link/fileA username@superpod.smu.edu:/users/username

or using WinSCP on Windows machine if you dont want to use CLI

Working with module

By default, very few modules available when using module avail

$ module avail

------------------------------------------------------------------------- /hpc/mp/module_files/compilers -------------------------------------------------------------------------
   amd/aocc/4.1.0    gcc/11.2.0    intel/oneapi/2023.2    nvidia/nvhpc/23.7

--------------------------------------------------------------------------- /hpc/mp/module_files/apps ----------------------------------------------------------------------------
   amber/22    apptainer/1.1.9    conda    gaussian/g16c02    julia/1.9.2    lammps/may22    spack

Similar to M3, SuperPOD also uses Spack as its module manager. Therefore you can find all your needed modules after loading spack:

$ module load spack
$ module avail

------------------------------------------------------------------ /hpc/mp/spack_modules/linux-ubuntu22.04-zen2 ------------------------------------------------------------------
   aocc-4.1.0/aocl-sparse/4.0-t2kjb3u                               gcc-11.2.0/aocl-sparse/4.0-zczy7ug                          gcc-11.2.0/lz4/1.9.4-gtzsc3c
   aocc-4.1.0/autoconf-archive/2023.02.20-inwkm6b                   gcc-11.2.0/autoconf-archive/2023.02.20-r5lazua              gcc-11.2.0/lzo/2.10-x6itbky
   aocc-4.1.0/autoconf/2.69-x53b2ii                                 gcc-11.2.0/autoconf/2.69-xlmuzvq                            gcc-11.2.0/m4/1.4.19-sv4d5ah
   aocc-4.1.0/automake/1.16.5-hfcjabg                               gcc-11.2.0/automake/1.16.5-nsy2ron                          gcc-11.2.0/mbedtls/2.28.2-xvf3rc3
   aocc-4.1.0/berkeley-db/18.1.40-5po7n7c                           gcc-11.2.0/berkeley-db/18.1.40-hlnjdqn                      gcc-11.2.0/mbedtls/2.28.2-42lnomn         (D)     
   aocc-4.1.0/binutils/2.40-eivqxcw                                 gcc-11.2.0/binutils/2.40-u6hr2wz                            gcc-11.2.0/meson/1.1.0-teqdfz5
   aocc-4.1.0/bzip2/1.0.8-5ag7qmi                                   gcc-11.2.0/bison/3.8.2-tifozqf                              gcc-11.2.0/metis/5.1.0-coza6f3
   aocc-4.1.0/cmake/3.26.3-p6v5a7t                                  gcc-11.2.0/boost/1.82.0-xpmd3v6                             gcc-11.2.0/mpfr/4.2.0-meodww2
   aocc-4.1.0/diffutils/3.9-bzq7rzo                                 gcc-11.2.0/bzip2/1.0.8-qaxdt7f                              gcc-11.2.0/msgpack-c/3.1.1-d624eki
   aocc-4.1.0/expat/2.5.0-kav5ad4                                   gcc-11.2.0/cmake/3.26.3-r23mmbq                             gcc-11.2.0/nasm/2.15.05-mdqravc
   aocc-4.1.0/gdbm/1.23-6r6asdl                                     gcc-11.2.0/cmake/3.26.3-utseokk                      (D)    gcc-11.2.0/ncurses/6.4-rfw5ur5
   aocc-4.1.0/gettext/0.21.1-dmnukqt                                gcc-11.2.0/curl/8.0.1-cp7iioq                               gcc-11.2.0/neovim/0.8.3-mdppjp3
   ....

Note: Press “q” to quit checking module

As we are on installation process, if you do not see the modules that you needed available, please inform us so we can install that for you

Key Points

  • SuperPOD 101


Working with Conda Environment

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to create personal conda environment in SuperPOD

Objectives
  • Create Conda environment for AI&ML Application

2. Conda Environment

$ module load conda
$ conda env list

# conda environments:
#
base                     /hpc/mp/apps/conda

Create conda environment for Tensorflow with GPUs support

Next, let’s create a conda environment for Tensorflow 2.9, here are the steps:

(1) Request a compute node with 1 GPU

$ srun -N1 -G1 -c10 --mem=64G --time=12:00:00 --pty $SHELL

(2) Load cuda and cudnn module for GPU support

$ module load conda gcc
$ module load cuda
$ module load cudnn

(3) Create Tensorflow environment with your prefered version of python

$ conda create --prefix ~/tensorflow_2.9 python=3.8 pip --y

The conda environment named tensorflow_2.9 is created on your home directory

(4) Activate the conda environment and Install Tensorflow 2.9.1 (or your prefered TF version)

$ source activate ~/tensorflow_2.9/  
$ pip install tensorflow==2.9.1

Install ipkernel and create the kernel for Notebook

$ pip install ipykernel
$ python3 -m ipykernel install --user --name tensorflow_2.9 --display-name TensorflowGPU29

(5) Once installation done, check if the conda environment is able to enable the GPU

$  python
>>> import tensorflow as tf
>>> tf.config.list_physical_devices('GPU')
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Usage of conda environment manager is no difference compared to running in M3.

Create conda environment for Pytorch with GPUs support

Similar to Tensorflow, one can create conda environment for Pytorch with GPUs support.

Following is the brief steps (3) to (5) to create the env and install Pytorch after requesting a node and load the libraries

$ conda create --prefix ~/pytorch_1.13 python=3.8 pip --y
$ source activate ~/pytorch_1.13
$ conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia --y
$ python
>>> import torch 
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
1

Key Points

  • Conda environment


Using NGC Container in SuperPOD

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use NGC Container in SuperPOD?

Objectives
  • Learn how to master NGC Container useage in SuperPOD

3. Using NVIDIA NGC Container in SuperPOD

What is Container?

Docker Container

NVIDIA NGC Container

ENROOT

It is very convenient to download docker and NGC container to SuperPOD. Here I would like to introduce a very effective tool name enroot

Importing docker container to SuperPOD from docker hub

$ enroot import docker://ubuntu
$ enroot create ubuntu.sqsh
$ enroot start ubuntu

#Type ls to see the content of container:
# ls

bin   dev  home  lib32  libx32  mnt  proc  run   srv  tmp    usr
boot  etc  lib   lib64  media   opt  root  sbin  sys  users  var

Exercise

Go to dockerhub, search for any container, for example lolcow then use enroot to contruct that container environment

enroot import docker://godlovedc/lolcow
enroot create godlovedc+lolcow.sqsh
enroot start godlovedc+lolcow

image

Download Tensorflow container

image

The following information was copied to the memory when selecting the 22.12-tf2 version:

nvcr.io/nvidia/tensorflow:22.12-tf2-py3
$ cd $WORK/sqsh
$ enroot import docker://nvcr.io#nvidia/tensorflow:22.12-tf2-py3

The sqsh file nvidia+tensorflow+22.12-tf2-py3.sqsh is created.

$ enroot create nvidia+tensorflow+22.12-tf2-py3.sqsh

Working with NGC container in Interactive mode:

Once the container is import and created into your folder in SuperPOD, you can simply activate it from login node when requesting a compute node:

$ srun -N1 -G1 -c10 --mem=64G --time=12:00:00 --container-image $WORK/sqsh/nvidia+tensorflow+22.12-tf2-py3.sqsh --container-mounts=$WORK --pty $SHELL

Check the GPU enable:

$ python
>>> import tensorflow as tf
>>> tf.config.list_physical_devices('GPU')
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Exit the container using exit command.

Working with NGC container in Batch mode

#!/bin/bash
#SBATCH -J Testing       # job name to display in squeue
#SBATCH -o output-%j.txt    # standard output file
#SBATCH -e error-%j.txt     # standard error file
#SBATCH -p batch -c 12 --mem=20G --gres=gpu:1     # requested partition
#SBATCH -t 1440              # maximum runtime in minutes
#SBATCH -D /link-to-your-folder/

srun --container-image=/work/users/tuev/sqsh/nvidia+tensorflow+22.12-tf2-py3.sqsh --container-mounts=$WORK python testing.py
import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))

Working with NGC container in Jupyter Lab

root@bcm-dgxa100-0001:/workspace# jupyter lab --allow-root --no-browser --ip=0.0.0.0

The following URL appear with its token

Or copy and paste this URL:
        http://hostname:8888/?token=fd6495a28350afe11f0d0489755bc3cfd18f8893718555d2

Note that you must replace hostname to the corresponding node that you are in, this case is bcm-dgxa100-0001.

Therefore, you should change the above address to and paste to Firefox:

http://bcm-dgxa100-0001:8888/?token=fd6495a28350afe11f0d0489755bc3cfd18f8893718555d2

Note: you should select the default Python 3 (ipykernel) instead of any other kernels for running the container.

image

Tip: Once forwarding to Jupter Lab, you are placed in container’s root. It’s recommended to create a symlink for your folder in order to navigate away:

$ ln -s $WORK work

Key Points

  • NGC Container


Using Jupyter Lab in SuperPOD

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use Jupter Lab in SuperPOD?

Objectives
  • Learn port forwarding technique to enable Jupter Lab

4. Jupter Lab on SuperPOD

$ ssh -C -D 8000 username@superpod.smu.edu

The C stands for Compression and D stands for Dynamic port-forwarding with SOCKS4/5 to port number 8000. Feel free to change the port and remember to set it up in your browser

4.1 Setup browser to enable proxy viewing (similar for MacOS/Linux as well)

4.1.1 Using Firefox as browser:

Open Firefox, my version is 104.0.2. Use combination Alt+T+S to open up the settings tab. Scroll to bottom and select Settings from Network Settings:

image

4.1.2 Using Chrome/Safari as browser:

Search for proxies and set a Socks proxy with sever localhost and port 8000. image

4.2 Test Proxy

4.2.1. Test Proxy using conda environment:

Go back to MobaXTerm and login into SuperPOD using regular SSH Request a compute node

$ srun -N1 -G1 -c10 --mem=64G --time=12:00:00 --pty $SHELL

Load cuda, cudnn and activate any of your conda environment, for example Tensorflow_2.9 in the home directory

$ module load conda gcc; module load cuda; module load cudnn
$ conda activate ~/tensorflow_2.9   

Make sure to install jupyter

$ pip install jupyterlab

Next insert the following command:

$ jupyter notebook --ip=0.0.0.0 --no-browser
# or
$ jupyter lab --ip=0.0.0.0 --no-browser   

The following screen appears

image

Copy the highlighted URLs to Firefox, you will see Jupyter Notebook port forward to this:

image

Select TensorflowGPU29 kernel notebook and Check GPU device:

image

4.2.2. Test Proxy using docker container:

For docker container, the command line need to have 1 additional flag:

$ jupyter lab --ip=0.0.0.0 --no-browser --allow-root

You will need to replace the hostname to the name of the node that you are having:

image

For example in the previous command, you need to copy and paste the following line to Firefox browser:

$ http://bcm-dgxa100-0016:8888/?token=daefb1c3e2754b37b6b94b619387cb3fd9710608e0152182 

Troubleshoot for notebook requesting password

In certain case, your Jupyter Notebook requires password to be enable, you can setup the password using the command below prior to requesting jupyter lab instance:

$ jupyter notebook password   

In case changing password do not help, it might be the case that the forwarded port has some problems. In that case you should either:

(1) change the default port 8888 to other (8889 for example), or
(2) change the localhost port when you first login to SuperPOD 8000 in this case to other local port (5000 for example)

Key Points

  • Jupter Lab, Port-Forwarding


Using Batch script in SuperPOD

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to run Batch script in SuperPOD

Objectives
  • Running batch script using CIFAR100 template model

5. Using Batch script in SuperPOD

SuperPOD uses SLURM as scheduler so there is no difference in running Batch script comparing to ManeFrame 3. However, there are some commands you might need to pay attention when running Batch script using container.

Following are the instructions on how to run Batch script for a Computer Vision sample using CIFAR10 data. Here, I use a python file called model_CNN_CIFAR10.py.

The file can be downloaded from here to your $WORK folder:

$ cd $WORK
$ wget https://raw.githubusercontent.com/SouthernMethodistUniversity/SMU_SuperPOD_101/e6315c29ca0542351b79233729708dfa16161cdf/files/model_CNN_CIFAR10.py

5.1 Running Batch script with conda environment

Prepare the batch script with name: modelCNN.sh using the following content:

#!/bin/bash
#SBATCH -J CNN_CIFAR10_SPOD       # job name to display in squeue
#SBATCH -t 60                     # maximum runtime in minutes
#SBATCH -c 2                      # request 2 cpus    
#SBATCH -G 1                      # request 1 gpu a100
#SBATCH -p workshop               # request queue name workshop (optional)
#SBATCH -D /work/users/tuev       # link to your folder
#SBATCH --mem=32gb                # request 32gb memory
#SBATCH --mail-user tuev@smu.edu  # request to email to your emailID
#SBATCH --mail-type=end           # request to mail when the model **end**

module load conda gcc
module load cuda cudnn

conda activate ~/tensorflow_2.9
python model_CNN_CIFAR10.py

Be on login node to submit the batch script:

$ sbatch modelCNN.sh

5.2 Running Batch script with container

Prepare the batch script with name: modelCNN_ngc.sh using the following content:

#!/bin/bash
#SBATCH -J CNN_CIFAR10_SPOD       # job name to display in squeue
#SBATCH -t 60                     # maximum runtime in minutes
#SBATCH -c 2                      # request 2 cpus    
#SBATCH -G 1                      # request 1 gpu a100
#SBATCH -p workshop               # request queue name workshop (optional)
#SBATCH --mem=32gb                # request 32gb memory
#SBATCH --mail-user tuev@smu.edu  # request to email to your emailID
#SBATCH --mail-type=end           # request to mail when the model **end**

srun --container-image=/work/users/tuev/sqsh/nvidia+tensorflow+22.12-tf2-py3.sqsh --container-mounts=$WORK python $WORK/model_CNN_CIFAR10.py

Be on login node to submit the batch script:

$ sbatch modelCNN_ngc.sh

Key Points

  • Batch script, Computer Vision


Job queueing and control in SuperPOD

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to run control Job in SuperPOD

Objectives
  • To teach command to work with Job in SLURM

The SuperPOD cluster uses the Simple Linux Utility for Resource Management system (SLURM) to manage jobs.

5b. Job Queue and Control

In SLURM there are several usefull commands for checking your job:

Lifecycle of a Job

The life of a job begins when you submit the job to the scheduler. If accepted, it will enter the Queued state.

Thereafter, the job may move to other states, as defined below:

image

Useful Commands

Here are some basic SLURM commands for submitting, querying and deleting jobs in SuperPOD:

Command Actions
srun -N1 -G1 --pty $SHELL Submit an interactive job (reserves 1 Node, 1GPU, 1CPU, 6gb RAM, 1 hour walltime)
sbatch job.sh submit the job script job.sh
sstat <job id> Check the status of the job given jobID
sstat <job id> --format=AveCPU,AvePages,AveRSS,AveVMSize,JobID Narrow some information on sstat
squeue -u <username> Check the status of all jobs submitted by given username
scontrol show job <job id> Check the detailed information for job with given job ID
scancel <job id> Delete the queued or running job given job ID

Check pending, working job:

$ squeue -u $USERNAME

JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON
12345  workshop     bash     tuev  R      39:46      1 bcm-dgxa100-0002

The above Job has a JOBID=12345, which will be used below:

Check configuration of any requested job using JOBID:

$ scontrol show job 12345 grep ReqTRES

ReqTRES=cpu=5,mem=30G,node=1,billing=5,gres/gpu=1

Delete any job

$ scancel 12345

Checking how your job is running in node

When you know your working node, for example bcm-dgxa100-0001, from login node, you can login to the compute node and check the processing:

$ ssh bcm-dgxa100-0001
$ top -u $USERNAME
$ ssh bcm-dgxa100-0001
$ nvidia-smi
OR to refresh the command every 0.2s
$ watch -n .2 nvidia-smi

Key Points

  • Job queue, control


Data Science workflow with GPUs using RAPIDS

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to install and use RAPIDS

Objectives
  • Using GPUs directly to work with data

RAPIDS

RAPIDS provides unmatched speed with familiar APIs that match the most popular PyData libraries. Built on the shoulders of giants including NVIDIA CUDA and Apache Arrow, it unlocks the speed of GPUs with code you already know.

https://rapids.ai/

Installing RAPIDS

There are several ways to install RAPIDS to HPC systems

Using Conda Environment

This is the simplest method and usable to both M2 and SuperPOD system. You can install interactively, first, you just need to request a GPU node and load the corresponding library:

$ srun -n1 --gres=gpu:1 -c2 --mem=4gb --time=12:00:00 -p gpu-dev --pty $SHELL
$ module load conda

In SuperPOD:

$ srun -n1 --gres=gpu:1 -c2 --mem=4gb --time=12:00:00 --pty $SHELL
$ module load conda

Once the necessary module has been loaded, you just need to create the conda environment and install rapids, the following command get the latest standard version from https://rapids.ai/

$ conda create -n rapids-23.02 -c rapidsai -c conda-forge -c nvidia  rapids=23.02 python=3.10 cudatoolkit=11.8

If you have more personalized version, you can select the corresponding option and copy the command from rapids website: rapids.ai:

image

Using container

This approach is working on SuperPOD only. We will need to download the RAPIDS container from NGC

$ enroot import docker://nvcr.io#nvidia/rapidsai/rapidsai:cuda11.2-runtime-centos7-py3.10
$ enroot create nvidia+rapidsai+rapidsai+cuda11.2-runtime-centos7-py3.10.sqsh

Once my docker container has been downloaded to my home/scratch/work directory, I can load it from login node:

$ srun -N1 -G1 -c10 --mem=64G --time=12:00:00 --container-image $WORK/sqsh/nvidia+rapidsai+rapidsai+cuda11.2-runtime-centos7-py3.10.sqsh --container-mounts=$WORK --pty $SHELL

Your installation is done!

Key Points

  • NGC Container, RAPIDS, cudf, cuDask


Sample Application of NEMO for Sentiment Analysis

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use NEMO in container

Objectives
  • Apply NEMO to run sentiment analysis

NeMo

Import and Create NeMo sqsh file:

The NGC for NeMo can be found here: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

$ enroot import docker://nvcr.io#nvidia/nemo:22.09
$ enroot create nvidia+nemo+22.09.sqsh

Sentiment Analysis using NeMo

Here we use this sentiment sample from NVIDIA

SST2 data:

We download the Stanford Sentiment Treebank v2 (SST-2) and preprocess to nemo format for training and testing data

cd $WORK
mkdir nemo && cd nemo

curl -s -O https://dl.fbaipublicfiles.com/glue/data/SST-2.zip\
 && unzip -o SST-2.zip -d ./\
 && sed 1d ./SST-2/train.tsv > ./train_nemo_format.tsv\
 && sed 1d ./SST-2/dev.tsv > ./dev_nemo_format.tsv &

Requesting a compute node with NeMo container enable with a GPU:

srun -N1 -G1 -c10 --mem=64G --time=12:00:00 --container-image $WORK/sqsh/nvidia+nemo+22.09.sqsh --container-mounts=$WORK --pty bash -i

Let’s run Sentiment Analysis using NeMo

cd $WORK/nemo/SST-2
python /workspace/nemo/examples/nlp/text_classification/text_classification_with_bert.py \
      model.dataset.num_classes=2 \
      model.dataset.max_seq_length=256 \
      model.train_ds.batch_size=64 \
      model.validation_ds.batch_size=64 \
      model.language_model.pretrained_model_name='bert-base-cased' \
      model.train_ds.file_path=train_nemo_format.tsv \
      model.validation_ds.file_path=dev_nemo_format.tsv \
      trainer.num_nodes=1 \
      trainer.max_epochs=20 \
      trainer.precision=16 \
      model.optim.name=adam \
      model.optim.lr=1e-4

Check the GPU usage with nvidia-smi command

Output of the model training is text_classification_model.nemo

Model Evaluation and Inference

from nemo.collections.nlp.models.text_classification import TextClassificationModel
model = TextClassificationModel.restore_from("text_classification_model.nemo")
model.to("cuda")

# define the list of queries for inference
queries = ['legendary irish writer brendan behan memoir , borstal boy',
           'demonstrates that the director of such hollywood blockbusters as patriot games can still turn out a small , personal film with an emotional wallop ', 
           'on the worst revenge-of-the-nerds clichés the filmmakers could dredge up', 
           'uneasy mishmash of styles and genres']

results = model.classifytext(queries=queries, batch_size=3, max_seq_length=512)

print('The prediction results of some sample queries with the trained model:')
for query, result in zip(queries, results):
    print(f'Query : {query}')
    print(f'Predicted label: {result}')

Key Points

  • NGC Container, NEMO, Sentiment Analysis


Sample Applications of MultiGPUs for Computer Vision using Horovod

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to utilize MultiGPUs in SuperPOD

Objectives
  • Apply Horovod to drive multiple GPUs using CIFAR100

MultiGPUs using CIFAR100

Here is the sample python code that utilizing Tensorflow to train the CIFAR100 datasets;

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization, Dropout
from tensorflow.keras.utils import to_categorical

from tensorflow.keras.layers import Conv2D # convolutional layers to reduce image size
from tensorflow.keras.layers import MaxPooling2D,AveragePooling2D # Max pooling layers to further reduce image size   
from tensorflow.keras.layers import Flatten # flatten data from 2D to column for Dense layer

from tensorflow.keras.datasets import cifar100
import matplotlib.pyplot as plt
# TODO: Step 1: import Horovod
import horovod.tensorflow.keras as hvd
# TODO: Step 1: initialize Horovod
hvd.init()

# TODO: Step 1: pin to a GPU
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    tf.config.experimental.set_memory_growth(gpus[hvd.local_rank()], True)
    tf.config.experimental.set_visible_devices(gpus[hvd.local_rank()], 'GPU')


def plot_acc_loss(history):
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['training', 'validation'], loc='best')
    plt.savefig("calval_hvod.png") #save as jpg
    plt.show()

# load data
(X_train, y_train), (X_test, y_test) = cifar100.load_data()

# Normalized data to range (0, 1):
X_train, X_test = X_train/X_train.max(), X_test/X_test.max()

num_categories=100
y_train = tf.keras.utils.to_categorical(y_train,num_categories)
y_test = tf.keras.utils.to_categorical(y_test,num_categories)

model = Sequential()
model.add(Conv2D(1024, (3, 3), strides=(1, 1), activation='relu', input_shape=(32, 32, 3)))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(.1))
model.add(Conv2D(512, (3, 3), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(.1))
model.add(Conv2D(256, (3, 3), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(.1))

model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(.1))

#Output layer contains 10 different number from 0-9
model.add(Dense(100, activation='softmax'))

model.summary()
# create model
model.compile(optimizer='Adam', loss='categorical_crossentropy',  metrics=['accuracy'])

#Train the model
model_CNN = model.fit(X_train, y_train, epochs=40,verbose=1,
                    validation_data=(X_test, y_test))

plot_acc_loss(model_CNN)

Using SuperPOD to run MultiGPUs

The following batch script is used to submit the training job using 8 GPUs and Tensorflow 22.02 version

#!/bin/bash
#SBATCH -J CIFAR100M      # job name to display in squeue
#SBATCH -c 16 --mem=750G      # requested partition
#SBATCH -o output-%j.txt    # standard output file
#SBATCH -e error-%j.txt     # standard error file
#SBATCH --gres=gpu:8
#SBATCH -t 1440              # maximum runtime in minutes
#SBATCH -D /work/users/tuev/cv1/cifar100/multi
#SBATCH --exclusive
#SBATCH --mail-user tuev@smu.edu
#SBATCH --mail-type=end

srun --container-image=$WORK/sqsh/nvidia+tensorflow+22.02-tf2-py3.sqsh --container-mounts=$WORK mpirun -np 8 --allow-run-as-root --oversubscribe python /work/users/tuev/cv1/cifar100/multi/cifar100spod-hvod.py

Make sure to use nvidia-smi to check the usage of all 8 GPUs

image

Key Points

  • NGC Container, Horovod, Computer Vision


Using YOLOv5 for object detection

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use train YOLOv5 to detect objects

Objectives
  • Download pretrained YOLOv5 and images then apply YOLO to detect object

YOLOv5

YOLOv5 🚀 is the world’s most loved vision AI, representing Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.

To download YOLO, simply go to the github page and clone it to your home or work directory:

$ git clone https://github.com/ultralytics/yolov5.git

Suggestion:: It is better to use $WORK directory to store the code and data to avoid jamming up your $HOME directory

Open Conda env and install requirement

Prior to training YOLOv5 model, it’s better to go to your own conda env and install the missing library. For simplicit, I use NEMO Container:

$ srun -n1 --gres=gpu:1 --container-image $WORK/sqsh/nvidia+nemo+22.04.sqsh --container-mounts=$WORK --time=12:00:00 --pty $SHELL

Go to yolov5 folder and install missing library

$ cd yolov5
$ pip install -r requirements.txt 

Select Pretrained model

Refer to this table for full comparison of models. Here let’s use yolov5l6 for better performance

image

Dataset for training:

YOLOv5 is trained by using COCO (Common Object in Context) dataset, here we use coco128 which is 128 classes of images from larger COCO dataset.

The dataset is automatically downloaded when using flag –data coco128.yaml

Train YOLOv5

Let’s train model with image size of 1280 pixels, 32 batches and 10 epochs, the data in use is coco128 and pretrained model is yolov5l6:

$ python train.py --img 1280 --batch 32 --epochs 10 --data coco128.yaml --weights yolov5l6.pt

Tail of The output from model training:

 Epoch    GPU_mem   box_loss   obj_loss   cls_loss  Instances       Size
        9/9      75.5G    0.02099    0.05281   0.006695        573       1280: 100%|██████████| 4/4 [00:03<00:00,  1.17it/s]
                 Class     Images  Instances          P          R      mAP50   mAP50-95: 100%|██████████| 2/2 [00:01<00:00,  1.01it/s]
                   all        128        929      0.905      0.805      0.902      0.736

10 epochs completed in 0.031 hours.
Optimizer stripped from runs/train/exp/weights/last.pt, 154.9MB
Optimizer stripped from runs/train/exp/weights/best.pt, 154.9MB

Here we see that there are 2 pretrained model created from the training process last.pt and best.pt from corresponding output location.

We will use the best.pt weight for model inference:

To validate the model inference, we use the data from Kaggle

The Kaggle dataset can be found here: https://www.kaggle.com/competitions/open-images-2019-object-detection/data#

Using Kaggle API, one can simply download the dataset from CLI:

kaggle competitions download -c open-images-2019-object-detection

unzip the open-images-2019-object-detection.zip to get the test folder with 100000 images.

Inference using YOLOv5 for object detection with Kaggle data

The weight is used from pretrained model best.pt,

$ python detect.py --weights runs/train/exp/weights/best.pt --img 1280 --conf 0.25 --source ../test

The model output can be found in /run/detect/exp.

Sample model result:

image

Inference using YOLOv5 for object detection with video

We can also use YOLOv5 for video detection. From the sample video like this:

https://user-images.githubusercontent.com/43855029/222778747-b5312f6d-58c9-4f63-9233-93dfa65f8345.mp4

We run the inference with the best pretrained model using following command:

$ python detect.py --weights runs/train/exp/weights/best.pt --source  video.mp4

output of the inference would look like:

detect: weights=['runs/train/exp/weights/best.pt'], source=../test/before_short.mp4, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-56-gc0ca1d2 Python-3.8.13 torch-1.13.0a0+d0d6b1f CUDA:0 (NVIDIA A100-SXM4-80GB, 81251MiB)

Fusing layers... 
Model summary: 157 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
video 1/1 (1/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 156.8ms
video 1/1 (2/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.2ms
video 1/1 (3/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.2ms
video 1/1 (4/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.1ms
video 1/1 (5/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.1ms
video 1/1 (6/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.1ms
video 1/1 (7/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 3 trains, 8.1ms
video 1/1 (8/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.1ms
video 1/1 (9/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.2ms
video 1/1 (10/120) /work/users/tuev/YOLO/test/before_short.mp4: 384x640 2 trains, 8.2ms
Speed: 0.3ms pre-process, 9.4ms inference, 2.2ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp2

and the output video is saved in runs/detect/exp2 folder:

https://user-images.githubusercontent.com/43855029/222778650-f68c4a4f-ad51-4237-92a8-bfb0ad37cd54.mp4

Key Points

  • YOLOv5, object detection, inference


Using Transfer Learning with ResNet50

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use apply transfer learning to detect object

Objectives
  • Apply ResNet50 model in transfer learning

The following lecture note based on NVIDIA’s Fundamental Introduction to Deep Learning course with different input data

Transfer Learning

So far, we have trained accurate models on large datasets, and also downloaded a pre-trained model that we used with no training necessary. But what if we cannot find a pre-trained model that does exactly what you need, and what if we do not have a sufficiently large dataset to train a model from scratch? In this case, there is a very helpful technique we can use called transfer learning.

With transfer learning, we take a pre-trained model and retrain it on a task that has some overlap with the original training task. A good analogy for this is an artist who is skilled in one medium, such as painting, who wants to learn to practice in another medium, such as charcoal drawing. We can imagine that the skills they learned while painting would be very valuable in learning how to draw with charcoal.

As an example in deep learning, say we have a pre-trained model that is very good at recognizing different types of cars, and we want to train a model to recognize types of motorcycles. A lot of the learnings of the car model would likely be very useful, for instance the ability to recognize headlights and wheels.

Transfer learning is especially powerful when we do not have a large and varied dataset. In this case, a model trained from scratch would likely memorize the training data quickly, but not be able to generalize well to new data. With transfer learning, you can increase your chances of training an accurate and robust model on a small dataset.

Here we just use a simple tensorflow conda environment or container:

$ srun -n1 -G1 --container-image $WORK/sqsh/nvidia+tensorflow+22.02-tf2-py3.sqsh --container-mounts=$WORK --time=12:00:00 --pty bash -i

Objective

Urban or Rural

In this example, we would like to create a model to recognize urban and rural. The data is downloaded from here

Download the pre-trained model

The ImageNet pre-trained models are often good choices for computer vision transfer learning, as they have learned to classify various different types of images. In doing this, they have learned to detect many different types of features that could be valuable in image recognition.

Let us start by downloading the pre-trained model. Again, this is available directly from the Keras library. As we are downloading, there is going to be an important difference. The last layer of an ImageNet model is a dense layer of 1000 units, representing the 1000 possible classes in the dataset. In our case, we want it to make a different classification: is this urban or rural? Because we want the classification to be different, we are going to remove the last layer of the model. We can do this by setting the flag include_top=False when downloading the model. After removing this top layer, we can add new layers that will yield the type of classification that we want:

from tensorflow.keras.applications.resnet50 import ResNet50
base_model = ResNet50(
    weights='imagenet',  # Load weights pre-trained on ImageNet.
    input_shape=(224, 224, 3),
    include_top=False)
    
base_model.summary()    

Freezing the Base Model

Before we add our new layers onto the pre-trained model, we should take an important step: freezing the model’s pre-trained layers. This means that when we train, we will not update the base layers from the pre-trained model. Instead we will only update the new layers that we add on the end for our new classification. We freeze the initial layers because we want to retain the learning achieved from training on the ImageNet dataset. If they were unfrozen at this stage, we would likely destroy this valuable information. There will be an option to unfreeze and train these layers later, in a process called fine-tuning.

Freezing the base layers is as simple as setting trainable on the model to False.

base_model.trainable = False

Adding new layer

We can now add the new trainable layers to the pre-trained model. They will take the features from the pre-trained layers and turn them into predictions on the new dataset. We will add two layers to the model. First will be a pooling layer like we saw in our earlier convolutional neural network. (If you want a more thorough understanding of the role of pooling layers in CNNs, please read this detailed blog post). We then need to add our final layer, which will classify urban or rural. This will be a densely connected layer with one output.

from tensorflow import keras
inputs = keras.Input(shape=(224, 224, 3))
# Separately from setting trainable on the model, we set training to False 
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
# A Dense classifier with a single unit (binary classification)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

model.summary()

Keras gives us a nice summary here, as it shows the vgg16 pre-trained model as one unit, rather than showing all of the internal layers. It is also worth noting that we have many non-trainable parameters as we have frozen the pre-trained model.

Compile the model

As with our previous exercises, we need to compile the model with loss and metrics options. We have to make some different choices here. In previous cases we had many categories in our classification problem. As a result, we picked categorical crossentropy for the calculation of our loss. In this case we only have a binary classification problem (Urban or Rural), and so we will use binary crossentropy. Further detail about the differences between the two can found here. We will also use binary accuracy instead of traditional accuracy.

By setting from_logits=True we inform the loss function that the output values are not normalized (e.g. with softmax).

# Important to use binary crossentropy and binary accuracy as we now have a binary classification problem
model.compile(loss=keras.losses.BinaryCrossentropy(from_logits=True), metrics=[keras.metrics.BinaryAccuracy()])

Augmenting the data

Now that we are dealing with a very small dataset, it is especially important that we augment our data. As before, we will make small modifications to the existing images, which will allow the model to see a wider variety of images to learn from. This will help it learn to recognize new pictures of Urban/Rural instead of just memorizing the pictures it trains on.

from tensorflow.keras.preprocessing.image import ImageDataGenerator
# create a data generator
datagen = ImageDataGenerator(
        samplewise_center=True,  # set each sample mean to 0
        rotation_range=10,  # randomly rotate images in the range (degrees, 0 to 180)
        zoom_range = 0.1, # Randomly zoom image 
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False) # we don't expect the image to be taken upsizedown

Loading the data

We have seen datasets in a couple different formats so far. In the MNIST exercise, we were able to download the dataset directly from within the Keras library. For the sign language dataset, the data was in CSV files. For this exercise, we are going to load images directly from folders using Keras’ flow_from_directory function. We have set up our directories to help this process go smoothly as our labels are inferred from the folder names. In the data directory, we have train and validation directories, which each have folders for images of Urban or Rural. Feel free to explore the images to get a sense of our dataset.

Note that flow_from_directory will also allow us to size our images to match the model: 244x244 pixels with 3 channels.

# load and iterate training dataset
train_it = datagen.flow_from_directory('data/train/', 
                                       target_size=(224, 224), 
                                       color_mode='rgb', 
                                       class_mode='binary', 
                                       batch_size=8)
# load and iterate validation dataset
valid_it = datagen.flow_from_directory('data/val/', 
                                      target_size=(224, 224), 
                                      color_mode='rgb', 
                                      class_mode='binary', 
                                      batch_size=8)

Training the model

Time to train our model and see how it does. Recall that when using a data generator, we have to explicitly set the number of steps_per_epoch:

model.fit(train_it, steps_per_epoch=12, validation_data=valid_it, validation_steps=4, epochs=20)

Discussion of Results

Both the training and validation accuracy should be quite high. This is a pretty awesome result! We were able to train on a small dataset, but because of the knowledge transferred from the ImageNet model, it was able to achieve high accuracy and generalize well. This means it has a very good sense of Urban and Rural

If you saw some fluctuation in the validation accuracy, that is okay too. We have a technique for improving our model in the next section.

Fine tuning the model

Now that the new layers of the model are trained, we have the option to apply a final trick to improve the model, called fine-tuning. To do this we unfreeze the entire model, and train it again with a very small learning rate. This will cause the base pre-trained layers to take very small steps and adjust slightly, improving the model by a small amount.

Note that it is important to only do this step after the model with frozen layers has been fully trained. The untrained pooling and classification layers that we added to the model earlier were randomly initialized. This means they needed to be updated quite a lot to correctly classify the images. Through the process of backpropagation, large initial updates in the last layers would have caused potentially large updates in the pre-trained layers as well. These updates would have destroyed those important pre-trained features. However, now that those final layers are trained and have converged, any updates to the model as a whole will be much smaller (especially with a very small learning rate) and will not destroy the features of the earlier layers.

Let’s try unfreezing the pre-trained layers, and then fine tuning the model:

# Unfreeze the base model
base_model.trainable = True

# It's important to recompile your model after you make any changes
# to the `trainable` attribute of any inner layer, so that your changes
# are taken into account
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate = .00001),  # Very low learning rate
              loss=keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=[keras.metrics.BinaryAccuracy()])
model.fit(train_it, steps_per_epoch=12, validation_data=valid_it, validation_steps=4, epochs=10)

Examine the Prediction

Now that we have a well-trained model, it is time to create the model to detect Urban or Rural We can start by looking at the predictions that come from the model.

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from tensorflow.keras.preprocessing import image as image_utils
from tensorflow.keras.applications.imagenet_utils import preprocess_input

def show_image(image_path):
    image = mpimg.imread(image_path)
    plt.imshow(image)

def make_predictions(image_path):
    show_image(image_path)
    image = image_utils.load_img(image_path, target_size=(224, 224))
    image = image_utils.img_to_array(image)
    image = image.reshape(1,224,224,3)
    image = preprocess_input(image)
    preds = model.predict(image)
    return preds
make_predictions('data/val/urban/urban_20.jpeg')

image

make_predictions('data/val/rural/rural5.jpeg')

image

It looks like a negative number prediction means that it is Rural and a positive number prediction means it is Urban. We can use this information to differentiate these scenary

def detect_img(image_path):
    preds = make_predictions(image_path)
    if preds[0]<0:
        print("It's Rural! So freshy")
    else:
        print("It's Urban! So developed!")
import numpy as np
detect_img('data/val/rural/rural15.jpeg')

image

detect_img('data/val/urban/urban_40.jpeg')

image

Summary

Great work! With transfer learning, you have built a highly accurate model using a very small dataset. This can be an extremely powerful technique, and be the difference between a successful project and one that cannot get off the ground. We hope these techniques can help you out in similar situations in the future!

There is a wealth of helpful resources for transfer learning in the NVIDIA Transfer Learning Toolkit.

Key Points

  • ResNet50, object detection, transfer learning


Using Stable Diffusion with SuperPOD

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use Stable Diffusion model

Objectives
  • Learn how to download and install Stable Diffusion from HuggingFace

User can now access to Stable Diffusion from HuggingFace but still utilizing the power of SPOD’s A100 GPU to inference the data with any incoming prompt. The following take an example from Stable Diffusion model from HuggingFace

pip install diffusers --upgrade
from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

# if using torch < 2.0
# pipe.enable_xformers_memory_efficient_attention()

prompt = "An astronaut riding a green horse"

images = pipe(prompt=prompt).images[0]

image

Key Points

  • Stable Diffusion, Prompt, HuggingFace


Using Pre-trained model from HuggingFace

Overview

Teaching: 20 min
Exercises: 0 min
Questions
  • How to use pre-trained model already available from Hugging Face hub

Objectives
  • To master the usage of pre-trained deep learning model from Hugging Face

Hugging Face hub

Transformers library

Model task

The screenshot below describes the model task from Hugging Face that covers many different aspecs from Computer Vision to NLP, Audio or Reinforcement Learning image

Pipeline for inference

Pipeline for NLP Sentiment Analysis

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I am so excited to use the new SuperPOD from NVIDIA")

[{'label': 'POSITIVE', 'score': 0.9995261430740356}]
classifier(
    ["I am so excited to use the new SuperPOD from NVIDIA", "I hate running late"])

[{'label': 'POSITIVE', 'score': 0.9995261430740356},
 {'label': 'NEGATIVE', 'score': 0.9943193197250366}]

Pipeline Text Generation

from transformers import pipeline
generator = pipeline("text-generation")
generator("Using SMU latest HPC cluster NVIDIA SuperPOD,  you will be able to")

[{'generated_text': 'Using SMU latest HPC cluster NVIDIA SuperPOD,  you will be able to connect to other SSE nodes such as the following and use them as a HPC node:\n\n[CPU: CPU1, GIGABYTE'}]

Pipeline for Mask filling

from transformers import pipeline

unmasker = pipeline("fill-mask")
unmasker("This course will teach you all about <mask> models.", top_k=2)

[{'score': 0.19619698822498322,
  'token': 30412,
  'token_str': ' mathematical',
  'sequence': 'This course will teach you all about mathematical models.'},
 {'score': 0.04052705690264702,
  'token': 38163,
  'token_str': ' computational',
  'sequence': 'This course will teach you all about computational models.'}]

Pipeline for Name Entity Recognition

from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My name is Tue Vu and I work at SMU in Dallas")

[{'entity_group': 'PER',
  'score': 0.9868829,
  'word': 'Tue Vu',
  'start': 11,
  'end': 17},
 {'entity_group': 'ORG',
  'score': 0.9965092,
  'word': 'SMU',
  'start': 32,
  'end': 35},
 {'entity_group': 'LOC',
  'score': 0.9950755,
  'word': 'Dallas',
  'start': 39,
  'end': 45}]

Pipeline for Question Answering

from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
    question="Where do I work?",
    context="My name is Tue Vu and I work at SMU in Dallas",
)

{'score': 0.3651700019836426, 'start': 32, 'end': 35, 'answer': 'SMU'}

Pipeline for Conversational

from transformers import pipeline, Conversation
converse = pipeline("conversational")

conversation_1 = Conversation("What do you think about using HPC SuperPOD")
conversation_2 = Conversation("Do you believe in God?")
converse([conversation_1, conversation_2])

Answer:

[Conversation id: 44cf473c-29f2-4b44-be6c-15352dab13a2 
 user >> What do you think about using HPC SuperPOD 
 bot >> I think it's a good idea, but I don't think it's a good idea to use it for a lot of things. ,
 Conversation id: 489d923c-f127-4847-8cde-972c77470230 
 user >> What do you do to optimize the Python workflow? 
 bot >> I believe in the power of love.]

Pipeline for Computer Vision - Image Classification

from transformers import pipeline
clf = pipeline("image-classification")

Display the image:

import urllib.request
from io import BytesIO

url = 'https://t4.ftcdn.net/jpg/02/66/72/41/360_F_266724172_Iy8gdKgMa7XmrhYYxLCxyhx6J7070Pr8.jpg'
with urllib.request.urlopen(url) as url:
    img = Image.open(BytesIO(url.read()))
img

360_F_266724172_Iy8gdKgMa7XmrhYYxLCxyhx6J7070Pr8

Model Inference

clf(img)

[{'score': 0.49216628074645996, 'label': 'Egyptian cat'},
 {'score': 0.41306015849113464, 'label': 'tabby, tabby cat'},
 {'score': 0.050162095576524734, 'label': 'tiger cat'},
 {'score': 0.012556081637740135, 'label': 'lynx, catamount'},
 {'score': 0.00524393143132329, 'label': 'ping-pong ball'}]

Key Points

  • Hugging Face, pre-trained, pipeline