Tutorials

Learning tasks

How To

Reference

Background

Introduction

This tutorial explains the qickstart examples and some core abstractions FastAI.jl is built on.

On the quickstart page, we showed how to train models on common tasks in a few lines of code:

dataset = Datasets.loadtaskdata(Datasets.datasetpath("imagenette2-160"), ImageClassificationTask)
method = ImageClassification(Datasets.getclassesclassification("imagenette2-160"), (160, 160))
dls = methoddataloaders(dataset, method, 16)
model = methodmodel(method, Models.xresnet18())
learner = Learner(model, dls, ADAM(), methodlossfn(method), ToGPU(), Metrics(accuracy))
fitonecycle!(learner, 5)

Let’s unpack each line.

Data containers

dataset = Datasets.loadtaskdata(Datasets.datasetpath("imagenette2-160"), ImageClassificationTask)
mapobs((input = FastAI.Datasets.loadfile, target = FastAI.Datasets.var"#27#32"()), DataSubset(::FastAI.Datasets.FileDataset, ::Vector{Int64}, ObsDim.Undefined())
 13394 observations)

This line downloads and loads the ImageNette image classification dataset, a small subset of ImageNet with 10 different classes. dataset is a data container that can be used to load individual observations, here of images and the corresponding labels. We can use getobs(dataset, i) to load the i-th observation and nobs to find out how many observations there are.

image, class = getobs(dataset, 1000)
class = "n02102040"
nobs(dataset)
13394

To train on a different dataset, you could replace dataset with other data containers made up of pairs of images and classes.

Method

classes = Datasets.getclassesclassification("imagenette2-160")
method = ImageClassification(classes, (160, 160))
ImageClassification() with 10 classes

Here we define ImageClassification, which defines how data is processed before being fed to the model and how model outputs are turned into predictions. classes is a vector of strings naming each class, and (224, 224) the size of the images that are input to the model.

ImageClassification is a LearningMethod, an abstraction that encapsulates the logic and configuration for training models on a specific learning task. See learning methods to find out more about how they can be used and how to create custom learning methods.

Data loaders

dls = methoddataloaders(dataset, method, 16)
LoadError("string", 1, MethodError(DLPipelines.encode, (ImageClassification() with 10 classes, Training(), (input = ColorTypes.RGB{FixedPointNumbers.N0f8}[RGB{N0f8}(0.51,0.451,0.329) RGB{N0f8}(0.529,0.471,0.357) … RGB{N0f8}(1.0,0.953,0.957) RGB{N0f8}(1.0,0.976,0.984); RGB{N0f8}(0.518,0.451,0.333) RGB{N0f8}(0.533,0.467,0.357) … RGB{N0f8}(1.0,0.957,0.953) RGB{N0f8}(1.0,0.984,0.984); … ; RGB{N0f8}(0.396,0.392,0.373) RGB{N0f8}(0.396,0.392,0.373) … RGB{N0f8}(0.549,0.533,0.486) RGB{N0f8}(0.545,0.529,0.482); RGB{N0f8}(0.396,0.392,0.373) RGB{N0f8}(0.4,0.396,0.376) … RGB{N0f8}(0.545,0.529,0.482) RGB{N0f8}(0.541,0.525,0.478)], target = "n03417042")), 0x000000000000767c))

Next we turn the data container into training and validation data loaders. These take care of efficiently loading batches of data (by default in parallel). The observations are already preprocessed using the information in method and then batched together. Let’s look at a single batch:

traindl, valdl = dls
(xs, ys), _ = iterate(traindl)
summary.((xs, ys))
LoadError("string", 1, UndefVarError(:dls))

xs is a batch of cropped and normalized images with dimensions (height, width, color channels, batch size) and ys a batch of one-hot encoded classes with dimensions (classes, batch size).

Model

model = methodmodel(method, Models.xresnet18())
LoadError("string", 1, MethodError(Flux.outdims, (BatchNorm(32, λ = relu), (80, 80)), 0x000000000000767c))

Now we create a Flux.jl model. methodmodel is a part of the learning method interface that knows how to smartly construct an image classification model from different backbone architectures. Here a classficiation head with the appropriate number of classes is stacked on a slightly modified version of the ResNet architecture.

Learner

learner = Learner(model, dls, ADAM(), methodlossfn(method), ToGPU(), Metrics(accuracy))
LoadError("string", 1, UndefVarError(:model))

Finally we bring the model and data loaders together with an optimizer and loss function in a Learner. The Learner stores all state for training the model. It also features a powerful, extensible callback system enabling checkpointing, hyperparameter scheduling, TensorBoard logging, and many other features. Here we use the ToGPU() callback so that model and batch data will be transferred to an available GPU and Metrics(accuracy) to track the classification accuracy during training.

With that setup, training learner is dead simple:

fitonecycle!(learner, 5)