Introduction
This tutorial explains the qickstart examples and some core abstractions FastAI.jl is built on.
On the quickstart page, we showed how to train models on common tasks in a few lines of code:
dataset = Datasets.loadtaskdata(Datasets.datasetpath("imagenette2-160"), ImageClassificationTask)
method = ImageClassification(Datasets.getclassesclassification("imagenette2-160"), (160, 160))
dls = methoddataloaders(dataset, method, 16)
model = methodmodel(method, Models.xresnet18())
learner = Learner(model, dls, ADAM(), methodlossfn(method), ToGPU(), Metrics(accuracy))
fitonecycle!(learner, 5)
Let’s unpack each line.
Data containers
dataset = Datasets.loadtaskdata(Datasets.datasetpath("imagenette2-160"), ImageClassificationTask)
mapobs((input = FastAI.Datasets.loadfile, target = FastAI.Datasets.var"#27#32"()), DataSubset(::FastAI.Datasets.FileDataset, ::Vector{Int64}, ObsDim.Undefined())
13394 observations)
This line downloads and loads the ImageNette image classification dataset, a small subset of ImageNet with 10 different classes. dataset
is a data container that can be used to load individual observations, here of images and the corresponding labels. We can use getobs(dataset, i)
to load the i
-th observation and nobs
to find out how many observations there are.
image, class = getobs(dataset, 1000)
class = "n02102040"
nobs(dataset)
13394
To train on a different dataset, you could replace dataset
with other data containers made up of pairs of images and classes.
Method
classes = Datasets.getclassesclassification("imagenette2-160")
method = ImageClassification(classes, (160, 160))
ImageClassification() with 10 classes
Here we define ImageClassification
, which defines how data is processed before being fed to the model and how model outputs are turned into predictions. classes
is a vector of strings naming each class, and (224, 224)
the size of the images that are input to the model.
ImageClassification
is a LearningMethod
, an abstraction that encapsulates the logic and configuration for training models on a specific learning task. See learning methods to find out more about how they can be used and how to create custom learning methods.
Data loaders
dls = methoddataloaders(dataset, method, 16)
LoadError("string", 1, MethodError(DLPipelines.encode, (ImageClassification() with 10 classes, Training(), (input = ColorTypes.RGB{FixedPointNumbers.N0f8}[RGB{N0f8}(0.51,0.451,0.329) RGB{N0f8}(0.529,0.471,0.357) … RGB{N0f8}(1.0,0.953,0.957) RGB{N0f8}(1.0,0.976,0.984); RGB{N0f8}(0.518,0.451,0.333) RGB{N0f8}(0.533,0.467,0.357) … RGB{N0f8}(1.0,0.957,0.953) RGB{N0f8}(1.0,0.984,0.984); … ; RGB{N0f8}(0.396,0.392,0.373) RGB{N0f8}(0.396,0.392,0.373) … RGB{N0f8}(0.549,0.533,0.486) RGB{N0f8}(0.545,0.529,0.482); RGB{N0f8}(0.396,0.392,0.373) RGB{N0f8}(0.4,0.396,0.376) … RGB{N0f8}(0.545,0.529,0.482) RGB{N0f8}(0.541,0.525,0.478)], target = "n03417042")), 0x000000000000767c))
Next we turn the data container into training and validation data loaders. These take care of efficiently loading batches of data (by default in parallel). The observations are already preprocessed using the information in method
and then batched together. Let’s look at a single batch:
traindl, valdl = dls
(xs, ys), _ = iterate(traindl)
summary.((xs, ys))
LoadError("string", 1, UndefVarError(:dls))
xs
is a batch of cropped and normalized images with dimensions (height, width, color channels, batch size)
and ys
a batch of one-hot encoded classes with dimensions (classes, batch size)
.
Model
model = methodmodel(method, Models.xresnet18())
LoadError("string", 1, MethodError(Flux.outdims, (BatchNorm(32, λ = relu), (80, 80)), 0x000000000000767c))
Now we create a Flux.jl model. methodmodel
is a part of the learning method interface that knows how to smartly construct an image classification model from different backbone architectures. Here a classficiation head with the appropriate number of classes is stacked on a slightly modified version of the ResNet architecture.
Learner
learner = Learner(model, dls, ADAM(), methodlossfn(method), ToGPU(), Metrics(accuracy))
LoadError("string", 1, UndefVarError(:model))
Finally we bring the model and data loaders together with an optimizer and loss function in a Learner
. The Learner
stores all state for training the model. It also features a powerful, extensible callback system enabling checkpointing, hyperparameter scheduling, TensorBoard logging, and many other features. Here we use the ToGPU()
callback so that model and batch data will be transferred to an available GPU and Metrics(accuracy)
to track the classification accuracy during training.
With that setup, training learner
is dead simple:
fitonecycle!(learner, 5)