High-level Interface
Integrating Augmentor into an existing project should in general not require any major changes to your code. In most cases it should break down to the three basic steps outlined below. We will spend the rest of this document investigating these in more detail.
Import Augmentor into the namespace of your program.
using Augmentor
Define a (stochastic) image processing pipeline by chaining the desired operations using
|>
and*
.julia> pl = FlipX() * FlipY() |> Zoom(0.9:0.1:1.2) |> CropSize(64,64) 3-step Augmentor.ImmutablePipeline: 1.) Either: (50%) Flip the X axis. (50%) Flip the Y axis. 2.) Zoom by I ∈ {0.9×0.9, 1.0×1.0, 1.1×1.1, 1.2×1.2} 3.) Crop a 64×64 window around the center
Apply the pipeline to the existing image or set of images.
img_processed = augment(img_original, pl)
Depending on the complexity of your problem, you may want to iterate between step 2.
and 3.
to identify an appropriate pipeline.
Defining a Pipeline
In Augmentor, a (stochastic) image-processing pipeline can be understood as a sequence of operations, for which the parameters can (but need not) be random variables. What that essentially means is that the user explicitly specifies which image operation to perform in what order. A complete list of available operations can be found at Supported Operations.
To start off with a simple example, let us assume that we want to first rotate our image(s) counter-clockwise by 14°, then crop them down to the biggest possible square, and lastly resize the image(s) to a fixed size of 64 by 64 pixel. Such a pipeline would be defined as follows:
julia> pl = Rotate(14) |> CropRatio(1) |> Resize(64,64)
3-step Augmentor.ImmutablePipeline:
1.) Rotate 14 degree
2.) Crop to 1:1 aspect ratio
3.) Resize to 64×64
Notice that in the example above there is no room for randomness. In other words, the same input image would always result in the same output image given that pipeline. If we wish for more variation we can do so by using a vector as our parameters, instead of a single number.
In this subsection we will focus only on how to define a pipeline, without actually thinking too much about how to apply that pipeline to an actual image. The later will be the main topic of the rest of this document.
Say we wish to adapt our pipeline such that the rotation is a little more random. More specifically, lets say we want our image to be rotated by either -10°, -5°, 5°, 10°, or not at all. Other than that change we will leave the rest of the pipeline as is.
julia> pl = Rotate([-10,-5,0,5,10]) |> CropRatio(1) |> Resize(64,64)
3-step Augmentor.ImmutablePipeline:
1.) Rotate by θ ∈ [-10, -5, 0, 5, 10] degree
2.) Crop to 1:1 aspect ratio
3.) Resize to 64×64
Variation in the parameters is only one of the two main ways to introduce randomness to our pipeline. Additionally, one can specify that an operation should be sampled randomly from a chosen set of operations . This can be accomplished using a utility operation called Either
, which has its own convenience syntax.
As an example, let us assume we wish to first either mirror our image(s) horizontally, or vertically, or not at all, and then crop it down to a size of 100 by 100 pixel around the image's center. We can specify the "either" using the *
operator.
julia> pl = FlipX() * FlipY() * NoOp() |> CropSize(100,100)
2-step Augmentor.ImmutablePipeline:
1.) Either: (33%) Flip the X axis. (33%) Flip the Y axis. (33%) No operation.
2.) Crop a 100×100 window around the center
It is also possible to specify the odds of for such an "either". For example we may want the NoOp
to be twice as likely as either of the mirroring options.
julia> pl = (1=>FlipX()) * (1=>FlipY()) * (2=>NoOp()) |> CropSize(100,100)
2-step Augmentor.ImmutablePipeline:
1.) Either: (25%) Flip the X axis. (25%) Flip the Y axis. (50%) No operation.
2.) Crop a 100×100 window around the center
Now that we know how to define a pipeline, let us think about how to apply it to an image or a set of images.
The design behind operation types
The purpose of an operation is to simply serve as a "dumb placeholder" to specify the intent and parameters of the desired transformation. What that means is that a pipeline of operations can be thought of as a list of instructions (a cookbook of sorts), that Augmentor uses internally to construct the required code that implements the desired behaviour in the most efficient way it can.
The way an operation is implemented depends on the rest of the specified pipeline. For example, Augmentor knows three different ways to implement the behaviour of the operation Rotate90
and will choose the one that best coincides with the other operations of the pipeline and their concrete order.
Call the function
rotl90
of Julia's base library, which makes use of the fact that a 90 degree rotation can be implemented very efficiently. While by itself this is the fastest way to compute the result, this function is "eager" and will allocate a new array. IfRotate90
is followed by another operation this may not be the best choice, since it will cause a temporary image that is later discarded.Create a
SubArray
of aPermutedDimsArray
. This is more or less a lazy version ofrotl90
that makes use of the fact that a 90 degree rotation can be described 1-to-1 using just the original pixels. By itself this strategy is slower thanrotl90
, but if it is followed by an operation such asCrop
orCropSize
it can be significantly faster. The reason for this is that it avoids the computation of unused pixels and also any allocation of temporary memory. The computation overhead per output pixel, while small, grows linearly with the number of chained operations.Create an
AffineMap
using a rotation matrix that describes a 90 degree rotation around the center of the image. This will result in a lazy transformation of the original image that is further compose-able with otherAffineMap
. This is the slowest available strategy, unless multiple affine operations are chained together. If that is the case, then chaining the operations can be reduced to composing the tiny affine maps instead. This effectively fuses multiple operations into a single operation for which the computation overhead per output pixel remains approximately constant in respect to the number of chained operations.
Loading the Example Image
Augmentor ships with a custom example image, which was specifically designed for visualizing augmentation effects. It can be accessed by calling the function testpattern()
. That said, doing so explicitly should rarely be necessary in practice, because most high-level functions will default to using testpattern()
if no other image is specified.
Augmentor.testpattern
— Functiontestpattern([T=RGBA{N0f8}]; ratio=1.0) -> Matrix{RGBA{N0f8}}
Load and return the provided 300x400 test image. Additional args and kwargs are passed to imresize
.
The returned image was specifically designed to be informative about the effects of the applied augmentation operations. It is thus well suited to prototype an augmentation pipeline, because it makes it easy to see what kind of effects one can achieve with it.
using Augmentor
img = testpattern()
Augmenting an Image
Once a pipeline is constructed it can be applied to an image (i.e. AbstractArray{<:ColorTypes.Colorant}
), or even just to an array of numbers (i.e. AbstractArray{<:Number}
), using the function augment
.
Augmentor.augment
— Functionaugment([img], pipeline) -> out
augment(img=>mask, pipeline) -> out
Apply the operations of the given pipeline
sequentially to the given image img
and return the resulting image out
. For the second method, see Semantic wrappers below.
julia> img = testpattern();
julia> out = augment(img, FlipX() |> FlipY())
3×2 Array{Gray{N0f8},2}:
[...]
The parameter img
can either be a single image, or a tuple of multiple images. In case img
is a tuple of images, its elements will be assumed to be conceptually connected. Consequently, all images in the tuple will take the exact same path through the pipeline; even when randomness is involved. This is useful for the purpose of image segmentation, for which the input and output are both images that need to be transformed exactly the same way.
img1 = testpattern()
img2 = Gray.(testpattern())
out1, out2 = augment((img1, img2), FlipX() |> FlipY())
The parameter pipeline
can be a Augmentor.Pipeline
, a tuple of Augmentor.Operation
, or a single Augmentor.Operation
.
img = testpattern()
augment(img, FlipX() |> FlipY())
augment(img, (FlipX(), FlipY()))
augment(img, FlipX())
If img
is omitted, Augmentor will use the augmentation test image provided by the function testpattern
as the input image.
augment(FlipX())
Semantic wrappers
It is possible to define more flexible augmentation pipelines by wrapping the input into a semantic wrapper. Semantic wrappers determine meaning of an input, and ensure that only appropriate operations are applied on that input.
Currently implemented semantic wrappers are:
Augmentor.Mask
: Wraps a segmentation mask. Allows only spatial transformations.The convenient usage for this is
augment(img => mask, pipeline)
.
Example
using Augmentor
using Augmentor: unwrap, Mask
img, mask = testpattern(), testpattern()
pl = Rotate90() |> GaussianBlur(3)
aug_img, aug_mask = unwrap.(augment((img, Mask(mask)), pl))
# Equivalent usage
aug_img, aug_mask = augment(img => mask, pl)
# GaussianBlur will be skipped for our `mask`
aug_mask == augment(mask, Rotate90())
# output
true
Augmentor.Mask
— TypeMask wraps a segmentation mask.
We also provide a mutating version of augment
that writes the output into preallocated memory. While this function avoids allocation, it does have the caveat that the size of the output image must be known beforehand (and thus must not be random).
Augmentor.augment!
— Functionaugment!(out, img, pipeline) -> out
Apply the operations of the given pipeline
sequentially to the image img
and write the resulting image into the preallocated parameter out
. For convenience out
is also the function's return-value.
img = testpattern()
out = similar(img)
augment!(out, img, FlipX() |> FlipY())
The parameter img
can either be a single image, or a tuple of multiple images. In case img
is a tuple of images, the parameter out
has to be a tuple of the same length and ordering. See augment
for more information.
imgs = (testpattern(), Gray.(testpattern()))
outs = (similar(imgs[1]), similar(imgs[2]))
augment!(outs, imgs, FlipX() |> FlipY())
The parameter pipeline
can be a Augmentor.Pipeline
, a tuple of Augmentor.Operation
, or a single Augmentor.Operation
.
img = testpattern()
out = similar(img)
augment!(out, img, FlipX() |> FlipY())
augment!(out, img, (FlipX(), FlipY()))
augment!(out, img, FlipX())
Augmenting Image Batches
In most machine learning scenarios we will want to process a whole batch of images at once, instead of a single image at a time. For this reason we provide the function augmentbatch!
, which also supports multi-threading.
Augmentor.augmentbatch!
— Functionaugmentbatch!([resource], outs, imgs, pipeline, [obsdim]) -> outs
Apply the operations of the given pipeline
to the images in imgs
and write the resulting images into outs
.
Both outs
and imgs
have to contain the same number of images. Each of these two variables can either be in the form of a higher dimensional array, in the form of a vector of arrays for which each vector element denotes an image.
# create five example observations of size 3x3
imgs = rand(3,3,5)
# create output arrays of appropriate shape
outs = similar(imgs)
# transform the batch of images
augmentbatch!(outs, imgs, FlipX() |> FlipY())
If one (or both) of the two parameters outs
and imgs
is a higher dimensional array, then the optional parameter obsdim
can be used specify which dimension denotes the observations (defaults to ObsDim.Last()
),
# create five example observations of size 3x3
imgs = rand(5,3,3)
# create output arrays of appropriate shape
outs = similar(imgs)
# transform the batch of images
augmentbatch!(outs, imgs, FlipX() |> FlipY(), ObsDim.First())
Similar to augment!
, it is also allowed for outs
and imgs
to both be tuples of the same length. If that is the case, then each tuple element can be in any of the forms listed above. This is useful for tasks such as image segmentation, where each observations is made up of more than one image.
# create five example observations where each observation is
# made up of two conceptually linked 3x3 arrays
imgs = (rand(3,3,5), rand(3,3,5))
# create output arrays of appropriate shape
outs = similar.(imgs)
# transform the batch of images
augmentbatch!(outs, imgs, FlipX() |> FlipY())
The parameter pipeline
can be a Augmentor.Pipeline
, a tuple of Augmentor.Operation
, or a single Augmentor.Operation
.
augmentbatch!(outs, imgs, FlipX() |> FlipY())
augmentbatch!(outs, imgs, (FlipX(), FlipY()))
augmentbatch!(outs, imgs, FlipX())
The optional first parameter resource
can either be CPU1()
(default) or CPUThreads()
. In the later case the images will be augmented in parallel. For this to make sense make sure that the environment variable JULIA_NUM_THREADS
is set to a reasonable number so that Threads.nthreads()
is greater than 1.
# transform the batch of images in parallel using multithreading
augmentbatch!(CPUThreads(), outs, imgs, FlipX() |> FlipY())