Train and Monitor the Network
For more details on Machine Learning with CopyCat in NukeX, see https://learn.foundry.com/nuke#machine-learning-compositing-in-nuke.
Train the Network to Perform a Task
After setting up your data set of Input and Ground Truth image pairs, you're ready to train the network using machine learning to replicate the desired effect. There are a number of settings available to fine-tune your model, but they are generally trade-offs between training speed and the quality of the result from the network.
Monitor Training to Get the Best Result
As the training progresses, it's important to monitor the progress of the machine learning to get the best result. You can monitor progress using CopyCat's Properties panel Progress tab, by examining the Viewer crop overlay and corresponding contact sheets, and by connecting the optional Preview input on the CopyCat node to the source sequence.
Monitoring the Graph
As the network completes passes through the data set, you'll see the graph for the current training steadily decline toward zero on the Loss axis. You'll also notice that the Viewer displays a crop overlay showing a grid of images.
Monitoring Contact Sheets
As the machine learning completes passes through the data set, as well as the graph in the Properties panel, you'll see a crop overlay grid in the Contact Sheet panel. As training progresses, you'll notice the output images change steadily from random garbage to replicate the groundtruth images.
Contact sheets allow you to see if your data set is aligned correctly and display a visual representation of how the network behaves on certain frames.
The overlay is split into three random crop rows, each with three columns. The size of the crop square is determined by the Crop Size control and generally, a larger area produces a better overall result. Some older GPUs may struggle with large Crop Size though, so you may have to reduce the area.
Each random crop row contains three images:
• input - a random sample from the Input side of the data set image pair. This is a frame from the source sequence that we'll apply the trained network to later on.
• groundtruth - the same sample area, but taken from the GroundTruth side of the data set image pair. This is what the network is learning to replicate.
• output - the same sample area, but with the effect that the network has learned so far applied to the source image.
In the example, crop 1 and crop 2 have a reasonable approximation of the groundtruth shown in the output column.
However, crop 3 is blank. This is because the groundtruth contains no alpha mask in that particular random crop.
Training a network requires both kinds of data to replicate the effect reliably, but you don't have to watch the Viewer overlay to evaluate progress. The overlay is saved as a contact sheet to the Data Directory every 100 steps by default. You can change how often a contact sheet is saved using the Contact Sheet Interval control in the Properties panel.
As training progresses, you'll see .png files populate the Data Directory at the specified Contact Sheet Interval. You can see how well the network is learning by examining the images over time. As training progresses, you'll notice the output images change steadily from random garbage in the first and second panels below to replicate the groundtruth images in later panels.
Tip: If the images in the output column appear to replicate the images in the groundtruth column before the training run finishes, you can stop training early. You don't have to complete the run if you see the result you want halfway through.
Previewing the Result
CopyCat's Preview input allows you to connect the source sequence so that you can view the machine learning progression in the Viewer. The Preview allows you to see how the training is progressing on frames outside the data set because the whole source sequence is connected to CopyCat.
You don't have to watch the Viewer overlay to evaluate progress, a preview .png file is saved to the Data Directory every 1000 steps by default. You can change how often a preview is saved using the Checkpoint Interval control in the Properties panel.
Reading the Results
You can examine the graph and contact sheets and previews in the Data Directory, but what indicates that you might need to tweak some settings or restart the training from a checkpoint or specific weighting?
Contact sheets are designed to be used alongside graph data, so you may not need to restart training if the contact sheet results are not perfect as long as the Step/Loss graph is trending toward zero on the Y axis.
• Check that you have connected the Input and Ground Truth images in the same order.
• Try adjusting the Model Size - more complex operations, such as beauty work, may require a larger Model Size.
• Try adjusting the Crop Size - higher resolution data sets may produce better models with a larger Crop Size.
• If the Step/Loss graph is trending toward zero, but the output column is still not replicating the groundtruth, try increasing the number of Epochs in the run.
• Add more image pairs to the data set - a diverse selection of images tends to produce the best results and the more frames you use, the better the results are likely to be. For example, if you're training a network to mask an object, try to pick frames that represent a wide variety of mask shapes.
If the Preview does not look correct on frames outside the data set, even if the graph is trending downwards and the contact sheet output column looks good, there may be image differences in the source sequence that are not represented by the Input and Ground Truth images you're using to train the network. These differences include defocus, different framing, shadows covering part of the effect, and so on.
• Add more image pairs to the data set - a diverse selection of images tends to produce the best results and the more frames you use, the better the results are likely to be. For example, if you're training a network to mask an object, try to pick frames that represent a wide variety of mask shapes.
• Try faking the effect on image pairs already in the data set. This can be particularly effective in certain situations such as shadows in the affected area.
Apply a trained network to a sequence using the Inference node.