We’re living in a time where artificial intelligence (AI) is starting to reshape the entire creative industry — and CopyCat, one of Nuke’s most powerful tools, puts that technology directly in your hands. It allows you to train a custom AI model using just a few “before and after” examples — and from that, it learns to replicate your work across an entire shot.
But how on earth does that actually work?
An ML model isn’t some magical, all-knowing brain. Think of it more like a very fast, very patient student. It doesn’t know anything to begin with — but if you show it enough examples, it can start to guess what it should do next.
Imagine teaching someone what a dog is just by showing them thousands of photos of dogs — no explanations, no rules. Over time, they start to notice patterns. That’s exactly how modern AI works.
There’s a psychological phenomenon called pareidolia — it’s what happens when you see shapes or faces in clouds, or think a pile of laundry in the dark looks like a person. It’s your brain looking for meaning in noise.
Modern AIs do something very similar. They don’t understand what they’re seeing — but they’ve been trained on so many examples that they can recognise patterns. They have their own kind of visual pareidolia.
So when you ask an AI to turn your video into a marble statue, or a retro animation, or a dream sequence — it finds what looks like those styles, based on what it’s learned before, and then applies that look to your footage.
CopyCat is a machine learning node in Nuke. It uses a process called supervised learning, which means:
You give it pairs of images: one original (the raw plate), and one corrected (your desired result).
CopyCat runs your original through a neural network and compares the result to your corrected version.
It checks how far off it was — and adjusts its internal parameters to get closer.
It repeats this process thousands of times.
Eventually, it learns to apply your transformation consistently — and saves it as a .cat model file.
Once trained, this model can be applied to entire shots using the Inference node.
It’s not just repeating a filter — it’s learning the exact logic behind the visual change you want.
CopyCat doesn’t know what a face is. It doesn’t understand what a bruise, a shadow, or a wig line means.
But it’s brilliant at imitating visual logic — the kind of patterns that a skilled compositor learns to see instinctively.
It’s like training a junior artist just by giving them lots of before-and-after comps. They don’t need a step-by-step breakdown — they just start to learn by example.
Because CopyCat isn’t just about automation — it’s about teaching a machine to replicate your visual judgement. And once trained, it can:
Rotoscope subjects
Remove objects, scars or tracking markers
Apply stylised looks
Restore motion blur or specular highlights
Generate masks from RGB images
And it can do all this frame-by-frame, with pixel-level consistency — which is a huge time-saver in production.
CopyCat is like a visual assistant: it learns from examples you give it.
It works by recognising patterns — not by understanding like a human would.
You train it by feeding in input/output pairs, and it learns the transformation.
Once trained, it can apply that transformation across sequences or shots automatically.
It’s part of a much bigger shift where AI is starting to assist in creative work — not by replacing the artist, but by learning from their choices.
We’re still in the early days — but already, CopyCat gives artists the ability to turn their own creative vision into an automated, repeatable process.
CopyCat is a node in Nuke — the compositing software used in visual effects (VFX) — that allows artists to train their own neural network directly inside a script. A neural network is a type of machine learning model that learns patterns from examples.
With CopyCat, you can train a model to perform specific tasks by providing it with paired images:
By showing CopyCat examples of “before and after” image pairs, you can teach it to automate similar tasks on other frames in the shot.
Tip: Use CopyCat when you need to automate a repetitive, time-consuming task across many frames — especially when traditional techniques would be too labour-intensive.
At its core, CopyCat trains a deep neural network by learning how to map input images to desired outputs. This process relies on a mathematical system of trial and error, guided by a loss function — a measurement of how far off the model’s output is from the ground truth.
The training pipeline works as follows:
Training often starts slow but improves steadily. It’s normal for early outputs to look completely wrong — this is part of the learning process.
At the beginning of training (step 0), the neural network is randomly initialised, meaning it has no understanding of the task. As a result, its first outputs will appear meaningless or noisy.
As training continues and the network receives feedback from the loss function, it begins to recognise patterns in the data. The model’s weights are adjusted repeatedly, allowing it to better approximate the transformation from input to ground truth.
Eventually, the model “learns” to mimic the desired output, even on new, unseen frames from the same shot.
CopyCat is well-suited for shot-specific tasks that are difficult or time-consuming to do by hand. These include:
You can monitor training progress in the CopyCat UI. A steadily decreasing loss value is a good indicator that the model is learning correctly.
To get the most out of CopyCat, keep these principles in mind:
Think of CopyCat like a custom tool you build for each shot — not a one-size-fits-all solution.
When training a CopyCat model, it’s important to understand that it has no built-in understanding of what it’s looking at. It doesn’t recognise people, vehicles, environments, or any kind of objects in a semantic way.
Instead, it learns pixel relationships — for example, how certain colours, textures, or patterns relate to one another within the examples you provide. That means “white next to a skin tone” is a completely different thing to the model than “white next to a red wall,” unless both are included in the training data.
Always think in terms of what pixel patterns the model sees, not what humans interpret.
Choosing the right frames is critical to teaching your model how to generalise across an entire shot. Since CopyCat doesn’t “see” meaning, the dataset must be carefully constructed to reflect the visual variety within the shot.
When choosing frames for training, cover as much visual variation as possible in these areas:
Lighting conditions can dramatically change how surfaces appear. Make sure your model sees those variations.
For deblurring or stylisation tasks, the model must learn from both extremes of focus.
When working with CopyCat, think of frame selection as being similar to keyframing — you’re marking the critical moments that define the shot’s variation.
Keep these principles in mind:
If a subject rotates or moves drastically, and your dataset only includes one pose, the model is likely to fail when it encounters different positions.
Speed up training and improve model accuracy by guiding CopyCat to focus on the regions of interest (ROIs) within your input images — particularly useful when the effect you’re training for is localised to a small area (e.g. a blemish, blood splatter, or facial detail).
You can significantly reduce training time and improve results by simply cropping your data more strategically.
By default, CopyCat randomly samples crops from your training frames during each training step. This means:
This slows down learning and makes convergence (the point where the model reliably produces the desired output) much less efficient.
If the effect only exists in a small portion of your image, default training will be very slow unless you intervene..
You can manually define custom crop areas around the region you want the model to learn — this makes the training far more targeted.
For example:
This approach:
To implement explicit cropping:
You can feed this cropped sequence into CopyCat as your training input.
The AppendClip node allows you to combine cropped regions from multiple frames into a single training clip — extremely useful for small but varied datasets.
To make cropping truly effective, combine targeted crops with a few full-frame context frames. This ensures CopyCat learns both what to change and what not to change.
This is especially important for learning fine detail, such as skin texture, subtle colour changes, or small artefacts.
For example:
Shows CopyCat where not to apply the learned transformation.
Think of it like this — cropped frames teach the “what”, and full frames teach the “where”.
Improve the performance and robustness of your CopyCat models by artificially expanding your training dataset. This is done by creating synthetic variations of your existing frames — no extra rotoscoping or cleanup work required.
Think of data augmentation as a way to “teach” your model more than it saw — without creating more by hand.
In CopyCat, data augmentation refers to automatically modifying your existing input/ground truth image pairs in order to simulate new visual conditions.
This allows you to:
This is especially powerful when your current dataset is small or lacks variation — a common case in production shots.
Here are common augmentation methods used in CopyCat workflows, tailored for visual effects tasks:
Avoid vertical flips if gravity-related features (e.g., shadows, drips) are important — it could confuse the model.
Don’t go overboard — extreme rotations or zooms may break alignment between input and ground truth.
Use subtle, realistic colour shifts to mimic natural lighting variation. Avoid creating colour combinations that wouldn’t exist in the scene.
Always ensure that the ground truth is cropped and repositioned in sync with the input.
Helps the model learn how to respond when subject boundaries interact with difficult backgrounds — like white-on-white or dark-on-dark.
Train a CopyCat model to generate alpha mattes (transparency masks) for a runner moving through a variety of real-world environments.
The model failed on frames where the runner’s white headphones overlapped with the white sky. There was not enough training data to help it distinguish the boundary.
Instead of redoing manual Roto, the team used data augmentation:
All new training pairs were derived entirely from synthetic variants of existing annotated frames.
Use augmentation to create your own “stress tests” — deliberately target the model’s weak points.
Unlike some general-purpose AI tools, CopyCat has no built-in knowledge of the world. It doesn’t understand what a person or headphone is. It only knows how to respond to patterns it has seen in your data.
That means:
More diversity = better generalisation = fewer artifacts and errors during production use.
Follow these guidelines when applying augmentation:
A well-augmented dataset not only reduces manual prep work — it directly leads to lower loss values, faster convergence, and more production-ready results. CopyCat is highly sensitive to mismatched data — always quality-check before training.
Keep a reusable “augmentation toolkit” in your Nuke script — so you can apply and tweak these techniques quickly for different shots.
Learn how the size and scope of your dataset directly affects CopyCat’s performance — whether you’re training a model for a single shot or developing a broader tool for reuse across projects.
A bigger dataset isn’t always better — it depends on whether you’re solving a specific or general task.
CopyCat supports two main training strategies, and each one has very different data requirements.
This is the most common use of CopyCat: building a model tailored to one specific shot.
You don’t need a large dataset — just one that covers the full range of visual change in the shot. Choose training frames that represent:
You’ll usually get excellent results from just 5–30 frames, as long as those frames are well chosen.
This approach is used when building a reusable CopyCat model — one that can be deployed across many shots or even different shows.
To generalise well, the model needs to see a huge variety of:
Real-World Example:
The Foundry’s in-house Human Matting model was trained on around 10,000 annotated examples — a strong baseline for general-purpose tools.
For generalised tasks, aim for 10,000–20,000 image pairs to start. Add more over time as you expand the use case.
Here’s how dataset size relates to typical production goals:
Use Case | Recommended Dataset Size | Purpose |
Shot-specific cleanup (e.g. bruise removal) | 5–30 frames | Fast turnaround, targeted accuracy |
Custom stylisation on one clip | 10–25 frames | One-off creative look, quick iteration |
General human matte extraction tool | 10k–100k+ frames | Consistent performance across multiple sequences |
Deblur or upscaling across different shots | 50k–1M+ frames | Long-term deployment as a studio tool |
Keep scope in mind — don’t overshoot. You don’t need thousands of frames to solve a single shot.
CopyCat models can inherit learning from existing networks — speeding up training for large-scale tools.
CopyCat’s data needs scale with the scope of the task:
When in doubt, start small. You can always expand your dataset if the model underperforms on certain edge cases.
Ensure consistent, high-quality results by avoiding common mistakes that can undermine CopyCat training performance — especially in complex or high-volume Nuke production environments.
Most CopyCat issues aren’t about the model — they’re about the data going in. Clean data = better results.
Below are the most frequent causes of degraded CopyCat output, along with how to resolve or avoid them.
CopyCat’s underlying machine learning model expects pixel values in the normalised floating-point range [0.0 – 1.0].
If the super-white pixels aren’t part of the effect you’re training (e.g. irrelevant highlights):
If the model does need to learn from high values (e.g. fire, lens flares, HDR glows):
⚠️ Note: Log colour space can slow training slightly due to value compression — expect more iterations to reach convergence.
If your training data is in one colour space (e.g. Rec.709) and inference happens in another (e.g. ACEScg), the model will underperform or fail entirely. This is known as domain shift.
Stick to a known working colour space for ML tasks — Rec.709 or ACEScg are both valid, as long as you’re consistent.
If you accidentally include extra channels — like alpha, depth, motion vectors, or even metadata layers — the model will treat them as part of the input signal.
A clean input = a smarter model. Noise in your channels = confusion in your output.
Here’s a quick guide to check your dataset before starting training:
Issue | What to Check | Recommended Action |
Pixel Value Range | Any values over 1.0? | Clamp or convert to Log colour space |
Colour Space | Training vs inference matched? | Standardise with OCIOColorSpace / Colorspace nodes |
Channel Structure | Extra channels (Z, motion, etc)? | Remove or shuffle to keep only RGB or RGBA |
Create a reusable “CopyCat precomp” template in your team’s Nuke pipeline to enforce consistency.
To avoid repeated errors and simplify team workflows, your training setup should always include:
This approach ensures that CopyCat receives clean, consistent, and predictable data — crucial for stable training, especially in multi-shot pipelines or across larger teams.
Label your training branches in the node graph (e.g. “Input to CopyCat” / “Ground Truth to CopyCat”) — makes reviews and debugging easier.
Learn how to choose the optimal CopyCat model size — Small, Medium, or Large — based on your task’s complexity, training data scale, and the balance between speed and quality.
Tip: Picking the right model size early can save hours of render time — or avoid underfitting entirely
Each CopyCat model size reflects a different neural network architecture — with varying depth, parameter count, and computational requirements.
Here’s a breakdown:
Model Size | Training Speed | Inference Speed | Learning Capacity | Best For |
Small | ⚡ Fast | ⚡ Fast | 🧠 Low | Simple pixel-based tasks (e.g. deblur, upscale) |
Medium | ⚖️ Balanced | ⚖️ Balanced | 🧠 Moderate | Beauty, marker removal, single-shot matting |
Large | 🐢 Slower | 🐢 Slower | 🧠 High | Semantic tasks, generalised models, stylisation |
Model size affects memory usage and GPU load. If you’re training on a lower-spec machine, start with Small and evaluate.
Large models may require longer per-step training time — factor this into your delivery schedule.
Choose the model size based on the type of visual transformation you’re trying to learn:
Task Type | Recommended Model Size |
Basic image-to-image tasks (deblur, clean-up) | Small |
Beauty work, marker removal, inpainting | Medium |
Garbage matting, multi-subject segmentation | Large |
Stylisation or abstract visual transformations | Large |
Tasks involving semantic understanding — like identifying people, hair, or objects — benefit from the Large model’s deeper learning structure.
The amount of training data also affects which model will perform best:
Dataset Size | Model Size Guidance |
1–30 frames | Small or Medium (faster convergence, less overfitting) |
100–1,000+ frames | Medium or Large (allows better pattern learning) |
10,000+ frames | Large (can fully utilise diversity and avoid collapse) |
Using a Large model on a very small dataset may lead to overfitting — unless you're augmenting or fine-tuning from pretrained weights.
Here’s a quick guide for real-world scenarios:
Scenario | Recommended Model Size |
Quick, single-shot cleanup (e.g. blemish removal) | Small |
Shot-specific stylisation or roto/matting | Medium |
Multi-shot training, generalised matte generation | Large |
Training complex FX tools (e.g. dynamic segmentation) | Large |
For high-end production outputs, larger models often outperform smaller ones — even on modest datasets — but you'll need to accept longer training and render times.
Model size directly affects:
While Small and Medium models are excellent for speed and iteration, you’ll often get cleaner, more reliable results from the Large model — especially on nuanced, artistic, or multi-subject tasks.
If render time isn’t critical, always test with Large before committing to final output — especially for client-facing work.
Learn how to accelerate CopyCat training by leveraging pre-trained weights — allowing you to reduce training time significantly while improving early convergence, model quality, and consistency across shots.
Starting from a solid foundation saves time and boosts results, especially in high-pressure production environments.
By default, CopyCat starts with randomised model weights — this means the model knows nothing about your task and must learn everything from scratch. This “cold start” can take thousands of steps to show useful results.
However, CopyCat supports pre-trained model checkpoints, which provide a domain-specific starting point. These models already contain knowledge of common visual patterns (like faces, edges, or motion blur), enabling the model to learn your specific task much faster.
Think of pre-trained weights as a head start — instead of learning how to see, your model can jump straight to learning what to do
CopyCat includes several built-in model weights that are production-safe and trained on large, diverse datasets. Each one is suited to a different kind of task:
Model Name | Domain Expertise | Best Used For |
Deblur | High-frequency detail reconstruction | Deblurring, inpainting, temporal artefact repair |
Upscale | Resolution and detail enhancement | Super-resolution, scale-sensitive FX |
HumanMatte | Human structure, contours, pose-aware | Garbage matting, beauty cleanup, roto assistance |
All built-in weights are trained on licensed, production-approved data — you can safely use them in commercial workflows.
Here’s what you can expect when using pre-trained weights versus training from scratch:
Scenario | Cold Start (Random Weights) | With Pretrained Weights |
Initial convergence | Slow (thousands of steps) | Fast (hundreds of steps) |
Semantic learning | Learns from zero | Starts from known structure |
Early generalisation | Weak | Robust, even with few examples |
Total training time | Long | Up to 10x faster |
Example: A human matting model trained from scratch took over 3,000 steps to stabilise. The same setup, using HumanMatte weights, produced usable results in under 500 steps — with visibly better mattes and lower loss values.
Beyond built-in weights, you can also save your own CopyCat model checkpoints during training. This enables a form of transfer learning — using a trained model as a starting point for another related task.
This is ideal for:
Custom checkpoints are a powerful way to build studio-specific ML tools that improve across a show or season.
Here’s how to decide which model or strategy to use based on your task:
Task Type | Recommended Strategy |
Single-shot matte | Use built-in HumanMatte weights |
Sequential beauty cleanup | Train on one → Save checkpoint → Reuse downstream |
Stylisation / inpainting | Start with Deblur weights for pixel-based priors |
Actor or scene consistency | Train once → Create reusable checkpoint |
Fast prototyping | Always use pre-trained weights unless domain mismatch |
If your task overlaps visually with the domain of a built-in model (e.g., faces, skin, edges), a pretrained weight will almost always outperform a cold start — especially early in training.
For best results, start with a pretrained weight and fine-tune with your own data. This combines speed with custom accuracy.
Expect significant reductions in training time, better convergence, and more robust generalisation — even with limited data.
Pretrained models aren’t just shortcuts — they’re the foundation of scalable, production-ready CopyCat workflows.
Learn how to correctly configure epochs in CopyCat training to achieve optimal results — based on your dataset size, task complexity, and whether you’re starting from scratch or using pretrained weights.
Tip: Don’t just guess your training duration — use epochs and step counts strategically to balance time, precision, and generalisation.
Before diving in, let’s clarify key terms used throughout the training process:
In CopyCat, steps control how often the model updates; epochs give you a dataset-size-relative way to manage learning duration.
Here are typical step ranges based on dataset size and whether you’re using random initialisation or pretrained weights:
Use Case | Starting From | Recommended Steps | Notes |
Small dataset (single-shot) | Scratch | 15,000–30,000 steps | Go lower if compute or time is limited |
Small dataset (with pretrained) | Pretrained | 5,000–15,000 steps | Often sufficient for high-quality convergence |
Large dataset (general model) | Scratch or Pretrained | 100,000–300,000+ steps | Needed for robust generalisation across shots |
More steps ≠ better results unless the dataset justifies it. Use the loss curve to determine when to stop.
If you’re training with:
Then:
To reach 20,000 steps, you’d need 20,000 / 4 = 5,000 epochs
In CopyCat, it’s common to define training duration by steps, but estimating with epochs helps you budget time based on dataset scale.
While CopyCat tracks training in steps, the epoch is a standard way of measuring training progress in machine learning. It scales naturally with:
Internally, CopyCat adjusts behaviour (e.g. learning rate decay, sampling logic) based on epoch progression, not just raw step count.
Think of epochs as a schedule — they allow CopyCat to learn efficiently no matter the size of your training set.
Here’s how to adjust epochs based on your goals:
Goal | Recommended Adjustment |
Faster convergence | Use pretrained weights and reduce total epoch/step count |
Higher precision | Increase total steps (15k–30k+) for better fidelity |
Avoid overfitting | Watch the training loss curve — if it flattens, stop early |
Improve temporal consistency | Increase epochs (and crop variation) to help with flicker |
Flickering outputs are often a result of under-training or inconsistent input variation — more epochs can help the model stabilize.
For production, build a simple spreadsheet to calculate step/epoch targets for different training setups — it’s an easy way to standardise training across your team.
Understand how to configure crop size in CopyCat training to strike the right balance between speed, GPU efficiency, and contextual awareness — especially when working on tasks that vary from low-level pixel edits to high-level semantic transformations.
Tip: Crop size directly affects how much the model “sees” — too small, and it can’t understand the scene; too large, and you may overload your GPU.
Crop size refers to the pixel dimensions (e.g. 256×256) of the random patches that CopyCat extracts from your input/ground truth image pairs during training.
Rather than training on full-resolution frames — which would be memory-intensive and slow — CopyCat randomly samples smaller regions (crops), allowing:
Crop size doesn’t change your final output resolution — it only affects training.
The crop size determines how much of the image CopyCat sees at each step, which in turn impacts how well it can understand both local detail and global context.
Crop Size | Characteristics | Best For |
Small | Fast, low-memory, limited context | Pixel-local tasks (deblur, denoise, upscale) |
Large | Slower, high-memory, strong scene understanding | Semantic tasks (human matting, object removal) |
If you’re training a model to remove a blemish on a face, a small crop might confuse the blemish with another feature. A larger crop includes the eye, brow, or facial contour — giving the model enough context to make accurate edits.
Parameter | Small Crop | Large Crop |
Speed | ✅ Faster training | ❌ Slower due to larger input |
Memory Usage | ✅ Lower GPU requirement | ❌ Higher GPU demand |
Context Capture | ❌ Weak (local info only) | ✅ Stronger global awareness |
Generalisation | ❌ May overfit local noise | ✅ Better real-world robustness |
For high-resolution source material, a small crop may completely miss key features — especially in sparse or complex frames.
Use these crop size ranges based on the type of task:
Task Type | Recommended Crop Size |
Pixel-based tasks (e.g., upscale, deblur) | 128–256 px |
Localised edits (e.g., blemish removal, object cleanup) | 384–512 px |
Semantic or full-subject tasks (e.g., human matting, facial work) | 512–768+ px |
These are square crops by default (e.g. 512×512), but you can adjust to fit non-square aspect ratios if needed — just ensure input and ground truth align exactly.
On 8–12 GB GPUs, 512×512 is often the upper limit for batch sizes above 4 when using the Large model.
Task Type | Recommended Crop Size |
Low-level pixel tasks (upscale, blur) | 128–256 px |
Localised edits (e.g., minor cleanup) | 384–512 px |
Semantic work (e.g., human matting) | 512–768+ px |
If your model isn’t converging well — or seems confused by the visual context — try increasing the crop size to help it see more of the surrounding structure.
Learn how to set the Batch Size in CopyCat training to balance GPU memory usage, training speed, and model stability — especially when working with high-resolution crops or complex tasks.
Tip: Batch size is one of the most GPU-sensitive parameters in CopyCat. Setting it right can drastically improve training efficiency and prevent crashes.
Batch size defines the number of training crops (input/ground truth pairs) that CopyCat processes in parallel during each training step.
Smaller batches allow for faster, more frequent updates and consume less GPU memory.
Think of batch size as how much “learning material” the model sees at once — bigger batches offer a clearer signal but require more GPU power.
Batch Size | Memory Usage | Training Stability | Gradient Noise | Convergence Behaviour |
Small | ✅ Low GPU memory usage | ❌ Less stable | ❌ High | ✅ Fast updates, but more jittery |
Large | ❌ High GPU memory usage | ✅ Smooth learning | ✅ Low | ❌ Slower per-step updates, more stable |
Noisy tasks (e.g. with aggressive data augmentation or lighting variance) benefit from slightly larger batches for stability
GPU Class | Typical Max Batch Size (512×512 crop) |
RTX A6000, RTX 4090 | 4–8 |
RTX 3090, 4080 | 2–6 |
RTX 2080 Ti, 3070, M1 Max | 2–4 |
If you're hitting VRAM limits or CopyCat stalls, lower your batch size first, not the crop size.
Batch size impacts how quickly your model sees all the data:
Steps = (Number of crops per epoch ÷ Batch size) × Epochs
If you reduce batch size, consider increasing epochs to maintain your target step count.
Use Case | Recommended Batch Size |
Pixel-level tasks (e.g. deblur, upscale) | 4–8 |
Semantic edits (e.g. matting, face work) | 2–4 (crop size-dependent) |
High-resolution crops (e.g. 768×768+) | 1–2 (monitor VRAM) |
Batch size is the first setting to adjust if CopyCat is crashing, underperforming, or running slowly on your hardware.
Learn how to monitor CopyCat training progress effectively, interpret loss behaviour, and resolve common training issues — particularly in fast-paced, high-stakes production environments.
Tip: Machine learning isn’t a black box. If you know how to read the loss curve and spot visual artefacts early, you can save hours of wasted training time.
In the CopyCat Graph Tab, you’ll find:
This curve shows how the model’s error (loss) is decreasing across training steps.
Training Phase | Expected Loss Curve Behaviour |
Early Training | Sharp drop — model is learning to “memorise” training pairs |
Mid to Late | Slower, more gradual decline — model begins to generalise |
Plateau | Curve flattens — model has learned all it can from the current dataset |
A gently flattening curve is normal. But sharp spikes or stagnation from the start signal a deeper issue.
Loss Behaviour | Interpretation | Suggested Fix |
No decrease | Poor frame selection or misaligned data | Review training pairs; check input vs ground truth alignment |
Oscillating loss | High learning rate or low dataset variation | Lower learning rate; diversify training set |
Early flattening | Underfitting — model too small or task too easy | Use larger model or increase crop size |
Sudden spikes | Superwhite pixels or format mismatch | Clamp values or convert to log colour space |
Fix: Lower crop size or batch size to stay within VRAM limits. Check GPU usage in Task Manager or via system monitors.
Restarting training from a saved checkpoint can prevent losing earlier progress.
After training finishes:
Overfitting is more likely with small datasets or tiny crop sizes. Add variation or increase crop size to help the model generalise.
Use Case | What to Monitor | What to Tune |
Single Shot | Smooth, fast-falling loss | Frame selection, crop size, batch size |
Generalised Tasks | Slow, steady loss reduction | Batch size ↑, dataset diversity ↑ |
Noisy Outputs | Loss spikes, visual artefacts | Clamp/log space, remove extra channels |
Cross-reference loss behaviour with visual outputs. A “good” loss curve means nothing if the frames don’t look right — and vice versa.
Learn how to improve the quality and generalisation of your CopyCat models by identifying common weaknesses and applying targeted refinements — without needing to start from scratch.
Tip: Most CopyCat models don’t need re-training — they just need more training or better-targeted data.
The model is simply undertrained. It hasn’t seen enough iterations to fully learn the task.
Saves time by picking up exactly where you left off.
If loss is still dropping and your output looks half-baked — keep training. You’re not done yet.
These frames aren’t well-represented in the training set. The model has never seen a similar scenario before.
You don’t need to overhaul the entire dataset. One or two smartly chosen frames can fix a recurring issue.
To refine CopyCat model performance:
Problem | Action |
Undertrained model, weak results | Resume training for more steps |
Selective failure on specific frames | Add targeted training data (real or augmented) |
Nuke’s CopyCat node represents a transformative shift in how artists approach shot-specific tasks in visual effects — blending the power of machine learning with the creative control of node-based compositing. This guide has outlined not just how CopyCat works, but how to get the most out of it through smart data selection, targeted training strategies, and practical troubleshooting. Whether you’re cleaning up blemishes, generating mattes, or stylising complex footage, CopyCat empowers artists to build custom, high-quality tools tailored to each shot. With the right preparation and understanding, even a small dataset can deliver production-ready results — fast, flexible, and fully integrated into your Nuke pipeline.
Here’s a beginner-friendly glossary of terms found in the CopyCat documentation. These explanations aim to help newcomers understand core concepts related to machine learning, visual effects (VFX), and the CopyCat tool in Nuke.
This custom GPT — Nuke CopyCat TD v1.0 — is a specialized AI assistant designed to help artists, compositors, and technical directors work effectively with Foundry Nuke’s CopyCat tool and its related machine learning workflows (like AIR nodes and the Inference node).
Think of it as your senior TD mentor for anything related to CopyCat. It helps you:
Understand how CopyCat and machine learning concepts work inside Nuke.
Set up training with the right parameters (epochs, batch size, crop size, model type, etc.).
Troubleshoot problems, such as bad predictions, loss curve issues, or GPU crashes.
Optimize results, speed up training, and improve model generalization.
Design workflows for specific VFX tasks — like matte generation, relighting, cleanup, or stylization.
It always provides a 📦 Recommended CopyCat Parameters section tailored to your shot, task, and hardware, with plain-English explanations for every choice.
CopyCat is a machine learning node in Foundry’s Nuke that lets users train custom AI models using before-and-after image pairs to automate complex visual effects tasks.
CopyCat uses supervised learning to analyze differences between input (raw plate) and ground truth (desired result) images, training a neural network to apply the same transformation across a sequence.
CopyCat is ideal for tasks like rotoscoping, object removal, beauty work, stylization, deblurring, and generating mattes — all with pixel-level consistency.
For shot-specific models, 5–30 carefully chosen frames are typically enough. For general-purpose tools, you may need 10,000+ training pairs.
Not easily. CopyCat models are usually shot-specific. However, with a large and diverse dataset, generalized models can be trained for broader use cases.
Once a CopyCat model is trained, the Inference node applies the learned transformation across entire shots automatically.
Use smart cropping to focus training on the region of interest, and data augmentation to synthetically expand your dataset with variations in lighting, pose, and background.
Ensure input/output pairs are aligned, include varied lighting and motion, and clean extraneous channels to prevent confusing the model.lly.