Kernels

The cornerstone of the Blink framework is the concept of a Blink kernel. In Blink, a kernel defines a computational process that is run for every pixel in a given output image.

The main performance benefits of using kernels is that they are run in parallel, and the code being processed for each output pixel can be run simultaneously. On a GPU, thousands of kernel calls might be run at the same time. However, by processing all the pixels in parallel, you have no control over the order in which pixels in the output image will be filled in. Our Invert kernel is a good example of an operation that can be parallelised in this way: each output value can be calculated independently as it does not depend on any of the other values in the output. In fact, many image processing operations are similarly straightforward to map to a parallel processing model.

Iteration Space

When processing an image, a kernel runs its process() function in a massively parallel way, and will run the same piece of code for each pixel, or each component within a pixel, individually. A kernel’s iteration space can be thought of as the total area over which the individual process() function calls will be run. For Blink kernels run in Nuke’s BlinkScript node, the iteration space is determined by the resolution of the single output image, which is specified with eWrite (more detail in the Images section).

For example, a kernel processing each pixel in a 3x4 RGB image will run the process() function in parallel with an iteration space arranged in a 3x4 grid. A kernel processing each component in a 3x4 RGB image will run in parallel with an iteration space arranged in an 3x4x3 grid. The same kernel can be run for any sized output image, and the iteration space will scale accordingly. For example, a component-wise kernel running on an UHD RGBA output image will have an iteration space of 3840x2160x4.

The Invert Kernel

To introduce Blink kernels, we will use this Invert kernel, which takes a single image as input and inverts the colours.

// Copyright (c) 2024 The Foundry Visionmongers Ltd.  All Rights Reserved.
kernel Invert : ImageComputationKernel<eComponentWise>
{
  Image<eRead> src;   // Input image
  Image<eWrite> dst;  // Output image

  param:
    float multiply;  // This parameter is made available to the user.

  local:
    float whiteAccessPoint;  //This local variable is not exposed to the user.

  // In define(), parameters can be given labels and default values.
  void define() {
    defineParam(multiply, "Multiply", 1.0f);
  }

  // The init() function is run once before any calls to process().
  void init() {
    whiteAccessPoint = 1.0f;  //Local variables can be initialised here.
  }

  // The process function is run at every pixel to produce the output.
  void process() {
    //Invert the input value from src and multiply:
    dst() = (whiteAccessPoint - src()) * multiply;
   }
};
../_images/invert-kernel.png

The Invert kernel’s effect on a ColorWheel in a Nuke BlinkScript node.

Defining a Kernel

Kernels are defined similarly to C++ classes, following this structure:

kernel ExampleKernel : ImageComputationKernel<eGranularity>
{
    //... kernel body in here
};

This declares a kernel, called ExampleKernel, which will be an ImageComputationKernel of type eGranularity. The kernel body will be any code written between the curly brackets {} and needs to contain the following:

At least one image specification

Image specifications define the images that are read from and written to by the kernel. By convention, these are the first members of the kernel.

A single process() function

The process() function (see more here) is called for each point in the iteration space. This is where the kernel reads from its inputs and writes to its outputs.

The Invert kernel is defined as a Blink Image Computation Kernel that will process images in a component-wise manner.

kernel Invert : ImageComputationKernel<eComponentWise>

Kernel Granularity

The kernel granularity tells the kernel how it can access the image data at each position in the iteration space. A kernel can be iterated over in either a component-wise or pixel-wise manner:

ePixelWise

A pixel-wise kernel can access the entire pixel at a given position within the iteration space. It can write to all the components in the output and can access any component in the input. You would use a pixel-wise kernel for any operation where there is interdependence between the components, for example a saturation kernel, where the R, G and B components are required to calculate luminosity.

eComponentWise

Component-wise processing means that each component in the iteration space will be processed independently. Only the current component’s value can be accessed in any of the input images, or written to in the output image. When processing each channel, only values from the corresponding component in the input(s) can be accessed. For example, in an RGBA image, the kernel will be run independently for each of the R, B, G and A components. This should be used where there is no interdependence between the components of each pixel, like in our Invert kernel.

Images

In order for our kernel to be able to process images, we need to define the images that the kernel will use. The Invert Kernel has a single input image, which we call src (short for “source”), and a single output, dst (“destination”), which are declared like this:

Image<eRead> src;   // Input image
Image<eWrite> dst;  // Output image

This tells us that src is an Image object. eRead tells the kernel that it will need to read the contents of this image, and is therefore an input image. Similarly, dst is also an Image, but this time eWrite tells the kernel that it will need to write to this image, making it an output image.

You can also tell the kernel more about how you intend to access the image and what to do if you attempt to access outside the image area. This is explained in detail in the Images section.

All kernels should have at least an output image.

Parameters

Parameters are public variables that will be exposed to users, and are used to manipulate the behaviour of the kernel. A kernel’s parameters are declared in the param section. If your kernel doesn’t require any parameters, you can omit this section.

Values for parameters can be accessed from outside the kernel and in BlinkScript nodes are visually accessible from the Kernel Parameters tab.

The Invert kernel has a single parameter, multiply:

param:
  float multiply;  // This parameter is made available to the user.

Locals

Locals are private variables local to the kernel, which you do not want to expose to the user. They can be added in the local section of the kernel. If you don’t need any local variables, you don’t need to add a local section.

The Invert kernel has a single local variable, whiteAccessPoint. This is used to invert the incoming colour image. Here we assume that the white point of the input is 1.0f and so set the value to 1.0f. Like parameters, local variables can be any of the built-in types detailed here.

local:
  float whiteAccessPoint;  //This local variable is not exposed to the user.

The define() Function

The define() function is used to set information about kernel parameters. This should only contain calls to the function defineParam(), which has the following form:

defineParam(paramName, "externalParamName", defaultValue[, optionalParamProperties]);

For example, here the integer parameter radius is given a user label “Radius” and default value of 5. To allow this parameter to be scaled in Nuke when proxy mode is enabled, the optional parameter property eParamProxyScale is supplied.

defineParam(radius, "Radius", 5, eParamProxyScale);

In the Invert kernel, the multiply parameter is given a display name, "Multiply", and a default float value of 1.0f.

// In define(), parameters can be given labels and default values.
void define() {
  defineParam(multiply, "Multiply", 1.0f);
}

The define() function will be called just once, when the kernel is first created, to define the parameters. If your kernel has any parameters, it’s good practice to give them a display name and default value within the define function. If it doesn’t have any parameters, there is no need to include a define function in your kernel.

The init() Function

Local variables can be set up in the init() function, which is run once before the parallel part of the kernel computation (in the process() function) starts. By using local variables for values that are shared by all pixel computations, you can save processing time by avoiding having to recalculate their values in every call to process(), which could be millions of times. The process() function only has read-only access to the local variables. If no initialisation is required, it’s not necessary to include an init() function in your kernel.

In the Invert kernel, the whiteAccessPoint variable, which is the same for every pixel in the image, is initialised here to the floating-point value of 1.0f.

// The init() function is run once before any calls to process().
void init() {
  whiteAccessPoint = 1.0f;  //Local variables can be initialised here.
}

The process() Function

The process() function is where the image computation is performed. This function is called once for every output pixel (in a pixel-wise kernel), or for every component of every output pixel (in a component-wise kernel). It can have one of three signatures:

void process()

This signature is for kernels which do the same processing regardless of where they are in the iteration space.

void process(int2 pos)

This signature is for kernels which need to know the x (pos.x) and y (pos.y) coordinates of their position in the iteration space <iteration-space>.

void process(int3 pos)

This signature is only available for kernels with eComponentWise granularity. Here, (pos.x, pos.y) gives the coordinates of the current pixel position in the iteration space, while pos.z is the current component. For example, in an RGB image, the R component would have pos.z == 0, G: pos.z == 1, and B: pos.z == 2.

Note

The pos parameter will behave differently based on the Image specification. More detail on this can be found here.

The Invert kernel does not need the spatial position information of the pixels, it will perform the same computation regardless of where it is located within the image. Therefore, it uses the first signature with no position argument.

void process() {
  //Invert the input value from src and multiply:
  dst() = (whiteAccessPoint - src()) * multiply;
 }

Since the Invert kernel is component-wise, for each output component, it can only access the corresponding component in its input. The current component of an Image is accessed through the () operator.

So here, dst() accesses the current component of the output image at the current pixel position, while src() is used to access the current component of the input at the same position.

Because src was specified with eRead, we can only read the value using src(). However, we can write to dst, since this was specified with eWrite. More information on image access can be found in the Images section.

To invert each component of the incoming image, the Invert kernel subtracts the value of src from its local variable, whiteAccessPoint. It then multiplies the inverted value by the multiply parameter.