Introduction to Kernels

The first kernel we’re going to look at is a simple one that takes a single image as input, inverts the colours in it, multiplies them by a value and then writes them to an output image. Let’s take a closer look at it.

The InvertKernel

kernel InvertKernel : ImageComputationKernel<eComponentWise>
{
  Image<eRead, eAccessPoint, eEdgeClamped> src;  //the input image
  Image<eWrite> dst;  //the output image

  param:
    float multiply;  //This parameter is made available to the user.

  local:
    float _whiteAccessPoint;  //This local variable is not exposed to the user.

  //In define(), parameters can be given labels and default values.
  void define() {
    defineParam(multiply, "Multiply", 1.0f);
  }

  //The init() function is run before any calls to process().
  void init() {
    _whiteAccessPoint = 1.0f;  //Local variables can be initialised here.
  }

  //The process function is run at every pixel to produce the output.
  void process() {
    //Invert the input value from src and multiply:
    dst() = (_whiteAccessPoint - src()) * multiply;
   }
};

Kernels and Iteration

A Blink kernel is declared in a similar way to a class in C++:

kernel InvertKernel : ImageComputationKernel<eComponentWise>

This line tells us that the InvertKernel is a Blink “kernel”, which is similar to a class in C++. This Blink kernel derives from ImageComputationKernel, which describes a kernel that is used to produce an output image. This kernel is run over the images in a componentwise manner ( the eComponentWise). Componentwise processing means that each channel in the output image will be processed independently. When processing each channel, only values from the corresponding channel in the input(s) can be accessed.

The alternative is to have a pixelwise kernel (ePixelWise). At each pixel, a pixelwise kernel can write to all the channels in the output and can access any channel in the input. You would use a pixelwise kernel for any operation where there is interdependence between the channels, for example a saturation. A pixelwise kernel would be declared like this:

kernel SaturationKernel : ImageComputationKernel<ePixelWise>

When using an ImageComputationKernel, you have no control over the order in which pixels (for pixelwise kernels) or components (for componentwise kernels) in the output image will be filled in. One kernel call will be launched for each point in the output space. The idea is that all these kernel calls are independent of one another and can potentially be executed in parallel. On a GPU at least, thousands of these kernel calls might be run at the same time. Our InvertKernel is a good example of an operation that can be parallelised in this way: each output value can be calculated independently as it does not depend on any of the other values in the output. In fact, many image processing operations are similarly straightforward to map to a parallel processing model.

Image Specification

Inside the kernel body we need to specify the images it requires. The InvertKernel has a single input image, src, and a single output, dst, which are declared like this:

  Image<eRead, eAccessPoint, eEdgeClamped> src;  //the input image
  Image<eWrite> dst;  //the output image

This tells us that src is an object of type Image, and the eRead tells us that we require only read access to this image. Similarly, dst is also an Image but this time we need write access to it, eWrite.

In the image specification you can also tell the compiler more about how you intend to access the image and what to do if you attempt to access outside the image area. For example, in the InvertKernel the input image has eAccessPoint access, which means that it will only be accessed at the current position in the output space. It also has an edge method, eEdgeClamped, which specifies that the edge pixels should be repeated for pixel accesses outside the bounds of the input image. We’ll see further examples of access specifiers in the next section.

All kernels should have at least an output image, which you could declare with the single line:

Image<eWrite> dst;

Parameters

Parameters for the kernel are declared in the param section, in the same way as you would declare member variables in a C++ class. If your kernel doesn’t require any parameters, you can omit this section.

The InvertKernel has a single parameter, multiply:

  param:
    float multiply;  //This parameter is made available to the user.

When a kernel is compiled inside the BlinkScript node, knobs will be generated for each of the kernel’s parameters and added to the Kernel Parameters tab. When first created, the node has just one parameter here, Multiply.

Kernel parameters can be C++ built-in types such as float or int. Vector parameters are also supported: float2, float3 and float4 are 2-, 3- and 4-component vectors respectively, and similarly for int2, int3, int4. float3x3 (a 3x3 matrix) and float4x4 (a 4x4 matrix) matrix types are available as well.

Local Variables

Variables local to the kernel, which you don’t want to expose to the user, can be added in the local section. Again, if you don’t need any local variables, you don’t need to add a local section.

The InvertKernel has a single local variable, _whitePoint. This is used to invert the incoming colour image, and here we assume that the white point of the input is 1.

Like parameters, local variables can be C++ built-in types such as float or int; vector types float2, float3, float4, int2, int3 and int4; or matrix types float3x3 or float4x4.

Local variables can be set up in the init() function, which is run on a single thread before the parallel part of the processing (process() function) starts. process() can be run in parallel by multiple threads as once, and each instance of process() has read-only access to the local variables. If needed, you can also declare temporary variables within process which the current execution thread (only) will be able to read and write to.

The define() Function

Parameters can be set up in the define function, using calls to defineParam(). defineParam takes three arguments: the name of the parameter, a string containing the parameter name that will be displayed to the user, and a default value for the parameter. Here, the multiply parameter is given a display name, Multiply, and a default value of 1:

  void define() {
    defineParam(multiply, "Multiply", 1.0f);
  }

If your kernel has any parameters, it’s good practice to give them a display name and default value using the define function. If it doesn’t have any parameters, there is no need to include a define function in your kernel.

The init() Function

The init() function is called once, before any of the image processing starts, and is the place to do any initialization that’s required beforehand. In the InvertKernel, the _whitePoint variable is initialized here to the floating-point value of 1.

  void init() {
    _whiteAccessPoint = 1.0f;  //Local variables can be initialised here.
  }

Other things can also be set up here: for example, you can tell the compiler exactly how you intend to access your input images. We’ll see an example of this in the next kernel.

If no initialization is required, it’s not necessary to include an init() function in your kernel.

The process() Function

The process function is where the real work gets done. This will be called once for every output pixel, in a pixelwise kernel, or once for every component of every output pixel, in a componentwise one. In the InvertKernel, it looks like this:

  void process() {
    //Invert the input value from src and multiply:
    dst() = (_whiteAccessPoint - src()) * multiply;
   }

Since the InvertKernel is componentwise, for each output component it can only access the corresponding component in its input. The current component of an Image is accessed by appending () to the image name. So here, dst() accesses the current component of the output image at the current pixel position, while src() is used to access the current component of the input at the same position. Because src was specified with eRead, we can only read the value from src using src(). However, we cawrite to dst, since this was specified with eWrite. A third access specifier, eReadWrite, is also available and should be used when you require both read and write access to an image. We write to dst by assigning a new value to dst(), as here:

    dst() = (_whiteAccessPoint - src()) * multiply;

To invert each component of the incoming image, we subtract the value of src from our local variable, _whitePoint. We then multiply the inverted value by the value of the multiply parameter.

Changing the contents of this process function will change the operation performed by the kernel. For example, you could create a “Gain” kernel by simply ignoring the _whitePoint and multiplying the incoming component by the multiply value. You can then remove the local and init() sections of the code as they will no longer be required. You might also want to change the name of the multiply parameter and call it gain instead, and maybe give it a different default value, and why not also rename the kernel itself while you’re about it?

Next Steps

In the next section we’ll look at a blur kernel that requires more complicated access to its input.