The next kernel we’re going to look at does a convolution, or weighted two-dimensional blur. The blur weights are taken from a second input image, *filter*, so for every output pixel, the kernel needs to access all the pixels from the filter image. We do this using random access.

Here’s the whole kernel:

```
//Warning: connecting a large image to the filter input will cause the kernel to run very slowly!
//If running on a GPU connected to a display, this will cause problems if the time taken to
//execute the kernel is longer than your operating system allows. Use with caution!
kernel ConvolutionKernel : public ImageComputationKernel<ePixelWise>
{
Image<eRead, eAccessRanged2D, eEdgeClamped> src;
Image<eRead, eAccessRandom> filter;
Image<eWrite> dst;
local:
int2 _filterOffset;
void init()
{
//Get the size of the filter input and store the radius.
int2 filterRadius(filter.bounds.width()/2, filter.bounds.height()/2);
//Store the offset of the bottom-left corner of the filter image
//from the current pixel:
_filterOffset.x = filter.bounds.x1 - filterRadius.x;
_filterOffset.y = filter.bounds.y1 - filterRadius.y;
//Set up the access for the src image
src.setRange(-filterRadius.x, -filterRadius.y, filterRadius.x, filterRadius.y);
}
void process() {
SampleType(src) valueSum(0);
ValueType(filter) filterSum(0);
//Iterate over the filter image
for(int j = filter.bounds.y1; j < filter.bounds.y2; j++) {
for(int i = filter.bounds.x1; i < filter.bounds.x2; i++) {
//Get the filter value
ValueType(filter) filterVal = filter(i, j, 0);
//Multiply the src value by the corresponding filter weight and accumulate
valueSum += filterVal * src(i + _filterOffset.x, j + _filterOffset.y);
//Update the filter sum with the current filter value
filterSum += filterVal;
}
}
//Normalise the value sum, avoiding division by zero
if (filterSum != 0)
valueSum /= filterSum;
dst() = valueSum;
}
};
```

The first difference between this kernel and the ones we’ve seen previously is that this one iterates over its output in a pixelwise fashion:

```
kernel ConvolutionKernel : public ImageComputationKernel<ePixelWise>
```

For this convolution, we use only the first channel of the filter image, so it makes sense to use pixelwise iteration - it means we only have to access each pixel in the filter image once per output pixel, rather than once for each component of each output pixel.

The ConvolutionKernel needs to access the whole of the *filter* input at every output pixel. To do this, we use the *eAccessRandom* access method, which allows access to any position in the image from all positions in the output.

```
Image<eRead, eAccessRandom> filter;
```

With *eAccessRandom* input access, you can also specify an edge method for the input, in exactly the same way as you would with *eAccessRanged1D* or *eAccessRanged2D* access. However, in this case we don’t intend to access any pixels outside the *filter* input, so no edge method is specified for this input.

Since random access lets you access any input pixel at any output location, there’s no need to specify anything further about the access requirements for the *filter* input in the *init()* function. We do, however, store some information about the *filter* input here that will be used to help us with our access to *src*.

It’s often useful to know the bounds of a random access input; in this case, for example, we need to know the size of the *filter* input in order to decide how big our convolution needs to be. A random access input has a *bounds* member which you can access via *inputName.bounds*.

*bounds* is actually a structure called a *recti*, which is a rectangle with integer co-ordinates. You can get the width of the rectangle using *bounds.width()*, or the height using *bounds.height(). Here, the *width* and *height* functions are used to obtain a radius for our *filter* input.

```
//Get the size of the filter input and store the radius.
int2 filterRadius(filter.bounds.width()/2, filter.bounds.height()/2);
```

You can also access the lower and upper bounds of the rectangle by using *bounds.x1*, *bounds.y1*, *bounds.x2* and *bounds.y2* respectively. Here, *bounds.x1* and *bounds.y1* are used to set up and store an offset that we will use later for accessing the *src* input.

```
//Store the offset of the bottom-left corner of the filter image
//from the current pixel:
_filterOffset.x = filter.bounds.x1 - filterRadius.x;
_filterOffset.y = filter.bounds.y1 - filterRadius.y;
```

Finally, the radius we calculated earlier is used to initialise the 2D ranged access to the *src* input.

```
//Set up the access for the src image
src.setRange(-filterRadius.x, -filterRadius.y, filterRadius.x, filterRadius.y);
```

(This code will actually overestimate the access needed for an even-sized input by one row and column, but that’s OK - the important thing is not to underestimate the range we need, which could give the wrong results or even lead to a crash.)

Inside the *process()* function, we iterate over the *filter* image to do the convolution. The filter is centred over the current output position and each input pixel covered by the filter is multiplied by the corresponding filter weight. These weighted input values are accumulated, as are the filter weights; if the filter weights do not sum to one, this is compensated for at the end in order to preserve the brightness of the input.

The weighted input values are accumulated into *valueSum*, declared here:

```
SampleType(src) valueSum(0);
```

The ConvolutionKernel has pixelwise access to its *src* input so we can access a whole pixel at a time. A pixel has the *SampleType* of the input, accessed here using *SampleType(src)*. All images in Nuke are floating point, so *SampleType* here will be a vector of floating point values. The number of components in the vector will be the same as the number of components in *src*; for example, this will be 4 if *src* is an RGBA image. The declaration above initialises all of the components to zero.

With pixelwise access it’s also possible to access a single component from an input pixel. We do this with the the *filter* input, as only the first component is used for the filter weights. A single component’s type will be the *ValueType* of its input, which in this case, since we are in Nuke, is always float. Here *filterSum*, into which the filter weights will be accumulated, is declared as type *ValueType(filter)* and its single value is initialised to zero:

```
ValueType(filter) filterSum(0);
```

We iterate over the bounds of the *filter* image in order to accumulate the weighted *src* values:

```
//Iterate over the filter image
for(int j = filter.bounds.y1; j < filter.bounds.y2; j++) {
for(int i = filter.bounds.x1; i < filter.bounds.x2; i++) {
//Get the filter value
ValueType(filter) filterVal = filter(i, j, 0);
//Multiply the src value by the corresponding filter weight and accumulate
valueSum += filterVal * src(i + _filterOffset.x, j + _filterOffset.y);
//Update the filter sum with the current filter value
filterSum += filterVal;
}
}
```

Random access is not relative to the current position, so the *filter* input is accessed using the position *(i, j)* inside the filter image. *filter(i, j, 0)* returns the zeroth component at this position as a *ValueType(filter)*.

*Warning: Because this iteration is done at every pixel, it’s a good idea to keep the filter input fairly small, otherwise this kernel will take a very long time to run! This can also cause problems when the kernel is running on a GPU which is connected to your display, if the time taken exceeds the time-out limit for your operating system.*

The *src* is accessed with a 2D range, as we did in the BoxBlur2DKernel except that this time we access an entire pixel at a time, rather than a single component. The access looks just the same, *src(xOffset, yOffset)*, but this time it will return a value of type *SampleType(src)* instead of a single float. This value - which is actually a vector of values, as described above - is multiplied by the filter weight and accumulated into *valueSum*.

```
valueSum += filterVal * src(i + _filterOffset.x, j + _filterOffset.y);
```

At the end of the *process()* function, the *valueSum* is normalised by the accumulated filter weights to preserve the brightness of the input. The ConvolutionKernel doesn’t have any control over the values in its *filter* input, which might contain all zero values, so we need to be careful to avoid a division by zero here.

```
//Normalise the value sum, avoiding division by zero
if (filterSum != 0)
valueSum /= filterSum;
```

Now you’ve seen the main access patterns and kernel types, you should be ready to start writing your own kernels. If the examples we’ve looked at so far don’t cover everything you need, you can look at the list in *Example Kernels* for more clues, or at the Reference Guide.