03
Dec
07

### Code Optimisation

My brother has been helping me with the maths for a little project (well, doing it for me, in fact, since I’m a mathematical dunce).

The idea was to create a more flexible version of the classic FreeFrame ‘blow‘ and ‘smear‘ effects.

In this variation, 2 movable points, A and B are defined. Pixels on an imaginary line between the 2 points are smeared towards the edge of the frame, in a direction at 90 deg. to the line. Much easier to expain with a diagram than in words.

With Ed’s help, I was able to get a version of the effect working a while back, but I was never satisfied with the code. On Saturday, I finally got around to sitting down and re-writing it to run much more efficiently. Following Ed’s suggestions, I seperated-out some of the heavier maths into functions, which would only be called when needed. I’ve never tried doing this before, as for some reason I’d always assumed you could only have one function in a CIKernel. It seems to work, though, so it’s a technique I’ll be using in the future.

Here is the final code:

// Function to calculate value of t
float calculate_t(vec2 A, vec2 B, vec2 C)
{
// If t is in range 0 to 1, current pixel is on a line normal to B-A
return dot((B-A),(C-A)) / dot((B-A),(B-A));
}

// Function to calculate position of pixel to copy
vec2 smearSource(vec2 A, vec2 B, vec2 C, float t)
{
// D is intersection point of line from current pixel to point on B-A, which is normal to B-A
vec2 D = A+t * (B-A);
return D;
}

// Main kernel function (returns image)
kernel vec4 tb_superSmear(sampler Image, vec2 A, vec2 B)
{
// Current pixel location
vec2 C = samplerCoord(Image);
float t = ((A.x-B.x)*(C.y-B.y) < (C.x-B.x)*(A.y-B.y)) ? -1.0 : calculate_t(A, B, C);
vec2 samplePos = (t < 0.0) ? C : (t > 1.0) ? C : smearSource(A, B, C, t);
// Output
return sample(Image, vec2(samplePos));
}

If I change the line
float t = ((A.x-B.x)*(C.y-B.y) < (C.x-B.x)*(A.y-B.y)) ? -1.0 : calculate_t(A, B, C);
to
float t = ((A.x-B.x)*(C.y-B.y) > (C.x-B.x)*(A.y-B.y)) ? -1.0 : calculate_t(A, B, C);
I’m able to ‘flip’ the effect from one side of the line to the other.
I thought, rather than adding more code to the CIKernel, I’d just have 2 copies of the Kernel, with just that one line changed, then switch between them with a Multiplexer. This cuts down on the stuff that has to be done per-pixel, and patches that don’t produce an output (ie ones that are not routed to an output port by the multipexer) shouldn’t consume system resources, anyway.

The whole question of optimising CIKernel code isn’t something I’ve really addressed before. It’s always been more a case of just getting things to work. It’s definitely something I’ll have to start considering in the future though.