Motion - Dilate Nine Speed Patch

Dilate9 Speed Patch


This patch was created to improve the speed of the dilate9 function in alg.c. As can be seen in the execution profile when motion is being detected, the function is pretty CPU intensive:

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 18.83     96.56    36.12      193   187.16   187.16  dilate9
(See MotionProfiling for more info and the example profile where the above snippet was taken from.)

When writing the patch, a bug in the original function was detected. The patch contains a fix for the bug.

Description of Patch

The patch has two purposes. The first and foremost purpose is to increase the speed of the dilate9 function. The current function is slow, mainly because it performs the same calculations several times. I wrote a test program that compares the current/old function and the optimized/new function:

Running 2000 iterations of old_dilate9 with image size 320x240: 42.75 ms/iteration
Running 2000 iterations of new_dilate9 with image size 320x240: 10.89 ms/iteration

We can also compare the entry in the execution profile (see above):

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
  9.51     54.04     7.05      144    48.96    48.96  dilate9

As can be seen, the optimized function runs nearly 75% faster regardless of how we measure. The speed improvement is achieved mainly by cutting down on the number of statements executed in the inner loop.

The second, and also very important, purpose is to fix a bug detected in the current dilate9 function. The bug occurs because the function treats the image as an array of (signed) char, and uses the macro MAX(x, y) which compares the absolute values of its two operands. This has the effect that luminance (Y) values above 127 may be considered smaller than luminance values below 127 in some cases. See the mailing list discussion for more info.

Installation of Patch

The installation is very straightforward:

  1. tar xzf motion-3.1.18_snap4.tar.gz
  2. cd motion-3.1.18
  3. zcat ../motion-3.1.18_snap4-dilate9.patch.gz | patch -p1
  4. ./configure and make.

Testing and Validation

Since I don't have any test pictures for which I know the expected result after running the function, I created a program that randomly generates a picture, runs both the old and the new functions on it, and compares the results:

Testing accuracy of new_dilate9 compared to old_dilate9; 15000 iterations with image size 320x240: all ok

In other words, the new function generated the same result as the old in 15000 random cases. Note that in this test, the bug mentioned above had been fixed in the old function as well.

-- PerJonsson - 11 Nov 2004

Discussion and Comments

I will post a patch for the dilate5 function next week as well. It can be optimized in a similar way as the dilate9 function.

-- PerJonsson - 12 Nov 2004

I have already added this patch to my source tree.

And I have released a snapshot release with it.

Excellent job.

-- KennethLavrsen - 12 Nov 2004
Topic revision: r6 - 30 Jan 2005, KennethLavrsen
Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Please do not email Kenneth for support questions (read why). Use the Support Requests page or join the Mailing List.
This website only use harmless session cookies. See Cookie Policy for details. By using this website you accept the use of these cookies.