OpenCV Adventure: December 2010

Friday, December 31, 2010

Chamfer Matching

My understanding of Chamfer Matching is that it detects a particular object from an image by scanning the image for the edge pixels. In the default example, the cluttered logo image is distance-transformed. And the logo image, which is reduced to edges, is then being looked up from the cluttered one. The look-up method would produce a list of candidates ordered by a score. Put it simply, the score is determined by placing the logo on the distance transformed image, the score is equal to sum of each pixel value of the edge times the corresponding distance-value by placing the logo on various locations. This implementation also scan multiple passes of a set of user-specified scaling.

Performance: registering logo.png with logo_in_clutter.png. takes 1-2 minutes with default parameters. The cluttered logo image 600x824 in size. The logo image is 252x257 in size.

Parameters:

Increasing the minMatchDistance parameter to reduce matching the same object more than once. More matches would return as a result. The parameter value seems to be a Manhattan distance.
Increasing the maxMatches from default would also help discover intended matches if it is rotated.
Use the trio [ scales, minScale, maxScale ] to improve matching due to size variants.
pad_X and pad_Y are x-steps and y-steps in Sliding Window Iterator, which is being used by the current implementation.
templScale parameter is simply set to label the current scale-factor of the supplied 'logo' is, should leave it as 1.

Observations:

Noise edges at the top half of the cluttered image often achieve 'high scores' with the 'logo', making other candidate matches essential to be looked at.
The results and costs vector are sorted in ascending order of costs.

Good introduction:

Presentation "Chamfer Matching and Hausdauf Distance" by Ankur Datta
Class Diagrams from the original implementation by Marius Muja: http://www.ros.org/doc/api/chamfer_matching/html/inherits.html

Thursday, December 30, 2010

Image Transform Samples

Pyramid Thresholds

Changing Pyramid Levels from 4 to 6 makes the regions bigger ( threshold2 = 0)
Significant differences with each step observed at lower end of threshold2.

DFT

Use pictures of simple geometry to better tell the characteristics of output. Real life images seem to just give a gray noisy output.
http://www.cs.ioc.ee/~khoros2/linear/dft-pulse-example/front-page.html
My understanding of DFT is it breaks down an image into a series of simpler images superimposed together, similar to the way Fourier Thereom describes. A periodic function is made up of a series of sine waves of different amplitudes in frequency multiples of the original, of different phases. I got this idea after reading from the website above and a quick read of DCT Basis Functions section from Digital Video Decompression.

Distance Transform

Good introduction: http://homepages.inf.ed.ac.uk/rbf/HIPR2/distance.htm
The contours gets 'rounder' as going C -> L1 -> L2.
Grid size 3, 5, Exact only makes significant differences in L2 from my tests using lena.jpg and stuffs.jpg
The objects of which their edges are detected varies with the edge-threshold. It makes sense as more details are revealed of a certain object when the threshold gets close to the intensity level of the object (in grayscale).

HoughLines

Added HoughCircles to the sample program. The radius is often wrong with small circles (pic1.png). The centers are correct. It is consistent with a comment from the OpenCV book regarding the radius value should not be used.
HoughCircles() has built-in Canny. Running a second Canny on a Canny Edge Detected image would add an dotted edge inside that first one. Resulting in a double-circle. And somehow this would get more accurate radius value.
Still unable to get all the circles detected from pic1.png test image.
>>> See later post on Randomized Hough Transforms for ellipse detection. <<<

Laplace Edge Detector

Added a switch to disable the edge detector, to compare the effects of different blurring methods.
Output from Median filtering is like drawing with a thick paint brush.
Added Bilateral Filtering to the rotation - significantly slows down the movie.
Median filtering is slower than Gaussian, but a bit faster than Bilateral.
Bilateral filtering shows thick edges like Median, and preserve more edges than the latter.
Getting more bang for the buck with Gaussian.
Higher Sigma -> Larger Window -> Slower and Blurrier images except Bilateral.
Made the Laplace aperture (odd and < 31) a function of Sigma. Reasonable output obtained within 7. Fewer and thinner edges with smaller aperture.

Friday, December 24, 2010

See the Math in HTML Documentation

The math formula in HTML doc is supposed to be viewable as PNG images. The image file translation is handled by Sphinx using Latex. For me, it doesn't work right away by specifying the 'latex' path and 'dvipng' option to conf.py.

http://sphinx.pocoo.org/ext/math.html#module-sphinx.ext.pngmath

The above explains why the README from the OpenCV/doc asked user to install 'dvipng'.

In order to get Latex support, I installed the 'tetex-bin' Cygwin package, which in turns installs python 2.6 as my default. Now I added the 'easy_install' (setuptools 0.6) for python 2.6. I also re-installed Sphinx and plasTex just in case the old version not compatible with python 2.6. And finally I used added the 'dvipng' python package with easy_install.

At this point, the first error from 'buildall' is has to do with missing 'latex.fmt' (and fmtutil.cnf/etc). It turns out that my tetex-bin' installation is too small does not cover that, as I found out that 'latex' directory is missing from /usr/share/texmf/tex/. So I do a greedy install - covering all the tetex packages, except those obviously for development purposes.

Now the error is missing 'utf8x.def' - indicating a lack of Unicode support in this Latex installation. It has to be added manually as there is no corresponding Cygwin package.

http://www.unruh.de/DniQ/latex/unicode/
It will be redirected to www.softbase.org. Follow the link to CTAN site for a unicode.zip file.

It's quite straightforward: simply unzip the unicode.tgz to /usr/share/texmf/tex/latex/. All the necessary files would be under the directory 'ucs'. Tell latex know about this by updating the /usr/share/texmf/tex/latex/ls-R file. Simply go to that directory and run 'mktexlsr'. Afterwards double-check the 'utf8x.def' is in the ls-R file.

Monday, December 20, 2010

Wildcard Expansion

Tried to use wildcard to supply a list of image files to the imagelist_creator sample as command line argument. Requires a another object 'setargv.obj' to enable that, which by-default is disabled.

The MSDN page that saves me:
http://msdn.microsoft.com/en-us/library/8bch7bkk%28v=VS.90%29.aspx

On Visual C++ 2008 Express, append 'setargv.obj' to the 'Additional options' under the 'Linker->Command Line' field of Project Properties.

Tuesday, December 14, 2010

3rd-Party + Test + Docs

Tried studying the header files now that it is much better organized. 'Feature2d' has a lot of new algorithms for feature detection and extraction. Overwhelmed by the changes and massive amount of functions in the header files. Need a better strategy to go over those.

Did a little look-around the openCV package:
3rd-Party

Lapack is installed (default?) linear algebra library. It's written in Fortran, developed by University of Kentucky - Knoxville. Interestingly, Garmin's Chairman is an alumni and donated a building for Computer Science. Looking back at the CMake config, noticed that there is another linear algebra library called Eigen. There is a comparison between Eigen versus Lapack.
ilmimf: ILM (the all-too-famous FX studio) donated an open-source package to support OpenEXR (High-Dynamic-Range photo format). No source is included in this package. It's a build-option from CMake.
gtest: Google Test framework based on xUnit (unit test framework). Doesn't look like being used either.
libjasper: JPEG2000 library. Multiple-resolution sounds like a good idea for network transmission. It says require more computing power. Wonder how many camera or other real-life support.

Test

Class name: Cxts ( Cv + Tests? )
The test sources are under project names cvtest_*; They are configured to be built in CMake. Notice all tests are descendants of Cxts. Individual tests seems to be registered to the 'master' class instance upon each instantiation.
Only top level cvtest_*.exe are built and installed under 'bin' among the samples.
Running: the cvtest_*.exe (-h for help). All or selected.
Check out the willowgarage site for test coverage table ( functions / conditions ).

Docs

Package comes with all-in-one PDF. Tried building the HTML version - one for each language (C, C++ and Python) by following the instructions given in README.
~~Need~~ to _manually_ install a few python packages to Cygwin. Start with 'setuptools' which includes the important 'easy_install' tool. Update: There are some CMake configuration options regarding docs (turn on Advance and Group). That could be easier?!
Use 'easy_install' to download and install 'pyparsing', 'plasTex' and 'sphinx'. Installation is straightforward once I figured out what the 'easy_install' is about. But spending an hour of this just to make HTML doc seems like a bit too much. Strongly suggest the HTML doc to be included in the package.
As indicated in Release notes: Bibliography is indeed broken. Fetch the 'opencv.bib' from SVN repository to fix.

Monday, December 13, 2010

OpenCV 2.2 Installation Notes

Finished Reading OpenCV book. It is based on OpenCV 2.0. Now that version 2.2 is out, time to play with new things. Reading from the release notes, looks like a lot of work has been done on 3D side of things since.

Build Options to be explored: TBB, IPP, QT, EIGEN2, OPENGL, CUDA.

And it would be interesting to build and run Android version.

Python-related build error in OpenCV-Win32 DEBUG build:

Missing python26-d.lib
http://stackoverflow.com/questions/1236060/compiling-python-modules-whith-debug-defined-on-msvc
http://tech.groups.yahoo.com/group/OpenCV/message/73972