Friday, December 31, 2010

Chamfer Matching

My understanding of Chamfer Matching is that it detects a particular object from an image by scanning the image for the edge pixels. In the default example, the cluttered logo image is distance-transformed. And the logo image, which is reduced to edges, is then being looked up from the cluttered one. The look-up method would produce a list of candidates ordered by a score. Put it simply, the score is determined by placing the logo on the distance transformed image, the score is equal to sum of each pixel value of the edge times the corresponding distance-value by placing the logo on various locations. This implementation also scan multiple passes of a set of user-specified scaling.

Performance: registering logo.png with logo_in_clutter.png. takes 1-2 minutes with default parameters. The cluttered logo image 600x824 in size. The logo image is 252x257 in size.

Parameters:

  • Increasing the minMatchDistance parameter to reduce matching the same object more than once. More matches would return as a result. The parameter value seems to be a Manhattan distance.
  • Increasing the maxMatches from default would also help discover intended matches if it is rotated.
  • Use the trio [ scales, minScale, maxScale ]  to improve matching due to size variants.
  • pad_X and pad_Y are x-steps and y-steps in Sliding Window Iterator, which is being used by the current implementation.
  • templScale parameter is simply set to label the current scale-factor of the supplied 'logo' is, should leave it as 1.


Observations:

  • Noise edges at the top half of the cluttered image often achieve 'high scores' with the 'logo', making other candidate matches essential to be looked at.
  • The results and costs vector are sorted in ascending order of costs.


Good introduction:

Thursday, December 30, 2010

Image Transform Samples

Pyramid Thresholds

  • Changing Pyramid Levels from 4 to 6 makes the regions bigger ( threshold2 = 0)
  • Significant differences with each step observed at lower end of threshold2.
DFT

  • Use pictures of simple geometry to better tell the characteristics of output. Real life images seem to just give a gray noisy output.
  • http://www.cs.ioc.ee/~khoros2/linear/dft-pulse-example/front-page.html
  • My understanding of DFT is it breaks down an image into a series of simpler images superimposed together, similar to the way Fourier Thereom describes. A periodic function is made up of a series of sine waves of different amplitudes in frequency multiples of the original, of different phases. I got this idea after reading from the website above and a quick read of DCT Basis Functions section from Digital Video Decompression.
Distance Transform

  • Good introduction: http://homepages.inf.ed.ac.uk/rbf/HIPR2/distance.htm
  • The contours gets 'rounder' as going C -> L1 -> L2.
  • Grid size 3, 5, Exact only makes significant differences in L2 from my tests using lena.jpg and stuffs.jpg
  • The objects of which their edges are detected varies with the edge-threshold. It makes sense as more details are revealed of a certain object when the threshold gets close to the intensity level of the object (in grayscale).
HoughLines

  • Added HoughCircles to the sample program. The radius is often wrong with small circles (pic1.png). The centers are correct. It is consistent with a comment from the OpenCV book regarding the radius value should not be used.
  • HoughCircles() has built-in Canny. Running a second Canny on a Canny Edge Detected image would add an dotted edge inside that first one. Resulting in a double-circle. And somehow this would get more accurate radius value.
  • Still unable to get all the circles detected from pic1.png test image.
  • >>> See later post on Randomized Hough Transforms for ellipse detection. <<<
Laplace Edge Detector
  • Added a switch to disable the edge detector, to compare the effects of different blurring methods.
  • Output from Median filtering is like drawing with a thick paint brush.
  • Added Bilateral Filtering to the rotation - significantly slows down the movie.
  • Median filtering is slower than Gaussian, but a bit faster than Bilateral.
  • Bilateral filtering shows thick edges like Median, and preserve more edges than the latter.
  • Getting more bang for the buck with Gaussian.
  • Higher Sigma -> Larger Window -> Slower and Blurrier images except Bilateral.
  • Made the Laplace aperture (odd and < 31) a function of Sigma. Reasonable output obtained within 7. Fewer and thinner edges with smaller aperture.

Friday, December 24, 2010

See the Math in HTML Documentation

The math formula in HTML doc is supposed to be viewable as PNG images. The image file translation is handled by Sphinx using Latex. For me, it doesn't work right away by specifying the 'latex' path and 'dvipng' option to conf.py.

http://sphinx.pocoo.org/ext/math.html#module-sphinx.ext.pngmath

The above explains why the README from the OpenCV/doc asked user to install 'dvipng'.

In order to get Latex support, I installed the 'tetex-bin' Cygwin package, which in turns installs python 2.6 as my default. Now I added the 'easy_install' (setuptools 0.6) for python 2.6. I also re-installed Sphinx and plasTex just in case the old version not compatible with python 2.6. And finally I used added the 'dvipng' python package with easy_install.

At this point, the first error from 'buildall' is has to do with missing 'latex.fmt' (and fmtutil.cnf/etc). It turns out that my tetex-bin' installation is too small does not cover that, as I found out that 'latex' directory is missing from /usr/share/texmf/tex/. So I do a greedy install - covering all the tetex packages, except those obviously for development purposes.

Now the error is missing 'utf8x.def' - indicating a lack of Unicode support in this Latex installation. It has to be added manually as there is no corresponding Cygwin package.

http://www.unruh.de/DniQ/latex/unicode/
It will be redirected to www.softbase.org. Follow the link to CTAN site for a unicode.zip file.

It's quite straightforward: simply unzip the unicode.tgz to /usr/share/texmf/tex/latex/. All the necessary files would be under the directory 'ucs'. Tell latex know about this by updating the /usr/share/texmf/tex/latex/ls-R file. Simply go to that directory and run 'mktexlsr'. Afterwards double-check the 'utf8x.def' is in the ls-R file.

Monday, December 20, 2010

Wildcard Expansion

Tried to use wildcard to supply a list of image files to the imagelist_creator sample as command line argument. Requires a another object 'setargv.obj' to enable that, which by-default is disabled.

The MSDN page that saves me:
http://msdn.microsoft.com/en-us/library/8bch7bkk%28v=VS.90%29.aspx

On Visual C++ 2008 Express, append 'setargv.obj' to the 'Additional options' under the 'Linker->Command Line' field of Project Properties.

Tuesday, December 14, 2010

3rd-Party + Test + Docs

Tried studying the header files now that it is much better organized. 'Feature2d' has a lot of new algorithms for feature detection and extraction. Overwhelmed by the changes and massive amount of functions in the header files. Need a better strategy to go over those.

Did a little look-around the openCV package:
3rd-Party

  • Lapack is installed (default?) linear algebra library. It's written in Fortran, developed by University of Kentucky - Knoxville. Interestingly, Garmin's Chairman is an alumni and donated a building for Computer Science. Looking back at the CMake config, noticed that there is another linear algebra library called Eigen. There is a comparison between Eigen versus Lapack.
  • ilmimf: ILM (the all-too-famous FX studio) donated an open-source package to support OpenEXR (High-Dynamic-Range photo format). No source is included in this package. It's a build-option from CMake.
  • gtest: Google Test framework based on xUnit (unit test framework). Doesn't look like being used either.
  • libjasper: JPEG2000 library. Multiple-resolution sounds like a good idea for network transmission. It says require more computing power. Wonder how many camera or other real-life support.

Test

  • Class name: Cxts ( Cv + Tests? )
  • The test sources are under project names cvtest_*; They are configured to be built in CMake. Notice all tests are descendants of Cxts. Individual tests seems to be registered to the 'master' class instance upon each instantiation.
  • Only top level cvtest_*.exe are built and installed under 'bin' among the samples.
  • Running: the cvtest_*.exe (-h for help). All or selected.
  • Check out the willowgarage site for test coverage table ( functions / conditions ).

Docs

  • Package comes with all-in-one PDF. Tried building the HTML version - one for each language (C, C++ and Python) by following the instructions given in README.
  • Need to _manually_ install a few python packages to Cygwin. Start with 'setuptools' which includes the important 'easy_install' tool. Update: There are some CMake configuration options regarding docs (turn on Advance and Group). That could be easier?!
  • Use 'easy_install' to download and install 'pyparsing', 'plasTex' and 'sphinx'. Installation is straightforward once I figured out what the 'easy_install' is about. But spending an hour of this just to make HTML doc seems like a bit too much. Strongly suggest the HTML doc to be included in the package.
  • As indicated in Release notes: Bibliography is indeed broken. Fetch the 'opencv.bib' from SVN repository to fix.

Monday, December 13, 2010

OpenCV 2.2 Installation Notes

Finished Reading OpenCV book. It is based on OpenCV 2.0. Now that version 2.2 is out, time to play with new things. Reading from the release notes, looks like a lot of work has been done on 3D side of things since.

Build Options to be explored: TBB, IPP, QT, EIGEN2, OPENGL, CUDA.

And it would be interesting to build and run Android version.



Python-related build error in OpenCV-Win32 DEBUG build:

Thursday, March 18, 2010

It's time to pick a simple example to gain more confidence! I am able to finish 'drawing.c' example. It goes through the openCV drawing functions, lines, rectangles,... They are all pretty rich APIs, able to go beyond the basic drawing primitivesAPI. The PolyLine API is a bit difficult to understand at the beginning. Turns out it is able to draw a number of contours with a single API. The angle arguments of ellipse are confusing.

OpenCV uses a 'liberal font' called Hershey. It's a vector font. Interesting story behind this:
http://idlastro.gsfc.nasa.gov/idl_html_help/Hershey_Vector_Font_Samples.html

The example uses the random number generator APIs given by OpenCV. Equally impressive, it is able to specify distribution type (Uniform or Normal). Useful for generating an array of numbers, such as 'noisy screen'.

Wednesday, March 17, 2010

With some experimentation, it turns out that adjusting the following parameters are useful:
  • - command-line thresholds: increasing the range reduces the noises as seen from the 'Diff window'
  • - perimeter ratio argument to cvSegmentFGMask(): reducing this to keep smaller area from being thrown away. It works well if the noise has been reduced.
Now able to put the 'car theives' on the 'Connected' screen after tuning the above arguments.

Notes: a car leaving the parking lot would leave a permanent 'change'. The passing-by cars leaves mark also. The rays from headlight introduced a big block of 'differences'. Low frame-rate makes fast-moving object blurry. The person is more likely to show up in the 'radar' if he/she wears light-colored clothes. Skin is 'dark'.

The number-of-frames to learn MATTERS: use small value if the expected movement is small, like the tennis lesson videos. Otherwise, all the subsequent movements will still be 'in the dark'.

Finally, having trouble with editing SVN log message. Needs to setup the pre-commit hook. The template that comes with SVN installation are Unix shell scripts. Able to found a great working one here for Windows (Awesome!):
http://ayria.livejournal.com/33438.html

Saturday, March 13, 2010

Finally able to run the bgfg_codebook demo. Few hours spent on getting the videos ready. Surveillance camera video posted on YouTube. Download as MP4 (discovered that short ones like (0:09) doesn't seem to work). The next step is to find a suitable video as input. MediaCoder does that job (the latest 0.7XX version keep failing though). The input format chosen is Huffyuv (requires a simple codec installation) in a AVI container. I suppose the requirement is to have something like motion jpeg where every frame is a full-frame, unlike MP4. For some reason this PC is unable to play motion jpeg with an AVI container.

Using the default parameters is not very good. The 'Frame Differences' output seems alright. But the 'Connected Component' output often unable to 'connect the dots'. Tested using the Car-Theft-At-JoJo video.

Downloaded the Tennis Lesson #1 from "Fuzzy Yellow Balls". Wonder if that would give a better result.

Friday, March 12, 2010

Spent the last few days on the sample bgfg_codebook. At this point, slowly examined the source code of the sample program - which exercises the functions of a cvaux module: cvbgfg_codebook. Have yet to run the program yet. During such course, explored a little bit on dynamic storage provided by CVCore. The MemStorage and CVSeq. Other topics that is touched on is the built-in contour tracing, supposedly the same as edge-detection/perimeter-finding. The ApproxPoly() function uses something called Douglas-Peucker algorithm to reduce the number of points needed to represent a curve. Wonder if that is what Inkscape does to 'Simplify Path'. Amazed by the depth of this openCV library.

Saturday, March 6, 2010

cvAdaptiveSkinDetector

It turns out that the argument to cvWaitKey() is the timeout value for waiting for user input. Passing a zero (0) to this would make it hold until user presses a key.

Spent a few hours today to look through the AdaptiveSkinDetector sample program. It is an exercise of the cvAdaptiveSkinDetector module. The program expects a series of frames pictures split from video or actually grabbing frames from a the webcam input. The thresholds of skin is just a range of Hue values. The adaptive part comes in as it adjusts the range based on the histogram collected going from frame to frame.

Friday, February 12, 2010

Turned off the 'ignore' settings so that the libraries could be added/imported to SVN. This solves the issue of missing files at link time. Now able to run the adaptive-skin-detector demo. Still unable to make it stop at each frame. Looks like cvWaitKey() somehow does not work.

Thursday, February 11, 2010

Starting a journal

On the way to the studio tonight, it suddenly hits me. writing down my progress as a journal seems like a good idea to see how much time I put into this experiment. Before today, I spent one month reading the book Practical Computer Vision Using C. Summarize the points to a Xmind.
Last night I resume by setting up SVN and import the OpenCV 2.0 software. Use CMake2.8 and Visual C++ 2008 Express to build the software. Ran into a link-time problem. Unable to find libgcc static library (.a) for ffmpeg library.