While looking for a project on `Image Processing', we were intrigued by the problem of finding a particular object in a given image. We were excited by the idea of manipulating with images. A direct application of this can be sensed in Robot Vision.
How does our project relate to `Robot Vision'?
Imagine a robot who is asked to pick up a particular shape
of object present before him. First, he must sense it (i.e. identify its
location), then reach near the object, and pick it up. Here we deal with
the two dimensional analog of sensing the object ; the vision of
the robot corresponds to the given image and what he is supposed
to pick is what we wish to find in that image. Though it
is a simplified version of the actual problem, the implementation of the
actual `vision' won't need any new algorithms!
Example :
We wish to find a particular object in a 2-D image, say for example, Africa in the world map. Humans do this by first looking at the outline of Africa, then matching that outline in the world map. Our aim is to simulate this, for we want our robots to be as human as they can! We first get the edge map of both Africa and World Map, then compare the edges, keeping in mind that the comparison has to be independent of the scale and orientation of the object (see Methodology for details).
But, the problem in its original form is quite tough;
so we do some simplifications to the problem domain, to probe at least
to some depth in this field. Firstly, we assume that there is only one
object present in every image we deal with. Secondly, the intensity
difference between the object and the background is marked enough i.e.
more than some threshold value. With these simplifications let us see the
Theory
used to implement this.
Mr. F. Mokhtarian and Mr. Alan Mackworth has done a
lot of work in the field of our consideration. It is his work which has
inspired us to experiment in this field and make a humble effort to further
his fine work.
His papers (see references
) have defined a unique way of observing and studying 2-D
closed shapes. The fore ground of the image is assumed to ot
to be that of an object whose intensity differs markedly from that if its
background . The object is extracted from the image using this very property
. A edge detector was actually not used because not all edges of the image
were required. The program implemented can trace the outer most closed
curve of the object and thus proceed . Noise is softened and if the
background is lighter than the foreground then the image is inverted. Now
comes the most important part , the essence of the program . The Curvature
Scale Space . It is a mapping of the image of the object from three dimansional
space to a space which represents each point as a curvature w.r.t. the
arclength. The soln. to K(u,o~) = 0 , are the (u,o~) points plotted on
the C.S.S.plot . Now the image is represented as its C.S.S. plot .
This plot is unique for each curve and vice versa and stable in the sense
that slight change in the image is reflected as slight change in the plot
.
Xu(u,o~) Yuu(u,o~)
- Yu(u,o~) Xuu(u,o~)
curvature(u,o~) =
-------------------------------
( Xu2(u,o~)
+ Yu2(u,o~)
)1.5o~ denotes the width
of the Gaussian .
Sigma is directly proportional to the amount the
curve has evolved and hence the amount the curve is freed of noise and
smoothened.
Here :
X(u , o~) = Integ( x(v) * exp(-(u-v)2/2o~2)
* dv)
Y(u , o~) = Integ( y(v) * exp(-(u-v)2/2o~2)*
dv)
where:
Integ denotes the integral evaluated from -infinity to +infinity.
Here 'u' and 'v' are the normalized arc length parameters .
Solutions to "curvature(u,o~) = 0 " were the plotted
on curvature(u) vs 'sigma' gave the Curvature Scale Space
Image. Once the CSS contours of the main and auxiliary image were obtained
they were ready for match.
The CSS plot is reflective of the shape of the object only (size is normalised and curvature has no dependance on orientation or location in space). We see all the possible matches of two CSS plots by finding the no. of maximas of model above (say) 90% of image maximum.These are the no of possible matches . Then we extract all the maximas of image and all the maximas of the model above the threshold. We find the likely shift of the CSS plot of model in correspondance with the CSS plot of image and superimpose the maxima points and find the cost by finding the distance between the points.The minimum cost gives the best match .Do this for all the models and the image to find which model matches the image.
The choice of this method for
recognizing patterns over other methods was because this had the combined
credentials of being computationally cheap , simple and efficient. And
it had the properties of invariance(under rotation, translation ,uniform
scaling of object image) , stability (of curve and corresponding CSS image)
, uniqueness. Besides great amount of noise and distortion is tolerated.
During evolution and arc length evolution of the CSS contour the corresponding
planar curve is simplified , becomes convex , shrinks in length , smoothens
. Thus its recognition process partly resembles the humans pattern recognition
capabilities.
The image URL's are read from a file and loaded by the
implplementation
code. Our aim is to recognize the image amongst the model(s)
, overlooking all the noise and disturbances in the two images and obtaining
as accurate a result as ( we intend to) the human eyes would have returned.
For each of the images we first find the background intensity.
If it is white then we invert the image. A smoothening filter
is applied to smoothen the image.Next the outer boundary is traced
for simplifying the analysis of the images . This educes the amount of
data to be processed while preserving useful structural information about
the object boundaries.Now the image is
Gaussian smoothed( to reduce
false results due to noise) to give the smoothed gradient at each pixel.
Shown below is a test image and versions at various levels of smoothening.
|
|||
This is followed by conversion of the BINARY EDGE IMAGE
to CURVATURE SCALE SPACE IMAGE. This representation is useful where
one has to recognize noisy curves of arbitrary shapes at an arbitrary
scale & orientation.
This method convolutes a path (arc length ) based
parametric
representation of a planar curve ( closed 2 - D object) with a Gaussian
function , as the Gaussian width varies from a small to a large value.
Basically we plot the curvature vs. the normalized arc length of
the planar curve. As the Gaussian width is increased , the
scale of the image increases or the image evolves and thus the amount of
noise is reduced and the curve distortions smoothened. The benefits of
this representation are that it is invariant under rotation
,uniform scaling and translation of the curve. It s unique
(
or there is a 1 to 1 correspondence between a curve and is C.S.S.
image) and it is stable ( slight changes in the curve does not significantly
affect the C.S.S. image and vice versa). It is also an efficient ,
simple. For details see C.S.S.Imaging
. Shown below are the original image, the contour image
and
the curvature scale space image which could be obtained using C. S.
S. Imaging.
To match the two images , we basically implement the algorithm described in the previous section. The computed costs are stored in an array which is then traversed to get the minimum cost.
The lower the cost of the node, the better the match is. In fact, cost = 0 denotes a perfect match. High costs are not desirable. We apply a simple serial match approach to compute the node with the lowest cost. This node corresponds to the best possible match . The names of the best matched images are returned.
Thus we've found the pattern of the auxiliary image
in
our main image. This is how we have to executed our objective.
We have experimented with some images created in a Linux Utility "Xfig" with added uniform noise.Our sample input was :
|
||
|
|
|
The obtained convolved edges of all the images are :-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
cost[image<-->model
1] = 750.3223
cost[image<-->model
2] = INFINITY (No Match!)
cost[image<-->model
3] = 0.0 (Perfect Match !!!!!)
A robust implementation has been developed to find the best match of an image from a set of images already available in our data base . The property worked upon is that the CSS plot of an image is itself a unique representation of an image . Instead of matching images , matching their CSS plots is more computationally efficient and accurate . The maximas on the CSS plots are non intersecting and isolated . Thus by the maxima - matching algorithm we are not only trying to match a set of arbitrary finite points, but we are actually matching the whole CSS plot (assuming the typical CSS plot structure - elliptic curves with major axis several times minor axis and open at bottom and closed at top ). The results obtained satisfactorily show proper match with realistic cost outputs . The claims of matching objects with different orientation , size and position in space have been achieved and the method overlooks noise distribution. Though some drawbacks are that only a single object should be on the image , it should form a closed contour and the foreground and background should have a definite intensity difference.Thus we are concentrating on the boundary values of the image only and the internal features are ignored . This resticts the usefulness of the project to some extent and introduces scope for further development.
Scope for Future Work :-
Future work is possible in developing a matching algorithm to match 1 object in a given set of objects on an image which may be occluded or may occlude other objects in the view. Thereby adjacent or intersecting objects on image may be determined too. Another possibilty is to map the internal features of a object on image too. Also a more immediate problem may be to combine this work with that of morphology of contours which are arbitrarily generated since little work on this has been done before . And also in the corner detection algorithm apply by those trying to generate three dimension objects given more than 1 of its two - dimensional views.
The papers referred for the formulation of the objective
of
this project are as follows :
@In Proceedings{ location = {New Delhi,India
},
date = {22-23},
publisher = {IEEE Press },
authors = {Farzin Mokhtarian and Riku Suomela and Ka Cheung Chan},
title = { Image Point Feature Detection through Curvature Scale
Space },
year = { 1998 },
month = {December },
institution = {Centre for Vision Speech and Signal Processing,
Department of Electronic and Electrical Engineering,
University of Surrey, Guildford, England },
e-mail = {F.Mokhtarian@ee.surrey.ac.uk}
www = { http;//www.ee.surrey.ac.uk/Personal/F.Mokhtarian/
}
annote = {
This paper deals with obtaining the corner points of an image using the curvature scale space method. This method is very robust and performed better than the other corner detectors available at that time. The detector also provided point features at multiple scales.
}
@Article{
author = {F.Mokhtarian _et_al:1995},
title = {Silhoutte based Isolated Object Recognition through Curvature
Scale Space},
year = {1995},
volume={17},
number={5},
pages={539 - 544},
journal = {IEEE Trans. on Pattern Analysis and Machine Intelligence},
institution = {Centre for Vision and Signal Processing ,
Department of Electronic and Electrical Engg.
University of Surrey ,Guilford , England},
e-mail = {F.Mokhtarian@ee.surrey.ac.uk}
www = { http://www.ee.surrey.ac.uk/Personal/F.Mokhtarian/
}
@Article{ Mokhtarian,F./Mackworth,A.K. : 1986,
author = { F. Mokhtarian and A.K. Mackworth_et_al :1986},
title = {Scale-based description and recognition of planar curves and
two dimensional shapes},
year = { 1986},
month = {January},
volume = {8},
number= {1},
pages = {34-43},
journal={IEEE Trans. on Pattern Analysis and Machine Intelligence},
institution = {Univ. of Surrey,England},
e-mail = {F.Mokhtarian@ee.surrey.ac.uk}
www = { http;//www.ee.surrey.ac.uk/Personal/F.Mokhtarian/
}
@Article{ Mokhtarian,F./Mackworth,A.K. :1992},
author = {F. Mokhtarian and A.K. Mackworth_et_al :1992},
title = {A theory of multi-scale, curvature-based shape representation
for planar curves},
year = {1992},
month = {August},
volume = {14},
number ={8},
pages = {789-805},
journal = {IEEE Trans. on Pattern Analysis and Machine Intelligence},
institution= {Univ. of Surrey. England },
e-mail = {F.Mokhtarian@ee.surrey.ac.uk}
www = { http;//www.ee.surrey.ac.uk/Personal/F.Mokhtarian/
}
Also see Theoretical Background
for a general idea of the contents of these papers.
Several online resources are available for reference to
related topics :-