-------- Original-Nachricht --------
> Datum: Mon, 23 Jun 2008 20:08:13 +0900
> Von: Casimir <pikEISPAMMMseli / welho.com>
> An: ruby-talk / ruby-lang.org
> Betreff: Re: Texture/image similarity with ruby? (computer vision)

Dear Casimir,


As a scientific folklore has it , if you haven't got the answer, it's because you did not understand
the question ...

> Axel Etzold wrote on Thu, 19 Jun 2008 20:33:43 +0900
> 
> > well, what does similarity of two images/textures mean for you ?
> 
> Perceptual similarity as a human subject would experience it.

I think this statement is not (yet) useful, since there is no such thing as a unique perception of human
subjects of an image. A subset of images in your collection might be similar because they all show postcard destinations (the skyline of Manhattan, Big Ben in London, a camel in front of the Pyramids in Cairo, a koala in Australia, a Tyrolean yodler), but the same images show living beings whereas others show architecture and therefore are dissimilar to a psychologist or to a civil engineer ...  
I know of many neuroscientists who train monkeys, usually for at least six months, more often for a year, on some very specific task to test a particular hypothesis about early vision. (It takes this long, because you can't explain the monkey
what to do in English, they'll tell you). These monkeys then become experts on any weird task -- theoretical computer science holds some nice, albeit non-constructive (in the sense of their mathematical proof style) theorems about computability using neural networks, that can be confirmed experimentally this way. 
So there is a description problem that will possibly lead you down very different ways on the same particular set of data, depending on whether you are a psychologist or a civil engineer or a tourist manager in the example above.
On the other hand, there is the anecdote of a journalist questioning a judge at Supreme Court of the United States to define obscenity (right after a decision the journalist didn't like). 

Journalist: How do you define obscenity?
Judge: I recognise it when I see it.

So given the preliminary information at breakfast that we need to book the summer holidays, we might all classify the
data from the example above the way the tourist manager would as we sit in the travel agent's shop, there is still a question of how to do that.

> 
> At the moment I am focusing on following the problem: Given any single 
> photograph and a random set of photos (20-100), which of the random set 
> is most similar, perceptually, to the target photo.
> 
> I have made some simple tests that divide image into color-channel 
> components and a luminosity channel, downsample the channels into a 
> 16x16 arrays, and calculates the difference between the target photo to 
> each of the random ones. Difference hashing its called?
> 
> Results are rather confusing. Most of the time perceived similarity (as 
> I experience it) does not exist, even if statistically the images might 
> be similar.

This shows that apparently, the human visual system does not use this kind
of downsampling for similarity measures....


> ... [Support Vector Machines] is one of the possible avenues. Gabor features sets used this kind 
> of approach I believe. 

Gabor feature sets (or wavelets in general) and support vector machines and I'd claim, any 
feasible and plausible approach of modelling vision will need some kind of data reduction ...
reconstruct an image with loss using basis functions, use only a few, but yet so many that
the loss is not (annoyingly) perceptible. The number of parameters you'll need to do that
will then suffice to do the classification also.
Interpersonal mileage may vary.. there are some individuals who cannot distinguish Bordeaux
from Bourgogne when tasting either,  but then they don't  write wine connoissor guides.


> But, I don't see this as the most interesting approach. The particular 
> problem I wrestle with has rather small sets, and training would have to 
> be performed for every photo. 

That's in contradiction to your statement above ... if there is such a thing as human similarity
perception, it should hold for all cases brought in front of you.
In other words, if you are a judge at a supreme court, you should not confuse and irritate
people by taking decisions that seem totally contradictory all the time ... those people that do  not like the decisions
taken will still appreciate internal consistency ...

> May be training a nn is the only way to really do this.

Neural networks can be seen as a hardware for  support vector machines or for Gabor patches etc  (even if
this is a gedankenexperiment most of the time).

> E. Borasky - Yes, I have the tools down, but clearly dont have a 
> suitable perceptual image comparison algo yet.

I would take an iterative approach : classify the pictures that look similar to you by hand and then plot the classes as point clusters in different colours in the space of variables which you think seem useful (colour, hue, ...)

Can you easily draw a line to separate the dots of different classes ? Then the SVM in that particular
space delivers what you want.
Otherwise, introducing a new dimension might give you the opportunity to introduce a separating plane,
or you might still need another one. Eventually, there's the theorem of Hahn-Banach from functional
analysis that shows how to separate points ... any set of points ... using a hyperplane (i.e., something linear
and therefore computationally well-behaved).



> 
> Thanks also to Ron Fox for pointing out its not going to be easy. :)
> 
> So, I guess I could use the rest of my lunch break to elaborate on the 
> Question.
> 
> What kind of computational algorithm would provide the perceptual 
> similarity score, rating or hash of some kind between two or more images 
> that would match the way humans perceive best?
> 
> I guess one would need two distinct classifications: similarity of 
> morphological appearance (features, shapes, ? in image) and similarity 
> of the colors (of the areas).

A good idea would be to have a look at a good introduction to wavelets ...

http://www.gvsu.edu/math/wavelets/tutorials.htm

You can use wavelets as basis functions, fit parameters for your data and then try to find
separable parameter sets for your different classes.
Think of something plausible when choosing ... as Ron Fox said, it's not easy ...

Best regards,

Axel 
-- 
Psssst! Schon vom neuen GMX MultiMessenger geh÷˛t?
Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger