The Thousand-Mile Telescope: Image Processing, Machine Vision, and the Way Forward.

There are no stars in this image.

There are also no galaxies, no asteroids, and there is no interstellar background.

Images contain pixels, and nothing else. The image can answer questions like "what is the gray value at location x,y?". Or: "What are the dimensions of this image?" Or "what is the average gray value?" Or even "What is the gray value histogram?" That is all.

If something says "I see stars in that image." then that something is a vision system. The stars that it sees are not in the image -- but are in its own head. In the data structures that it has created, by doing some highly nontrivial computation, using that image as input.

It's very hard for humans to understand that the objects that they see are not explicitly in the image, but are highly abstract constructs in their own minds -- because, for humans, vision is utterly effortless. When you look at this page and then glance around the room, you are probably using more compute power than exists in the United State of America, including all the secret parts, and you're doing it without so much as frowning. (Which makes it pretty hard on guys who try to get machines to see what you can see. But that .. is another story.)

Image Processing is not Vision

There are two kinds of processing you can do to an image: image processing and machine vision.

Image processing consists of operations that take an image in, and put out a transformation of the image. For example, an image processing operation may take in a color image and put out a monochrome version of it, or a contrast-enhanced version. Or it may put out the sum of all the pixel values, or the average pixel value, or a histogram of all the pixel values.

Image processing stays within the realm of images and their properties.

Vision, on the other hand, takes in an image, but then puts out a data structure that is not in the realm of images-and-their-properties. For instance, it may take in a gray scale image and put out a data structure that says "I see a chair at position X1, Y1, X2, Y2 with certainty C".

Chairs are not in images. The vision system needs to add a lot of non-image knowledge to determine that a certain pattern of brighter and darker regions probably represents a chair.

But even simple features like bright edges, or streaks, or regions of a given color are not quite in the image domain. A uniformly colored region is not explicit in the image. It is implicit, and must be gotten out (turned into a data structure) by some nontrivial processing.

I Need Both

In the asteroid-finding system I am writing, I should have both an image processing level and a machine vision level. These two levels should be well separated from each other, so that I can run the lower-level image processing by itself.

Because -- I want to be able to test the machine vision level against what a human can do, after only the image processing code has been run. The goal will be to get machine vision code that can at least get in the ballpark of a system composed of image processing plus human vision.

Testing

To characterize vision-system performance I will make a little gadget that creates random simulated asteroid streaks in real images that I have taken.

Take 50 real images.
program draws random streaks into half of them.
Starting point and direction are random.
brightness of streak is not random -- chosen by input argument.
I don't know which images have streaks, which don't.
I examine all 50 images visually, write down locations of where I think I see a streak.
Look at 'answers' saved by streak-drawing program, count how many real streaks I missed, and how many times I thought I saw a streak that was not really there. ( False negatives and false positives. )

I should probably have some description of what kind of performance I hope to achieve with any vision system, whether it is natural or artificial. I.e. what level of false positives are acceptable: when I hallucinate a streak.

The goodness of the vision system should probably be expressed something like this: "For every four streaks that the vision system reports, three of them, on average, will be real."

In this endeavor, false positives are very expensive. They cause you to go take a follow-up picture. False negatives are not a big deal: when you miss a streak that is there. After all, the point is finding new asteroids. If we miss one, that's OK. We didn't know about it before, and we still don't.

Well, unless I miss the one that's coming to destroy civilization, or our species, or life on land, or whatever. That would be bad.

Next Step

OK, so that's my next step. Write this testing system, and use it to characterize my own 'natural' vision system performance. See if it also gives me ideas about improvements to the image processing level. And get ready to use the same test on the machine vision system.

The Thousand-Mile Telescope

Friday, February 6, 2015

Image Processing, Machine Vision, and the Way Forward.

Image Processing is not Vision

I Need Both

Testing

Next Step

No comments:

Post a Comment