The Thousand-Mile Telescope: Breakthrough

No, my next step is not to make a freaking testing system.

What I want here is an image processing and machine vision solution to finding faint streaks! I don't want a testing system! (Well, OK. A testing system will be useful. But first I want a solution to test.)

For months now I have been bouncing back and forth between a vision-based approach and a purely statistical approach. At various times I could convince myself that one or the other was the Right Stuff -- but that conviction never lasted longer than a day or two.

The uncertainty has been painful.

At last I have a solution that has blended these two approaches: vision/structural, and statistical. And I think this one is truly The Right Stuff.

Outline of the Approach

Get statistics for the background
Make the stars go away
do region-growing on all remaining pixels brighter than 2 standard deviations above background. (experiment with this threshold.)
take statistics of the region size
see if there are any regions that greatly stand out

The region-growing is the machine vision part. Using the statistics of the region size is .. um ... the statistical part. My first experiment shows that these two together might be a Really Big Deal.

Region Growing

Does everybody know what region growing is? It's easy.

You have a binary image, and you want to grow regions for the white pixels.
search the image in scan-line order until you find a white pixel. Start a new region data structure and put this pixel in it.
look at all its nearest neighbors. If you don't find any white pixels, you're done.
if you do find white pixels, add them to the region.
now check all the neighbors of the points you just added.
keep going this way until you run out of newly-added points to check around. When that happens, your region has finished growing.

Will This Work?

I decided to check the last part first, because I already know I can make the stars go away. But will the vision-and-stats part work? If not, let's stop right here.
Let's go through the steps.

1. Simulate the Background

This is easy. I have already measured the background statistics in several images and written a little gadget to make images with identical background stats.
Here is what one looks like, zoomed in on the relevant gray values. (And zoomed in spatially to show nice big pixels.)

These 16-bit images actually look perfectly black. What I have done here is to take a 'slice' of 256 gray values and display them in an 8-bit image. The gray values are selected so that the darkest pixels in this image are about 4 standard deviations below the mean, while the brightest are about 4 standard deviations above.

2. Add a Streak

Next we add a simulated streak to the image.

The amount of energy that this streak adds to the image is determined by my studies of stars of known brightness in real images I have taken. So this streak is a simulated object of magnitude 20, and I am moving it 10 pixels.

If this were one of my real 10-minute images from T27 ( http://www.itelescope.net/telescope-t27/ ) that would mean an angular motion of about 0.5 arcseconds per minute.

The streak is right in the middle of the image, and slopes up to the left at about a 45 degree angle.

This is not what you would normally call a bright streak. I think it would be very hard to find, by any normal means, in a 3056x3056 image, like the ones I get from T27.

It is miserable. Hopeless. Inconceivable. We should give up.

3. Grow the Regions

These 'regions' are also called connected components, by the way.
Threshold at 2 standard deviations above the background mean (I should experiment with that) and find all connected regions of such pixels.

For debugging, I also draw all the regions I find into a new image, making each region white on black, for visibility.

Here is what we get:

4. Use the Statistic on Region Size

So here's the cool part. Regions that are caused by random agglomeration of bright pixels certainly do happen, but the size of such regions has a pretty good standard deviation.

Their average size is about 12, with a standard deviation of only 2.6. This means that it is very hard for such random agglomerations to get very large. Only about 1 in a thousand of them will be larger than 20 pixels!

But the asteroid streak, even though in gray scale it looks awfully dim -- can grow as long as it wants! So in this domain -- the domain of the size of regions significantly brighter than the mean background -- this thing is ... really big.

5. Be Shocked and Awed

In this domain, the size of the asteroid streak's region is fifteen standard deviations above the mean.

In technical statistical terminology, that is Freaking Enormous.

If we were talking about random variations in audible noise, and a fifteen standard deviation increase occurred -- it would blow me across the room and through the wall.

I think we may be onto something.

The Thousand-Mile Telescope

Thursday, May 14, 2015

Breakthrough