The Thousand-Mile Telescope: March 2016

Sunday, March 27, 2016

Finding Bright Stars with Region Growing

What does it mean for a piece of software to 'find' a star in an image? The star is already there in the image isn't it?

No, there is no star in the image. The image has nothing but pixels in it. To find a star, a program must build a data structure that makes knowledge about the star's existence and location explicit. There are no stars but what we make.

There is no star in this image.

To make knowledge about stars explicit in my program, I will use a region-growing algorithm. We will start out with a vanilla region-growing algorithm. Later, we will add something strange and wonderful.

Here is a quick sketch of how it works:

select a threshold. We will make regions out of pixels whose values are greater than or equal to that threshold, and ignore all others. Let's call pixels bright if they are >= threshold.
Find a pixel in the image that is bright. Start a new region with it.
Look at all its neighbors. If any are bright, add them to the region.
For all the ones you just added, look at all their neighbors. Add to the region any that are bright.
Keep doing this, until you get to a point where you don't add any new pixels to the region. Then stop.

Here's an illustration.

This is the image we will be using. I've outlined all the pixels in red so you can see them. All the bright pixels have been set to white, while all non-brights have been set to gray.

To find the first bright pixel, scan the image top to bottom, left to right, testing the value of every pixel until you find a bright one.

Start a new region, and add that pixel to it.

Actually, I will have several different groups of pixels in this region. Let's call them current generation, region, and new generation. I will color-code them. Right now the pixel we just found is the only one in the new generation.

Now we can get started.
For all the pixels in the current generation, look at their 8 neighbors. If any of those neighbors are bright, add them to the new generation.

After finishing checking the neighbors for all of the current generation, move all of the current generation to region. Move all new generation pixels into current generation. Then repeat the process.

I have numbered the generations here, so you can see how they get added to the region in layers.

Finally, you reach a point where, when you check all the current pixels, you find no new ones. So you put the current generation into the region, and you're done. There is nothing more to check.

Now we know explicitly where a star is. We know how many pixels are in it, we know its centroid X and Y. We can look at the original image, before thresholding, and know its average gray value. We know all kinds of stuff.

That's how we find a bright star.

Sunday, March 20, 2016

Autothresh

Here is a beautifully thresholded image of the night sky of 5 March, 2016.

By the way, my T27 images are 27 minutes of arc across. A full moon would fit almost perfectly inside this box.

This is not the original image as it came from the telescope. The original image has gray values that go from 0 to 65536, and on my screen it doesn't look like much.

In fact, the original looks like this:

Of course the original actually has much more dynamic range than my thresholded (I call it "sliced") version, but I don't care.

I don't care because:

My monitor can't display it.
My eyes can't see it.
My image manipulation tools can't handle it worth a damn. (Maybe yje next version of the Gimp will do better.)
My algorithms need it. The things I'm looking for are all down near the noise.

So I like my thresholded images, and I'll be using them.

Actually, it has been two-way, thresholded. Like so.

Here is a histogram of the original image. (On the x-axis it has gray-values, on the y-axis it shows how many pixels of that gray value are in the image.)

The graph actually goes all the way to 65535 on the right, but the little bumps of pixels up there are so small you can't see them on this scale. The big normal distribution of pixels you are seeing here is the Dark Night Sky -- which contains practically all the pixels in the image.

But my 8-bit sliced image only has 256 gray values, so here is what I did:

The question is -- how can we automate this process? I want to write software to do all this with no human intervention, so that my software can find asteroids while I sit on a beach and sip Piña Coladas. So -- how can we find a nice threshold without me clicking on stuff with a mouse?

The trick is in the structure of that histogram. Because we are looking at the night sky, it will always look like that: a great huge normal distribution of pixels down in the dark, and teensy smattering of pixels all the way to the right.

How can we describe the ideal threshold position, relative to the structure of that histogram?

The criteria for the Magic Spot are structural criteria: aspects of the structure of this histogram. We want a spot that has a rapid descent on its left, but where the histogram is nice and flat to its right.

To find that kind of structure, we will create a structuring element. Since the space we are looking at is one-dimensional (it's actually just a list of numbers) our structuring element will also be one-dimensional. It will be two line segments, one to the left of the-point-we-are-looking-at-in-the-histogram, and one to the right.

The two halves of the structuring element both contribute to the total score for the point in the histogram that we are considering, but they do so in very different ways. The left half generates a high score if the part of the histogram that it is overlaying has a large drop in value from left to right. The right part of the structuring element generates a high score only if its part of the histogram is nice and flat. Between the two of these, they generate the highest score right at the place where we want to start our 8-bit, 256 grayvalue image: the right knee of the histogram's big curve.

By the way, the right half of the structuring element is not looking at the original histogram -- it's looking at the first derivative of the histogram. It just adds up all the values of the first derivative that fall under it, and thus gets an idea of the net up or down movement of the part of the histo it is on.

But here's a problem.

How big should this structuring element be? I don't want to assign it an arbitrary size that I simply know will work (like I did at first, during development.)

We want to ensure that the left half of the structuring element covers a good fraction of the right half of the Big Curve, but we don't want it to exceed half of the Big Curve's 'wavelength'. And we want to be able to determine the size fairly easily, because we are inherently lazy.

So let's just do this:

It's easy to find the highest point in the histogram, and easy to walk to the right until the value has fallen to half of the maximum. The distance we move rightward is a nice size for the left half of my structuring element, and we might as well use the same size for the right half.

Using that strategy for sizing the structuring element, and using the structuring element to find me my lower threshold for the 8-bit image 'slice' is what gave me these lovely images: (these are small crops from the main image)

Notice how, in this 3-frame movie, the background brightness doesn't change at all. The auto-thresholder found good thresholds on all three images, even though the background brightness of the sky was increasing during the 30 minutes I was taking the pictures. (I think the humidity was increasing, or something.)

Also notice how the trapezoid of stars in the upper left is preserved. Those are some of the dimmest objects in the images, and the threshold is doing a good job of not knocking them out.

This is the thresholder I will be using, until it dies or I find somebody better.

Sunday, March 13, 2016

What is a Good Threshold?

If we're going to identify objects in the night sky, we need to find a grayscale threshold that means: Above this value, the pixels are probably part of an object. Below it they are probably not.

A threshold is just an integer -- the pixels in our image, brought to us by the miraculously excellent Proline camera, are 16-bit integers, whose possible values range from 0 to 65,535.

The reason we need to think about a threshold is because the background -- the dark between the stars -- is not perfectly black. If it were perfectly black, then its pixel values would all be zero, and our threshold would always be "anything above zero". But it's not. In the images I have taken, the dark between the stars tends to be around gray value 1000. The darkest sky I have seen so far (on March 5th, 2016) had a background around 650.

And that background is never perfectly smooth. It has noise. It has a standard distribution of values, so that, if you brighten it enough and zoom in enough, it looks like this:

Further, if we just wanted to find nice, reasonable stars, we wouldn't have to worry about setting a good threshold. Nice, reasonable stars are so much brighter than the background noise that you could choose practically any threshold at all and it would work just fine.

But we want to see the faintest things that hide right down in the noise, so we need the best threshold that can possibly be found.

How do we know if we have a good one?

Here are a couple examples, from the lovely dark sky of 5 March 2016, above Siding Spring, Australia.

The image on the left has been thresholded at 500, the one on the right at 600.

Look at the trapezoid of 4 faint stars near the upper left of both images. They are just as discernable in the right image as in the left. In fact, there are no faint stars visible in the left image that are not also visible in the right.

If we go too far into the noise with our threshold, we are not making any new faint stars visible. We are only brightening the noise.

What we want is the highest threshold we can get -- and thus the darkest background we can get -- that does not lose those faintest stars. That will be our criterion for a good threshold.

It's easy to find a good threshold 'by hand', by slicing the 16-bit image at many levels and inspecting the results -- but we need to automate the process. Next week we will look at a method of doing that.

Sunday, March 6, 2016

Holophonos

As we start making algorithms, let's avoid the temptation of generality. There are many different kinds of things that move in the night sky, and if we make algorithms to find all of them we will be wasting our time.

The thing we are looking for has never before made a pass close to the Earth. It is far away, and dark, and moving really slow way out in the cold places beyond Jupiter. It is in an odd high-inclination orbit. All these factors have combined to ensure that no one has noticed it yet.

Of course, its stealthiness is only part of the problem.

Three or four years from now, if we are lucky enough to look in just the right place, and have software that is good enough to process the images in just the right way, we will see something very much like this:

Holophonos

This is Holophonos: the killer of all things.

It is a chunk of nickel-iron about 4 miles across and if we don't see it soon enough it will kill everything you love, everything you hate, everything you hope for or fear, the future and the past, faith, knowledge, history, technology, dance music, fine dining, cities, villages, languages, cars, candy, comedies, proms, pies, parades, breakfasts, basements, butchers, bakers, candle stick makers, and baby's first steps.

There may be more than one reason why there are human beings on this planet. But if Life on Earth could speak, it would say that there's only one reason it cares about.

It would say:

You are here to stop that rock. Do whatever you need to do, use whatever you need to use, but stop that rock.

We can't stop it if we can't see it.

Saturday, March 5, 2016

Lights in the Night Sky

Three minutes of arc and thirty minutes of time of the sky over Siding Spring, Australia, on the night of 5 March, 2016.

Long have I wandered in the dark spaces between the stars, and in the vaster spaces still of all the ways you can screw up thinking about them.

My problem has been that I've been trying to come up with a way to find moving objects without ever actually making a judgment that says This Is An Object. I don't mind writing posts about efforts that don't pan out or ideas that need to be corrected later, but I'm kind of shy about writing when I know darn well that I am totally full of shit. Well, I believe I have made some advances lately which have persuaded me that I am now down to about half full, so here we go again.

You can, of course, get better science writing elsewhere, folks -- but not at this price-point.

The two great principles that I learned from working in machine vision are:

throw away as much data as you can and
put off binarizing decisions as long as possible

By throwing away data, I mean: the three images I took tonight come to about 60 megabytes. If I find paydirt in them, it will be a little moving object that can probably be described with a few small numbers. Maybe 32 bytes total. A good way to understand your algorithms is that they are a way of discarding the useless data, without getting rid of what you want.

I think it was Michelangelo who first said:

Questo scrittore è un idiota. Vorrei fossi ancora vivo, così potuto fargli il culo.

which of course means:

I find asteroids by getting rid of all the pixels that they ain't.

By put off binarizing decisions, I mean: a decision like these pixels are foreground and everything else is background -- kind of pins things down. Even if you preserve the actual pixel values, all your algorithms are now working with that binarizing assumption. It's best if you can find a way to refine your data-ore as far as possible without making limiting judgments like that.

A rare photograph of the T27 telescope in its natural habitat.

Well, I've tried to stick to those ideas for a long time and they have led nowhere I want to be. So at last I emerge from the wilderness with this insight:

You need to find the objects in each image.

There is just no way around this. You need to make a binarizing decision that These Things are Objects, and the Rest of the Image is Just Background. There is simply no way around this. You find the objects you are interested in, do that in each image of your sequence, and then look for objects-that-move.

Seems pretty obvious, right? Yeah, well. It is possible to get Way Too Fancy while thinking about this stuff, and that is maybe the best way you could find to never get anything done.

There is one more insight that is at least as valuable, about how to do the vision work.

The night sky has many different kinds of objects in it. Use a different algorithm for each.

So there you have it. This gives me a way to proceed. I don't think I will be wandering anymore.

Thanks to the heroes at iTelescope who let me rent T27, and to the heroes at PlaneWave Instruments for making it!