Hey! I'm back to talk a bit more about alpha tested image rendering!
This is a continuation to my previous post, so if things are not making any sense you can go back to that one and take a look ;)
Just a little recap about my arbitrary goals, I want to draw a circle on the screen (and maybe generalize for more shapes in the future!) using as base a small texture like 16x16 to a bigger size, lets say 256x256.
the setup we ended last time with was:
generate a 16x16 texture that is:
- black if the center of the pixel is inside the circle
- white if the center is outside
magnify the texture using bilinear scalling (interpolating the inbetween pixels with the nearest 4 linearly)
alpha testing the resulting image to get sharp corners instead if blurry areas
The astute among you may notice that the final image does not look like a circle at all, I mean, it is less blocky, and less blurry, so an improvment for sure, but we can do better.
The key thing to notice is that by applying those steps to the image to upscale it, basically we redefined how we are describing our shape, instead of a pixel directly describing the color we want to paint, we end up with something more like describing a shape by a math equation, wherever this math equation is bigger then 0.5 we are inside our shape, and wherever it is smaller then 0.5 we are outside our shape
Noticing, this we may do some cool things with the first step, If we know that what we care with the image is the position of the resulting 0.5 (grey) contour, we can paint our initial pixels in a way so that when we scale our image with the bilinear interpolation we get an isoline that better represents a circle
One method that is very common is to paint each pixel in the original image with the distance to the nearest point in the shape.
there is just a few problems that we have two big problems with using this:
The distance defines our shape as the exact point where d = 0, and since we are sampling points and linearly interpolating we end in the border interpolating something between 0.02 and 0.03, so we are never getting a perfect zero, maybe the best we can do is consider as zero anything smaller than some arbitrary epsilon, and end up wiht something like this for the border:
We also cannot easilly differentiate what is the inside and outside of our shape, in our previous technique we could just check if the value in our scalled image was bigger or smaller than 0.5
there is a technique that kinda solves those two problems, and it is so useful in so manu areas that it gets its own name it is Signed Distance Fields (or simply SDF)
the idea is that we do the same thing that we did before, but now, if the point that we are looking at is inside the shape we say that the distance is negative, so now:
Near two points in the border we are interpolating between something like 0.02 and -0.03 so it clearly will always passes trough 0
to check if we are inside or outside our image we can just check the sign!
well, let the result speak for itself (in this case, i draw positive values as shades of purple, and negative as green)
and now, our border, is way better defined as well!
Nice! now our circle is looking way more circular now, and we are still only using our old 16x16 texture!!
Well, actually there are still a few small things that we can improve, the first thing is that our texture color values only go from 0 to 1, so we had to cheat a little using an rgb texture so we use different colors for positive and negative values (we also, also divide the distance by the size of the image so we never actually get smaller than 0 or bigger than 1), this also makes so the bilinear interpolation is a little funky, since it interpolates the channels separetly.
So what actually gets used in games and other media, where this application makes sense is a Pseudo Signed Distance Field, we basically observe that:
we dont care about having precision far from the area where we have borders, so we can clamp our SDF between -2 and 2 pixels from the border for example
we also, dont need the contour to be specifically 0, if we go back to before, and set the border with 0.5 as a value, we can transpose and scale our values from -2 and 2 to to 0 and 1 by dividing by 4 and adding 0.5.
what we get at the end of this is a single image 16x16 image that we can apply the exact same steps that we were doing before, so basically no new "runtime" computation needed and now we get a way smoother circle:
That is a great result, but this is not the end of the road for our technique!
For now all I did was describe techniques that are fairly well known
there are a lot of things to understand yet.
and if we understant better those things maybe we can push the technique forward and make improvements in quality and performance, and maybe get smaller, better looking games ;)
some of the things that I've been trying to understand are:
why specifically did we use the distance field as the underlying way to to encode our shape
is it the best we can do?
how much can we zoom before we start getting visual glitches?
can we get similar results in a shader with less texture samples?
can we get better results with more?
what are the implications of interpolating with bicubic or biquadratic?
and if we dont use a single value for our isosuface we oculd maybe get lines thinner than a pixel from the original image!
how can we best decide what is the best texture size to encode our original image.
This seems like such a simple sequence of steps, but there is so much more that we can do and so much further that we can push it!