** end horizline **
Note: If you're using 3D clipping, you can of course forget these graphical
clippings (clippings inside the filler), and save a lot of calculating. I
recommend using that method.
Kewlkuul. That's the idea of drawing a triangle, but there's of course
still a lot to optimize. I wish you a good time with optimization, because
this is the most time-consuming part of the whole 3D engine.
3.1.1 Fixed point
It's not always a good idea to use floating point
numbers. In these kinds of situations, it's sometimes useful to be familiar
with fixed point, which means that for example a 32bit number is
treated as the lower 16 bits were the decimal part and only the upper 16
bits the integer part of the number.
But how to do that? Easy: We only multiply the desired number by 2^16,
perform the desired operations, and finally divide it by the same number
2^16. What use is that? The operations between the multiplication and the
divide happen to be a tad bit more exact than with 'traditional' integer
numbers. Try it out and see the difference!
A piece of pseudo code can be found below in the gouraud
chapter.
3.2 Gouraud Triangle
The idea of gouraud and flat triangle is nearly
the same. Gouraud takes only three parameters more (the color value of each of
the vertices), and the routine just interpolates between them drawing a
beautiful shaded triangle. Flat triangle interpolated only one value (x in
connection with y), gouraud needs three (x related to y, color related to y,
and color related to x). Drawing a gouraud triangle, we add only two parts to
the flat triangle routine. The horizline routine gets a bit more complicated
due to the interpolation of the color value related to x but the main routine
itself remains nearly the same.
I'm not giving you a full pseudo gouraud routine, you have to code it
yourself with the help of some hints. However, I'll show the critical parts of
the routine. The first outer loop of the remixed main routine:
- - c[0] is the color value of (x[0],y[0])
- - the dc's are calculated in the same way as d but interpolating c
related to y instead of x related to y
- for (a=y[0] -> y[1])
- gouraud_horizline( x[0] + (a-y[0])*d[0], x[0] + (a-y[0])*d[2],
c[0] + (a-y[0])*dc[0], c[0] + (a-y[0])*dc[2], a )
- endfor
Gouraud horizline with fixed point and without
clipping:
- - dc is of a 32bit integer type
- < a compare: is x1 greater than x2? if yes, xchg both x1 and x2,
and c1 and c2 >
- dc = ((c2-c1)*65536)/(x2-x1)
- for (a=x1 -> x2)
- putpixel(a,y,c1+((a-x1)*dc)/65536)
- endfor
I'm using so many parentheses to ensure that the
calculations are performed in the right order (some interpreters start
calculating from the right depending on the calculation).
Maybe I'd better write a bit about optimizing. First, we can use derivates
here, too: the derivative (growing speed) of c1+((a-x1)*dc) is dc. Using this
piece of information, we get rid of all the multiplications and need only one
add in the interpolation part of gouraud:
- c1=c1*65536 ; Note!
- for (a=x1 -> x2)
- putpixel(a,y,c1/65536) ; Note!
- c1=c1+dc
- endfor
Actually it's more complicated to even think
about that c1+((a-x1)*dc); the loop above is very clear and we remember it
from the last interpolating example where we interpolated the value 0 to the
value 4 in seven steps. The loop above interpolates the value c1 to c2 in
(x2-x1) steps. How in the world you could put it clearler? :)
Now some quick words about further optimization. You can use assembly of
course, how could you else code fast vector graphics?-) As you see, we have to
divide by 65536 (2^16) in the example above. It's widely known that divide is
a very slow operation if it's performed often (well, many compilers optimize
divides by the powers of two to sars). The fastest method is to use the carry
flag: We have two 16bit variables, registers if you wish. (Another possibility
is to use one 8bit and one 16bit variable if we want to save registers and
accept 8bit decimal part.)
- < dx = c1's decimal part, bx = c1's integer part >
- in the loop:
- add dx,[adder's_decimal_part]
- adc bx,[adder's_integer_part]
Why adc? When
the upper command rolls dx around, carry flag is set and the value of dx is
set the lower 16 bits of the addition. In the adc command, the adder's integer
part is added to bx. It's also incremented by one if the decimal part in the
preceding command grew over 2^16-1 (if the carry flag is set that is). Now we
have the integer part in bx and needen't to shift or divide anything. Another
possibility would be using .8 fixed point:
- < ax=16bit fixed point number (=original c1 * 256) >
- in the loop:
- mov [screen+screenpos],ah ;ah = integer part
- add ax,[fixed_incrementer]
Thus we get the
divide by 256 for free.
This idea can be extended very, very much, for example so that we
interpolate more than one value in a single register etc.
3.3 Texture Triangle
First, the idea of linear or 'classical' texture
mapping (without perspective correction). Linear mapping works pretty well in
some scenes, but perspective correction is in some way needed in most 3D
systems.
Anyway: Again we're using the idea of interpolation: now we'll code a
texture triangle filler. And again the idea is perfectly the same, only two
more values to interpolate, that is five values total. In texture mapping, we
interpolate x, u, and v related to y, and u and v related to x (u and v are
coordinates in the 2D bitmap space). The situation is maybe easiest to
understand by looking at the following picture:
The left
triangle is the triangle which is drawn onto the screen. There's a single
scanline (one call to the horizline routine) pointed out as an example. The
triangle on the right is the same triangle in the bitmap space, and there's
the same scanline drawn from another point of view into it, too. So we need
just to interpolate, interpolate, and once more interpolate in a texture
filler -- an easy job if you've understood the idea of a gouraud filler (I
myself coded a linear texture mapper in one day after coding a gouraud
routine)!
You would like some pseudo code you say? No way, and here's why:
- a) the code is so much like in gouraud and flat I would feel stupid
writing it once again,
- b) you should do something by yourself, too, don't you think? :)
But as I said, e-mail works if you're having insurmountable problems
:I
An optimization trick: the color deltas in gouraud and (u,v) coordinate
deltas in texture remain constant, so we need to calculate them only once /
polygon.
Let's take the u delta in linear texturing as an example. As we know, we
need to interpolate u1 to u2 in the horizline routine in (x2-x1) steps. We are
in the need of a u delta (ku) which would be the same for the whole polygon.
So instead of calculating in each scanline this:
h_ku = (h_u2 - h_u1) / (h_x2 - h_x1),
we do like this
in the setup part of the polygon routine: We know that
- h_u2 = u2 = u1 + (y2 - y1) * ku2,
- h_u1 = u1 + (y2 - y1) * ku1,
- h_x2 = x2 = x1 + (y2 - y1) * kx2,
- h_x1 = x1 + (y2 - y1) * kx1,
WHEN y=y2 (when y is the y
of the second vertex).
This can be easily seen (for instance) from the setup part of the second
part of the triangle. When we place the values of the variables h_u2, h_u1,
h_x1, and h_x1 (above) to the u delta statement,
- h_ku = (h_u2 - h_u1) / (h_x2 - h_x1),
we get the
following statement as a result:
- [u1 + (y2 - y1) * ku2] - [u1 + (y2 - y1) * ku1]
- h_ku=-----------------------------------------------
- [x1 + (y2 - y1) * kx2] - [x1 + (y2 - y1) * kx1]
<=>
- (y2 - y1) * (u1 - u1 + ku2 - ku1)
- h_ku=---------------------------------
- (y2 - y1) * (x1 - x1 + kx2 - kx1)
<=>
- ku2 - ku1
- h_ku=---------
- outerUdelta2-outerUdelta1
- innerUdelta = ---------------------------.
- outerXdelta2-outerXdelta1
Nice! But
what if kx2 = kx1? This of course means that the polygon is just one line, so
ku doesn't need any specific value; zero does the job very well. Note!
If we're using integers, a fixed point is required to ensure adequate
precision!
Optimization trick #2: In the horizline routine, we don't need to compare
the x's (x1 is always less than or equal to x2) if in the main routine we
examine the values of d and do as follows: interpolating from y1 to y2, give
the value with the greater d as the first one, and from y2 to y3, the value
with the smaller d as the first parameter.
3.3.1 The idea of perspective correction
Let's start with an
example: the idea of a 3D starfield. We take a point, say (1,1,3000), from
which we go to the point (1,1,1) plotting dots every 10 units. This can of
course be done by interpolating the z coordinate linearly: a nice straight
line of dots is drawn into the 3-space. But what if we transform it into the
2D computer screen? The distances between the dots aren't equal anymore! How
to interpolate it in a way that it would look on the screen same as in the
3-space? We should of course use the linear interpolation of 2D
coordinates.
As we remember, the 3D->2D transformation formula is the following:
In our example, both x and y are one, so
we can substitute:
We've thus found out that if we want a 3D
curve to look like a straight line on the screen, we need to interpolate 1/z
instead of z itself. Also we notice that if z is a constant, we aren't in
the need of 'perspective correction'.
But what does this have to do with texture mapping -- the fact is
that we're drawing the texture triangle between already 2D transformated
coordinates? Yes, the coordinates of the 3-space have been
transformated into the screen, but how about the texture space
(bitmap)? Yes, it's a 2D plane and it doesn't seem to be rational to bend
(or straighten :) into 2D, but try yourself linear texturemapping and
come then saying it looks all good -- and tell me the trick you used ;)
We'd maybe better do something. What about doing it like this:
No, it's really not allright yet. Now
linear interpolation looks like linear interpolation on the screen, too, but
those aren't the coordinates we would like to present the bitmap
coordinates; (u,v) would be a more suitable pair. So how about this: we
interpolate also 1/z (z_2D) linearly, and perform the following operation
for every pixel:
- u_bitmap = u_2D / z_2D,
- v_bitmap = v_2D / z_2D,
or
- u_bitmap = (u/z) / (1/z),
- v_bitmap = (v/z) / (1/z),
or
- u_bitmap = u,
- v_bitmap = v!
So we have the right coordinates and --
even better -- they're interpolated right (note: calculate u_2D and v_2D in
the init part of the filler, interpolate them (and z_2D of course), and then
calculate u_bitmap and v_bitmap in the inner loop)! A neat technique, I'd
say. But slow. SSLLOOWW. Mere two divides per PIXEL X) But phew,
we've some tricks to make the thing to go a bit faster:
1) Let's not perform this slow operation for every single pixel, but
let's follow the example of Quake and use it only every 8th or 16th pixel.
We can use linear interpolation between them, and the difference can't be
noticed anywhere else than speed ;)
2) One of the multiplies can be thrown off by using this trick (can also
be used in texture mapping etc):
- z = 1/z_2D ; z = 1/(1/z) = z
- u_bitmap = u_2D*z
- v_bitmap = v_2D*z.
3.3.2 Fitting a texture onto an object
What? You can't fit a texture
onto an object? It can't be that hard...
Argh, alrighty: perform env-mapping with original vertex normals, save
these texture coordinates and use them all the time. In practice:
- - au = vertex a's u coord
- - av = vertex a's v coord
- etc...
- for (a=0 -> number_of_faces)
- face[a].au = normal[ face[a].a ].x / 2 + 127
- face[a].av = normal[ face[a].a ].y / 2 + 127
- face[a].bu = normal[ face[a].b ].x / 2 + 127
- face[a].bv = normal[ face[a].b ].y / 2 + 127
- face[a].cu = normal[ face[a].c ].x / 2 + 127
- face[a].cv = normal[ face[a].c ].y / 2 + 127
- endfor
And use like this:
- texture_filler (
- x1,y1,z1,face[].au,face[].av,
- x2,y2,z2,face[].bu,face[].bv,
- x3,y3,z3,face[].cu,face[].cv )
This is
called planar mapping. The technique sucks in one way: polygons whose
normal is nearly x- or y-axis get bad texture coordinates. If the problem is
too bad for your object (as it is for certain types of them), try so-called
spherical mapping or cylindrical mapping. I haven't tried them
out but Advanced Animation and Rendering Techniques has
something concerning the subject. The main idea is to play with
spherical or cylindrical co-ordinates instead of good ol' xyz.
3.3.3 Bilinear filtering
Using bilinear filtering smoothens textures
and reduces pixelation -- but on the other hand practically jams a 3D
engine. 3D cards usually perform bilinear filtering automatically, but I'm
sure you won't kill me for describing the subject anyway ;)
[Wog/Orange] An example: Take a piece of graph paper and plot a point
there randomly. It probably won't hit the center of a square. Now draw a
square using this point as the center point. A part of the square hits other
squares than the square in which the plotted point is, in other words a
certain percentage hits the total of four squares. Using these percentages,
we blend the right-colored pixel from the texels and draw it into the
screen.
[Chem] C pseudo example:
- typedef struct { float r, g, b; } pixel;
- float xf = frac x ; fractional part
- float yf = frac y
- int xd = trunc x ; integer part
- int yd = trunc y
- float w1 = (1.0 - xf) * (1.0 - yf) ; weight
- float w2 = (xf) * (1.0 - yf)
- float w3 = (1.0 - xf) * (yf)
- float w4 = (xf) * (yf)
- pixel p1 = GetBitmapPixel(xd, yd) ; pixel rgb
- pixel p2 = GetBitmapPixel(xd + 1, yd)
- pixel p3 = GetBitmapPixel(xd,yd + 1)
- pixel p4 = GetBitmapPixel(xd + 1,yd + 1)
- float red = p1.r*w1 + p2.r*w2 + p3.r*w3 + p4.r*w4
- float green = p1.g*w1 + p2.g*w2 + p3.g*w3 + p4.g*w4
- float blue = p1.b*w1 + p2.b*w2 + p3.b*w3 + p4.b*w4
(w1234 are the weights of the four texels the pixel hits)
It's worth mentioning that if the filler skips some texels, the routine
above gives them no weight at all. Taking them into account is quite an
interesting operation if we're talking from the code's point of view.
Mip-maps eliminate the problem very well.
3.4 Texturing + Shading
Using texturing and shading at the same time
is quite straightforward to implement the basic idea being that we just
interpolate the values of both texture and shade and blend them in a suitable
ratio. An example inner loop:
- - 16bit fixed point
- - tab = precalculated table in which are located the values of
texture and angular-interpolated phong
- - putpixel parameters: x,y,red,green,blue.
- for (a=y1 -> y2)
- texel = tmap[ u/65536 + v/65536*256 ]
- putpixel( x/65536,a,
- tab[texel,c/65536].r,
- tab[texel,c/65536].g,
- tab[texel,c/65536].b )
- x += kx
- u += ku
- v += kv
- c += kc
- endfor
An precalculation example of the Tab-table
(without a hilight):
- for (a=0 -> 255)
- for (b=0 -> 255)
- tab[a,b].r = pal[a].r * phong[b].r / 256
- tab[a,b].g = pal[a].g * phong[b].g / 256
- tab[a,b].b = pal[a].b * phong[b].b / 256
- endfor
- endfor
3.5 The Idea of Convex Polygons
[Chem] We are willing to draw the
following pentagon:
I'm
starting with the assumption that you've read and understood how to
draw a triangle. The main idea is that we follow the sides 1 and 2 from the
highest vertex down to the lowest, and draw horizontal lines between them.
1. Find the highest vertex from which we start drawing. The highest
vertex is the same for both sides, so it's both "start1" and "start2".
2. Take the preceding vertex from the vertex list and call it
"stop1". Take the following vertex and call it "stop2".
3. It should be clear now that we're following the lines start1-stop1 and
start2-stop2. Now we just interpolate start1.x to stop1.x and start2.x to
stop2.x, and draw each step a horizontal line between the interpolated x
coordinates. We begin interpolating from the y of a start and stop it to the
higher stop's y. Start is start1 at the beginning. The higher-up stop
is just "stop" to the end.
4. We've reached stop, in other words we've successfully drawn a part of
the polygon. Now depending on which stop was the higher-up one (stop1 or
stop2), we do as follows:
- stop1 was higher:
- start1 = stop1
- stop1 = stop1 preceding vertex
- start = start1 = stop1
- stop2 was higher:
- start2 = stop2
- stop2 = stop2 following vertex
- start = start2
5. Go to the point 3 until
the whole polygon's been drawn.
I'm not saying that this would be the best
way to draw convex polygons. This is only the technique which first popped
into my mind. Note also that this works only for convex polygons, concave and
complex polygons don't work this straightforwardly! [I'd say that this is the
best technique. Comments? -ica]Back to the index