4.1 Z-Sorting
The idea of z-sorting is that we sort polygons to an order determined by
the average of their z coordinates. It's easy to use quicksort to achieve
this, as in the following pseudo code:
-
function quicksort(left, right)
-
if left<right
-
q = partition(left,right)
-
quicksort(left,q)
-
quicksort(q+1,right)
-
endif
-
endf
-
function partition(left, right)
-
- crd is the table of the rotated vertices
-
- face is the polygon table
-
x = crd[face[left,0],2] + crd[face[left,1],2] + crd[face[left,2],2]
-
a = left-1
-
b = right+1
-
loop forever:
-
decrement b by one as long as (crd[face[b,0],2] +
-
crd[face[b,1],2] + crd[face[b,2],2]) is less than x
-
increment a by one as long as (crd[face[a,0],2] +
-
crd[face[a,1],2] + crd[face[a,2],2]) is greater than x
-
if a<b
-
else
-
break looping and return b
-
endif
-
endloop
-
endf
Now we just draw the polygons straight from the table.
Z-sort is really not a perfect sorting technique but it's enough for
many purposes (for example many asm97-demos used z-sort). Anyway, I suggest
learning also at least Z-buffer and maybe BSP-tree and S-buffer if you
have time and interest.
4.2 Z-buffer
Z-buffer is the easiest (but by no means the fastest) way of doing objects
which can intersect each other. Contrary to BSP-tree, Z-buffer doesn't
require splitting polygons when they intersect. BSP-tree is unbeatable
for stable scenes, though. As the name indicates, we use a buffer (a table),
into which we save the z and color values of the points. The size of this
table should be the same as the size of the screen, for example in the
MCGA mode 64000 at least 16bit elements. Z-buffer looks like this in C:
Before we start drawing anything to the z-buffer, we need to set the buffer
to the maximum value it can reach. Then, when drawing polygons, we interpolate
also z. In the inner loop we perform a compare, and if the current z is
less than the z in the z-buffer, we set the current z as the z-buffer value
and draw the pixel. If not, we continue with the next pixel.
This check slows the routine down dramatically, but luckily we can interpolate
z just like x or anything else: only one add per per pixel (plus the check).
Finally, when all the points of all the polygons have been checked, we
draw the z-buffer into the screen.
A short piece of pseudo code from the hline:
-
z = z1
-
for x=x1 -> x2
-
if (z < zbuffer[y*320+x])
-
zbuffer[y*320+x] = z
-
plot(x,y,color)
-
endif
-
z = z + kz ; interpolate z
-
endfor
About optimizing: you don't need to reset the z-buffer so often (never
I'd say) if you interpolate 1/z and perform all the operations above but
change the compare from '<' to '>' :)
How to implement it, you ask. Easy: 1/z is always at the range 0..1,
so if we substract 1 from each value of the z-buffer, it's 'reset' (every
possible value of 1/z is greater than this value which is at the range
-1..0). The best way is to use this trick in the inner loop:
-
1st frame:
-
if (new_1/z_value + 1) > zbuffer[].z
-
2nd frame:
-
if (new_1/z_value + 2) > zbuffer[].z
etc.
It's strongly recommended to interpolate 1/z instead of z even if you
didn't use the trick above -- in the screen space, z isn't actually linear
when 1/z is (see 3.3.1 Perspective correction), so
it's is interpolated right only when using 1/z interpolation. So the pseudo
code above changes a bit (opz = one per zed):
-
opz = opz1
-
for x=x1 -> x2
-
if ( (opz+cur_frame) < zbuffer[y*320+x] )
-
zbuffer[y*320+x] = opz
-
plot(x,y,color)
-
endif
-
opz = opz + k_opz ; interpolate z
-
endfor
4.3 BSP-Tree
BSP-tree (Binary Space Partitioning tree) is a very hard-to-implement but
great sorting technique. I'm not describing how to code linked lists, so
if you can't use them, find a good programming book and read about it.
The special case of linked lists, a binary tree, means that every
element of the tree is attached to two other elements below it in the hierarchy:
Etcetc. Implementing this should be straightforward so I don't believe
you'll have any problems.
4.3.1 The main idea
BSP-tree is as easy use in 2-space as in 3-space. It's easier to draw a
2D world so I'll describe the main idea using one. Here are six lines:
(X = camera location) These should be drawn in the right order using BSP-tree.
First we'll create the tree:
We start with the line 1. We check on which side of it the rest of the
lines are, and make a binary tree in which we put the lines depending on
the side they're located. If some lines are partly at the left, partly
at the right side of the line 1, they're split, and the two pieces are
placed one to the left, one to the right side. Then we take the line 2
and perform the same thing with the lines 3, 4, 5, and 6, and go on like
this until all the lines have been arranged. If we want, we can stop in
some certain point and sort the remaining lines with some else sorting
technique.
When we're ready to draw the lines, we perform the next operation: We
check, on which side of line 1 the camera is located. If it's on the left,
we check if there are lines at right and if there are some, we go
down the right branch. When the right branch is ready, we draw the line
one and go down the left branch. REMEMBER to draw the line 1 in the place
mentioned.
In our example, we perform the following splits:
The binary tree looks like this:
The drawing order of the lines in respect to the camera point in the picture
is so the following:
-
1) We're on the right side of 1. 2a is drawn first, then 1, then we go
down the right branch.
-
2) We're also on the right side of 2b, so we go down the left branch.
-
3) We're on the left of 3a, so we draw first 3a, and then go down the left
branch (no lines on the right).
-
4) We're on the right of 4. So the order is 5c, 4, 5b.
-
5) We return back to 2b and draw it, and go down the right branch towards
3b.
-
6) We're on the left of 3b so we draw 6a and 3b and go down the right branch.
-
7) Related to 5a we're at left, so we draw 6b and finally 5a.
So the final order is 2a,1,3a,5c,4,5b,2b,6a,3b,6b,5a. Seems to be quite
rational. I suggest testing the functioning of BSP with a paper. I myself
drew a 18-line 'world' just to clarify the idea of BSP-tree. An A4 full
of strange lines and symbols X)
4.3.2 Required formulas
Neat thing that, but how to implement it? Here I'll describe the use of
all the needed formulas.
The equation of a plane is the following:
Nx * (X-ax) + Ny * (Y-ay) + Nz * (Z-az) = 0,
where N is the normal vector of the plane, X, Y, and Z arbitrary variables,
and (ax,ay,az) one point on the plane. Now we should derive the equation
of the plane going through all the vertices of our polygon. To do it, we
need the normal vector, which can be calculated by creating two vectors
from the polygon vertices and by taking their cross product. Now we place
the coordinates of the desired point into the equation to the places of
X, Y, and Z. If we get zero as a result, the point is on the plane. If
the result is negative, the point is located on the one side of the plane,
and if it's positive, it's on the other side. This technique is used to
determine on which side of a plane the points are.
If all the vertices of a polygon aren't at the same side of a plane,
the plane and the polygon intersect each other. In this kind of situation
we need to calculate the intersecting points and with the help of those
form two new polygons which are on different sides of the plane. To do
this, the equation of a line in 3-space is required:
-
X = X1 + (X2-X1)t
-
Y = Y1 + (Y2-Y1)t
-
Z = Z1 + (Z2-Z1)t
This line goes through the points (X1,Y1,Z1) and (X2,Y2,Z2) t getting all
real values. When we place these X, Y, and Z to the places of X, Y, and
Z in the plane equation, we've get the coordinates of the intersection
point:
-
Nx*(X1+(X2-X1)t-ax) +
-
Ny*(Y1+(Y2-Y1)t-ay) +
-
Nz*(Z1+(Z2-Z1)t-az) = 0
We solve t:
-
Nx*(X2-X1)t + Ny*(Y2-Y1)t + Nz*(Z2-Z1)t =
-
Nx*(X1-ax) + Ny*(Y1-ay) + Nz*(Z1-az)
-
t * ( Nx*(X2-X1) + Ny*(Y2-Y1) + Nz*(Z2-Z1) ) =
-
Nx*(ax-X1) + Ny*(ay-Y1) + Nz*(az-Z1)
-
Nx*(ax-X1) + Ny*(ay-Y1) + Nz*(az-Z1)
-
t = --------------------------------------
-
Nx*(X2-X1) + Ny*(Y2-Y1) + Nz*(Z2-Z1)
Luckily we need to use this only in the init part... X)
4.3.3 Hints
1) Calculating the plane equation is so slow it'd maybe be better to calculate
the normals just at the beginning of the program and rotate them among
vertices.
2) The easiest way is seldom the fastest. It would maybe be better to
find the alternative with the least number of polygon splits -- the number
of polygons can easily be reduced to a half which speeds up the thing dramatically.
4.4 S-buffer
S-buffer (or scanline buffer or segmented buffer or span buffer or... :)
is an enhanced -- and remarkably faster -- version of z-buffer, where we
sort scanlines rather than single pixels. In the form I'm describing here,
it was developed by Paul Nettle of Hot Wax Software (_-_ at #coders), but
some people had used kind of an s-buffer many years before Nettle developed
his version. In any case, we can get rid of a great amount of calculations
by using s-buffer; in most of the horizline calls we need only compare
the ending coordinates. So it looks like a very interesting way of sorting
polygons.
So we have a table or a linked list into which we save the data
of every hline call. We compare them to each other, act accordingly, and
finally draw the visible scanlines (or parts of them). For example the
following type of table works (for texturemapping, pmode):
-
typedef struct
-
{
-
short xb,xe,zb,ze; // x_begin, x_end, ...
-
byte ub,ue,vb,ve;
-
long kz,ku,kv; // kz = (ze-zb)*65536/(xe-xb) etc
-
unsigned char used; // if 0 -> not used
-
} sbuf_t;
-
sbuftype sbuf[200][100]; // 200 lines in mode 320x200,
-
// max. 100 segments / line
Note! The segments are ordered by the size of the x coordinates
from left to right. This kind of table consumes memory 200*100*28=560000
bytes (due to alignment), and the more complicated the world is, the greater
should the maximum number of segments be. I recommend using a linked list
instead of a table; there's no maximum number of segments, and adding a
segment between two other segments is far easier.
An s-buffered polygon routine takes also the z coordinates of the points
(which have already been transformed into 2D) as parameters. In the outer
loop, we interpolate also z coordinates. An sbuf-hline doesn't actually
draw anything but it adds a new scanline into the s-buffer, and not until
the s-buffer is ready the whole lot is drawn into the screen, just like
with z-buffer.
Segments on the same horizontal line can be located in many ways related
to each other, so they must be compared and act accordingly. There are
six ways in which segments can be located related to each other (old line
= v, new line = u):
With these six cases we can present all the possible locations related
to each other, also special cases (like one pixel segments and segments
located to equal (x,y) coordinates). Humm... Paul mentioned in #coders
that his own s-buffer requires only two or three compares in the hline!
I would be interested to see the implementation. Paul has written a document
describing his technique but he hasn't released it yet (so as not to annoy
his employees in some weird way...) So everyone flood his email until he
gives up and finally releases the doc! ;) [addition: the technique is quite
probably this: he uses BSP-tree for stabile parts of the scene, then he
inserts these pre-sorted parts into a s-buffer front->back, then uses normal
s-buffer for moving objects and moving parts of the scene. The bsp-tree->s-buffer
insertion routine can be optimized ultra fast; a scanline is added if and
only if there's no segment in the s-buffer already in those coordinates.]
In the case 1 we just add the segment into the buffer before
the next segment and return to the main polygon routine (all the segments
to the left have already been checked; the list or table is ordered left
-> right).
In the case 2 we try the next old segment if it exists. If not, we add
the segment into the buffer as the last one.
In the case 3 we split the segment at the end of the old segment (remember
to calculate the new values of z, u, and v), compare the left part to the
old segment, and add it into the buffer if needed. Then we continue with
the right part and the next old segment.
In the case 4 we split the segment at the beginning of the old segment,
add the left part into the buffer before the old segment, and act with
the right part like above with the left part.
In the case 5 we split the segment at both ends of the old scanline
and act like in the cases 3 and 4.
In the case 6 we act directly like with the left part in the case 3.
Here are routines to empty and flip a s-buffer using the sbuf table
above.
-
- tmap is 256x256-sized bitmap
-
function flip_sbuf
-
integer i,j,k
-
integer du,dv
-
for i=0 -> 199
-
j = 0
-
while sbuf[i][j].used<>0 and j<100
-
du = 0
-
dv = 0
-
for k=sbuf[i][j].xb -> sbuf[i][j].xe
-
putpixel(k,i,tmap[sbuf[i][j].ub+du/65536+
-
(sbuf[i][j].vb+dv/65536)*256])
-
du = du + sbuf[i][j].ku
-
dv = dv + sbuf[i][j].kv
-
endfor
-
j = j + 1
-
endwhile
-
endfor
-
endf
-
function clear_sbuf()
-
integer i,j
-
for i=0 -> 199
-
j = 0
-
while sbuf[i][j].used<>0 and j<200
-
sbuf[i][j].used=0
-
j = j + 1
-
endwhile
-
endfor
-
endf