OpenGL pixel and texel placement
A How-To on writing a bitblit imitating OpenGL texture mapping

Contents

Written by Bert Peers, bert [at] bpeers [dot] com - Last update : 28/10/2002

Introduction

Recently I had to write a software rasterizer which would simply blit a texture map to the screen. It was a straightforward bitblitter, except that it also had to support zooming in (smooth enlarging). Previously, I already had an implentation of this throw-bitmap-at-the-screen stuff using OpenGL. Using a simple texture mapped quad, facing the viewer, with filtering set to GL_LINEAR, did the job fine.

The tricky part was making sure that the software routine, and the OpenGL quad approach, displayed exactly the same image, give or take some quantization errors. For a game, it may not be so bad if there is a subtle shift when you switch engines, but for "serious" applications, such as medical or geographical visualization, being off by half a pixel is just not acceptable. For these fields, you need an almost mathematical approach to reading and interpreting the spec, so you can be reasonably sure that the pixels you show are the same, rather than relying on it "looking the same". This is what the article below tries to do.

So the question is, given...

  • A texture map of N x M texels...
  • A viewer-facing quad with Window Coordinates (left, top, right, bottom)...
  • Texture coordinates of (S_0, T_0, S_1, T_1)...
  • A viewport of (0, 0, width, height)...
  • A mapping mode of GL_LINEAR...
  • All of the above producing a magnification, or at least a 1:1 view...
... which pixels are colored, using which texels, interpolated by which blending factors ?

1. The viewport

The most important thing to realize right away is that OpenGL doesn't really deal with "pixels" until the very last step, the rasterization. Until then, it is best to think of all coordinates as being specified on a true Cartesian 2D plane. There is an origin plus two unit vectors (X and Y); together, these allow you to know the exact position of any point (p_x, p_y) you specify, without even mentioning pixels.

In this spirit, the command glViewport (x, y, w, h) is precisely the way to specify where that origin is, and how long those unit vectors are. You do so, by effectively saying, "the origin and unit vectors are positioned in such a way, that the leftmost edge of the screen is at X-coordinate x, the lower most edge is at Y-coordinate y, and the rightmost and uppermost edges are at x+w and y+h, respectively". Note the use of the word edge, not pixel. If you specify glViewport (0, ..., 5, ...), and your screen happens to be 5 pixels across, you've put the origin at the left edge of the leftmost pixel, and you've put the X == 5 vertical line at the right edge of the rightmost pixel. See the figure below.

By the Version 1.1 spec, we get from 2.10.1 :
Viewport transformation parameters are specified using

void Viewport ( int x, int y, sizei w, sizei h ) ;

where x and y give the x and y window coordinates of the viewport's lower-left 
corner and w and h give the viewport's width and height, respectively.

From the figure, you can see that if we call glViewport with x = 0 and a width equal to our horizontal resolution, then the center of pixel i has X coordinate 0.5 + i and every pixel has a width of exactly 1. Call this "lemma" A.

Note that from the above interpretation, there is not really a reason why the arguments to the viewport should be integers, other than "2.10.1 says so".

2. The polygon

Suppose you draw a rectangle-like quad, with the projection and the modelview matrix set up to let the vertex coordinates end up being the window coordinates; that is, it looks like you're drawing directly on the viewport plane. Suppose we then say
glVertex... (0, ... );
...
glVertex... (5, ... );
...
Which pixels are drawn ?

Let's first throw some spec at it; from 3.5.1 :
The rule for determining which fragments are produced by polygon rasterization is 
called point sampling. The two-dimensional projection obtained by taking the x 
and y window coordinates of the polygon's vertices is formed. Fragment centers 
that lie inside of this polygon are produced by rasterization. Special treatment 
is given to a fragment whose center lies on a polygon boundary edge. In such a 
case we require that if two polygons lie on either side of a common edge (with 
identical endpoints) on which a fragment center lies, then exactly one of the 
polygons results in the production of the fragment during rasterization. 
Also, from 3. :
A fragment is located by its lower-left corner, which lies on integer grid 
coordinates. Rasterization operations also refer to a fragment's center, 
which is offset by (1/2, 1/2) from its lower-left corner (and so lies on half-integer 
coordinates).
(Lemma's
B).

So what the heck does all that mean ? Quite simply, it means that as a polygon is scanconverted, a pixel will be colored with the value from the polygon, if and only if the pixel's center lies entirely inside the polygon's boundaries; or if and only if the pixel's center lies exactly on the polygon boundary, and the pixel is on "the good side" of the boundary.

This "good side" is typically defined as "including the upper left". Unless I missed it in the spec, this is not a hard guarantee, so this is a potential vulnerability in our OpenGL-imitation exercise. Even so, we'd still only be drawing the wrong pixels near the edges, which would not show up at shared edges. Also, the above does not influence our coloring of the internal pixels. And you'd need to have the boundary passing exactly through the center, which also is not so common.

In short, for the above glVertex sequence, fragments will be emitted with fragment centers at X coordinates 0.5, 1.5, 2.5, 3.5 and 4.5, which happens to be dead-on the pixels from Section 1. This is nice, because it means the "data associated with the fragment" in Section 3, will be virtually computed "at the pixel center", which makes things easier (call this Lemma B.2).

3. The texture map

Assuming nothing fancy like mipmapping and borders are used, a call to glTexImage2D simply uploads the N x M map to some internal representation of exactly the same dimensions. The more interesting question is, again, what are the precise coordinates of the individual texels, their edges, and of the entire map's edges ?

As can be seen from the spec's figure 3.10, the entire texture map is interpreted as having a lower left at coordinate (0, 0) and an upper right at coordinate (1, 1). So, when you upload a 4 x 4 texture map, you establish the following frame :

Observe that for a texture of width N, the X coordinate of the center of texel i (i counting from 0..N-1), will be 1/(2.N) + i/N, or, (0.5 + i) / N (Lemma
C). This center is important, because OpenGL considers it to be the "hot spot" where no filtering is needed. If a fragment is emitted having "associated data" (ie, a texture coordinate) which puts it exactly at the center of some texel, then that texel's color value will be emitted unfiltered, whether you use GL_NEAREST or GL_LINEAR. This follows from the - 1/2 nudging in equation 3.15, part 3.8.1. If the fragment center is not exactly on a texel center, then interpolation will be used (for GL_LINEAR);

Going back to 3.5.1 of the spec :
Then the value f of a datum at a fragment produced by rasterizing a triangle is given by
[...]
a, b, and c are the barycentric coordinates of the fragment for which the data are produced.
a, b, and c must correspond precisely to the exact coordinates of the center of 
the fragment. Another way of saying this is that the data associated with a fragment 
must be sampled at the fragment's center. 
Texture coordinates are computed by interpolating the values given at the vertices, using the barycentric coordinates of the fragment center. Combined with B.2, we know this comes down to computing those texture coordinates at the pixel centers. (Lemma C2).

If the computed S, T coordinates are not exactly at a texel center, and GL_LINEAR is used, OpenGL floors and ceils to the four surrounding integer values, lookups the texels at those integer coordinates, and blends their colors using the relative distance from the true texel centra (3.8.1, Eq. 3.13 - 3.15). (Lemma D)

4. Texture mapped polygon rasterization

Finally, we come to the actual rasterization. For clarity, let's write a super dumb, super unoptimized loop.

Setup 1. Transform the quad's vertices to Window Coordinates, as per 2.10. This gives us four floats,
left, top, right, bottom. Let's focus on one single row of pixels at some integer count j, with pixel center at some viewport Y-coordinate f.

Setup 2. For every pixel i on line j, 0 <= i < horiz. resolution, compute the X coordinate of pixel i's center :
P_x = i + 0.5 (from
A).

Step 1. If P_x < left or P_x >= right, skip this pixel; it's outside the polygon (from B).

Step 2. Suppose our given texture coordinates are 0 at left, 1 at right; then we can compute the texture coordinate of the fragment at pixel i by interpolating at X coordinate P_x (from B.2), producing
S_x = (P_x - left) / (right - left) (from C2).

Step 3. If we're not filtering (GL_NEAREST), compute the integer Si_x as
Si_x = round (S_x . N - 0.5) due to GL_NEAREST (for round) and C. The pixel's color is texmap [Si_x, ...]

Step 4. If we're filtering (GL_LINEAR), compute both the floor and ceil of the interpolated texture coordinate, and interpolate using the relative distance (from D) :
Sf_x = S_x . N - 0.5
Si_x = int (Sf_x)
alpha = Sf_x - Si_x
color = (1 - alpha) . texmap [Si_x, ...] + alpha. texmap [Si_x + 1, ...]

Step 5. If the S texture coordinates, associated with vertices at Window Coordinates left and right, were S_0 and S_1 instead of simply 0, 1, compute S_x as
S_x = S_0 + (S_1 - S_0) . (P_x - left) / (right - left) in Step 2.

---

Putting it all together for the most general case :

Given a polygon scanline with boundaries at left and right, with texture coordinates at those boundaries of S_0 and S_1, using GL_LINEAR, then for every pixel i,

  • If i + 0.5 < left || i + 0.5 >= right, skip it
  • Else blend between texel int ( ( S_0 + (S_1 - S_0) . (i + 0.5 - left) / (right - left) ) . N - 0.5 ) and the next, using relative distances

---

Clipping, wrapping modes, vertical iteration and efficiency have all been ignored. Another good exercise would be imitating trilinear minification, which comes down to the same ideas, but using 3 fetches, blended with just the right alpha's -- more spec digging.
Have fun !

Homepage