Chapter 4: Geometric Objects and Transformations

In this chapter, we slow down and study some of the mathematics associated with computer graphics.  The basic operation is called transformation of coordinates, basically equivalent to changing the viewing point when displaying a graphic object.

We begin this chapter by getting rid of a few terms.  In three-dimensional graphics, we are working in an Euclidean affine vector space.  Were this a math class, we would spend time giving a formal definition of a vector space and then give the additional requirements for it to be both Euclidean and affine.  This being a computer science class, we move right to the punch line and specify the space in terms that are quite familiar.

An affine vector space is a mathematical construct with three types of objects
Scalars             (think real numbers)

Points               (in three dimensions, think X, Y, Z)

Vectors            (think a line connecting two points)

A vector space is Euclidean if it supports a measure of distance and, by extension, an idea of trigonometry.  Our vector space will use the standard ideas of distance, often called the hypotenuse rule, and, by extension, the standard trigonometry.

Theoreticians consider it very important to distinguish between vectors and points as classes of objects in our vector space.  For now we shall be content to repeat this point and discuss why few students bother to differentiate the two constructs.  When needed, we may repeat the statement and claim that some objects are points and not vectors, etc.

When many of us bother to think about such things, we tend to imagine a three-dimensional space as a system with an origin; i.e., a distinguished point with coordinates (0, 0, 0).  In such a coordinate system, the correspondence between points and vectors is so natural that it is hard to remember that the two represent distinct concepts.  Consider the figure below.

We are looking at two views of the (Y, Z)-plane in a three dimensional (X, Y, Z) coordinate system.  On the left, we see the point P = (0, Y, Z) and on the right we see the vector from the origin to the point P; the vector is called V = (0, Y, Z).  Note that for every point
P = (X, Y, Z) in the three-dimensional space there exists a vector V = (X, Y, Z) connecting the origin of the coordinate space to the point P.  This isomorphism of the two object types can cause some confusion.  When we need to, we shall clarify the differences.

When considering a vector space with three distinct object types, we should first investigate the types of operations allowed on those types.  We begin with scalars, the most familiar type  all we are talking about here is real numbers.

Scalars

We are all familiar with a number of binary operations on real numbers: addition, subtraction, multiplication, division, and quite a few others.  We focus on addition, subtraction, and multiplication.  In real numbers, subtraction is the inverse of addition.

Vectors

We first consider operations between two vectors and then consider operations involving one vector and either a scalar or a point.  Vectors may be added; thus for two vectors V1 and V2, the sum (V1 + V2) is defined.  As vector addition is well defined, it follows immediately that vector subtraction is also defined.  For V1 = (X1, Y1, Z1) and V2 = (X2, Y2, Z2), we have
V1 + V2 = (X1 + X2, Y1 + Y2, Z1 + Z2)
V1  V2 = (X1  X2, Y1  Y2, Z1  Z2).

In particular, V1 + V1 = (2·X1, 2·Y1, 2·Z1).  We generalize from this observation to note that multiplication of a vector by a scalar is an allowed operation, so for V = (X, Y, Z) and scalar a we have a·V = (a·X, a·Y, a·Z).  This equality holds in the following special cases

a = 0         0·V = (0, 0, 0)
a = 1         1·V = (X, Y, Z)
a =  1       1·V = ( X,  Y,  Z).

We originally defined a vector as a difference between two points.  For that reason, addition of a vector to a point is a defined operation.  Consider point P = (X1, Y1, Z1) and vector
V = (X2, Y2, Z2).  The sum is defined as P + V = (X1 + X2, Y1 + Y2, Z1 + Z2).  By extension one might consider subtraction of a vector from a point P  V = P + ( 1·V), although this is usually considered just another case of addition of a vector to a point.

Note that the result of addition of a vector to a point is another point.

Points

There are only a few operations allowed on points.
1)   Addition of a vector to a point, giving another point.
2)   Subtraction of one point from another, giving a vector.
3)   Affine addition of two points, an operation which implicitly involves
the vector connecting the two points.

In particular, addition of points is not a supported operation and, by extension, neither is multiplication of a point by a scalar.  For two points P1 = (X1, Y1, Z1) and P2 = (X2, Y2, Z2), affine addition is defined in terms of the vector V = P2  P1 = (X2  X1, Y2  Y1, Z2  Z1).

For scalar a , we define affine addition as the addition of a vector to a point

Q   = P1 + a · V

= P1 + a · (P2  P1) = (1  a)· P1 + a · P2.

For 0.0 £ a £ 1.0, this forms the set of points on the line segment between P1 and P2.

For an example, let us consider the equation in two dimensions Y = a·X + b.

Let P1 be the X-intercept and P2 be the Y-intercept.  The Y-intercept occurs at X = 0, thus at Y = b, or P2 = (0, b).  The X-intercept occurs when a·X + b = 0, or X  =  b/ a, thus P1 = ( b/a, 0).

The line segment connecting the two points is thus (1  a)· P1 + a · P2 = ( b/a · (a  1), a · b), so the line segment connecting the two points should be described by the following equations:

X = b/a · (a  1)

Y = a · b              for 0.0 £ a £ 1.0.

To produce equations in the standard form, we first solve both equations for a.
Equation 1 becomes a = a · X / b + 1
Equation 2 becomes a = Y / b,
thus we have Y / b = a · X / b + 1, or Y = a·X + b, as expected.

As a final remark, consider the point Q = ( b/a · (a  1), a · b) for a = 2.0.  This becomes
Q = ( b/a , 2 · b) or X = b/a  and Y = 2 · b.  For X = b/a, the linear equation evaluates to
Y = a·(b/a) + b b + b = 2 · b, so the point Q = ( b/a , 2 · b) is on the line described by the linear equation.  It is just not on the line segment between the two points.

Convexity

We have defined the line segment between points P1 and P2 as the set of all points Q such that Q = (1  a)· P1 + a · P2, with 0.0 £ a £ 1.0.  Equivalently, we have given the definition as Q = a1· P1 + a2 · P2, with a1 ³ 0, a2 ³ 0, and a1 + a2 = 1.0.  In general, we extend this concept to N points, P1  PN, by defining
Q = a1· P1 + a2· P2 +   + aN· PN, with the two constraints

a1 ³ 0, a2 ³ 0, , aN ³ 0, and

a1 + a2 + + aN = 1.0.

The set of all points Q so formed is called the convex hull of the point set P1, P2,  , PN.  There are a number of ways to imagine the convex hull.  Let H be a polygon with vertices a subset of the points P1, P2,  PN.  We pick the vertices of the polygon H so as to make it convex, specifically the line segment between any two of the points in P1, P2, , PN is included entirely in the interior of H.  Then H is a convex hull of the point set.

With the informal definition just given, it is not obvious that the convex hull of a point set is uniquely defined.  Given the formal definition, the claim that the convex hull is uniquely defined is at least a bit more plausible.  It may even be true.

Dot and Cross Products of Vectors.

When we discussed the three elements of an affine vector space, we indicated that there were additional operators on vectors that were not necessary for the formal definitions.  We now discuss the two standard vector products and later specify convenient methods for their computation.  The two products are the dot product and the cross product.  Let U and V be two vectors.  The dot product is denoted U · V and the cross product U x V.

The properties of the dot and vector product derive from two important properties of any pair of vectors.  Remember that vectors denote only a direction and a magnitude.  We can easily imagine any two vectors V1 and V2 as being rooted at the same point P and connecting P to points P1 and P2 respectively, so V1 = P1  P and V2 = P2  P.  Imagining the two vectors to be sides of a triangle, we may uniquely define the angle between the two vectors in the obvious way.  The second property is that if the vectors are not collinear, then the three points P, P1, and P2 uniquely define a plane.  Two vectors V1 and V2 are collinear if there exists a non-zero real number b such that V2 = b · V1.

The dot product of two vectors is a scalar.  As we are working in an Euclidean space, we note further that for any vector U, U · U = |U|2 is the square of the magnitude of the vector.  For any two vectors U and V, it can be shown that  (|U| · |V|) £ U · V £ (|U| · |V|) and that the angle Q between the two vectors is given by cos(Q) = U · V / (|U| · |V|).  If U · V = 0, then the two vectors U and V are said to be orthogonal or perpendicular to each other.

The cross product of two vectors U and V is a vector orthogonal to each of U and V; thus, (U x V) · U = 0 and (U x V) · V = 0.  The magnitude of the cross product is determined by the sine of the angle between the two vectors.  If Q is the angle between the vectors U and V, then |U x V| = |U| · |V| · sin(Q).  If U and V are collinear, then |U x V| = 0.

We shall present convenient methods for computing both the dot product and cross product of two vectors after we have specified the vector space in which we shall be working.

Planes

Consider three points P, Q, and R such that the vectors (P  Q) and (P  R) are not collinear.  It is a theorem of plane geometry that three such points uniquely determine a plane.  We shall now consider some of the implications of such a statement.

The line segment that connects P and Q is the set of points of the form
S(a) = a · P + (1  a) · Q, for 0 £ a £ 1.

Given this, we can similarly describe the line segment connecting S to R by
T(b) = b · S + (1  b) · R, for 0 £ b £ 1.

T may be described in terms of the original points P, Q, and R by the combination

T(a, b)      = b · (a · P + (1  a) · Q) + (1  b) · R

= (a · b · P) + (b  a · b) · Q + (1  b) · R.

At this point, we take a detour to note the similarity to the idea of this derivation to the derivation for the convex hull of three points.  Let
a1 = a · b, a2 = (b  a · b) and a3 = (1  b).  Then a1 + a2 + a3 = 1.

Since 0 £ a £ 1 and 0 £ b £ 1, it immediately follows that 0 £ a · b £ 1, so 0 £ a1 £ 1.

Since 0 £ b £ 1, it follows immediately that 0 £ (1  b) £ 1, so 0 £ a3 £ 1.  We see also that
0 £ (1  a) £ 1, so 0 £ (b · (1  a) ) £ 1, so 0 £ a2 £ 1, and we have a proper convex hull equation T = a1 · P + a2 · Q + a3 · R.

Returning to our original derivation, we have

T(a, b)      = (a · b · P) + (b  a · b) · Q + (1  b) · R.

= P + (a · b  1 )· P + (b  a · b) · Q + (1  b) · R
= P + (a · b  b  1 + b)· P + (b  a · b) · Q + (1  b) · R
= P + (a · b  b)· P + (  1 + b)· P + (b  a · b) · Q + (1  b) · R
= P  (b  a · b)· P  (1  b) · P + (b  a · b) · Q + (1  b) · R

= P + (b  a · b) · Q  (b  a · b)· P + (1  b) · R  (1  b) · P

= P + (b  a · b) · (Q  P) + (1  b) · (R  P), the books result.

Define vectors U = (Q  P) and V = (R  P) to get

T(a, b)      = P + (b  a · b) · U  + (1  b) · V, which the text book represents as
T(a, b)      = P + aU + bV, obviously redefining a and b.  Note that we drop the excessive use of the multiplication symbol, writing aU for  a · U to avoid confusion.

The equation can be rewritten as T(a, b)  P = aU + bV.  We can construct a vector W perpendicular to each of U and V by taking their cross product W = U x V.  Then
W · (aU + bV) = a(W · U) + b(W · V) = 0, as W · U = 0 and W · V = 0.

The equation of a plane could thus be defined as W ·(T(a, b)  P) = 0; thus the plane is defined by a point in the plane and a vector perpendicular to the plane.  We shall return to this observation when we have further examined the dot product and cross product.

Review of Matrix and Vector Notation.

A matrix is a rectangular array of numbers.  Those students with programming experience may consider a matrix to be a two-dimensional array of real numbers.  An (N by M)-matrix has N rows, each of M entries, and so has M columns.  Two matrices are of the same size, if they have the same number of rows and the same number of columns.  If N = M, the matrix is square.  The fact that most of the matrices of interest to us are square should not blind us to the existence of matrices that are not square.

The most familiar examples of non-square matrices are the row vector and column vector.

A row vector is an (1 by M) matrix.
A column vector is an (N by 1) matrix.

A matrix element is variously denoted

In algebra we use subscripts                                                AI,J

In many programming languages we use parentheses            A(I, J)

In C++ we use square brackets                                          a[i][j]

These notes will mostly use the style of parentheses.

Addition is defined for two matrices if and only if they are the same size.  Consider two
(N by M matrices) A and B.  The sum matrix C = A + B is also an (N by M) matrix, with elements defined by C(I,J) = A(I,J) + B(I,J), 1 £ I £ N and 1 £ J £ M.

Note that, unlike C++ and the .NET languages, matrices in our usage are 1-based.

Matrix addition is commutative, so A + B = B + A.

The transpose of a matrix A, denoted AT, is obtained by swapping the rows and columns of the matrix, thus AT(I, J) = A(J, I).  The transpose of an (N by M) matrix is an (M by N) matrix.  If a matrix is square, then so is its transpose.  Note that (AT)T = A.

The product of two matrices A and B, often written as AB, is defined if and only if the number of columns in matrix A is the same as the number of rows in matrix B.

Let A be an (L by M) matrix
Let B be an (M by N) matrix

Then C = AB is an (L by N) matrix, with elements

C(I, J) = ε A(I, K) · B(K, J), with the sum taken for 1 £ K £ M.

Note that matrix multiplication usually does not commute.  In the above scenario, the product matrix BA will not be defined unless L = N.  If L = N, then each of AB and BA will be square matrices, but they will be different size unless L = M = N.

Suppose that both A and B are 3 by 3 matrices, and C = AB is the product matrix. Then

C(I, J) = A(I, 1) · B(1, J) + A(I, 2) · B(2, J) + A(I, 3) · B(3, J).

Suppose U is a 3-D row vector; i.e., a (1 by 3) matrix.  Then its transpose UT is a 3-D column vector; i.e., a (3 by 1) matrix.  Let A be a (3 by 3) matrix.  The following products are defined: UA, AUT, UUT and UTU (the last one is a bit strange).  The figure below illustrates a few sample computations.

Note that the matrix product UUT is a scalar, while the matrix product UTU is a 3 by 3 matrix.

Basis Vectors and Coordinate Systems

Consider three vectors V1, V2, and V3 that are linearly independent.  Those few students who are interested are invited to look up the definition of linear independence; we shall specify a related property of the vectors that is sufficient for linear independence.  We stipulate that the dot products of two distinct vectors in the set is 0; thus VI · VJ = 0 if I Ή J.

It is common practice to express arbitrary vectors as linear combinations of these basis vectors; thus                  W = a1V1 + a2V2 + a3V3, and
X = b1V1 + b2V2 + b3V3.

We now calculate the dot product W · X.

W · X        = a1V1 · (b1V1 + b2V2 + b3V3)

+ a2V2 · (b1V1 + b2V2 + b3V3)
+ a3V3 · (b1V1 + b2V2 + b3V3)

= a1b1V1 · V1 + a1b2V1 · V2 + a1b3V1 · V3

+ a2b1V2 · V1 + a2b2V2 · V2 + a2b3V2 · V3
+ a3b1V3 · V1 + a3b2V3 · V2 + a3b3V3 · V3

= a1b1V1 · V1 + a2b2V2 · V2 + a3b3V3 · V3

With this interpretation, we have
W · W        = a1a1V1 · V1 + a2a2V2 · V2 + a3a3V3 · V3.

What we expect is that W · W = (a1)2 + (a2)2 + (a3)2.  This is true if we further specify the basis vectors so that they are normal vectors: V1 · V1 = 1, V2 · V2 = 1, and V3 · V3 = 1.

Thus we have an orthonormal set of basis vectors; they are normal vectors orthogonal (perpendicular to each other).

Following normal usage, we call the basis vectors e1, e2, and e3, where

A vector W then has the standard representation in terms of the basis vectors.

In our text, we shall often write vectors in the row form W = (a1, a2, a3).  There is no theoretical significance to this notation, it is just easier to write vectors this way and to forget to denote the row with the superscript indicating transpose.  To recap, consider the two vectors W = (a1, a2, a3) and X = (b1, b2, b3).  Then W · X = a1b1+ a2b2+ a3b3.

As an exercise in the dot product, consider the equation of a plane through a point P0.  The equation is N · (P  P0) = 0.  The first example is as follows:

What is the equation of the plane perpendicular to (0, 0, 1) and through the point (2, 3, 5)?

Answer: Let P = (X, Y, Z) and P0 = (2, 3, 5).  We stipulated that N = (0, 0, 1), so the equation of the plane is (0, 0, 1) · (X  2, Y  3, Z  5) = 0 or Z  5 = 0 or Z = 5.

What is the equation of the plane perpendicular to (1, 1, 1) and through the point (2, 3, 5)?

Answer: The equation is (1, 1, 1)  · (X  2, Y  3, Z  5) = 0 or X  2 + Y  3 + Z  5 = 0 or
X + Y + Z  10 = 0, or X + Y + Z = 10.

What is the equation of the plane perpendicular to (1, 1, 1), through the point (2, 3, 5)?

Answer: The equation is (1, 1, 1) · (X  2, Y  3, Z  5) = 0
or 2  X + 3  Y + 5  Z = 0 or 10  X  Y  Z = 0, or X + Y + Z = 10.  Note that this is the same answer as the previous question, due to the fact that (1, 1, 1) =  (1, 1, 1).  We shall soon consider situations when we need to distinguish these two cases as representing the outside of a geometric solid vs. the inside.

Cross Product, Revisited.

Consider the two vectors W = (a1, a2, a3) and V = (b1, b2, b3).  We now present a simple method for calculating the cross product W x V.  Recall that the cross product W x V is a vector perpendicular to each of W and V.  The easiest method for producing the cross product involves taking the determinant of a (3 by 3)-matrix.  The equation is

It is easily shown from the above that W x V =  V x W; the cross-product anti-commutes.

It is easily shown that W · (W x V) = V · (W x V) = 0, so that the vector W x V is perpendicular to each of V and W.  We show this for W · (W x V).

W · (W x V)     = a1(a2b3  a3b2) + a2(a3b1  a1b3) + a3(a1b2  a2b1)

= a1a2b3  a1a3b2 + a2a3b1  a2a1b3 + a3a1b2  a3a2b1

= a1a2b3  a2a1b3 + a2a3b1  a3a2b1 + a3a1b2  a1a3b2

= 0, so W x V is perpendicular to W.

Two Special Square Matrices

If we restrict our attention to square matrices, say (N by N)-matrices, we find it helpful to define two special matrices: the zero matrix and the identity matrix.

0    the zero matrix  for every (N by N)-matrix A, A + 0 = 0 + A = A

I     the identity matrix          for every (N by N)-matrix A, AI = IA = A.

The components of the zero matrix are easily specified: they are all 0s.  The components of the identity matrix are 1s along the diagonal of the matrix and 0s elsewhere; thus

IJK = 1 if J = K and 0 if J Ή K.  For (3 by 3) matrices, we have:

The subscript 3 on the label I3 is to distinguish the matrix as the (3 by 3) identity matrix.  When the size of the matrix is clear, the subscript is often omitted.

If a matrix A is invertible, there exists a matrix, denoted A1, such that A A1 = A1 A = I.  There exist a number of methods for computing the inverse of an invertible matrix; we consider the one that is simplest to implement.  This method of elementary row operations is taken from the book Linear Algebra by Friedberg, Insel, and Spence (Prentice-Hall, 1989).

Definition: Let A be an (M by N)-matrix.  Any one of the following three operations on the rows of A is called an elementary row operation:

a)   Interchanging any two rows of A

b)   Multiplying any row of A by a nonzero constant

c)   Adding any constant multiple of a row of A to another row.

Elementary column operations are defined in a similar manner, but we do not use them.

Definition: Let A be an (N by N)-matrix.  We define the (N by 2N) augmented matrix
C = (A | IN) as pasting the identity matrix onto the matrix A, so that

C(I, J)  = A(I, J) if 1 £ J £ N
= 1 if J = (N + I)
= 0 otherwise

Theorem: Let A be an invertible (N by N)-matrix.  If the augmented matrix C = (A | IN) is transformed into a matrix of the form (IN | B) by a series of a finite number of elementary row operations, then B = A1.

We consider the example matrix A at right.  First, we construct the augmented matrix (A | I3).

The first thing we note is that we need a nonzero entry in column 1 of the first row of the augmented matrix.  But C(1, 1) = 0, so we swap rows 1 and 2 to get the matrix at left.  We equally well could have swapped rows 1 and 3.

The basic strategy is to move column by column from left to right to cause the matrix A to be replaced by the matrix I3.  We first drive the diagonal element in the column to 1 and then drive the non-diagonal elements to 0.  In column 1, we want A(1, 1) = 1, A(1, 2) = 0, and A(1, 3) = 0.  After this we work on columns 2 and 3.

In order to place a 1 in the first column of the first row, we multiply the first row of the augmented matrix by 0.5 to get the matrix at left.  We are now prepared to drive A(1, 2) and A(1, 3) to zero.

In the first column we find that A(1, 2) = 0 already so that no adjustment is required.  We see that A(1, 3) = 3, so we add 3 times row 1 to row 3 to get the result at right.  We are now complete with the process of converting the first column and move to column 2.

In order to change the second column of the matrix above into the second column of I3, we multiply row 2 by 0.5 to place a 1 in the
(2, 2) position.  This operation produces the matrix at left.

We now complete work on the second column by adding 2 times the second row to the first row and by adding 3 times row 2 to row 3.  The result of this operation is the matrix at right.

Only the third column remains to be changed.  In order to place a 1 in the (3, 3) position, we multiply the third row by 0.25.  The result of this operation is the matrix at left.

We complete the process of converting the matrix by adding 3 times row 3 to row 1 and 2 times row 3 to row 2.  The result of this process is shown at right.  Note that the matrix is now of the form (I3 | B).

As a result of the above, we now have the inverse of the matrix A.

This completes the example.

# More Properties of Matrices

We need a few more results on matrix multiplication in order to work our way through some of the presentations just to come.

The first result relates to the inverse of a matrix product.

Theorem:  Let A and B be invertible (N by N)-matrices.  Let C = AB be the matrix product of A and B.  Then C1 = B1A1.

Proof: If A is invertible, then there exists a matrix A1 such that AA1 = A1A = IN.  The same is true of matrix B.  Recalling that for any (N by N) matrix M, we have
M IN = IN M = M, we note the following

AB B1A1 = A B B1A1 = A IN A1 = A A1 = IN.

B1A1 AB = B1 A1A B = B1 IN B = B1 B = IN.

Theorem: Let A and B be matrices such that the product C = AB is defined.
Then CT = BTAT.

Proof:  Let A be an (L by M)-matrix and B be an (M by N) matrix.  Then the product
C = AB is defined, with C being an (L by N) matrix.

Note that AT is an (M by L)-matrix and BT in an (N by M) matrix.  From this observation, we see that the matrix product BTAT is defined, while the product ATBT is not defined if L Ή M.

This last result applies to square matrices, but is not restricted to them.  One particular case of interest is the product of a matrix and a vector.

Let A by an (N by N)- matrix and let X be an (N by 1) column vector.  Then XT is a (1 by N) row vector.  We consider the product of the matrix and the vector.

(A X)T = XT AT and (XT A)T = AT X, as (XT)T = X.

We now give a theorem on the inverse of the transpose of a matrix.  We begin with a definition of a new symbol used to facilitate the proof.

Definition: We define the Kronecker delta by dI,J = 1 if I = J and dI,J = 0 if I Ή J.

Note that dI,J = dJ,I.

Theorem:  Let A be an N-by-N matrix with an inverse.  Then (AT)1 = (A1)T.

Proof:  Let A be an invertible N-by-N matrix.  Let B = A1.  Then we have the two matrix products A·B = B·A = IN, the N-by-N identity matrix.

The claim is that BT = (AT)1, thus AT·BT = BT·AT = IN.

Expressing these matrix products by sums, we have the following equivalent expressions.

(A·B)I,J =

(B·A)J,I =

We use the matrix products to complete the proof.

(AT·BT)I,J = =

(BT·AT)J,I =  = .

We have just shown that AT·BT = BT·AT = IN.

Since B = A1, we have shown that (AT)1 = (A1)T.

Thus we have shown that for an invertible matrix, the inverse of the transpose is equal to the transpose of the inverse.  This property becomes quite useful when we consider coordinate transformations.

# Change of Coordinate Systems

We now investigate a number of transformations of coordinate systems.  The two most basic methods are rotation of the coordinate system and translation of the coordinate system.

There are a number of reasons for interest in coordinate transformations.  We first present an example not directly related to computer graphics.

Consider the following 2-dimensional equation in (X, Y)-coordinate space.

One might say that the geometric figure represented by the figure is not obvious, except in the obvious fact that it is a two-dimensional figure.  There are standard methods, not to be discussed in this class, to simplify this equation so that it may be understood.  These standard methods lead to the following coordinate transformation.

Making this transformation leads to the following equation in the variables U and V.

Division of both sides of this equation leads to the following form.

This is easily recognized as the equation of an ellipse, centered at (U, V) = (0, 0) with semi-major axis of length 4 along the V axis and semi-minor axis of length 3 along the U axis.  The figures below show the ellipse in both (U, V)-space and (X, Y)-space.

We now consider the more traditional presentation in terms of distinct sets of basis vectors.  Consider two sets of basis vectors U = {U1, U2, U3} and V = {V1, V2, V3}.  The key to the transformation is the specification of one set of basis vectors in terms of the other basis set.  The representation of a set of basis vectors in terms of itself is quite simple.  In terms of the basis vectors U1, U2, and U3, we have U1 = (1, 0, 0), U2 = (0, 1, 0), and U3 = (0, 0, 1).  Then the set of basis vectors (note the set notation) is given by
U = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.

Suppose that the basis vector set U is to be expressed in terms of another basis vector set V.  The conventional way to do this is to use matrix multiplication.  Consider two sets of basis vectors in three dimensions: U = {U1, U2, U3} and V = {V1, V2, V3}.  Given the representation of a vector in the basis set V, we want to find it in the basis set U.  We begin with the representation of the basis vectors in the set U in terms of those in V.

U1 = m11·V1 + m12·V2 + m13·V3

U2 = m21·V1 + m22·V2 + m23·V3

U3 = m31·V1 + m32·V2 + m33·V3.

Recalling that in the V coordinate set V1 = (1, 0, 0), V2 = (0, 1, 0), and V3 = (0, 0, 1), this is usually written in the following way.

U1 = m11·(1, 0, 0) + m12·(0, 1, 0) + m13·(0, 0, 1) = (m11, m12, m13)

U2 = m21·(1, 0, 0) + m22·(0, 1, 0) + m23·(0, 0, 1) = (m21, m22, m23)

U3 = m31·(1, 0, 0) + m32·(0, 1, 0) + m33·(0, 0, 1) = (m31, m32, m33).

In three dimensions, this is written compactly using a (3 by 3)-matrix M.

To give a specific example, we consider the coordinate transformation mentioned earlier.

Since this is a two-dimensional transformation, it is represented by a (2 by 2) matrix.

We now investigate the relationship of this matrix to the matrix M discussed above.  To do this we must investigate the representation of the basis vectors.  Let  and  be the basis vectors in the (X, Y) system and  and  be the basis vectors in the (U, V) system.  In the (X, Y)-coordinate system,  = (1, 0) and  = (0, 1).  In the (U, V) system we have

The above shows the (U, V)-coordinates of the (X, Y) basis vectors.  Put another way:

We may also want to see the (X, Y)-coordinates of the (U, V) basis vectors.  It is easy to show that

When we discuss rotations, the method for inverting these transformations will be obvious.

# Transformation of a General Vector

We now address the question of conversion of a vector from one coordinate system to another.  In this case, we assume that the vector W is defined in the V coordinate system and want to find its representation in the U coordinate system.

Given that W = a1·V1 + a2·V2 + a3·V3, or W = (a1,a2,a3) in the V-system, we know that the vector has a representation in the U-system W = b1·U1 + b2·U2 + b3·U3.  The task is to compute the coefficients b1, b2, and b3 given a1, a2, and a3, and the matrix M.

Just a bit ago, we showed the matrix M as representing the U basis vectors in the V basis.

We now note that the two representations of W, W = a1·V1 + a2·V2 + a3·V3 and
W = b1·U1 + b2·U2 + b3·U3 can be written in matrix notation as

Putting all of this together, we get a matrix equation

We want the vector b in terms of vector a, so b = (MT)1a.

We now apply all this algebra to a simple example.

Consider the following example, with the U basis set expressed in terms of the V-set:

U1 = (     1,   1, 0) =             1·V1  + 1·V2 + 0·V3

U2 = (     1, 1, 1) =             1·V1   1·V2 + 1·V3

U3 = (   1,   1, 2) =           1·V1  + 1·V2 + 2·V3

The (3 by 3)-matrix describing this is shown below

Taking (M)1, the inverse of matrix M, allows us to express the U-set in terms of the V-set and the V-set in terms of the U-set, as follows.

Our goal of converting a vector representation from the V-basis set to the U-basis set requires that we compute the inverse of the transpose of M; that is (MT)1.

Suppose that W = 1·V1 + 1·V2 + 2·V3.  What is the representation of W in the U-set?

Thus W = U3, which should not be a surprise, as we used the equation for U3 to define W.

As an aside, we note that the length of W in the V system is (1 + 1 + 4)1/2 = 61/2 or approximately 2.45, while the length of W in the U system is obviously 1, so the transformation above is not length preserving.  To make matters worse, consider the vector
X = 1·V1 + 1·V2 + 0·V3, which is U1 in the U system.  This vectors length in the V system is 21/2, or approximately 1.41, while again it is 1 in the U system.  Not only does this transformation fail to preserve lengths, but it also distorts the objects.  Next time you go to a fair and attend a funny house, think of distorting image transformations as you stand in front of one of the funny mirrors.

To appreciate the difference between length preservation and non-distortion, consider the following change of basis vectors.

U1 = 2·V1

U2 = 2·V2        or         , with

U3 = 2·V3

Each of the U basis vectors is twice the length of a V basis vector, but the U basis vectors are all of the same length in the V-system, so an object will appear as half-sized in the U-system, but it will not appear distorted.

# Another Example of Change of Representation

The discussion just above is terse and accurate, but may leave too much to the imagination.  In order to get a better idea of what is happening, we discuss the books example from section 4.3.2, but give an entirely different presentation.

Suppose we have two basis sets for a vector space: {U1, U2, U3} and {V1, V2, V3}, related by
U1 = V1
U2 = V1 + V2
U3 = V1 + V2 + V3.

There are two questions one might ask.

1) Given a vector represented in the U-system, represent it in the V-system.
2) Given a vector represented in the V-system, represent it in the U-system.

Note that the book addresses only the second question.

1)   Suppose that W is represented in the U-system as W = a1·U1 + a2·U2 + a3·U3.  The representation in the V-system is given quite simply based on the above equations.  We start by making simple substitutions for U1, U2, and U3

W   = a1·(V1) + a2·(V1 + V2) + a3·(V1 + V2 + V3)
= (a1+ a2+ a3)·V1 + (a2+ a3)·V2 + (a3)·V3, so we have the representation of W in the
V-system as W = (a1+ a2+ a3, a2+ a3, a3).

2)   We now consider the problem discussed in section 4.3.2 of the textbook.  Suppose that W is represented in the V-system as W = b1·V1 + b2·V2 + b3·V3.  The formal way to approach this problem is to obtain the inverse of the transformation matrix, as was done by the text.  We shall consider a simpler approach.  Consider the transformation as a set of equations with the unknown variables being V1, V2, and V3.

V1                    = U1
V1 + V2                            = U2
V1 + V2 + V3    = U3

We solve for V1, V2, and V3 in terms of U1, U2, and U3.

To solve for V1, we merely copy the first equation.  To solve for V2, we substitute the first equation into the second to get U1 + V2= U2, or V2 = U2  U1.  To solve for V3, we substitute the second equation in the third equation (V1 + V2) + V3 = U3, or U2 + V3 = U3 to get
V3 = U3  U2.  Thus we have the equations

V1 = U1

V2 = U2  U1

V3 = U3  U2

If given W in the V-system as W = b1·V1 + b2·V2 + b3·V3, then we just substitute again
W   = b1·U1 + b2·(U2  U1) + b3·(U3  U2)
= (b1  b2)·U1 + (b2  b3)·U2 + (b3)·U3, so W in the U-system is represented as
W   = (b1  b2, b2  b3, b3).

Example: Suppose W = 1·V1 + 2·V2 + 3·V3.  Then we have in the U-system
W = (1  2)·U1 + (2  3)·U2 + (3)·U3 =  1·U1  1·U2 + 3·U3, as in the textbook.

In terms used by the book, we have , with M = .

It follows then that , with M 1 = .

Now W = a1·V1 + a2·V2 + a3·V3 = aT·, where a = .

In the U-system, W is represented as W = b1·U1 + b2·U2 +b3·U3 or

W = bT·, where b = .  From here, we proceed as in the derivation above, showing that W = bT· = bT·M·= aT·, so bT·M =  aT and a = MT·b, as before.

Homogeneous Coordinates

The concept of homogeneous coordinates seems to have arisen from a number of considerations, including a desire to have a uniform notation that differentiated points from vectors and a desire to speed up the computation of general coordinate transformations.

In the standard 3-D system, both points and vectors are represented as triples.  In our notes, we use different fonts to designate the two, P = (1, 2, 3)T and V = (1, 2, 3)T, and generally use different letters (Ps and Qs for points, Ws, Xs, Ys, and Zs for vectors), but this convention does not convey any information to the computer.

A more general coordinate transformation can be viewed as a rotation and a translation, composed in either order; rotate then translate or translate then rotate.  In matrix terms,

U = M·V + X
U = N·(V + X), where U, V, and X are column vectors.

Note that, in this formulation, M and N are both 3-by-3 matrices.  We could arrive at a formula for deriving the matrix N in terms of M or vice-versa, but that is not the focus of this discussion.  The point is that in the above system, we need both a matrix addition and a matrix multiplication to handle a combination rotation and translation.  It would be more efficient to develop a system in which a single matrix operation would suffice.

We move to the idea made possible by the use of homogeneous coordinates  that a single matrix multiplication can be made to represent a general coordinate transformation.  We first consider only rotations and translations and then consider more general transformations.

Homogeneous coordinates are based on the use of four-dimensional entities to represent three dimensional points and vectors.  We shall discover a number of uses for the additional dimension, but should always remember that they are a trick to facilitate computation.

In homogeneous coordinates, three-dimensional points and vectors are represented by four-dimensional arrays.  The basis of this four-dimensional representation is the definition of a three-dimensional frame:

three basis vectors, usually denoted (1, 0, 0), (0, 1, 0), and (0, 0, 1), and
a unique reference point P0, often called the origin of the coordinate system.

In such a system, a vector can be represented by three components

W = a1·V1 + a2·V2 + a3·V3

But a point needs both a reference point (the origin) and a vector

P = P0 + X = P0 + b1·V1 + b2·V2 + b3·V3

Homogeneous coordinates use a four-dimensional notation to reflect this difference.  Homogeneous coordinates have as an expanded basis set the following hybrid four-dimensional array, which I call B4 for the basis of a 4-D system.

In the pure mathematical sense, this is a heterogeneous array.  The first three elements in the array are vectors, while the last element in the array is a point.  It is enough for us that points and vectors share a common representation.

Vectors are represented by a 4-D array with the last element 0.

Points are represented by a 4-D array with the last element 1.

Consider the vector, represented in standard three-dimensional form as

W = a1·V1 + a2·V2 + a3·V3 = [a1, a2, a3]T.

In the four-dimensional homogeneous coordinate system, the vector is represented as

, so that

or WT·B4 = a1·V1 + a2·V2 + a3·V3 + 0 = a1·V1 + a2·V2 + a3·V3

Consider the point, represented in standard three-dimensional form as

P = P0 + X = P0 + b1·V1 + b2·V2 + b3·V3, or loosely as (b1, b2, b3).

In the four-dimensional homogeneous coordinate system, the point is represented as

, so that

or PT·B4 = b1·V1 + b2·V2 + b3·V3 + P0.

The big payoff comes in the ability to represent a translation of coordinate systems by a matrix.  Suppose we have a coordinate system with origin P0.  A translated coordinate system is one with a new origin Q0, specified by a vector offset from P0.

Consider two frames (V1, V2, V3, P0) and (U1, U2, U3, Q0) and the transformation from the
V-system to the U-system.

U1 = M11·V1 + M12·V2 + M13·V3

U2 = M21·V1 + M22·V2 + M23·V3

U3 = M31·V1 + M32·V2 + M33·V3

Q0 = M41·V1 + M42·V2 + M43·V3 +P0

Note that the Q0, the origin of the new system, is displaced from P0, the origin of the old system by the vector M41·V1 + M42·V2 + M43·V3.  In homogeneous coordinates, this transformation is represented by the matrix

, with transpose

We use the transpose MT in both point and vector transformations.  Vectors transform as

Points transform as

We immediately notice one important property of these transformations: vectors transform into vectors (0 in the fourth row) and points transform into points (1 in the fourth row).

Also , so a vector added to a point is a point.

We now consider some special coordinate transformations.  In our first example, we pick

, so

We first transform the point P to identify this transformation.

- a translation.

Note that this corresponds to translating the origin by (3, 4, 5), as the values of each component are increased.  We now consider that effect of such a transformation on a vector.

Note here that the components of the vector are invariant under the transformation; that is, they do not change.  The answer is that the vector is just a direction in space and a magnitude.  Thus vectors are not changed by just changing the origin of coordinates.

Finally, we consider a pure rotation applied to a point.

Considering a pure rotation applied to a vector would give similar results.  In each case the pure rotation in homogeneous coordinates strongly resembles the corresponding rotation in the three-dimensional model.

Affine Transformations

We are developing the theory of coordinate transformations because such transformations are essential to the understanding of graphics systems.  At this point, we consider what type of transformations are required for work with graphics systems.

It turns out that not all transformations can be used for graphics and, fortunately, that the subset of transformations that can be used is the simpler subset.  This is the subset of
affine transformations.

To understand the idea of an affine transformation, consider the related concept of linear function.  A function f( ) is linear if and only if
f(a·x + b·y) = a·f(x) + b·f(y)           for all scalars a and b
and all variables x and y.

Consider the function f(x) = x2.  This is not a linear function, specifically
f(a·x + b·y)   = (a·x + b·y)2 = a2·x2 + b2·y2 + 2·a·b·x·y
=
a2·f(x) + b2·f(y) + 2·a·b·x·y.

There is another property of linear functions, shared by affine transformations.  In terms of linear functions of one variable, we have the following interesting property

f(a·x + (1  a)·y) = a·f(x) + (1  a)·f(y).

This generalizes to three dimensional points as
f(a·p + (1  a)·q) = a·f(p) + (1  a)·f(q).

Recall that the set a·p + (1  a)·q, for 0 £ a £ 1, is the set of all points on the line between the two points p and q.  Note that this maps onto the set a·f(p) + (1  a)·f(q), which is the set of all points on the line between the two points f(p) and f(q).

This observation would be an interesting curiosity were it not for an immediate consequence.  To apply a transformation to the line from point p to point q, we use two steps.
1)         Compute the two points f(p) and f(q)

2)         Connect these two points f(p) and f(q) with a straight line.

The straight line between f(p) and f(q) is the image of the straight line between p and q.

One property of transformations that is worth note is the degrees of freedom.  Basically, this is the number of variables that can be set independently.  Consider the function of three variables f(x, y, z) = a1·x + a2·y + a3·z.  Suppose that the scalars a1, a2, and a3 are independent.  In that case the transformation has three degrees of freedom because each of these three scalars can be set arbitrarily.

If, on the other hand, we have the constraint that a1 + a2 + a3 = 1, then only two of the three scalars can be set arbitrarily and the third value is then determined.  In this case the transformation has only two degrees of freedom.  The book discusses the degrees of freedom of transformation at some length, but it is not likely that this course will make much use of the concept.

Rotation, Translation, and Scaling

We have studied the use of homogeneous coordinates and the related 4-by-4 matrices to represent coordinate transformations.  We have mentioned that such matrices could be used to represent transformations that are quite general, but hinted that we would use simpler methods to construct these matrices.

We shall view many of these general coordinate transformations as being constructed from three simpler affine transformations: rotation, translation, and scaling.  We shall then discuss the OpenGL practice of building the general transformation matrix as a product of the matrices representing the simpler transformations.

We begin with the translation method of coordinate transformation.  Under coordinate translations, points are transformed but vectors are not. Consider a translation of the coordinate system by the displacement vector d = (ax, ay, az)T.  In the homogeneous coordinate system, this is represented by the 4-by-4 matrix.

Points [x, y, z, 1]T translate as follows:

This translation is sometimes written at T(ax, ay, az).

We obtain the inverse of the translation by noting that if T corresponds to a translation by vector d = (ax, ay, az)T, then the inverse transformation corresponds to a translation by the vector  d = (ax, ay, az)T.  Based on this we have the following observation

T1(ax, ay, az) = T(ax, ay, az), represented by matrix .

It is easily shown that vectors are not changed by this translation.

# Scaling

The second affine transformation to be considered is scaling.  There are two types of scaling: uniform and non-uniform scaling.  In uniform scaling, each coordinate is multiplied by the same scale factor, denoted b.

For |b| > 1, an object gets larger

For 0 < |b| < 1, an object gets smaller.

Values of b < 0 correspond to reflection.

Uniform scaling is represented by the matrix S(b) = .

Consider the effect of uniform scaling on both vectors and points.

For vector w

For point P

Non-uniform scaling is a transformation with different scaling factors for each axis.  Uniform scaling preserves shapes; non-uniform scaling does not.  Non-uniform scaling is represented by S(bx, by, bz) = .

It is easily shown that both vectors and points transform in manner similar to uniform scaling.  We apply the inverse of the scaling matrix by applying the inverse of each of the scaling factor, thus S1(bx, by, bz) = .

# Question: Do Scaling and Translation Commute?

Suppose we want to do a coordinate transformation involving scaling and translation.  Does it matter if we scale first and then translate or translate first and then scale.  In mathematical terms, we are asking if the two matrices commute; i.e.,
does S(bx, by, bz)·T(ax, ay, az) = T(ax, ay, az)·S(bx, by, bz)?

We first compute the matrix S(bx, by, bz)·T(ax, ay, az).

S(bx, by, bz)·T(ax, ay, az) =

T(ax, ay, az)·S(bx, by, bz) =

The mathematics clearly indicates that the coordinate transformations do not commute.  It is instructive to observe the reason for failure to commute.

Consider the first product S(bx, by, bz)·T(ax, ay, az).  Because the translation is done first, followed by the scaling, we note that the translation vector (ax, ay, az) is also scaled, so we have translation by the vector (bxax, byay, bzaz).  The second transformation corresponds to translation of the scaled figure by the vector (ax, ay, az).

Rotation

The generalized rotation operator is specified in terms of an angle of rotation and a vector (possibly a unit vector) serving as the axis of rotation.  The operator is represented as
R(Q, VX, VY, VZ) were V = (VX, VY, VZ) is the axis of rotation.  We first examine a number of simple rotations, specifically

R(QX, 1, 0, 0)         rotation about the Xaxis, and

R(QY, 0, 1, 0)         rotation about the Yaxis, and

R(QZ, 0, 0, 1)          rotation about the Zaxis.

The first case we consider is R(QZ, 0, 0, 1), also known as rotation in the X-Y plane about the origin.  We shall examine this rotation and then make some statements applicable to all three of these basic rotations.  After considering the basic rotations, we consider the more general rotations and show that these can be represented as the concatenation of the basic transformations.

Let RZ(QZ) denote rotation by angle QZ about the Z-axis.  It is easily shown that the 4-by-4 matrix representing this rotation is given by

RZ(QZ) = .  Similar matrices represent RX(QX) and RY(QY).

Let R(Q) represent any of the three rotations RX(QX), RY(QY), or RZ(QZ).  Noting that a rotation by Q can always be undone by a rotation by  Q, we have the immediate result that R1(Q) = R( Q).  Applying this logic to RZ(QZ), we see that

RZ1(QZ) = = .

From this we see that RZ1(QZ) = RZT(QZ), a property shared by the other rotation matrices.  A matrix whose inverse equals its transpose is called an orthogonal matrix.

Question: Do rotations commute?  To answer this question, we consider a specific case involving rotations around the X-axis and Z-axis.  If this fails to commute, we have shown that, in general, rotations fail to commute.  Consider two transforms RX(F) and RZ(Q).

We first consider the product RX(F)·RZ(Q)

· =

We now consider the product RZ(Q)·RX(F)

· =

The two products are not equal, so we conclude that the rotations do not commute.

Concatenation of Transformations

We now have developed matrix representations of the basic transformation types.  We now consider examples of affine transformations formed by concatenating sequences of these basic transformations.

Suppose a sequence of three transformations on a point p, represented by matrices A, B, and C.  The transformations are applied in this order: A first, then B, and finally C.  We can consider the sequence of transformations as follows:

p2 = Ap
p3 = Bp2
q = Cp3

This can be written as a concatenation of transformations q = C(B(A(p))), represented by the matrix product q = CBAp, or q = Mp where M = C·B·A.  It is important to remember that the order of transformations represented by a product of matrices is best read right to left.

The book shows that a generalized rotation about an arbitrary vector can be represented as the concatenation of basic translation and rotations.  We shall not pursue this in detail, but should be aware of it.

# The Instance Transformation

Most graphical systems include primitives for drawing common shapes such as ellipses and rectangles.  Many applications of graphics require a collection of composite shapes that can be scaled, rotated, and translated.  We consider a number of such shapes used in the study of digital logic.  In this illustration, we use the symbols for 2-input AND gate and OR gates.

Each of these symbols would have a representation stored in some standard form.  Each of the two symbols would be represented at the origin of the coordinate system and in a standard orientation and standard size.  Consider the AND gate, which is the symbol drawn on the left in the above illustration.  To draw this gate with a different size and orientation at a different location, we apply the transformation

M = T(v)R(q)S(b) to the symbol, thus creating the instance at the correct location.