Chapter 4: Geometric Objects and
Transformations
In this chapter, we slow down and study some of the mathematics associated with computer graphics. The basic operation is called transformation of coordinates, basically equivalent to changing the viewing point when displaying a graphic object.
We begin this chapter by getting rid of a few terms. In three-dimensional graphics, we are working in an Euclidean affine vector space. Were this a math class, we would spend time giving a formal definition of a vector space and then give the additional requirements for it to be both Euclidean and affine. This being a computer science class, we move right to the punch line and specify the space in terms that are quite familiar.
An affine vector space is a
mathematical construct with three types of objects
Scalars (think real numbers)
Points (in three dimensions, think X, Y, Z)
Vectors (think a line connecting two points)
A vector space is Euclidean if it supports a measure of distance and, by extension, an idea of trigonometry. Our vector space will use the standard ideas of distance, often called the hypotenuse rule, and, by extension, the standard trigonometry.
Theoreticians consider it very important to distinguish between vectors and points as classes of objects in our vector space. For now we shall be content to repeat this point and discuss why few students bother to differentiate the two constructs. When needed, we may repeat the statement and claim that some objects are points and not vectors, etc.
When many of us bother to think about such things, we tend to imagine a three-dimensional space as a system with an origin; i.e., a distinguished point with coordinates (0, 0, 0). In such a coordinate system, the correspondence between points and vectors is so natural that it is hard to remember that the two represent distinct concepts. Consider the figure below.

We are looking at two views of
the (Y, Z)-plane in a three dimensional (X, Y, Z) coordinate system. On the left, we see the point P = (0, Y, Z)
and on the right we see the vector from the origin to the point P; the vector
is called V = (0, Y, Z). Note that for
every point
P = (X, Y, Z) in the three-dimensional space there exists a vector V = (X, Y,
Z) connecting the origin of the coordinate space to the point P. This isomorphism
of the two object types can cause some confusion. When we need to, we shall clarify the
differences.
When considering a vector space with three distinct object types, we should first investigate the types of operations allowed on those types. We begin with scalars, the most familiar type all we are talking about here is real numbers.
Scalars
We are all familiar with a number of binary operations on real numbers: addition, subtraction, multiplication, division, and quite a few others. We focus on addition, subtraction, and multiplication. In real numbers, subtraction is the inverse of addition.
Vectors
We first consider operations
between two vectors and then consider operations involving one vector and either
a scalar or a point. Vectors may be
added; thus for two vectors V1 and V2, the sum (V1
+ V2) is defined. As vector
addition is well defined, it follows immediately that vector subtraction is
also defined. For V1 = (X1,
Y1, Z1) and V2 = (X2, Y2,
Z2), we have
V1 + V2
= (X1 + X2, Y1 + Y2, Z1
+ Z2)
V1 V2
= (X1 X2, Y1 Y2, Z1
Z2).
In particular, V1 + V1 = (2·X1, 2·Y1, 2·Z1). We generalize from this observation to note that multiplication of a vector by a scalar is an allowed operation, so for V = (X, Y, Z) and scalar a we have a·V = (a·X, a·Y, a·Z). This equality holds in the following special cases
a
= 0 0·V = (0, 0, 0)
a = 1 1·V = (X, Y, Z)
a = 1 1·V = ( X, Y, Z).
We originally defined a vector
as a difference between two points. For
that reason, addition of a vector to a point is a defined operation. Consider point P = (X1, Y1,
Z1) and vector
V = (X2, Y2, Z2). The sum is defined as P + V = (X1
+ X2, Y1 + Y2, Z1 + Z2). By extension one might consider subtraction
of a vector from a point P V = P + ( 1·V), although this is
usually considered just another case of addition of a vector to a point.
Note that the result of addition of a vector to a point is another
point.
Points
There are only a few operations
allowed on points.
1) Addition
of a vector to a point, giving another point.
2) Subtraction
of one point from another, giving a vector.
3) Affine
addition of two points, an operation which implicitly involves
the vector connecting the two
points.
In particular, addition of points is not a supported operation and, by extension, neither is multiplication of a point by a scalar. For two points P1 = (X1, Y1, Z1) and P2 = (X2, Y2, Z2), affine addition is defined in terms of the vector V = P2 P1 = (X2 X1, Y2 Y1, Z2 Z1).
For scalar a , we define affine addition as the addition of a vector to a point
Q = P1 + a · V
= P1 + a · (P2 P1) = (1 a)· P1 + a · P2.
For 0.0 £ a £ 1.0, this forms the set of points on the line segment between P1 and P2.
For an example, let us consider the equation in two dimensions Y = a·X + b.
Let
P1 be the X-intercept and P2 be the Y-intercept. The Y-intercept occurs at X = 0, thus at Y =
b, or P2 = (0, b). The
X-intercept occurs when a·X + b = 0, or X =
b/ a, thus P1 = ( b/a, 0).
The line segment connecting the two points is thus (1 a)· P1 + a · P2 = ( b/a · (a 1), a · b), so the line segment connecting the two points should be described by the following equations:
X = b/a · (a 1)
Y = a · b for 0.0 £ a £ 1.0.
To produce equations in the
standard form, we first solve both equations for a.
Equation 1 becomes a = a · X /
b + 1
Equation 2 becomes a = Y
/ b,
thus we have Y / b = a · X / b + 1, or Y = a·X + b, as expected.
As a final remark, consider the
point Q = ( b/a ·
(a
1), a
·
b) for a
= 2.0. This becomes
Q = ( b/a , 2 ·
b) or X = b/a and Y = 2 ·
b. For X = b/a, the linear equation
evaluates to
Y = a·(b/a)
+ b b + b = 2 · b, so the point Q = ( b/a
, 2 ·
b) is on the line described by the linear equation. It is just not on the line segment between
the two points.
Convexity
We have defined the line
segment between points P1 and P2 as the set of all points
Q such that Q = (1 a)· P1 + a · P2, with 0.0 £ a £
1.0. Equivalently, we have given the
definition as Q = a1· P1
+ a2
·
P2, with a1
³
0, a2
³
0, and a1
+ a2
= 1.0. In general, we extend this
concept to N points, P1
PN, by defining
Q = a1· P1
+ a2· P2
+
+ aN· PN,
with the two constraints
a1 ³ 0, a2 ³ 0, , aN ³ 0, and
a1 + a2 + + aN = 1.0.
The set of all points Q so formed is called the convex hull of the point set P1, P2, , PN. There are a number of ways to imagine the convex hull. Let H be a polygon with vertices a subset of the points P1, P2, PN. We pick the vertices of the polygon H so as to make it convex, specifically the line segment between any two of the points in P1, P2, , PN is included entirely in the interior of H. Then H is a convex hull of the point set.
With the informal definition just given, it is not obvious that the convex hull of a point set is uniquely defined. Given the formal definition, the claim that the convex hull is uniquely defined is at least a bit more plausible. It may even be true.
Dot and Cross Products of Vectors.
When we discussed the three elements of an affine vector space, we indicated that there were additional operators on vectors that were not necessary for the formal definitions. We now discuss the two standard vector products and later specify convenient methods for their computation. The two products are the dot product and the cross product. Let U and V be two vectors. The dot product is denoted U · V and the cross product U x V.
The properties of the dot and vector product derive from two important properties of any pair of vectors. Remember that vectors denote only a direction and a magnitude. We can easily imagine any two vectors V1 and V2 as being rooted at the same point P and connecting P to points P1 and P2 respectively, so V1 = P1 P and V2 = P2 P. Imagining the two vectors to be sides of a triangle, we may uniquely define the angle between the two vectors in the obvious way. The second property is that if the vectors are not collinear, then the three points P, P1, and P2 uniquely define a plane. Two vectors V1 and V2 are collinear if there exists a non-zero real number b such that V2 = b · V1.
The dot product of two vectors is a scalar. As we are working in an Euclidean space, we note further that for any vector U, U · U = |U|2 is the square of the magnitude of the vector. For any two vectors U and V, it can be shown that (|U| · |V|) £ U · V £ (|U| · |V|) and that the angle Q between the two vectors is given by cos(Q) = U · V / (|U| · |V|). If U · V = 0, then the two vectors U and V are said to be orthogonal or perpendicular to each other.
The cross product of two vectors U and V is a vector orthogonal to each of U and V; thus, (U x V) · U = 0 and (U x V) · V = 0. The magnitude of the cross product is determined by the sine of the angle between the two vectors. If Q is the angle between the vectors U and V, then |U x V| = |U| · |V| · sin(Q). If U and V are collinear, then |U x V| = 0.
We shall present convenient methods for computing both the dot product and cross product of two vectors after we have specified the vector space in which we shall be working.
Planes
Consider three points P, Q, and R such that the vectors (P Q) and (P R) are not collinear. It is a theorem of plane geometry that three such points uniquely determine a plane. We shall now consider some of the implications of such a statement.
The line segment that connects
P and Q is the set of points of the form
S(a) = a · P +
(1 a)
·
Q, for 0 £
a
£
1.
Given this, we can similarly
describe the line segment connecting S to R by
T(b) = b · S +
(1 b)
·
R, for 0 £
b
£
1.
T may be described in terms of the original points P, Q, and R by the combination
T(a, b) = b · (a · P + (1 a) · Q) + (1 b) · R
= (a · b · P) + (b a · b) · Q + (1 b) · R.
At this point, we take a detour
to note the similarity to the idea of this derivation to the derivation for the
convex hull of three points. Let
a1 = a · b, a2
= (b
a
·
b)
and a3
= (1 b). Then a1 + a2
+ a3
= 1.
Since 0 £ a £ 1 and 0 £ b £ 1, it immediately follows that 0 £ a · b £ 1, so 0 £ a1 £ 1.
Since 0 £ b £ 1, it follows immediately
that 0 £
(1 b)
£
1, so 0 £
a3
£
1. We see also that
0 £
(1 a)
£
1, so 0 £
(b
·
(1 a)
) £
1, so 0 £
a2
£
1, and we have a proper convex hull equation T = a1 · P + a2
·
Q + a3
·
R.
Returning to our original derivation, we have
T(a, b) = (a · b · P) + (b a · b) · Q + (1 b) · R.
= P + (a · b 1 )· P +
(b
a
·
b)
·
Q + (1 b)
·
R
= P + (a · b b 1
+ b)· P +
(b
a
·
b)
·
Q + (1 b)
·
R
= P + (a · b b)· P +
( 1 + b)· P +
(b
a
·
b)
·
Q + (1 b)
·
R
= P (b a · b)· P
(1 b)
·
P + (b
a
·
b)
·
Q + (1 b)
·
R
= P + (b a · b) · Q (b a · b)· P + (1 b) · R (1 b) · P
= P + (b a · b) · (Q P) + (1 b) · (R P), the books result.
Define vectors U = (Q P) and V = (R P) to get
T(a,
b) = P + (b a · b) · U + (1 b) · V, which the text book represents as
T(a, b) = P + aU + bV, obviously redefining a and b. Note that we drop the excessive use of the
multiplication symbol, writing aU for a · U to avoid
confusion.
The equation can be rewritten
as T(a,
b)
P = aU + bV. We can construct a vector
W perpendicular to each of U and V by taking their cross product W
= U x V. Then
W · (aU + bV) = a(W ·
U) + b(W ·
V) = 0, as W ·
U = 0 and W ·
V = 0.
The equation of a plane could thus be defined as W ·(T(a, b) P) = 0; thus the plane is defined by a point in the plane and a vector perpendicular to the plane. We shall return to this observation when we have further examined the dot product and cross product.
Review of Matrix and Vector Notation.
A matrix is a rectangular array of numbers. Those students with programming experience may consider a matrix to be a two-dimensional array of real numbers. An (N by M)-matrix has N rows, each of M entries, and so has M columns. Two matrices are of the same size, if they have the same number of rows and the same number of columns. If N = M, the matrix is square. The fact that most of the matrices of interest to us are square should not blind us to the existence of matrices that are not square.
The most familiar examples of non-square matrices are the row vector and column vector.
A row vector is an (1 by M) matrix.
A column vector is an (N by 1)
matrix.
A matrix element is variously denoted
In algebra we use subscripts AI,J
In many programming languages we use parentheses A(I, J)
In C++ we use square brackets a[i][j]
These notes will mostly use the style of parentheses.
Addition is defined for two
matrices if and only if they are the same size.
Consider two
(N by M matrices) A and B.
The sum matrix C = A + B is also an (N by M) matrix, with elements defined by C(I,J) =
A(I,J) + B(I,J), 1 £
I £
N and 1 £
J £
M.
Note that, unlike C++ and the .NET languages, matrices in our usage are 1-based.
Matrix addition is commutative, so A + B = B + A.
The transpose of a matrix A, denoted AT, is obtained by swapping the rows and columns of the matrix, thus AT(I, J) = A(J, I). The transpose of an (N by M) matrix is an (M by N) matrix. If a matrix is square, then so is its transpose. Note that (AT)T = A.
The product of two matrices A and B, often written as AB, is defined if and only if the number of columns in matrix A is the same as the number of rows in matrix B.
Let A be an (L by M)
matrix
Let B be an (M by N) matrix
Then C = AB is an (L by N) matrix, with elements
C(I, J) = ε A(I, K) · B(K, J), with the sum taken for 1 £ K £ M.
Note that matrix multiplication usually does not commute. In the above scenario, the product matrix BA will not be defined unless L = N. If L = N, then each of AB and BA will be square matrices, but they will be different size unless L = M = N.
Suppose that both A and B are 3 by 3 matrices, and C = AB is the product matrix. Then
C(I, J) = A(I, 1) · B(1, J) + A(I, 2) · B(2, J) + A(I, 3) · B(3, J).
Suppose U is a 3-D row vector; i.e., a (1 by 3) matrix. Then its transpose UT is a 3-D column vector; i.e., a (3 by 1) matrix. Let A be a (3 by 3) matrix. The following products are defined: UA, AUT, UUT and UTU (the last one is a bit strange). The figure below illustrates a few sample computations.


Note that the matrix product UUT is a scalar, while the matrix product UTU is a 3 by 3 matrix.
Basis Vectors and Coordinate
Systems
Consider three vectors V1, V2, and V3 that are linearly independent. Those few students who are interested are invited to look up the definition of linear independence; we shall specify a related property of the vectors that is sufficient for linear independence. We stipulate that the dot products of two distinct vectors in the set is 0; thus VI · VJ = 0 if I Ή J.
It is common practice to
express arbitrary vectors as linear combinations of these basis vectors; thus W = a1V1 + a2V2 + a3V3, and
X = b1V1 + b2V2 + b3V3.
We now calculate the dot product W · X.
W · X = a1V1 · (b1V1 + b2V2 + b3V3)
+ a2V2
·
(b1V1 + b2V2 + b3V3)
+ a3V3 · (b1V1 + b2V2 + b3V3)
= a1b1V1 · V1 + a1b2V1 · V2 + a1b3V1 · V3
+ a2b1V2
·
V1 + a2b2V2 · V2 + a2b3V2
·
V3
+ a3b1V3 · V1 + a3b2V3
·
V2 + a3b3V3 · V3
= a1b1V1 · V1 + a2b2V2 · V2 + a3b3V3 · V3
With this interpretation, we
have
W
·
W =
a1a1V1 · V1 + a2a2V2
·
V2 + a3a3V3 · V3.
What we expect is that W · W = (a1)2 + (a2)2 + (a3)2. This is true if we further specify the basis vectors so that they are normal vectors: V1 · V1 = 1, V2 · V2 = 1, and V3 · V3 = 1.
Thus we have an orthonormal set of basis vectors; they are normal vectors orthogonal (perpendicular to each other).
Following normal usage, we call the basis vectors e1, e2, and e3, where

A vector W then has the standard representation in terms of the basis vectors.

In our text, we shall often write vectors in the row form W = (a1, a2, a3). There is no theoretical significance to this notation, it is just easier to write vectors this way and to forget to denote the row with the superscript indicating transpose. To recap, consider the two vectors W = (a1, a2, a3) and X = (b1, b2, b3). Then W · X = a1b1+ a2b2+ a3b3.
As an exercise in the dot product, consider the equation of a plane through a point P0. The equation is N · (P P0) = 0. The first example is as follows:
What is the equation of the plane perpendicular to (0, 0, 1) and through the point (2, 3, 5)?
Answer: Let P = (X, Y, Z) and P0 = (2, 3, 5). We stipulated that N = (0, 0, 1), so the equation of the plane is (0, 0, 1) · (X 2, Y 3, Z 5) = 0 or Z 5 = 0 or Z = 5.
What is the equation of the plane perpendicular to (1, 1, 1) and through the point (2, 3, 5)?
Answer: The equation is (1, 1, 1)
·
(X 2, Y 3, Z 5) = 0 or X 2 + Y 3 + Z 5 = 0 or
X + Y + Z 10 = 0, or X + Y + Z = 10.
What is the equation of the plane perpendicular to (1, 1, 1), through the point (2, 3, 5)?
Answer: The equation is (1, 1, 1) · (X 2, Y 3, Z 5) = 0
or 2 X + 3 Y + 5 Z = 0 or 10 X Y Z = 0, or X + Y + Z = 10. Note that this is the same answer as the
previous question, due to the fact that (1, 1, 1) = (1, 1, 1). We shall soon consider situations when we
need to distinguish these two cases as representing the outside of a
geometric solid vs. the inside.
Cross Product, Revisited.
Consider the two vectors W = (a1, a2, a3) and V = (b1, b2, b3). We now present a simple method for calculating the cross product W x V. Recall that the cross product W x V is a vector perpendicular to each of W and V. The easiest method for producing the cross product involves taking the determinant of a (3 by 3)-matrix. The equation is

It is easily shown from the above that W x V = V x W; the cross-product anti-commutes.
It is easily shown that W · (W x V) = V · (W x V) = 0, so that the vector W x V is perpendicular to each of V and W. We show this for W · (W x V).
W · (W x V) = a1(a2b3 a3b2) + a2(a3b1 a1b3) + a3(a1b2 a2b1)
= a1a2b3 a1a3b2 + a2a3b1 a2a1b3 + a3a1b2 a3a2b1
= a1a2b3 a2a1b3 + a2a3b1 a3a2b1 + a3a1b2 a1a3b2
= 0, so W x V is perpendicular to W.
Two Special Square Matrices
If we restrict our attention to square matrices, say (N by N)-matrices, we find it helpful to define two special matrices: the zero matrix and the identity matrix.
0 the zero matrix for every (N by N)-matrix A, A + 0 = 0 + A = A
I the identity matrix for every (N by N)-matrix A, AI = IA = A.
The components of the zero matrix are easily specified: they are all 0s. The components of the identity matrix are 1s along the diagonal of the matrix and 0s elsewhere; thus
IJK = 1 if J = K and 0 if J Ή K. For (3 by 3) matrices, we have:
The
subscript 3 on the label I3 is to distinguish the matrix as the (3
by 3) identity matrix. When the size of
the matrix is clear, the subscript is often omitted.
If a matrix A is invertible, there exists a matrix, denoted A1, such that A A1 = A1 A = I. There exist a number of methods for computing the inverse of an invertible matrix; we consider the one that is simplest to implement. This method of elementary row operations is taken from the book Linear Algebra by Friedberg, Insel, and Spence (Prentice-Hall, 1989).
Definition: Let A be an (M by N)-matrix. Any one of the following three operations on the rows of A is called an elementary row operation:
a) Interchanging any two rows of A
b) Multiplying any row of A by a nonzero constant
c) Adding any constant multiple of a row of A to another row.
Elementary column operations are defined in a similar manner, but we do not use them.
Definition: Let A be an
(N by N)-matrix. We define the (N by 2N)
augmented matrix
C = (A | IN) as pasting the identity matrix onto the matrix A,
so that
C(I, J) = A(I, J)
if 1 £
J £
N
= 1 if J = (N + I)
= 0 otherwise
Theorem: Let A be an invertible (N by N)-matrix. If the augmented matrix C = (A | IN) is transformed into a matrix of the form (IN | B) by a series of a finite number of elementary row operations, then B = A1.

We consider the example matrix A at right. First, we construct the augmented matrix (A | I3).
The
first thing we note is that we need a nonzero entry in column 1 of the first
row of the augmented matrix. But C(1, 1)
= 0, so we swap rows 1 and 2 to get the matrix at left. We equally well could have swapped rows 1 and
3.
The basic strategy is to move column by column from left to right to cause the matrix A to be replaced by the matrix I3. We first drive the diagonal element in the column to 1 and then drive the non-diagonal elements to 0. In column 1, we want A(1, 1) = 1, A(1, 2) = 0, and A(1, 3) = 0. After this we work on columns 2 and 3.
In
order to place a 1 in the first column of the first row, we multiply the first
row of the augmented matrix by 0.5 to get the matrix at left. We are now prepared to drive A(1, 2) and A(1,
3) to zero.
In
the first column we find that A(1, 2) = 0 already so that no adjustment is
required. We see that A(1, 3) = 3, so we
add 3 times row 1 to row 3 to get the result at right. We are now complete with the process of
converting the first column and move to column 2.
In
order to change the second column of the matrix above into the second column of
I3, we multiply row 2 by 0.5 to place a 1 in the
(2, 2) position. This operation produces
the matrix at left.
We
now complete work on the second column by adding 2 times the second row to the
first row and by adding 3 times row 2 to row 3.
The result of this operation is the matrix at right.
Only
the third column remains to be changed.
In order to place a 1 in the (3, 3) position, we multiply the third row
by 0.25. The result of this operation is
the matrix at left.
We
complete the process of converting the matrix by adding 3 times row 3 to row 1
and 2 times row 3 to row 2. The result
of this process is shown at right. Note
that the matrix is now of the form (I3 | B).
As a result of the above, we now have the inverse of the matrix A.

This completes the example.
We need a few more results on matrix multiplication in order to work our way through some of the presentations just to come.
The first result relates to the inverse of a matrix product.
Theorem: Let A and B be invertible (N by N)-matrices. Let C = AB be the matrix product of A and B. Then C1 = B1A1.
Proof: If A is
invertible, then there exists a matrix A1 such that AA1
= A1A = IN. The same is true of matrix B. Recalling that for any (N by N) matrix M,
we have
M IN = IN M = M, we note the
following
AB B1A1 = A B B1A1 = A IN A1 = A A1 = IN.
B1A1 AB = B1 A1A B = B1 IN B = B1 B = IN.
Theorem: Let A
and B be matrices such that the product C = AB is defined.
Then CT = BTAT.
Proof: Let A be an (L by M)-matrix and B
be an (M by N) matrix. Then the product
C = AB is defined, with C being an (L by N) matrix.
Note that AT is an (M by L)-matrix and BT in an (N by M) matrix. From this observation, we see that the matrix product BTAT is defined, while the product ATBT is not defined if L Ή M.
![]()
![]()
This last result applies to square matrices, but is not restricted to them. One particular case of interest is the product of a matrix and a vector.
Let A by an (N by N)- matrix and let X be an (N by 1) column vector. Then XT is a (1 by N) row vector. We consider the product of the matrix and the vector.
(A X)T = XT AT and (XT A)T = AT X, as (XT)T = X.
We now give a theorem on the inverse of the transpose of a matrix. We begin with a definition of a new symbol used to facilitate the proof.
Definition: We define the Kronecker delta by dI,J = 1 if I = J and dI,J = 0 if I Ή J.
Note that dI,J
= dJ,I.
Theorem: Let A be an N-by-N matrix with an inverse. Then (AT)1 = (A1)T.
Proof: Let A be an invertible N-by-N matrix. Let B = A1. Then we have the two matrix products A·B = B·A = IN, the N-by-N identity matrix.
The claim is that BT = (AT)1, thus AT·BT = BT·AT = IN.
Expressing these matrix products by sums, we have the following equivalent expressions.
(A·B)I,J
= ![]()
(B·A)J,I
= ![]()
We use the matrix products to complete the proof.
(AT·BT)I,J
= ![]()
= ![]()
(BT·AT)J,I
= ![]()
=
.
We have just shown that AT·BT = BT·AT = IN.
Since B = A1, we have shown that (AT)1 = (A1)T.
Thus we have shown that for an invertible matrix, the inverse of the transpose is equal to the transpose of the inverse. This property becomes quite useful when we consider coordinate transformations.
We now investigate a number of transformations of coordinate systems. The two most basic methods are rotation of the coordinate system and translation of the coordinate system.
There are a number of reasons for interest in coordinate transformations. We first present an example not directly related to computer graphics.
Consider the following 2-dimensional equation in (X, Y)-coordinate space.
![]()
One might say that the geometric figure represented by the figure is not obvious, except in the obvious fact that it is a two-dimensional figure. There are standard methods, not to be discussed in this class, to simplify this equation so that it may be understood. These standard methods lead to the following coordinate transformation.

Making this transformation leads to the following equation in the variables U and V.
![]()
Division of both sides of this equation leads to the following form.
![]()
This is easily recognized as the equation of an ellipse, centered at (U, V) = (0, 0) with semi-major axis of length 4 along the V axis and semi-minor axis of length 3 along the U axis. The figures below show the ellipse in both (U, V)-space and (X, Y)-space.

We now consider the more
traditional presentation in terms of distinct sets of basis vectors. Consider two sets of basis vectors U =
{U1, U2, U3} and V
= {V1, V2, V3}. The key to the transformation is the
specification of one set of basis vectors in terms of the other basis set. The representation of a set of basis vectors
in terms of itself is quite simple. In
terms of the basis vectors U1, U2, and U3,
we have U1 = (1, 0, 0), U2 = (0, 1, 0), and
U3 = (0, 0, 1). Then
the set of basis vectors (note the set notation) is given by
U = {(1, 0, 0), (0, 1, 0),
(0, 0, 1)}.
Suppose that the basis vector set U is to be expressed in terms of another basis vector set V. The conventional way to do this is to use matrix multiplication. Consider two sets of basis vectors in three dimensions: U = {U1, U2, U3} and V = {V1, V2, V3}. Given the representation of a vector in the basis set V, we want to find it in the basis set U. We begin with the representation of the basis vectors in the set U in terms of those in V.
U1 = m11·V1 + m12·V2 + m13·V3
U2 = m21·V1 + m22·V2 + m23·V3
U3 = m31·V1 + m32·V2 + m33·V3.
Recalling that in the V coordinate set V1 = (1, 0, 0), V2 = (0, 1, 0), and V3 = (0, 0, 1), this is usually written in the following way.
U1 = m11·(1, 0, 0) + m12·(0, 1, 0) + m13·(0, 0, 1) = (m11, m12, m13)
U2 = m21·(1, 0, 0) + m22·(0, 1, 0) + m23·(0, 0, 1) = (m21, m22, m23)
U3 = m31·(1, 0, 0) + m32·(0, 1, 0) + m33·(0, 0, 1) = (m31, m32, m33).
In three dimensions, this is written compactly using a (3 by 3)-matrix M.

To give a specific example, we consider the coordinate transformation mentioned earlier.

Since this is a two-dimensional transformation, it is represented by a (2 by 2) matrix.

We now investigate the
relationship of this matrix to the matrix M discussed above. To do this we must investigate the
representation of the basis vectors. Let
and
be the basis vectors
in the (X, Y) system and
and
be the basis vectors
in the (U, V) system. In the (X,
Y)-coordinate system,
= (1, 0) and
= (0, 1). In the (U, V) system we have

The above shows the (U, V)-coordinates of the (X, Y) basis vectors. Put another way:
![]()
We may also want to see the (X, Y)-coordinates of the (U, V) basis vectors. It is easy to show that
![]()
When we discuss rotations, the method for inverting these transformations will be obvious.
We now address the question of conversion of a vector from one coordinate system to another. In this case, we assume that the vector W is defined in the V coordinate system and want to find its representation in the U coordinate system.
Given that W = a1·V1 + a2·V2 + a3·V3, or W = (a1,a2,a3) in the V-system, we know that the vector has a representation in the U-system W = b1·U1 + b2·U2 + b3·U3. The task is to compute the coefficients b1, b2, and b3 given a1, a2, and a3, and the matrix M.
Just a bit ago, we showed the matrix M as representing the U basis vectors in the V basis.

We now note that the two
representations of W, W = a1·V1
+ a2·V2
+ a3·V3
and
W = b1·U1
+ b2·U2
+ b3·U3
can be written in matrix notation as

Putting all of this together, we get a matrix equation

We want the vector b in terms of vector a, so b = (MT)1a.
We now apply all this algebra to a simple example.
Consider the following example, with the U basis set expressed in terms of the V-set:
U1 = ( 1, 1, 0) = 1·V1 + 1·V2 + 0·V3
U2 = ( 1, 1, 1) = 1·V1 1·V2 + 1·V3
U3 = ( 1, 1, 2) = 1·V1 + 1·V2 + 2·V3
The (3 by 3)-matrix describing this is shown below

Taking (M)1, the inverse of matrix M, allows us to express the U-set in terms of the V-set and the V-set in terms of the U-set, as follows.

Our goal of converting a vector representation from the V-basis set to the U-basis set requires that we compute the inverse of the transpose of M; that is (MT)1.

Suppose that W = 1·V1 + 1·V2 + 2·V3. What is the representation of W in the U-set?

Thus W = U3, which should not be a surprise, as we used the equation for U3 to define W.
As an aside, we note that the
length of W in the V system is (1 + 1 + 4)1/2 =
61/2 or approximately 2.45, while the length of W in the U system is
obviously 1, so the transformation above is not length preserving. To make matters worse, consider the vector
X = 1·V1 + 1·V2
+ 0·V3,
which is U1 in the U system. This vectors length in the V system is 21/2, or
approximately 1.41, while again it is 1 in the U system. Not only does this
transformation fail to preserve lengths, but it also distorts the objects. Next time you go to a fair and attend a funny
house, think of distorting image transformations as you stand in front of one
of the funny mirrors.
To appreciate the difference between length preservation and non-distortion, consider the following change of basis vectors.
U1 = 2·V1
U2 =
2·V2 or
, with 
U3 = 2·V3
Each of the U basis vectors is twice the length of a V basis vector, but the U basis vectors are all of the same length in the V-system, so an object will appear as half-sized in the U-system, but it will not appear distorted.
The discussion just above is terse and accurate, but may leave too much to the imagination. In order to get a better idea of what is happening, we discuss the books example from section 4.3.2, but give an entirely different presentation.
Suppose we have two basis sets
for a vector space: {U1, U2, U3}
and {V1, V2, V3}, related
by
U1 = V1
U2 = V1
+ V2
U3 = V1
+ V2 + V3.
There are two questions one might ask.
1) Given a vector represented in the U-system, represent
it in the V-system.
2) Given a vector represented
in the V-system, represent it in the U-system.
Note that the book addresses only the second question.
1) Suppose that W is represented in the U-system as W = a1·U1 + a2·U2 + a3·U3. The representation in the V-system is given quite simply based on the above equations. We start by making simple substitutions for U1, U2, and U3
W = a1·(V1)
+ a2·(V1
+ V2) + a3·(V1 + V2 + V3)
= (a1+ a2+
a3)·V1
+ (a2+
a3)·V2
+ (a3)·V3,
so we have the representation of W in the
V-system as W = (a1+ a2+ a3,
a2+
a3,
a3).
2) We now consider the problem discussed in section 4.3.2 of the textbook. Suppose that W is represented in the V-system as W = b1·V1 + b2·V2 + b3·V3. The formal way to approach this problem is to obtain the inverse of the transformation matrix, as was done by the text. We shall consider a simpler approach. Consider the transformation as a set of equations with the unknown variables being V1, V2, and V3.
V1 =
U1
V1 + V2 = U2
V1 + V2
+ V3 = U3
We solve for V1, V2, and V3 in terms of U1, U2, and U3.
To solve for V1,
we merely copy the first equation. To
solve for V2, we substitute the first equation into the
second to get U1 + V2= U2,
or V2 = U2 U1. To solve for V3, we
substitute the second equation in the third equation (V1 + V2)
+ V3 = U3, or U2 + V3
= U3 to get
V3 = U3 U2. Thus we have the equations
V1 = U1
V2 = U2 U1
V3 = U3 U2
If given W in the
V-system as W = b1·V1 + b2·V2
+ b3·V3,
then we just substitute again
W = b1·U1
+ b2·(U2
U1) + b3·(U3 U2)
= (b1 b2)·U1
+ (b2
b3)·U2
+ (b3)·U3,
so W in the U-system is represented as
W = (b1
b2,
b2
b3,
b3).
Example: Suppose W = 1·V1 + 2·V2
+ 3·V3. Then
we have in the U-system
W
= (1 2)·U1
+ (2 3)·U2
+ (3)·U3
= 1·U1
1·U2
+ 3·U3,
as in the textbook.
In terms used by the book, we
have
, with M =
.
It follows then that
, with M 1 =
.
Now W = a1·V1 + a2·V2 + a3·V3 = aT·
, where a =
.
In the U-system, W is represented as W = b1·U1 + b2·U2 +b3·U3 or
W = bT·
, where b =
. From here, we
proceed as in the derivation above, showing that W = bT·
= bT·M·
= aT·
, so bT·M
= aT
and a = MT·b, as before.
Homogeneous Coordinates
The concept of homogeneous coordinates seems to have arisen from a number of considerations, including a desire to have a uniform notation that differentiated points from vectors and a desire to speed up the computation of general coordinate transformations.
In the standard 3-D system, both points and vectors are represented as triples. In our notes, we use different fonts to designate the two, P = (1, 2, 3)T and V = (1, 2, 3)T, and generally use different letters (Ps and Qs for points, Ws, Xs, Ys, and Zs for vectors), but this convention does not convey any information to the computer.
A more general coordinate transformation can be viewed as a rotation and a translation, composed in either order; rotate then translate or translate then rotate. In matrix terms,
U = M·V + X
U = N·(V + X),
where U, V, and X are column
vectors.
Note that, in this formulation, M and N are both 3-by-3 matrices. We could arrive at a formula for deriving the matrix N in terms of M or vice-versa, but that is not the focus of this discussion. The point is that in the above system, we need both a matrix addition and a matrix multiplication to handle a combination rotation and translation. It would be more efficient to develop a system in which a single matrix operation would suffice.
We move to the idea made possible by the use of homogeneous coordinates that a single matrix multiplication can be made to represent a general coordinate transformation. We first consider only rotations and translations and then consider more general transformations.
Homogeneous coordinates are based on the use of four-dimensional entities to represent three dimensional points and vectors. We shall discover a number of uses for the additional dimension, but should always remember that they are a trick to facilitate computation.
In homogeneous coordinates, three-dimensional points and vectors are represented by four-dimensional arrays. The basis of this four-dimensional representation is the definition of a three-dimensional frame:
three basis vectors, usually denoted (1, 0, 0), (0, 1,
0), and (0, 0, 1), and
a unique reference point P0,
often called the origin of the
coordinate system.
In such a system, a vector can be represented by three components
W = a1·V1 + a2·V2 + a3·V3
But a point needs both a reference point (the origin) and a vector
P = P0 + X = P0 + b1·V1 + b2·V2 + b3·V3
Homogeneous coordinates use a four-dimensional notation to reflect this difference. Homogeneous coordinates have as an expanded basis set the following hybrid four-dimensional array, which I call B4 for the basis of a 4-D system.

In the pure mathematical sense, this is a heterogeneous array. The first three elements in the array are vectors, while the last element in the array is a point. It is enough for us that points and vectors share a common representation.
Vectors are represented by a 4-D array with
the last element 0.
Points are represented by a 4-D array with
the last element 1.
Consider the vector, represented in standard three-dimensional form as
W = a1·V1 + a2·V2 + a3·V3 = [a1, a2, a3]T.
In the four-dimensional homogeneous coordinate system, the vector is represented as
, so that 
or WT·B4 = a1·V1 + a2·V2 + a3·V3 + 0 = a1·V1 + a2·V2 + a3·V3
Consider the point, represented in standard three-dimensional form as
P = P0 + X = P0 + b1·V1 + b2·V2 + b3·V3, or loosely as (b1, b2, b3).
In the four-dimensional homogeneous coordinate system, the point is represented as
, so that 
or PT·B4 = b1·V1 + b2·V2 + b3·V3 + P0.
The big payoff comes in the ability to represent a translation of coordinate systems by a matrix. Suppose we have a coordinate system with origin P0. A translated coordinate system is one with a new origin Q0, specified by a vector offset from P0.
Consider two frames (V1, V2, V3,
P0) and (U1, U2, U3, Q0) and the transformation from the
V-system to the U-system.
U1 = M11·V1 + M12·V2 + M13·V3
U2 = M21·V1 + M22·V2 + M23·V3
U3 = M31·V1 + M32·V2 + M33·V3
Q0 = M41·V1 + M42·V2 + M43·V3 +P0
Note that the Q0, the origin of the new system, is displaced from P0, the origin of the old system by the vector M41·V1 + M42·V2 + M43·V3. In homogeneous coordinates, this transformation is represented by the matrix
, with transpose 
We use the transpose MT in both point and vector transformations. Vectors transform as

Points transform as

We immediately notice one important property of these transformations: vectors transform into vectors (0 in the fourth row) and points transform into points (1 in the fourth row).
Also
, so a vector added to a point is a point.
We now consider some special coordinate transformations. In our first example, we pick
, so 
We first transform the point P to identify this transformation.
- a translation.
Note that this corresponds to translating the origin by (3, 4, 5), as the values of each component are increased. We now consider that effect of such a transformation on a vector.

Note here that the components of the vector are invariant under the transformation; that is, they do not change. The answer is that the vector is just a direction in space and a magnitude. Thus vectors are not changed by just changing the origin of coordinates.
Finally, we consider a pure rotation applied to a point.

Considering a pure rotation applied to a vector would give similar results. In each case the pure rotation in homogeneous coordinates strongly resembles the corresponding rotation in the three-dimensional model.
Affine Transformations
We are developing the theory of coordinate transformations because such transformations are essential to the understanding of graphics systems. At this point, we consider what type of transformations are required for work with graphics systems.
It turns out that not all
transformations can be used for graphics and, fortunately, that the subset of
transformations that can be used is the simpler subset. This is the subset of
affine transformations.
To understand the idea of an
affine transformation, consider the related concept of linear function. A function f( ) is linear if and only if
f(a·x + b·y)
= a·f(x)
+ b·f(y) for all scalars a and b
and
all variables x and y.
Consider the function f(x)
= x2. This is not a
linear function, specifically
f(a·x + b·y) = (a·x + b·y)2
= a2·x2
+ b2·y2
+ 2·a·b·x·y
= a2·f(x)
+ b2·f(y)
+ 2·a·b·x·y.
There is another property of linear functions, shared by affine transformations. In terms of linear functions of one variable, we have the following interesting property
f(a·x + (1 a)·y) = a·f(x) + (1 a)·f(y).
This generalizes to three
dimensional points as
f(a·p + (1 a)·q)
= a·f(p)
+ (1 a)·f(q).
Recall that the set a·p + (1 a)·q, for 0 £ a £ 1, is the set of all points on the line between the two points p and q. Note that this maps onto the set a·f(p) + (1 a)·f(q), which is the set of all points on the line between the two points f(p) and f(q).
This observation would be an
interesting curiosity were it not for an immediate consequence. To apply a transformation to the line from
point p to point q, we use two steps.
1) Compute the two points f(p) and f(q)
2) Connect these two points f(p) and f(q) with a straight line.
The straight line between f(p) and f(q) is the image of the straight line between p and q.
One property of transformations that is worth note is the degrees of freedom. Basically, this is the number of variables that can be set independently. Consider the function of three variables f(x, y, z) = a1·x + a2·y + a3·z. Suppose that the scalars a1, a2, and a3 are independent. In that case the transformation has three degrees of freedom because each of these three scalars can be set arbitrarily.
If, on the other hand, we have the constraint that a1 + a2 + a3 = 1, then only two of the three scalars can be set arbitrarily and the third value is then determined. In this case the transformation has only two degrees of freedom. The book discusses the degrees of freedom of transformation at some length, but it is not likely that this course will make much use of the concept.
Rotation, Translation, and
Scaling
We have studied the use of homogeneous coordinates and the related 4-by-4 matrices to represent coordinate transformations. We have mentioned that such matrices could be used to represent transformations that are quite general, but hinted that we would use simpler methods to construct these matrices.
We shall view many of these general coordinate transformations as being constructed from three simpler affine transformations: rotation, translation, and scaling. We shall then discuss the OpenGL practice of building the general transformation matrix as a product of the matrices representing the simpler transformations.
We begin with the translation method of coordinate transformation. Under coordinate translations, points are transformed but vectors are not. Consider a translation of the coordinate system by the displacement vector d = (ax, ay, az)T. In the homogeneous coordinate system, this is represented by the 4-by-4 matrix.

Points [x, y, z, 1]T translate as follows:

This translation is sometimes written at T(ax, ay, az).
We obtain the inverse of the translation by noting that if T corresponds to a translation by vector d = (ax, ay, az)T, then the inverse transformation corresponds to a translation by the vector d = (ax, ay, az)T. Based on this we have the following observation
T1(ax, ay, az)
= T(ax,
ay,
az),
represented by matrix
.
It is easily shown that vectors are not changed by this translation.
The second affine transformation to be considered is scaling. There are two types of scaling: uniform and non-uniform scaling. In uniform scaling, each coordinate is multiplied by the same scale factor, denoted b.
For |b| > 1, an object gets larger
For 0 < |b| < 1, an object gets smaller.
Values of b < 0 correspond to reflection.
Uniform scaling is represented
by the matrix S(b)
=
.
Consider the effect of uniform scaling on both vectors and points.
For vector w 
For point P 
Non-uniform scaling is a
transformation with different scaling factors for each axis. Uniform scaling preserves shapes; non-uniform
scaling does not. Non-uniform scaling is
represented by S(bx,
by,
bz)
=
.
It is easily shown that both
vectors and points transform in manner similar to uniform scaling. We apply the inverse of the scaling matrix by
applying the inverse of each of the scaling factor, thus S1(bx,
by,
bz)
=
.
Suppose we want to do a
coordinate transformation involving scaling and translation. Does it matter if we scale first and then
translate or translate first and then scale.
In mathematical terms, we are asking if the two matrices commute; i.e.,
does S(bx, by,
bz)·T(ax,
ay,
az)
= T(ax,
ay,
az)·S(bx,
by,
bz)?
We first compute the matrix S(bx, by, bz)·T(ax, ay, az).
S(bx, by,
bz)·T(ax,
ay,
az)
= 
T(ax, ay,
az)·S(bx,
by,
bz)
= 
The mathematics clearly indicates that the coordinate transformations do not commute. It is instructive to observe the reason for failure to commute.
Consider the first product S(bx, by, bz)·T(ax, ay, az). Because the translation is done first, followed by the scaling, we note that the translation vector (ax, ay, az) is also scaled, so we have translation by the vector (bxax, byay, bzaz). The second transformation corresponds to translation of the scaled figure by the vector (ax, ay, az).
Rotation
The generalized rotation
operator is specified in terms of an angle of rotation and a vector (possibly a
unit vector) serving as the axis of rotation.
The operator is represented as
R(Q, VX, VY,
VZ) were V = (VX,
VY, VZ) is the axis of rotation. We first examine a number of simple
rotations, specifically
R(QX, 1, 0, 0) rotation about the Xaxis, and
R(QY, 0, 1, 0) rotation about the Yaxis, and
R(QZ, 0, 0, 1) rotation about the Zaxis.
The first case we consider is R(QZ, 0, 0, 1), also known as rotation in the X-Y plane about the origin. We shall examine this rotation and then make some statements applicable to all three of these basic rotations. After considering the basic rotations, we consider the more general rotations and show that these can be represented as the concatenation of the basic transformations.
Let RZ(QZ) denote rotation by angle QZ about the Z-axis. It is easily shown that the 4-by-4 matrix representing this rotation is given by
RZ(QZ)
=
. Similar matrices
represent RX(QX) and RY(QY).
Let R(Q) represent any of the three rotations RX(QX), RY(QY), or RZ(QZ). Noting that a rotation by Q can always be undone by a rotation by Q, we have the immediate result that R1(Q) = R( Q). Applying this logic to RZ(QZ), we see that
RZ1(QZ)
=
=
.
From this we see that RZ1(QZ) = RZT(QZ), a property shared by the other rotation matrices. A matrix whose inverse equals its transpose is called an orthogonal matrix.
Question: Do rotations commute? To answer this question, we consider a specific case involving rotations around the X-axis and Z-axis. If this fails to commute, we have shown that, in general, rotations fail to commute. Consider two transforms RX(F) and RZ(Q).
We first consider the product RX(F)·RZ(Q)
·
= 
We now consider the product RZ(Q)·RX(F)
·
= 
The two products are not equal, so we conclude that the rotations do not commute.
Concatenation of
Transformations
We now have developed matrix representations of the basic transformation types. We now consider examples of affine transformations formed by concatenating sequences of these basic transformations.
Suppose a sequence of three transformations on a point p, represented by matrices A, B, and C. The transformations are applied in this order: A first, then B, and finally C. We can consider the sequence of transformations as follows:
p2 = Ap
p3 = Bp2
q = Cp3
This can be written as a concatenation of transformations q = C(B(A(p))), represented by the matrix product q = CBAp, or q = Mp where M = C·B·A. It is important to remember that the order of transformations represented by a product of matrices is best read right to left.
The book shows that a generalized rotation about an arbitrary vector can be represented as the concatenation of basic translation and rotations. We shall not pursue this in detail, but should be aware of it.
Most graphical systems include primitives for drawing common shapes such as ellipses and rectangles. Many applications of graphics require a collection of composite shapes that can be scaled, rotated, and translated. We consider a number of such shapes used in the study of digital logic. In this illustration, we use the symbols for 2-input AND gate and OR gates.

Each of these symbols would have a representation stored in some standard form. Each of the two symbols would be represented at the origin of the coordinate system and in a standard orientation and standard size. Consider the AND gate, which is the symbol drawn on the left in the above illustration. To draw this gate with a different size and orientation at a different location, we apply the transformation
M = T(v)R(q)S(b) to the symbol, thus creating the instance at the correct location.