Author: Romain Vergne (website)
Please cite my name and add a link to my web page if you use this course

Image synthesis and OpenGL: transformations

Quick links to:

Goal
Vectors
Basic operations on vectors
2D transformations
Homogeneous coordinates
3D transformations
Transformations on complex objects
The model matrix
The view matrix
The projection matrix
The viewport
From a point to the screen

Goal

How to move / rotate / scale objects?
How to modify the position of the point of view?
How to modify the type of projection of the camera?

Vectors

Lets consider a vector \( \mathbf{x}=( x_1, x_2, \cdots, x_n )^T \) of a vector space \( E^n \)
And a basis \( B= ( \mathbf{e}_1, \mathbf{e}_2, \cdots , \mathbf{e}_n) \): a linearly independent subset of the vector space \( E^n \)
\( \mathbf{x} \) can be uniquely expressed in the basis \( B \in E^n \) by \[ \mathbf{x}=\sum_{i=1}^n x_i \mathbf{e}_i \]

Examples using orthogonal basis:

In 2D

\( E=\mathbb{R}^2 \)
\( \mathbf{e}_1=(1,0)^T \) (the x-axis)
\( \mathbf{e}_2=(0,1)^T \) (the y-axis)
\( B= (\mathbf{e}_1, \mathbf{e}_2) = \begin{pmatrix} 1 & 0\\ 0 & 1 \end{pmatrix} \)
\( \mathbf{x}=(-2,1)^T = -2 \mathbf{e}_1 + 1 \mathbf{e}_2 \)

In 3D

\( E=\mathbb{R}^3 \)
\( \mathbf{e}_1=(1,0,0)^T \) (the x-axis)
\( \mathbf{e}_2=(0,1,0)^T \) (the y-axis)
\( \mathbf{e}_3=(0,0,1)^T \) (the z-axis)
\( B= (\mathbf{e}_1, \mathbf{e}_2, \mathbf{e}_3) = \begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\end{pmatrix} \)
\( \mathbf{x}=(a_x,a_y,a_z)^T = a_x \mathbf{e}_1 + a_y \mathbf{e}_2 +a_z \mathbf{e}_3\)

Basic operations on vectors

Norm

The norm (or magnitude) of a vector \( \mathbf{x}= (x_1,x_2,\cdots ,x_n)^T \) is defined by \( | \mathbf{x} | = \sqrt{x_1^2+x_2^2+ \cdots +x_n^2} \)

Normalized vector

The normalized (or unit) vector of \( \mathbf{x}= (x_1,x_2,\cdots ,x_n)^T \) is a vector in the same direction but with norm 1: \( \hat{\mathbf{x}}= \frac{\mathbf{x}}{ | \mathbf{x} |} = (\frac{x_1}{| \mathbf{x} |}, \frac{x_2}{| \mathbf{x} |}, \cdots , \frac{x_n}{| \mathbf{x} |})^T \)

Scalar product

The scalar product between 2 vectors \( \mathbf{a} = (a_x,a_y,a_z)^T \) and \( \mathbf{b} = (b_x,b_y,b_z)^T \) is defined by

\( \mathbf{a} \cdot \mathbf{b} = | \mathbf{a} | | \mathbf{b} | \cos ( \theta ) \)

\( \mathbf{a} \cdot \mathbf{b} = a_xb_x + a_yb_y + a_zb_z \)

Try an applet here

Vector product (or cross-product)

The vector (or cross-) product between 2 vectors \( \mathbf{a} = (a_x,a_y,a_z)^T \) and \( \mathbf{b} = (b_x,b_y,b_z)^T \) is defined by

\( \mathbf{a} \times \mathbf{b} = | \mathbf{a} | | \mathbf{b} | \sin ( \theta ) \mathbf{n}\), with \( \mathbf{n} \): a unit perpendicular vector to the plane

\( \mathbf{a} \times \mathbf{b} = -\mathbf{b} \times \mathbf{a} = \begin{vmatrix} \mathbf{e}_1 & \mathbf{e}_2 & \mathbf{e}_3\\ a_x & a_y & a_z\\ b_x & b_y & b_z \end{vmatrix} = \begin{pmatrix} a_yb_z-a_zb_y\\ a_zb_x-a_xb_z\\ a_xb_y-a_yb_x \end{pmatrix} \)

Try an applet here

Exercices

What is the angle between 2 normalized vectors \( \mathbf{a} \) and \( \mathbf{b} \) ?
What is the projection of a vector \( \mathbf{a} \) onto a vector \( \mathbf{b} \) ?
What is the normal of a triangle, given its 3 vertices \( p_1 \) , \( p_2 \) and \( p_3 \) ?
What is the area of this triangle?

2D transformations

We look for the operators that scale/rotate/translate a point or a vector \( \mathbf{P}= \begin{pmatrix} x\\y \end{pmatrix} \) into another one \( \mathbf{P}^\prime= \begin{pmatrix} x^\prime\\y^\prime \end{pmatrix} \)

Scaling

Scaling transformation
Multiply each coordinate by a scaling factor:

\[
\begin{pmatrix}
x^\prime\\
y^\prime
\end{pmatrix}
=
\begin{pmatrix}
s_x x\\
s_y y
\end{pmatrix}
\]

MATRIX FORM: \( \mathbf{P}^\prime = \mathbf{S} \mathbf{P} \)

\[
\begin{pmatrix}
x^\prime\\
y^\prime
\end{pmatrix}
=
\begin{pmatrix}
s_x & 0\\
0 & s_y
\end{pmatrix}
\begin{pmatrix}
x\\
y
\end{pmatrix}
\]

Rotation

Rotation transformation
Rotate each vector around the origin:

\[
\begin{pmatrix}
x^\prime\\
y^\prime
\end{pmatrix}
=
\begin{pmatrix}
\cos (\theta) x - \sin (\theta) y\\
\sin (\theta) x + \cos (\theta) y
\end{pmatrix}
\]

MATRIX FORM: \( \mathbf{P}^\prime = \mathbf{R}_\theta \mathbf{P} \)

\[
\begin{pmatrix}
x^\prime\\
y^\prime
\end{pmatrix}
=
\begin{pmatrix}
\cos (\theta) & - \sin (\theta)\\
\sin (\theta) & \cos (\theta)
\end{pmatrix}
\begin{pmatrix}
x\\
y
\end{pmatrix}
\]

Translation

Translation transformation
Translation is a simple vectorial sum

\[
\begin{pmatrix}
x^\prime\\
y^\prime
\end{pmatrix}
=
\begin{pmatrix}
x + t_x\\
y + t_y
\end{pmatrix}
\]

MATRIX FORM: \( \mathbf{P}^\prime = \mathbf{P} + \mathbf{T}\)

\[
\begin{pmatrix}
x^\prime\\
y^\prime
\end{pmatrix}
=
\begin{pmatrix}
x\\
y
\end{pmatrix}
+
\begin{pmatrix}
t_x\\
t_y
\end{pmatrix}
\]

Notation issue

Scaling = matrix multiplication
Rotation = matrix multiplication
Translation = vectors addition

We would like to find a unique and unified representation for all transformations, so that we could concatenate multiple transformations.
How can we transform the translation into a matrix multiplication?

Homogeneous coordinates

Powerfull geometric tool, used in many applications: image, robotic, vision, etc.
We add a third coordinate \( w \) to a 2D point, which becomes a vector: \[ \begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} x \\ y\\ 1 \end{pmatrix} = \begin{pmatrix} x \\ y\\ w \end{pmatrix} \]
2 points \( (x_1,y_1,w_1)^T \) and \( (x_2,y_2,w_2)^T \) are equal if and only if \( x_1/ w_1 = x_2 / w_2 \) and \( y_1 / w_1 = y_2 / w_2 \)
A point is projected at infinity when \( w=0 \)

It allows to represent affine spaces (and differentiate points and vectors)
It allows to represent projection transformations

Back on 2D transformations using homogeneous coordinates

Scaling: \( \mathbf{P}^\prime = \mathbf{S} \mathbf{P} \)

\[
\begin{pmatrix}
x^\prime\\
y^\prime\\
w^\prime\\
\end{pmatrix}
=
\begin{pmatrix}
s_x & 0 & 0\\
0 & s_y & 0\\
0 & 0 & 1
\end{pmatrix}
\begin{pmatrix}
x\\
y\\
w\\
\end{pmatrix}
\]
\[ \Downarrow \]
\[
\begin{pmatrix}
s_x x\\
s_y y\\
w\\
\end{pmatrix}
\Rightarrow
\begin{pmatrix}
s_x x / w\\
s_y y / w\\
\end{pmatrix}
\]

Rotation: \( \mathbf{P}^\prime = \mathbf{R}_\theta \mathbf{P} \)

\[
\begin{pmatrix}
x^\prime\\
y^\prime\\
w^\prime\\
\end{pmatrix}
=
\begin{pmatrix}
\cos (\theta) & -\sin (\theta) & 0\\
\sin (\theta) & \cos (\theta) & 0\\
0 & 0 & 1
\end{pmatrix}
\begin{pmatrix}
x\\
y\\
w\\
\end{pmatrix}
\]
\[ \Downarrow \]
\[
\begin{pmatrix}
x \ cos (\theta) - y \sin (\theta)\\
x \ sin (\theta) + y \cos (\theta)\\
w\\
\end{pmatrix}
\Rightarrow
\begin{pmatrix}
(x/w) \ cos (\theta) - (y/w) \sin (\theta)\\
(x/w) \ sin (\theta) + (y/w) \cos (\theta)\\
\end{pmatrix}
\]

Translation: \( \mathbf{P}^\prime = \mathbf{T} \mathbf{P} \)

\[
\begin{pmatrix}
x^\prime\\
y^\prime\\
w^\prime\\
\end{pmatrix}
=
\begin{pmatrix}
1 & 0 & t_x\\
0 & 1 & t_y\\
0 & 0 & 1
\end{pmatrix}
\begin{pmatrix}
x\\
y\\
w\\
\end{pmatrix}
\]
\[ \Downarrow \]
\[
\begin{pmatrix}
x + w t_x\\
y + w t_y\\
w\\
\end{pmatrix}
\Rightarrow
\begin{pmatrix}
x / w + t_x\\
y / w + t_y\\
\end{pmatrix}
\]

Combining transformations

Simply by multiplying matrices
Every 2D transformation can be expressed with matrices with homogeneous coordinates
Very general notation
Example for a translation followed by a rotation

\( \mathbf{M} = \mathbf{R}_\theta \mathbf{T} \)

Rotation around a point \( \mathbf{q} \):

Translate to the origin: \( \mathbf{T}_\mathbf{q} \)
Rotate around the origin: \( \mathbf{R}_\theta \)
Translate back to \( \mathbf{q} \): \( -\mathbf{T}_\mathbf{q} \)


Rotation of \( \mathbf{p} \) around \( \mathbf{q} \)	1 translate to the origin: \( \mathbf{T}_\mathbf{q} \mathbf{p} \)	2 rotate around the origin: \( \mathbf{R}_\theta \mathbf{T}_\mathbf{q} \mathbf{p} \)	3 translate back to \( \mathbf{q} \): \( (-\mathbf{T}_\mathbf{q}) \mathbf{R}_\theta \mathbf{T}_\mathbf{q} \mathbf{p} \)

3D transformations

Same principle
We use a fourth coordinate \( w \): \[ \begin{pmatrix} x \\ y\\ z \end{pmatrix} = \begin{pmatrix} x \\ y\\ z\\ 1 \end{pmatrix} = \begin{pmatrix} x \\ y\\ z\\ w \end{pmatrix} \]
2 points \( (x_1,y_1,z_1,w_1)^T \) and \( (x_2,y_2,z_2,w_2)^T \) are equal if and only if \( x_1/ w_1 = x_2 / w_2 \), \( y_1 / w_1 = y_2 / w_2 \) and \( z_1 / w_1 = z_2 / w_2 \)

3D scaling: \( \mathbf{P}^\prime = \mathbf{S} \mathbf{P} \)

\[
\mathbf{S}
=
\begin{pmatrix}
s_x & 0 & 0 & 0\\
0 & s_y & 0 & 0\\
0 & 0 & s_z & 0\\
0 & 0 & 0 & 1
\end{pmatrix}
\]

3D translation: \( \mathbf{P}^\prime = \mathbf{T} \mathbf{P} \)

\[
\mathbf{T}
=
\begin{pmatrix}
1 & 0 & 0 & t_x\\
0 & 1 & 0 & t_y\\
0 & 0 & 1 & t_z\\
0 & 0 & 0 & 1
\end{pmatrix}
\]

3D rotation: \( \mathbf{P}^\prime = \mathbf{R}_{x/y/z} \mathbf{P} \)

Depends on an axis and an angle

Simple expression around \( x \), \( y \), and \( z \) axis
Other rotations are expressed as combinations of these simple rotations

For a direct representation, use quaternions

\[ \mathbf{R}_x = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & \cos \theta & -\sin \theta & 0\\ 0 & \sin \theta & \cos \theta & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \]	\[ \mathbf{R}_y = \begin{pmatrix} \cos \theta & 0 & \sin \theta & 0\\ 0 & 1 & 0 & 0\\ -\sin \theta & 0 & \cos \theta & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \]	\[ \mathbf{R}_z = \begin{pmatrix} \cos \theta & -\sin \theta & 0 & 0\\ \sin \theta & \cos \theta & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \]
Invariant along the \( \mathbf{x} \) axis	Invariant along the \( \mathbf{y} \) axis	Invariant along the \( \mathbf{z} \) axis

Every 3D transformation can be expressed as combinations of translations, scalings and rotations !

can be represented by a \( 4 \times 4 \) matrix that uses homogeneous coordinates

Multiplication of matrices is not commutative !
The order of transformations is important: \[ \mathbf{R}\mathbf{T} \neq \mathbf{T}\mathbf{R} \]

Transformations on complex objects

A complex object is defined as a combination of smaller objects
We want a behavior so that B and C should follow A when A moves
Solution: use relative coordinates:

position of B according to A
position of C according to B

Considering that "draw A" is drawing a rectangle centered at the origin and aligned on the x-axis, and T is a half length translation matrix, we may concatenate transformations:

\( \mathbf{M} \): initial matrix
\( \mathbf{M}_1 =\mathbf{M} \mathbf{R}_\alpha \mathbf{T} \)
draw A
\( \mathbf{M}_2 = \mathbf{M}_1 \mathbf{T} \mathbf{R}_\beta \mathbf{T} \)
draw B
\( \mathbf{M}_3 = \mathbf{M}_2 \mathbf{T} \mathbf{R}_\gamma \mathbf{T} \)
draw C1

Question: how can we come back to \( \mathbf{M}_2 \) to draw C2 and C3?
Solution: keep track on all the previous matrices using a stack:

\( \mathbf{M} \): initial matrix
pushmatrix()
\( \mathbf{M} = \mathbf{M} \mathbf{R}_\alpha \mathbf{T} \)
draw A
\( \mathbf{M} = \mathbf{M} \mathbf{T} \mathbf{R}_\beta \mathbf{T} \)
draw B
pushmatrix()
\( \mathbf{M} = \mathbf{M} \mathbf{T} \mathbf{R}_\gamma^1 \mathbf{T} \)
draw C1
popmatrix()
pushmatrix()
\( \mathbf{M} = \mathbf{M} \mathbf{T} \mathbf{R}_\gamma^2 \mathbf{T} \)
draw C2
popmatrix()
pushmatrix()
\( \mathbf{M} = \mathbf{M} \mathbf{T} \mathbf{R}_\gamma^3 \mathbf{T} \)
draw C3
popmatrix()
popmatrix()

Hierachical representation of objects.
Reusable objects (scene graph)

The model matrix

This matrix is defined by any combinations of the matrices seen above for transforming object coordinates.

The View matrix

We want to define a virtual camera that comprises:

a camera position \( \mathbf{e} \)
an x-axis: the right vector \( \mathbf{r} \)
a y-axis: the up vector \( \mathbf{u} \)
a z-axis: the view vector \( \mathbf{v} \)

the whole scene will this be defined in this coordinate system:

\[
\mathbf{V}=
\begin{pmatrix}
r_x & u_x & -v_x & -e_x\\
r_y & u_y & -v_y & -e_y\\
r_z & u_z & -v_z & -e_z\\
0 & 0 & 0 & 1\\
\end{pmatrix}
\]

The right, up and view vectors can be easily found using more intuitive parameters:

a camera position \( \mathbf{e} \)
the position of a reference point at which the camera is looking at \( \mathbf{c} \)
an up-vector \( \mathbf{u}^\prime \)

The view vector \( \mathbf{v} \) is given by \( \mathbf{v} = (\mathbf{c}- \mathbf{e})/ |\mathbf{c}- \mathbf{e}|\)
The right vector \( \mathbf{r} \) is given by \( \mathbf{r} = \mathbf{v} \times \mathbf{u}^\prime \)
The up vector vector \( \mathbf{u} \) is given by \( \mathbf{u} = \mathbf{r} \times \mathbf{v} \)

The projection matrix


Orthographic projection Object sizes do not change parallel lines are kept parallel	Perspective projection Far objects are smaller Similar to the human eye

Orthographic projection

Simplest ones: the 3 common views for technical drawings:

\[ \mathbf{P}_x = \begin{pmatrix} 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \]	\[ \mathbf{P}_y = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \]	\[ \mathbf{P}_z = \begin{pmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \]
Side view	Top view	Front view

More generaly, the orthographic matrix can be defined using a 6-tuple (left,right,bottom,top,near,far), which defines the clipping plane:

\[
\mathbf{P}
=
\begin{pmatrix}
\frac{2}{right-left} & 0 & 0 & -\frac{right+left}{right-left}\\
0 & \frac{2}{top-bottom} & 0 & -\frac{top+bottom}{top-bottom}\\
0 & 0 & \frac{-2}{far-near} & -\frac{far+near}{far-near}\\
0 & 0 & 0 & 1
\end{pmatrix}
\]

which defines a translation, followed by a scaling, to obtain a unit cube centered at the origin and having a minimum (resp. maximum) at \( (-1,-1,-1) \) (resp. \( (1,1,1) \) ).

Perspective projection

\[
\mathbf{P}
=
\begin{pmatrix}
\frac{2 near}{right-left} & 0 & \frac{right+left}{right-left} & 0\\
0 & \frac{2 near}{top-bottom} & \frac{top+bottom}{top-bottom} & 0\\
0 & 0 & \frac{-(far+near)}{far-near} & \frac{-2 far near}{far-near}\\
0 & 0 & -1 & 0
\end{pmatrix}
\]

Usually defined with more intuitive parameters:

\[
\mathbf{P}
=
\begin{pmatrix}
\frac{f}{aspect} & 0 & 0 & 0\\
0 & f & 0 & 0\\
0 & 0 & \frac{far+near}{near-far} & \frac{2 far near}{near-far}\\
0 & 0 & -1 & 0
\end{pmatrix}
\],
where \( f = 1/ \tan (\frac{fovy}{2}) \) and \( aspect = \frac{w}{h} \)

The viewport

A normalized (projected) point \( \mathbf{x}_e= (x_{e},y_e)^T \) still have to be transformed to be displayed on the screen coordinate system:
\[
\begin{pmatrix}
x_w\\
y_w\\
\end{pmatrix}
=
\begin{pmatrix}
x_0 + \frac{x_e+1}{2}w\\
y_0 + \frac{y_e+1}{2}h\\
\end{pmatrix}
\],
where \( (x_0, y_0)^T \) is the screen origin, ans \( w, h \) are the screen width and height.
Question: what about the \( z \) coordinate?

From a point to the screen

A 3D point can be obtained by simply combining transformations:

The modelview matrix allows to transform a scene from world coordinates to eye coordinates:

Given \( \mathbf{M} \), the model matrix, \( \mathbf{V} \), the view matrix, the modelview matrix is simply obtained by \( \mathbf{V} \mathbf{M} \)

The modelview-projection matrix allows to transform a scene from world coordinates to projected clipping plane coordinates:

Given \( \mathbf{M} \), the model matrix, \( \mathbf{V} \), the view matrix, and \( \mathbf{P} \) the projection matrix, the modelview-projection matrix is simply obtained by \( \mathbf{P} \mathbf{V} \mathbf{M} \)

Sources

PREVIOUS: EXERCICE01

NEXT: EXERCICE02