Simple 3D scenes

You will be creating 2D images from 3D scenes. In order to create the image, the scene must be viewed from some location. So, there are three major components in 3D graphics: a 3D model, a virtual camera position, and the output 2D image.


Large 3D scenes often have thousands of models and many cameras and must be managed in complex hierarchies. We will start with simple scenes using only vertices and triangles. While any kind of shape can be rendered to a 2D image, triangles in that they are the simplest bounded 3D object. Many shapes can be approximated with a large number of triangles.

Triangles are the bounded region contained within three vertices. These are often 3D vertices, but could be 2D for 2D scenes. Often, vertices are stored in a large array, with each vertex having a unique ID. Triangles can then be formed by referring to these IDs.

For example, a 2D square with area of 4 and centered at the origin could be built from two triangles and four vertices. Each vertex has a unique ID:

vertex ID value
0 -1, -1
1 1, -1
2 1, 1
2 -1, 1

The two triangles could then be built from these vertices:

triangle ID value
0 0, 1, 2
1 0, 2, 3


Most cameras are very complicated internally, requiring many lenses and complex focusing algorithms to function correctly. These lenses are difficult to simulate, so we will use a simpler model: the pinhole camera.

Pinhole cameras are closed boxes with film on one side and a small hole on the opposite side. Light enters through the hole and projects on the film. The field of view is the solid angle of the scene that is visible to the camera. In a pinhole camera the FOV is constrained by the size of the film surface and the distance between the film and pinhole.

Rendering pipeline

The camera projects the scene on to a recording surface. In graphics, this recording surface is the output 2D image. In order to create the 2D image, the 3D scene needs to be projected to 2D. This process involves several projection transforms: one to the camera view, one to account for perspective, another to prepare for 2D projection, and the final 2D projection.

We'll look at the transforms in the future. The goal of all of this is to produce a 2D output image.