Saturday, September 12, 2015

Speed-up raycasting with Octrees

In our last issue we had to wait 8 minutes to render a genie lamp and a sphere. Altogether we asked for each pixel if one of the many triangles were intersected. This scene consisted of
  • 4 objects (1 lamp, 2 spheres and 1 plane)
  • 34,378 triangles (1 sphere has no triangles, ask OBJ file)
  • 640x480 = 307,200 pixels
giving us more than 10.5 billion queries for a single rendered scene. In general a CPU has 1-2 GHz in a single thread. That means one thread can calculate around 1-2 billion steps per second.

Nowadays a CPU can have multiple cores (mine has 4 cores with each 2 threads). I could use multithreading to calculate the color of i.e. 8 pixels at the same time, but I suggest to avoid that complicated architecture, and that for 3 reasons:
  • It needs correct locking when several threads try to access the same memory
  • Some of you (beginners) might not be familiar with multithreading and locking
  • Debugging the code will become more than difficult and makes algorithms more difficult to read and analyze

Possible speed ups without multithreading

Saying that it means we need another solution to speed up raycasting (and raytracing). I found a few possible ways to accelerate our render process without multithreading and multiprocessing, but we will include only one of them in our framework (maybe I'll add the others later too). We will start with the most complex one and go step by step to the simplier methods.

Bounding sphere

This was my first solution: Our objects store all points in a list. the bounding sphere places itself so that all points are enclosed by a sphere (see graph here). We already discussed ray/sphere intersections, and we know that the calculation is done with a single formula.

Only if the bounding sphere is hit by the ray, the list of stored triangles will be searched for the closest triangle. That way we can avoid many many queries if the ray isn't even close to the object.

The big disadvantage of bounding spheres is the correct placement of center and radius. All developed algorithms are approximations and provide an error coefficient. In other words it is quite impossible to calculate the perfect sphere with the smallest radius to cover all points in question.

Bounding Box

Like the bounding sphere it encloses all points in question. A box has 8 points and 6 surfaces, which means we have more queries to figure out if the ray intersects with the box, especially if the box is also rotated and scaled by transform algorithms. Finding the correct sizes is also quite difficult.

Binary Space Partitioning (BSP)

A BSP seperates a space into two subspaces, and these subspaces in another two subspaces and so on. Therefore the area to search an object gets smaller and smaller, partitioning points and triangles into certain areas.

The difficult part of that is the implementation of good seperators.

Axis-Aligned Bounding Box (AABB)

This is the simple version of the regular bounding box. Like the name already says, each surface is aligned to the 3 axis x, y and z. The center is the average value of the minimal and maximal values for each axis, and the difference of these values are the bounding sizes.

There are two ways to describe an AABB:
  • Using a point for minimal values and a point for maximal values
  • Using a center and a vector describing the (minimal) distance from center to surface
Even the calculation for ray/AABB intersection is a lot simplier than that of a bounding box, and it serves as the foundation of the data structure we will use in the framework.

Octree

Imagine an AABB. Now half all 3 sizes (x, y and z). You get cube which can fit in the original cube eight times.

An Octree is a cube which has 8 sub cubes inside, and each sub cube has another 8 sub cubes (that would be 64 sub cubes in total). This way we have a hierarchy of nodes to store informations. If we hit the big cube, we search for the sub cubes if the ray would hit them, and search even deeper into going smaller cubes. The number of possible points/triangles get smaller and smaller, so that we don't have to look into the whole list.

This data structure is called partitioning tree. It seperates data into sub sections helping us to find the correct data even faster than looking into the whole list.

Almighty Octree

The most important thing about an axis-aligned bounding box is that it surrounds the whole object (i.e. a sphere). We do that by getting min and max of x, y and z. To ensure that all points are inside that box we add the border by ε.
This alone would improve our raycasting. Only when the ray intersects with the AABB we go through the list of triangles and search for the closest triangle hitting that ray. But a sphere can have thousands or millions of triangles, We don't have time to look into all of them.

The advantage of that AABB is that we can half the size of all x, y and z and seperate it into 8 smaller sub AABBs. Here is what I mean:
These sub AABBs can be divided into another 8 sub AABBs and so on. The advantage of that is that we can decide the creation of a sub AABB (let's call them node like for binary trees). If there is a node, than there is at least one data set to expect.

Unfortunately memory storage would increase exponential (S(23n)). We can't go into infinite depths. Instead of saying one child node can only have one triangle we can have one node have a limited number of triangles. If that limit is exceeded, then split all triangles into appropriate child nodes.

Sometimes a triangle can exist in several sub nodes (cause of its size). Let's say an object has 5000 triangles: The octree might have 5010 triangles or even more. That way we can avoid cross queries to other nodes of the same level.

Optional you can make an Octree as perfect cube of size a, but I suggest to make it of size width, height and depth instead of a single size. a would be the maximum width, height, depth.

Add triangle to Octree

My suggestion is the following algorithm:
  • If limit is not reached OR node reached max depth
    • Add triangle to list
    • If that triangle exceeds limit and max depth is not reached
      • Query all stored triangles for intersection with sub nodes
      • Create sub node if there is no node for that position
    • Tell node that limit is reached
  • Otherwise add triangle to sub nodes (more fitting nodes possible)
  • Create sub node if there is no node for that position

Triangle/AABB intersection test

The big question is when does a triangle intersect with an AABB? Tomas Möller suggested a very fast algorithm which can answer our problem (paper). He implemented the Separating Axis Therom (SAT) to ask 3 questions in 13 tests:
  1. Is there at least one triangle point inside AABB? (True if there is)
  2. Does the plane overlaps with AABB? (True if there is)
  3. Is there a separating plane between triangle edge's and plane's normal (False if there is)
I modified the algorithm so that instead of overlapping test (Test 2) I query if there is a triangle edge intersecting with AABB.

Ray/AABB intersection test

This test is quite simple if you remember ray/plane intersection test. For that we need 6 calculations with 2 points: The point with only minimum values and the point with only maximum values.

Once again, here is the formula for ray/plane intersection:


where t is the scalar distance according to ray's starting point S and direction v, P0 is one of the corner points of AABB plane (Pmin or Pmax) and n is the AABB plane's normal. This normal is either (1,0,0), (0,1,0) or (0,0,1) considering the plane in question.

Let's say we want to take the front plane, then n would be (0,0,1) and we need Pmin. Our formula can be shortened as



For the planes front, left and bottom we can use Pmin and for the planes back, right and top we can use Pmax. This makes six calculations. Easy, isn't it?

With that we get six different values for t. We put this into our ray equation to get a point which is exactly on the plane, but is it within the boundary too? If you used the calculation for the front plane, you need to query x and y values if it is inside AABB plane. You get the boundary values from Pmin and Pmax.

If both queries are true, than the ray intersects with the plane and therefore with the whole AABB.

Framework

I explained that I won't change object settings in Scene3D to compare rendering time. But I used the time to make some structure improvements:
  • Scene3D:
    • Offers option to build Octrees for each object
    • All lists are std::maps now for Object3D, Light, Material and Texture
    • Uses "keys" to get objects, but can still be turned into std::vector
    • That way, objects created from OBJLoader can be transformed easily
  • Object3D:
    • Contains the new implemented Octree object linked to that object
    • ray intersection queries if there is an octree (not null)
    • if not, go through each Surface3D like before (makes octrees optional)
    • Now has an ID to better identify objects (i.e. when Octree is updated)
  • Octree:
    • From Object3D instance calculates center and size of AABB
    • Creates OctreeNode as root node with prepared informations
  • OctreeNode:
    • Has 8 child nodes of type OctreeNode
    • Stores center, size and 8 childCenters as Vector3D
    • contains a list of stored Surface3D if limit or max depth is not reached
    • static values for max level and limit valid for all nodes and sub nodes.
  • OBJLoader:
    • Can now read .obj and .mtl files from different folders
    • saves file path so that it can find MTL file
  • main.cpp:
    • Added timer to see rendering time in seconds
Once again, let's summarize the previously rendered scene
  • 4 objects (1 lamp, 2 spheres and 1 plane)
  • 34,378 triangles (1 sphere has no triangles, ask OBJ file)
  • 640x480 = 307,200 pixels
Time without octrees:
around 8 minutes

Time with octrees
7-8 seconds

To answer your question: I still don't use graphic libraries and therefore no graphic card. BUT I have Intel i7 inside (4 cores/8 threads with 2.4 GHz each). So the result might differ from computer to computer.

Conclusion

We added a data structure into our framework limiting the number of queries for each pixel, and were able to reduce time by around one fiftieth (1:50). And we still didn't (and will never) add multithreading into the framework.

In the next issue we will make a bit of a better rendering scene using extern OBJ files (maybe your creations) and implement the long waited RAYTRACING where we will simulate reflection (mirror) and refraction (glass). We will also make a small introduction into shadowing to make everything a bit more realistic.

Well, see ya in the next issue.

Repository

Tested with:
  • Visual Studio 2013 (Windows 8.1) (in 7-8 seconds)
  • GNU Compiler 4.9.2 (Ubuntu 12.04) (in 13-14 sesconds with VM)

Sunday, August 23, 2015

Wavefront OBJ file format and Normal Mapping

Before going into more advanced render techniques I'd like to use more professional object designs than spheres, cubes or cylinders. It looks more professional with textures, but we shouldn't use so simple designs on our awesome framework. We won't be able to use all commands

Wavefront .obj file format

Wavefront OBJ is the most simple file format for storing 3D model informations. Every line starts with a command followed by a punch of numbers. With them we can store points, texture coordinates, normals and surfaces. Unfortunately lights are not a standard for the file format.

I mark variables between <...> (i.e. <x> <y> <z>, <"name">). Quotes for strings and no quotes for numbers:

Comment "#"

Every line starting with a # symbol is a comment

Syntax:
# <"comment">

Example:
# This is a comment

Material library "mtllib"

A line starting with "mtllib" tells us that we want to include a material file (.mtl) into our object. This has the advantage to store ambient, diffuse, specular and texture informations into an extern file. In generall the filename of both .obj and .mtl files are the same (classroom.obj and classroom.mtl). Even whitespace is possible in a filename, so be careful

Syntax:
mtllib <"filepath">

Examples:
mtllib classroom.mtl
mtllib genie lamb 05.mtl

Object "o"

A line starting with "o" tells us that a new object will be created. All lines afterwards and until the next "o" will belong to this object. This allows us to have more than one object in one file.

Syntax:
o <"object name">

Examples:
o Plane
o Lamb01
o Chair

It may happen that there is no "o" at all, so we need to create a new object anyway.

Group "g"

A line starting with "g" describes that the following lines belong to that group. It won't create a new object, but it is more like they belong together.

Syntax:
g <"group name">

Examples:
g window
g front

Vertex "v"

Finally something which contains important informations: "v" is a vertex or in our case a point. They normally appear in several lines in a row, especially because we need at least 3 points to create a surface.

Syntax:
v <x> <y> <z>

Example:
v -1 2 5.2
v -0.000000 0.114194 -0.315641

Texture vertex "vt"

Same as "v", but uses u, v and optional w. (w is used for bump mapping, explained later). In most cases they are between 0 and +1 like explained in 2D texture mapping. They also appear  in several lines in a row.

Syntax:
vt <u> <v>
vt <u> <v> <w>

Example:
vt 0.450580 0.776692
vt 0.382673 0.517478 0.247539

Normal vertex "vn"

A normal vertex is normally used for vertex interpolation like with smooth surfaces. Gouraud shading and Phong shading. The trick is that each surface point has (optionally) a normal vertex stored. The normal on a certain point on the surface can be interpolated in the same way like texture coordinates using u and v.

Syntax:
vn <x> <y> <z>

Material usage "usemtl"

Before we assign points and texture points to each surface we need to define the used material. Remember the line "mtllib"? Exactly: In this file we can find the material with the given name. (Material file syntax explained below).

Syntax:
usemtl <"material name">

Examples:
usemtl Material001
usemtl Rocky
usemtl gold

Smooth group "s"

This command puts all upcoming surfaces into the same smooth group. This allows us to use smooth shading along edges. There is also the option to set smooth shading "off".

Syntax:
s <1-32>
s off

Example:
s 7

Surface "f"

Now the usage of this nasty pointers in our framework will make sense. Wavefront OBJ numerates each "v", "vt" and "vn" like a list. In other words, after reading all vertexes, texture vertexes and normal vertexes we should have 3 lists, each of them starting with index 1, NOT WITH 0. A surface can have 3 or 4 points and both have 4 syntaxes each to take care of.

Syntax with vertex alone:
f <v1> <v2> <v3>
f <v1> <v2> <v3> <v4>

Syntax with vertex and texture:
f <v1>/<vt1> <v2>/<vt2> <v3>/<vt3>
f <v1>/<vt1> <v2>/<vt2> <v3>/<vt3> <v4>/<vt4>

Syntax with vertex, texture and normal:
f <v1>/<vt1>/<vn1> <v2>/<vt2>/<vn2> <v3>/<vt3>/<vn3>
f <v1>/<vt1>/<vn1> <v2>/<vt2>/<vn2> <v3>/<vt3>/<vn3> <v4>/<vt4>/<vn4>

Syntax with vertex and normal:
f <v1>//<vn1> <v2>//<vn2> <v3>//<vn3>
f <v1>//<vn1> <v2>//<vn2> <v3>//<vn3> <v4>//<vn4>

Examples:
f 146/27 8405/32 8406/33 147/28
f 146/27/8 8405/32/8 8406/33 147/28/8
f 1//6 4//6 8//6

Wavefront .mtl file format

In most cases, each OBJ file gives us a MTL (material) file, containing color and other informations we described in our Material class. Like in OBJ every line starts with a command for a syntax.

Comment "#"

I already described command "#", lets skip this

Material definition "newmtl"

This command tells us to create a new material object with the given name.

Syntax:
newmtl <"material name">

Examples:
newmtl Rocky
newmtl Material001

I used "Rocky" on purpose (see above ^^).

Ambient color "Ka"

The first component important for phong illumination model. Command is "ka" followed by 3 floating point numbers from 0 to 1 representing red, green and blue.

Syntax:
Ka <r> <g> <b>

Example:
Ka 0.000000 0.000000 0.000000

Diffuse color "Kd"

The second component important for phong illumination model. Same as ambient color.

Syntax:
Kd <r> <g> <b>

Example:
Kd 0.790599 0.552374 0.126178

Specular color "Ks"

The third component important for phong illumination model. Same as ambient and diffuse color.

Syntax:
Ks <r> <g> <b>

Example:
Ks 1.909807 1.671032 0.369581

Specular exponent "Ns"

Important to describe the specular's exponent for highlighting a spot. After command "Ns" there is a single floating point number from 0 to 1000.

Syntax:
Ns <exponent>

Example:
Ns 109.803922

Refraction coefficient "Ni"

Important for raytracing (explained later). If you are a physician, you know that transparent objects (i.e. glass and water) have a refraction coefficient where light rays get distracted from the surface when entering. It is not important now how to calculate it.

Syntyx:
Ni <refraction coefficient>

Example (no refraction):
Ni 1.000000

Disolve factor or Transparency "d"

Talking about transparent objects like glass. This command tells us how much of the color you will see from the object.

Syntax:
d <disolve factor>

Example:
d 1.000000

Illumination model "illum"

Not really something you can figure out yourself. It seems that OBJ files have around 10 different settings for illumination.

Syntax:
illum <setting>

Example (Highlight on):
illum 2

A list of possible illumination models can be found here.

Mapping "map_" + option

There is also a possibility to not just use material color but also use a texture color of a given point as component (ambient, diffuse and specular) color. That way we are more flexible in designing realistic scenes.

Syntax:
map_Kd <"texture file">
map_Ks 
<"texture file">
map_Ka 
<"texture file">
map_Bump 
<"texture file">
map_d 
<"scalar texture file">

Our problem is that most file names are of type PNG, JPEG or TGA. There are far too many for our simple framework. We have to figure it out ourself which image we took for texture mapping and add it in our framework by ourselves.

Reflection type "refl"

Also something we need for raytracing. This is quite difficult to figure out because it seems there are more than one possibility for a syntax. I believe it is more directed to reflection mapping.

Syntax:
refl -type <sphere/cube_"side"> <"filename">
refl -type <sphere/cube_"side"> -otions <"optoin"> -args <"args"> <"filename">

Example:
refl -type sphere chrome.rla

Normal vector transformation

Now that we know the contents of an OBJ file we need to update our surface class. It contains now:
  • 3x Vertex points
  • 3x Texture points
  • 3x Normal vectors
Normal vectors only show us the direction, they won't change when the surface is translated. You should consider normals like planes (see here). To get a normal transformed correctly we calculate:



where n is normal and M is the transformation matrix. Here, M is inverse and transpose.

It took me some time to figure that out myself, so I'll show you how I do that. In 3D we normally work with a 4x4 matrix looking like this:



From this transformation matrix we only need the components responsible for scale and rotation, not for translation. We cut the 4th row and 4th column, getting a 3x3 matrix M.



And from this, we first calculate the inverse matrix and afterwards calculate the transpose. We can use this matrix to multiply with a 3D vector normally OR turn M into a 4x4 matrix for a homogenous matrix transformation.

Phong interpolation

We can set up 3 (equal or different) normals (n0, n1 and n2) at a surface's corner, giving us the opportunity to manipulate a surface normal as we please. That way we can make a surface more smooth (like a sphere).

We are already able to use barycentric coordinates properly. We use this to test if a point is inside a surface and calculate texture coordinates for UV mapping. We will now do the same calculation for normals.

From barycentric coordinates we get α, β and γ. Our normal on point P is.



With that there is no need to simulate an object's normal like I did in the previous version. We can also scale a sphere as we please and it will keep the correct normals for interpolation. This is important at working with customized vertex normals.

Normal Mapping

With normal mapping we are able to simulate a terrain on a surface without manipulating the object's surfaces. In other words, A sphere will stay a sphere for the render engine, but illumination will get a different normal then usually.

To a color map (a.k.a. texture) we have for instance a picture of the earth and it's normal map (source)




This normal map saves a surface's normal as RGB from 0 to 255. For us, it means from 0 to 1. But vectors can also go into negative direction, which is why we need to calculate from 0 to 1 into -1 to +1. R is x, G is y and B is z. This map is considered to have a general normal of (0,0,1).

As you can see the normal map is mainly colored in blue. Well, normals don't face negative areas usually. This image always have more than 0.5 in his blue color channel.

We only know a normal on our surface (i.e. a sphere). We must find a way to get 2 additional vectors so that we can form something like a look at matrix (remember my post about 3D Viewing). This 2 new vectors are called tangent and bitangent (like right and up vectors for look at matrix). A way to calculate these two can be found on this website.

We have to prepare some values to get tangent:

  • Surface's edges (p1-p0 and p2-p0) in x,y,z
  • texture edges (t1-t0 and t2-t0) in u,v

You can also just calculate tangent. Bitangent can be calculated by using the cross product of normal and tangent. With that we can finally build our transform matrix. We can use either 3x3 or 4x4 (homogenous).



IMPORTANT: Inverse and transpose matrix M when used on normals which is the case in normal mapping

Previously our images looked like this, very boring, plane and with a specular.


And that's how the same sphere with the same lighting and the same view looks with a normal map:


As you can see, some areas on the dark side of the earth are still bright. That's because according to Phong's illumiation model the new normal is still at a small angel to the light source. We can avoid this by adding hard shadows.
Honestly, I almost fell from my chair when I saw this picture for the first time after running the framework.

Framework

Once again, I put in some minor changes in the framework
  • Surface3D
    • Added Texture object normalMap
    • interpolates normals with and without normal map
  • Polygon3D
    • Derived from Object3D
    • Is able to add extern created points and surfaces
  • OBJLoader
    • Text parser to read OBJ and MTL files
    • Only works with an existing Scene3D object
    • Splits each line into blocks (seperated by space ' ')
    • Freely expendable
    • Can read "f" commands with 3 (one triangle) and 4 (two triangles) parameters
    • Saves materials, objects and surfaces immediately into Scene3D
  • Scene3D
    • Method to load obj files from OBJLoader object
    • Method to add Object3D and Material objects
I first intended to add a bounding sphere algorithm into the framework, but I left that asside because this issue should focus on loading OBJ files and normal mapping. It shall demonstrate how slow raycasting really is without that.

Preparations for OBJ and MTL files

There are several 3D modeling softwares in WWW. It is up to you which one you take, as long as they are able to export your creations into OBJ files. Blender's homepage offers many demo files you can export into OBJ files.

For our framework, you need to consider the following options (i.e. in Blender v2.73)
  • What is IMPORTANT
    • Forward settings: +Z-Forward
    • Up settings: +Y-Up
    • Objects as OBJ objects (not group)
    • Write Materials
  • What is OPTIONAL
    • Include UVs (for texture mapping)
    • Triangulate Faces (3 instead of 4 points, file would be larger)
    • Write Normals (for correct phong interpolation)
Settings for File -> Export -> Wavefront (.obj) in Blender v2.73

Output

This Image was rendered in 8 minutes with size 640x480. It loads file "genie_lamp_05.obj" and creates a sphere with texture "earth.ppm" and normal map "earth_normalmap.ppm"

Left: an object loaded from "genie_lamp_05.obj" (blender file source)
Left: a sphere with normal mapped earth surface (texture and normal map source)

Conclusion

We are now able to load extern OBJ files from modern 3D modeling softwares like Blender, Maya and AutoCAD. But we are facing the harsh reality that our framework becomes slower and slower without better data structures.

In the next issue we will improve the OBJLoader so that it can use extern folders and we will talk about a special data structure to speed up raycasting and even later render algorithms. The one data structure I think will fit the best will be Octrees, I will change the structure a bit where a sub-octree can contain multiple leaves (or in our framework: Surface3Ds). Scene3D will also make some changes by turning all std::vector into std::map.

I do this before Raytracing because I don't want to wait 8 minutes everytime I want to test the framework. Having said that, see you in the next issue.

Repository

Tested with:
  • Visual Studio 2013 (Windows 8.1)
  • GNU Compiler 4.9.2 (Ubuntu 12.04)

Thursday, August 13, 2015

Raycasting and Phong Shading

We took our time and stayed with just one render technique and a simple shading model. This time we leave complex transformation calculations and go to more photo realistic render and shading algorithm.

Raycasting

Raycasting is the preversion of Raytracing. It simply shoots several rays through a plane and calculates intersection queries for each surface existing in our scene. Each ray returns the modified color of the closest found surface. With modified I mean after going through a shading algorithm.

The number of rays depends on the number of pixels we need to render, or the number of rays is equal to the size of our pixel and depth buffer. All rays have the camera coordinate as starting point. After transformation into view space camera is set to (0/0/0) and the plane where the rays get their direction vector is the near plane we discussed in 3D viewing section.



The size of the plane can be taken from our viewport coordinates like fov, screen width and height and near plane. You can see the corner points of that view plane in the figure above. But we don't want to get the color at the corner, but the one where the ray goes exactly through the center of the pixel (marked as black dots).

The best way to do so is to calculate the distance from one pixel to another. That means we divide the length of the view plane by the number of pixels for each side.



After that we need a starting point. Using the value of the top-left corner for x = 0 and y = 0 we would hit exactly the corner of the pixel, what we DON'T WANT. We have to add a half step, then we hit exactly the center of the destined pixel. We mustn't forget to mirror y coordinates.



Where x and y are pixel coordinates from 0 to width-1 and from 0 to height-1 and where V is the point on the view plane.

The near value can be skipped by the way, meaning that the view plane is 1 unit away from the camera. You only need near when you don't want to draw a pixel outside of the viewport.

The last problem we need to fix is the different length of each ray. We would have problems with our depth buffer if 1 unit has different length (like mixing up meters and yards for different pixel in the depth buffer). We need to normalize the direction vector for the ray. Our Ray formula looks like this:



In case for Raycasting Start will be a 0-point. That's the position where our camera is in view space.

Ray Intersections

Rays have to do lots of intersection calculations to figure out if they actually "see" a surface. More accurately there are width*height*(number of surfaces) calculations to be done. Therefore Raycasting and later Raytracing are not very efficient.

Ray-Plane-Intersection

We were talking about surfaces most of the time, but why is this subsection called Ray-Plane-Intersection? It's because our triangulate surface IS a plane. The plane equation is



where N is the normal vector of the surface, P0 is a corner point of the plane (surface) and P is an unknown point on the plane. A/B/C are the normal's x, y and z values and D is the negative DOT-product of the normal and a corner point.

Replacing x, y and z with the ray formula we can calculate the distance t.



As long as N and v are not perpenticular (meaning ray's direction is parallel to plane's surface) we will always hit the plane. Putting t into our ray equation we get our hit, but one thing still has to be done. There is a chance that the hitting point is NOT inside of a triangle.


There are a few ways to figure out if the point is inside or outside of the triangle. A Same Side test or the use of barycentric coordinates. Both methods are described here. In my eyes the same side test would take longer, I suggest the use of barycentric coordinates. How to calculate them was already explained in my previous post 2D Texture Mapping.

I suggest to use Möller-Trumbore intersection algorithm which includes a given ray. With this method we can calculate s and t directly and ask if the conditions for a point inside a triangle are true.

Ray-Sphere-Intersection

Even a sphere is built with multiple triangles, so why is there an extra formula for spheres? That's simple: The sphere is not here for his own calculations, but to group several surfaces into one sphere. Only if the sphere is hit by the ray we shall calculate possible intersections with the surfaces inside of it.

3 ways how a ray intersects with a sphere

The sphere equation goes



where C is the sphere's center. Replacing x, y and z with the ray equation we get



where S is the ray's starting point and v is the ray's direction vector. The formula is a bit long, we calculate the distance between sphere's center point and ray's starting point to get d = C-S.



Because we want to get t we will group all parameters after t and get the following squared equation:



or in short



All that's left is to solve this squared equation



The formula looks quite long, but we have the option to normalize v. That way the length of v is 1 and we can remove it from the formula. We can also divide by 2 easily to get rid of the fraction.



The result of the square root's content (not the square root's result) decides which case we approached (see figure above).

  • result > 0 => 2 hits (take the closest one, return true)
  • result = 1 => 1 hit (return true)
  • result < 0 => 0 hits (return false)

Phong shading

Phong shading is the combination of 3 different color reflections:
  • Ambient reflection
  • Diffuse reflection
  • Specular reflection
The sum of these reflection types will return the color drawn in the pixel. The last two types are summed up for each light source.



In all reflection types I will use M for material and L for light source.

Vectors needed for Phong shading (H is only used in Blinn-Phong shading)

Ambient reflection

With ambient reflection we simulate global illumination. That means we don't actually go after the light's physics.


Attenuation

To explain attenuation you better use a flashlight and enlighten a dark room (i.e. your room at night). If you are close to the wall it shines pretty bright, but try to enlighten that wall while walking back. You will see that the wall isn't so bright like close up.

Attenuation can be seen as light getting weaker with greater distance. But some light sources gets weaker differently. Therefore we use 4 components to calculate scalar value attenuation.

  • constant attenuation
  • linear attenuation
  • quadratic attenuation
  • distance between point and light source

Diffuse reflection

The diffuse part is a mix of the following components:
  • surface color (i.e. texture at P)
  • material diffuse color
  • light diffuse color
  • angel between surface's normal and light source from point (dot product)
the dot product between light direction and normal (if both are normalized) returns cos(x) which contains a value between -1 and +1 (like in lambert shading). We can only work with positive values, negative values shall become 0.



Specular reflection

Last but not least we need the so called specular effect which is the shiny glowing you see on metallic objects or bells. We need
  • material specular color
  • light specular color
  • material shining constant
  • direction of reflected light
  • direction to camera
First we need the reflection of an incoming vector. With a given normal we can calculate



where R is the reflected vector, This formula will be used a lot in Raytracing where we can simulate mirror effects.

V is the incoming vector and N is the normal. In this case, the incoming vector is (P-L), the opposite direction of what we needed for diffuse reflection. Now we need the direction to our camera. After the scene transformed the camera's coordination became (0,0,0), where we can use V = -P.


Framework

With raycasting we need some new classes. Of course I updated the previewsly greated Material class
  • Renderer3DRaycasting: derived from Renderer3D. render function and raycasting function
  • Object3D: added abstract getNormal() function (won't be abstract anymore in next issue)
  • Light: Point light class with shading attributes
  • Material: class with shading attributes
  • Ray: line segment with start point and direction. Essential for raycasting
  • Scene3D: added material and light source list and seperated initialize functions into 4 functions
  • Shader: class with shader functions (phong shading for now)
I figured out that cmath's abs(double) function doesn't work equaly between VC++ and GNU compiler. It took me an hour to realize that. My tip: NEVER USE abs(double).

Raycasting works "per pixel", which is totally different to rasterization. We need to ask each surface for intersection and only take the one closest to our camera. If there is no surface in that pixel's direction, keep the background color. Otherwise get intersection point and continue with phong shading.

In phong shading function, ask if we have at least one light source and a surface to shade. For each light source, calculate attenuation and ambient, diffuse and specular intensity. Use surface color only on ambient and diffuse.

Texture coordinates for Cube and Sphere

Some of you might have noticed that I skipped 3D texture mapping. I'm sorry about that. I added predefined texture coordinates for cube and sphere. That has the advantage that you can use one function to add textures. To see how that works, there are 2 images in folder "sources".

Texture for cube, to see which area has to be placed

World map for sphere.

Output

For comparison, A rendered scene with the same objects, but one image has specular reflection, the other one has no specular (look at earth).

Rendered scene with specular effect

Rendered scene without specular effect

Conclusion

Raycasting is the first step to a more realistic rendered world. On the other hand it takes longer to render scenes because of the "per pixel" method, but the waiting is worth it, especially for more complex scenes.

If you think in the next issue I'll start with raytracing, you guessed wrong. I have something special in mind for you, but I'll keep that a secret. You have to wait for my next post. But there is a hint I can give you:

We will render professional designed objects.

See you in my next post.

Repository


Tested with:
  • Visual Studio 2013 (Windows 8.1)
  • GNU Compiler 4.9.2 (Ubuntu 12.04)

Monday, July 27, 2015

3D Transformations and Viewing

It's finally time! We add a new dimension in our framework and work with 3D graphics from now on. We won't go straight to photo realistic graphics, we need global illumination models and they are too advanced to understand them now.

In this post we will stay with rasterization from our 2D framework, but we will expend this algorithm by adding a depth buffer to get the pixel closest to our camera.

This chapter contains many 3D transformations looking very similar to their 2D counterparts. But we have to be careful about about 3D viewing, because their are two different kinds of viewing:
Orthogonal projection is like drawing a 3D object (i.e. cube) on paper like you learned in elementary school or gymnasium. You look on each object like it's straight in front of you.

Perspective projection is the way we humans see our surroundings. Everything is projected on a single point and from their looking to each direction from there. The best examples are eyes and cameras.

We will use orthogonal projection only in this issue to see the difference to perspective projection. From then on everything will be designed like we would look on objects.

3D Transformations

I won't go into details for this section, we already discussed 2D transformations and 3D work really similar. I will only describe what kinds of transformations exist and how their formulas look like.

Even transformations have to follow a certain order to be transformed like wished:

  1. scale/rotate
  2. translate

Translation

Scale

Rotation

Now i do have to go into detail, because we had just one rotation for 2D. in 3D we have the possibility to rotate along all 3 axis, each of them has their own formula. For instance when we rotate along z-axis only x and y values change, z stays unchanged.







Viewing

After using above tranformation formulas to move Object3D from object space to world space we need to transform them from world coordinates to viewing coordinates, in other words how a camera sees objects from his position and angel. In other words, the camera has always (0, 0, 0) as coordinate in view space. We can do it by using inversed transformation matrices for rotation and translation or by using a lookat matrix.

A lookat matrix needs the following parameters:

  • camera position in world space (eye)
  • point in world space where camera looks at (target point)
  • lookup vector, a direction where a camera is looking up (mostly [0,1,0], up vector)
red: input parameters (camera, lookat, lookup)
blue: calculated transformation vectors (forward, right, up)








forward is the normalized direction from camera to lookat point.
right is perpenticular to lookup vector and forward vector and normalized
up is perpenticular to forward and right

With this our viewing transformation matrix is complete, looking like this:



Projection

As described before we seperate between orthogonal and perspective projection. Both transformation matrices returns normalized viewport coordinates from -1 to +1 to all 3 axis. This is important for the final transformation to screen coordinates.

Orthogonal projection

The transformation amtrix almost looks like in 2D viewing: We need to define borders left/right, top/down and near/far, wer left is xmin, bottom is ymin and near is zmin. This is because we have a parallel projection where each projection line has the same direction.
Viewport for orthogonal projection



The result contains values from -1 to +1 for each point within our view box. But we are using the view's width, height, far and near as parameters where xmax is +width/2 and xmin is -width/2.

Perspective projection

This method is more realistic to display because it's the way how we see things. On the other hand, it is quite complicated to calculate the normalized viewport coordinates.

A point in view space and where it should be projected
We take the near plane as projection plane and it is positioned along z-axis pointing to our camera. Therefore our plane formula is



or


where n is the position of the near plane. After calculating the distance using line-plane-intersection (I will explain further details in Raycasting and Raytracing section) we get



For that we have our projected x and y coordinates, but now that every z coordinates is equal to n our depth buffer is null and void. We need a formula which moves z to a new position, but with the condition that if z is exactly at near plane the result is n and when z is exactly at far plane the result is f. I found a formula in a video lecture where these conditions are fullfilled:



where n is near plane and f is far plane. When working with transformation matrices we have no way to divide values, we can only add, subtract or multiply. The best way to do so is to make homogenes coordinates by adding an additional coordinate w. This value always has to be +1. In cases where w is anything else but +1, we have to divide by that value.



In case of perspective projection we need to divide by z. Therefore our transformation matrix from view space to projection space looks like this:



Now we have our point transformed to projection space, but still not to normalized view space. But we make this task very simple, we just use the transformation matrix for orthogonal projection. But we will make some changes. Instead of left/right and top/bottom we reduce our parameters into:

  • fov - field of vision in degree. I think a human eye has a fov of 60° for up and down.
  • aspect - the ration between width and height depending on fov orientation
With these two parameters we can calculate left/right and top/bottom according to trigonometry calculations, using the distance to near plane as as adjacent leg and the unknown distance to the edges of the near plane as opposite leg. fov describes the whole angel a camera can look top-down, which means that the camera can look half of fov angel up and half of fov down.

This gives us the following formulas






Now replacing theta with fov/2 and using aspect as multiplier for left and right borders we can calculate the 4 parameters for orthogonal projection and replace them.



resulting for example



I think I used too many formulas. That might confuse some readers. It's better to avoid any more complicated formulas and write down the new transformation matrix for normalized viewing.



Note that this transformation matrix can also be used in pure orthogonal projection in case we want to compare image output of 2 different projections.

We can also combine the above 2 transformation matrices for normalized view and projection. The final transformation matrix for perspective projection is:



Screen coordinates

No matter what kind of projection we used, they are stored within a normalized box. Now we just need to calculate screen coordinates where top/left is (0,0) and bottom/right is (width,height). Like in a previous post we have to deal with switched y-coordinates, meaning that we have to mirror it. In addition we need depth informations for our rendering. We are only allowed to draw a pixel if we found a pixel with a smaller depth then in the depth buffer and if the value is between 0 and 1 according to screen space. The calculation goes:



or as transformation matrix:



Rasterization with Depth Buffer (z-Buffer)

We will use the same algorithm like in a previous post for 2D rendering, with the exception that we will also ask about depth values of that pixel. We do that by adding an additional depth buffer of the same size as the frame buffer: width x height. So this time we are only able to color that pixel if our current z is smaller than the already saved z AND if z is between 0 and 1.

We already use DDA algorithm to get appropriate x and y values. The question is how to get z? One way is to interpolate it with DDA like x and y, but the problem here is to change z for both edges and on the way to the other edge, Lots of calculations and adjustments would be necessary, a programmer would only lose sight of the algorithm.

My suggestion is to use a technique normally used in Raycasting and Raytracing: Ray-Plane-Intersection. A ray needs a starting point and a direction. We use as starting point (x/y/0), where x and y are screen coordinates from our rasterization interpolation. As direction we use (0/0/1), in other words along z-axis.

Framework

Now that we are dealing with 3 dimensions, we need to create lots of new classes:
  • Camera3D - holding transformation values and returns 3 matrices
  • Surface3D - surface with color and texture
    • 3 pointers to Vector3D for corner points of triangle
    • 3 Vector2D texture coordinates for UV mapping
  • Object3D - abstract class for 3D objects like spheres or cubes
    • list of 3D points
    • list of Surface3D
    • link to a texture (discussed later)
  • Scene3D - contains a list of Object3D and a possibility to transform all objects at once
  • Renderer3D - abstract render class derived from Renderer
    • contains depth buffer of same size as pixel buffer
    • Scene3D instance returning a list of all surfaces to render
    • implemented Rasterization algorithm
  • Renderer3DRasterization - derived from Renderer3D
Previously we got a list of all 2D objects to render. We did that because there were also border lines to render. This time there are no border lines to render, we can simply focus on surfaces.

Lambert's Shade of Coloring - Flat Shading

Shading describes how a surface should be colored. This might depend on numbers of light sources, angel of incoming light or diffusely reflection. We will use a simple Lambert shading model to describe the luminous intensity of coloring, also known as Flat shading. We need to calculate the DOT-Product of a surface's normal and the direction from the surface to a light source.



This fraction returns a number between -1 and +1. In case that the result is negative, Color becomes black, otherwise keep the calculated color. If the surface's normal points directly to the light source, the result would be +1 and the full color can be drawn.

In this example I'll let the camera be the light source and I also keep the background color white. Normally background is black, but I think it doesn't matter in this early state of the blog.

Output

We once again get an Image from our pixel-buffer, but I also included a depth-buffer image with gray values (also as PPM). The darker the pixel color, the closer the object at that pixel is to the camera. As promised, I show the difference between perspective and orthogonal projection. I set the far plane of both types on the same size.

First perspective...
Perspective Pixel Buffer

Perspective Depth Buffer

...and orthogonal

Orthogonal Pixel Buffer

Orthogonal Depth Buffer

Conclusion

We discussed some transformation types for objects and that there are two different types of projections. To get a clue for colors we added a simple shader. We extended rasterization with depth queries and stored them in a depth buffer where we created an output image.

In the future we will hardly find usages for projection matrices. The only other case where they are needed is for Radiosity, but this is another story.

In the next issue we will go for Raycasting which can also be used for both orthogonal and perspective projection. But I think showing both won't be necessary. I also intend to only use perspective projection from now on, preparing both is not such an easy task. Also we will describe new shading models and upgrade the Material class.

Alright, see you in my next post.

Repository


Tested with:
  • Visual Studio 2013 (Windows 8.1)
  • GNU Compiler 4.9.2 (Ubuntu 12.04)