Monday, May 28, 2012

OpenGL Camera Class Tutorial Part 1: An Object Model

Introduction

I've taken classes, searched the internet, and never found exactly what I wanted in an OpenGL Camera class / tutorial. So here it goes: yet another camera class tutorial.

I've used a camera class in all of my OpenGL projects. It has been extremely useful and has provided shortcuts to a lot of other necessary components of a graphic system. My camera class has evolved from its beginnings as a homework assignment, to the point that it provides many miscellaneous features. It honestly is just a hodge-podge of useful features, and never had a specific purpose or structure. So, I've decided to re-write it, and document the process.

Goals

There are three goals of this tutorial:
  1. A tutorial for myself. Computer graphics is not my day job, but I love to tinker around with OpenGL and have a bunch of half-finished projects. With that in mind, I would love to more fully understand the concepts so my time tinkering isn't just time debugging and being confused.
  2. A tutorial for others. If I can get through this, maybe I can show others so they don't have to make the same mistakes as I.
  3. Show code that works. I don't want to show only a portion of the code, or pseudo code. I want to show code that can be easily used and compiled. I will not provide source files, but all the code will be shown in the tutorials. One merely must copy and paste the code into a syntactically correct C# class.

Camera Features

The following features are what I want in my camera, with each visited by its own tutorial:
  1. An object model representation of the position and orientation of my camera (this tutorial).
  2. The ability to slide and rotate my object model relative to its local coordinate system.
  3. An object model representation of the OpenGL viewport / frustum.
  4. The ability to use the object model to configure the model-view and projection matrices.
  5. Acquire a picking ray from the camera.

Development Environment

  • I am using Visual C# 2010 as my development environment. This should not prevent anyone from using this class in Java or C++. As a matter of fact, I have a version written in Java for Android's implementation of OpenGL ES, and a version in C++.
  • How do I use OpenGL in C#? A very nifty library I found called OpenTK. Besides providing an interface to OpenGL, it also provides matrices, vectors, and other helpful 3D constructs and math utilities.

The FreeCamera Class

The class is named FreeCamera because of its ability to model an arbitrary orientation and position in 3-D space (not because it is free as in freedom nor because it is free as in no cost; though, it does comply with these definitions as well). Many camera classes only allow orientation defined as an azimuth / elevation (strict pitch / yaw), or only rotation on one or two axis -- to me, that is defined as an AxisRestrictedCamera. Not only will this class allow pitch, yaw and roll of the camera, it will also allow rotation of the camera on an arbitrarily defined axis. To do this, meet the beginning code for the camera:
public class FreeCamera
{
  public Vector3 U;
  public Vector3 V;
  public Vector3 N;
  public Vector3 Position;
}
The class contains 3 public structure instances: U,V,N, and Position. The Position field provides the camera's position. U,V, and N, on the other hand, need some explaining. Each axis provides a vector in the positive direction of each of the camera's local axis. Now, this is where I got lost when I was learning this the first time. So, I want to explain this a bit more.

Consider a representation of 3-D space:
World Coordinate System.
I have an X-axis, Y-axis, Z-axis and an origin marked with a big black dot. I can simply find or define positions anywhere in this space knowing the X, Y, and Z components of a point. For my camera, this is the space that the camera lives in (and anything the camera looks at) -- the world coordinate system.

Now I'll define a camera's position with a big orange dot and define its orientation. A simple way to define a camera's orientation is to give it something to look at and tell it which way is up. So, I'll define a camera with a position, a "look," and an "up." The camera's position will be at about (1.5,0,1.5) in world coordinates, will look at the world coordinate system origin (0,0,0), and the camera will point up in the direction of the world coordinate system Y-axis(0,1,0):
World Coordinate System with Camera.
I have the camera defined in the world. What I want to do, though, is just look at the camera itself:
Camera, alone.
The camera can be described by its position (orange dot), a specific position it is looking at (black dot), and the direction known to be "up" for the camera (magenta vector arrow). The next trick is to find a way to represent the camera's local coordinate system. Why is this important? I already have a position, know what I'm looking at, and could easily move it toward that point, at an angle, etc., using fundamental math operations. But, I can think of two reasons why I want a local coordinate system, instead:
  1. Convenience. If I want to move forward, great, I'll know exactly which way is forward (or any other direction). If I want pitch, yaw, or roll, then I will know exactly how to rotate and model my new position. The previous two statements have been from a camera perspective, not from the world coordinate perspective -- I wouldn't say, "I'm going to move 8 units from the center of my house toward the front door and 6 units to the east wall," just to describe my intent to walk forward 1 step.
  2. OpenGL doesn't really use a position / look / up model of a camera. It does have a utility function, glLookAt, that takes these parameters, but I won't be using that here. Eventually, what I'll do is use the U,V,N representation of the camera's local system to set the model-view matrix of OpenGL.
So, I'll redefine the camera's model with a local U,V,N system (analogous to X,Y,Z). Imagine a line coming straight out of the screen, behind the camera; this is the positive N-axis. So, the Camera will always be pointing down the negative N-axis. Looking to the right of the camera will be the positive U-axis, and looking up will be the positive V-axis:
Camera with U,V,N.

Why is the U,V,N system defined this way? It follows the right-hand rule and, using this model, it is easy to populate the modelview matrix. When put together, the world coordinate system contains the camera which has its own personal coordinate system relative to its orientation. Here is a final diagram of the 3-D world and the camera:

World Coordinate System with Camera and U,V,N system.
How do I calculate the U,V, and N axis? It is actually quite simple with the look / up vectors. The N-axis vector is behind the camera, so subtracting the look from the position will give the positive N-axis vector. U is the cross product of the up vector and N (again, see the right-hand rule). V could be the same as the up vector, but the up vector isn't guaranteed to be orthogonal to the look direction. Telling the camera which way is "up" is really more like a hint of what the orientation is. So, V is calculated as the cross product of N and U. This is the first method I'll add to the FreeCamera class:
public void LookAt(Vector3 look, Vector3 up)
{
  N = Position - look;
  N.Normalize();

  U = Vector3.Cross(up, N);
  U.Normalize();

  V = Vector3.Cross(N, U);
  V.Normalize();
}
I'll also add some constructors to the FreeCamera class to finish up this tutorial:
public FreeCamera()
{
  Position = new Vector3(0, 0, 0);
  LookAt(
    new Vector3(0, 0, -1),
    new Vector3(0, 1, 0));
}

public FreeCamera(Vector3 position)
{
  Position = position;
  LookAt(
    position + new Vector3(0, 0, -1),
    new Vector3(0, 1, 0));
}

Conclusion

This isn't a tutorial about OpenGL, C#, OpenTK, linear algebra, or mathematical concepts. I don't attempt to optimize any methods or operations. This tutorial is about creating a camera class to use in OpenGL. If hints of any other topic than a camera class are provided, that is only as a bonus. It is an assignment for the reader to learn any other concepts needed to utilize this tutorial. In the next tutorial, I'll cover sliding and rotating the camera: Part 2.