Being a fan of OOP, I tend to write a lot of object-oriented code. Coming up with a meaningful object model that behaves in an appropriate way is just as important as having a meaningful interface to your objects. A concrete object is an object that actually behaves in the manner you’d expect without any wierd side-effects, and has the same kind of attributes that you’d expect of a primitive data type.
Creating concrete data objects/classes is a good thing to do, as it reduces the probability of bugs, and crazy side-effects. It’s also an important first step in writing intuitive code - which will be the topic of a later blog post.
I’d like to quote one of my lecturers that I learned from during my time at UTS…
If in doubt, do as the ints do.
I realise the meaning of this point isn’t obvious on the surface, but with some example code it’ll all become clear.
An example concrete data type - a 3D Vector class
Let’s say you’re writing some code to do some 3D rendering, and you’re in need of a class that can handle the functionality and behaviour of Vectors in 3D space. The first and most obvious thing that you need to handle are x, y and z coordinates. Let’s start with a very basic Vector3 class:
class Vector3 { public: float X; float Y; float Z; };
At this point you’ve got an object that gives public access to its internal workings. Can we say that’s what the int datatype does? Are you able to mess around with its inner workings? No, you’re not. Not just that, but the idea of public member variables breaks the whole notion of encapsulation/information hiding. We need to hide the internal workings, but in doing so we remove the ability to set and get the values on the object. We need to expose some functions that will allow us to do that. Our improved version of the code might look like this:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // getters float GetX(); float GetY(); float GetZ(); // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); };
This is an improvement, as we now have control over the inner workings of the object without exposing the implementation to external classes. Now let’s say that we want to be able to construct a new Vector3 object through a variety of ways, ie. in exactly the same way we can with int. For example:
// this is what we can do with ints: int a( 1 ); int b = a; int c( b ); int d, e; d = e = c; // we want do the same kind of things with our // own class Vector3 a( 1.0, 0.0, -1.0 ); Vector3 b = a; Vector3 c( b ); Vector3 d, e; d = e = c;
We need to expose some options for construction and assignment. So our next iteration might look like this:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // construction - default to zero if nothing is passed in Vector3( float x = 0.0, float y = 0.0, float z = 0.0 ); Vector3( Vector3& v ); // assigment operator void operator=( Vector3& v ); // getters float GetX(); float GetY(); float GetZ(); // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); };
The above interface should do what we need it to… or should it? A more careful examination will reveal that it doesn’t actually behave exactly as you’d expect. The copy constructor takes a reference to another Vector3, which could be modified inside the copy constructor. We have the same issue with our assignment operator. Integers do not behave this way, so we need to modify our interface a bit more to tidy it up:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // construction - default to zero if nothing is passed in Vector3( float x = 0.0, float y = 0.0, float z = 0.0 ); Vector3( const Vector3& v ); // assigment operator void operator=( const Vector3& v ); // getters float GetX(); float GetY(); float GetZ(); // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); };
We’ve now made sure that the Vector3 class does not modify anything that doesn’t belong to it, which again is how the integers behave. But there is still something missing. The assignment operator doesn’t allow for chaining (eg. int a = b = c = d;) just like the ints do, so we need to make a slight adjustment to the overload:
// return a const reference to ourself const Vector3& operator=( const Vector3& v );
There, much better.
We’re now at the stage where we want to be able to add/subtract/multiply/divide vectors together, but before we start overloading the operators we should look at the way integers behave when they go through the same operations:
int a, b, c, d; a = b; // b doesn't change, a does. a * b; // both a and b do not change. a = b * c; // both b and c do not change, but a does. a = ( b + c ) * d; // a is the only thing that changes a += b / c; // again, a is the only thing that changes.
It’s obvious from this that when we overload the operators, we only modify the content of the object if it is on the left hand side of one of the assignment operators. Our interface should make this obvious:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // construction - default to zero if nothing is passed in Vector3( float x = 0.0, float y = 0.0, float z = 0.0 ); Vector3( const Vector3& v ); // assigment operators - not const because the current object // needs to change const Vector3& operator=( const Vector3& v ); const Vector3& operator+=( const Vector3& v ); const Vector3& operator-=( const Vector3& v ); const Vector3& operator*=( const Vector3& v ); const Vector3& operator/=( const Vector3& v ); // other operators - all const functions to make sure that the // internal state of the object doesn't get modified when the // function is called. Separate temporary instances of Vector3 // objects are created and returned when executed. Vector3 operator+( const Vector3& v ) const; // v1 + v2 Vector3 operator-( const Vector3& v ) const; // v1 - v2 Vector3 operator/( const Vector3& v ) const; // v1 * v2 Vector3 operator*( const Vector3& v ) const; // v1 / v2 // getters float GetX(); float GetY(); float GetZ(); // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); };
The class is starting to take shape, but there are some other functions missing that are an integral part of any Vector class: normalise(), dot() and cross().
The normalise function is used to create a unit vector (ie. a vector with a length of 1.0). But the question here is: when calling the function, should the object be modified, or should it return a copy of the Vector with a unit length? Let’s look at the difference in function signatures:
// this function would modify the object directly void Normalise(); // this function would return a new vector that is normalised Vector3 Normalise() const;
Making a decision like this can be a bit of a pain in the butt, but we’re fortunate in this case because we can deduce what should be done! Generally when dealing with normalised vectors, we tend to retain a reference to the normal while dealing with a stack of other vectors. So another instance of a vector is used alongside the other vectors. Let’s look at how this might be done with both above functions if we had a vector that we wanted to reuse, but get a normalise version of at the same time:
// this is the vector we want to keep as is, but need a // normalised copy of Vector3 someVector; // here's how we'd do it with the first option: Vector3 normal1 = someVector; normal1.Normalise(); // here's how we'd do it with the second option: Vector3 normal2 = someVector.Normalise();
In my view, the second option is easier to read, and is a bit more intuitive. Not just that, but it’s less code! So based on this, we’ll utilise the second version of the function. Our class now looks like this:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // construction - default to zero if nothing is passed in Vector3( float x = 0.0, float y = 0.0, float z = 0.0 ); Vector3( const Vector3& v ); // assigment operators - not const because the current object // needs to change const Vector3& operator=( const Vector3& v ); const Vector3& operator+=( const Vector3& v ); const Vector3& operator-=( const Vector3& v ); const Vector3& operator*=( const Vector3& v ); const Vector3& operator/=( const Vector3& v ); // other operators - all const functions to make sure that the // internal state of the object doesn't get modified when the // function is called. Separate temporary instances of Vector3 // objects are created and returned when executed. Vector3 operator+( const Vector3& v ) const; // v1 + v2 Vector3 operator-( const Vector3& v ) const; // v1 - v2 Vector3 operator/( const Vector3& v ) const; // v1 * v2 Vector3 operator*( const Vector3& v ) const; // v1 / v2 // getters float GetX(); float GetY(); float GetZ(); // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); // helpers Vector3 Normalise() const; };
So, next up we need the dot() function, which gives us the dot product of two vectors. The dot product is a single value which represents the angle between the two vectors, so that’s what the function should return. Since the dot product requires two vectors, it would make sense for the function to be given a reference to the secont vector that is part of the dot product equation. None of the objects should be modified at all during the course of the function, so we should make that obvious in the function signature. Our class now looks like this:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // construction - default to zero if nothing is passed in Vector3( float x = 0.0, float y = 0.0, float z = 0.0 ); Vector3( const Vector3& v ); // assigment operators - not const because the current object // needs to change const Vector3& operator=( const Vector3& v ); const Vector3& operator+=( const Vector3& v ); const Vector3& operator-=( const Vector3& v ); const Vector3& operator*=( const Vector3& v ); const Vector3& operator/=( const Vector3& v ); // other operators - all const functions to make sure that the // internal state of the object doesn't get modified when the // function is called. Separate temporary instances of Vector3 // objects are created and returned when executed. Vector3 operator+( const Vector3& v ) const; // v1 + v2 Vector3 operator-( const Vector3& v ) const; // v1 - v2 Vector3 operator/( const Vector3& v ) const; // v1 * v2 Vector3 operator*( const Vector3& v ) const; // v1 / v2 // getters float GetX(); float GetY(); float GetZ(); // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); // helpers Vector3 Normalise() const; double Dot( const Vector3& v ) const; };
Note that the function is marked as const to imply that the object will not have a different state when it’s called, and the parameter is a const reference to imply that the parameter will not be modified as well.
Finally, we need a function which will determine the cross product of two vectors (commonly used to determine the vector that is perpendicular to two input vectors. The cross product equation takes two vectors and results in another vector, which again implies that the two input vectors do not change. Based on this implication, our function signature should be easy to deduce. While we’re there, let’s just chuck in a cheeky length() function which will give us the magnitude of the vector. Our class now looks like this:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // construction - default to zero if nothing is passed in Vector3( float x = 0.0, float y = 0.0, float z = 0.0 ); Vector3( const Vector3& v ); // assigment operators - not const because the current object // needs to change const Vector3& operator=( const Vector3& v ); const Vector3& operator+=( const Vector3& v ); const Vector3& operator-=( const Vector3& v ); const Vector3& operator*=( const Vector3& v ); const Vector3& operator/=( const Vector3& v ); // other operators - all const functions to make sure that the // internal state of the object doesn't get modified when the // function is called. Separate temporary instances of Vector3 // objects are created and returned when executed. Vector3 operator+( const Vector3& v ) const; // v1 + v2 Vector3 operator-( const Vector3& v ) const; // v1 - v2 Vector3 operator/( const Vector3& v ) const; // v1 * v2 Vector3 operator*( const Vector3& v ) const; // v1 / v2 // getters float GetX(); float GetY(); float GetZ(); // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); // helpers Vector3 Normalise() const; double Dot( const Vector3& v ) const; double Length() const; Vector3 Cross( const Vector3& v ) const; };
The interface to our class now looks fine.. except for one small omission. Const objects are read-only objects, and hence reading values from a const object should be legal. On the flip side, when we read a given x, y, or z value from a vector the vector shouldn’t be modified at all. So with that in mind, we should mark each of the getter function as const as well:
class Vector3 { private: // change the variable names to make it obvious that // they are member variables, not local or global. float m_X; float m_Y; float m_Z; public: // construction - default to zero if nothing is passed in Vector3( float x = 0.0, float y = 0.0, float z = 0.0 ); Vector3( const Vector3& v ); // assigment operators - not const because the current object // needs to change const Vector3& operator=( const Vector3& v ); const Vector3& operator+=( const Vector3& v ); const Vector3& operator-=( const Vector3& v ); const Vector3& operator*=( const Vector3& v ); const Vector3& operator/=( const Vector3& v ); // other operators - all const functions to make sure that the // internal state of the object doesn't get modified when the // function is called. Separate temporary instances of Vector3 // objects are created and returned when executed. Vector3 operator+( const Vector3& v ) const; // v1 + v2 Vector3 operator-( const Vector3& v ) const; // v1 - v2 Vector3 operator/( const Vector3& v ) const; // v1 * v2 Vector3 operator*( const Vector3& v ) const; // v1 / v2 // getters float GetX() const; float GetY() const; float GetZ() const; // setters void SetX( float x ); void SetY( float y ); void SetZ( float z ); // helpers Vector3 Normalise() const; double Dot( const Vector3& v ) const; double Length() const; Vector3 Cross( const Vector3& v ) const; };
We’re done! Our class looks complete (enough), so let’s pump out a definition for the functions (just for the sake of clarity).
Vector3::Vector3( float x, float y, float z ) : m_X( x ), m_Y( y ), m_Z( z ) { // don't need to do anything in the body of the // function because we're using the initialisation // lists instaed. } Vector3::Vector3( const Vector3& v ) : m_X( v.m_X ), m_Y( v.m_Y ), m_Z( v.m_Z ) { // don't need to do anything in the body of the // function because we're using the initialisation // lists instaed. } const Vector3& Vector3::operator=( const Vector3& v ) { // check for self-assignment if( this != &v ) { m_X = v.m_X; m_Y = v.m_Y; m_Z = v.m_Z; } // return a reference to ourselves for chaining return *this; } const Vector3& Vector3::operator+=( const Vector3& v ) { m_X += v.m_X; m_Y += v.m_Y; m_Z += v.m_Z; return *this; } const Vector3& Vector3::operator-=( const Vector3& v ) { m_X -= v.m_X; m_Y -= v.m_Y; m_Z -= v.m_Z; return *this; } const Vector3& Vector3::operator*=( const Vector3& v ) { m_X *= v.m_X; m_Y *= v.m_Y; m_Z *= v.m_Z; return *this; } const Vector3& Vector3::operator/=( const Vector3& v ) { m_X /= v.m_X; m_Y /= v.m_Y; m_Z /= v.m_Z; return *this; } Vector3 Vector3::operator+( const Vector3& v ) const { return Vector3( m_X + v.m_X, m_Y + v.m_Y, m_Z + v.m_Z ); } Vector3 Vector3::operator-( const Vector3& v ) const { return Vector3( m_X - v.m_X, m_Y - v.m_Y, m_Z - v.m_Z ); } Vector3 Vector3::operator/( const Vector3& v ) const { return Vector3( m_X / v.m_X, m_Y / v.m_Y, m_Z / v.m_Z ); } Vector3 Vector3::operator*( const Vector3& v ) const { return Vector3( m_X * v.m_X, m_Y * v.m_Y, m_Z * v.m_Z ); } float Vector3::GetX() const { return m_X; } float Vector3::GetY() const { return m_Y; } float Vector3::GetZ() const { return m_Z; } void Vector3::SetX( float x ) { m_X = x; } void Vector3::SetY( float y ) { m_Y = y; } void Vector3::SetZ( float z ) { m_Z = z; } Vector3 Vector3::Normalise() const { double length = Length(); if( length == 0.0 ) { // return a zero vector if it's got zero length return Vector3(); } double invLength = 1.0 / length; return Vector3( m_X * invLength, m_Y * invLength, m_Z * invLength ); } double Vector3::Dot( const Vector3& v ) const { return m_X * v.m_X + m_Y * v.m_Y + mZ * v.mZ; } double Vector3::Length() const { return sqrt( m_X * m_X + m_Y * m_Y + mZ * mZ ); } Vector3 Vector3::Cross( const Vector3& v ) const { return Vector3( m_Y * v.m_Z - m_Z * v.m_Y, m_Z * v.m_X - m_X * v.m_Z, m_X * v.m_Y - m_Y * v.m_X ); }
Now that we have a functional class, we should have no problem using it in all of the above scenarios without any crazy side-effects. The class should be concrete!
Disclaimer: This code hasn’t been compiled, and might not run without a few tweaks. I wrote this off the top of my head while sitting in front of the TV! Comments, questions and flames are welcome.










July 15, 2007
Hey,
Nice article. I was just curious as to why you decided to do the self-assignment check in the assignment operator.
I can’t really see any disadvantage from copying the values across even if it is self-assignment. Obviously it seems redundant but from a code point of view does it add much having the if there?
July 15, 2007
Hey Gav,
Thanks for the comment mate. Your question is a good one, and it’s one that I had thought about before I punched out the code.
The code that’s listed here is far from production code, and the idea is to give the idea of good programming practice. In general, checking for assignment to yourself is a good thing to do. In the case of the Vector3 class listed above, you could easily get away with not checking for self-assignment. However, in classes where memory is allocated to members, or pointers/references are used, it’s much better to check for self-assignment. I decided to put it in (despite not needing it in this case) because I felt it’s a practice that’s worth following. It’s better to put this in what it’s not needed than to leave it out when it is needed.
I might do another post later on why self-assignment should be used as a rule of thumb rather than not
Thanks for the comment!
July 15, 2007
I was about six years into C++ coding back in 1998 when I suddenly realized that the whole point of OOP is to describe an object in simple english and then hammer the code to fit the simple interface description.
Instead of thinking about really cool things to do with code inside of an object, instead think of the most foolproof, idiot-resistant interface on the outside you could envision, then code it internally to service that interface.
This is where the agile school comes up with the idea of actually writing the interface and unit test before you write any code. It wasn’t until I was forced to do it this way I realized how brilliant it is.
Inside your simple, elegant, easily described interface, make it super tough and robust so a bad call doesn’t cause it to keel over and die. This is why you should use ASSERTs all over the place and even in spots you think it is ridiculous to check for say, a NULL pointer in that call. It may seem dumb to you at first but in reality it protects you against your own dumb mistakes. You might be single stepping a piece of code for hours before you get around to examining that argument that “must be set fer sure” before you realize it is passed as NULL.
So your objects have a bog simple wrapper outside that covers a very paranoid and suspicious object inside that does everything it can to intercept or prevent itself from falling over just because it gets a bad call from the outside world.
The problem is not learning to code this way, the problem is that 90% of the places out there won’t LET YOU WRITE CODE that makes this much sense. They always claim they are just too busy to spend much time thinking about the problem, because after all they are three years overdue … largely because they don’t code this way. See how that works? That’s why most places just plain bite down on it, hard.
July 15, 2007
Hi Cleve, thanks for your comment. I’m not sure that it’s entirely on-topic, but it’s worth discussing at least :).
I don’t know if I agree with the idea that OOP is about hammering code to fit a simple interface description. To me that implies that the interface doesn’t fit the meaning/intention of the class. But perhaps I’m just not getting a clear view of your point.
To me, a class isn’t ready when there’s nothing left to add. It’s ready when there’s nothing left that you can take away that doesn’t stop it from working the way it needs to. Excessive function overloads are a clear candidate for removal (just because it might end up being used, it doesn’t mean that its presence is justified). The fewer entry points there are into the class, the fewer points there are for it to die on.
I agree with the idea of defensive coding, and I do think that checking for NULL and using ASSERT are things that should be done constantly. There have been some recent murmurs around the web that checking for NULL is a bad idea, but I don’t agree. I don’t care what anyone says, if your code blows up on site, you’re going to look bad in front of your client. Code defensively as much as possible, it’s the only way to go.
Aside from that, you need to make sure that the other coders who are writing the application alongside you are actually doing the same thing to.
As far as the general state of the industry is concerned, I guess I agree with you for the most part. There are many companies out there who are not really pushing for ideal development practices despite knowing their obvious benefits. The excuses that are used are generally “lack of time” of “lack of budget”. I’ve lost count of the times I’ve heard “now just isn’t a good time”. Well, I’m afraid it’s never a good time to rewrite stuff. But the best time is now before the code becomes and even bigger and less maintainable beast. The original code base tends to become something it was never designed to be, generally because it wasn’t actually “designed”. It just evolved from someone’s “idea”. This is extremely common, and is part of the reason why there are so many developers out there who are paid so much more than they’re worth purely because they got into a project at the start and have written that much dodgey code that they now have job security. Unfortunately for those people who are keen to write decent code, these guys are just happy to carry on pumping out line after line of complete gunk. The business seems happy because things appear to be getting done, but they don’t realise that with every line that’s written, the cost of maintenance is going up 5 times faster.
It’s a sad state of affairs for our industry, and I don’t think it’s going to change any time soon. This is why I’m putting time into building a portfolio so that I can market myself as a freelancer. Based on how things are at the moment, it seems that writing my own stuff from scratch is the only way I’ll be able to avoid being The Code Cleaner.
July 18, 2007
Nice blog on OO, but to be honest, Im starting to lose some of the faith.
Especially in categories where abstract things don’t map very easily to objects, writing pure OO can lead to some awkward looking code. Im my opinion that is why Java gets a reputation for having some beautiful code in some problem domains, and verbose god-awful shite in others.
For example, OO is fundamentally based on coupling state and function into one conceptual thing. But any good programmer should develop a cold uneasy fear in the pit of their stomach whenever someone says the world “coupling”, and what it implies.
Essentially the C++ STL is an example of using OO-based language features to do something fundamentally non-OO. The STL is fundamentally functional programming, and the more I use it, and python for my scripting needs, the more Im starting to get a crush of functional programming.
In my current code, I am creating a series of simple objects, much as you have done in your example, but Im gluing those bit together with functional programming techniques. This means the bits that actually function, are simply and compactly written without having to jigger them onto an awkward OO framework, but the data itself are plain old objects, with proper encapsulation. Practically speaking, this means I have very little FooWriter or BarLartomatic style classes, instead easier to functions that say write (Foo, File), transform (Lart, Bar), where data an action can be decoupled and swapped out when the wind changes.
July 18, 2007
I hear what you’re saying Ryan. I love functional programming too (it’s one of the things I loved to teach when I was at uni), and its power is just immense. The perfect world would be that ideal cross of basic/simple objects which are tied together with “functional” code.
Coupling is the spawn of Satan. It’s currently the reason why my job is taking 5 to 10 times longer to perform “simple” maintenance tasks.
July 18, 2007
I’m sorry you expected me to be on-topic. I thought you had seen some of my previous posts.
Actually, I agree with everything in your reply. You are right about seeing if you can strip the class down until anything else removed prevents the class from doing what it is supposed to do.
I think I was saying as much before, unfortunately it was too boring that way so I added a lot of colorful language to get cheap laughs.
July 19, 2007
Nah it’s ok mate. Staying on topic can be hard, and I tend to go off on a lot of tangents myself