Back to C++ Optimization Techniques

3: C++ Optimizations You Can Do "As You Go"

Defy the software engineering mantra of "optimization procrastination." These techniques can be added to your code today! In general, these methods not only make your code more efficient, but increase readability and maintainability, too.

3.1: Pass Class Parameters by Reference

Passing an object by value requires that the entire object be copied (copy ctor), whereas passing by reference does not invoke a copy constructor, though you pay a "dereference" penalty when the object is used within the function. This is an easy tip to forget, especially for small objects. As you'll see, even for relatively small objects, the penalty of passing by value can be stiff. I compared the speed of the following functions:

template <class T> void ByValue(T t) { }
template <class T> void ByReference(const T& t) { }
template <class T> void ByPointer(const T* t) { }

For strings, passing by reference is almost 30 times faster! For the bitmap class, it's thousands of times faster. What is surprising is that passing a complex object by reference is almost 40% faster than passing by value. Only ints and smaller objects should be passed by value, because it's cheaper to copy them than to take the dereferencing hit within the function.

3.2: Postpone Variable Declaration as Long as Possible

In C, all variables must be declared at the top of a function. It seems natural to use this same method in C++. However, in C++, declaring a variable can be an expensive operation when the object has a non-trivial constructor or destructor. C++ allows you to declare a variable wherever you need to. The only restriction is that the variable must be declared before it's used. For maximum efficiency, declare variables in the minimum scope necessary, and then only immediately before they're used.

// Declare Outside (b is true half the time)
T x;
if (b)
    x = t;

// Declare Inside (b is true half the time)
if (b)
    T x = t;

Without exception, it's as fast or faster to declare the objects within the scope of the if statement. The only time where it may make sense to declare an object outside of the scope where it's used is in the case of loops. An object declared at the top of a loop is constructed each time through the loop. If the object naturally changes every time through the loop, declare it within the loop. If the object is constant throughout the loop, declare it outside the loop.

3.3: Prefer Initialization over Assignment

Another C holdover is the restriction that variables must be defined and then initialized. With C++, this no longer applies. In fact, it's to your advantage to initialize a variable at the moment it's declared. Initializing an object invokes the object's copy constructor. That's it. Defining and then assigning an object invokes both the default constructor and then the assignment operator. Why take two steps when one will do?

This recommendation nicely complements postponing declarations. In the ideal case, postpone your declaration until you can do an initialization.

// Initialization
T x = t; // alternately T x(t); either one invokes copy ctor

// Assignment
T x;   // default ctor
x = t; // assignment operator

Initializing a complex value is over four times faster than declaring and assigning. Even for strings, the gain is 6%. Surprisingly, it makes little difference for the bitmap object. That's because the time to default construct a bitmap is miniscule in comparison to the time required to copy one bitmap to another.

Here's a real world case from the company where I work. This is code that's running today - slightly modified to protect the guilty. It probably looks similar to code in your own projects. The input strings are copied to slightly different string objects.

void SNCommGPSendNewUser(const SNstring& sUser, const SNstring& sPass,
                         /* 9 more SNstring params ... */ )
    {
    string User;
    string Pass;
    User = sUser; // Convert to our format
    Pass = sPass;
    // etc . . .
    }

Here's the code revised to use the initialization technique.

void SNCommGPSendNewUser(const SNstring& sUser, const SNstring& sPass,
                         /* 9 more SNstring params ... */ )
    {
    // Convert to our format
    string User = sUser;
    string Pass = sPass;
    // etc . . .
    }

Readability improvement: 100%. Lines of code: 50% of original. Speed improvement: just over 3%. Not huge, but certainly nothing to complain about. Triple win.

3.4: Use Constructor Initialization Lists

In any constructor that initializes member objects, it can pay big dividends to set the objects using an initialization list rather than within the constructor itself. Why? Class member variables are automatically constructed using their default constructor prior to entry within the class constructor itself. You can override this behavior by specifying a different member constructor (usually a copy constructor) in the initialization list. Multiple initializations are separated with commas (not shown here).

template <class T> class CtorInit
    {
    T m_Value;
public:

    // no list
    CtorInit(const T& t) // m_Value default ctor called here automatically
        {
        m_Value = t; // m_Value assignment operator called
        }

    // with list
    CtorInit(const T& t) : m_Value(t) { } // m_Value copy ctor called
    }; 
The drawback to using initialization lists is that there's no way to do error checking on incoming values. In the "no list" example we could do some validation on t within the CtorInit function. In the "with list" example, we can't do any error checking until we've actually entered the CtorInit code, by which time t has already been assigned to m_Value. There's also a readability drawback, especially if you're not used to initialization lists.

Nevertheless, these are good performance gains, particularly for the complex object. This type of performance can outweigh the drawbacks.

3.5: Use Operator= Instead of Operator Alone

One of the great things about C++ is the ability to define your own operators. Rather than coding strC = strA; strC.append(strB); you can code strC = strA + strB. The notion of "appending" is conveyed much more simply and precisely using operator +. One thing to keep in mind, however, is that operator + returns a temporary value that must be both constructed and destructed, whereas operator += modifies an existing value. In fact, operator + can usually be defined in terms of operator +=.

T T::operator + (const T& t)
    {
    T result(*this); // temporary object
    return (result += t);
    }

It's typically more efficient to use += instead of + alone, because we avoid generating a temporary object. Consider the following functions. They give the same result, but one uses + alone and the other uses +=

template <class T> T OperatorAlone(const T& a, const T& b)
    {
    T c(a + b);
    return (c);
    }

template <class T> T OperatorEquals(const T& a, const T& b)
    {
    T c(a);
    c += b;
    return (c);
    }

For intrinsic types, + alone gives better results, but for non-trivial classes, especially classes with costly construction time, += is the better choice.

3.6: Use Prefix Operators

Objects that provide the concept of increment and decrement often provide the ++ and - - operators. There are two types of incrementing, prefix (++x) and postfix (x++). In the prefix case, the value is increased and the new value is returned. In the postfix case, the value is increased, but the old value is returned. Correctly implementing the postfix case requires saving the original value of the object - in other words, creating a temporary object. The postfix code can usually be defined in terms of the prefix code.

const T T::operator ++ (int) // postfix
    {
    T orig(*this);
    ++(*this); // call prefix operator
    return (orig);
    }

The clear recommendation: avoid postfix operators. In fact, you may want to declare the postfix operators in the private section of the class so that the compiler will flag incorrect usage for you automatically.

Strings aren't included in the results because increment doesn't make sense for strings. (It doesn't really make sense for bitmaps, either, but I defined increment to increase the width and height by one, forcing a reallocation.) Where this recommendation really shines is for mathematical objects like complex. The prefix operator is almost 50% faster for complex objects.

3.7: Use Explicit Constructors

Explicit is a recent keyword added to C++ and now part of the language standard. This keyword solves the following potential problem. Suppose you have the following class:

class pair
    {
    double x, y;
public:
    pair(const string& s) { . . . }
    bool operator == (const pair& c) const { . . . }
    };

Now suppose you do the following comparison, either purposefully or accidentally:

pair p;
string s;
if (p == s) { . . . }

Your compiler is pretty smart. It knows how to compare two pairs because you told it how in the pair class. It also knows how to create a pair given a string, so it can easily evaluate (p == s). The drawback is that we've hidden the second pair constructor - it's implicit. If that constructor is expensive, it's difficult to see that's it's being invoked. Worse, if we made a mistake and we didn't really want to compare a pair with a string, the compiler won't tell us.

My advice: make all single-argument constructors (except the copy constructor) explicit.

explicit pair(const string& s) { . . . }

Now the (c == s) line will give a compiler error. If you really want to compare these guys, you must explicitly call the constructor:

if (p == pair(s)) { . . . }

Using explicit will protect you from stupid mistakes and make it easier for you to pinpoint potential bottlenecks.

Back to C++ Optimization Techniques