Item 4 - Make sure that objects are initialized before they’re used.
Reading uninitialized values yields undefined behaviour.
Always initialize your objects before using them. For non-memeber objects of built-in types, do this manually. For example:
The resposibility for initialization falls on constructors. Make sure that all constructors initialize everything in the object.
Note: Do not confuse assignment with initialization.
This will yield ABEntry
objects with the values you expect, but it's still not the best approach. The rules of C++ stipulate the data members of an object are initialized before the body of a constructor is entered. Inside the ABEntry
constructor, theName
, theAddress
, and thePhones
aren’t being initialized, they’re being assigned. Initialization took place earlier — when their default constructors were automatically called prior to entering the body of the ABEntry
constructor. This isn’t true for numTimesConsulted
, because it’s a built-in type. For it, there’s no guarantee it was initialized at all prior to its assignment.
A better way to write the ABEntry
constructor is to use the member initilization list instead of assignments:
This constructor yields the same end results as the one above, but it will often be more efficient. The assignment-based version first called default constructors to initialize the theName
, the Address
, and thePhones
, then promptly assigned new values on top of the default-constructed ones. All the work performed in those default constructions was therefore wasted. The member initialization list apporach avoids that problem, because the arguments in the initialization list are used as constructors arguments for the various data members. In this case theName
is copy-constructed from name
, theAddress
is copy-constructed from phones
. For most types, a single call to copy constructor is more efficient - sometimes much more efficient - than a call to the default contructor followed by a call to the copy assignment operator.
If ABEntry
had a constructor taking no parameters, it could be implemented like this:
For objects of built-in type like numTimesConsulted
, there is no difference in cost between initialization and assignment, but for consistency, it's often best to initialize everything via member initialization.
One aspect of C++ that isn't fickle is the order in which an object's data is initialized.
Base classes are initialized before derived classes (Item 12), and within class, data members are initialized in the order in which they are declared.
In ABEntry
, for example, theName
will always be initialized first, theAddress
second, thePhones
third and numTimesConsulted
last. This is true even if they are listed in a different order on the member initialization list.
Now, there is one more thing to be remembered, and that is,
The order of initialization of non-local static objects defined in different translation units.
Let's pick that phrase apart it by bit.
A static object is one that exists from the time it's constructed until the end of the program. Stack and heap-based objects are thus excluded. Included are global objects, objects defined at namespace scope, objects declared static inside classes, objects declared static inside functions, and object declared static at file scope. Static objects inside functions are knows as local static objects (because they're local to a function), and the other kinds of static objects are known as non-local static objects. Static objects are destroyed when program exits, i.e. their destructors are called when main finishes executing.
A translation unit is the source code giving rise to a single object file.It's basically a single source file, plus all of its #include
files.
The problem we're concerned with, then, involves at least 2 separately compiled source files, each of which contains at least one non-local static object. And the actual problem is this: if initialization of a non-local static object in one translation unit uses a non-local static object in a different translation unit, the object it uses could be uninitialized, because the relative order of initialization of non-local static object defined in different translation units is undefined.
Example:
Suppose you have a FileSystem
class that makes files on the Internet look like they're local. Since your class makes the world look like a single file system, you might create a special object at global or namespace scope representing the single file system:
Now suppose some client creates a class for directories in a file system. Naturally, their class uses the tfs
object:
Further suppose this client decides to create a single Directory
object for temporary files:
Now the imporatance of initialization order becomes apparent: unless tfs
is initialized before tempDir
, tempDir
's constructor will attempt to use tfs
before it's been initialized. But tfs
and tempDir
were created by different people at different times in different source files — they're non-local static objects defined in different translation units. How can you be sure that tfs
will be initialized before tempDir
?
Fortunately, a small design change eliminates the problem entirely. All that has to be done is to move each non-local static object into its own function, where it's declared static
.These functions return references to the objects they contain. Clients then call the functions instead of referring to the objects. In other words, non-local static objects are replaced with local static objects.
This approach is founded on C++'s guarantee that local static objects are initialized when the object's definition is first encountered during a call to that function. So if you replace direct accesses to non-local static objects with calls to functions that return references to local static objects, you're guarenteed that the references you get back will refer to initialized objects.
Here's the technique applied to both tfs
and tempDir
:
The reference-returning functions dictated by this scheme are always simple: define and initialize a local static object on line 1, return it on line 2. On the other hand, the fact that these functions contain static objects makes them problematic in multithreaded systems. Then again, any kind of non-const static object — local or non-local — is trouble waiting to happen in the presence of multiple threads. One way to deal with such trouble is to manually invoke all the reference-returning functions during the single-threaded startup portion of the program. This eliminates initialization-related race conditions.
To avoid using objects before they're initialized, then, you need to do only three things. First, manually initialize non-member objects of built-in types. Second, use member initialization lists to initialize all parts of an object. Finally, design around the initialization order uncertainty that afflicts non-local static objects defined in separate translation units.
Things to Remember:
Manually initialize objects of builtin types, because C++ only sometimes initializes them itself.
In a constructor, prefer use of the member initialization list to assignment inside the body of the constructor. List data members in the initialization list in the same order they're declared in the class.
Avoid initialization order problems across translation units by replacing non-local static objects with local static objects.
Last updated