Walletfox.com

Pointer vs reference: The perils of reassignment

As you know, an address of an object in C++ can be stored either through a reference or through a pointer. Although it might appear that they represent similar concepts, one of the important differences is that you can reassign a pointer to point to a different address, but you cannot do this with a reference. A reference practically stands for an object.

The tallest skyscraper

To show you the difference between a pointer and a reference we are going to look for the tallest Skyscraper in a list of Skyscraper objects. We are going to use a naive algorithm that starts by marking an arbitrary (e.g. the first Skyscraper object in the list) as the tallest. As the iteration progresses, anytime we find a taller Skyscraper object, we will need to reassign the 'address holder' of the tallest skyscraper to the address of the currently tallest Skyscraper object. In our first example, we are going to use a pointer to store the address of the currently tallest skyscraper. In the second one, a reference (it will become clear very soon that using references for this purpose is a big mistake). Below you can see the class Skyscraper, which we are going to use throughout this example.

#ifndef SKYSCRAPER_H
#define SKYSCRAPER_H
#include <stdexcept>
#include <string>
#include <iostream>

class Skyscraper
{
public:
    Skyscraper(const std::string& name, double height);
    auto name() const {return m_name;}
    auto height() const {return m_height;}

private:
    std::string m_name;
    double m_height;
};

std::ostream& operator<<(std::ostream &output, const Skyscraper &s);

#endif // SKYSCRAPER_H
#include "skyscraper.h"

Skyscraper::Skyscraper(const std::string &name, double height): m_name(name),
    m_height(height){
    if(m_height < 0)
        throw std::invalid_argument(
                "Height must be non-negative.");
}

std::ostream& operator<<(std::ostream &output, const Skyscraper &s)
{
    output << s.name() << " " << s.height() << '\n';
    return output;
}

For simplicity, we are only going to use a short list of 4 skyscrapers which we are going to store in a vector.

#include <iostream>
#include <vector>
#include "skyscraper.h"

int main()
{
    auto skyscrapers = std::vector<Skyscraper> {Skyscraper("Empire State", 381),
                                                Skyscraper("Petronas", 482),
                                                Skyscraper("Burj Khalifa", 828),
                                                Skyscraper("Taipei", 509)};
}

Address of the tallest Skyscraper - pointer

Now let's have a look at the details of the algorithm for the identification of the tallest skyscraper. In the first instance, the pointer tallestScraper points to skyscrapers.at(0), i.e. the Empire State.

Skyscraper* tallestScraper = &skyscrapers.at(0);

for(auto i = 1u; i < skyscrapers.size(); ++i)
   if(skyscrapers.at(i).height() > tallestScraper->height())
      tallestScraper = &skyscrapers.at(i);
}

Notice the first row of the code above. As we are using a pointer, an asterisk has to appear on the left side of the statement. Because we are storing an address of an object, an ampersand has to appear on the right side of the initialization row.

In the next instance, we iterate over the remaining skyscrapers. If the next skyscraper, i.e.skyscrapers.at(1), happens to be taller than the skyscraper whose address is stored at tallestSscraper, we reassign the pointer tallestSscraper to the address of skyscrapers.at(1). If it is shorter, we won't do anything. We are going to continue in this way until we finish the list.

You can see the entire main.cpp below.

#include <iostream>
#include <vector>
#include "skyscraper.h"

int main()
{
    auto skyscrapers = std::vector<Skyscraper> {Skyscraper("Empire State", 381),
                                                Skyscraper("Petronas", 482),
                                                Skyscraper("Burj Khalifa", 828),
                                                Skyscraper("Taipei", 509)};

    Skyscraper* tallestScraper = &skyscrapers.at(0);

    for(auto i = 1u; i < skyscrapers.size(); ++i)
        if(skyscrapers.at(i).height() > tallestScraper->height())
            tallestScraper = &skyscrapers.at(i);

    std::cout << "The tallest skyscraper: " << *tallestScraper << '\n';
    for(auto i = 0u; i < skyscrapers.size(); ++i)
        std::cout << skyscrapers.at(i);
}

If you run the code, you are going to get the output below. Everything is alright, the tallest building is Burj Khalifa. Reassigning a pointer during the iterations works just fine.

The tallest skyscraper: Burj Khalifa 828

Empire State 381
Petronas 452
Burj Khalifa 828
Taipei 509

  Currently tallest skyscraper
Step Name Address
0 Empire State 0xa92c68
1 Petronas 0xa92ca0
2 Burj Khalifa 0xa92cf0
3 Burj Khalifa 0xa92cf0

Address of the tallest Skyscraper - reference

Now let's have a look at what would happen if we used a reference instead of a pointer to represent the address of the tallest Skyscraper. Notice some changes in the code. As we are now using a reference, the asterisk in the initialization got replaced by an ampersand. On the other hand, the ampersand disappeared from the right side of the initialization row. Also, in the iteration loop, the access to the member functions of tallestScraper is now accomplished through a '.' instead of '->' that we used previously.

Skyscraper& tallestScraper = skyscrapers.at(0);

for(auto i = 1u; i < skyscrapers.size(); ++i){
   if(skyscrapers.at(i).height() > tallestScraper.height())
      tallestScraper = skyscrapers.at(i);
}

The entire code can be seen below:

#include <iostream>
#include <vector>
#include "skyscraper.h"

int main()
{
    auto skyscrapers = std::vector<Skyscraper> {Skyscraper("Empire State", 381),
                                                Skyscraper("Petronas", 482),
                                                Skyscraper("Burj Khalifa", 828),
                                                Skyscraper("Taipei", 509)};

    Skyscraper& tallestScraper = skyscrapers.at(0);

    for(auto i = 1u; i < skyscrapers.size(); ++i){
        if(skyscrapers.at(i).height() > tallestScraper.height())
            tallestScraper = skyscrapers.at(i);
    }

    std::cout << "The tallest skyscraper: " << tallestScraper << '\n';
    for(auto i = 0u; i < skyscrapers.size(); ++i)
        std::cout << skyscrapers.at(i);
}

If you run the code, everything seems to be alright at first. The tallest skyscraper seems to be the Burj Khalifa. But here comes the problem. Look at the original list of the skyscrapers printed below. You will see that the Empire State disappeared from our list and we have now two Burj Khalifas!

The tallest skyscraper: Burj Khalifa 828

Burj Khalifa 828
Petronas 452
Burj Khalifa 828
Taipei 509

What happened is clear already after the first iteration. Remember, how we initially assigned the reference tallestScraper to skyscrapers.at(0)? After the algorithm found a taller skyscraper, the reference tallestScraper wasn't reassigned to this taller skyscraper. Instead, the reference tallestScraper contained the address of the first skyscraper regardless of what we tried to do. However, its contents (i.e. the name and height) got modified every time we tried to reassign the reference to a different object. In the end, they changed from Empire State with 381m to Burj Khalifa with 828 m. Remember, once you initialize a reference, you cannot reassign it. A reference practically stands for an object. You cannot make it 'point' to a different object.


  Currently tallest skyscraper - reference
Step Name Address
0 Empire State 0x842cf8
1 Petronas 0x842cf8 !
2 Burj Khalifa 0x842cf8 !
3 Burj Khalifa 0x842cf8 !

Tagged: C++