2-minute read: What is faster, std::endl or ‘\n’?

We at Johnny’s Software Lab LLC are experts in performance. If performance is in any way concern in your software project, feel free to contact us.

A few days ago I wrote a small app to illustrate one of the articles I was preparing. Basically the program was loading a file from the hard disk, sorting it, and then outputting to another file only unique values (by omitting duplicates).

The function for writing unique values to a file looks like this:

void remove_duplicates_and_save(std::vector<std::string>& lines,
                                std::string file_name) {
    std::ofstream myfile(file_name);

    myfile << lines[0] << std::endl;
    for (int i = 1; i < lines.size(); i++) {
        if (lines[i] != lines[i - 1]) {
            myfile << lines[i] << std::endl;
        }
    }
}

As you can see, the function is simple enough and there is nothing special about it. The whole program took 2.3 seconds to complete on a file with 1 million lines. When I ran it through speedscope’s flamegraphs, I got the following output:

Flamegraphs clearly show that most of the runtime is eaten away by function remove_duplicates_and_save

As you can see, a lot of time is spent in remove_duplicates_and_save function, and if you look a little bit closer, a it involves a lot of flushing! For those who don’t know, flushing is moving data from your computer’s operating memory to the hard drive, and it is a very expensive operation if done often. So to increase performance, the C++ standard library performs flushing only when its internal data buffer is full.

I expected that remove_duplicates_and_save function would take a shorter time than sort_lines, however, this was not the case. Upon closer inspection, the culprit was found. According to C++ standard, outputting std::endl causes a buffer flush and degrades performance. Replacing std::endl with '\n' gave the following frame graph:

Flamegraph for whole program, where we replaced std::endl with ‘\n’

The overall program’s runtime went down from 2.3 seconds to 0.65 seconds. Function remove_duplicates_and_save almost disappeared from the flamegraph, which means its runtime is very short. Unlucky choice in the design of C++ standard library, but std::endl is a very inefficient way to write a new line to a file! So use '\n' instead!

Do you need to discuss a performance problem in your project? Or maybe you want a vectorization training for yourself or your team? Contact us
Or follow us on LinkedIn , Twitter or Mastodon and get notified as soon as new content becomes available.

Leave a Reply

Your email address will not be published. Required fields are marked *