Johnny's Software Lab

Johnny's Software Lab

Resources for Software Performance Engineers

  • Home
  • Performance
    • 2 Minute Reads
    • C++ Performance
    • Standard Library and Performance
    • Algorithms and Performance
    • Toolchain and Performance
    • Help the Compiler
    • Performance Analysis Tools
    • Computational Performance
    • Low Level Performance
    • Parallelization
    • Multithreaded Performance
    • Performance Contest
  • Debugging
  • Developer Tools
  • Workshops
    • Software Optimizations for the Memory Subsystem
    • Vectorization Workshop
  • Need help?
  • Talks
  • Contact
  • About us
Menu

All posts in Performance

“Premature optimization is the root of all evil”. Or is it?

“Premature optimization is the root of all evil”. Or is it?

Posted on April 12, 2021March 19, 2022Author Ivica BogosavljevićPosted in C++ Performance, PerformanceLeave a Reply

We investigate the topic of premature optimizations, or more specifically, in what cases you want to think early about performance.

Flexibility and Performance

Flexibility and Performance

Posted on March 25, 2021December 6, 2025Author Ivica BogosavljevićPosted in C++ Performance, Help the Compiler, Performance, Standard Library and PerformanceLeave a Reply

In this post we talk about how to write code that is both flexible and fast!

Speedscope: visualize what your program is doing and where it is spending time

Speedscope: visualize what your program is doing and where it is spending time

Posted on March 13, 2021March 19, 2022Author Ivica BogosavljevićPosted in Developer Tools, Performance, Performance Analysis Tools9 Replies

In this post we introduce Speescope, a useful tool to help you visualize what your program is doing and where it is spending time.

Speeding up an Image Processing Algorithm

Speeding up an Image Processing Algorithm

Posted on March 3, 2021May 8, 2022Author Ivica BogosavljevićPosted in Computational Performance, PerformanceLeave a Reply

A post explaining how a few small changes in the right places can have a drastic effects on performance of an image processing algorithm named Canny.

The true price of virtual functions in C++

The true price of virtual functions in C++

Posted on February 21, 2021April 8, 2023Author Ivica BogosavljevićPosted in C++ Performance, Performance12 Replies

We talk about virtual functions, and how the performance of software with virtual functions depends on many factors: the cost of additional instructions, cache misses, branch prediction misses, instruction cache misses and compiler optimizations.

2-minute read: Class Size, Member Layout and Speed

2-minute read: Class Size, Member Layout and Speed

Posted on February 13, 2021March 19, 2022Author Ivica BogosavljevićPosted in C++ Performance, Low Level Performance, Performance1 Reply

We are exploring how class size and layout of its data members affect your program’s speed

Performance Tuning Contest: February 2021 edition

Performance Tuning Contest: February 2021 edition

Posted on February 7, 2021March 19, 2022Author Ivica BogosavljevićPosted in Performance, Performance ContestLeave a Reply

Take part in a performance tuning contest to learn more about performance tuning on a real world code.

Crash course introduction to parallelism: SIMD Parallelism

Crash course introduction to parallelism: SIMD Parallelism

Posted on January 19, 2021March 19, 2022Author Ivica BogosavljevićPosted in Computational Performance, Low Level Performance, Parallelization, PerformanceLeave a Reply

This is the first article about hardware support for parallelization. We talk about SIMD, an extension almost every processor nowadays has that lets you speed up your program.

2-minute read: How is Big O notation relevant on modern systems?

2-minute read: How is Big O notation relevant on modern systems?

Posted on January 10, 2021March 19, 2022Author Ivica BogosavljevićPosted in 2 Minute Reads, Algorithms and Performance, PerformanceLeave a Reply

Big O notation is commonly used to describe algorithm performance. But modern hardware makes performance analysis much harder than it used to be. In this short article we give three interesting examples to illustrate the limits of big O notation.

Crash course introduction to parallelism: the algorithms

Crash course introduction to parallelism: the algorithms

Posted on January 4, 2021March 19, 2022Author Ivica BogosavljevićPosted in Parallelization, PerformanceLeave a Reply

When it comes to performance, there are two ways to go: one is to improve the usage of the existing hardware resources, the other is to use the new hardware resources. We already talked a lot about how to increase the performance of your program by better using the existing resources, for example, by decreasing…

Read

Posts pagination

← Previous Page 1 … Page 5 Page 6 Page 7 Page 8 Next →

Like what you’re reading? Follow us!

  • Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Floating-Point Error Handling in C++: What Actually Works
  • Deep Dive in Java vs C++ Performance
  • 9 Things Every Fresh Graduate Should Know About Software Performance
  • The messy reality of SIMD (vector) functions

Recent Posts

  • Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Floating-Point Error Handling in C++: What Actually Works
  • Deep Dive in Java vs C++ Performance
  • 9 Things Every Fresh Graduate Should Know About Software Performance
  • The messy reality of SIMD (vector) functions

Recent Comments

  • Ivica Bogosavljević on Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Matt on Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Min Hsu on Performance Debugging with llvm-mca: Simulating the CPU!
  • Luke Hofstetter on Growing Buffers to Avoid Copying Data
  • Tor on An optimizing compiler doesn’t help much with long instruction dependencies

Archives

  • February 2026
  • January 2026
  • November 2025
  • September 2025
  • July 2025
  • May 2025
  • March 2025
  • January 2025
  • December 2024
  • October 2024
  • August 2024
  • June 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020

Categories

  • 2 Minute Reads
  • Algorithms and Performance
  • C++ Performance
  • Computational Performance
  • Data Structure Performance
  • Debugging
  • Developer Tools
  • Help the Compiler
  • Kernel Space and Performance
  • Low Level Performance
  • Memory Footprint
  • Memory Subsystem Performance
  • Multithreaded Performance
  • Parallelization
  • Performance
  • Performance Analysis Tools
  • Performance Contest
  • Reliability
  • Standard Library and Performance
  • System Design
  • Toolchain and Performance
  • Vectorization

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

©2026 Johnny's Software Lab | WordPress Theme by Superb WordPress Themes