Johnny's Software Lab

Johnny's Software Lab

Resources for Software Performance Engineers

  • Home
  • Performance
    • 2 Minute Reads
    • C++ Performance
    • Standard Library and Performance
    • Algorithms and Performance
    • Toolchain and Performance
    • Help the Compiler
    • Performance Analysis Tools
    • Computational Performance
    • Low Level Performance
    • Parallelization
    • Multithreaded Performance
    • Performance Contest
  • Debugging
  • Developer Tools
  • Workshops
    • Software Optimizations for the Memory Subsystem
    • Vectorization Workshop
  • Need help?
  • Talks
  • Contact
  • About us
Menu

All posts by Ivica Bogosavljević

About Ivica Bogosavljević

Senior Software Engineer with 10 years of experience active in the domain of Linux and bare-metal based embedded systems. His professional focus is application performance improvement - techniques used to make your C/C++ program run faster by using better algorithms, better exploiting the underlying hardware, and better usage of the standard library, programming language, and the operating system.

Memory Access Pattern and Performance: the Example of Matrix Multiplication

Posted on May 20, 2021March 11, 2023Author Ivica BogosavljevićPosted in Computational Performance, Low Level Performance, PerformanceLeave a Reply

We use matrix multiplication example to investigate loop interchange and loop tiling as techniques to speed up your program that works with matrices.

“Premature optimization is the root of all evil”. Or is it?

Posted on April 12, 2021March 19, 2022Author Ivica BogosavljevićPosted in C++ Performance, PerformanceLeave a Reply

We investigate the topic of premature optimizations, or more specifically, in what cases you want to think early about performance.

Flexibility and Performance

Posted on March 25, 2021December 6, 2025Author Ivica BogosavljevićPosted in C++ Performance, Help the Compiler, Performance, Standard Library and PerformanceLeave a Reply

In this post we talk about how to write code that is both flexible and fast!

Speedscope: visualize what your program is doing and where it is spending time

Posted on March 13, 2021March 19, 2022Author Ivica BogosavljevićPosted in Developer Tools, Performance, Performance Analysis Tools9 Replies

In this post we introduce Speescope, a useful tool to help you visualize what your program is doing and where it is spending time.

Speeding up an Image Processing Algorithm

Posted on March 3, 2021May 8, 2022Author Ivica BogosavljevićPosted in Computational Performance, PerformanceLeave a Reply

A post explaining how a few small changes in the right places can have a drastic effects on performance of an image processing algorithm named Canny.

The true price of virtual functions in C++

Posted on February 21, 2021April 8, 2023Author Ivica BogosavljevićPosted in C++ Performance, Performance12 Replies

We talk about virtual functions, and how the performance of software with virtual functions depends on many factors: the cost of additional instructions, cache misses, branch prediction misses, instruction cache misses and compiler optimizations.

2-minute read: Class Size, Member Layout and Speed

Posted on February 13, 2021March 19, 2022Author Ivica BogosavljevićPosted in C++ Performance, Low Level Performance, Performance1 Reply

We are exploring how class size and layout of its data members affect your program’s speed

Performance Tuning Contest: February 2021 edition

Posted on February 7, 2021March 19, 2022Author Ivica BogosavljevićPosted in Performance, Performance ContestLeave a Reply

Take part in a performance tuning contest to learn more about performance tuning on a real world code.

Crash course introduction to parallelism: SIMD Parallelism

Posted on January 19, 2021March 19, 2022Author Ivica BogosavljevićPosted in Computational Performance, Low Level Performance, Parallelization, PerformanceLeave a Reply

This is the first article about hardware support for parallelization. We talk about SIMD, an extension almost every processor nowadays has that lets you speed up your program.

2-minute read: How is Big O notation relevant on modern systems?

Posted on January 10, 2021March 19, 2022Author Ivica BogosavljevićPosted in 2 Minute Reads, Algorithms and Performance, PerformanceLeave a Reply

Big O notation is commonly used to describe algorithm performance. But modern hardware makes performance analysis much harder than it used to be. In this short article we give three interesting examples to illustrate the limits of big O notation.

Posts pagination

← Previous Page 1 … Page 5 Page 6 Page 7 … Page 9 Next →

Like what you’re reading? Follow us!

  • Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Floating-Point Error Handling in C++: What Actually Works
  • Deep Dive in Java vs C++ Performance
  • 9 Things Every Fresh Graduate Should Know About Software Performance
  • The messy reality of SIMD (vector) functions

Recent Posts

  • Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Floating-Point Error Handling in C++: What Actually Works
  • Deep Dive in Java vs C++ Performance
  • 9 Things Every Fresh Graduate Should Know About Software Performance
  • The messy reality of SIMD (vector) functions

Recent Comments

  • Ivica Bogosavljević on Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Matt on Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se
  • Min Hsu on Performance Debugging with llvm-mca: Simulating the CPU!
  • Luke Hofstetter on Growing Buffers to Avoid Copying Data
  • Tor on An optimizing compiler doesn’t help much with long instruction dependencies

Archives

  • February 2026
  • January 2026
  • November 2025
  • September 2025
  • July 2025
  • May 2025
  • March 2025
  • January 2025
  • December 2024
  • October 2024
  • August 2024
  • June 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020

Categories

  • 2 Minute Reads
  • Algorithms and Performance
  • C++ Performance
  • Computational Performance
  • Data Structure Performance
  • Debugging
  • Developer Tools
  • Help the Compiler
  • Kernel Space and Performance
  • Low Level Performance
  • Memory Footprint
  • Memory Subsystem Performance
  • Multithreaded Performance
  • Parallelization
  • Performance
  • Performance Analysis Tools
  • Performance Contest
  • Reliability
  • Standard Library and Performance
  • System Design
  • Toolchain and Performance
  • Vectorization

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

©2026 Johnny's Software Lab | WordPress Theme by Superb WordPress Themes