Low Level Performance Archives - Johnny's Software Lab

Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se

February 26, 2026February 26, 2026Ivica BogosavljevićLow Level Performance, Performance, Vectorization2 Replies

I was preparing an article about Highway – portable vectorization library by Google – so I ported a few examples from my vectorization workshop from AVX to Highway. One of the examples was vectorized binary search. I assume most readers are familiar with simple binary search. It looks something like this: We take a lookup…

Read

The messy reality of SIMD (vector) functions

July 4, 2025September 7, 2025Ivica BogosavljevićPerformance, Toolchain and Performance, VectorizationLeave a Reply

We’ve discussed SIMD and vectorization extensively on this blog, and it was only a matter of time before SIMD (or vector) functions came up. In this post, we explore what SIMD functions are, when they are useful, and how to declare and use them effectively. A SIMD function is a function that processes more than…

Read

An optimizing compiler doesn’t help much with long instruction dependencies

May 31, 2025July 2, 2025Ivica Bogosavljević2 Minute Reads, Memory Subsystem Performance, Performance, Toolchain and Performance1 Reply

Does it matter if we are compiling with optimizations off (O0) or optimizations on (O3) if the problem is memory bound? Let’s find out…

Memory Subsystem Optimizations – The Remaining Topics

October 31, 2024November 13, 2024Ivica BogosavljevićLow Level Performance, Memory Subsystem Performance, PerformanceLeave a Reply

This is the last memory optimization that we are covering in this blog. You can see the full list of all memory subsystem optimization that we covered earlier here. Definitely a read for anyone who is trying to improve performance of memory intensive software. In this post, we are covering a few remaining optimization techniques…

Read