Performance Debugging with llvm-mca: Simulating the CPU!
We debug our performance problem by simulating it with llvm-mca!
Senior Software Engineer with 10 years of experience active in the domain of Linux and bare-metal based embedded systems. His professional focus is application performance improvement - techniques used to make your C/C++ program run faster by using better algorithms, better exploiting the underlying hardware, and better usage of the standard library, programming language, and the operating system.
We debug our performance problem by simulating it with llvm-mca!
Flamegraphs are great way to visualize resource consumption in your program and I am their big fan (I have written about them on two occasions – here and here). My biggest concern with flamegraphs is when the tooling is bad or missing: to create flamegraphs, you need to have a good profiler and a binary…
This is the last memory optimization that we are covering in this blog. You can see the full list of all memory subsystem optimization that we covered earlier here. Definitely a read for anyone who is trying to improve performance of memory intensive software. In this post, we are covering a few remaining optimization techniques…
In this post we investigate methods to speed up convergence loops – while loops that slowly converge to the correct result.
In this post we talk about memory mechanism that increase memory accesses latency and we explore the techniques to avoid them in latency-sensitive systems.
We explore performance of latency-sensitive application, or more specifically, how to avoid evicting your critical data from the data cache.
We investigate explicit software prefetching, a mechanism software developers can use to prefetch the data in advance so it is ready once the program needs it.
A story of a very large loop with a long instruction dependency chain.
How to avoid register spilling in vectorized code with many constants?
We investigate the unusual way memory subsystem interacts with branch prediction and how this interaction shapes software performance.