Welcome to Johnny’s Software Lab, a blog for all interested in fast software written in C and C++.
Your program doesn’t run fast enough? You need someone to talk to about your software’s performance? You or your team want to learn to write faster software? Whatever it is, we can help you. Check out the consulting page for more info.
Featured posts
Optimizations for the Memory Subsystem
Memory Subsystem Performance has in the meantime become a complete comprehensive coverage on how to speed up your software by increasing the amount of data available to the CPU.
C++ Performance
Posts related to better usage of C++ language features:
Low-level optimizations
Posts related to better using the underlying hardware:
- Make your program run faster by better using the data cache
- How branches influence the performance of your code and what can you do about it?
- Make your programs run faster: avoid function calls
- Make your programs run faster: avoid expensive instructions
- Making your program run faster in a multithreaded environment
Compilers, Toolchains and Performance
Posts related to improving the performance of your code by better using the compilers and toolchains:
Parallel Programming
Posts related to improving the performance of your code by using additional resources in the computer system:
Performance Analysis Tools
Posts related to analyzing software’s performance:
Debugging
Posts sorted by most recent
- Memory Subsystem Optimizations – The Remaining Topics
- Speeding Up Convergence Loops. Or, on Vectorization and Precision Control
- Latency-Sensitive Application and the Memory Subsystem Part 2: Memory Management Mechanisms
- Latency-Sensitive Applications and the Memory Subsystem: Keeping the Data in the Cache
- The pros and cons of explicit software prefetching
- A story of a very large loop with a long instruction dependency chain
- On Avoiding Register Spills in Vectorized Code with Many Constants
- Unexpected Ways Memory Subsystem Interacts with Branch Prediction
- Multithreading and the Memory Subsystem
- Speeding Up Translation of Virtual To Physical Memory Addresses: TLB and Huge Pages
- Faster hash maps, binary trees etc. through data layout modification
- Performance Through Memory Layout
- Measuring Memory Subsystem Performance
- Hiding Memory Latency With In-Order CPU Cores OR How Compilers Optimize Your Code
- Software Performance and Class Layout
- Horrible Code, Clean Performance
- Decreasing the Number of Memory Accesses: The Compiler’s Secret Life 2/2
- Decreasing the Number of Memory Accesses 1/2
- Frugal Programming: Saving Memory Subsystem Bandwidth
- Loop Optimizations: interpreting the compiler optimization report
- For Software Performance, the Way Data is Accessed Matters!
- What is faster: vec.emplace_back(x) or vec[x] ?
- When an instruction depends on the previous instruction depends on the previous instructions… : long instruction dependency chains and performance
- The memory subsystem from the viewpoint of software: how memory subsystem affects software performance 2/3
- The memory subsystem from the viewpoint of software: how memory subsystem affects software performance 1/3
- Instruction-level parallelism in practice: speeding up memory-bound programs with low ILP
- Memory consumption, dataset size and performance: how does it all relate?
- Crash course introduction to parallelism: Multithreading
- Vectorization, dependencies and outer loop vectorization: if you can’t beat them, join them
- Making your program run faster: the key concepts of software performance
- Why is quicksort faster than heapsort? And how to make them faster?
- When vectorization hits the memory wall: investigating the AVX2 memory gather instruction
- What are premature optimizations?
- Why do programs get slower with time?
- Loop Optimizations: taking matters into your hands
- Loop Optimizations: how does the compiler do it?
- The quest for the fastest linked list
- Performance Tuning Contest: July 2021 edition
- Debugging performance issues in kernel space: minor fault and major faults
- Hardware performance counters the easy way: quickstart likwid-perfctr
- Debugging performance issues in kernel space: system calls
- Memory Access Pattern and Performance: the Example of Matrix Multiplication
- “Premature optimization is the root of all evil”. Or is it?
- Flexibility and Performance
- Speedscope: visualize what your program is doing and where it is spending time
- Speeding up an Image Processing Algorithm
- The true price of virtual functions in C++
- 2-minute read: Class Size, Member Layout and Speed
- Performance Tuning Contest: February 2021 edition
- Crash course introduction to parallelism: SIMD Parallelism
- 2-minute read: How is Big O notation relevant on modern systems?
- Crash course introduction to parallelism: the algorithms
- Making your program run faster in a multithreaded environment
- 2-minute read: the Magic Touch of Parallel Algorithms
- 2-minute read: What is faster, std::endl or ‘\n’?
- Use explicit data prefetching to faster process your data structure
- Multitime: a small utility to measure your program’s runtime
- Excessive copying in C++ and your program’s speed
- Make your programs run faster: avoid expensive instructions
- Tune your program’s speed with profile guided optimizations
- Process polymorphic classes in lightning speed
- RR: The magic of record and replay debugging
- The price of dynamic memory: Memory Access
- The price of dynamic memory: Allocation
- Lessons in debugging: observe how programs interact with the Linux kernel with STRACE
- How branches influence the performance of your code and what can you do about it?
- CPU Dispatching: Make your code both portable and fast
- GDB: A quick guide to make your debugging easier
- Make your programs run faster: avoid function calls
- FlameGraphs: Understand where your program is spending time
- MOSH – a simple SSH replacement that works when network conditions are bad
- Link Time Optimizations: New Way to Do Compiler Optimizations
- Make your programs run faster by better using the data cache
- A story about error recovery a.k.a those boring recurring bugs and what to do about them