A story of a very large loop with a long instruction dependency chain.
All posts tagged loop blocking
For Software Performance, the Way Data is Accessed Matters!
Posted on Author Ivica BogosavljevićPosted in Low Level Performance, Memory Subsystem Performance, Performance2 Replies
In our experiments with the memory access pattern, we have seen that good data locality is a key to good software performance. Accessing memory sequentially and splitting the data set into small-sized pieces which are processed individually improves data locality and software speed. In this post, we will present a few techniques to improve the…
Memory Access Pattern and Performance: the Example of Matrix Multiplication
Posted on Author Ivica BogosavljevićPosted in Computational Performance, Low Level Performance, PerformanceLeave a Reply
We use matrix multiplication example to investigate loop interchange and loop tiling as techniques to speed up your program that works with matrices.