Efficient Approximations for Cache-conscious Data Placementvirtual
Sat 18 Jun 2022 01:30 - 01:50 at Toucan - Verification & Optimization
There is a huge and growing gap between the speed of accesses to data stored in main memory vs cache. Thus, cache misses account for a significant portion of runtime overhead in virtually every program and minimizing them has been an active research topic for decades. The primary and most classical formal model for this problem is that of Cache-conscious Data Placement (CDP): given a commutative cache with constant capacity $k$ and a sequence $\Sigma$ of accesses to data elements, the goal is to map each data element to a cache line such that the total number of cache misses over $\Sigma$ is minimized. CDP has been widely studied since the 1990s. In POPL 2002, Petrank and Rawitz proved a notoriously strong hardness result: They showed that for every $k \geq 3,$ CDP is not only NP-hard but also hard-to-approximate within any non-trivial factor unless $\text{P}=\text{NP}$. As such, all subsequent works gave up on theoretical improvements and instead focused on heuristic algorithms with no theoretical guarantees.
In this work, we present the first-ever positive theoretical result for CDP. The fundamental idea behind our approach is that real-world instances of the problem have specific structural properties that can be exploited to obtain efficient algorithms with strong approximation guarantees. Specifically, the access graphs corresponding to many real-world access sequences are sparse and tree-like. This was already well-known in the community but has only been used to design heuristics without guarantees. In contrast, we provide efficient algorithms that provably approximate the optimal number of cache misses within any factor $1 + \epsilon,$ assuming that the access graph of a specific degree $d_\epsilon$ is sparse, i.e. sparser real-world instances lead to tighter approximations. We also provide experimental results showing that our approach frequently outperforms previous methods.