All you need is Superword-Level Parallelism: Systematic Control-Flow Vectorization with SLP
Thu 16 Jun 2022 04:10 - 04:30 at Toucan - Tensors
Superword-level parallelism (SLP) vectorization is a proven technique for vectorizing straight-line code. It works by replacing independent, isomorphic instruction sequences with equivalent vector instructions. Larsen and Amarasinghe originally developed SLP vectorization as a simpler, more flexible alternative (in combination with loop unrolling to vectorize inner loops) to traditional loop-based vectorization. However, this vision of replacing traditional loop-based vectorization has not been realized because SLP is unable to directly reason with control flow.
In this work, we introduce the SuperVectorization, a new vectorization framework that generalizes SLP vectorization to uncover parallelism that spans across different basic blocks and different loop nests. With the capability to systematically vectorize instructions across control-flow regions like basic blocks and loops, our framework simultaneously subsumes the roles of inner-loop, outer-loop, and straight-line vectorizer. We are able to achieve a 1.36× geometric speedup on Polybench [23] compared to LLVM’s default vectorization pipeline, which includes both a loop vectorizer and an SLP vectorizer. On a set of serial graphics benchmarks from Pharr and Mark [18], our vectorizer achieves a 1.47× geometric speedup, with the most promising result being the 3.28× speedup on a volume render with complex, deeply nested control-flow constructs that prevent vectorization by existing vectorizers. We believe SuperVectorization paves a way for a unifying vectorization framework that subsumes traditional loop and SLP vectorization.
Wed 15 JunDisplayed time zone: Pacific Time (US & Canada) change
| 15:30 - 16:50 | |||
| 15:3020m Talk | Autoscheduling for Sparse Tensor Algebra with an Asymptotic Cost Model PLDIDOI | ||
| 15:5020m Talk | DISTAL: The Distributed Tensor Algebra Compiler PLDI Rohan Yadav Stanford University, Alex Aiken Stanford Univeristy, Fredrik Kjolstad Stanford UniversityDOI | ||
| 16:1020m Talk | All you need is Superword-Level Parallelism: Systematic Control-Flow Vectorization with SLP PLDI Yishen Chen Massachusetts Institute of Technology, Charith Mendis University of Illinois at Urbana-Champaign, Saman Amarasinghe Massachusetts Institute of TechnologyDOI | ||
| 16:3020m Talk | Warping Cache Simulation of Polyhedral Programs PLDIDOI | ||
Thu 16 JunDisplayed time zone: Pacific Time (US & Canada) change
| 03:30 - 04:50 | |||
| 03:3020m Talk | Autoscheduling for Sparse Tensor Algebra with an Asymptotic Cost Model PLDIDOI | ||
| 03:5020m Talk | DISTAL: The Distributed Tensor Algebra Compiler PLDI Rohan Yadav Stanford University, Alex Aiken Stanford Univeristy, Fredrik Kjolstad Stanford UniversityDOI | ||
| 04:1020m Talk | All you need is Superword-Level Parallelism: Systematic Control-Flow Vectorization with SLP PLDI Yishen Chen Massachusetts Institute of Technology, Charith Mendis University of Illinois at Urbana-Champaign, Saman Amarasinghe Massachusetts Institute of TechnologyDOI | ||
| 04:3020m Talk | Warping Cache Simulation of Polyhedral Programs PLDIDOI | ||

