Mon 13 Jun 2022 16:15 - 16:30 at Boardroom - Evening Chair(s): Swarat Chaudhuri
Tue 14 Jun 2022 04:15 - 04:30 at Boardroom - Evening

Machine-learning promises to transform compilation and software engineering, yet is frequently limited by the scope of available datasets. In particular, there is a lack of runnable, real-world datasets required for a range of tasks ranging from neural program synthesis to machine learning-guided program optimization. We introduce a new dataset, ExeBench, which attempts to address this. It tackles two key issues with real-world code: references to external types and functions and scalable generation of IO examples. ExeBench is the first publicly available dataset that pairs real-world C code taken from GitHub with IO examples that allow these programs to be run. We develop a toolchain that scrapes GitHub, analyzes the code, and generates runnable snippets of code. We analyze our benchmark suite using several metrics, and show it is representative of real-world code. ExeBench contains 4.5M compilable and 700k executable C functions. This scale of executable, real functions will enable the next generation of machine learning-based programming tasks.

Mon 13 Jun

Displayed time zone: Pacific Time (US & Canada) change

15:30 - 17:00
EveningMAPS at Boardroom +12h
Chair(s): Swarat Chaudhuri University of Texas at Austin
15:30
45m
Keynote
Unsupervised Program Synthesis: Hierarchy and Perception
MAPS
Kevin Ellis Cornell University
16:15
15m
Talk
ExeBench: An ML-scale dataset of executable C functions
MAPS
Jordi Armengol-Estapé University of Edinburgh, Jackson Woodruff University of Edinburgh, Alexander Brauckmann University of Edinburgh, José Wesley de Souza Magalhães University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh
16:30
15m
Talk
Automatically Debugging AutoML Pipelines Using Maro: ML Automated Remediation Oracle
MAPS
Julian Dolby IBM Research, USA, Jason Tsay IBM Research, Martin Hirzel IBM Research
16:45
15m
Talk
A Graph Neural Network-based performance model for Deep Learning Applications
MAPS
Shikhar Singh University of Texas, James Hegarty Facebook, Hugh Leather University of Edinburgh, UK, Benoit Steiner Facebook

Tue 14 Jun

Displayed time zone: Pacific Time (US & Canada) change

03:30 - 05:00
EveningMAPS at Boardroom
03:30
45m
Keynote
Unsupervised Program Synthesis: Hierarchy and Perception
MAPS
Kevin Ellis Cornell University
04:15
15m
Talk
ExeBench: An ML-scale dataset of executable C functions
MAPS
Jordi Armengol-Estapé University of Edinburgh, Jackson Woodruff University of Edinburgh, Alexander Brauckmann University of Edinburgh, José Wesley de Souza Magalhães University of Edinburgh, Michael F. P. O'Boyle University of Edinburgh
04:30
15m
Talk
Automatically Debugging AutoML Pipelines Using Maro: ML Automated Remediation Oracle
MAPS
Julian Dolby IBM Research, USA, Jason Tsay IBM Research, Martin Hirzel IBM Research
04:45
15m
Talk
A Graph Neural Network-based performance model for Deep Learning Applications
MAPS
Shikhar Singh University of Texas, James Hegarty Facebook, Hugh Leather University of Edinburgh, UK, Benoit Steiner Facebook