Thu 16 Jun 2022 11:00 - 11:20 at Toucan - Analysis Chair(s): Xiaokang Qiu
Thu 16 Jun 2022 23:00 - 23:20 at Toucan - Analysis

The increasing popularity of WebAssembly as a compilation target creates a demand for understanding and reverse engineering WebAssembly binaries. An important first step in this process is to recover the types of functions in the binary. Unfortunately, there currently is no automated approach for obtaining type information beyond the four built-in, low-level types of WebAssembly. This paper presents SnowWhite, a learning-based approach for recovering precise, high-level parameter and return types for WebAssembly functions. SnowWhite distinguishes itself from prior work for other binary formats by representing the types-to-predict in an expressive type language. This language can describe a large number of complex types, instead of the fixed, and usually small type vocabulary used in prior binary type prediction approaches. As types are sentences in the type language, we formulate the prediction as a sequence prediction task and build on the success of neural sequence-to-sequence models. We evaluate SnowWhite on a large-scale dataset of 6.3 million type samples extracted from over 300,000 WebAssembly object files. The results show the type language to be more expressive than prior work, precisely describing 1,225 types instead the 7 to 35 types considered previously. Despite this expressiveness, the type prediction has high accuracy, exactly predicting 44.5% (75.2%) of all parameter types and 57.7% (80.5%) of all return types within the top-1 (top-5) predictions.

Thu 16 Jun

Displayed time zone: Pacific Time (US & Canada) change

10:40 - 12:00
AnalysisPLDI at Toucan +12h
Chair(s): Xiaokang Qiu Purdue University, USA
10:40
20m
Talk
CycleQ: an efficient basis for cyclic equational reasoning
PLDI
Eddie Jones University of Bristol, C.-H. Luke Ong University of Oxford, Steven Ramsay University of Bristol
DOI
11:00
20m
Talk
Finding the Dwarf: Recovering Precise Types from WebAssembly Binaries
PLDI
Daniel Lehmann University of Stuttgart, Michael Pradel University of Stuttgart
DOI Pre-print
11:20
20m
Talk
Abstract Interpretation Repair
PLDI
Roberto Bruni University of Pisa, Roberto Giacobazzi University of Verona, Roberta Gori University of Pisa, Francesco Ranzato University of Padova
DOI Pre-print
11:40
20m
Talk
Differential Cost Analysis with Simultaneous Potentials and Anti-potentials
PLDI
Đorđe Žikelić IST Austria, Pauline Bolignano Amazon, Bor-Yuh Evan Chang University of Colorado Boulder & Amazon, Franco Raimondi Amazon & Middlesex University
DOI Pre-print
22:40 - 00:00
AnalysisPLDI at Toucan
22:40
20m
Talk
CycleQ: an efficient basis for cyclic equational reasoning
PLDI
Eddie Jones University of Bristol, C.-H. Luke Ong University of Oxford, Steven Ramsay University of Bristol
DOI
23:00
20m
Talk
Finding the Dwarf: Recovering Precise Types from WebAssembly Binaries
PLDI
Daniel Lehmann University of Stuttgart, Michael Pradel University of Stuttgart
DOI Pre-print
23:20
20m
Talk
Abstract Interpretation Repair
PLDI
Roberto Bruni University of Pisa, Roberto Giacobazzi University of Verona, Roberta Gori University of Pisa, Francesco Ranzato University of Padova
DOI Pre-print
23:40
20m
Talk
Differential Cost Analysis with Simultaneous Potentials and Anti-potentials
PLDI
Đorđe Žikelić IST Austria, Pauline Bolignano Amazon, Bor-Yuh Evan Chang University of Colorado Boulder & Amazon, Franco Raimondi Amazon & Middlesex University
DOI Pre-print