(OOPSLA 2020) Feedback-Driven Semi-Supervised Synthesis of Program Transformations
It is fairly common for developers to make repeated edits in code that are all instances of a more-general program transformation. Since this process can be tedious and error-prone, we study the problem automatically learning program transformations from past edits, which can then be used to predict future edits. We take the novel view of the problem as a semi-supervised learning problem: apart from the concrete edits that are instances of the general transformation, the learning procedure also exploits access to additional inputs (program subtrees) that are marked as positive or negative depending on whether the transformation applies on those inputs. We present a procedure to solve the semi-supervised transformation learning problem using anti-unification and programming-by-example synthesis technology. To eliminate reliance on access to marked additional inputs, we generalize the semi-supervised learning procedure to a feedback-driven procedure that also generates the marked additional inputs in an iterative loop. We apply these ideas to build and evaluate three applications that use different mechanisms for generating feedback. Compared to existing tools that learn program transformations from edits, our feedback-driven semi-supervised approach is vastly more effective in successfully predicting edits with significantly fewer past edit data.