MLforSE

Since 2018, we have been researching the intersections between artificial intelligence and software engineering. Our goal is to teach computers to understand, write and improve code. We mainly use deep learning tools, including large language models and recurrent neural networks. Currently, we are mainly interested in the study of the code interpretability of large language models, program transformations, and decryption.

Code interpretation by Large Language Models

The use of large language models in software development is no longer considered an unusual solution. However, it remains unclear how effectively these models can contribute to software development tasks and support code comprehension. Our research aims to address this question.

Code Decompilation

The goal of code decompilation is to recover source code from a compiled binary file. Deep learning offers advantages over traditional methods by producing more idiomatic code and being less dependent on the compiler. Our system is entirely based on deep learning: even the identification of literals and constants is performed without heuristics, unlike other deep learning systems. Our long-term objective is to support all language elements of the C language.

Program transformations

Deep learning is also highly effective for transforming less idiomatic, poorly optimized, or faulty code into improved versions. Our focus is primarily on idiomatically optimizing Python programs and performing automatic error correction.