Home Research Students Service Publications

Our SOftware Analytics Research (SOAR) group primarily works at the intersection of software engineering, cybersecurity and data science, encompassing socio-technical aspects, and analysis of different kinds of software artefacts (e.g., code, execution traces, bug reports, Q&A posts, and developer networks) and the interplay among them. We are particularly interested in transforming passive software engineering data into automated tools that can improve system reliability, security, and performance, increase developer productivity, and generate new insights for decision makers. We are also interested in connecting with practitioners to distil insights and discover pain points that can help direct future research effort.

Our group loves to collaborate. Aside from colleagues in SMU, our group has collaborated with Microsoft Research (Redmond and India), Adobe (USA), SAP (Germany), University of Illinois at Urbana Champaign (USA), CMU (USA), NUS, NTU, Zhejiang University (China), Peking University (China), Chinese University of Hong Kong (China), IIITD (India), Weizmann Institute of Science (Israel), Tel Aviv University (Israel), University of Milano-Biccoca (Italy), DIKU (Denmark), Inria (France), Monash University (Australia), Australian National University (Australia), Stellenbosch University (South Africa), and many more.

Our work has been published in top/major conferences and journals in the areas of software engineering (ICSE, FSE, ASE, ISSTA, ICSME, PLDI, TSE, TOSEM), artificial intelligence and data science (IJCAI, AAAI, KDD, VLDB, ICDE, ACL), and cybersecurity (ESORICS, TIFS).

The following describes a few areas of software analytics work that we have pursued:

 Bug and Vulnerability Management

We are intrigued to examine the entire process of how developers manage bugs and vulnerabilities, and how a data-driven approach can help. Our work in this area include:

 Code and Documentation Management

Given a large code base, a large set of code repositories, or a large set of libraries, it is often hard to find code snippets, methods or libraries of interest, given a particular need. Additionally, due to the fast pace of software development, documentation is often unavailable or outdated. Our work has addressed these pain points in the following ways:

 Empirical Studies

We are also interested in bridging the gap between research and practice through empirical studies. They are important to ensure that technologies that we design are relevant to practitioners, address their pain points, and are not evaluated in a biased way. Our work in this area include: