Spacer
Contact
Research
Teaching
Bio
Spacer
Spacer Spacer

I am broadly interested in the area of Software Engineering, Programming Languages, and Data Mining. I have been spending substantial efforts on code mining, analysis, and comprehension, aiming to provide practical techniques and tools for enhancing software reliability, increasing development productivity, reducing maintenance cost, and improving user experience.

Context-Aware Software Mining and Analysis

A general theme of my work is mining and analysis for software engineering, such as detection of code clones, code query processing, detection of bugs, search for bug fixes, and search for better testing & debugging techniques.

The search is being carried out on various contextual data sources in addition to program code itself, such as code change histories, program bug databases, test suites, developer activities, user feedbacks, and socio-technical information pertaining to the complex interactions between people and technologies in both software development processes and real-world usage scenarios.

To enable the extraction of information from various data sources and to enable efficient search and analysis, various technologies are being employed, such as static & dynamic program analysis, software engineering methodologies, data mining, information retrieval, natural language processing, and distributed computing techniques.

Publications

Scalable Code Clone Detection

Our studies and others' have noticed that on average more than 20% of code in large programs is cloned code, which often leads to higher maintenance cost and subtle software defects. The goal of our research is to scalably and accurately detect various code clones, track their evolutions and migrations among large programs, and manage them properly to facilitate program understanding and reengineering. Many applications, such as code refactoring, bug detection, and plagiarism detection, can stem from code clone detection and analysis.

  • DECKARD: A Code Clone and Clone-Related Bug Detection Tool
  • Active Refinement of Clone Anomaly Reports, by Lucia, David LO, Lingxiao JIANG, and Aditya Budi. Accepted for the 34th International Conference on Software Engineering (ICSE '12), Zurich, Switzerland, 2012. [To appear]
  • Automatic Mining of Functionally Equivalent Code Fragments via Random Testing, by Lingxiao JIANG and Zhendong SU. In the proceedings of the 18th International Conference on Software Testing and Analysis (ISSTA '09), Chicago, Illinois, USA, 2009. [PDF from ACM DL ACM DL
                        Author-ize service, on ACM DL, slides.pdf]
  • Scalable Detection of Semantic Clones, by Mark GABEL, Lingxiao JIANG, and Zhendong SU. In the proceedings of the 30th International Conference on Software Engineering (ICSE '08), Leipzig, Germany, 2008. [PDF from ACM DL ACM DL Author-ize service, on ACM DL, slides.pdf]
  • Context-Based Detection of Clone-Related Bugs, by Lingxiao JIANG, Zhendong SU, and Edwin CHIU. In the proceedings of the 6th joint meeting of the 11th European Software Engineering Conference and the 15th ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE '07), Dubrovnik, Croatia, 2007. [PDF from ACM DL ACM DL Author-ize service, on ACM DL, slides.pdf]
  • DECKARD: Scalable and Accurate Tree-based Detection of Code Clones, by Lingxiao JIANG, Ghassan MISHERGHI, Zhendong SU, and Stephane GLONDU. In the proceedings of the 29th International Conference on Software Engineering (ICSE '07), Minneapolis, Minnesota, USA, 2007. [pdf, ps, slides.pdf, on IEEE Xplore and ACM DL.

Code Queries

  • Code Search via Topic-Enriched Dependence Graph Matching, by Shaowei WANG, David LO, and Lingxiao JIANG. In the proceedings of the 18th Working Conference on Reverse Engineering (WCRE '11 on facebook), Limerick, Ireland, 2011. [pdf, on IEEE Xplore]
  • Concern Localization Using Information Retrieval: An Empirical Study on Linux Kernel, by Shaowei WANG, David LO, Zhenchang XING, and Lingxiao JIANG. In the proceedings of the 18th Working Conference on Reverse Engineering (WCRE '11 on facebook), Limerick, Ireland, 2011. [pdf, on IEEE Xplore]

Automated Testing

Automated Debugging

  • Search-Based Fault Localization, by Shaowei WANG, David LO, Lingxiao JIANG, Lucia, and Hoong Chuin LAU. In the proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE '11), Lawrence, Kansas, USA, 2011. [pdf, slides.pdf, on IEEE Xplore]
  • Comprehensive Evaluation of Association Measures for Fault Localization, by Lucia, David LO, Lingxiao JIANG, and Aditya BUDI. In the proceedings of the 26th IEEE International Conference on Software Maintenance (ICSM '10), Timisoara, Romania, 2010. [pdf, dataset, on IEEE Xplore]
  • Context-Aware Statistical Debugging: From Bug Predictors to Faulty Control Flow Paths, by Lingxiao JIANG and Zhendong SU. In the proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE '07), Atlanta, Georgia, USA, 2007. [PDF from ACM DL ACM DL Author-ize service, on ACM DL, slides.pdf]

Optimization and Quality Assurance

  • Real-time Trip Information Service For A Large Taxi Fleet, by Rajesh Krishna BALAN, Khoa Xuan NGUYEN, and Lingxiao JIANG. In the proceedings of the 9th International Conference on Mobile Systems, Applications, and Services (MobiSys '11), Washington, DC, USA, 2011. [PDF from ACM DL ACM DL Author-ize service, on ACM DL]
  • Static Validation of C Preprocessor Macros, by Andreas SAEBJOERNSEN, Lingxiao JIANG, Daniel QUINLAN, and Zhendong SU. In the proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering (ASE '09), Auckland, New Zealand, 2009. [pdf, on IEEE Xplore and ACM DL]
  • Osprey: A Practical Type System for Validating Dimensional Unit Correctness of C Programs, by Lingxiao JIANG and Zhendong SU. In the proceedings of the 28th International Conference on Software Engineering (ICSE '06), Shanghai, China, 2006. [PDF from ACM DL ACM DL Author-ize service, on ACM DL, slides.pdf]
Spacer
Spacer