With as much as 60-90% of software life cycle resources spent on program maintenance, it is critical that automated software tools are developed to help explore and understand today's complex software to expedite its maintenance. One important source of information maintenance tools can draw from is lexical information in comments and identifiers. Identifier names often communicate a programmer's intent when writing code, and help developers map real-world concepts to code during comprehension. The research focuses on two questions:
- How existing information retrieval (IR) and natural language processing (NLP) techniques can be leveraged to create more effective software tools, and
- How these techniques can be specialized for the domain of software.
The goal of the NLPA project is to develop specialized information retrieval techniques and natural language analyses for software so that software maintenance tools can take full advantage of the wealth of information in program identifiers, and integrate these techniques into software tools to expedite the maintenance activities of program exploration, concern location, and fault localization.