I’m always looking for new ways to get better quality in the software systems I’m responsible for. Recently I found an online article in the German magazine Object Spektrum with the promising title Software failure prediction: Intelligent quality assurance through statistical methods.
The two authors Steffen Herbold and Jens Grabowski describe a method to predict failures in software systems by analyzing commit messages in source code management systems (SCM) like Subversion or Git. The approach is quite simple: Count the number of bug fixing comments per file. The file with the most bug fixing comments has the most bugs. The file or class need the most attention during the test phase.
To improve the strategy the two authors suggest to value recent bug fixes higher than bug fixes that have been done more in the past.
Looking for a new pet project to learn some of the new features of Java 8 and working with Git again, I started to implement a tool to analyze SCM comments.
The first version will only support Subversion as SCM but supports filtering for different data. Currently filter for the following data is available:
- Time range (e.g. only comments from the last four weeks)
Currently only absolute date times are supported
- Minimum number of comments
(e.g. only files ending with *.java and *.c are analyzed)
- Comment content
(e.g only comments starting with FIX are analyzed)
The last three filter use the regular expressions and the pattern matcher from Java to do the filtering. With this solution I don’t have to implement my on filter interpreter.
Until the end of the year I want to provide a console based input and output of the results.
Open tasks for next year are:
- GUI based on JavaFX
- Support for Git
- Providing some base algorithms for aging of comments (value new comments higher then older ones)
- … (more to come)
The current state of the development is available on Github.
If you have any other suggestions for my pet project feel free to leave a comment.