Reproduction package for the paper "Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing", sent to EMSE Journal (still under review).
This repository contains the tool that allows, from a commit that fixes a bug and a test that reveals this bug, to find the commit that introduced the bug. It also includes a collection of Jupyter Notebooks to analyze in detail the results of the tool.
This package contains:
.
├── analysis # Jupyter Notebooks for data analysis
├── configFiles # Config files for each project
├── dataset # The dataset generated by our tool (BIC-RT)
├── dockerfiles # Docker files for all necessary images to perform the experiment
├── projects # Subjects of the experiment (git repositories)
├── py # Python scripts to perform the experiment
├── results # Results generate from the experiment
├── scripts # Bash scripts to easy-perform the experiment
├── tmp # Folder for temporary files
└── README.md
- SetUp
- Reproducing the experiment of the paper
- How to use BugHunter
- Git >= 2.25
- Docker >= 20.10 (used: build f0df350)
In order to use the tool, the following docker images are required to use the tool:
- Defects4J image
docker build -f dockerfiles/defects4j/defects4j.Dockerfile -t defects4j:2.1.1 .
- RegTestExecutor image
docker build -f dockerfiles/regression-seeker.Dockerfile -t regression-seeker:0.2.4 .
If in later steps the container generated from this image does not have permissions on Docker, you must build the image using the GID of the Docker socket as argument
DOCKER_GID=$(stat -c '%g' /var/run/docker.sock)
docker build --build-arg DOCKER_GID=$DOCKER_GID -f dockerfiles/regression-seeker.Dockerfile -t regression-seeker:0.2.4 .
- Analysis image
docker build -f dockerfiles/analysis.Dockerfile -t regression-seeker-analysis:0.1.1 .
The experiment was carried out in 3 phases/steps:
- Extract bug information from the Defects4J dataset.
- Execution of the regression test in the past (per bug)
- Analysis of the results
To carry out the experiment, we will use the Defects4J dataset, which has a simple API to obtain the information of the bugs. All projects in this dataset have been selected except Chart.
This step generates a configuration file in JSON format with all the bug information (as shown below) and is stored in the folder configFiles/
{
"id": "1",
"project": "Closure",
"git_url": "D4J",
"docker_image": "defects4j:2.1.1",
"bug_report": "https://storage.googleapis.com/google-code-archive/v2/code.google.com/closure-compiler/issues/issue-253.json",
"fix_commit": "1dfad5043a207e032a78ef50c3cba50488bcd300",
"build": "ant -Dd4j.project.id=Closure compile",
"build_test": "ant -Dd4j.project.id=Closure compile.tests",
"test_command": "ant -Dd4j.project.id=Closure -Dtest.entry.class=com.google.javascript.jscomp.CommandLineRunnerTest -Dtest.entry.method=testSimpleModeLeavesUnusedParams run.dev.tests",
"folder": "test/com/google/javascript/jscomp/",
"file": "CommandLineRunnerTest.java",
"test_report": "target/surefire-reports/TEST-com.google.javascript.jscomp.CommandLineRunnerTest.xml"
}
In order to generate these files automatically, a file called project-config.json
has been generated manually for each project to provide structure to the configuration files created.
To extract the information of each bug, we use the following command:
$ ./scripts/runExtractBugsD4J.sh <project_name>
From the configuration files generated in the previous step, we run the experiment. This experiment consists of:
- Read the bug configuration file
- Clone the repository (in the case of Defects4J, this is done through its tool).
- Execute the test that reveals the bug in the bug fixing commit (BFC) and check that the test succeeds.
- Execute the test that reveals the bug in the commit before the BFC and check that it fails
- Execute the test that reveals the bug in all previous commits
In the steps that execute regression tests, the tool follows this procedure:
- Checkout the corresponding commit
- Transplant (copy) the regression test
- Compile the source code
- Compile the regression test
- Execute the regression test.
For each bug, the following information is obtained:
- The result of compiling the source code, compiling the test code and running the regression test in JSON format.
- The logs of each of these three phases
- The test report generated
The results of this step (which we will call raw results) can be found in Zenodo (https://zenodo.org/record/8274835) as <project>-raw-results.tar.gz
(~16GB), where <project>
will be the name of the Defects4J project.
They should be unzipped inside the results/
directory, so that the results are placed in results/<project>
.
To run the experiment on a bug, we use the following command:
$ ./scripts/runExperiment.sh <project> <bug_id>
For this step you will need to have started a Docker container using the image built in set up (regression-seeker-analysis
)
$ ./scripts/runNotebook.sh
The analysis of the results is easily visualized through a Jupyter Notebook.
This notebook:
- Performs a bug introducing change (BIC) search.
- This analysis can also be run directly by terminal using
./scripts/runAnalysis.sh <project> <bug>
- This analysis can also be run directly by terminal using
- Analyzes and displays results that answer
RQ1A: How far can a test be transplanted into the past?
- Analyzes and displays results that answer
RQ1B: How compilability and runnability problems impact the transplantation of the regression tests to the past?
- Analyzes and displays results that answer
RQ2: Can the BIC for a given bug be found using its regression test?
When running all notebook cells, processed results are generated in the analysis/results/
folder.
The results of this step (which we will call processed results) can be found in Zenodo (https://zenodo.org/record/8274835) as processed-results.tar.gz
(350MB).
From the results obtained in the previous step (processed results) a BIC dataset is generated through a JupyterNotebook.
The result of executing all the cells of this notebook is the CSV file dataset/BIC-RT.csv
, included in the Git repository.
Once the dataset has been generated in the previous step, in this step we will use it to test the performance and evaluate different derivations of the SZZ algorithm.
All SZZ derivations are part of this repository and are located in py/szz/
. A suite of adapters has been generated to facilitate the use of these algorithms using Python code:
- OpenSZZ.py
- PySZZ.py
- SZZUnleashed.py
These algorithms will use the project's git repository and the configuration file generated in Step 1. The latter file must be adapted to include additional information required by these algorithms (date the fix was created, date the issue was opened and date the issue was closed). To adapt this configuration file, the following command is provided:
$ ./scripts/adaptAllIssues.sh
The execution of the SZZ algorithms on the detected regressions is automated through the scripts located in scripts/szz/
.
$ scripts/szz/run<SZZ_Algorithm>.sh
The results of the execution of these algorithms are part of the raw results
mentioned above and are available at Zenodo (https://zenodo.org/record/8274835) as szz-raw-results.tar.gz
.
They should be unzipped inside the results/
directory, so that the results are placed in results/szz/
.
To visualize the results of these derivations of the SZZ, we will use again a JupyterNotebook
To validate our dataset, we checked our results with those of a popular BIC benchmark, InduceBenchmark which includes BICs for the Defects4J dataset.
The analysis of the common BICs (whether the identification matches or not) together with how the different algorithms behave (the derivations of the SZZ and our proposal) can be found in a Jupyter Notebook.
To evaluate the reasons why a regression test cannot be transplanted, we evaluated the logs at the first commit (starting from the BFC) where the test can no longer be built. The results of this analysis can be found in a Jupyter Notebook.
This section shows how to use our tool for other projects.
Background:
We will use as an example a SpringBootSamples application that contains a Divider functionality, which allows to divide two numbers provided by an input string. This functionality was recently fixed, since it does not allow dividing by 0, although it was a previously supported feature.
It will be necessary to indicate all the relevant information to our tool in a configuration file in JSON format.
id
El identificador del bug dentro del proyectoproject
Name of our project (Avoid using spaces or special characters)git_url
Url from where the project will be downloaded. The tool assumes that it is a Git repository. If the project does not have a public URL, you can manually place it inresults/<project>_Bug_<id>
, as the tool will assume that the project is already downloaded and skip the cloning phase.docker_image
Name of the Docker image that will be used to build the project and run the tests. It is recommended to use the official images that contain the necessary tools for these steps.bug_report
Link to the bug report for this bug, NOT used by the tool.fix_commit
The complete hash of the commit where the bug has been fixed and which includes the regression test.build
Source code build commandbuild_test
Test code construction command. It will be executed after the previous commandtest_command
Regression test execution commandfolder
andfile
Indicate the location and name of the test file to be transplanted to the past (copied from the fix commit).test_report
Location of the test result. The tool does NOT use this file, but allows to save it automatically for commit, so that further analysis can be performed.fixes
Allows to include a file or command that will be executed with bash inside the container, allowing to perform any kind of additional action BEFORE the build command is executed. This allows to modify/add/delete files to ensure the correct transplant of the test. (OPTIONAL)
Here is an example of configuration file to try to find this bug
{
"id": "1",
"project": "SpringBootSamples",
"git_url": "https://github.com/Maes95/SpringBootSamples.git",
"docker_image": "maven:3-jdk-8-slim",
"bug_report": "-",
"fix_commit": "add8221fb5314265ce7d7a8a4002078a498511a3",
"build": "mvn clean compile",
"build_test": "mvn test-compile",
"test_command": "mvn -Dtest=DividerTest#divideBy0 test",
"folder": "src/test/java/samples/websocket/tomcat/divider/",
"file": "DividerTest.java",
"test_report": "target/surefire-reports/TEST-samples.websocket.tomcat.divider.DividerTest.xml",
"fixes": "echo \"No fixes!\""
}
Once the configuration file is ready, the experiment can be run automatically to obtain the test result in each of the project commits, using the following command:
./scripts/runExperiment.sh <project> <id>
In our case:
./scripts/runExperiment.sh SpringBootSamples 1
This command will launch a Docker container from the regression-seeker
image (named RS--Bug-) that will carry out the experiment and use an auxiliary container with the image indicated in the configuration.
The results of each commit can be viewed as the project runs in results/<project>/Bug_<id>/commits/
, which will include the logs of each step, the test result and a results.json
file with the summary of each phase.
Once the experiment is finished and we have the results, we will use the following command to search for the commit that introduced the regression in the code:
./scripts/runAnalysis.sh <project> <id>
In our case:
./scripts/runAnalysis.sh SpringBootSamples 1
This will generate an analysis results folder in analysis/results/<project>/Bug_<id>
containing a JSON file with the analysis result. For our example, it is as follows:
{
"id": "SpringBootSamples_Bug_1",
"bug": "Bug_1",
"project": "SpringBootSamples",
"fix_pass": true,
"prev_fails": true,
"category": "A regression is detected",
"sub_category": "Unique candidates",
"test_name": "DividerTest#divideBy0",
"bug_report": "-",
"fix_commit": "add8221fb5314265ce7d7a8a4002078a498511a3",
"BIC_candidates": [
[
2,
"eb79de443301486e24618623d2158deb1329c0cc"
]
],
"bic_position": 2,
"bic_age": 0,
"executionsOnPast": 4,
"buildFail": 4,
"buildTestFail": 1,
"numCommits": 9,
"totalDays": 0,
"transplantability_days": 0,
"transplantability_position": 3
}
For this example, the commit with the hash "eb79de443301486e24618623d2158deb1329c0cc" was found to be the one that introduced the bug, located 2 commits behind the fix commit.