Skip to content

Commit

Permalink
Add script & data to reproduce RQ2-4 results
Browse files Browse the repository at this point in the history
  • Loading branch information
agb94 committed May 13, 2022
1 parent 59404dd commit 0ccff96
Show file tree
Hide file tree
Showing 196 changed files with 495 additions and 30 deletions.
12 changes: 12 additions & 0 deletions Defects4J-generated-tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,18 @@ docker rm fdg # remove the container
```

## Detailed Description
Although we do not provide the entire raw experiment results due to storage issue, the precomputed fault localisation results are available in the directory [./output](./output). In the directory, each file path means `<time_budget>/<test_suite_id>/<diagnosability_metric>.pkl`.

**Using the precomputed data, you can reproduce RQ2-RQ4 results presented in our paper with the script [./plot-RQ2-RQ4.pkl](./plot-RQ2-RQ4.ipynb).**

Note that the precomputed `.pkl` files were generated using [./summarize_FL_results.py](./summarize_FL_results.py). The script summarises the fault localisation results produced in **Step 3** into a Pandas Dataframe. For example,
```shell
python summarize_FL_results.py newTS 60 FDG:0.5 --output output.pkl
```
this command will summarise the fault localisation results obtained using the diagnosability metric `FDG:0.5` in `./docker/results/localisation/newTS/` and save the summary to `output.pkl`. If you have finished `Step 3` or `Step 4`, you can try this command.

---
If you want to replicate each experiment our paper, please refer to the following detailed descriptions for each research question:

### RQ2: IFL Performance

Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
15 changes: 15 additions & 0 deletions Defects4J-generated-tests/output/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
The test suites were generated using Evosuite with the following seeds.

| Budget (s) | Test Suite ID | Seed | Budget (s) | Test Suite ID | Seed |
|------------:|--------------:|-----:|------------:|--------------:|-----:|
| 180 | TS_180_01 | 0 | 600 | TS_600_01 | 1004 |
| | TS_180_02 | 1004 | | TS_600_02 | 4001 |
| | TS_180_03 | 4001 | | TS_600_03 | 7777 |
| | TS_180_04 | 1001 | | TS_600_04 | 0 |
| | TS_180_05 | 3567 | | TS_600_05 | 1001 |
| | TS_180_06 | 1111 | | TS_600_06 | 1111 |
| | TS_180_07 | 10 | | TS_600_07 | 10 |
| | TS_180_08 | 1234 | | TS_600_08 | 1234 |
| | TS_180_09 | 4321 | | TS_600_09 | 4321 |
| | TS_180_10 | 7777 | | TS_600_10 | 3567 |

365 changes: 365 additions & 0 deletions Defects4J-generated-tests/plot-RQ2-RQ4.ipynb

Large diffs are not rendered by default.

69 changes: 69 additions & 0 deletions Defects4J-generated-tests/summarize_FL_results.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@


import os
import re
import argparse
import pandas as pd
from tqdm import tqdm

FL_DIR = f"./docker/results/localisation/"

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('test_suite_id', type=str)
parser.add_argument('time_budget', type=int)
parser.add_argument('metric', type=str)
parser.add_argument('--max-query-budget', '-q', type=int, default=10)
parser.add_argument('--output', '-o', type=str, default="./output.pkl")
args = parser.parse_args()

ts_id = args.test_suite_id
time_budget = args.time_budget
metric = args.metric
query_budgets = list(range(args.max_query_budget + 1))
result_dir = os.path.join(FL_DIR, ts_id)
output_path = args.output

rows = []
for filename in tqdm(os.listdir(result_dir), colour="green"):
if filename.startswith("."):
continue

if f"ranks-{metric}" not in filename:
continue

groups = re.search("(\w+)-(\d+)-ranks", filename)
if not groups:
continue
project, version = groups.group(1), groups.group(2)

groups = re.search("noise_(\d\.\d)\.pkl", filename)
if groups:
noise_prob = float(groups.group(1))
if noise_prob not in noise_probs:
continue
else:
noise_prob = 0.0

ranks = pd.read_pickle(
os.path.join(result_dir, filename)
)
for query_budget in range(max(query_budgets) + 1):
if f"rank-{query_budget}" in ranks.columns:
buggy_ranks = ranks.loc[
ranks['is_buggy'] == True,
f"rank-{query_budget}"].values
for buggy_rank in buggy_ranks:
if query_budget in query_budgets:
rows.append(
[ts_id, project, version, time_budget, query_budget, noise_prob, buggy_rank]
)
df = pd.DataFrame(
data=rows,
columns=['Test Suite', 'Project', 'Version',
'Time Budget', 'Query Budget', 'Noise Probability', 'Rank']
)

print(df)
df.to_pickle(output_path)
print(f"Saved to {output_path}")
11 changes: 0 additions & 11 deletions Defects4J-human-written-tests/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,5 @@
# Prioritisation Simulation (RQ1)

## Development Environment
- Python 3.9.1
- Installing dependencies
```shell
pip install -r requirements.txt
```
- (OS X) If `libshm.dylib` is not loaded, please install `libomp`.
```shell
brew install libomp
```

## Precomputed Simulation Results

The precomputed results are available in the directory `output/`.
Expand Down
42 changes: 23 additions & 19 deletions Defects4J-human-written-tests/plot-RQ1.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,26 @@
" mdf = df[df.iteration == iteration].copy()\n",
" mdf[f\"acc@{n}\"] = mdf[column] <= n\n",
" grouped = mdf.groupby('program').mean()[f\"acc@{n}\"]\n",
" return int(round(grouped.sum(), 0))"
" return int(round(grouped.sum(), 0))\n",
"\n",
"def get_acc_n_table(dfs, N=[1,3,5,10], num_iters=10):\n",
" rows = []\n",
" for metric in dfs:\n",
" frame = get_highest_ranks(dfs[metric])\n",
" for iteration in range(0, num_iters+1):\n",
" row = [metric, iteration]\n",
" for n in N:\n",
" value = accuracy_at(frame, iteration, n)\n",
" row.append(value)\n",
" rows.append(row)\n",
" row = [metric, \"Full\"]\n",
" for n in N:\n",
" row.append(accuracy_at(frame, 0, n, column=\"full_rank\"))\n",
" rows.append(row)\n",
"\n",
" df = pd.DataFrame(data=rows,\n",
" columns=[\"metric\", \"iteration\"] + [f\"acc@{n}\" for n in N])\n",
" return df"
]
},
{
Expand Down Expand Up @@ -426,27 +445,12 @@
}
],
"source": [
"upper_row= [\"EntBug\", \"Total\", \"DDU\", \"Add\", \"RAPTER\", \"TfD\"]\n",
"lower_row = [\"FLINT\", \"Prox\", \"S3\", \"Split\", \"Cover\", \"FDG\"]\n",
"N = [1,3,5,10]\n",
"num_iters = 10\n",
"df = get_acc_n_table(dfs, N=N, num_iters=num_iters)\n",
"\n",
"rows = []\n",
"for metric in upper_row + lower_row:\n",
" frame = get_highest_ranks(dfs[metric])\n",
" for iteration in range(0, num_iters+1):\n",
" row = [metric, iteration]\n",
" for n in N:\n",
" value = accuracy_at(frame, iteration, n)\n",
" row.append(value)\n",
" rows.append(row)\n",
" row = [metric, \"Full\"]\n",
" for n in N:\n",
" row.append(accuracy_at(frame, 0, n, column=\"full_rank\"))\n",
" rows.append(row)\n",
"\n",
"df = pd.DataFrame(data=rows,\n",
" columns=[\"metric\", \"iteration\"] + [f\"acc@{n}\" for n in N])\n",
"upper_row= [\"EntBug\", \"Total\", \"DDU\", \"Add\", \"RAPTER\", \"TfD\"]\n",
"lower_row = [\"FLINT\", \"Prox\", \"S3\", \"Split\", \"Cover\", \"FDG\"]\n",
"\n",
"markdown = \"\"\n",
"markdown += f\"# Table 2\\n\"\n",
Expand Down
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@
This is an artifact accompanying the paper **FDG: A Precise Measurement of
Fault Diagnosability Gain of Test Cases** (ISSTA 2022).

## Development Environment
- Python 3.9.1
- Installing dependencies
```shell
pip install -r requirements.txt
```
- (OS X) If `libshm.dylib` is not loaded, please install `libomp`.
```shell
brew install libomp
```

### Package structure
```bash
├── Defects4J-human-written-tests/ # RQ1
Expand Down

0 comments on commit 0ccff96

Please sign in to comment.