Skip to content

Commit

Permalink
cosmetics
Browse files Browse the repository at this point in the history
  • Loading branch information
lfoppiano committed Jan 10, 2025
1 parent a6bea43 commit 833291c
Show file tree
Hide file tree
Showing 8 changed files with 488 additions and 548 deletions.
64 changes: 32 additions & 32 deletions doc/benchmarks/flavors/article_light/benchmaking-biorxiv.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,59 +7,59 @@ Evaluation on 1996 random PDF files out of 1998 PDF (ratio 1.0).

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 82.99 | 81.45 | 82.22 | 1995 |
| first_author | 96.32 | 94.63 | 95.47 | 1993 |
| title | 78.19 | 73.65 | 75.85 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **85.94** | **83.24** | **84.57** | 5984 |
| all fields (macro avg.) | 85.84 | 83.24 | 84.51 | 5984 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 82.99 | 81.45 | 82.22 | 1995 |
| first_author | 96.32 | 94.63 | 95.47 | 1993 |
| title | 78.19 | 73.65 | 75.85 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **85.94** | **83.24** | **84.57** | 5984 |
| all fields (macro avg.) | 85.84 | 83.24 | 84.51 | 5984 |



#### Soft Matching (ignoring punctuation, case and space characters mismatches)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 83.55 | 82.01 | 82.77 | 1995 |
| first_author | 96.63 | 94.93 | 95.77 | 1993 |
| title | 80.64 | 75.95 | 78.22 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **87.03** | **84.29** | **85.64** | 5984 |
| all fields (macro avg.) | 86.94 | 84.3 | 85.59 | 5984 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 83.55 | 82.01 | 82.77 | 1995 |
| first_author | 96.63 | 94.93 | 95.77 | 1993 |
| title | 80.64 | 75.95 | 78.22 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **87.03** | **84.29** | **85.64** | 5984 |
| all fields (macro avg.) | 86.94 | 84.3 | 85.59 | 5984 |



#### Levenshtein Matching (Minimum Levenshtein distance at 0.8)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 91.57 | 89.87 | 90.72 | 1995 |
| first_author | 96.78 | 95.08 | 95.93 | 1993 |
| title | 92.13 | 86.77 | 89.37 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **93.51** | **90.57** | **92.02** | 5984 |
| all fields (macro avg.) | 93.49 | 90.58 | 92 | 5984 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 91.57 | 89.87 | 90.72 | 1995 |
| first_author | 96.78 | 95.08 | 95.93 | 1993 |
| title | 92.13 | 86.77 | 89.37 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **93.51** | **90.57** | **92.02** | 5984 |
| all fields (macro avg.) | 93.49 | 90.58 | 92 | 5984 |



#### Ratcliff/Obershelp Matching (Minimum Ratcliff/Obershelp similarity at 0.95)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 87.59 | 85.96 | 86.77 | 1995 |
| first_author | 96.32 | 94.63 | 95.47 | 1993 |
| title | 88.35 | 83.22 | 85.71 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **90.79** | **87.93** | **89.34** | 5984 |
| all fields (macro avg.) | 90.75 | 87.94 | 89.32 | 5984 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 87.59 | 85.96 | 86.77 | 1995 |
| first_author | 96.32 | 94.63 | 95.47 | 1993 |
| title | 88.35 | 83.22 | 85.71 | 1996 |
| | | | | |
| **all fields (micro avg.)** | **90.79** | **87.93** | **89.34** | 5984 |
| all fields (macro avg.) | 90.75 | 87.94 | 89.32 | 5984 |


#### Instance-level results
Expand Down
64 changes: 32 additions & 32 deletions doc/benchmarks/flavors/article_light/benchmaking-elife.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,59 +7,59 @@ Evaluation on 957 random PDF files out of 982 PDF (ratio 1.0).

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 79.68 | 77.43 | 78.54 | 957 |
| first_author | 91.83 | 89.33 | 90.56 | 956 |
| title | 89.25 | 85.89 | 87.54 | 957 |
| | | | | |
| **all fields (micro avg.)** | **86.91** | **84.22** | **85.54** | 2870 |
| all fields (macro avg.) | 86.92 | 84.22 | 85.55 | 2870 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 79.68 | 77.43 | 78.54 | 957 |
| first_author | 91.83 | 89.33 | 90.56 | 956 |
| title | 89.25 | 85.89 | 87.54 | 957 |
| | | | | |
| **all fields (micro avg.)** | **86.91** | **84.22** | **85.54** | 2870 |
| all fields (macro avg.) | 86.92 | 84.22 | 85.55 | 2870 |



#### Soft Matching (ignoring punctuation, case and space characters mismatches)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 80 | 77.74 | 78.86 | 957 |
| first_author | 91.83 | 89.33 | 90.56 | 956 |
| title | 96.42 | 92.79 | 94.57 | 957 |
| | | | | |
| **all fields (micro avg.)** | **89.39** | **86.62** | **87.98** | 2870 |
| all fields (macro avg.) | 89.41 | 86.62 | 88 | 2870 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 80 | 77.74 | 78.86 | 957 |
| first_author | 91.83 | 89.33 | 90.56 | 956 |
| title | 96.42 | 92.79 | 94.57 | 957 |
| | | | | |
| **all fields (micro avg.)** | **89.39** | **86.62** | **87.98** | 2870 |
| all fields (macro avg.) | 89.41 | 86.62 | 88 | 2870 |



#### Levenshtein Matching (Minimum Levenshtein distance at 0.8)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 92.8 | 90.18 | 91.47 | 957 |
| first_author | 92.26 | 89.75 | 90.99 | 956 |
| title | 98.05 | 94.36 | 96.17 | 957 |
| | | | | |
| **all fields (micro avg.)** | **94.35** | **91.43** | **92.87** | 2870 |
| all fields (macro avg.) | 94.37 | 91.43 | 92.87 | 2870 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 92.8 | 90.18 | 91.47 | 957 |
| first_author | 92.26 | 89.75 | 90.99 | 956 |
| title | 98.05 | 94.36 | 96.17 | 957 |
| | | | | |
| **all fields (micro avg.)** | **94.35** | **91.43** | **92.87** | 2870 |
| all fields (macro avg.) | 94.37 | 91.43 | 92.87 | 2870 |



#### Ratcliff/Obershelp Matching (Minimum Ratcliff/Obershelp similarity at 0.95)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 85.59 | 83.18 | 84.37 | 957 |
| first_author | 91.83 | 89.33 | 90.56 | 956 |
| title | 97.94 | 94.25 | 96.06 | 957 |
| | | | | |
| **all fields (micro avg.)** | **91.77** | **88.92** | **90.32** | 2870 |
| all fields (macro avg.) | 91.79 | 88.92 | 90.33 | 2870 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 85.59 | 83.18 | 84.37 | 957 |
| first_author | 91.83 | 89.33 | 90.56 | 956 |
| title | 97.94 | 94.25 | 96.06 | 957 |
| | | | | |
| **all fields (micro avg.)** | **91.77** | **88.92** | **90.32** | 2870 |
| all fields (macro avg.) | 91.79 | 88.92 | 90.33 | 2870 |


#### Instance-level results
Expand Down
64 changes: 32 additions & 32 deletions doc/benchmarks/flavors/article_light/benchmaking-plos.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,59 +7,59 @@ Evaluation on 1000 random PDF files out of 998 PDF (ratio 1.0).

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 98.97 | 99.07 | 99.02 | 969 |
| first_author | 99.28 | 99.38 | 99.33 | 969 |
| title | 95.77 | 95.1 | 95.43 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **97.99** | **97.82** | **97.9** | 2938 |
| all fields (macro avg.) | 98.01 | 97.85 | 97.93 | 2938 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|----------|---------|
| authors | 98.97 | 99.07 | 99.02 | 969 |
| first_author | 99.28 | 99.38 | 99.33 | 969 |
| title | 95.77 | 95.1 | 95.43 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **97.99** | **97.82** | **97.9** | 2938 |
| all fields (macro avg.) | 98.01 | 97.85 | 97.93 | 2938 |



#### Soft Matching (ignoring punctuation, case and space characters mismatches)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 98.97 | 99.07 | 99.02 | 969 |
| first_author | 99.28 | 99.38 | 99.33 | 969 |
| title | 99.3 | 98.6 | 98.95 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **99.18** | **99.01** | **99.1** | 2938 |
| all fields (macro avg.) | 99.18 | 99.02 | 99.1 | 2938 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|----------|---------|
| authors | 98.97 | 99.07 | 99.02 | 969 |
| first_author | 99.28 | 99.38 | 99.33 | 969 |
| title | 99.3 | 98.6 | 98.95 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **99.18** | **99.01** | **99.1** | 2938 |
| all fields (macro avg.) | 99.18 | 99.02 | 99.1 | 2938 |



#### Levenshtein Matching (Minimum Levenshtein distance at 0.8)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 99.28 | 99.38 | 99.33 | 969 |
| first_author | 99.38 | 99.48 | 99.43 | 969 |
| title | 99.7 | 99 | 99.35 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **99.45** | **99.29** | **99.37** | 2938 |
| all fields (macro avg.) | 99.45 | 99.29 | 99.37 | 2938 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|------------|---------|
| authors | 99.28 | 99.38 | 99.33 | 969 |
| first_author | 99.38 | 99.48 | 99.43 | 969 |
| title | 99.7 | 99 | 99.35 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **99.45** | **99.29** | **99.37** | 2938 |
| all fields (macro avg.) | 99.45 | 99.29 | 99.37 | 2938 |



#### Ratcliff/Obershelp Matching (Minimum Ratcliff/Obershelp similarity at 0.95)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 99.18 | 99.28 | 99.23 | 969 |
| first_author | 99.28 | 99.38 | 99.33 | 969 |
| title | 99.5 | 98.8 | 99.15 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **99.32** | **99.15** | **99.23** | 2938 |
| all fields (macro avg.) | 99.32 | 99.15 | 99.23 | 2938 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 99.18 | 99.28 | 99.23 | 969 |
| first_author | 99.28 | 99.38 | 99.33 | 969 |
| title | 99.5 | 98.8 | 99.15 | 1000 |
| | | | | |
| **all fields (micro avg.)** | **99.32** | **99.15** | **99.23** | 2938 |
| all fields (macro avg.) | 99.32 | 99.15 | 99.23 | 2938 |


#### Instance-level results
Expand Down
64 changes: 32 additions & 32 deletions doc/benchmarks/flavors/article_light/benchmaking-pmc.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,59 +7,59 @@ Evaluation on 1943 random PDF files out of 1941 PDF (ratio 1.0).

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 92.2 | 91.91 | 92.05 | 1941 |
| first_author | 96.28 | 95.98 | 96.13 | 1941 |
| title | 84.33 | 83.38 | 83.85 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **90.95** | **90.42** | **90.69** | 5825 |
| all fields (macro avg.) | 90.94 | 90.42 | 90.68 | 5825 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 92.2 | 91.91 | 92.05 | 1941 |
| first_author | 96.28 | 95.98 | 96.13 | 1941 |
| title | 84.33 | 83.38 | 83.85 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **90.95** | **90.42** | **90.69** | 5825 |
| all fields (macro avg.) | 90.94 | 90.42 | 90.68 | 5825 |



#### Soft Matching (ignoring punctuation, case and space characters mismatches)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 94.11 | 93.82 | 93.96 | 1941 |
| first_author | 96.64 | 96.34 | 96.49 | 1941 |
| title | 92.04 | 90.99 | 91.51 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **94.27** | **93.72** | **93.99** | 5825 |
| all fields (macro avg.) | 94.26 | 93.72 | 93.99 | 5825 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 94.11 | 93.82 | 93.96 | 1941 |
| first_author | 96.64 | 96.34 | 96.49 | 1941 |
| title | 92.04 | 90.99 | 91.51 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **94.27** | **93.72** | **93.99** | 5825 |
| all fields (macro avg.) | 94.26 | 93.72 | 93.99 | 5825 |



#### Levenshtein Matching (Minimum Levenshtein distance at 0.8)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 96.28 | 95.98 | 96.13 | 1941 |
| first_author | 96.95 | 96.65 | 96.8 | 1941 |
| title | 98.18 | 97.07 | 97.62 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **97.13** | **96.57** | **96.85** | 5825 |
| all fields (macro avg.) | 97.14 | 96.57 | 96.85 | 5825 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 96.28 | 95.98 | 96.13 | 1941 |
| first_author | 96.95 | 96.65 | 96.8 | 1941 |
| title | 98.18 | 97.07 | 97.62 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **97.13** | **96.57** | **96.85** | 5825 |
| all fields (macro avg.) | 97.14 | 96.57 | 96.85 | 5825 |



#### Ratcliff/Obershelp Matching (Minimum Ratcliff/Obershelp similarity at 0.95)

**Field-level results**

| label | precision | recall | f1 | support |
|--- |--- |--- |--- |--- |
| authors | 95.3 | 95 | 95.15 | 1941 |
| first_author | 96.28 | 95.98 | 96.13 | 1941 |
| title | 96.2 | 95.11 | 95.65 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **95.92** | **95.36** | **95.64** | 5825 |
| all fields (macro avg.) | 95.93 | 95.36 | 95.64 | 5825 |
| label | precision | recall | f1 | support |
|-----------------------------|-----------|-----------|-----------|---------|
| authors | 95.3 | 95 | 95.15 | 1941 |
| first_author | 96.28 | 95.98 | 96.13 | 1941 |
| title | 96.2 | 95.11 | 95.65 | 1943 |
| | | | | |
| **all fields (micro avg.)** | **95.92** | **95.36** | **95.64** | 5825 |
| all fields (macro avg.) | 95.93 | 95.36 | 95.64 | 5825 |


#### Instance-level results
Expand Down
Loading

0 comments on commit 833291c

Please sign in to comment.