Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge from upstream #13

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 17 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# easy2acl.py

This short script is useful in the scenario where peer-reviewing is done using EasyChair but proceedings are to be produced with aclpub. The user must retrieve information from EasyChair before running the script.

easy2acl.py produces two files for use with aclpub; the `db` file, and an archive `final.tar.gz` containing a folder `final`, which in turn contains the PDF files of the accepted submissions. You should make yourself familiar with the db file, which you can read more about in the aclpub documentation.

Please report bugs and suggest improvements.
Expand All @@ -10,36 +10,36 @@ Please report bugs and suggest improvements.

The Python 3 packages PyPDF2 and unicode_tex are needed and can be installed using pip. The tar command is also needed (and should be available at PATH).

## How to run
## How to run

Create the files `accepted` and `submissions` and the folder `pdf` as shown in [Getting data from EasyChair](#getting-data-from-easychair). Before running this script, your file structure should look like this:
Create the files `accepted` and `submissions` and the folder `pdf` as shown in [Getting data from EasyChair](#getting-data-from-easychair). Before running this script, your file structure should look like this:

|-- easy2acl.py
|-- submissions
|-- accepted
`-- pdf
|-- ..._submission_1.pdf
|-- ..._submission_2.pdf
|-- easy2acl.py
|-- submissions
|-- accepted
`-- pdf
|-- ..._submission_1.pdf
|-- ..._submission_2.pdf
`-- ...

Run the script:

$ python3 easy2acl.py

When the script has finished, you will find the files `db` and `final.tar.gz` in the same folder. Place these files in your `proceedings` folder as suggested by the aclpub documentation, and proceed as you usually would with aclpub.

## Additional information

It is your responsibility to make sure that the `db` file is correct. The author(s) of this script make no claims that this script works as intended. Below are some things to look out for regarding the data you get from EasyChair, and the assumptions made by the script.

* **Title of submission in EasyChair does not match title in the submitted PDF.** In case of a substantial change to the title, and depending on the policy of your conference, you might want to contact the Program Chair. You might want to do so anyways in case the title is used anywhere else, for example in the conference program.

* **Order of authors of submission in EasyChair does not match the order in the submitted PDF.**

* **Order of author name internally, as in `<first> <last>`, in EasyChair is incorrect.** This can cause problems with the order of the papers since they are written to the `db` file in alphabetical order according to the first author's last name.

* **Author has multiple names before the last name, e.g. `<first> <middle> <last>`.** This can cause problems with the order of the papers since they are written in alphabetical order according to the first author's last name. The script assumes the format `<first> <last> [<last>] [<last>] ...`.

* **Some diacritics and special characters in names are not converted by the script.** Certain characters that you expected to be translated into LaTeX escape codes, but were not, might be because they are not handled in the unicode_tex package. Make sure that the name was properly written in EasyChair; it might be that the person who entered the name forgot to add diacritics. If you want to be nice, you can check the names in your resulting `db` file against the names of the actual submissions and make the appropriate changes to the `db` file.

## Getting data from EasyChair
Expand All @@ -54,10 +54,14 @@ We now have information about all the submissions but not whether they are accep

### A list of the accepted submissions

Go to _Status -> All papers_. Here we find the information on what submissions are accepted. Copy the content of this table as you did with the previous one, except only select the lines of the _ACCEPTED_ papers. It is very important that you do not include any of the _REJECTED_ papers when selecting the content, or else these papers will be included in the resulting `db` file as well. Save the content as `accepted`, and make sure that each row in the table corresponds to one line in the resulting file. A sample `accepted` file is available [here](example-files/accepted).
Go to _Status -> All papers_. Here we find the information on what submissions are accepted. Copy the content of this table as you did with the previous one. Save the content as `accepted`, and make sure that each row in the table corresponds to one line in the resulting file. A sample `accepted` file is available [here](example-files/accepted).

### A short explanation of the steps above

Neither of the two pages we saved data from alone contain all the information we need to create the `db` file – the _Submissions_ page does not say which ones are accepted, and the _Status page_ does not tell us the author names of the papers. By taking the intersection of the submission IDs of the two lists that we saved, we get the information we need about the accepted submissions.

Copying the table contents directly from the web browser results in a nice tab separated list when pasting into a text editor. This makes it easy to work with, and if the table format should change in EasyChair it is simple to adapt the script.

# Contributors

* Asad Sayeed
74 changes: 39 additions & 35 deletions easy2acl.py
Original file line number Diff line number Diff line change
@@ -1,32 +1,32 @@
#!/usr/bin/env python3

#,----
#| easy2acl.py - Convert data from EasyChair for use with aclpub
#|
#| easy2acl.py - Convert data from EasyChair for use with aclpub
#|
#| Author: Nils Blomqvist
#|
#|
#| Documentation
#| -------------
#| Full documentation at http://github.com/nblomqvist/easy2acl.
#|
#| Quick reference
#| ---------------
#| Before running this script, your file structure should look like this:
#|
#| |-- easy2acl.py
#| |-- submissions
#| |-- accepted
#| `-- pdf
#| |-- ..._submission_1.pdf
#| |-- ..._submission_2.pdf
#| `-- ...
#|
#| Run the script:
#|
#| $ ./easy2acl.py
#|
#| Full documentation at http://github.com/nblomqvist/easy2acl.
#|
#| Quick reference
#| ---------------
#| Before running this script, your file structure should look like this:
#|
#| |-- easy2acl.py
#| |-- submissions
#| |-- accepted
#| `-- pdf
#| |-- ..._submission_1.pdf
#| |-- ..._submission_2.pdf
#| `-- ...
#|
#| Run the script:
#|
#| $ ./easy2acl.py
#|
#| When the script has finished, you will find the files 'db' and 'final.tar.gz'
#| in the same folder.
#| in the same folder.
#`----

from shutil import copy, rmtree
Expand All @@ -41,23 +41,26 @@ def texify(string):

"""
output = ''

for w in string.split():
output += unicode_to_tex(w) + ' '
output = output.strip()

return output

#,----
#| Append each accepted submission, as a tuple, to the 'accepted' list.
#`----
accepted = []
accepted = []

with open('accepted') as accepted_file:
for line in accepted_file:
entry = line.split("\t")
submission_id = entry[0]
title = entry[1]

if entry[-1][0] == "A": # if it's "ACCEPT"
#print(entry[-1])
submission_id = entry[0]
title = entry[1]

accepted.append((submission_id, title))

Expand All @@ -82,7 +85,7 @@ def texify(string):
for last in author_fullname[1:]:
author_last_name += last + ' '
author_last_name.strip()

authors_clean.append((author_last_name, author_first_name))

submissions.append((submission_id, title, authors_clean))
Expand All @@ -106,10 +109,10 @@ def texify(string):
#| Add the submissions whose submission ID is found in the 'accepted' list to a
#| new list 'final_papers'. A match must made for both the submission ID and the
#| title (just in case).
#|
#|
#| Copy the PDFs whose submission ID is found in the 'accepted' list to a
#| directory 'final'.
#|
#|
#| Finally, compress the 'final' directory into 'final.tar.gz' and remove folder
#| 'final'.
#`----
Expand All @@ -129,12 +132,13 @@ def texify(string):
copy(current_path, final_path)
break

Popen(['tar', '-czf', 'final.tar.gz', 'final'])
myprocess = Popen(['tar', '-czf', 'final.tar.gz', 'final'])
myprocess.wait()
rmtree('final')

#,----
#| Write the db file.
#|
#|
#| Sort papers naturally by key 'first author's last name'.
#`----
final_papers = sorted(final_papers, key=lambda paper: paper[2][0][0])
Expand All @@ -144,20 +148,20 @@ def texify(string):
id = paper[0]
title = texify(paper[1])
authors = paper[2]

db.write('P: ' + id + '\n')
db.write('T: ' + title + '\n')
for author in authors:
lastname = texify(author[0])
firstname = texify(author[1])

db.write('A: ' + lastname + ', ' + firstname + '\n')

for pdf in pdfs:
if paper[0] == pdf[0]:
path = pdf[3]
length = str(pdf[1])

db.write('F: ' + path + '\n')
db.write('L: ' + length + '\n')
break
Expand Down