Step 1: Configure config.json

The code and queries etc here are unlikely to be updated as my process evolves. Later repos will likely have progressively different approaches and more elaborate tooling, as my habit is to try to improve at least one part of the process each time around.

Step 1: Configure config.json

All the relevant metadata now lives in config.json: ideally nothing will need tweaked after this. We need to be careful here to get the history of Wikidata IDs for the constituency correct.

Step 1: Scrape the results

jq -r .wikipedia config.json | xargs bundle exec ruby scraper.rb | tee wikipedia.csv

Step 2: Check for missing party IDs

xsv search -v -s party 'Q' wikipedia.csv

Nothing missing.

Ste 3: Check for missing election IDs

xsv search -v -s election 'Q' wikipedia.csv | xsv select electionLabel | uniq

Missing:

1890 Bristol East by-election
1895 Bristol East by-election

I created these as Q98521628 and Q98521631 and manuall inserted them into the scraped file (later versions should do more of this automatically)

Step 4: Generate possible missing person IDs

xsv search -v -s id 'Q' wikipedia.csv | xsv select name | tail +2 |
  sed -e 's/^/"/' -e 's/$/"@en/' | paste -s - |
  xargs -0 wd sparql find-candidates.js |
  jq -r '.[] | [.name, .item.value, .election.label, .constituency.label, .party.label] | @csv' |
  tee candidates.csv

Step 5: Combine Those

xsv join -n --left 2 wikipedia.csv 1 candidates.csv | xsv select '10,1-8' | sed $'1i\\\nfoundid' | tee combo.csv

Step 6: Generate QuickStatements commands

bundle exec ruby generate-qs.rb config.json | tee commands.qs

Then sent to QuickStatements as https://tools.wmflabs.org/editgroups/b/QSv2T/1597907922692

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Step 1: Configure config.json

Step 1: Scrape the results

Step 2: Check for missing party IDs

Ste 3: Check for missing election IDs

Step 4: Generate possible missing person IDs

Step 5: Combine Those

Step 6: Generate QuickStatements commands

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
lib		lib
.rubocop.yml		.rubocop.yml
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
candidates.csv		candidates.csv
combo.csv		combo.csv
commands.qs		commands.qs
config.json		config.json
find-candidates.js		find-candidates.js
generate-qs.rb		generate-qs.rb
scraper.rb		scraper.rb
wikipedia.csv		wikipedia.csv

every-politician-scrapers/bristol-east-elections-wikipedia

Folders and files

Latest commit

History

Repository files navigation

Step 1: Configure config.json

Step 1: Scrape the results

Step 2: Check for missing party IDs

Ste 3: Check for missing election IDs

Step 4: Generate possible missing person IDs

Step 5: Combine Those

Step 6: Generate QuickStatements commands

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages