Facebook Hackathon 2018 - News/Blog Quality Ranking

Possible Domains

newsrank.org
mediarank.*
inforank.*
sourcerank.*

Basic Idea

Populate a database with data that measures the ‘information quality’ of a publication
Rank publications according to the data
Expose these rankings via a web interface and via a public API

Different ‘Information Quality’ Metrics

“I trust the experts” - ie. How many academics contribute to the publication?
“I value sources” - ie. How many links out of an average article on the publication
“I don’t want click-funded content” - ie. Do they have a subscription revenue model?
“I want to avoid dishonest publications” - ie. How many time have their articles been review as “false” on Snopes.com, TrueFact, etc. etc.
"I want to avoid highly biased sources" - ie. Use https://mediabiasfactcheck.com/

Getting the data - “I trust the experts”

Scrape articles from a list of publications
Get the authors from the articles
Check if authors have:
1. Published papers (check Google Scholar)
2. Published Books
3. (Stretch goal) Check if their published stuff matches the article's topic (use an ML API if one exists)
Rank publications according to who most often publishes academics

Getting the data - “I don’t want click-funded content”

Manually check the list of publications

Getting the data - “I value sources”

Scrape articles from a list of publications
Check how many external hyperlinks exist on average
1. Note: hyperlinks != sources, but it’ll do for an MVP

Getting the data - “I want to avoid dishonest publications”

Fact check organisations will more often (I think) fact check people rather than individual articles
So if you search for people in the author table and see whether their name shows up on the fact check websites, we can trace this back to the publications they've published on.

Getting the data - "I want to avoid highly biased sources"

MediaBiasFactCheck.com has a lot of data on this, but they don't seem to provide a public API. Scraping their list-view pages shouldn't be too hard though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

project_intro.md

project_intro.md

Facebook Hackathon 2018 - News/Blog Quality Ranking

Possible Domains

Basic Idea

Different ‘Information Quality’ Metrics

Getting the data - “I trust the experts”

Getting the data - “I don’t want click-funded content”

Getting the data - “I value sources”

Getting the data - “I want to avoid dishonest publications”

Getting the data - "I want to avoid highly biased sources"

Files

project_intro.md

Latest commit

History

project_intro.md

File metadata and controls

Facebook Hackathon 2018 - News/Blog Quality Ranking

Possible Domains

Basic Idea

Different ‘Information Quality’ Metrics

Getting the data - “I trust the experts”

Getting the data - “I don’t want click-funded content”

Getting the data - “I value sources”

Getting the data - “I want to avoid dishonest publications”

Getting the data - "I want to avoid highly biased sources"