uniformize older scrapers #18

alexn11 · 2020-12-21T22:44:30Z

Some of the scrapers have different columns (the-bfd.py, cato-institute.py, co2-coalition.py) or missing source column (bbc-non-climate, breibart-defense, the-onion-politics). If these are to be used again, should change them (and remove the scripts in the normalizer directory which are intended to correct that).

ricjhill · 2021-01-15T00:21:42Z

This scraping script works on most data sources. The output is standardized

Just build the docker file and have look. The differences between the datasources we can fix in the filter function for each datasource and by custom extractions from the HTML . The output is standardized

https://github.com/ClimateMisinformation/Scrapers/tree/create-container-climatediscussionnexus.com/infrastructure/docker/climatediscussionnexus-scrape

alexn11 self-assigned this Dec 21, 2020

alexn11 added the maybe Possible improvement label Dec 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uniformize older scrapers #18

uniformize older scrapers #18

alexn11 commented Dec 21, 2020

ricjhill commented Jan 15, 2021

uniformize older scrapers #18

uniformize older scrapers #18

Comments

alexn11 commented Dec 21, 2020

ricjhill commented Jan 15, 2021