Page-Scraper

This is a Python based web scraper designed particularly for scraping information about students
clearing Google Summer of Code in the past years. It collects the name, organisation and project
details of each successful candidate which is available on Google's official webpage: https://summerofcode.withgoogle.com
and stores it in a .csv file.
It further compares the obtained data with the student database in .json format and returns
the name of the relevant matches.

Important Stuff

This requires Python 3.x installed on your system along with the BeautifulSoup4 and Requests library.
In case you don't have the above mentioned dependencies, then follow the given installation steps:

sudo apt install python
sudo apt install pip
pip install beautifulsoup4 requests

How to Use

Download the repository in your local machine. Make sure you have all the dependencies installed.
You might want to create a different .csv file for storing the data. If this is the case, then change
the file name in scraper.py and accordingly in check.py.
To provide a diiferent JSON database, copy the .json file to the same directory as that of check.py
and accordingly change the name in check.py( scraper.py remains unchanged).
Finally, run the python file in your terminal using the following commands.

python scraper.py

Give in the URL as input. This shall store the data in the specified.csv file. (org_info.csv in this case)
Next up, run the check.py file which gives the common name entries as output along with some other details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
check.py		check.py
org_info.csv		org_info.csv
scraper.py		scraper.py
students.json		students.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Page-Scraper

Important Stuff

How to Use

About

Releases

Packages

Languages

kartikcode/Page-Scraper-PClub

Folders and files

Latest commit

History

Repository files navigation

Page-Scraper

Important Stuff

How to Use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages