TeamForm Web Scraping Project

Description

This project performs web scraping to collect league ranking data from the TeamForm website. It utilizes a headless Chrome browser to navigate through the site, collecting data for each specified quarter/week.

Features

Web Scraping for League Ranking Data: Collects league ranking data from the TeamForm website, providing insights into team performances and standings.
Headless Chrome Browser Utilization: Employs a headless Chrome browser for efficient navigation and data collection from web pages.
Configurable Data Load: The PAGES_NUMBER variable in _functions.py allows control over how much data is loaded by determining the number of times the 'Load More' button is clicked, each click revealing additional rows of data.
Memory Efficiency: Designed to avoid memory issues by limiting the number of pages loaded. The script clicks the 'Load More' button a predetermined number of times (e.g., 17 times) to fetch a substantial yet manageable amount of data.
Focused Data Retrieval: Currently, the script is specialized in retrieving league data. While it does not support 'Club' or 'National' data at the moment, its structure is conducive to future expansions in this area.

Python Version Support

This project supports Python 3.8.

Note: This software has not been tested on earlier or later versions of Python.

Installation

Clone the repository:

git clone https://github.com/avchauzov/teamform_web_scraping.git

Navigate to the project directory:

cd teamform_web_scraping

Install the required dependencies:

pip install -r requirements.txt

Usage

Ensure the necessary dependencies, Chrome browser & Chromium are installed.
Modify file paths & links.json file as needed.
Run the main script to perform web scraping:

python main.py

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Name: Andrew Chauzov
Email: [email protected]

For more information or inquiries about the project, feel free to reach out via email.

Acknowledgements

Selenium: A powerful tool for browser automation used in this project for efficient web scraping.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TeamForm Web Scraping Project

Description

Features

Python Version Support

Installation

Usage

License

Contact

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
_credentials		_credentials
LICENSE		LICENSE
README.md		README.md
functions.py		functions.py
main.py		main.py
requirements.txt		requirements.txt

License

avchauzov/teamform_web_scraping

Folders and files

Latest commit

History

Repository files navigation

TeamForm Web Scraping Project

Description

Features

Python Version Support

Installation

Usage

License

Contact

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages