TeamForm Web Scraping Project

Description

This project performs web scraping to collect league ranking data from the TeamForm website. It utilizes a headless Chrome browser to navigate through the site, collecting data for each specified quarter/week.

Features

Web Scraping for League Ranking Data: Collects league ranking data from the TeamForm website, providing insights into team performances and standings.
Headless Chrome Browser Utilization: Employs a headless Chrome browser for efficient navigation and data collection from web pages.
Configurable Data Load: The PAGES_NUMBER variable in _functions.py allows control over how much data is loaded by determining the number of times the 'Load More' button is clicked, each click revealing additional rows of data.
Memory Efficiency: Designed to avoid memory issues by limiting the number of pages loaded. The script clicks the 'Load More' button a predetermined number of times (e.g., 17 times) to fetch a substantial yet manageable amount of data.
Focused Data Retrieval: Currently, the script is specialized in retrieving league data. While it does not support 'Club' or 'National' data at the moment, its structure is conducive to future expansions in this area.

Python Version Support

This project supports Python 3.8.

Note: This software has not been tested on earlier or later versions of Python.

Installation

Clone the repository:

git clone https://github.com/avchauzov/teamform_web_scraping.git

Navigate to the project directory:

cd teamform_web_scraping

Install the required dependencies:

pip install -r requirements.txt

Usage

Ensure the necessary dependencies, Chrome browser & Chromium are installed.
Modify file paths & links.json file as needed.
Run the main script to perform web scraping:

python main.py

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Name: Andrew Chauzov
Email: avchauzov@gmail.com

For more information or inquiries about the project, feel free to reach out via email.

Acknowledgements

Selenium: A powerful tool for browser automation used in this project for efficient web scraping.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

TeamForm Web Scraping Project

Description

Features

Python Version Support

Installation

Usage

License

Contact

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

TeamForm Web Scraping Project

Description

Features

Python Version Support

Installation

Usage

License

Contact

Acknowledgements