Data Analysis and Visualization

This repository aims to show basic concepts of data analysis, and not only the technical content, but also, a critical view regarding the data.

Python libraries
- Numpy
  
  NumPy is a Python library used for working with arrays. It also has functions for working in domain of linear algebra, fourier transform, and matrices.
- SciPy
  
  SciPy is a scientific computation library that uses NumPy underneath. SciPy stands for Scientific Python. It provides more utility functions for optimization, stats and signal processing.
- Pandas
  
  pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
- StatsModels
  
  statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.
- Matplotlib
  
  Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
- Seaborn
  
  Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
- Datetime
  
  The datetime module includes functions and classes for doing date and time parsing, formatting, and arithmetic.
- Threading
  
  This module constructs higher-level threading interfaces on top of the lower level _thread module.
- Speedtest
  
  Command line interface for testing internet bandwidth using speedtest.net
- Faker
  
  Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you.
- Missingno
  
  Missingno is a Python library that provides the ability to understand the distribution of missing values through informative visualizations.
- FuzzyWuzzy
  
  It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.
Introduction to Machine Learning
- Scikit-learn
  - Classification
  - Regression
  - Clustering
  - Dimensionality reduction
  - Model selection
  - Preprocessing
- XGboost

The content related to Data Science (Machine Learning and Deep Learning) is available in another repository.

Data Scraping
- Scrapy
- Selenium WebDriver
SQL

Name		Name	Last commit message	Last commit date
Latest commit History 185 Commits
Data-Scraping		Data-Scraping
Disciplina-IA/Analisador_de_Risco		Disciplina-IA/Analisador_de_Risco
Introduction-to-Machine-Learning		Introduction-to-Machine-Learning
Python-libraries		Python-libraries
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis and Visualization

About

Releases

Packages

Languages

GuilhermeMonteiroPeixoto/Data-Analysis-and-Visualization

Folders and files

Latest commit

History

Repository files navigation

Data Analysis and Visualization

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages