ML-video-game-analysis-synthetic

A processing of a simplified video game dataset using k-nearest neighbors, and using Synthetic Data Vault and Regression/Classification models to verify synthetic model accuracy

data

We use a simplified dataset that lists age, weight, height, gender and gaming preference in terms of RPG, FPS, and so on.

approach

We first approach the video game dataset with a simple k-nearest neighbors example, dropping the gaming preference column in an attempt to predict it given different values of age, height, weight and gender.

We then flip this process on an axis by using the same data to predict probable genders, given age, height, weight, and gaming preference.

synthetic data generation and verification

We then deviate and use the Synthetic Data Vault to extend the size of the original dataset. This requires installation of the Synthetic Data Vault packages. We use this to create a dataset that is 5x the original.

Using SDV, we create two synthetic sets, a Gaussian Copula version and a CTGAN version.

We want to verify the veracity of the synthetic data, so we use both linear regression (with label conversion to numeric representations) and a classification model (using the gaming preference column as classification labels).

This classification and linear regression process requires instalation of SKlearn through scikit-learn.

Pandas and Numpy need to be included but are not listed as they are standard packages.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
classification_analysis_synthetic.py		classification_analysis_synthetic.py
k-nearest-video-games-gender.py		k-nearest-video-games-gender.py
k-nearest-video-games.py		k-nearest-video-games.py
linear_regression_analysis_synthetic.py		linear_regression_analysis_synthetic.py
sdv_generate_videog_data.py		sdv_generate_videog_data.py
video-game-dataset.csv		video-game-dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-video-game-analysis-synthetic

data

approach

synthetic data generation and verification

About

Releases

Packages

Languages

License

jtatman/ML-video-game-analysis-synthetic

Folders and files

Latest commit

History

Repository files navigation

ML-video-game-analysis-synthetic

data

approach

synthetic data generation and verification

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages