Skip to content

gizmodata/benchmark-bigquery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BigQuery benchmark repo

This repo is intended to run benchmark queries against BigQuery.

Setup (to run locally)

1. Clone the repo

git clone https://github.com/voltrondata/benchmark-bigquery

2. Setup Python

Create a new Python 3.8+ virtual environment and install the requirements with:

cd benchmark-bigquery

# Create the virtual environment
python3 -m venv ./venv

# Activate the virtual environment
. ./venv/bin/activate

# Upgrade pip, setuptools, and wheel
pip install --upgrade pip setuptools wheel

# Install the benchmark-bigquery package (in editable mode)
pip install --editable .

3. Create .env file in root of repo folder

Create a .env file in the root folder of the repo - it will be git-ignored for security reasons.

Sample contents:

export GOOGLE_PROJECT_ID="voltron-data-developers"
export DATASET_ID="voltron-data-developers.tpch_10"

4. Authenticate with Google Cloud

gcloud auth application-default login

Running the benchmarks (with default settings)

benchmark-bigquery

Note: this will create a file in the data directory called: "benchmark_results.json" with the query run details.

To see more options:

benchmark-bigquery --help

Converting the benchmark JSON output data to Excel format

benchmark-bigquery-convert-output-to-excel

Note: this will create an Excel file in the data directory called: "benchmark_results.xlsx" with the query run details.

About

Tools to benchmark Google Cloud BigQuery

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages