Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Week 2 project #88

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions greenery/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

target/
dbt_packages/
logs/
52 changes: 52 additions & 0 deletions greenery/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Week 1

## Metrics:

- **Total Users**: The total number of unique users registered on the platform.
- **Average Orders per Hour**: The average number of orders placed every hour.
- **Average Hours to Deliver**: On average, the number of hours it takes from an order being placed to being delivered.
- **Users by Purchase Count**: How many users have made one, two, or three or more purchases.
- **Average Sessions per Hour**: The average number of unique browsing sessions on the platform every hour.

## Results:

| Metric | Value |
|---------------------------------|------------|
| Total Users | 130 |
| Average Orders per Hour | 7.520 |
| Average Hours to Deliver | 93.403 |
| Users with One Purchase | 25 |
| Users with Two Purchases | 28 |
| Users with Three+ Purchases | 71 |
| Average Sessions per Hour | 16.327 |


# Week 2

## Business objectives:

- **What is our user repeat rate?**: The ratio of users who made two or more purchases over total users.
- **Define good indicators for potential repeat users**
- **Define good indicators for potential non-repeat users**


## Results:

| Metric | Value |
|---------------------------------|------------|
| Repeat rate | 79.84% |

- **Define good indicators for potential repeat users**
- **Number of itens on previous order**
- **Usage of promo codes**
- **Number of sessions until purchase**
- **Average delivery time for the user**
- **Good Reviews on the products**
- **Good Reviews of the platform itself**

- **Define good indicators for potential non-repeat users**
- **Type of product ordered (one-type purchases)**
- **Email spam rate**
- **Average delivery time for the user**
- **Bad Reviews on the products**
- **Bad Reviews of the platform itself**
Empty file added greenery/analyses/.gitkeep
Empty file.
90 changes: 90 additions & 0 deletions greenery/analyses/week_1_project.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
WITH
--1. How many users do we have?
qtt_users AS (
SELECT
COUNT(DISTINCT user_id) AS distinct_users
FROM {{ ref("stg_postgres__users") }}
)--qtt_users



--2. On average, how many orders do we receive per hour?
, orders_per_hour AS (
SELECT
date_trunc('HOUR', created_at) AS hour
, COUNT(DISTINCT order_id) AS order_count
FROM {{ ref("stg_postgres__orders") }}
GROUP BY date_trunc('HOUR', created_at)
)--orders_per_hour


, avg_orders_per_hour AS (
SELECT
AVG(order_count) AS avg_order_per_hour
FROM orders_per_hour
)--avg_orders_per_hour


--3. On average, how long does an order take from being placed to being delivered?
, avg_delivery_time AS (
SELECT
AVG(DATEDIFF(hour, created_at, delivered_at)) AS avg_hours_to_deliver
FROM {{ ref("stg_postgres__orders") }}
WHERE delivered_at IS NOT NULL
)--avg_delivery_time



-- 4. How many users have only made one purchase? Two purchases? Three+ purchases?
, purchases_by_user AS (
SELECT
user_id
, COUNT(DISTINCT order_id) AS total_purchases
FROM {{ ref("stg_postgres__orders") }}
GROUP BY user_id
)--purchases_by_user

, purchase_counter AS (
SELECT
SUM(CASE WHEN total_purchases = 1 THEN 1 ELSE 0 end) AS users_with_one_purchase
, SUM(CASE WHEN total_purchases = 2 THEN 1 ELSE 0 end) AS users_with_two_purchase
, SUM(CASE WHEN total_purchases >= 3 THEN 1 ELSE 0 end) AS users_with_three_purchase
FROM purchases_by_user
)--purchase_counter



-- 5. On average, how many unique sessions do we have per hour?
,sessions_per_hour AS (
SELECT
DATE_TRUNC('HOUR', created_at) AS session_hour
, COUNT(DISTINCT session_id) AS session_count
FROM {{ ref("stg_postgres__events") }}
GROUP BY DATE_TRUNC('HOUR', created_at)
)

avg_sessions_per_hour AS (
SELECT
AVG(session_count) AS avg_session_per_hour
FROM sessions_per_hour
)



SELECT
qtt_users.distinct_users
, avg_orders_per_hour.avg_order_per_hour
, avg_delivery_time.avg_hours_to_deliver
, purchase_counter.users_with_one_purchase
, purchase_counter.users_with_two_purchase
, purchase_counter.users_with_three_purchase
, avg_sessions_per_hour.avg_session_per_hour
FROM qtt_users
CROSS JOIN
avg_orders_per_hour
CROSS JOIN
avg_delivery_time
CROSS JOIN
purchase_counter
CROSS JOIN
avg_sessions_per_hour
25 changes: 25 additions & 0 deletions greenery/analyses/week_2_project.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
with
purchases_per_user AS (
SELECT
user_id
,COUNT(DISTINCT order_id) AS purchase_counter
FROM {{ ref("stg_postgres__orders") }}
GROUP BY user_id
)--purchases_per_user

, purchase_counter AS (
SELECT
COUNT(user_id) AS users_who_purchased
, sum(CASE WHEN purchase_counter >= 2
THEN 1
ELSE 0
END
) AS users_who_purchased_twice_or_more
FROM purchases_per_user
)--purchase_counter

SELECT
users_who_purchased,
users_who_purchased_twice_or_more,
users_who_purchased_twice_or_more / users_who_purchased as repeat_rate
FROM purchase_counter
39 changes: 39 additions & 0 deletions greenery/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'greenery'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'greenery'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
greenery:
# Config indicated by + and applies to all files under models/example/
example:
+materialized: view
marts:
+materialized: table
Empty file added greenery/macros/.gitkeep
Empty file.
38 changes: 38 additions & 0 deletions greenery/models/marts/core/_core__models.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
version: 2

models:
- name: dim_products
columns:
- name: product_id
description: Unique UUID for each product
tests:
- unique
- not_null
- relationships:
to: ref('stg_postgres__products')
field: product_id

- name: dim_users
columns:
- name: user_id
description: Unique UUID for each user
tests:
- unique
- not_null
- relationships:
to: ref('stg_postgres__users')
field: user_id

- name: fact_orders
columns:
- name: order_id
description: Unique UUID for each order
tests:
- unique
- not_null
- name: user_id
description: Unique UUID for each user
tests:
- relationships:
to: ref('stg_postgres__users')
field: user_id
34 changes: 34 additions & 0 deletions greenery/models/marts/core/dim_products.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
WITH
products AS (
SELECT
product_id
,name
,price
,inventory
FROM {{ ref("stg_postgres__products") }}
)--products

, ordered_products AS (
SELECT
product_id
,num_orders
,quantity_shipped
,quantity_delivered
,quantity_preparing
,total_quantity
FROM {{ ref("int_products") }}
)--ordered_products

SELECT
products.product_id
,products.name
,products.price
,products.inventory
,ordered_products_summary.num_orders
,ordered_products_summary.quantity_shipped
,ordered_products_summary.quantity_delivered
,ordered_products_summary.quantity_preparing
,ordered_products_summary.total_quantity

FROM products
left join ordered_products_summary using (product_id)
10 changes: 10 additions & 0 deletions greenery/models/marts/core/dim_users.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
select
user_id
,first_name
,last_name
,email
,phone_number
,created_at
,updated_at
,address_id
FROM {{ ref("stg_postgres__users") }}
45 changes: 45 additions & 0 deletions greenery/models/marts/core/fact_orders.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
WITH
orders AS (
SELECT
order_id
,promo_id
,user_id
,address_id
,created_at
,order_cost
,shipping_cost
,order_total
,tracking_id
,shipping_service
,estimated_delivery_at
,delivered_at
,status
FROM {{ ref("stg_postgres__orders") }}
)--orders

, promos AS (
SELECT
promo_id
,discount
,status
FROM {{ ref("stg_postgres__promos") }}
)--promos

SELECT
orders.order_id
,orders.promo_id
,orders.user_id
,orders.address_id
,orders.created_at
,orders.order_cost
,orders.shipping_cost
,orders.order_total
,orders.tracking_id
,orders.shipping_service
,orders.estimated_delivery_at
,orders.delivered_at
,orders.status
,promos.discount AS promo_discount
FROM orders
LEFT JOIN promos
ON orders.promo_id = promos.promo_id
Loading