Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project w2 #87

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions greenery/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

target/
dbt_packages/
logs/
108 changes: 108 additions & 0 deletions greenery/Project 1/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Week 1 Project:

**Question :**
How many users do we have?

**Answer :**
130 unique users

**Query :**

```sql
SELECT COUNT(DISTINCT(user_id)) AS COUNT
FROM dev_db.dbt_burjack86gmailcom.stg_postgres__users
```

---

**Question :**
On average, how many orders do we receive per hour?

**Answer :**
On average, we receive approximately 15 orders per hour

**Query :**

```sql
SELECT DISTINCT
AVG(COUNT(*)) OVER () AS avg_orders_per_hour
FROM
dev_db.dbt_burjack86gmailcom.stg_postgres__orders
GROUP BY
EXTRACT(HOUR FROM created_at)
```

---

**Question :**
On average, how long does an order take from being placed to being delivered?

**Answer :**
On average, it takes almost 4 days (circa 3 days and 22 hours) for an order to be delivered.

**Query :**

```sql
SELECT AVG(DATEDIFF(DAY, created_at, delivered_at)) AS avg_delivery_days
FROM dev_db.dbt_burjack86gmailcom.stg_postgres__orders
WHERE status = 'delivered'
```

---

**Question :**
How many users have only made one purchase? Two purchases? Three+ purchases?

**Answer :**
Users by number of purchases are as follows:
One purchase = 25
Two purchases = 28
Three+ purchases = 71

**Query :**

```sql
WITH agg_purchases AS(
SELECT
COUNT(DISTINCT(order_id)) AS order_count
, CASE
WHEN order_count = 1 THEN '=1'
WHEN order_count = 2 THEN '=2'
WHEN order_count >= 3 THEN '>= 3'
END AS purchase_cohort
FROM dev_db.dbt_burjack86gmailcom.stg_postgres__orders
GROUP BY user_id
)

SELECT
purchase_cohort,
COUNT(order_count) AS order_count
FROM agg_purchases
GROUP BY purchase_cohort
```

---

**Question :**
On average, how many unique sessions do we have per hour?

**Answer :**
On average, there are almost 40 sessions per hour (39.46).

**Query :**

```sql
WITH sessions_hour AS (
SELECT DISTINCT
EXTRACT(HOUR FROM created_at),
COUNT(DISTINCT(session_id)) AS sessions_per_hour
FROM dev_db.dbt_burjack86gmailcom.stg_postgres__events
GROUP BY EXTRACT(HOUR FROM created_at)
ORDER BY EXTRACT(HOUR FROM created_at)
)

SELECT
AVG(sessions_per_hour) AS avg_sessions_per_hour
FROM sessions_hour
```

15 changes: 15 additions & 0 deletions greenery/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Welcome to your new dbt project!

### Using the starter project

Try running the following commands:
- dbt run
- dbt test


### Resources:
- Learn more about dbt [in the docs](https://docs.getdbt.com/docs/introduction)
- Check out [Discourse](https://discourse.getdbt.com/) for commonly asked questions and answers
- Join the [chat](https://community.getdbt.com/) on Slack for live discussions and support
- Find [dbt events](https://events.getdbt.com) near you
- Check out [the blog](https://blog.getdbt.com/) for the latest news on dbt's development and best practices
Empty file added greenery/analyses/.gitkeep
Empty file.
37 changes: 37 additions & 0 deletions greenery/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'greenery'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'greenery'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
greenery:
# Config indicated by + and applies to all files under models/example/
example:
+materialized: view
Empty file added greenery/macros/.gitkeep
Empty file.
27 changes: 27 additions & 0 deletions greenery/models/example/my_first_dbt_model.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@

/*
Welcome to your first dbt model!
Did you know that you can also configure models directly within SQL files?
This will override configurations stated in dbt_project.yml
Try changing "table" to "view" below
*/

{{ config(materialized='table') }}

with source_data as (

select 1 as id
union all
select null as id

)

select *
from source_data

/*
Uncomment the line below to remove records with null `id` values
*/

-- where id is not null
6 changes: 6 additions & 0 deletions greenery/models/example/my_second_dbt_model.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

-- Use the `ref` function to select from other models

select *
from {{ ref('my_first_dbt_model') }}
where id = 1
21 changes: 21 additions & 0 deletions greenery/models/example/schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@

version: 2

models:
- name: my_first_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
tests:
- unique
- not_null

- name: my_second_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
tests:
- unique
- not_null
18 changes: 18 additions & 0 deletions greenery/models/marts/Marketing/fact_users_orders.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
{{
config(
materialized='table'
)
}}

SELECT
u.user_id
, count(o.order_id) AS count_orders
, count(case when o.promo_id is not null then o.promo_id end) AS count_promo_orders
, date(min(o.created_at)) AS first_order_date
, date(max(o.created_at)) AS last_order_date
, round(avg(o.order_total),2) AS avg_order_total
FROM {{ ref ('stg_postgres__users') }} u
LEFT JOIN {{ ref ('stg_postgres__orders') }} o
ON u.user_id = o.user_id
WHERE o.status = 'delivered'
group by all
30 changes: 30 additions & 0 deletions greenery/models/staging/postgres/_postgres__sources.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
version: 2

sources:

- name: postgres # project name
schema: public
database: raw

tables:
- name: addresses
description: >
Address IDs for each customer
- name: users
description: >
Table that has customers ID and information
- name: promos
description: >
Promo codes, status, and discounts.
- name: products
description: >
Products list including name, price, and inventory.
- name: orders
description: >
Oders event table.
- name: order_items
description: >
Producsts for each order event.
- name: events
description: >
Events by user and webpage.
2 changes: 2 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__addresses.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
select ADDRESS_ID, ADDRESS, ZIPCODE, STATE, COUNTRY
from {{source('postgres', 'addresses')}}
19 changes: 19 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__addresses.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
version: 2

models:
- name: stg_postgres__addresses
description: Table for customer adresses
columns:
- name: address_id
description: Unique address ID in the platform (Primary key)
tests:
- unique
- not_null
- name: address
description: Address description (street number, name etc)
- name: zipcode
description: The zipcode
- name: state
description: Address' State
- name: country
description: Address' country
2 changes: 2 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__events.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
select EVENT_ID, SESSION_ID, USER_ID, PAGE_URL, CREATED_AT, EVENT_TYPE, ORDER_ID, PRODUCT_ID
from {{source('postgres', 'events')}}
26 changes: 26 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__events.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
version: 2

models:

- name: stg_postgres__events
description: Events based on customers interaction with website
columns:
- name: event_id
description: Unique event ID in the platform (Primary key)
tests:
- unique
- not_null
- name: session_id
description: session browsing ID (Primary key)
- name: user_id
description: User ID indicating to whom the event is associated with
- name: page_url
description: URL of the event
- name: created_at
description: Timestamp of the event
- name: event_type
description: Type of event
- name: order_id
description: Order ID related to the event if there was one
- name: product_id
description: Product ID related to the event if there was one
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
select ORDER_ID, PRODUCT_ID, QUANTITY
from {{source('postgres', 'order_items')}}
13 changes: 13 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__order_items.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: 2

models:

- name: stg_postgres__order_items
description: Table that links orders with products and quantities
columns:
- name: order_id
description: Order ID
- name: product_id
description: Product ID
- name: quantity
description: Quantity ordered
2 changes: 2 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__orders.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
select ORDER_ID, USER_ID, PROMO_ID, ADDRESS_ID, CREATED_AT, ORDER_COST, SHIPPING_COST, ORDER_TOTAL, TRACKING_ID, SHIPPING_SERVICE, ESTIMATED_DELIVERY_AT, DELIVERED_AT, STATUS
from {{source('postgres', 'orders')}}
36 changes: 36 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__orders.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
version: 2

models:

- name: stg_postgres__orders
description: Table with customers orders and its information
columns:
- name: order_id
description: Unique order ID in the platform (Primary key)
tests:
- unique
- not_null
- name: user_id
description: User ID related to the order
- name: promo_id
description: Promo ID if there was on applied to the order
- name: address_id
description: Delivery address ID
- name: created_at
description: Timestamp indicating when the order was created
- name: order_cost
description: Cost for the respective order
- name: shipping_cost
description: Cost of shipping
- name: order_total
description: Total cost of the order
- name: tracking_id
description: Tracking number for the order
- name: shipping_service
description: Company that was used for shipping
- name: estimated_delivery_at
description: Estimated date of delivery
- name: delivered_at
description: Actual time of delivery
- name: status
description: Order Status
2 changes: 2 additions & 0 deletions greenery/models/staging/postgres/stg_postgres__products.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
select PRODUCT_ID, NAME, PRICE, INVENTORY
from {{source('postgres', 'products')}}
Loading