Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create open_data athena db #691

Merged
Show file tree
Hide file tree
Changes from 53 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
b31690d
First commit
wrridgeway Dec 26, 2024
a8ae382
Add docs
wrridgeway Dec 26, 2024
b781046
Add +schema
wrridgeway Dec 26, 2024
4fcff39
Rename view
wrridgeway Dec 26, 2024
14e5021
Merge branch 'master' into 652-create-current-year-only-parcel-univer…
wrridgeway Jan 2, 2025
bb04d72
Add exposure
wrridgeway Jan 2, 2025
255f9aa
Revert exposure
wrridgeway Jan 2, 2025
8d4df29
Update exposure name
wrridgeway Jan 2, 2025
267a952
Language
wrridgeway Jan 2, 2025
1f65d65
Add new open data asset
wrridgeway Jan 2, 2025
4da462a
Merge branch 'master' into 652-create-current-year-only-parcel-univer…
wrridgeway Jan 3, 2025
c715482
Add new assets to workflow
wrridgeway Jan 3, 2025
a072eda
Update worklflow asset names
wrridgeway Jan 3, 2025
587bf72
Update default asset
wrridgeway Jan 3, 2025
5b4a4be
Merge branch 'master' into 652-create-current-year-only-parcel-univer…
wrridgeway Jan 12, 2025
3bcbf9d
Add condo view
wrridgeway Jan 12, 2025
fec9365
Correct columns
wrridgeway Jan 13, 2025
241e976
Correct schema
wrridgeway Jan 13, 2025
20448e7
Add sf
wrridgeway Jan 13, 2025
84ea5ec
Add sf
wrridgeway Jan 13, 2025
49c6009
Add sf
wrridgeway Jan 13, 2025
03c472d
Rename current pu
wrridgeway Jan 13, 2025
eb38047
Forgot exposure
wrridgeway Jan 13, 2025
985dd12
Add historic parcel universe
wrridgeway Jan 13, 2025
68e84cb
Add missing columns as comments
wrridgeway Jan 13, 2025
7335329
Remove accidental changes
wrridgeway Jan 13, 2025
88a521f
Remove accidental change
wrridgeway Jan 13, 2025
7a77761
Add addresses, row ids
wrridgeway Jan 13, 2025
d324887
Add assessed values
wrridgeway Jan 13, 2025
4f80b7d
Add appeals
wrridgeway Jan 13, 2025
1c819eb
Add row_id
wrridgeway Jan 13, 2025
7a57818
Add sales
wrridgeway Jan 13, 2025
959d7d2
Add sales
wrridgeway Jan 13, 2025
2a30f9d
Add sales
wrridgeway Jan 13, 2025
244050f
Add sales
wrridgeway Jan 13, 2025
d3f80b9
Add sales
wrridgeway Jan 13, 2025
a16daee
Add sales
wrridgeway Jan 13, 2025
1197a56
Add sales
wrridgeway Jan 13, 2025
fda44cc
Add exempt parcels
wrridgeway Jan 13, 2025
2b5a1e1
Correct exempt parcel pk
wrridgeway Jan 13, 2025
bcc03e7
Add proximity
wrridgeway Jan 13, 2025
29849b9
Remove row_id construction from api script
wrridgeway Jan 13, 2025
a996df5
Simplify commenting
wrridgeway Jan 13, 2025
179e50e
Simplify row id handling
wrridgeway Jan 13, 2025
9ea9d4c
Align condo columns
wrridgeway Jan 14, 2025
3af0898
Organize excluded columns
wrridgeway Jan 14, 2025
6e8e9e6
Remove vestigial row_id ref
wrridgeway Jan 14, 2025
1e303c3
Simplify docs
wrridgeway Jan 14, 2025
fbc43c5
Add parcel status
wrridgeway Jan 14, 2025
75b95b3
Add permits
wrridgeway Jan 14, 2025
8b164e0
Correct exposure ref
wrridgeway Jan 14, 2025
a7a7a70
Remove notes column
wrridgeway Jan 14, 2025
45f7e43
Reconfigure schema
wrridgeway Jan 14, 2025
f625870
Remove row_id column description.
wrridgeway Jan 14, 2025
226af86
Language
wrridgeway Jan 14, 2025
2a2578e
Strip out missing columns commenting
wrridgeway Jan 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions .github/workflows/socrata_upload.yaml
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding both new assets and renaming the parcel universe assets here.

Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,19 @@ on:
type: choice
description: Target Socrata asset
options:
- Parcel Universe
- Single and Multi-Family Improvement Characteristics
- Residential Condominium Unit Characteristics
- Parcel Sales
- Assessed Values
- Appeals
- Assessed Values
- Parcel Addresses
- Parcel Proximity
- Parcel Sales
- Parcel Status
- Permits
- Property Tax-Exempt Parcels
default: Parcel Universe
- Parcel Universe (Historic)
- Parcel Universe (Current Year)
- Residential Condominium Unit Characteristics
- Single and Multi-Family Improvement Characteristics
default: Parcel Universe (Current Year)
required: true
overwrite:
# True for overwrite, False for update
Expand Down
2 changes: 2 additions & 0 deletions dbt/dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ models:
+schema: location
model:
+schema: model
open_data:
+schema: open_data
proximity:
+schema: proximity
reporting:
Expand Down
2 changes: 1 addition & 1 deletion dbt/models/default/docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ View containing aggregate land square footage for all PINs.
View containing building permits organized by PIN, with extra metadata
recorded by CCAO permit specialists during the permit processing workflow.

**Primary Key**: `pin`, `date_issued`
**Primary Key**: `pin`, `permit_number`
Copy link
Member Author

@wrridgeway wrridgeway Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This combo uniquely identifies rows and feels more intuitive to me:

select count(*) from default.vw_pin_permit group by pin, permit_number having count(*) > 1

Yields zero rows. I also needed to make sure that none of the columns I used for Socrata PKs had any NULL values, and this combo suits that condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Praise] Yup, this new version is correct. Thanks!

{% enddocs %}

# vw_pin_sale
Expand Down
71 changes: 71 additions & 0 deletions dbt/models/open_data/docs.md
wrridgeway marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# vw_appeal

{% docs view_vw_appeal %}
Thin wrapper around `default.vw_pin_appeal` that powers the `Appeals` open data asset
{% enddocs %}

# vw_assessed_value

{% docs view_vw_assessed_value %}
Thin wrapper around `default.vw_pin_history` that powers the `Assess Values` open data asset
{% enddocs %}

# vw_parcel_address

{% docs view_vw_parcel_address %}
Thin wrapper around `default.vw_pin_address` that powers the `Parcel Addresses` open data asset
{% enddocs %}

# vw_parcel_proximity

{% docs view_vw_parcel_proximity %}
Thin wrapper around `proximity.vw_pin10_proximity` that powers the `Parcel Proximity` open data asset
{% enddocs %}

# vw_parcel_sale

{% docs view_vw_parcel_sale %}
Thin wrapper around `default.vw_pin_sale` that powers the `Parcel Sales` open data asset
{% enddocs %}

# vw_parcel_status

{% docs view_vw_parcel_status %}
Thin wrapper around `default.vw_pin_status` that powers the `Parcel Status` open data asset
{% enddocs %}

# vw_parcel_universe_current

{% docs view_vw_parcel_universe_current %}
Thin wrapper around `default.vw_pin_universe` that powers the `Parcel Universe (Current Year)` open data asset
{% enddocs %}

# vw_parcel_universe_historic

{% docs view_vw_parcel_universe_historic %}
Thin wrapper around `default.vw_pin_universe` that powers the `Parcel Universe (Historic)` open data asset
{% enddocs %}

# vw_permit

{% docs view_vw_permit %}
Thin wrapper around `default.vw_pin_permit` that powers the `Permits` open data asset
{% enddocs %}

# vw_property_tax_exempt_parcel

{% docs view_vw_property_tax_exempt_parcel %}
Thin wrapper around `default.vw_pin_exempt` that powers the `Property Tax-Exempt Parcels` open data asset
{% enddocs %}

# vw_res_condo_unit_char

{% docs view_vw_res_condo_unit_char %}
Thin wrapper around `default.vw_pin_condo_char` that powers the `Residential Condominium Unit Characteristics` open data asset
{% enddocs %}

# vw_sf_mf_improvement_char

{% docs view_vw_sf_mf_improvement_char %}
Thin wrapper around `default.vw_card_res_char` that powers the `Single and Multi-Family Improvement Characteristics` open data asset
{% enddocs %}
Copy link
Member Author

@wrridgeway wrridgeway Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just a lift and shift from the default folder, but also added the three new assets and changed refs to point to the open_data db.

Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ exposures:
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Appeals/y282-6ig3
depends_on:
- ref('default.vw_pin_appeal')
- ref('open_data.vw_appeal')
owner:
name: Data Department
meta:
Expand All @@ -26,7 +26,7 @@ exposures:
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Assessed-Values/uzyt-m557
depends_on:
- ref('default.vw_pin_history')
- ref('open_data.vw_assessed_value')
owner:
name: Data Department
meta:
Expand All @@ -43,12 +43,13 @@ exposures:

Use cases: Alone, can characterize assessments in a given area. Can be combined with characteristic data to make more nuanced generalizations about assessments. Can be combined with sales data to conduct ratio studies.


- name: parcel_addresses
label: Parcel Addresses
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Addresses/3723-97qp
depends_on:
- ref('default.vw_pin_address')
- ref('open_data.vw_parcel_address')
owner:
name: Data Department
meta:
Expand All @@ -64,12 +65,33 @@ exposures:

Use cases: Can be used for geocoding or joining address-level data to other datasets.

- name: parcel_proximity
label: Parcel Proximity
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Proximity/ydue-e5u3
depends_on:
- ref('open_data.vw_parcel_proximity')
owner:
name: Data Department
meta:
test_row_count: true
asset_id: ydue-e5u3
primary_key:
- pin10
- year
description: |
Cook County 10-digit parcels with attached distances to various spatial features.

Notes: Refreshed monthly, data is updated yearly as spatial files are made available.

Use cases: Can be used to isolate parcels by distance to specific spatial features.

- name: parcel_sales
label: Parcel Sales
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Sales/wvhk-k5uv
depends_on:
- ref('default.vw_pin_sale')
- ref('open_data.vw_parcel_sale')
owner:
name: Data Department
meta:
Expand All @@ -84,12 +106,52 @@ exposures:

Use cases: Alone, sales data can be used to characterize real estate markets. Sales paired with characteristics can be used to find comparable properties or as an input to an automated modeling application. Sales paired with assessments can be used to calculate sales ratio statistics. Outliers can be easily removed using filters constructed from class, township, and year variables.

- name: parcel_universe
label: Parcel Universe
- name: parcel_status
label: Parcel Status
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Status/uuu4-fqy8
depends_on:
- ref('open_data.vw_parcel_status')
owner:
name: Data Department
meta:
test_row_count: true
asset_id: uuu4-fqy8
primary_key:
- pin
- year
description: |
Collection of various different PIN-level physical and assessment-related
statuses collected and documented across the CCAO and Data Department.
Constructs the Data Department's AHSAP indicator.

- name: parcel_universe_current_year
label: Parcel Universe (Current Year)
type: dashboard
url: https://datacatalog.cookcountyil.gov/dataset/Assessor-Parcel-Universe-Current-Year-/pabr-t5kh
depends_on:
- ref('open_data.vw_parcel_universe_current')
owner:
name: Data Department
meta:
test_row_count: true
asset_id: pabr-t5kh
primary_key:
- pin
- year
description: |
Most recent year universe of Cook County parcels with attached geographic, governmental, and spatial data.

Notes: Contains a cornucopia of locational and spatial data for all parcels in Cook County.

Use cases: Joining parcel-level data to this dataset allows analysis and reporting across a number of different political, tax, Census, and other boundaries.

- name: parcel_universe_historic
label: Parcel Universe (Historic)
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Parcel-Universe/nj4t-kc8j
depends_on:
- ref('default.vw_pin_universe')
- ref('open_data.vw_parcel_universe_historic')
owner:
name: Data Department
meta:
Expand All @@ -105,19 +167,37 @@ exposures:

Use cases: Joining parcel-level data to this dataset allows analysis and reporting across a number of different political, tax, Census, and other boundaries.

- name: permits
label: Permits
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Permits/6yjf-dfxs
depends_on:
- ref('open_data.vw_permit')
owner:
name: Data Department
meta:
test_row_count: true
asset_id: 6yjf-dfxs
primary_key:
- pin
- permit_number
description: |
Building permits organized by PIN, with extra metadata recorded by CCAO
permit specialists during the permit processing workflow.

- name: property_tax_exempt_parcels
label: Property Tax-Exempt Parcels
type: dashboard
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Property-Tax-Exempt-Parcels/vgzx-68gb
depends_on:
- ref('default.vw_pin_exempt')
- ref('open_data.vw_property_tax_exempt_parcel')
owner:
name: Data Department
meta:
test_row_count: true
asset_id: vgzx-68gb
primary_key:
- pin10
- pin
- year
description: |
Parcels with property tax-exempt status across all of Cook County per tax year, from Tax Year 2022 on, with geographic coordinates and addresses.
Expand All @@ -126,14 +206,15 @@ exposures:

Use cases: Can be used to study parcels that are exempted from paying property taxes.


- name: residential_condominium_unit_characteristics
label: Residential Condominium Unit Characteristics
type: dashboard
tags:
- type_condo
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Residential-Condominium-Unit-Characteri/3r7i-mrz4
depends_on:
- ref('default.vw_pin_condo_char')
- ref('open_data.vw_res_condo_unit_char')
owner:
name: Data Department
meta:
Expand All @@ -158,7 +239,7 @@ exposures:
- type_res
url: https://datacatalog.cookcountyil.gov/Property-Taxation/Assessor-Single-and-Multi-Family-Improvement-Chara/x54s-btds
depends_on:
- ref('default.vw_card_res_char')
- ref('open_data.vw_sf_mf_improvement_char')
owner:
name: Data Department
meta:
Expand Down
26 changes: 26 additions & 0 deletions dbt/models/open_data/open_data.vw_appeal.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
-- Copy of default.vw_pin_appeal that feeds the "Appeals" open data asset.
SELECT
pin || year || case_no AS row_id,
pin,
class,
township_code,
year,
mailed_bldg,
mailed_land,
mailed_tot,
certified_bldg,
certified_land,
certified_tot,
case_no,
appeal_type,
change,
reason_code1,
reason_desc1,
reason_code2,
reason_desc2,
reason_code3,
reason_desc3,
agent_code,
agent_name,
status
FROM {{ ref('default.vw_pin_appeal') }}
49 changes: 49 additions & 0 deletions dbt/models/open_data/open_data.vw_assessed_value.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
-- Copy of default.vw_pin_history that feeds the "Assessed Values" open data
-- asset.

/* The following columns are not included in the open data asset, or are
currently hidden:
mailed_class
certified_class
board_class
change_reason
oneyr_pri_mailed_bldg
oneyr_pri_mailed_land
oneyr_pri_mailed_tot
oneyr_pri_certified_bldg
oneyr_pri_certified_land
oneyr_pri_certified_tot
oneyr_pri_board_bldg
oneyr_pri_board_land
oneyr_pri_board_tot
oneyr_pri_change_reason
twoyr_pri_mailed_bldg
twoyr_pri_mailed_land
twoyr_pri_mailed_tot
twoyr_pri_certified_bldg
twoyr_pri_certified_land
twoyr_pri_certified_tot
twoyr_pri_board_bldg
twoyr_pri_board_land
twoyr_pri_board_tot
twoyr_pri_change_reason
*/
wrridgeway marked this conversation as resolved.
Show resolved Hide resolved

SELECT
CONCAT(pin, year) AS row_id,
pin,
year,
class,
township_code,
township_name,
nbhd,
mailed_bldg,
mailed_land,
mailed_tot,
certified_bldg,
certified_land,
certified_tot,
board_bldg,
board_land,
board_tot
FROM {{ ref('default.vw_pin_history') }}
Loading
Loading