Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test rolling SF sale-price feature #95

Draft
wants to merge 18 commits into
base: 2025-assessment-year
Choose a base branch
from
1 change: 1 addition & 0 deletions R/setup.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ suppressPackageStartupMessages({
# Resolve package namespace conflicts, preferring the library::function pair
# shown over other functions with the same name from different libraries
conflicts_prefer(
data.table::`:=`,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistently got an error that data.table and rlang were conflicting.

dplyr::filter,
dplyr::first,
dplyr::lag,
Expand Down
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,10 +114,15 @@ ones used in the most recent assessment model.
| Total Condominium Building Non-Livable Parcels | char_building_non_units | Count of non-livable 14-digit PINs | Characteristic | numeric | X |
| Condominium Building Is Mixed Use | char_bldg_is_mixed_use | The 10-digit PIN (building) contains a 14-digit PIN that is neither class 299 nor 399 | Characteristic | logical | X |
| Total Condominium Building Square Footage | char_building_sf | Square footage of the *building* (PIN10) containing this unit | Characteristic | numeric | X |
| Building Square Footage | char_building_sf | Square footage of the *building* (PIN10) containing this unit | Characteristic | numeric | X |
| Condominium Unit Square Footage | char_unit_sf | Square footage of the condominium unit associated with this PIN | Characteristic | numeric | X |
| Unit Square Footage | char_unit_sf | Square footage of the condominium unit associated with this PIN | Characteristic | numeric | X |
| Condominium Unit Bedrooms | char_bedrooms | Number of bedrooms in the building | Characteristic | numeric | X |
| Bedrooms | char_bedrooms | Number of bedrooms in the building | Characteristic | numeric | X |
| Condominium Unit Half Baths | char_half_baths | Number of half baths | Characteristic | numeric | X |
| Half Baths | char_half_baths | Number of half baths | Characteristic | numeric | X |
| Condominium Unit Full Baths | char_full_baths | Number of full bathrooms | Characteristic | numeric | X |
| Full Baths | char_full_baths | Number of full bathrooms | Characteristic | numeric | X |
| Condominium % Ownership | meta_tieback_proration_rate | Proration rate applied to the PIN | Meta | numeric | X |
| Condominium Building Strata 1 | meta_strata_1 | Condominium Building Strata - 10 Levels | Meta | character | X |
| Condominium Building Strata 2 | meta_strata_2 | Condominium Building Strata - 100 Levels | Meta | character | X |
Expand All @@ -135,6 +140,24 @@ ones used in the most recent assessment model.
| Average Daily Traffic Count on Nearest Collector Road | prox_nearest_road_collector_daily_traffic | Daily traffic of nearest collector road | Proximity | numeric | X |
| Nearest New Construction (Feet) | prox_nearest_new_construction_dist_ft | Nearest new construction distance (feet) | Proximity | numeric | X |
| Nearest Major Stadium (Feet) | prox_nearest_stadium_dist_ft | Nearest stadium distance (feet) | Proximity | numeric | X |
| NA | time_sale_roll_mean_nbhd_sf_t0_w1 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t0_w2 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t0_w3 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t1_w1 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t1_w2 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t1_w3 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t2_w1 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t2_w2 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_sf_t2_w3 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t0_w1 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t0_w2 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t0_w3 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t1_w1 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t1_w2 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t1_w3 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t2_w1 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t2_w2 | | NA | NA | X |
| NA | time_sale_roll_mean_nbhd_condo_t2_w3 | | NA | NA | X |
| Percent Population Age, Under 19 Years Old | acs5_percent_age_children | Percent of the people 17 years or younger | ACS5 | numeric | |
| Percent Population Age, Over 65 Years Old | acs5_percent_age_senior | Percent of the people 65 years or older | ACS5 | numeric | |
| Median Population Age | acs5_median_age_total | Median age for whole population | ACS5 | numeric | |
Expand Down
23 changes: 23 additions & 0 deletions docs/data-dict.csv
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,15 @@ Total Condominium Building Livable Parcels,char_building_units,Count of livable
Total Condominium Building Non-Livable Parcels,char_building_non_units,Count of non-livable 14-digit PINs,Characteristic,numeric,TRUE
Condominium Building Is Mixed Use,char_bldg_is_mixed_use,The 10-digit PIN (building) contains a 14-digit PIN that is neither class 299 nor 399,Characteristic,logical,TRUE
Total Condominium Building Square Footage,char_building_sf,Square footage of the _building_ (PIN10) containing this unit,Characteristic,numeric,TRUE
Building Square Footage,char_building_sf,Square footage of the _building_ (PIN10) containing this unit,Characteristic,numeric,TRUE
Condominium Unit Square Footage,char_unit_sf,Square footage of the condominium unit associated with this PIN,Characteristic,numeric,TRUE
Unit Square Footage,char_unit_sf,Square footage of the condominium unit associated with this PIN,Characteristic,numeric,TRUE
Condominium Unit Bedrooms,char_bedrooms,Number of bedrooms in the building,Characteristic,numeric,TRUE
Bedrooms,char_bedrooms,Number of bedrooms in the building,Characteristic,numeric,TRUE
Condominium Unit Half Baths,char_half_baths,Number of half baths,Characteristic,numeric,TRUE
Half Baths,char_half_baths,Number of half baths,Characteristic,numeric,TRUE
Condominium Unit Full Baths,char_full_baths,Number of full bathrooms,Characteristic,numeric,TRUE
Full Baths,char_full_baths,Number of full bathrooms,Characteristic,numeric,TRUE
Condominium % Ownership,meta_tieback_proration_rate,Proration rate applied to the PIN,Meta,numeric,TRUE
Condominium Building Strata 1,meta_strata_1,Condominium Building Strata - 10 Levels,Meta,character,TRUE
Condominium Building Strata 2,meta_strata_2,Condominium Building Strata - 100 Levels,Meta,character,TRUE
Expand All @@ -25,6 +30,24 @@ Average Daily Traffic Count on Nearest Arterial Road,prox_nearest_road_arterial_
Average Daily Traffic Count on Nearest Collector Road,prox_nearest_road_collector_daily_traffic,Daily traffic of nearest collector road,Proximity,numeric,TRUE
Nearest New Construction (Feet),prox_nearest_new_construction_dist_ft,Nearest new construction distance (feet),Proximity,numeric,TRUE
Nearest Major Stadium (Feet),prox_nearest_stadium_dist_ft,Nearest stadium distance (feet),Proximity,numeric,TRUE
NA,time_sale_roll_mean_nbhd_sf_t0_w1,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t0_w2,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t0_w3,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t1_w1,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t1_w2,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t1_w3,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t2_w1,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t2_w2,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_sf_t2_w3,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t0_w1,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t0_w2,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t0_w3,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t1_w1,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t1_w2,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t1_w3,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t2_w1,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t2_w2,,NA,NA,TRUE
NA,time_sale_roll_mean_nbhd_condo_t2_w3,,NA,NA,TRUE
"Percent Population Age, Under 19 Years Old",acs5_percent_age_children,Percent of the people 17 years or younger,ACS5,numeric,FALSE
"Percent Population Age, Over 65 Years Old",acs5_percent_age_senior,Percent of the people 65 years or older,ACS5,numeric,FALSE
Median Population Age,acs5_median_age_total,Median age for whole population,ACS5,numeric,FALSE
Expand Down
16 changes: 8 additions & 8 deletions dvc.lock
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ stages:
deps:
- path: pipeline/00-ingest.R
hash: md5
md5: f758cc2d2c8dbe928806ffb0a46ab821
size: 24134
md5: b91e2ec22113406aae490edb36fbb7dd
size: 33642
params:
params.yaml:
assessment:
Expand All @@ -31,12 +31,12 @@ stages:
outs:
- path: input/assessment_data.parquet
hash: md5
md5: b1462cc55efa7d8beb5ec2af9a649a9b
size: 76103136
md5: a8e25e4fe7e62b85b63d5175d82603e8
size: 79249396
- path: input/char_data.parquet
hash: md5
md5: 09a842b0910fa84c9fa7834593ee488c
size: 149301395
md5: c413791724db9725659ef536c429602f
size: 149205649
- path: input/condo_strata_data.parquet
hash: md5
md5: ded3ecde590af57e6b98a8935fae0215
Expand All @@ -47,8 +47,8 @@ stages:
size: 6019
- path: input/training_data.parquet
hash: md5
md5: ef87ceb9be93d8ae85118991ab5269f2
size: 76713007
md5: 17ff61dc7cd2ecd272d46f9777b80080
size: 100488708
train:
cmd: Rscript pipeline/01-train.R
deps:
Expand Down
Loading