-
Notifications
You must be signed in to change notification settings - Fork 10
第 13 章
coderLMN edited this page Jun 15, 2018
·
2 revisions
第13章半结构化数据获取,数据源从ftp改变成了http协议,现数据源网址http://www.wcc.nrcs.usda.gov/ftpref/data/climate/table/temperature/history/california/
气象站位置的数据也没有了,在 https://wrcc.dri.edu/Monitoring/Stations/station_inventory_show.php?snet=snotel&sstate=CA 有一个基本相同的数据,但是这个文件的格式和原文件有差异,比如 sep="|" 这个参数就不对,因为每个数据项的分隔符不是 | 而是制表符 /t,其他内容也需要自己解析一下看看是否正确。
需要的数据项应该就是这几项吧:
NRCSID Sitename Lat. Long. Elev.
其他的可以略去,如果要用 header=F 这个参数,那么就只要那一行横杠底下的数据就可以了,别的内容不要存到文件里,这样解析比较方便,还可以先把它存为 .csv 文件,用 Excel 打开,并删掉不需要的几列数据,然后再从 R 里读取需要的几项数据。
另外,第二列的数据都需要在前面加上 'CA',比如第一项 20H13S 应该改为 CA20H13S,这样才能和书中代码里的 id 项格式相符。
内容我在这里也贴一遍,以防这个页面将来也找不到了:
Station Data Inventory Listings
Snotel Network: California
WRCC Snotel Inventory. Last updated 970307. Kelly Redmond.
Hbk5 NRCSID STNUM Sitename Lat. Long. Elev. SDPXNV Start End
----- ------ ----- -------------------- ---- ----- ----- ------ ------ ------
ADMC1 20H13S 04001 ADIN MTN 4115 12046 6200 101000 841001 890930
ADMC1 20H13S 04001 ADIN MTN 4115 12046 6200 101111 891001 999999
BLAC1 19L05S 04002 BLUE LAKES 3836 11955 8000 101000 801001 830930
BLAC1 19L05S 04002 BLUE LAKES 3836 11955 8000 101111 831001 999999
CDRC1 20H06S 04003 CEDAR PASS 4135 12018 7100 101000 781001 900930
CDRC1 20H06S 04003 CEDAR PASS 4135 12018 7100 101111 901001 999999
CSSC1 20K31S 04004 CSS LAB 3920 12022 6900 100000 811001 830930
CSSC1 20K31S 04004 CSS LAB 3920 12022 6900 101000 831001 860930
CSSC1 20K31S 04004 CSS LAB 3920 12022 6900 101110 861001 870714
CSSC1 20K31S 04004 CSS LAB 3920 12022 6900 101111 870715 999999
DMLC1 20H12S 04005 DISMAL SWAMP 4158 12010 7000 101000 801001 890930
DMLC1 20H12S 04005 DISMAL SWAMP 4158 12010 7000 101111 891001 999999
EFTC1 19L19S 04006 EBBETTS PASS 3833 11948 8700 101000 781001 870930
EFTC1 19L19S 04006 EBBETTS PASS 3833 11948 8700 101111 871001 999999
ECOC1 20L06S 04007 ECHO PEAK 3851 12004 7800 101000 801001 840930
ECOC1 20L06S 04007 ECHO PEAK 3851 12004 7800 101111 840930 999999
FLFC1 20L10S 04008 FALLEN LEAF 3856 12003 6300 101000 791001 900930
FLFC1 20L10S 04008 FALLEN LEAF 3856 12003 6300 101111 901001 999999
HGNC1 19L03S 04009 HAGAN'S MEADOW 3851 11956 8000 101000 781001 861001
HGNC1 19L03S 04009 HAGAN'S MEADOW 3851 11956 8000 101110 861002 870614
HGNC1 19L03S 04009 HAGAN'S MEADOW 3851 11956 8000 101111 870615 999999
HVNC1 19L24S 04010 HEAVENLY VALLEY 3856 11954 8850 101000 781001 900930
HVNC1 19L24S 04010 HEAVENLY VALLEY 3856 11954 8850 101111 901001 999999
ICPC1 20K04S 04011 INDEPENDENCE CAMP 3927 12017 7000 101000 781001 830930
ICPC1 20K04S 04011 INDEPENDENCE CAMP 3927 12017 7000 101111 831001 999999
ICKC1 20K03S 04012 INDEPENDENCE CREEK 3929 12017 6500 101000 801001 900930
ICKC1 20K03S 04012 INDEPENDENCE CREEK 3929 12017 6500 101111 901001 999999
ILKC1 20K05S 04013 INDEPENDENCE LAKE 3925 12019 8450 101000 781001 940930
ILKC1 20K05S 04013 INDEPENDENCE LAKE 3925 12019 8450 101111 941001 999999
LELC1 19L38S 04014 LEAVITT LAKE 3816 11937 9400 101111 891001 999999
LVTC1 19L08S 04015 LEAVITT MEADOWS 3820 11933 7200 101000 801001 890930
LVTC1 19L08S 04015 LEAVITT MEADOWS 3820 11933 7200 101111 891001 999999
LOBC1 19L17S 04016 LOBDELL LAKE 3826 11922 9200 101000 781001 890930
LOBC1 19L17S 04016 LOBDELL LAKE 3826 11922 9200 101111 891001 999999
MNPC1 19L40S 04017 MONITOR PASS 3835 11936 8350 101111 901001 999999
XXXC1 19L06S 04018 POISON FLAT 3830 11938 7900 101000 801001 870930 ?
XXXC1 19L06S 04018 POISON FLAT 3830 11938 7900 101111 881001 999999
RUBC1 20L02S 04019 RUBICON #2 3900 12008 7500 101000 801001 900930
RUBC1 20L02S 04019 RUBICON #2 3900 12008 7500 101111 901001 999999
SRAC1 19L07S 04020 SONORA PASS 3819 11936 8800 101000 781001 820930
SRAC1 19L07S 04020 SONORA PASS 3819 11936 8800 101111 821001 999999
SPCC1 19L39S 04021 SPRATT CREEK 3840 11949 6200 101000 801001 880930
SPCC1 19L39S 04021 SPRATT CREEK 3840 11949 6200 101110 881001 890418
SPCC1 19L39S 04021 SPRATT CREEK 3840 11949 6200 101111 890419 999999
SQWC1 20K30S 04022 SQUAW VALLEY G.C. 3911 12015 8200 101000 801001 900930
SQWC1 20K30S 04022 SQUAW VALLEY G.C. 3911 12015 8200 101111 901001 999999
THOC1 20K27S 04023 TAHOE CITY CROSS 3910 12009 6750 001000 801001 810930
THOC1 20K27S 04023 TAHOE CITY CROSS 3910 12009 6750 101000 811001 880930
THOC1 20K27S 04023 TAHOE CITY CROSS 3910 12009 6750 101110 881001 890417
THOC1 20K27S 04023 TAHOE CITY CROSS 3910 12009 6750 101111 890418 999999
TRUC1 20K13S 04024 TRUCKEE #2 3918 12012 6400 101000 801001 880930
TRUC1 20K13S 04024 TRUCKEE #2 3918 12012 6400 101110 881001 890417
TRUC1 20K13S 04024 TRUCKEE #2 3918 12012 6400 101111 890418 999999
VGAC1 19L13S 04025 VIRGINIA LAKES RIDGE 3805 11915 9200 101000 781001 820930
VGAC1 19L13S 04025 VIRGINIA LAKES RIDGE 3805 11915 9200 101111 821001 999999
WRDC1 20K25S 04026 WARD CREEK #3 3908 12014 6750 101000 781001 900930
WRDC1 20K25S 04026 WARD CREEK #3 3908 12014 6750 101111 901001 999999
????? 19L18S 04027 WET MEADOWS 3837 11952 8050 101000 801001 900731
Hbk5 is National Weather Service Handbook 5 ID. Question mark (?) follows
if guessed or inferred from circumstantial evidence
????? means NRCS ID exists, but Handbook 5 ID not found
NRCSID is NRCS ID
NWS Handbook 5 IDs may not have been assigned for now-deactivated stations;
NWS Handbook 5 IDs were taken from NWS Location Identifier software
Sitename is NRCS name; none appear to have changed during the lifetime
of the network
Lat is Deg Min N
Lon is Deg Min W
Elevation is in feet, from NRCS files
NWS Handbook 5 position/elevations often differ from NRCS values, NRCS used
SDPXNV is indicator of elements reported. 1-present, 0-absent
S-Snow Water Equivalent, D-Depth of snow, P-Precipitation
X-Maximum Temp, N-Minimum Temp, V-Average Daily Temp
Start is start date for this entry in format yymmdd
End is end date for this entry in format yymmdd
New entry for every change in IDs, names, positions, elements reported
? at very end indicates major uncertainty about station name or ID