It's confusing to understand the organization and connections between all the data we're tying to back up. Here's a few notes that might help make sense of what the data represents and prioritizing what to back up:
- Data is built up in Levels: L0 data is whatever considered raw off of instruments, L1 is when that has been ordered calibrated into standard units, L2 & up represent more calculations into different products.
- An L2 or L3 product will often depend on multiple inputs of other L1/2 products from different sources. This L2/L3 data might also require many days of computer processing time using non-public algorithms.
- There are some organized groups of data under various 'DAAC' sites but there is no unified organized system, which means data is spread out and often copied all over the place as needed. Each group owns their own products, but depends on other products too.
- It's really difficult to prioritize what data is most important. One could say X is important and Y isn't, but another research team might need Y.
- While L0 data is the most important in that it's the bottom level of the pyramid, it's often the least directly used because it take a lot of human/computational work to get something calibrated and standardized.
So basically, all the data is connected in a complex web that's not easy to untangle.