Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FALL 2024 & SPRING 2025: Project Organization #14

Open
andrewfagerheim opened this issue Sep 23, 2024 · 17 comments
Open

FALL 2024 & SPRING 2025: Project Organization #14

andrewfagerheim opened this issue Sep 23, 2024 · 17 comments
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@andrewfagerheim
Copy link
Collaborator

This issue contains meeting notes, action items, and updates starting September 2024. For similar issues from previous semesters see:

@andrewfagerheim andrewfagerheim added documentation Improvements or additions to documentation good first issue Good for newcomers labels Sep 23, 2024
@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Sep 23, 2024

23 Sept 2024: Meeting w @dhruvbalwada

Notes: parallelization

  • Use dask (that's built into xarray) to parallelize the interpolation process. If this automatic integration doesn't work, then we might need to use the dask library directly. Hopefully this is not the case, it would warrant further discussion.
  • Choose the profile dimension for chunking; this is the parallelizable dimension because our analysis treats each profile separately.
  • Ensure that multiple cores are being used by checking the argo dashboard. Start with simple test cases in the xarray documentation to make sure functionality is working as expected. Also check that the LEAP hub and gyre both give the same results to make sure results are not machine-dependent.

Notes: vertical coordinate

  • A broader note on using density as the vertical coordinate; there are two reasons to do this.
    • The first is mechanistic---to remove the influence of internal waves stretching and squeezing isopycnals but not stirring the tracers.
    • The second is analytic---to more readily visualize if bands of high variance are "sliding" along isopycnals into the interior in regions of deepwater formation, which would connect these variance results with ventilation.
  • The process for this coordinate transformation is as follows:
    • C(z,x,t), ρ(z,x,t) --> C(ρ,x,t)
    • C(ρ,x,t) --> C(zρ,x,t)
    • Then either C(zρ,x,t) is used to calculate variance
    • Or C(zρ,x,t) --> stretching of N2 is considered
    • Then this metric would be used to calculate variance

Next steps:

  • reread Ferrari 2005 and take notes here, see Dhruv's notes here
  • read SWOT proposal and take notes here
  • test dask parallelization with example from xarray documentation
  • test dask parallelization with interpolation function

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Oct 14, 2024

14 Oct 2024: Meeting w @dhruvbalwada

Notes:

  • Process is likely crashing because of a memory error (especially because the dataset I'm creating is pretty gigantic). The first thing to test is simply updating the packages. Dhruv mentioned that dask memory management issues have been addressed recently, so hopefully updating the packages fixes this too.
  • apply_ufunc: function should be written so you could prove a np array (no coordinate information, etc that xr provides) and it would run completely fine. apply_ufunc then tries to adapt the behavior of this function on a np array to provide the same behavior with an xr dataset, including using dask if you tell it to.

Unresolved questions:

  • Why does dask not run in parallel for xarray function that takes a long time (minutes) to run on one CPU?
  • Why does the kernal crash when a large dataset is used? (Shouldn't dask be able to resolve memory load?)
  • What are the barriers to running apply_ufunc?

Next steps:

  • update versions of xarray and dask, see if this resolves memory crashing issue
  • compare how long it takes to run get_anom function on an Argo section with and without dask
  • (assuming dask is quicker) use dask to perform a coordinate transformation

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Oct 22, 2024

22 Oct 2024: Meeting w @dhruvbalwada

Notes:

  • Ocean Mixing Ch3: lots of connections with Ferrari & Polzin which is great. Start with a general overview of the chapter and all the sections, so people know where to go if they'd like more info about everything. Then can go through a more detailed description of different sections:
    • Mixing conversation: walk through of tracer equation, double and triple decompositions
    • Non-dissipative theories: quick orientation to theories (where they occur, their assumptions)
    • Mixing and circulation: hand-wavy description of three mechanisms mentioned
    • Mixing and circulation locations: make a chart summarizing isotropic mixing vs mesoscale stirring at different depths
  • What is pseudo-depth? It seems like it may be discussed in Gouretski & Koltermann 2004? We're wondering if it could be 1) roughly analogous to mean isopycnal depth across basins, or 2) accounting for irregularities in the number and value of density measurements across many depths. If it's the former this would be really relevant to our Argo sections, however the later would be a lot less interesting.

Next steps:

  • work on dask steps listed above
  • read Castro et al. 2024 and take good notes
  • prepare for discussion on how to apply Castro et al. and Ferrari & Polzin to Argo data
  • look into "pseudo-depth"

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Nov 1, 2024

31 Oct 2024: Meeting w @dhruvbalwada

Notes:

  • Castro 2024: uses the triple decomposition on profiles of temperature and shear from the North Atlantic to create estimates for diapycnal and isopycnal mixing. Connects the rates of mixing to large-scale ocean circulation, in particular to AMOC.
  • This is a great study and there's a lot of detail that I should dive into further:
    • is $\overline{\theta}$ the same as $\theta^m$
    • how is $\overline{\theta}$ computed in this study
    • it feels like some of the plots are using a double decomposition and some are using a triple decomposition, understand better what is being used for each part of the analysis
  • But even more broadly speaking, there is a lot of work here that I think can be replicated using Argo. For example:
    • Fig 3: these are plots of $\overline{\theta}$ and $\theta^m$, which seems like what I've done already
    • Fig 6: top panel is stratification which can definitely be calculated using argo; bottom panel is the contribution of the mean state to total variance which can also be calculated
    • Fig 8-10: dissipation (black) can't be calculated by spectra like they do because you'd need microstructure measurements, but we could use the Osborn and Cox (1972) formula: $\chi_{\theta} = P_{\theta^2} \approx 2K_{\rho} (\frac{\partial \overline{\theta}}{\partial z})^2$ (eqn 3)
      diapycnal production (blue) I think can be calculated from argo using the formula: $P_{\theta^2}^{\perp} = 2K_{\rho} (\frac{\partial \theta^m}{\partial z})^2$ (eqn 8)
  • I feel like it's worth thinking more about the role of filtering in my analysis. Here they defined the mean state using a 4-degree polynomial fit, but spatial averaging would work too. I guess I'm wondering if spatial averaging alone would be adequate to define the mean vs anomaly for our analysis? I guess this also depends on if we're taking a double or triple decomposition approach.

Nest steps:

  • write up notes for Castro 2024
  • plan time for each day next week using GCal
  • continue tackling dask checklist
  • download and run latest version of argopy

Next 1-2 weeks:

  • run vertical coordinate interpolation function in parallel with dask
  • have one netdf/zarr file with all argo data, processed

Next 3-4 weeks:

  • make variance section plots with 3 vertical coordinates (depth, density, average density depth)
  • make plots like Fig 6 in Castro

Next 4-8 weeks:

  • make plots like Fig 8-10 in Castro

@andrewfagerheim
Copy link
Collaborator Author

6 Nov 2024: Mental Organization

Notes:

  • I think I've successfully updated argopy to version 1.0.0 which is great and should allow parallel loading with dask (will need to test this). However a few things to note:
    • I couldn't do this by running conda update --all alone, this would keep the version at 0.1.15 I believe. I had to do 'conda install conda-forge::argopy` to get version 1.0.0
    • Additionally, this new version of argopy seems to not allow for the newest version of xarray. So currently I'm running xarray version 2023.6.0, but I'm wondering if this doesn't have the more recent updates to dask integration. Maybe it would be worth it to figure out how to run 2023.6.0< xarray < 2024.3.0 with argopy 1.0.0
  • As for the dask tests, I can't work on that yet because the gyre dashboard appears to be down. I emailed IT, so hopefully I will hear back soon on this.

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Nov 7, 2024

7 Nov 2024: Meeting w @dhruvbalwada

Notes:

  • things to try on the argopy front:
    • run latest version of argopy with newest version possible of dask
    • it would be great to have one netcdf file with all data globally, HOWEVER if I end up spending too much more time on this it's not going to be worth it
    • backup would be to load regions bit by bit without using dask, then concatenate them together into one dataset
  • on the dask front:
    • main priority is to be able to use apply_ufunc to run my own functions in parallel using dask
    • this will be useful for lots of things: vertical coordinate transformations, filtering, polynomial fit, etc
  • on the readings front:
    • there are a lot of directions we could go with this work and I feel like I'm getting a little lost in the options. goal for today is to create a chart of each "offshoot," which papers they're most closely based on, and what the next steps and eventual outcomes would be
    • discuss with Dhruv and Andreas

Immediate next steps:

  • make project summary table
  • (if possible) update env file and load global box (still getting error)
  • (if necessary) concatenate regions together

Slightly longer term next steps:

  • compare polynomial fit to filtering to spatial averaging --> pick Castro region, compare to Fig 3 (compare profiles and vertical gradient) [tracer gradient decomposition] --> temp & salinity
  • learn apply_ufunc [any depth operation on many profiles] --> this is where spice would become more relevant

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Nov 11, 2024

11 Nov 2024: Meeting w @dhruvbalwada

Notes:

  • Notes on the project table:
    • Add vertical coord. stuff to the multi-scale column
    • Maybe flag seasonality as a potential thing to explore somewhere on the table
    • Refine information in the tracer decomp & dissipation column using notes I jotted at the end
    • Big picture idea here is to deeply understand how to calculate all 4 arrows in triple decomposition diagram using argo. Create a list of what I would still need to perform all of these calculations
  • Notes on smoothing notebook:
    • Initially, it seems like filtering is definitely winning, especially toward the boundaries (as visible with the first derivative in particular). It would be good to continue testing this however---look at lots of profiles with different shapes, levels of variance, etc. Maybe using the synthetic profiles notebook would be particularly helpful for this.
    • Try over the same depth region as the paper
    • Compare lots of different orders and filter scales
    • For spatial averaging: use Roemmich & Gilson's climatology instead of creating a box around each profile

Next steps:

  • make edits to project table
  • run through checklist above for smoothing tests
  • concatenate global argo dataset
  • write up how to calculate all four pathways/terms of the triple decomposition

@andrewfagerheim
Copy link
Collaborator Author

2 Dec 2024: Meeting w @dhruvbalwada

Notes:

  • I showed a first pass attempt at the mesoscale vs microscale pathways for the Atlantic basin. There's a lot of little updates to work on (notes at the bottom of this notebook), but I'm listing a few main things here:
    • check units! Ferrari & Castro doesn't have temperature in the units for some reason, but more importantly my units are really far off (I think this is mostly because I'm plot sections as the log, average profiles have a mag that makes much more sense, between 10**-8 and 10**-11)
    • term 2 should be filtered again to only take the large-scale features from the filter (this will make the patterns look more smooth like the term 3 plot)
    • look at individual profiles to see what's happening with "inflection points" in both temp and salinity
    • create similar plots to Castro and Ferrari & Polzin papers, maybe selecting just a few profiles in each of those regions to look at
    • compute dissipation term and compare this to the sum of mesoscale and microscale terms. Are there substantial differences?
    • create ratios of mesoscale to total, microscale to total; this may make it easier to see what terms dominate in certain regions?
  • Try to do as much of this stuff as reasonable before individual meeting on Thurs

Longer-Term Planning:

  • Thinking about what to focus on over winter break:
    • Load Argo data globally and calculate mesoscale and microscale pathways
    • Write updated methods section
    • Get global estimates of K_rho

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Dec 6, 2024

5 Dec 2024: Meeting w @dhruvbalwada

Much to write about!

Notes: Atlantic variance pathways sections

  • For mesoscale plots: adding the filtering step did smooth profiles vertically as expected, so the sections look closer to the microscale ones. The mesoscale sections still have less lateral coherence than the mesoscale sections though which makes sense: eddies are happing sporadically at different locations so general regions might have higher variance but individual profiles next to each other wouldn't be expected to match. With the microscale however, this is just a projection of the mean state, so it makes sense that this would be smoother/more coherent laterally as well.
  • There are some regions that still look a little rough---wondering is this just because we're taking the variance of a gradient? Or is the filter scale not quite large enough, especially at high latitudes?
  • Ratio plots: these are great and it's easier to start to make statements about which pathway dominates. Here are some patterns:
    • Most places are dominated by, and variance is almost exclusively produced by, microscale turbulence.
    • In places with low total variance, mesoscales usually dominate. (This suggests the profile is already very smooth, so the only thing left is small perturbations.)
    • In some places with low total variance, microscales dominate. (For mesoscale to be large, it requires a strong lateral gradient
  • There are also some features in the plots we'd like to investigate further:
    • CT & SA variance dominated by mesoscales at all depths around 40S-65S, this is the ACC region.
    • "Blob" of low CT variance dominated by mesoscales between 20S-0 and about 1250m, what is this?
    • Patches of high CT but especially SA variance around 40N and 1250m, this is probably Mediterranean outflow? (Also interesting that warm temperature anomaly dips that deep in the N Atlantic compared to the S Atlantic.)
    • Patch of high SA mesoscale variance at 20N and 250m. No idea what this could be. In general, actually SA looks "stranger" to me. There's also high SA mesoscale variance from 20S-20N and 1750m-1250m that I don't understand (maybe just because total variance is pretty low?).

Notes: things to look at next

  • Writing: methods should be well on their way at this point so write this section. Also start making bullet points about findings or features we should be looking at more closely.
  • Sections: all of the things from the comment above I didn't get to this week.
  • 2D histogram: for each of CT and SA plot 1) total variance on one axis and mesoscale variance on the other, 2) total variance on one axis and mesoscale ratio on the other.
  • T-S, volume: use this tutorial to create a T-S plot where the color is associated with volume of water in each bin
  • T-S, variance: same as above, but where the color is associated with variance of the water in each bin
  • Calculating [1]: should be able to calculate this term in two ways 1) using estimates of Ke and lateral tracer gradients, 2) using lateral gradients and [2] to compute 1. Do we get similar results? Also what about calculating our own value for Ke, do we get similar results to the stock values?
  • Then expand all of these plots to different sections. First look at previously loaded sections for posters and also the NATRE region, then try one between W Africa and our Atlantic section (to try to track this high CT mesoscale region.
  • Also still need to concatenate argo dataset globally---I keep running into memory errors that sent me down a rabbit hole I never resolved. So I feel like breaking things down into smaller boxes should be the next attempt

Resources:

Outstanding questions:

  • How can we characterize the different "regimes" of variance? (high variance + high mesoscale; high variance + high microscale; low variance + high mesoscale; low variance + high microscale)
  • What is going on with the patches of high variance in unexpected places? (CT mesoscale variance deep near equator)
  • continue list

Next steps (over winter break):

  • create 2D histograms of total variance vs pathway/ratio variance
  • create T-S diagram heatmap of volume of water
  • create T-S diagram heatmap of variance
  • calculate term [1] using two methods
  • write solid draft of methods section
  • convert CT and SA into same units

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Dec 12, 2024

6 Dec 2024: Meeting w @dhruvbalwada

Notes:

  • Results for NATRE region look generally comparable to Ferrari & Polzin, at least above 1km. However, below this, the mesoscale pathway really drops off---is this really an observed pattern? Or is this the result of a change in sampling rates?
  • So the next step on this front is to dig into float sampling rates. I think the best thing here is to somehow make a new variable or mask that is saved with the dataset as a whole. I think it's best to preserve all the data we can load (instead of weeding out floats with a low sampling rate for example), but pass along the information required to distinguish between floats with different resolutions later on.
  • It will take a bit of thinking to translate a sampling rate variable (which would have the frequency of the old pressure grid) into a variable on the interpolated pressure grid.

Next steps:

  • include sample_rate as a variable
  • replot average variance profiles in the NATRE region using high resolution profiles

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Jan 2, 2025

16 Dec 2024: Meeting w @dhruvbalwada

Notes:

  • Generally it seems like similar trends emerge when all data are included and when only high-res data are included. This is promising and suggests we will hopefully be okay to only include high-res data
  • Uncertainty estimates: it would be good to have uncertainty estimates, especially for average profile plots. One estimate to accomplish this is "bootstrapping": the idea is to resample the data collected man different times in a random way, then recalculate the desired metric for each, and finally set the uncertainty estimate based on these samples.
  • The microscale SA plot has 2 minima: likely because of changes in the salinity gradients. This could be because Mediterranean outflow water is flowing out into the Atlantic and different amounts of mixing causes the water to stop at different depths.
  • T-S diagrams: idea here is that a large amount of spread along an isopycnal suggests a lot of mixing is occurring between fresh background water and salty Mediterranean Outflow. You can find more info on this in Ferrari & Polzin. Also, try making a T-S plot where the color is based on latitude. Speer & Forget has a lot of info on identifying water masses using this type of plot.
  • 2D histograms: need to plot all points in all profiles, not just for the average profiles. This should hopefully create plots that are more useful and show the distribution of the data better.
  • Atlantic sections: need to add isopycnals to these section plots so it's possible to track water masses between plots. Also, try averaging profiles in ~5deg bins.
  • For the paper: the results section should have a few section plots to familiarize the reader with what metrics we are using, what large/small values mean, etc. Then move to a more statistical approach about the pathway distributions across entire basins, the globe, etc for different depths for example.
  • So next, it's important to begin moving to the basin and global scales, which requires having more argo data preprocessed. If the problem is with jupyter notebook, you can use python scripts to run code instead. Also, it's probably fine for now to throw out profiles that don't have high enough sampling.

Next steps:

  • items from above updates
  • sections: add isopycnals, average in ~5deg bins
  • 2D histograms: remake with all points in all profiles
  • T-S diagrams: make plot where color is based on latitude
  • create a workflow for preprocessing large swaths of argo data

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Jan 8, 2025

8 Jan 2025: Meeting w @dhruvbalwada

Notes:

  • Sections:
    • The 2D histograms are helpful, in particular the total variance-mesoscale ratio plots. Different regimes show up fairly quickly, and look different for each section. The log-log plots are less helpful, they're just roughly a 1:1 line.
    • The T-S diagrams may be interesting, but there's a lot of information to take in and it's hard to tell what to pay attention to across a whole basin. The NATRE box might be more interesting, especially because the Ferrari & Polzin paper directs us to a specific section to focus on.
    • Related to this: Dhruv mentioned some papers do more complex things with T-S diagrams to study water mass transformations. Mixing is usually assumed to have a minimal impact, but maybe this indicates that should be reconsidered?
    • Sections: To address the partial data issue, it would probably be good to make sections where high-res data is binned into 2deg/5deg bins and take the average.
    • Pacific notebook: High mesoscale regions seem highly localized to the poles, probably because the Pacific is older than the Atlantic (and therefore assumed to be more well-mixed). The one anomaly is near the equator in the Pacific.
  • Data loading:
    • The system I have going now seems to work really well to load and process large regions in a reasonable amount of time. However, this PSAL error is cropping up again and this is preventing further progress on basin-scale results. So this needs to be resolved.
    • There are two things to try here: 1) download a new sync copy of Argo data, 2) look into the argopy functions themselves to preempt the xarray error. In this situation, I would need to use a try/except structure to only load profiles that have a necessary set of variables.
    • If the first step works, then this should be a simple fix. If it doesn't, the second is probably going to be a lot more complex. And it might be necessary to submit another issue to argopy, which would require some specific documentation (like a list of problem profiles for example).

Next steps: loading

  • load new argo sync
  • (if necessary) rework argopy's to_xarray() method

Next steps: project

  • make sections binned by 2deg/5deg
  • calculate term [1] using two methods
  • get CT and SA into the same units

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Jan 17, 2025

17 Jan 2025: Updates

Notes:

  • The new filtering method for removing profiles that don't have PSAL before concatenation seems to be working! I cloned the argopy package, updated the to_xarray method to see if a file has a PSAL variable before concatenating, and installed this version of the package into my environment. It seemed like this solved almost all of the errors!
  • But there are still a few missing boxes (it looks like 4), and because my screen turned off before everything finished loading, I don't have the error logs on record. So that's unfortunate and will take additional work to sort out what's going on.
  • Also there's very large margins around the coasts and islands in the polar regions that seem surprising, similarly in the Southern Ocean. I'm not quite sure what to think about this.
  • For now, I'm loading the Indian basin as well, however I started this on the shuttle so it also disconnected and I'm not getting the error logs. I think it makes sense to work on two parallel tracks:
    • Load all argo data globally. The pre-filtering step has resolved almost all of the problem boxes, so I'll be able to get most of the data. This will also allow for some testing with what metrics, plots, etc work best with more global-scale data.
    • Dive into remaining errors. At some point (when you know you'll be at your computer for a while and remain on the same network), try to load a basin and see what the actual errors reported are. I'm really hoping it's just another missing variable. This would be easy to solve, just make a list of required variables that must be required (instead of only PSAL). If it's something broader, I'm not sure what to do with that.

Next steps:

  • above steps I haven't gotten to yet
  • load all argo data globally
  • resolve remaining problem boxes

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Jan 17, 2025

17 Jan 2025: Meeting w @dhruvbalwada

Notes:

  • Next steps should be to figure out why the remaining boxes are still blank and to determine how many problem floats are being removed from the dataset. Additionally, rerun with a more up-to-date synced dataset to see if this makes a difference. The end goal here is to either confirm that 1) this is a very weird but localized error with a few floats that probably isn't a big deal or 2) this is actually more wide-spread than we thought and should be looked into further.
  • In the future, looking into icechunk for version controlling data would be helpful. This would allow updates to data to be added instead of total reprocessing (if I understand correctly). Probably not something to consider right now however.
  • Another thing to work on: add shading to the section plots to visualize which areas have good data coverage and which areas are more sparse.
  • One really useful thing to do now is to create a gridded dataset from the basin-scale datasets. The idea here would be to group the data into 3deg x 3deg bins and take the average within each. Then you can use the hvplot wrapper in xarray for dynamic visualization: for example using a slider to correspond with a certain dimension, like to slide across longitudes for a section or depths for a map. I think you should save both a "normal" netcdf file for the whole global and also a "gridded" file.

Next steps:

  • rerun Atlantic basin and
    • figure out remaining problem boxes
    • log float and profile IDs of problem floats
  • shading for section plots
  • create gridded atl_basin dataset
  • create interactive plots with gridded data
  • (for down the road): explore more sophisticated data loading & processing workflow

@andrewfagerheim
Copy link
Collaborator Author

22 Jan 2025: Meeting with Andreas

Notes:

  • This was a great conversation, tons to think about!! I'll try to remember everything and decode my notes appropriately.
  • One general criticism (positive connotation) or thing to think about: Is spatial filtering really doing what we want it to?
    1. f changes with latitude, and with it the scales of eddies, internal wave, etc
    2. skepticism that 100m is an appropriate scale for filtering, he thinks that everything greater than 100m is really just measuring the impact of internal waves
  • How to solve this? Kind of top of mind, he wondered about binning and taking the average to be the mean profile, then we aren't dealing with spatial scales at all. I'm pretty sure I tried looking at this and it was pretty messy, but worth looking at more closely. Maybe try it for one section and compare the results side-by-side.
  • Also mentioned was using modes, but I understood less of this. Something to look into though.
  • Look at what Ferrari & Polzin did to define different scales, I'm pretty sure they did spatial filtering. But check into this to make sure---and also how large were the bins?
  • Another thing to think about is GO-SHIP cruises, to use section data to validate what argo is indicating. His suggestion is that GO-SHIP data is known to be the highest quality, so use the same metrics and compare: how similar do they look? As long as things look pretty similar, this then validates argo to look at global patterns.
  • He was really fascinated by the coherent spatial features that follow isopycnals almost perfectly, but then seem to break down quite suddenly.

Next steps (long term):

  • think about scale definition more carefully
  • load GO-SHIP section and perform similar analysis

@andrewfagerheim
Copy link
Collaborator Author

andrewfagerheim commented Feb 3, 2025

31 Jan 2025: Meeting w @dhruvbalwada

Notes:

  • These gridded plots are looking good and offer more opportunity to explore features across a basin, which is good. There are a few ways to improve them at some point (not all of these are immediate next steps):
    • Add labeled isopycnals to all 3 plots. It might take a bit of creativity to figure out how to find the right values for each plot. (Because the map will have a much smaller density range than the sections, for example.)
    • It would be nice to animate each plot, to have an animated map across all depths for example, or a section across all longitudes.
    • More long-term, I would love to make an interactive dashboard (see holoviz) where anyone could explore the dataset themselves. But this should either be a personal/side project or done during the summer, I'm worried it could easily detract from actual research progress.
  • Additionally, it will be important to compute along-isopycnal tracer gradients. Even if you don't go all the way to computing term [1], visually comparing regions of high mesoscale variance with high lateral gradients would be a useful exploration.
  • But the current barrier to resolve is the data loading problems: identifying how many problem floats/profiles are there, and what other errors may be thrown.

Next steps:

  • rerun Atlantic basin and:
    • figure out remaining problem boxes
    • log float and profile IDs of problem floats
  • add isopycnals to gridded plots
  • create animated gridded plots?

@andrewfagerheim
Copy link
Collaborator Author

6 Feb 2025: Meeting w Natassa

Notes:

  • She also suggested adding shading/dots/etc to denote which parts of my plots have low data availability. Conceptually I'm trying to think about how to do this, because it would mean the binned ds must retain some information about the original ds. Is there a way to add some kind of "N_PROF count" to the binning function?
  • Natassa noted that this type of work would be useful to modelers to constrain model parametrizations of eddy activity. However, the metric of "variance" wouldn't be the most relevant.
    • Instead, she recommended "converting the variances into fluxes." I will have to think more about this. Flux of what?
    • Additionally, since this is Argo data over a 20-year period, it will be difficult to compare to models directly. Some kind of seasonality or interdecadal analysis could be used, for example like dividing by ENSO index or NAO or just seasonality directly. Hopefully this would present a way to compare time variance to models while maintaining a large enough sample of Argo floats to remain confident about the results.
  • As far as scheduling an AC meeting this semester, Natassa doesn't have any long-term block-out dates in March. I need to connect with Dhruv and Andreas to ask them the same, then propose a few meeting times (on Tues/Thurs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant