Skip to content

2017 meeting notes

Michael Levy edited this page Nov 7, 2017 · 23 revisions

November 7, 2017

It makes sense to add yaml_to_json.py to the MARBL_tools package (and have it use the python class in that package). Open question about where the output JSON file should go, though -- one possibility is to look to see where CIME puts Fortran autogenerated by genf90, and name a directory in a similar manner.

The temporary solution of POP using put_settings() to make sure init_bury_coeff_opt is set correctly is okay, but the flag controlling it should be named lmarbl_bury_coeff_vars_in_restfile. Keith points out that we want to keep init_bury_coeff_opt in MARBL even after MARBL is computing its own running means, because a user may want to re-initialize the running mean.

Next step for marbl_dev_levy will be getting MARBL_NT directly from MARBL instead of OCN_TRACER_MODULES_OPT.


October 31, 2017

Demoed the new POP buildnml script, which calls a MARBL python script to create marbl_in instead of relying on build-namelist; lots of suggestions for cleaning up the MARBL scripts:

  1. Instead of input file, refer to marbl_in as a MARBL settings file
    • MARBL_input_file.py -> MARBL_generate_settings_file.py
    • MARBL_parameter_values.py -> MARBL_settings_file_class.py
    • lib_dir -> MARBL_settings_class_dir
  2. Instead of marbl/tools/generate_input_file, have all python in marbl/MARBL_tools and use __init__.py to allow code like
    import MARBL_tools
    MARBL_tools.gen_settings_file()
  3. Move default_values.yaml -> marbl/src/default_settings_values.yaml
  4. marbl.parameters_input -> user_settings_marbl

Also, some design decisions to keep thinking about / ask CSEG about:

  1. Can we have users put user_settings_marbl in $CASEROOT?
  2. Where should MARBL SourceMods go? Currently in SourceMods/src.pop, but maybe SourceMods/src.pop/marbl or SourceMods/src.marbl?
  3. Need to make sure that sys.path.insert(0,SourceModsDir) does what we want -- namely, that if an updated version of MARBL_settings_file_class.py is in SourceMods, that gets imported rather than the one in from MARBL_tools

And a bug we noticed looking through the code: MARBL probably does not work with multi-instance; we need to make marbl_nml_filename a namelist parameter in ecosys_driver.F90 and make sure it looks for marbl_in_#### if ninst > 1.

Keith asked about generating the settings file via a class method instead of having a separate script, but I don't like that idea because I think we want separate scripts to act on the class (especially once we are ready to auto-generate the Fortran code).


October 24, 2017

It sounds like using YAML will soon be blessed by the powers that be (aka CIME devs), so we'll move forward with YAML for generating the input file. Worst case scenario, we end up writing a tool to convert our YAML file into XML.

After POP is up and running with YAML-based input file generation, we will move grazing into zooplankton_type (so zooplankton(:)%grazing(:) instead of grazing(:,:). Also, we discussed how it doesn't make sense to call grazing%construct from marbl_settings_define_PFT_count, it should be called from a separate subroutine that is called between marbl_settings_define_PFT_count and marbl_settings_set_defaults_PFT_derived_type (maybe marbl_settings_construct_PFT_derived_types?). And really, it will be the zooplankton(:) constructor (which will call the grazing constructor in turn).

For POP use: introduce marbl.parameters_input in $POPROOT/input_templates (empty file save for comments about using it to change MARBL parameters). If file is not present in SourceMods/src.pop use the [empty] file in input_templates/.

Next priority (after input file work is complete) will be having MARBL provide a list of diagnostics. This will likely require converting ocn.ecosys.tavg.csh to Python; we want to use YAML to define all possible diagnostics. It will be a simple dictionary, with diags[diag_name] = requested_freq; requested_freq should be one of none, low, medium, high (or a list if it should appear in multiple streams). Care will be needed for tracer-based diagnostics, because most of those will be computed by the GCM itself rather than being passed by MARBL. So something like

diagnostics.yaml
----------------
_nonliving_tracers :
                   - DIC
                   - NO3
                   - NH4
_per_tracer:
   default_tracer :
      surface_flux : none
      virtual flux : none
   ALK :
      surface flux : medium
      virtual flux : medium

ECOSYS_ATM_PRESS : medium

ECOSYS_IFRAC :
   - medium
   - high

ECOSYS_XKW :
   - medium
   - high

SCHMIDT_O2 : medium

SCHMIDT_CO2 : medium

Possibly with more keys in the _per_tracer subdictionary.

The python script to parse this dictionary should also allow a text file override (similar to the parameter input file), maybe just

DIAGNAME requested_freq

Like with the parameters, input_templates/marbl.diagnostics_input could be empty but copied to SourceMods and edited. Tracer-specific changes would be harder to implement in this manner.


October 17, 2017

Not much beyond what is listed in the agenda, but we agreed that I should pause work on the input file name generation tool for a day to move iron_frac_in_dust and iron_frac_in_bc from MARBL to POP (was originally on Keith's list ahead of splitting the dust flux into fine and coarse components).


October 10, 2017

Keith pointed out that the YAML file contains a mix of python logicals (keys that look like "ciso_on = True") and YAML logicals (default_value : true, append_to_config_keywords : true, etc)... what we want is mostly Fortran logicals (ciso_on = .true., default_value : .true.), to be used when referring to a value that eventually ends up in Fortran code with some YAML thrown in (append_to_config_keywords : true) for the rest. When we start to support input files in the python, we also will need to support a range of acceptable Fortran: e.g. .true., True, and T.

As seen in the append_to_config_keywords : true example, I need to be more consistent with the leading _; rule of thumb is "if does not correspond to something in the MARBL Fortran, it is YAML specific and therefore needs to start with _.

To support different tracer counts, I will add a get_tracer_count routine to the class that will parse

    _array_size : *NT
    _array_size_increment :
       variable_PtoC = .false. : -3
       ciso_on = .true. : 14

Also, when it comes time to add this tool to POP, we will remove POPROOT/source/marbl (which is just MARBLROOT/src) and add the full MARBLROOT to POP's SVN externals. So POPROOT/marbl will contain the full MARBL checkout; also POP will add MARBLROOT to env_build.xml and then the buildcpp, buildnml, and buildexe scripts can check SourceMods before falling back to MARBLROOT when looking for MARBL code and utilities.

Matt pointed out that following the input file generation, the next big task for MARBL will be to provide the GCM with a list of diagnostics being provided. POP, for example, should use MARBL output to decide what goes into tavg_contents.


October 3, 2017

More discussion on how to get resolution-dependent defaults into the YAML. Going to make default_value a dictionary if there are multiple possible defaults, and then pass a list (or dictionary?) of keys. Luckily variables either seem to depend on resolution OR the value of a previous variable, rather than a combination of the two because the logic to figure out the default when there are multiple keys to match is far tougher.

Keith will update POP to require the abio and ciso modules to use the same d14c forcing when both modules are being run, but will NOT impose a similar restriction on atmospheric CO2 when abio and ecosys are both used.


September 22, 2017

Compiling and running a Fortran executable to generate CPP macros and the MARBL namelist is not a reasonable approach: some machines require different compiler flags to run executables on a login node rather than on the compute nodes, and then we would not be able to run the MARBL executable when the namelist is regenerated during a continuation run. Other smaller issues presented themselves too, but the big one alone prevents us from even prototyping this workflow.

Instead we will look at auto-generating Fortran code that contains the default values provided by an XML (or JSON?) file. The Fortran code would be generated by a python script, and a different python script would also generate marbl_in.

This still leaves us needing a tool to generate the tracer count, but for now we will focus on the namelist. One possibility for the tracer count would be to use the auto-generation tool for tracer initialization as well, so we'll keep that in mind as we move forward.

In more detail-oriented discussion, we talked about how to generate marbl_in; one possibility is to have build-namelist do it, but rather than rely on namelist_definitions.xml we could prepend something to MARBL variables in user_nl_pop. For example:

! POP variables are entered in the usual way
ltavg_ignore_extra_streams = .true.
n_tavg_streams = 1
tavg_freq_opt = 'nday'
tavg_file_freq_opt = 'nday'
lecosys_tavg_all = .true.

! MARBL variables have a "MARBL: " prefix
MARBL: autotroph_cnt = 4
MARBL: ciso_on = .true.

September 19, 2017

I showed a brief demo of the generic MARBL interface, and we talked about the best way to distribute it. For now we will leave it in its own repository, but after the folks at UCI have a chance to expand on it, we will look at bringing it into NCAR/MARBL. We would need to rethink the top-level directory structure, possibly adding a drivers directory and moving driver_src/ and driver_exe/ out of the tests directory.

A significant amount of time was spent discussing how to get MARBL's inputfile generation out of POP's build namelist. As a first pass, POP's buildcpp script will build MARBL's stand-alone driver; this can be run for two tasks:

  1. buildcpp: Determine tracer count (I'll add a new test that just outputs ECOSYS_BASE_NT, CISO_NT, and MARBL_NT)
  2. buildnml: Write out the MARBL input file (using gen_inputfile and passing user_nl_marbl to overwrite defaults).

We would need to be aware of MARBL files in SourceMods/src.pop when we build the driver, and it would need to be rebuilt in buildnml if files are modified after buildcpp is run.

We also found a bug in how I hard-coded marbl_in as the MARBL namelist file name; this needs to be compatible with multi-instance runs, so marbl_nml_filename should be in &ecosys_driver_nml rather than a Fortran parameter.

Mariana would like to sit down with Jim E and some of the MARBL team to discuss this approach in more detail.


August 29, 2017

Most of the meeting was spent discussing the best way to initialize PFTs. We will introduce a new MARBL parameter named PFT_defaults; a value of 'user-specified' will force the user to set all PFT-related variables; the default value will be 'CESM2', which will provide the 3 autotrophs, 1 zooplankton, and 3 grazing prey classes that are currently the default.

We will also rename grazer_prey_cnt to max_grazer_prey_cnt as that is a more descriptive name -- if we have 2 zooplankton, one of which grazes on three different biomass aggregates and another that only grazes on two then we need max_grazer_prey_cnt = 3; in the future, we may consider grazer_prey_cnt(zooplankton_cnt).

We will have some CESM-specific comments in MARBL, but only temporarily -- not all of the MARBL parameters that can be changed via put_setting() calls are available in the POP build-namelist script, and we want to make it clear to CESM users which parameters need to be changed in the Fortran code as opposed to which default values are over-written by put_setting() calls from POP. After MARBL has its own tool to generate input files, the comments will change to alert users to variables that should be changed via input file.


August 15, 2017

Our plan for CESM 2.0 is to continue to develop at our natural pace and see where MARBL stands at the code freeze - we already have several more features in than we expected thanks to delays in the CESM finalization. Keith is working on a new initial condition file; ecosys_jan_IC_gx1v6_20161123.nc is an intermediate file useful for some testing but is not needed in place of ecosys_jan_IC_gx1v6_20150108.nc given that it will not be the final version of the file and updating from the old version will require new baselines for all our tests.

Lots of progress was made towards #189. Most notable is that POP's build-namelist tool generates a separate marbl_in file that MARBL parses without Fortran's namelist functions. This is a great step towards having MARBL generate its own input file, which will be tackled after #189 is accepted.

I have a fork of the repository holding the MARBL documentation, and I will keep it up to date with my code changes so that the documentation can be updated once 189 is pulled to master.


July 25, 2017

Most of the meeting was spent discussing initialization; the current plan is to have marbl_instance%user_settings replace both %configuration and %parameters; the %put() call would store the keyword - value pairs in a linked list until after each round of %add_var() calls; at that point, variables that have been added would be updated (and the linked list entry would be deleted), while put() calls for variables that haven't been added yet would be kept (and tried again after the next round of %add_var()). If the linked list is not empty at the end of init() then it indicates a put() call did not match any of the parameters and MARBL will abort with an error.

We also talked about using a single monolithic init() instead of breaking the calls into different phases. One reason for doing this is that the different phases were introduced just so that GCMs had the opportunity to call put() in between phases, but that will no longer be necessary. Also, we will add some error checking to make sure different initialization routines are called in the right order (for example, we can not initialize tracer_restore_vars until we have constructed marbl_tracer_indices): this will likely mimic POP's initialization error checking, where we register a string with the subroutine name at the beginning of each phase and check to see if routines that are depended on have been run by looking for that name in the registry.

In the future, we can perhaps abandon the MARBL namelist altogether -- POP's build-namelist tool could write a separate file (marbl_in?) that contains a generic format like

var1_name   var1_type   var1_value
var2_name   var2_type   var2_value
...

Which would turn into the default format of MARBL's gen_parameters tool. The POP driver would then read this file and call marbl_instance%settings%put() and we could strip all namelist support out of MARBL.


July 12, 2017

We talked about what the next steps in MARBL development should be. After bringing runtime configurability to PFT counts in to MARBL (currently in progress), it will probably be a good time to work on MARBL setting up its own namelist, and it would be useful to have better control over diagnostic output before bringing abio tracers into MARBL. So the recommended path forward is

  1. Finish runtime-configurable PFT
  2. MARBL building its own namelist
  3. More flexibility in marbl_domain_type
  4. Better control over what diagnostics are returned to GCM

For the namelist generation, we compiled a list of options the namelist generation tool should support

  • --output-file-format: default will be Fortran namelist, but also support MPAS and MOM parameter file formats
  • --default-file: XML file containing general defaults
  • --non-default-files: a way to have MARBL provide multiple XML files that build on each other (e.g. turning CISO on)
  • --user-specified-file: a way for the user to specify changes from the default settings
  • --user-specified-file-format: XML file? something like CESM's user_nl text files? Something MPAS or MOM specific? etc etc.

For better control of diagnostics, what if MARBL did not allocate field_2d or field_3d but instead left that up to the GCM? Instead of compute_now we could just look to see if memory was allocated. (Maybe make them pointers, then check if associated?) This led to the follow up question about whether there are other parts of the interface that could be updated in this manner (state, forcings, fluxes, tendencies, etc). Conclusion was that it probably doesn't make sense to determine which tracer tendencies are returned in this manner, it's easier to think of tracers in natural groupings than as individual quantities).