Skip to content

Using EFSOI with GDAS

AndrewEichmann-NOAA edited this page Jul 26, 2022 · 19 revisions

Current Status

last edited July 26, 2022

The EFSOI code is merged with the GSI-utils develop branch and the scripts and j-jobs in the global-workflow develop branch, and most of the rest of the necessary files preliminary approval to be merged with the global-workflow development branch. As of this writing, EFSOI can be run on Orion using the development fork that has been merged with global-workflow develop from July 22, 2022 (ffcd5b), and the GSI-utils develop from July 19 (322cc7b).

Quick Start

Code, building, and experiment setup

To use EFSOI, you need to clone the forked global-workflow repository, then checkout the EFSOI branch, then run the usual sequence of scripts. This branch of global-workflow will clone a hash of a development fork of GSI-utils that contains the necessary fix file and some scripts for analyzing the EFSOI output. At the time of this writing EFSOI works on Orion and previously worked on WCOSS.

To set up the current global-workflow repository with the latest EFSOI development:

git clone --recursive https://github.com/AndrewEichmann-NOAA/global-workflow.git

cd global-workflow/

git checkout feature/EFSOI

and then checkout, build, and link global-workflow as usual.

Run workflow/setup_expt.py as usual for a cycling experiment. For testing 20-member ensembles suffices, and 80-member are used for experiments. GFS need not be run run separately.

In config.base in your expdir, set

export DO_EFSOI="YES"

Also, per the global-workflow instructions for cycling experiments, set the following:

imp_physics from 8 (Thompson) to 11 (GFDL)

CCPP_SUITE to FV3_GFS_v16 (or another suite that uses GFDL)

Then run workflow/setup_xml.py from the EFSOI build of global-workflow. This will set up the workflow with the extra EFSOI tasks. The experiment can be started as usual.

What Should Happen

During the first complete cycle, the gdaseupdfsoi task - the ensemble update with settings specific to EFSOI - will run with the same priority as gdaseupd, leading to a parallel set of EFSOI-specific gdas tasks ending with post-processing. In the first complete cycle, the first 30-hour ensemble forecast (metatask gdasefmnfsoi) is generated, and the gdasefsoi task will never run. During the second complete cycle, the forecast will be made again for 30 hours, and post-processed to generate 24-hour and 30-hour ensemble means. The gdasefsoi task in this cycle will sit idle until the cycle 24 hours subsequent is active and creates the verifying analysis. The gdasefsoi task from the second complete cycle then runs, using the 24-hour forecast from that cycle and the 30-hour forecast from the previous cycle, creating the final observation sensitivity - osense - file. This is placed in the osense directory in COMROT. The process is repeated for the following cycles for the length of the experiment.

One result of this process is that the EFSOI-specific data, stored in efsoigdas directories with a structure similar to that of enkfgdas directories, has to be kept on disk for a longer time than the other data files, and can eat up space. Likewise the osense files are several hundred MB for each cycle. These osense files can be analyzed with scripts in sorc/gsi_utils.fd/src/EFSOI_Utilities/scripts.

From Theory to Practice

Background

Ensemble Forecast Sensitivity to Observation Impacts is based on a method developed in Langland and Baker (2004) that uses a model adjoint and the Kalman gain to determine the positive or negative impact of individual assimilated observations on the error of a forecast relative to a verifying analysis. The state vector plot below illustrates the concept.

Plot by Rahul Mahajan

The forecast background Xb and analysis Xa of a given cycle are both used to initialize forecasts, Xaf and Xbf. These forecasts are then compared to a verifying analysis Xt to obtain the respective errors of the two forecasts. The difference in the errors at each observation point are traced back to their respective assimilated observations using the following equation:


Kalnay et al. (2012) developed a method to use ensemble forecasts and observation error covariance in lieu of an adjoint and Kalman gain:

References:

Kalnay, E., Ota., Y., Miyoshi, T. and Liu, J. 2012. A simpler formulation of forecast sensitivity to observations: application to ensemble Kalman filters. Tellus, 64A, 18462

Ota., Y., Derber., J., Kalnay., E. and Miyoshi., T., 2013, Ensemble-Based Observation Impact Estimates Using the NCEP GFS. Tellus, 65A, 20038

EFSOI in GDAS and global-workflow

In more concrete terms within GDAS and global-workflow, the variables in the EFSOI equation are represented as follows:

where the green terms are stored in the initial "osense file" generated during the EFSOI-specific ensemble update task (gdaseupdfsoi) for a given cycle t0, and the values used for the forecast perturbation (the red term) are in 24-hour ensemble member forecasts initialized with the analysis at t0, also generated by gdaseupdfsoi. The forecast errors (the blue terms) are calculated using the 24-hour forecast ensemble mean, the 30-hour forecast ensemble mean from the cycle t0-6hr (which is functionally the same as a 24-hour forecast initialized with the background at t0), and the verifying analysis from t0+24hr. Both the 24-hour and 30-hour ensemble forecasts are run at the same time for t0 with the metatask gdasefmnfsoi, the 24-hour forecast to for the EFSOI calculation for cycle t and the 30-hour forecast for cycle t+6hr, and the ensemble means generated with gdasepmnfsoi. Note that the 24-hour forecasts used are specific to the global model; regional models may use shorter forecasts for the same purpose.

Running EFSOI

Tools for Analysis

Developer Notes

Location of EFSOI-relevant code

The code to run EFSOI is spread out over three repositories: GSI-util for the EFSOI-exclusive Fortran code and Python scripts for analysis, GSI for libraries in enkf and gsi, and global-workflow for scripts to run within a cycling experiment.

GSI-util

Everything in this repository is under src/EFSOI_Utilities/, with the Fortran in src/EFSOI_Utilities/src and Python scripts in src/EFSOI_Utilities/scripts. Under src/EFSOI_Utilities/fix is a version of the file global_anavinfo.l127.txt from the GSI fix directory that is identical except for an entry for the EFSOI executable. At the time of this writing the location of this file is assigned to ANAVINFO in the config.efso in the experiment directory, though it should be merged with the regular fix file.

The files under src are as follows:

  • efsoi.f90
  • efsoi_main.f90
  • gridio_efsoi.f90
  • loadbal_efsoi.f90
  • loc_advection.f90
  • scatter_chunks_efsoi.f90
  • statevec_efsoi.f90

The filenames ending in _efsoi.f90 were originally from similar files under the EnKF code in GSI as for various reason they could not be used as is as libraries. Otherwise effort has been made to reduced code duplication, and certain modules are linked from the EnKF and GSI code. As such the GSI-utils build needs to told the location as gsi_ROOT and enkf_ROOT, as described in the GSI-utils INSTALL.md. This is done automatically in the global-workflow build, and that is probably the easiest context to do development here.

GSI

global-workflow

##The osense file

The osense file output by both the EnKF executable during the ensemble update task, and EFSOI executable during the efsoi task. The update outputs the statistical information of each assimilated observation required to perform the EFSOI calculation, where the EFSOI executable reads it in, and then overwrites it with the same information plus the observation sensitivities.

The following tables document the contents of the osense file. The first is for a single header with variables common to all observations, and the second describing the record for each observation. The conventional and ozone observations are generally handled separately from the satellite observations, though the record format is the same. The variable names are as they are used in the EnKF/EFSOI code, and the same are used by convention in accompanying Python scripts

Type Variable Name Description
real(r_single) obfit_prior Observation fit to the first guess
real(r_single) obsprd_prior Spread of observation prior
real(r_single) ensmean_obnobc Ensemble mean first guess (no bias correction)
real(r_single) ensmean_ob Ensemble mean first guess (bias corrected)
real(r_single) ob Observation value
real(r_single) oberrvar Observation error variance
real(r_single) lon Longitude
real(r_single) lat Latitude
real(r_single) pres Pressure
real(r_single) time Observation time
real(r_single) oberrvar_orig Original error variance
integer(i_kind) stattype Observation type
character(len=20) obtype Observation element / Satellite name
integer(i_kind) indxsat Satellite index (channel) set to zero
real(r_single) osense_kin Observation sensitivity (kinetic energy) [J/kg]
real(r_single) osense_dry Observation sensitivity (Dry total energy) [J/kg]
real(r_single) osense_moist Observation sensitivity (Moist total energy) [J/kg]
Clone this wiki locally