-
Notifications
You must be signed in to change notification settings - Fork 156
Using EFSOI with GDAS
last edited July 26, 2022
The EFSOI code is merged with the GSI-utils develop branch and the scripts and j-jobs in the global-workflow develop branch, and most of the rest of the necessary files preliminary approval to be merged with the global-workflow development branch. As of this writing, EFSOI can be run on Orion using the development fork that has been merged with global-workflow develop from July 22, 2022 (ffcd5b), and the GSI-utils develop from July 19 (322cc7b).
To use EFSOI, you need to clone the forked global-workflow repository, then checkout the EFSOI branch, then run the usual sequence of scripts. This branch of global-workflow will clone a hash of a development fork of GSI-utils that contains the necessary fix file and some scripts for analyzing the EFSOI output. At the time of this writing EFSOI works on Orion and previously worked on WCOSS.
To set up the current global-workflow repository with the latest EFSOI development:
git clone --recursive https://github.com/AndrewEichmann-NOAA/global-workflow.git
cd global-workflow/
git checkout feature/EFSOI
and then checkout, build, and link global-workflow as usual.
Run workflow/setup_expt.py
as usual for a cycling experiment. For testing 20-member ensembles suffices, and 80-member are used for experiments. GFS need not be run run separately.
In config.base
in your expdir, set
export DO_EFSOI="YES"
Also, per the global-workflow instructions for cycling experiments, set the following:
imp_physics from 8 (Thompson) to 11 (GFDL)
CCPP_SUITE to FV3_GFS_v16 (or another suite that uses GFDL)
Then run workflow/setup_xml.py
from the EFSOI build of global-workflow. This will set up the workflow with the extra EFSOI tasks. The experiment can be started as usual.
During the first complete cycle, the gdaseupdfsoi
task - the ensemble update with settings specific to EFSOI - will run with the same priority as gdaseupd
, leading to a parallel set of EFSOI-specific gdas tasks ending with post-processing. In the first complete cycle, the first 30-hour ensemble forecast (metatask gdasefmnfsoi
) is generated, and the gdasefsoi
task will never run. During the second complete cycle, the forecast will be made again for 30 hours, and post-processed to generate 24-hour and 30-hour ensemble means. The gdasefsoi
task in this cycle will sit idle until the cycle 24 hours subsequent is active and creates the verifying analysis. The gdasefsoi
task from the second complete cycle then runs, using the 24-hour forecast from that cycle and the 30-hour forecast from the previous cycle, creating the final observation sensitivity - osense - file. This is placed in the osense directory in COMROT. The process is repeated for the following cycles for the length of the experiment.
One result of this process is that the EFSOI-specific data, stored in efsoigdas
directories with a structure similar to that of enkfgdas
directories, has to be kept on disk for a longer time than the other data files, and can eat up space. Likewise the osense files are several hundred MB for each cycle. These osense files can be analyzed with scripts in sorc/gsi_utils.fd/src/EFSOI_Utilities/scripts
.
Ensemble Forecast Sensitivity to Observation Impacts is based on a method developed in Langland and Baker (2004) that uses a model adjoint and the Kalman gain to determine the positive or negative impact of individual assimilated observations on the error of a forecast relative to a verifying analysis. The state vector plot below illustrates the concept.
Plot by Rahul Mahajan
The forecast background Xb and analysis Xa of a given cycle are both used to initialize forecasts, Xaf and Xbf. These forecasts are then compared to a verifying analysis Xt to obtain the respective errors of the two forecasts. The difference in the errors at each observation point are traced back to their respective assimilated observations using the following equation:
Kalnay et al. (2012) developed a method to use ensemble forecasts and observation error covariance in lieu of an adjoint and Kalman gain:
Kalnay, E., Ota., Y., Miyoshi, T. and Liu, J. 2012. A simpler formulation of forecast sensitivity to observations: application to ensemble Kalman filters. Tellus, 64A, 18462
Ota., Y., Derber., J., Kalnay., E. and Miyoshi., T., 2013, Ensemble-Based Observation Impact Estimates Using the NCEP GFS. Tellus, 65A, 20038
In more concrete terms within GDAS and global-workflow, the variables in the EFSOI equation are represented as follows:
where the green terms are stored in the initial "osense file" generated during the EFSOI-specific ensemble update task (gdaseupdfsoi
) for a given cycle t0, and the values used for the forecast perturbation (the red term) are in 24-hour ensemble member forecasts initialized with the analysis at t0, also generated by gdaseupdfsoi
. The forecast errors (the blue terms) are calculated using the 24-hour forecast ensemble mean, the 30-hour forecast ensemble mean from the cycle t0-6hr (which is functionally the same as a 24-hour forecast initialized with the background at t0), and the verifying analysis from t0+24hr. Both the 24-hour and 30-hour ensemble forecasts are run at the same time for t0 with the metatask gdasefmnfsoi
, the 24-hour forecast to for the EFSOI calculation for cycle t and the 30-hour forecast for cycle t+6hr, and the ensemble means generated with gdasepmnfsoi
. Note that the 24-hour forecasts used are specific to the global model; regional models may use shorter forecasts for the same purpose.
The osense file
The osense file output by both the EnKF executable during the ensemble update task, and EFSOI executable during the efsoi task. The update outputs the statistical information of each assimilated observation required to perform the EFSOI calculation, where the EFSOI executable reads it in, and then overwrites it with the same information plus the observation sensitivities.
The following tables document the contents of the osense file. The first is for a single header with variables common to all observations, and the second describing the record for each observation. The conventional and ozone observations are generally handled separately from the satellite observations, though the record format is the same. The variable names are as they are used in the EnKF/EFSOI code, and the same are used by convention in accompanying Python scripts
Type | Variable Name | Description |
---|---|---|
real(r_single) | obfit_prior | Observation fit to the first guess |
real(r_single) | obsprd_prior | Spread of observation prior |
real(r_single) | ensmean_obnobc | Ensemble mean first guess (no bias correction) |
real(r_single) | ensmean_ob | Ensemble mean first guess (bias corrected) |
real(r_single) | ob | Observation value |
real(r_single) | oberrvar | Observation error variance |
real(r_single) | lon | Longitude |
real(r_single) | lat | Latitude |
real(r_single) | pres | Pressure |
real(r_single) | time | Observation time |
real(r_single) | oberrvar_orig | Original error variance |
integer(i_kind) | stattype | Observation type |
character(len=20) | obtype | Observation element / Satellite name |
integer(i_kind) | indxsat | Satellite index (channel) set to zero |
real(r_single) | osense_kin | Observation sensitivity (kinetic energy) [J/kg] |
real(r_single) | osense_dry | Observation sensitivity (Dry total energy) [J/kg] |
real(r_single) | osense_moist | Observation sensitivity (Moist total energy) [J/kg] |