Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to use netcdf global_berror files #810

Merged

Conversation

RussTreadon-NOAA
Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA commented Nov 27, 2024

Description
This PR adds the option for gsi.x to read netcdf format global_berror files.

Resolves #808

Depends on

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

ctests run on WCOSS2 (Dogwood) with Passed result for all tests.

NetCDF format global_berror files tested for various resolutions on Dogwood. Runs using netcdf and binary global_berror files yield bitwise identical analysis increment files.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • New and existing tests pass with my changes

@RussTreadon-NOAA
Copy link
Contributor Author

RussTreadon-NOAA commented Nov 27, 2024

This PR will changed to Ready for review once

  • GSI-fix PR #25 is merged into develop
  • g-w issue #3128 is closed
  • the fix submodule hash in RussTreadon-NOAA:feature/netcdf_berror is updated
  • GSI_BINARY_SOURCE_DIR is updated to point at the new directory for staged non-ASCII GSI fix files

src/gsi/m_berror_stats.f90 Outdated Show resolved Hide resolved
src/gsi/m_berror_stats.f90 Outdated Show resolved Hide resolved
src/gsi/m_berror_stats.f90 Outdated Show resolved Hide resolved
src/gsi/m_berror_stats.f90 Outdated Show resolved Hide resolved
src/gsi/m_berror_stats.f90 Outdated Show resolved Hide resolved
src/gsi/m_berror_stats.f90 Outdated Show resolved Hide resolved
src/gsi/m_berror_stats.f90 Outdated Show resolved Hide resolved
src/gsi/m_berror_stats.f90 Show resolved Hide resolved
src/gsi/m_nc_berror.f90 Outdated Show resolved Hide resolved
src/gsi/m_nc_berror.f90 Outdated Show resolved Hide resolved
CoryMartin-NOAA
CoryMartin-NOAA previously approved these changes Dec 2, 2024
@RussTreadon-NOAA
Copy link
Contributor Author

Install RussTreadon-NOAA:feature/netcdf_berror at 0e07d71 and develop at 9ada88e on Dogwood, Hera, Hercules, and Orion. Run ctests with the following results

Dogwood (WCOSS2)

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/berror/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  1207.91 sec
2/6 Test #6: global_enkf ......................   Passed  1331.62 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  1633.44 sec
4/6 Test #4: hafs_4denvar_glbens ..............   Passed  1813.80 sec
5/6 Test #1: global_4denvar ...................   Passed  2523.63 sec
6/6 Test #2: rtma .............................   Passed  3010.41 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 3010.53 sec

Orion

Test project /work2/noaa/da/rtreadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  967.15 sec
2/6 Test #6: global_enkf ......................   Passed  1207.74 sec
3/6 Test #2: rtma .............................   Passed  1867.69 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  3085.11 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  3267.06 sec
6/6 Test #1: global_4denvar ...................***Failed  4202.53 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) = 4202.56 sec

The global_4denvar test failed due to

The runtime for global_4denvar_hiproc_updat is 881.487325 seconds.  This has exceeded maximum allowable threshold time of 815.596658 seconds, resulting in Failure of timethresh2 the regression test.

gsi.x wall times are known to be highly variable on Orion, especially when running in the /work/noaa/stmp fileset. This is not a fatal fail.

Hera

Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  858.84 sec
2/6 Test #6: global_enkf ......................   Passed  2735.79 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  3085.84 sec
4/6 Test #4: hafs_4denvar_glbens ..............   Passed  3624.06 sec
5/6 Test #1: global_4denvar ...................   Passed  3727.23 sec
6/6 Test #2: rtma .............................   Passed  3850.52 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 3850.55 sec

Hercules
All GSI ctests are still pending on Hercules. The machine has 2803 jobs in the queue. 103 are running. 2700 are pending. Not sure when the Hercules ctests will complete.

@RussTreadon-NOAA
Copy link
Contributor Author

Hercules

Test project /work/noaa/da/rtreadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  27630.52 sec
2/6 Test #6: global_enkf ......................   Passed  29783.16 sec
3/6 Test #2: rtma .............................   Passed  31817.00 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  31827.25 sec
5/6 Test #1: global_4denvar ...................   Passed  31994.43 sec
6/6 Test #4: hafs_4denvar_glbens ..............***Failed  32437.47 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) = 32437.51 sec

The hafs_4denvar_glbens failure is due to

The runtime for hafs_4denvar_glbens_loproc_updat is 368.637720 seconds.  This has exceeded maximum allowable threshold time of 348.262471 seconds, resulting in Failure time-thresh of the regression test.

The runtime for hafs_4denvar_glbens_hiproc_updat is 264.768721 seconds.  This has exceeded maximum allowable threshold time of 253.254322 seconds, resulting in Failure of timethresh2 the regression test.

This is not a fatal failure. The /work/noaa/stmp fileset is know to impact executable wall time.

@RussTreadon-NOAA
Copy link
Contributor Author

Additional tests on Dogwood

Run the following cases:

  1. C24L31 with C12L31 ensemble
  2. C96L127
  3. C192L127
  4. C384L127
  5. C766L127 with C384L127 ensemble

Cases 2 through 4 were run twice: (a) develop gsi.x using the binary global_berror, (b) feature/netcdf_berror gsi.x using the netcdf global_berror. For each case the analysis results are identical between the executables and global_berror format.

The two jobs for Case 5 remain pending in the queue.

Case 1 is the ultra low-resolution case to be used in g-w CI. The feature/netcdf_berror gsi.x successfully ran to completion using the L31 netcdf global_berror file.

@RussTreadon-NOAA RussTreadon-NOAA marked this pull request as ready for review December 3, 2024 18:16
@RussTreadon-NOAA
Copy link
Contributor Author

Case 5, C768L127 with C384L127 ensemble is done. feature/netcdf_berror gsi.x using the netcdf global_berror generates analysis increments that are bitwise identical to those generated by develop gsi.x using the binary global_berror

@RussTreadon-NOAA
Copy link
Contributor Author

@danholdaway , @DavidNew-NOAA , and @CatherineThomas-NOAA

This PR adapts code from the JEDI GSIBEC to allow gsi.x to read netcdf format global_berror files. EIB added the netcdf global_berror files to their staged GSI directory. gsi.x determines the file format, netcdf or binary, upon reading local file berror_stats in the run directory.

@CoryMartin-NOAA created netcdf global_berror files for for C24L31 and C12L31. He also created a C24L31 deterministic background along with a 3 member C12L31 ensemble. These L31 netcdf global_berror files are in EIB's staged directory along with netcdf global_berror files for other resolutions. The updated fix hash in this PR brings in L31 ascii fix files. I successfully ran 3DEnVar gsi.x with a C24 deterministic background and a 3-member C12 ensemble.

GSI code review policy requires two peer reviews and approvals before PRs can be merged into develop. If any of you have time, a review would be helpful. Thanks.

@RussTreadon-NOAA
Copy link
Contributor Author

@DavidNew-NOAA or @CatherineThomas-NOAA , do you have time to review this PR? If not, let me know and I'll reach out to other developers.

Copy link

@DavidNew-NOAA DavidNew-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, with the caveat that I don't know my way around GSI very well.

@RussTreadon-NOAA
Copy link
Contributor Author

Thank you @DavidNew-NOAA !

@RussTreadon-NOAA
Copy link
Contributor Author

Install RussTreadon-NOAA:feature/netcdf_berror at 63df6ff on Dogwood, Hera, and Orion. Use develop at 9ada88e as contrl. RussTreadon-NOAA:feature/netcdf_berror is updat. Run ctests with the following results

Dogwood

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  907.34 sec
2/6 Test #6: global_enkf ......................   Passed  971.99 sec
3/6 Test #4: hafs_4denvar_glbens ..............   Passed  1454.36 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  1454.78 sec
5/6 Test #2: rtma .............................   Passed  1929.28 sec
6/6 Test #1: global_4denvar ...................   Passed  2822.58 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 2822.59 sec

Hera

Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  495.79 sec
2/6 Test #6: global_enkf ......................   Passed  761.26 sec
3/6 Test #2: rtma .............................   Passed  971.18 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  1226.19 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1349.04 sec
6/6 Test #1: global_4denvar ...................   Passed  1926.24 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 1926.26 sec

Orion

Test project /work2/noaa/da/rtreadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #6: global_enkf ......................   Passed  909.31 sec
2/6 Test #3: rrfs_3denvar_rdasens .............   Passed  1026.53 sec
3/6 Test #2: rtma .............................   Passed  1687.96 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  2719.90 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  3021.75 sec
6/6 Test #1: global_4denvar ...................   Passed  3722.15 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 3722.17 sec

In the above tests RussTreadon-NOAA:feature/netcdf_berror gsi.x was run using the binary global_error file. Repeat the above ctests using the netcdf global_berror when running RussTreadon-NOAA:feature/netcdf_berror gsi.x. Results are as follows

Dogwood

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 5: hafs_3denvar_hybens
    Start 4: hafs_4denvar_glbens
    Start 6: global_enkf
    Start 3: rrfs_3denvar_rdasens
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  1026.80 sec
2/6 Test #6: global_enkf ......................   Passed  1223.14 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  1395.08 sec
4/6 Test #2: rtma .............................   Passed  1510.72 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1634.73 sec
6/6 Test #1: global_4denvar ...................   Passed  2043.95 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 2044.07 sec

Hera

era(hfe05):/scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr810/build$ tail -f stdout_ctest_netcdf.txt
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 2: rtma
    Start 6: global_enkf
    Start 3: rrfs_3denvar_rdasens
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  496.49 sec
2/6 Test #6: global_enkf ......................   Passed  765.35 sec
3/6 Test #2: rtma .............................   Passed  970.30 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  1117.35 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1350.57 sec
6/6 Test #1: global_4denvar ...................   Passed  1928.94 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 1928.97 sec

Orion

Test project /work2/noaa/da/rtreadon/git/gsi/pr810/build
    Start 1: global_4denvar
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 6: global_enkf
1/6 Test #6: global_enkf ......................   Passed  788.22 sec
2/6 Test #3: rrfs_3denvar_rdasens .............   Passed  966.63 sec
3/6 Test #2: rtma .............................   Passed  1748.11 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  2780.33 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  3020.22 sec
6/6 Test #1: global_4denvar ...................   Passed  3662.79 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 3662.81 sec

Additionally, the develop and RussTreadon-NOAA:feature/netcdf_berror gsi.x were run using input from the operational 20241203 06Z GDAS for various analysis grid resolutions of C96, C192 and C384. The RussTreadon-NOAA:feature/netcdf_berror gsi.x was run twice: (1) binary global_berror, (2) netcdf global_berror. Analysis results from the three runs at each resolution are identical. Wall times are comparable. The size of the stdout files are comparable.

The operational 20241206 06Z GDAS was run at C768. Again, identical analysis results are generated with comparable wall times and stdout sizes.

@RussTreadon-NOAA RussTreadon-NOAA merged commit 6fd4bd5 into NOAA-EMC:develop Dec 10, 2024
4 checks passed
@RussTreadon-NOAA RussTreadon-NOAA deleted the feature/netcdf_berror branch December 10, 2024 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend GSI to read netcdf global_berror files
4 participants