Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iasi ng fov bugs #832

Merged
merged 4 commits into from
Feb 3, 2025
Merged

Conversation

wx20jjung
Copy link
Contributor

@wx20jjung wx20jjung commented Feb 1, 2025

Description
An error was discovered in the scan position calculation of read_iasing.f90. The initial proxy data set did not have enough information to properly calculate the scan position. It only had field of view spanning 1-16. There was no field of regard. In the latest data set the field of view now spans 1 - 224. This is consistent with the field of view EUMETSAT plans to provide. Using this new data set, the error was discovered. This PR fixes #IssueNumber #830.

Resolves #830

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?
NESDIS generated 2 IASI-NG bufr files with the correct field of view (FOV) range, 1-224. These bufr files were processed into "dump" files by NCEP. With these files, I was able to determine there was a problem in the logic determining the scan position. The problem was traced back to the output of the mod() function. When mod() = 0, the value should be a multiple of the 4th scan position. This is now correct. These tests were conducted with the 2 file sets on S4 at C96 resolution.

In order to reproduce these results, modifications to the global-workflow, CRTM and the global_satinfo file are required. The global-workflow needs logic to copy the bufr file(s) into its workspace. The CRTM version must have iasi-ng and metimage coefficient files. The satinfo file also has to have an iasi-ng_metop-sg_a1 channel selection.

Another issue was found with the FOV information in the bufr files. The FOV entries are transposed from their expected locations. The NESDIS bufr file FOV entries are:
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16

EUMETSAT's documentation has the FOV entries as:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16

This inconsistency was accounted for in determining the scan position.

I have also replaced the bufr mnemonics to EUMETSAT's published list. Documentation suggests these mnemonics will be used for IASI-NG distribution, including DBNet. Here is the link to the web site with the various metop-sg instrument documentation. The EUMETSAT documentation shows the number of channels distributed will follow a delayed replication entry.

https://user.eumetsat.int/resources/user-guides/metop-sg-test-data

This site also includes the EUMETSAT metop-sg instrument bufr tables

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing tests pass with my changes
    This code is currently NOT used by the GSI.

This code passes the ctests on hera. Orion continues to hang.

Hera stats:
[Jim.Jung@hfe04 build]$ ctest -j 6
Test project /scratch1/NCEPDEV/jcsda/Jim.Jung/save/ctests/update/build
Start 1: global_4denvar
Start 2: rtma
Start 3: rrfs_3denvar_rdasens
Start 4: hafs_4denvar_glbens
Start 5: hafs_3denvar_hybens
Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens ............. Passed 1045.33 sec
2/6 Test #6: global_enkf ...................... Passed 1515.30 sec
3/6 Test #4: hafs_4denvar_glbens .............. Passed 2368.49 sec
4/6 Test #1: global_4denvar ................... Passed 2446.88 sec
5/6 Test #5: hafs_3denvar_hybens .............. Passed 2618.69 sec
6/6 Test #2: rtma ............................. Passed 2774.23 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 2774.27 sec

Orion tests:
Test project /work2/noaa/nesdis-rdo1/jjung/ctests/update/build
Start 1: global_4denvar
Start 2: rtma
Start 3: rrfs_3denvar_rdasens
Start 4: hafs_4denvar_glbens
Start 5: hafs_3denvar_hybens
Start 6: global_enkf
1/6 Test #6: global_enkf ...................... Passed 728.02 sec
2/6 Test #3: rrfs_3denvar_rdasens ............. Passed 968.65 sec
3/6 Test #2: rtma ............................. Passed 1627.87 sec
4/6 Test #5: hafs_3denvar_hybens .............. Passed 2734.22 sec
5/6 Test #4: hafs_4denvar_glbens .............. Passed 3031.66 sec

I continue to have problems with orion. The global_4denvar does not complete.

@wx20jjung
Copy link
Contributor Author

@DavidHuber-NOAA , @ADCollard Would you review my iasi-ng_fov_bugs branch?

Copy link
Collaborator

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look OK syntactically, though I am not familiar with the science changes here.

@RussTreadon-NOAA
Copy link
Contributor

@wx20jjung , when you say global_4denvar does not complete on Orion are you referring to

  1. gsi.x aborts with an execution error
  2. gsi.x is terminated because the specified wall clock limit is exceeded

If the failure is due to 1, we obviously need to investigate the failure and fix it.

If the failure is due to 2, this is a know feature following the Orion Rocky 8 upgrade. Increasing the global_4denvar wall time in regression/regression_param.sh allows the job to complete.

I ran ctests for your PR on Orion. My run of global_4denvar yielded the following Orion gsi.x wall times

global_4denvar_hiproc_contrl/stdout:The total amount of wall time                        = 743.307286
global_4denvar_hiproc_updat/stdout:The total amount of wall time                        = 734.357374
global_4denvar_loproc_contrl/stdout:The total amount of wall time                        = 958.980489
global_4denvar_loproc_updat/stdout:The total amount of wall time                        = 962.058930

We should increase the Orion global_4denvar wall time in regression/regression_param.sh from 10 minutes to 20 minutes.

--- a/regression/regression_param.sh
+++ b/regression/regression_param.sh
@@ -67,8 +67,8 @@ case $regtest in
            topts[1]="0:10:00" ; popts[1]="12/8/" ; ropts[1]="/1"
            topts[2]="0:10:00" ; popts[2]="12/10/" ; ropts[2]="/2"
         elif [[ "$machine" = "Orion" ]]; then
-           topts[1]="0:10:00" ; popts[1]="12/8/" ; ropts[1]="/1"
-           topts[2]="0:10:00" ; popts[2]="12/12/" ; ropts[2]="/2"
+           topts[1]="0:20:00" ; popts[1]="12/8/" ; ropts[1]="/1"
+           topts[2]="0:20:00" ; popts[2]="12/12/" ; ropts[2]="/2"
         elif [[ "$machine" = "Hercules" ]]; then
            topts[1]="0:10:00" ; popts[1]="12/8/" ; ropts[1]="/1"
            topts[2]="0:10:00" ; popts[2]="12/12/" ; ropts[2]="/2"

When using a 20 minute global_4denvar wall time on Orion for global_4denvar, all ctests pass

Test project /work2/noaa/da/rtreadon/git/gsi/pr832/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #6: global_enkf ......................   Passed  730.70 sec
2/6 Test #3: rrfs_3denvar_rdasens .............   Passed  970.14 sec
3/6 Test #2: rtma .............................   Passed  1629.13 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  2783.55 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  2962.62 sec
6/6 Test #1: global_4denvar ...................   Passed  3604.25 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 3604.95 sec

@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 ctests

Install wx20jjung:iasi-ng_fov_bugs at bcc9ef1 and develop at e374f91 on Cactus. Run ctests with following results

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr832/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  851.50 sec
2/6 Test #6: global_enkf ......................   Passed  1217.44 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  1340.91 sec
4/6 Test #4: hafs_4denvar_glbens ..............   Passed  1401.33 sec
5/6 Test #2: rtma .............................   Passed  1634.90 sec
6/6 Test #1: global_4denvar ...................   Passed  1983.39 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 1983.41 sec

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't comment on the accuracy of the changes but assume @wx20jjung knows what mnemonics to change and how to use them.

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doubling of the global_4denvar wall time on Orion is correct.

Approve.

@wx20jjung
Copy link
Contributor Author

I changed the time limits for global_4denvar as per Russ' instructions. All the ctests now pass for me on orion.
orion-login-4[29] jjung$ ctest -j 6
Test project /work2/noaa/nesdis-rdo1/jjung/ctests/update/build
Start 1: global_4denvar
Start 2: rtma
Start 3: rrfs_3denvar_rdasens
Start 4: hafs_4denvar_glbens
Start 5: hafs_3denvar_hybens
Start 6: global_enkf
1/6 Test #6: global_enkf ...................... Passed 727.96 sec
2/6 Test #3: rrfs_3denvar_rdasens ............. Passed 967.03 sec
3/6 Test #2: rtma ............................. Passed 1689.52 sec
4/6 Test #5: hafs_3denvar_hybens .............. Passed 2783.27 sec
5/6 Test #4: hafs_4denvar_glbens .............. Passed 3020.94 sec
6/6 Test #1: global_4denvar ................... Passed 3602.11 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 3602.15 sec

I changed the time options for global_4denvar on orion from 10 minutes to 20 minutes
elif [[ "$machine" = "Orion" ]]; then
topts[1]="0:20:00" ; popts[1]="12/8/" ; ropts[1]="/1"
topts[2]="0:20:00" ; popts[2]="12/12/" ; ropts[2]="/2"

These timing changes are now committed to iasi-ng_fov_bugs

@RussTreadon-NOAA RussTreadon-NOAA merged commit 92165a4 into NOAA-EMC:develop Feb 3, 2025
4 checks passed
@wx20jjung wx20jjung deleted the iasi-ng_fov_bugs branch February 4, 2025 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix logic in read_iasing for deriving scan positon (1-56) from field-of-view (1-224).
3 participants