A. GSI Community Tools¶
BUFR Format and BUFR Tools¶
Under ./util/bufr_tools, there are many Fortran examples to illustrate basic BUFR/PrepBUFR file process skills such as encoding, decoding, and appending. For details of these examples and the BUFR format, please see the BUFR/PrepBUFR Users Guide, which is freely available on-line
http://www.dtcenter.org/com-GSI/BUFR/docs/index.php
The observation BUFR files generated by NCEP (for example, PrepBUFR and BUFR files from NCEP ftp server or gdas1.t12z.prepbufr.nr in tutorial case) are in Big Endian binary format. For release version older than 3.2, the BUFR files have to be converted from Big Endian BUFR file to Little Endian file when used by the GSI on Linux platform. Please refer to older version Users guide on how to convert.
Since release 3.2, BUFRLIB can automatically identify and handle either byte orders. For Intel and PGI compilers on Linux, the Big Endian BUFR/PrepBUFR files can be used by GSI without byte swap.
Read GSI Diagnostic Files¶
Lots of useful information about how one observation was used in the analysis such as innovation, observation values, observation error and adjusted observation error, and quality control information, has been saved in diagnostic files. To generate these diagnosis files, namelist variable write_diag in namelist section &SETUP needs to be true. The write_diag variable has been introduced in Part 4 of Section 3.4. The following is an example of using the write_diag variable to control diagnostic files. When we set the number of outer loops to 2, and set the write_diag namelist variable to the following:
write_diag(1)=.true.,write_diag(2)=.false.,write_diag(3)=.true.,
diag_amsua_metop-a_anl.2014061700 diag_amsua_n18_ges.2014061700
diag_amsua_metop-a_ges.2014061700 diag_conv_anl.2014061700
diag_amsua_n15_anl.2014061700 diag_conv_ges.2014061700
diag_amsua_n15_ges.2014061700 diag_hirs4_metop-a_anl.2014061700
diag_amsua_n18_anl.2014061700 diag_hirs4_metop-a_ges.2014061700
.4in1 read_diag_conv.f90 : Reads the diagnostic files for conventional observations. For example: diag_conv_anl.2014061700 and diag_conv_ges.2014061700
read_diag_rad.f90 : Reads the diagnostic files for radiance observation. For example:
diag_amsua_n15_ges.2014061700 diag_hirs4_metop-a_anl.2014061700
diag_amsua_n18_anl.2014061700 diag_hirs4_metop-a_ges.2014061700
To compile the programs, use the makefile provided:
./make
Note: since information in the GSI include directory is required, the GSI must have been compiled first.
To run read_diag_conv.exe, a namelist file namelist.conv needs to be in the directory along with the executable. The namelist.conv only has two parameters:
&iosetup
infilename='./diag_conv_anl', : The path and name of GSI diagnosis file
outfilename='./results_conv_anl', : The path and name of a text file used to
save the content of the diagnostic file
/
The user can set the test case directory and file diag_conv_anl.2014061700 from section 5.2 as the entry for infilename in the namelist, then run the executable
./read_diag_conv.exe
The results are placed in the file specified by the outfilename entry in the namelist. In this case that would be a file results_conv_anl located in the directory where the executable was run.
Similarly, to run read_diag_rad.exe, the namelist file namelist.rad is needed. It contains the same parameters as namelist.conv but it links to radiance diag files. After setting it to use the same case from section 5.2, such as:
&iosetup
infilename='(test directory)/diag_amsua_n18_ges.2014061700',
outfilename='./results_amsua_n18_ges',
/
cdiagbuf = station id
rdiagbuf(1) = observation type
rdiagbuf(2) = observation subtype
rdiagbuf(3) = observation latitude (degrees)
rdiagbuf(4) = observation longitude (degrees)
rdiagbuf(5) = station elevation (meters)
rdiagbuf(6) = observation pressure (hPa)
rdiagbuf(7) = observation height (meters)
rdiagbuf(8) = observation time (hours relative to analysis time)
rdiagbuf(9) = input prepbufr qc or event mark
rdiagbuf(10) = setup qc or event mark (currently qtflg only)
rdiagbuf(11) = read_prepbufr data usage flag
rdiagbuf(12) = analysis usage flag (1=use, -1=not used)
rdiagbuf(13) = nonlinear qc relative weight
rdiagbuf(14) = prepbufr inverse obs error (K**-1)
rdiagbuf(15) = read_prepbufr inverse obs error (K**-1)
rdiagbuf(16) = final inverse observation error (K**-1)
rdiagbuf(17) = observation (K)
rdiagbuf(18) = obs-ges used in analysis (K)
rdiagbuf(19) = obs-ges without bias correction (K)
For wind observations, the content after index 16 (check src/main/setupw.f90) is:
rdiagbuf(17) = earth relative u wind component observation (m/s)
rdiagbuf(18) = earth relative u obs-ges used in analysis (m/s)
rdiagbuf(19) = earth relative u obs-ges w/o bias correction (m/s)
rdiagbuf(20) = earth relative v wind component observation (m/s)
rdiagbuf(21) = earth relative v obs-ges used in analysis (m/s)
rdiagbuf(22) = earth relative v obs-ges w/o bias correction (m/s)
The read_diag_conv.exe reads these arrays and outputs important information in the text file results_conv_anlspecified by the user in the &iosetup namelist. For example:
station obs obs obs obs obs usag obs O-B
ID type time latitude longitude pressure value
ps @ 46047 : 180 -0.17 32.40 240.50 1014.30 1 1014.30 -0.09
t @ 72293 : 120 -0.98 32.85 242.88 996.00 1 297.25 1.26
uv @ 72293 : 220 -0.98 32.85 242.88 996.00 1 3.90 0.38 3.30 2.66
For wind, the last 4 columns are the wind components in the order of: U observation, O-B for U, V observation, O-B for V.
For radiance observations, the data is stored in two arrays: diagbuf and diagbufchan. Their contents are listed as follows (please refer to src/main/setuprad.f90 for more details):
diagbuf(1) = observation latitude (degrees)
diagbuf(2) = observation longitude (degrees)
diagbuf(3) = model (guess) elevation at observation location
diagbuf(4) = observation time (hours relative to analysis time)
diagbuf(5) = sensor scan position
diagbuf(6) = satellite zenith angle (degrees)
diagbuf(7) = satellite azimuth angle (degrees)
diagbuf(8) = solar zenith angle (degrees)
diagbuf(9) = solar azimuth angle (degrees)
diagbuf(10) = sun glint angle (degrees) (sgagl)
diagbuf(11) = surface fractional coverage by water
diagbuf(12) = surface fractional coverage by land
diagbuf(13) = surface fractional coverage by ice
diagbuf(14) = surface fractional coverage by snow
if(.not. retrieval)then
diagbuf(15) = surface temperature over water (K)
diagbuf(16) = surface temperature over land (K)
diagbuf(17) = surface temperature over ice (K)
diagbuf(18) = surface temperature over snow (K)
diagbuf(19) = soil temperature (K)
diagbuf(20) = soil moisture
diagbuf(21) = surface land type
else
diagbuf(15) = SST first guess used for SST retrieval
diagbuf(16) = NCEP SST analysis at t
diagbuf(17) = Physical SST retrieval
diagbuf(18) = Navy SST retrieval
diagbuf(19) = d(ta) corresponding to sstph
diagbuf(20) = d(qa) corresponding to sstph
diagbuf(21) = data type
endif
diagbuf(22) = vegetation fraction
diagbuf(23) = snow depth
diagbuf(24) = surface wind speed (m/s)
! Note: The following quantities are not computed for all sensors
if (.not.microwave) then
diagbuf(25) = cloud fraction (%)
diagbuf(26) = cloud top pressure (hPa)
else
diagbuf(25) = cloud liquid water (kg/m**2)
diagbuf(26) = total column precip. water (km/m**2)
endif
diagbuf(27) = foundation temperature: Tr
diagbuf(28) = diurnal warming: d(Tw) at depth zob
diagbuf(29) = sub-layer cooling: d(Tc) at depth zob
diagbuf(30) = d(Tz)/d(Tr)
diagbufchan include loop through channel i from 1 to nchanl:
diagbufchan(1,i)= observed brightness temperature (K)
diagbufchan(2,i)= observed - simulated Tb with bias corrrection (K)
diagbufchan(3,i)= observed - simulated Tb with no bias correction (K)
diagbufchan(4,i)= inverse observation error
diagbufchan(5,i)= quality control mark or event indicator
diagbufchan(6,i)= surface emissivity
diagbufchan(7,i)= stability index
diagbufchan(8,i)= d(Tb)/d(Ts)
do j=1,npred+1
diagbufchan(7+j,i)= Tb bias correction terms (K)
end do
In the sample output file results_amsua_n18_ges, only the observation location and time are written in the file. Users can write out other information based on the list.
Read and Plot Convergence Information from fort.220¶
In section 4.6, we introduced how to check the convergence information in the fort.220 file. Further detail on the fort.220 convergence information can be found in the Advanced Users Guide. Here, we provide tools to filter this file and to plot the values of the cost function and the norm of gradient during each iteration.
These tools - one ksh script and one ncl script - are in: ./util/Analysis_Utilities/plot_cost_grad directory:
The ksh script, filter_fort220.ksh, only has one line:
grep 'cost,grad,step,b' fort.220 | sed -e 's/cost,grad,step,b,step? = //g' | sed -e 's/good//g' > cost_gradient.txt
ncl GSI_cost_gradient.ncl
The pdf file GSI_cost_gradient.pdf is created. The pdf file contains plots of the convergence of the GSI analysis like those in section 4.6.
Plot Single Observation Test Result and Analysis Increment¶
[taba1]
The main difference between the ARW and NMM core used in GSI is that ARW is on a C grid, while NMM is on an E grid. GSI_singleobs_nmm.ncl calls fill_nmm_grid2.ncl to convert the E grid to an A grid for plotting, while GSI_singleobs_arw.ncl itself includes a C grid to A grid convertor.
Before running ncl scripts, users need to set up two links:
- cdf_analysis - Link to analysis result in NetCDF format (wrf_inout)
- cdf_bk - Link to background file in netCDF format
http://www.ncl.ucar.edu/
Generate initial regional ensembles¶
Under the ./util/EnKF/arw/src directory, there are two sub-directories: enspreproc_regional.fd/ and initialens_regional.fd/. The first one is to extract ensemble pertubations from GDAS 80 member ensembles and the second one is to add the extracted ensembles to a regional WRF background field (considered as the mean filed) to generate initial regional ensembles.
Before using these two unitilies, you should have already sucessfully compiled the GSI and gotten the “gsi.exe” file. After that, under the build directory, type “cmake -DBUILD_UTIL_COM path_to_comGSI ” to compile the utilities. A sucessful compilation should yield “enspreproc.exe” and “initialens.exe” in the bin/ directory.
Now, the next step is to get GDAS spectrally smoothed atmospheric ensemble forecasts. These files should be in the sigma format, which is currrenlty the only format supported by “enspreproc.exe”. You need to contact NCEP or other appropriate contacts to download these kind of ensembles. These ensemble files follow the name convection of “sfg_$CDATE_fhr$FEs_mem$MEM”. $CDATE is the cycle date, such as 2017011518 which means 18z of Jan. 15th, 2017. $FE is the forecast hour, for example, 06 means 6 hour of forecasts. $MEM is the member number. Here is an example of GDAS ensmbles: sfg_2017011518_fhr06s_mem001.
After you download the required GDAS ensembles, follow the following steps:
1. Running “enspreproc.exe”, enter the enspreproc_regional.fd/ directory:
(1). generate the file “fileslist01”. This file lists the ensemble files to be used in the calculation of ensemble pertubations. For example, if it is determined to use 20 members to generate ensemble perturbations, the file “filelist01” will be as follows:
sfg_2017011518_fhr06s_mem001
sfg_2017011518_fhr06s_mem002
sfg_2017011518_fhr06s_mem003
...
sfg_2017011518_fhr06s_mem018
sfg_2017011518_fhr06s_mem019
sfg_2017011518_fhr06s_mem020
(2). Modify the file “namelist.input”, change “n_ens” to the total number of ensembles to be used.
(3). Copy the “anavinfo” file used by GSI into current directory.
(4). Copy the background WRF file, name it as “wrf_inout”.
(5). Create a job description file, submit the job to get it run in parallel.
After the successful running of “enspreproc.exe”, you will get ensemble perturbations as follows:
en_perts4ars.mem0001
en_perts4ars.mem0002
en_perts4ars.mem0003
...
en_perts4ars.mem0018
en_perts4ars.mem0019
en_perts4ars.mem0020
2. Runnning “initialens.exe”, enter the initialens_regional.fd/ directory:
(1). Modify the file “namelist.input”, change “n_ens” to the total number of ensembles to be used.
(2). Copy wrf_inout to current directry
(3). Copy wrf_inout to wrfinput_d01.mem$MEM files as follows:
cp wrf_inout wrfinput_d01.mem0001
cp wrf_inout wrfinput_d01.mem0002
cp wrf_inout wrfinput_d01.mem0003
...
cp wrf_inout wrfinput_d01.mem0018
cp wrf_inout wrfinput_d01.mem0019
cp wrf_inout wrfinput_d01.mem0020
Be sure that each member has a correspoding wrfinput_d01 file. These files will be updated by “initialens.exe” later.
(4). Link the ensemble perturbations generated by “enspreproc.exe” to current directory. Something like this ln -s ../enspreproc_regional.fd/en_perts4arw.mem*.
(5). Create a job description file, submit the job to get it run in parallel. Please note that only 1 processor is required to run “initialens.exe” but submitting it to run on computing node is a must.
After the sucessful running of “initialens.exe”, all the wrfinput_d01.mem$MEM files are updated with ensemble perturbation added to the background or “mean” state of the original wrf_inout.
Now the initial regionl ensembles have been sucessfully generated.