.. _gsi_tool: A. GSI Community Tools =================== BUFR Format and BUFR Tools -------------------------- Under *./util/bufr_tools*, there are many Fortran examples to illustrate basic BUFR/PrepBUFR file process skills such as encoding, decoding, and appending. For details of these examples and the BUFR format, please see the BUFR/PrepBUFR Users Guide, which is freely available on-line :: http://www.dtcenter.org/com-GSI/BUFR/docs/index.php The observation BUFR files generated by NCEP (for example, PrepBUFR and BUFR files from NCEP ftp server or gdas1.t12z.prepbufr.nr in tutorial case) are in Big Endian binary format. For release version older than 3.2, the BUFR files have to be converted from Big Endian BUFR file to Little Endian file when used by the GSI on Linux platform. Please refer to older version Users guide on how to convert. Since release 3.2, BUFRLIB can automatically identify and handle either byte orders. For Intel and PGI compilers on Linux, the Big Endian BUFR/PrepBUFR files can be used by GSI without byte swap. Read GSI Diagnostic Files ------------------------- Lots of useful information about how one observation was used in the analysis such as innovation, observation values, observation error and adjusted observation error, and quality control information, has been saved in diagnostic files. To generate these diagnosis files, namelist variable *write_diag* in namelist section *&SETUP* needs to be true. The *write_diag* variable has been introduced in Part 4 of Section 3.4. The following is an example of using the *write_diag* variable to control diagnostic files. When we set the number of outer loops to 2, and set the *write_diag* namelist variable to the following: :: write_diag(1)=.true.,write_diag(2)=.false.,write_diag(3)=.true., | GSI will write out diagnostic files before the start of the 1st outer loop start (O-B) and after the completion of the 2nd outer loop finish (O-A). We dont want GSI to write out diagnosis files after the 1st outer loop because we set *write_diag(2)=.false.* | This is what we set in our example case described in section 5.2. From this case, we can see the following diagnostic files generated from the GSI analysis: :: diag_amsua_metop-a_anl.2014061700 diag_amsua_n18_ges.2014061700 diag_amsua_metop-a_ges.2014061700 diag_conv_anl.2014061700 diag_amsua_n15_anl.2014061700 diag_conv_ges.2014061700 diag_amsua_n15_ges.2014061700 diag_hirs4_metop-a_anl.2014061700 diag_amsua_n18_anl.2014061700 diag_hirs4_metop-a_ges.2014061700 | All files are identified with a filename containing three elements. The first element "diag" indicates these are combined diagnostic files. The second element identifies the observation type (here, "conv" means conventional observation from prepbufr and "amsua_n15" corresponds to radiance observation AMSU-A from NOAA 15). The last element identifies which step of outer loop the files were generated for. Here, "anl" means the contents were written after the last outer loop (from *write_diag(3)=.true.*) and "ges" means the contents were written before the first output loop (from *write_diag(1)= .true.*). | To help users to read the information from these diagnostic files, we have provided two Fortran programs in the *./util/Analysis_Utilities/read_diag/* directory: .4in1 *read_diag_conv.f90* : Reads the diagnostic files for conventional observations. For example: *diag_conv_anl.2014061700* and *diag_conv_ges.2014061700* *read_diag_rad.f90* : Reads the diagnostic files for radiance observation. For example: :: diag_amsua_n15_ges.2014061700 diag_hirs4_metop-a_anl.2014061700 diag_amsua_n18_anl.2014061700 diag_hirs4_metop-a_ges.2014061700 To compile the programs, use the makefile provided: :: ./make Note: since information in the GSI *include* directory is required, the GSI must have been compiled first. To run *read_diag_conv.exe*, a namelist file *namelist.conv* needs to be in the directory along with the executable. The namelist.conv only has two parameters: :: &iosetup infilename='./diag_conv_anl', : The path and name of GSI diagnosis file outfilename='./results_conv_anl', : The path and name of a text file used to save the content of the diagnostic file / The user can set the test case directory and file *diag_conv_anl.2014061700* from section 5.2 as the entry for *infilename* in the namelist, then run the executable :: ./read_diag_conv.exe The results are placed in the file specified by the outfilename entry in the namelist. In this case that would be a file *results_conv_anl* located in the directory where the executable was run. Similarly, to run *read_diag_rad.exe*, the namelist file namelist.rad is needed. It contains the same parameters as *namelist.conv* but it links to radiance diag files. After setting it to use the same case from section 5.2, such as: :: &iosetup infilename='(test directory)/diag_amsua_n18_ges.2014061700', outfilename='./results_amsua_n18_ges', / | Running the executable creates the text file *results_amsua_n18_ges* specified by the namelist in the directory *read_diag_rad.exe* runs. | For the conventional observations, the data is stored in two arrays: *rdiagbuf* and *cdiagbuf*. Their contents are listed as follows, for temperature (check *src/main/setupt.f90*) as an example: :: cdiagbuf = station id rdiagbuf(1) = observation type rdiagbuf(2) = observation subtype rdiagbuf(3) = observation latitude (degrees) rdiagbuf(4) = observation longitude (degrees) rdiagbuf(5) = station elevation (meters) rdiagbuf(6) = observation pressure (hPa) rdiagbuf(7) = observation height (meters) rdiagbuf(8) = observation time (hours relative to analysis time) rdiagbuf(9) = input prepbufr qc or event mark rdiagbuf(10) = setup qc or event mark (currently qtflg only) rdiagbuf(11) = read_prepbufr data usage flag rdiagbuf(12) = analysis usage flag (1=use, -1=not used) rdiagbuf(13) = nonlinear qc relative weight rdiagbuf(14) = prepbufr inverse obs error (K**-1) rdiagbuf(15) = read_prepbufr inverse obs error (K**-1) rdiagbuf(16) = final inverse observation error (K**-1) rdiagbuf(17) = observation (K) rdiagbuf(18) = obs-ges used in analysis (K) rdiagbuf(19) = obs-ges without bias correction (K) For wind observations, the content after index 16 (check *src/main/setupw.f90*) is: :: rdiagbuf(17) = earth relative u wind component observation (m/s) rdiagbuf(18) = earth relative u obs-ges used in analysis (m/s) rdiagbuf(19) = earth relative u obs-ges w/o bias correction (m/s) rdiagbuf(20) = earth relative v wind component observation (m/s) rdiagbuf(21) = earth relative v obs-ges used in analysis (m/s) rdiagbuf(22) = earth relative v obs-ges w/o bias correction (m/s) The *read_diag_conv.exe* reads these arrays and outputs important information in the text file *results_conv_anl*\ specified by the user in the *&iosetup* namelist. For example: :: station obs obs obs obs obs usag obs O-B ID type time latitude longitude pressure value ps @ 46047 : 180 -0.17 32.40 240.50 1014.30 1 1014.30 -0.09 t @ 72293 : 120 -0.98 32.85 242.88 996.00 1 297.25 1.26 uv @ 72293 : 220 -0.98 32.85 242.88 996.00 1 3.90 0.38 3.30 2.66 For wind, the last 4 columns are the wind components in the order of: U observation, O-B for U, V observation, O-B for V. For radiance observations, the data is stored in two arrays: *diagbuf* and *diagbufchan*. Their contents are listed as follows (please refer to *src/main/setuprad.f90* for more details): :: diagbuf(1) = observation latitude (degrees) diagbuf(2) = observation longitude (degrees) diagbuf(3) = model (guess) elevation at observation location diagbuf(4) = observation time (hours relative to analysis time) diagbuf(5) = sensor scan position diagbuf(6) = satellite zenith angle (degrees) diagbuf(7) = satellite azimuth angle (degrees) diagbuf(8) = solar zenith angle (degrees) diagbuf(9) = solar azimuth angle (degrees) diagbuf(10) = sun glint angle (degrees) (sgagl) diagbuf(11) = surface fractional coverage by water diagbuf(12) = surface fractional coverage by land diagbuf(13) = surface fractional coverage by ice diagbuf(14) = surface fractional coverage by snow if(.not. retrieval)then diagbuf(15) = surface temperature over water (K) diagbuf(16) = surface temperature over land (K) diagbuf(17) = surface temperature over ice (K) diagbuf(18) = surface temperature over snow (K) diagbuf(19) = soil temperature (K) diagbuf(20) = soil moisture diagbuf(21) = surface land type else diagbuf(15) = SST first guess used for SST retrieval diagbuf(16) = NCEP SST analysis at t diagbuf(17) = Physical SST retrieval diagbuf(18) = Navy SST retrieval diagbuf(19) = d(ta) corresponding to sstph diagbuf(20) = d(qa) corresponding to sstph diagbuf(21) = data type endif diagbuf(22) = vegetation fraction diagbuf(23) = snow depth diagbuf(24) = surface wind speed (m/s) ! Note: The following quantities are not computed for all sensors if (.not.microwave) then diagbuf(25) = cloud fraction (%) diagbuf(26) = cloud top pressure (hPa) else diagbuf(25) = cloud liquid water (kg/m**2) diagbuf(26) = total column precip. water (km/m**2) endif diagbuf(27) = foundation temperature: Tr diagbuf(28) = diurnal warming: d(Tw) at depth zob diagbuf(29) = sub-layer cooling: d(Tc) at depth zob diagbuf(30) = d(Tz)/d(Tr) *diagbufchan* include loop through channel *i* from 1 to *nchanl*: :: diagbufchan(1,i)= observed brightness temperature (K) diagbufchan(2,i)= observed - simulated Tb with bias corrrection (K) diagbufchan(3,i)= observed - simulated Tb with no bias correction (K) diagbufchan(4,i)= inverse observation error diagbufchan(5,i)= quality control mark or event indicator diagbufchan(6,i)= surface emissivity diagbufchan(7,i)= stability index diagbufchan(8,i)= d(Tb)/d(Ts) do j=1,npred+1 diagbufchan(7+j,i)= Tb bias correction terms (K) end do In the sample output file *results_amsua_n18_ges*, only the observation location and time are written in the file. Users can write out other information based on the list. Read and Plot Convergence Information from fort.220 --------------------------------------------------- In section 4.6, we introduced how to check the convergence information in the *fort.220* file. Further detail on the *fort.220* convergence information can be found in the Advanced Users Guide. Here, we provide tools to filter this file and to plot the values of the cost function and the norm of gradient during each iteration. These tools - one ksh script and one ncl script - are in: *./util/Analysis_Utilities/plot_cost_grad* directory: The ksh script, *filter_fort220.ksh*, only has one line: :: grep 'cost,grad,step,b' fort.220 | sed -e 's/cost,grad,step,b,step? = //g' | sed -e 's/good//g' > cost_gradient.txt | To run *filter_fort220.ksh*, the *fort.220* needs to be in the same directory. The script will filter out the values of the cost function and the norm of gradient at each iteration from *fort.220* into a text file called *cost_gradient.txt*. | Once the file *cost_gradient.txt* is ready, run ncl script *GSI_cost_gradient.ncl* to generate the plot: :: ncl GSI_cost_gradient.ncl The pdf file *GSI_cost_gradient.pdf* is created. The pdf file contains plots of the convergence of the GSI analysis like those in section 4.6. Plot Single Observation Test Result and Analysis Increment ---------------------------------------------------------- | In Section 4.2, we introduced how to do a single observation test for GSI. Here we provide users with the ncl scripts to plot the single observation test results. | There are 5 ncl scripts in the *util/Analysis_Utilities/plots_ncl* directory: .. table:: the list of ncl plotting tools ========================== ================================================================ GSI_singleobs_arw.ncl Plot single observation test with ARW NetCDF background ========================== ================================================================ GSI_singleobs_nmm.ncl Plot single observation test with NMM NetCDF background Analysis_increment.ncl Plot analysis increment from the case with ARW NetCDF background Analysis_increment_nmm.ncl Plot analysis increment from the case with NMM NetCDF background fill_nmm_grid2.ncl E grid to A grid convertor ========================== ================================================================ [taba1] The main difference between the ARW and NMM core used in GSI is that ARW is on a C grid, while NMM is on an E grid. *GSI_singleobs_nmm.ncl* calls *fill_nmm_grid2.ncl* to convert the E grid to an A grid for plotting, while *GSI_singleobs_arw.ncl* itself includes a C grid to A grid convertor. Before running ncl scripts, users need to set up two links: - *cdf_analysis* - Link to analysis result in NetCDF format (*wrf_inout*) - *cdf_bk* - Link to background file in netCDF format | These scripts read in the analysis and background fields of temperature (T), U component of wind (U), V component of wind (V), and moisture (Q) and calculate the difference of the analysis field minus the background field. Then XY sections (left column) and XZ sections (right column) are plotted for T, U, V, and Q through the point that has maximum analysis increment of single observation. Here the default single observation test is T. If the user conducts other single observation tests, the corresponding changes should be made based on the current scripts. | The scripts *Analysis_increment.ncl* and *Analysis_increment_nmm.ncl* are very similar the one for the single observation but only the XY section for a certain analysis level is plotted. | For more information on how to use ncl, please check the NCL website at: :: http://www.ncl.ucar.edu/ Generate initial regional ensembles ------------------------------------ Under the *./util/EnKF/arw/src* directory, there are two sub-directories: *enspreproc_regional.fd/* and *initialens_regional.fd/*. The first one is to extract ensemble pertubations from GDAS 80 member ensembles and the second one is to add the extracted ensembles to a regional WRF background field (considered as the mean filed) to generate initial regional ensembles. Before using these two unitilies, you should have already sucessfully compiled the GSI and gotten the "gsi.exe" file. After that, under the build directory, type "cmake -DBUILD_UTIL_COM path_to_comGSI " to compile the utilities. A sucessful compilation should yield "enspreproc.exe" and "initialens.exe" in the bin/ directory. Now, the next step is to get GDAS spectrally smoothed atmospheric ensemble forecasts. These files should be in the sigma format, which is currrenlty the only format supported by "enspreproc.exe". You need to contact NCEP or other appropriate contacts to download these kind of ensembles. These ensemble files follow the name convection of "sfg_$CDATE_fhr$FEs_mem$MEM". $CDATE is the cycle date, such as 2017011518 which means 18z of Jan. 15th, 2017. $FE is the forecast hour, for example, 06 means 6 hour of forecasts. $MEM is the member number. Here is an example of GDAS ensmbles: *sfg_2017011518_fhr06s_mem001*. After you download the required GDAS ensembles, follow the following steps: 1. Running "enspreproc.exe", enter the *enspreproc_regional.fd/* directory: (1). generate the file "fileslist01". This file lists the ensemble files to be used in the calculation of ensemble pertubations. For example, if it is determined to use 20 members to generate ensemble perturbations, the file "filelist01" will be as follows: :: sfg_2017011518_fhr06s_mem001 sfg_2017011518_fhr06s_mem002 sfg_2017011518_fhr06s_mem003 ... sfg_2017011518_fhr06s_mem018 sfg_2017011518_fhr06s_mem019 sfg_2017011518_fhr06s_mem020 (2). Modify the file "namelist.input", change "n_ens" to the total number of ensembles to be used. (3). Copy the "anavinfo" file used by GSI into current directory. (4). Copy the background WRF file, name it as "wrf_inout". (5). Create a job description file, submit the job to get it run in parallel. After the successful running of "enspreproc.exe", you will get ensemble perturbations as follows: :: en_perts4ars.mem0001 en_perts4ars.mem0002 en_perts4ars.mem0003 ... en_perts4ars.mem0018 en_perts4ars.mem0019 en_perts4ars.mem0020 2. Runnning "initialens.exe", enter the *initialens_regional.fd/* directory: (1). Modify the file "namelist.input", change "n_ens" to the total number of ensembles to be used. (2). Copy wrf_inout to current directry (3). Copy wrf_inout to wrfinput_d01.mem$MEM files as follows: :: cp wrf_inout wrfinput_d01.mem0001 cp wrf_inout wrfinput_d01.mem0002 cp wrf_inout wrfinput_d01.mem0003 ... cp wrf_inout wrfinput_d01.mem0018 cp wrf_inout wrfinput_d01.mem0019 cp wrf_inout wrfinput_d01.mem0020 Be sure that each member has a correspoding wrfinput_d01 file. These files will be updated by "initialens.exe" later. (4). Link the ensemble perturbations generated by "enspreproc.exe" to current directory. Something like this *ln -s ../enspreproc_regional.fd/en_perts4arw.mem\**. (5). Create a job description file, submit the job to get it run in parallel. Please note that only 1 processor is required to run "initialens.exe" but submitting it to run on computing node is a must. After the sucessful running of "initialens.exe", all the *wrfinput_d01.mem$MEM* files are updated with ensemble perturbation added to the background or "mean" state of the original wrf_inout. Now the initial regionl ensembles have been sucessfully generated.