Global Model Test Bed

CCPP Parameterization Documentation Standards

Documentation

Purpose

Scientific and technical documentation is integral for code maintenance and for fostering understanding among stakeholders. Thorough documentation should therefore be required of physics schemes in order to be included in the CCPP. This section describes the process that was followed for documenting the operational GFS physics. Doxygen was chosen as a platform for generating human-readable output due to its built-in functionality with Fortran, its high level of configurability, and its ability to parse in-line comments within the source code. Keeping documentation with the source itself increases the likelihood that the documentation will be updated along with the underlying code. Additionally, version control of the code provides the same functionality for the documentation.

The purpose of this document is to provide an understanding of how to properly document a physics scheme using doxygen in-line comments. It covers what kind of information should be in the documentation, how to mark up the in-line comments so that doxygen will parse it correctly, where to put various comments within the code, and how to configure and run doxygen to generate HTML (or other format) output. Part of this documentation, namely subroutine argument tables, have functional significance as part of the CCPP infrastructure. These tables must be in a particular format to be parsed by Python scripts that automatically generate a software cap for a given physics scheme. Although the procedure outlined herein is not unique, following it will provide a level of continuity with previously documented schemes.

Information to include in documentation

In order for this documentation to serve both software engineering and scientific purposes, a broad array of information should be included in the documentation as described in the following list:

  1. Brief one or two sentence overview of the parameterization. In some cases, a version number may exist (e.g., SHOC v9, and that information can be included here).
    1. This brief overview gets printed at the top of the webpage that describes the parameterization; e.g. This parameterization calculates the effect of process X on the model variables Y and Z using the method of Scientist 1 and Scientist 2.
  2. General introduction (one or two paragraphs) to the physical parameterization to include
    1. Scientific origin (including references)
    2. Brief scheme history (especially if the scheme has been improved through iteration)
    3. What the scheme calculates
    4. Key features and differentiating points
  3. A list of source files included in the scheme (if more than one)
  4. A diagram depicting calling hierarchy within the scheme (if the scheme has a complex calling structure); e.g. subroutine A calls subroutines B and C. Subroutine C calls subroutine D.
  5. A section describing any intra-physics communication; e.g. scheme X needs variables a, b, and c from scheme Y in order to calculate d and scheme X provides variables e, f, and g to scheme Z
  6. For each subroutine that is an entry point to the scheme, further documentation will include
    1. A brief one-line description
    2. An argument table (CRITICAL) that includes
      1. local variable name
      2. CF standard compliant standard name
        1. this is used as a key to allow schemes within the CCPP to work together
      3. short description
      4. units (format follows unit exponent, i.e. m2 s-2 for m2/s2)
      5. rank (0 for scalar, 1 for 1-D array, 2 for 2-D array, etc.)
      6. data type (integer, real, logical, etc.)
      7. kind (e.g. the specified floating point precision kind)
      8. intent (in, out, in/out)
      9. optional (T/F)
    3. a general overview of the steps of the algorithm
      1. describe in paragraph or bullet form the gist of what operation is performed (and how) in the subroutine
    4. (if available) a detailed description of the algorithm, including equations, discretizations, etc.
      1. this is the most time-consuming aspect of documenting a scheme, but it is arguably the most valuable information for the scientific community

    Using doxygen

    Doxygen is software that parses in-line comments within source code to produce coherent documentation in a variety of formats, including HTML. It is available on many HPC systems and is readily downloadable/installable on personal machines. The workflow for using doxygen to generate documentation is as follows:

    1. Use specially-formatted in-line comments within source code.
    2. Include and edit a doxygen configuration file to control what source files get parsed, how the source files get parsed, and to format the output.
    3. On a machine with doxygen installed, run the doxygen command from the command line with the doxygen configuration file given as an argument:
      1. > doxygen path_to_config_file/name_of_config_file
      2. Running this command may output compiler-like errors that need to be fixed in order to produce proper output
    4. Output is generated depending on the type specified (HTML, LaTeX, etc.) in the configuration file and is put in a location specified in the configuration file

    Configuration file

    If starting from scratch, you can generate a default configuration file using the command

    doxygen -g default
    

    Open the configuration file with a text editor and manually edit the options as needed. The format follows

    TAGNAME = VALUE or 
    TAGNAME = VALUE1 VALUE2 …
    

    The default file includes plenty of comments to explain all the options.

    Doxygen files for layout (ccpp_dox_layout.xml), a html stye (ccpp_dox_extra_style.css ), and bibliography (library.bib) are provided with the CCPP. Additionally, a configuration file is supplied, with the following variables modified from the default:

    • Project_name
    • Repeat_brief = no
    • Optimize_output_for_fortran = yes
    • Extension_mapping = .f=FortranFixed
    • Layout_file = ccpp_dox_layout.xml
    • Html_extra_stylesheet = ccpp_dox_extra_style.css
    • Cite_bib_files = library.bib
    • File_patterns = *.f *.txt
    • Source_browser = yes
    • Referenced_by_relation = yes
    • References_relation = yes
    • Disable_index = yes
    • Generate_treeview = yes
    • Use_mathjax = yes
    • Extra_packages = amsmath

    In order to change which source files to parse, edit the INPUT tag. In order to change where the output is placed, edit the OUTPUT_DIRECTORY tag.

    In-line comment formatting

    For a complete description of doxygen-styled comment syntax, visit the link: http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html

    Firstly, be aware that regular Fortran comments using ! are not parsed by doxygen. Only comments that begin with !> or !! are parsed. So, leaving the original code comments in place is probably the right thing to do -- doxygen simply ignores them. If you want doxygen to use the original comments, you must modify the comment line to use !>. Typically, one-line comments are started with !> while multiline comment blocks are initiated with !> for the first line and use either !> or !! for each line thereafter in that block.

    Groups for each parameterization

    Defining groups in doxygen helps to keep code that is spread across multiple files grouped together in the documentation. For example, parameterization A consists of 2 separate files. A group is defined in the first source file. Within the second source file, you identify the code as being part of the same group. In the output that is produced, a module is created for each group that is defined (note: this is, of course, different from the word module used in the Fortran language). For the purposes of the CCPP documentation, a group should be defined for each individual physics scheme (e.g. Zhao-Carr microphysics, Simplified Arakawa-Schubert deep convection, etc.). A group is defined using:

    !> \defgroup group_name group_title
    

    where group_name is the identifier and the group_title is what the group is referred to in the output. For all local code (code within the same file where the group is defined) to be included in a group, it should be enclosed by the two lines that denote the beginning and end of the code block:

    !> @{
    code to be included
    !> @}
    
    For code that is separated from the group definition, either within the same file or another file, code to be included in a defined group should use the following tag within a comment block:
    \ingroup group_name
    

    Enclose all code that should be included in the group using the same two lines with curly braces as above.

    In the same comment block where a group is defined for a physics scheme, there should be some additional documentation. First, using the \brief tag, a brief one or two sentence description of the scheme should be included. After a blank doxygen comment line, begin a more detailed one or two paragraph description of the scheme that may include items like the scheme origin, history, physical terms that it calculates, basic method for calculating those terms, a list of key differentiating features, scientific references, etc.

    Following the brief and detailed descriptions, insert two \section tags into this comment block. One section should include a diagram that describes the calling hierarchy of the scheme. For example if a scheme consists of many subroutines, the diagram should show where program control starts and branches to various subroutines. This diagram, which should be in image format (.png, .jpg, etc.) needs to be created outside of Doxygen (for example, in PowerPoint). Doxygen will also auto-generate a tree, but those are usually more difficult to interpret than the diagrams created separately.

    Finally, a section to include in the group documentation is interphysics interactions, that is, it should contain a description of how this scheme interacts with other physics schemes (e.g. scheme A depends on input from scheme B and provides output for scheme C).

    Source File Description

    Doxygen provides the \file tag as a way to provide documentation on the source code file level. That is, in the documentation that is generated, one may navigate by source code filenames (if desired) rather than through a functional navigation. The most important documentation organization is through the group concept mentioned above, because the division of a scheme into multiple source files is often functionally irrelevant. Nevertheless, using a \file tag provides an alternate path to navigate the documentation and it should be included in every source file. Therefore, it is prudent to at least include a small documentation block to describe what code is in each file using the \file tag. The brief description for each file is displayed next to the source filename on the doxygen-generated File List page.

    Subroutine Blocks

    Each subroutine that is an entry point for a parameterization should be documented with a documentation block immediately preceding its definition in the source. The documentation block should include the following components

    • a brief one- or two-sentence description with the \brief tag
    • a more detailed one or two paragraph description of the function of the subroutine
    • an argument table that includes entries for each subroutine argument
      • The argument table content should be immediately preceded by the following line: !! \section arg_table_SUBROUTINE_NAME
      • this line is required for a CCPP-related script to find the table!
      • The argument table should be immediately followed by a blank doxygen line: !!
      • This is needed to denote the end of an argument table
      • The first line of the table should contain the following header names
        • local var name (contains the local subroutine variable name)
        • standard name (CF-compliant standard name)
        • description
        • units (format follows unit exponent, i.e. m2 s-2 for m2/s2
        • rank (0 for scalar, 1 for 1-D array, 2 for 2-D array, etc.)
        • data type (integer, real, logical, etc)
        • kind (e.g. the specified floating point precision kind (at present, to be extended to different integer kinds in the future)
        • intent (in/out/inout)
        • optional (T/F)
    • a section called general algorithm with a bullet or numbered list of the tasks completed in the subroutine algorithm
    • At the end of initial subroutine documentation block, a detailed algorithm section is started and the entirety of the code is encompassed within the !!@{ and !!@} delimiters. This way, any comments explaining detailed aspects of the code are automatically included in the detailed algorithm section.

    External files

    Since it was specified that *.txt files would be parsed by doxygen in the configuration file, one can put documentation in external text files too. As an example, the file mainpage.txt is included. It contains a comment block that has the \mainpage tag. Any documentation in this block is included on the top level of the output; i.e., the home page for the CCPP physics documentation.

    Citation/Bibliography

    Doxygen can handle in-comment paper citations and link to an automatically created bibliography page. The bibliographic data for any papers that are cited need to be put in Bibtex format and saved in a .bib file. The .bib file is included in the same repository as the source files and the doxygen configuration option cite_bib_files should point to the included file.

    To use citations within the comment text, use the following tag:

    \cite bibtex_key_to_paper
    

    Equations

    See http://www.stack.nl/~dimitri/doxygen/manual/formulas.html for information about including equations. The configuration option 'use_mathjax' should be set to true in the doxygen configuration file for the best rendering.

    There are many great online resources for learning to use the LaTeX math typesetting used in doxygen.

    Example Source Code

    The following code is included as an example for how to document a physics scheme to be included in the CCPP. This example demonstrates a scheme (scheme_X) that is enclosed in a module that uses two subroutines (sub_A, which is the entry point, and sub_B). While an argument table is provided here for both subroutines, it is only mandatory for the entry point subroutines (so the schemes can interface with the IPDe routines).

    !> \file example.f
    !!  This file contains module mod_1 with subroutines A and B, 
    !!  which is the entirety of scheme X.
    
    module mod_1
    implicit none
    private
    public :: scheme_X_sub_A
    
    contains
    
    !> \defgroup scheme_x Scheme X 
    !! @{
    !! \brief Scheme X calculates the change in T and q due to the 
    !! such-and-such process based on the methods of X.
    !!
    !! This is a more detailed description of scheme X that includes 
    !! references, like Smith and Smith (2017) \cite smith_and_smith_2017. 
    !! This is the place to include information about scheme origin, 
    !! history, physical terms that it calculates, basic method for 
    !! calculating those terms, a list of key differentiating 
    !! features, scientific references,  etc.
    !!
    !! \section diagram Calling Hierarchy Diagram
    !! \image html Scheme_X_Flowchart.png "Diagram depicting how 
    !! calling structure of Scheme X" height=2cm
    !! \section interphysics Interphysics Communication
    !! This space is reserved for a description of how this scheme 
    !! uses information from other scheme types and/or how information
    !! calculated in this scheme is used in other scheme types.
    
    !>  \brief This subroutine contains all the logic for doing part A of Scheme X.
    !!
    !!  This is the place for a more detailed description of what 
    !! functionally happens in subroutine A. It is appropriate to cite 
    !! references here as well, but only those that pertain to this 
    !! particular subroutine.
    !!
    !! \section arg_table_scheme_X_sub_A
    !! | local var name | Standard name | description | units | rank | type  | kind  | intent | optional |
    !! |----------------|----------|-------------|-------|------|-------|-------|--------|----------|
    !! | var1           | CF_var1  | descr. var1 | m2 s-2|   2  |  real | dp    |  in    | F        |
    !! | var2           | CF_var2  | descr. var2 | K     |   1  |integer|       |  out   | F        |
    !!
    !!  \section general General Algorithm
    !!  -# Step 1
    !!  -# Step 2
    !!  -# Step n
    !!
    !!  \section detailed Detailed Algorithm
    !!  @{
          subroutine scheme_X_sub_A (var1, var2)
    
          use :: mod_2, only : helper_sub
    
          implicit none
    ! arguments
          real(kind=dp), dimension(:,:), intent(in) :: var1
          integer, dimension(:), intent(out) :: var2
    ! local variables
          real(kind=dp) :: var3
    
    !> ## This comment will be part of the 'detailed algorithm' section. 
    !! It is good practice to have important in-line code comments 
    !! parsed by doxygen to be included in documentation
    
          var3 = var1**2
    
    ! Comments that do not start with '!>' or '!!' remain valid 
    ! fortran comments but are not parsed by doxygen.
    
          var2 = 0.5*var3
    
          end
    !> @} (closes detailed algorithm section)
    
    !>  \brief This subroutine contains all the logic for doing part B of Scheme X.
    !!
    !!  This is the place for a more detailed description of what 
    !! functionally happens in subroutine B. It is appropriate to cite 
    !! references here as well, but only those that pertain to this 
    !! particular subroutine.
    !!
    !! \section arg_table_scheme_X_sub_B
    !! | local var name | standard name | description | units | rank | type  | kind  | intent | optional |
    !! |----------------|----------|-------------|-------|------|-------|-------|--------|----------|
    !! | var4           | CF_var4  | descr. var4 | m2 s-2|   2  |  real | dp    |  in    | F        |
    !! | var5           | CF_var5  | descr. var5 | K     |   1  |integer|       |  out   | F        |
    !!
    !!  \section general General Algorithm
    !!  -# Step 1
    !!  -# Step 2
    !!  -# Step n
    !!
    !!  \section detailed Detailed Algorithm
    !!  @{
          subroutine scheme_X_sub_B (var4, var5)
    
          use :: mod_2, only : helper_sub
    
    
          implicit none
    ! arguments
          real(kind=dp), dimension(:,:), intent(in) :: var4
          integer, dimension(:), intent(out) :: var5
    ! local variables
          real(kind=dp) :: var3
    
    !> ## This comment will be part of the 'detailed algorithm' section. 
    !! It is good practice to have important in-line code comments 
    !! parsed by doxygen to be included in documentation
    
          var3 = var4**2
    
    ! Comments that do not start with '!>' or '!!' remain valid 
    ! fortran comments but are not parsed by doxygen.
    
          var5 = 0.5*var3
    
          end
    !> @} (closes detailed algorithm section)
    !! @} (closes scheme_X group)
    

UCAR | Privacy Policy | Terms of Use