NITE: Executive Summary
In response to recommendations from the DTC Science Advisory Board and the UCACN, and discussions with NCEP leadership, the DTC undertook the task of creating a design for an infrastructure to facilitate development of NCEP numerical models by scientists both within and outside of EMC. Requirements for this NWP Information Technology Environment, or NITE, are based on a survey of potential NITE users, information obtained during site visits to EMC, UKMO, and ECMWF, discussions with focus groups, and reviews of various existing model development systems. The NITE design put forth to address these requirements includes the following elements:
Data management and experiment database. The data management element provides scientists with access to input datasets (model and observations), a mechanism for storing selected output from all experiments and tools for browsing, interrogating, subsetting, and easily retrieving data. An important aspect of this element is establishing standards for storing model and observation datasets, as well as their metadata. Another important aspect of the data management element is capturing all metadata pertaining to an experiment. To facilitate sharing of information, NITE needs to record information on key aspects of the experiment setup, such as provenance of source code and scripts, configuration files, and namelist parameters, in a searchable database.
Source code management and build systems. Single SVN code repositories or distributed Git repositories need to be available for all workflow components run within NITE. This will support code unification and collaboration between developers. Code repositories need to be available outside of the NCEP firewall, where the community can access them. Fast, parallel build systems should be implemented to efficiently build all workflow components of a suite before experiments are conducted.
Suite definition and configuration tools. All configurable aspects of a suite are abstracted to files that can be edited to create the experiments. No aspects of the directory structure, batch system, namelist parameters etc. are hardcoded in the scripts or source code. Predefined suites are provided as a starting point for creating experiments, with scientists also having the option to compose their own suites.
Scripts. The scripting for NITE is such that each workflow component within NITE (e.g., GSI) is associated with a single script, regardless of which suite is being run (e.g., NAM or RAP). Standardization of scripts reduces the overall maintenance costs for NCEP's multiple suites and helps scientists familiar with one suite quickly learn another.
Workflow management system. The workflow management system handles all job submission activity. Hence, the scripts used to run workflow components within NITE do not contain job submission commands. To meet long-term plans for NITE, it will be important for this workflow management system to be available for use outside of WCOSS.
Documentation and training. Documentation and training on all workflow components and suites available through NITE, as well as on NITE itself, are readily available through electronic means.
In addition to the elements above, standardized tools for data visualization and forecast verification need to be available to all scientists as part of NITE.
We recognize that substantial resources will be required for initial and ongoing development of NITE. However, it should be pointed out that NOAA already has several tools that can be used as a starting point for various NITE elements (e.g., MADIS, NOMADS, Rocoto, HWRF object-oriented scripts, VLab, and WRF Portal).
While the initial deployment of NITE will cause some disruption to model development, we are confident that this infrastructure will facilitate developing and running NCEP suites and will make transition of new research and development to operations more efficient and effective.