Report 18 -- Model Philosophy

Modeling Philosophy

The ability of a model to accurately forecast or hindcast is a complicated function of the accuracy of the model, and the accuracy, detail and type of data used to formulate, calibrate and initialize the model. Regardless of the accuracy of a model's prediction, however, we can never know if the model is "valid" in the sense that it is a true representation of the system being modeled (Oreskes et al., 1994). An accurate prediction does not necessarily mean that the model is correct; neither does a bad prediction or hindcast necessarily invalidate the model. Evidence can be gathered to corroborate a model. However, this evidence does not confirm a model's veracity; rather, it supports the model's probability of being accurate. The most trustworthy models, then, are those that have withstood multiple tests using independent data under a variety of conditions. These models can then be extrapolated beyond available data to give reasonable predictions or hindcasts.

The degree to which a model can be used for prediction depends on the type of question being asked, and the local temporal and spatial decorrelation scales of the processes being modeled. Predictions regarding the statistics of a process may be quite accurate long into the future. Predictions regarding the details of local dynamics will rapidly become inaccurate as boundary conditions influence the interior dynamics, and errors in initial conditions cause divergence of the modeled and measured processes. A model that can accurately predict from one data set to the next is an extremely useful tool, as it can then be used to explore details of the dynamics underlying changes in the state variables between the data sets. Using models in this interpolative mode is one of the great strengths of coupled modeling and field programs. The model can be used to quantify processes that were logistically impossible to sample in the field.

Defining the accuracy of the model is a difficult task. Ideally, the model should be tested with a data set independent of the one used to formulate and calibrate it. In practice, this is seldom possible. In any case, performance criteria must be defined for the model; an acceptable model must reproduce the available data within specified errors. Care must be taken to test the fit of all state variables, even those that are not the focus of the study. Does the model sacrifice realistic behavior of one state variable in order to accurately reproduce another state variable? It is essential to include sanity checks at all stages of the modeling. Such checks should be developed in collaboration with observationalists.

A model that accurately predicts the statistical behavior of a system becomes a useful tool for the design of sampling programs. Such a model can be used to infer decorrelation time and length scales which can then be used to plan the frequency and density of sampling. An ideal sampling scheme will sample more frequently than the decorrelation time, and more dense than the decorrelation length scale. Typically three to five samples per decorrelation scale are adequate. The processes governing local decorrelation scales are often complex, and may be driven by non-local forcings. Models are invaluable tools for obtaining estimates of local decorrelation scales, given appropriate knowledge of boundary conditions. Statistical estimates of forcings and boundary conditions are often appropriate to obtain statistical estimates of decorrelation scales.