- Bishop Hill blog

Reader Andrew sent me his summary of the basics of mathematical models, which I think readers will find useful.

I have been devising mathematical models (simulations) of physical processes for over 20 years, and I just wanted to point out some of the basics that might help people understand these types of models:

1. The physics of the process (to be modelled) may be well understood, but although this helps it is somewhat irrelevent to the accuracy of all but the most simple model (although you will almost certainly not get a good model if you don't understand the physics). Nearly all computer models are based on mathematical formulae, commonly binomial expansions, that are representative of the physical situation. These expansions are typically of the form: A + Bx + Cx² + Dx³ + Ex⁴ . . . and are truncated at some power of x (x representing the physical quantity under investigation, A, B, C etc. are calculated constants). I always tried to make it the x⁴ term, but this could lead to (in the 1990s) excessive calculation times (One commercial program, still in widespread use, truncates the series at the x² term). Thus there is always a 'remainder term', or 'residual', which the model will (hopefully be programmed to) attempt to make an estimation of.

2. The problem, or 'domain', over which the model is to be applied (unless trivial) cannot be simulated as a whole. Thus it is divided into small regular shapes (squares, cubes, or more normally now triangles and tetrahedra, that are usually called 'elements') - a process known as 'gridding' or 'meshing', over which the (truncated) equations representing the physical situation can be relatively easily applied. Smaller elements usually produce more accurate results, but the computation time increases - and see 4 below. These can now be used to give a 'spot' or 'node' value for the physical quantity being simulated, for each of these elements (this is somewhat simplistic). A further mathematical process is now used to combine the results for all these elements. A single calculation through each element node within the domain is known as a 'sweep' or, more commonly, an 'iteration'.

3. Many iterations are undertaken until a programmed 'convergence' criterion is met. This is sometimes that the change in node value between one iteration and the next are all below a certain value, or that the residuals (see 1 above) are all below a certain value. This is generally known as 'convergence'. This process is somewhat easier for a 'static' situation where the physical values to be calculated are constant. If you add a time-based (dynamic) component, i.e. like the atmosphere, to the calculation it usually gets much more complex.

4. I hope it can be seen that this process is 'absolutely riddled' with scope for errors, incorrect assumptions, and erroneous simplifications. Not only that, the whole process can become mathematically unstable, due to interaction between the various steps, leading to the calculations 'exploding' to infinity, or crashing to zero. This is a particular problem with dynamic situations where the calculation 'time-step' can interact with the mesh/grid spacing, leading to the whole model 'falling-over' or collapsing.

5. Even if the model does converge to a solution - it does not mean that this is a correct (or accurate) one. In another commercial program (to that in 1 above), users are warned that an incorrect choice of the element type to be used - can lead to solutions that are up to 2000% (yes, two thousand percent) away from the correct value. One big problem with ascertaining the accuracy of computer simulations is that you generally have to have some idea of what the answer should be, so that you can compare the calculated solution.

6. Bear in mind that the process (simplistically) outlined above must be undertaken for each physical attribute being investigated, and it can be seen that this is a hugely non-trivial problem (for an atmospheric model).

7. In research work I have found that computer models are a very useful tool for qualitative analysis, but much less so for accurate quantitative analysis. The models I have worked on have generally been used for automated process control - and invariably these require 'calibration' or 'tuning' to real world measurements. Furthermore, these process control models are made so that any calculated solution outside the physically measured range is 'viewed with suspicion'.