The objective of this document is to examine all the above indices and to propose a logical, stable, unique and linear index for evaluating model performance. When the value of the RMS is standardized by the average measurement, the dispersal index (SI) is sometimes called (Zambresky 1989). When the value of the RMS is standardized by a certain measure used for the propulsion of a model, it is sometimes referred to as “OPI” (Ris et al. 1999). OpI, for example, can be used to provide an estimate of the power of a tree-level transformation model inside the water based on the height of the swell measured at sea. While for simulation year 1, “monstrously simulated” values are omitted from the calculation, differentiated statistical indicators – average error (ME), absolute average error (MAE), fumigation defects (RMSE) and relative errors (RMSE) have decreased compared to those with “extremely simulated values”; It makes sense. Efficiency-based indicators – ENS or ELG, ELM, agreement index d and new agreement index d; But they should be increased. Similar behaviours are also observed for year 2. Willmott et al. (2011) proposed a new index, dr, and they compared the dr to “mean absolute error (MAE) ” recordings that vary logically with MAE. However, this should be compared to an average absolute relative error, as MAE may vary with different samples/data sets, while the “average absolute relative error” value may be the same (i.e., there is no change in the relative model). In this study, the dr index does not follow the logical trend within a given data set, as in Table 2 (combined analysis); and also ambiguously between different sets (1st year and data combined) – with a PMARE value.

Similar inconsistencies are also observed for random records (Table 4, 1 . . . 3. Recordings – with PMARE). Like the RMSE, smaller MAE values indicate a better match between measured and calculated values. The refined Willmott et al. index (2011) (dr) can be written as the agreement index (d) developed by Willmott (1981) as a standardized measure of the degree of modelling error and varies between 0 and 1. Value 1 gives a perfect match, and 0 gives no match at all (Willmott, 1981). The graphic method gives the overview and the actual picture, while the different indices give quantitative indicators. The diagnosis that can be made from the diagram must be supported by quantitative measures.

The indices should also be consistent in their results. Otherwise, the corresponding quantitative index is not suitable for comparing models and should be abundant from the measure of model performance.