Selected discussion from N. Marwan: How to avoid potential pitfalls in recurrence plot based data analysis, International Journal of Bifurcation and Chaos, 21(4), 1003-1017 (2011). DOI:10.1142/S0218127411029008
Indicators of determinism – Artifacts based on time delay embedding
The length of a diagonal line in the RP corresponds to the time
the system evolves very similar as during another time, i.e.,
a segment of the phase space trajectory runs parallel and
within an \(\varepsilon\)-tube of another segment of the phase space
trajectory. Deterministic systems are often characterised by
repeated similar state evolution (corresponding to a local predictability),
yielding in a large
number of diagonal lines in the RP. In contrast, systems with independent
subsequent values, like white noise, have RPs with mostly single
points. Therefore, the fraction of recurrence points forming such
diagonal lines (of length \(l \ge l_\min\))
DET = \frac{\sum_{l=l_{\min}}^N l P(l)}{\sum_{l=1}^N l P(l)}
can be calculated and is, therefore, called
determinism in the RQA. Somehow this measure can be
interpreted as an indication of determinism in the data.
But we should be careful in using the term determinism in
a more general or mathematical sense. In a deterministic system
we can calculate the same exact state by using given initial
conditions, i.e., there is no stochastic process involved.
Different methods can be used to test for determinism in
time series, e.g., a combined modelling-surrogate approach
(Small & Tse, 2003) or an analysis of the directionality
of the phase space trajectory (Kaplan & Glass, 1992).
High values of \(DET\)
might be an indication of determinism in the studied system,
but it is just a necessary condition, not a sufficient one.
Even for non-deterministic processes we can find longer
diagonal lines in the RP, resulting in increased \(DET\) values.
For example, the following (non-deterministic) auto-regressive
process \( x_i = 0.8 x_{i-1} + 0.3 x_{i-2} - 0.25 x_{i-3} + 0.8 \xi \)
(where \(\xi\) is white Gaussian noise) has a \(DET\)
value of \(0.6\)
(embedding dimension \(m=4\), delay \(\tau = 4\), and fixed recurrence
rate of \(0.1\)). As it was shown in Thiel et al. (2003),
stochastic processes can have RPs containing longer
diagonal lines just by chance (although very rare). Moreover,
due to embedding we introduce correlations in the RP and,
therefore, also uncorrelated data (e.g., from white noise process)
have spurious diagonal lines (Thiel et al., 2006; Marwan et al., 2007)
(Fig. 1). Moreover, data pre-processing
like low-passfiltering (smoothing) is frequently used.
Such pre-processing can also introduce spurious line structures
in the RP. Therefore, from just a high value of the RQA
measure \(DET\) we have to be careful in infering that the
studied system would be deterministic. For such conclusion we
need at least one further criterion included in the
RP: the directionality of the trajectory (Kaplan & Glass, 1992).
One possible solution is to use iso-directional
RPs (Horai et al., 2002) or perpendicular RPs (Choi et al., 1999);
if then the measure reaches \(DET \approx 1\) for
a very small recurrence density (i.e., \(RR<0.05\)), the
underlying system will be a deterministic one (like a periodic or chaotic
Indicators of periodic systems
As explained in the previous section, deterministic systems cause a high value in the RQA measure \(DET\). This measure has been successfully used to detect transitions in the dynamics of complex systems. A frequently used example in order to present this ability is the study of the different dynamical regimes of the logistic map, where \(DET\) is able to detect the periodic windows (by values \(DET\) = 1). Therefore, it is often claimed that this measure is able to detect chaos-period transitions.
However, we can also find such high \(DET\) values for non-periodic, but chaotic systems. For example, the Roessler system exhibits in the parameter interval c ∈ [35, 45] a transition from periodic to chaotic states (Fig. 2A). But due to the smooth phase space trajectory and high sampling frequency (sampling time Δ t = 0.1), the RP for the chaotic trajectory consists almost exclusively on diagonal line structures (Fig. 3), resulting in a high value of \(DET\), i.e., \(DET\) ≈ 1 (Fig. 2B).
A very high value of \(DET\) is not a clear or even sufficient indication of a periodic system. High values can be caused by very smooth phase space trajectories. This should also be considered when looking for indications of unstable periodic orbits (UPOs), where \(DET\) or mean and maximal line lengths \(L\) and \(L_\max\) may not be sufficient. A solution could be to increase the minimal length \(l_\min\) of a diagonal recurrence structure which is considered to be a line. However, a better solution is to look at the cumulative distribution of the diagonal line lengths and estimate the \(K_2\) entropy (but this requires much longer time series). Recent work has shown that measures coming from complex network theory, like clustering coefficient, applied to recurrence matrices are more powerful and reliable for the detection of periodic dynamics (Zou et al., 2010).
Indicators of chaos
The RP visualises the recurrence structure of the considered system (based on the phase space trajectory). The basic idea behind RPs comes, in general, from the study of chaos. Therefore it can be considered as a nonlinear tool for data analysis. But this cannot be a criterion to understand complex structures in the RP or high values of RQA measures as indicators of chaos or nonlinearity in the dynamical system.
As mentioned above, uncorrelated stochastic systems have mostly short or almost no diagonal line structures in their RPs, whereas deterministic and regular systems, like periodic processes, have mostly long and continuous diagonal line structures. Chaotic processes have also diagonal, but shorter lines, and can have single recurrence points. Nevertheless, only by looking at the appearance of an RP it is difficult (almost impossible) to infer about the type of dynamics; only periodic and white noise processes can be identified with some certainty.
The alternative is to look at the RQA measures quantifying the structures in an RP which are related to some dynamical characteristics of the system. As diagonal lines in the RP correspond to parallel running trajectory segments, it is clear that the length of these lines is somehow related to the divergence behaviour of the dynamical system. Divergence rate of phase space trajectories is measured by the Lyapunov exponent. In fact, the lengths of the diagonal lines are directly related to dynamical invariants as \(K_2\) entropy or \(D_2\) correlation dimension (Faure & Korn, 1998; Thiel et al., 2004). The \(K_2\) entropy is the lower limit of the sum of the positive Lyapunov exponents.
For example, RQA measures based on the length of the diagonal lines, like determinism \(DET\) and mean line length \(L\), also depend on the type of the dynamics of the systems (rather low values for uncorrelated stochastic (white noise) systems, higher values for more regular, correlated and also chaotic systems). It has been suggested to measure the length of the longest diagonal line \(L_\max\) and interpret its inverse \(DIV = \frac{1}{L_\max}\) as an estimator of the maximal Lyapunov exponent (Trulla et al., 1996). However, this interpretation incorporates high potential of erroneous conclusions derived from RQA.
First, the main diagonal in the RP (i.e., the line of identity, LOI) is naturally the longest diagonal line, wherefore it is usually excluded from the analysis. However, due to the tangential motion of the phase space trajectory (tangential motion becomes even more crucial and influential for highly sampled or smooth systems.), subsequent phase space vectors are often also considered as recurrence points (known as sojourn points) (Marwan et al., 2007). These recurrence points lead to further continuous diagonal lines directly close to the LOI. Without excluding an appropriate corridor along the LOI (the Theiler window), \(L_\max\) will be artificially large (\(\approx N\)) and \(DIV\) too small.
Second, as explained above, even white noise can have long diagonal lines, leading to a small \(DIV\) value just by chance (Fig. 1). Although the probability for the occurrence of such long lines is rather small, the probability that lines of length two occur in RPs of stochastic processes is, on the contrary, rather high. Only one line of length two is enough to get a finite value of \(DIV\) which might be mis-interpreted as a finite Lyapunov exponent and that the system would be chaotic instead stochastic.
Therefore, we have to be careful in interpreting the RQA measures themselves as indicators of chaos. Moreover, such conclusion cannot be drawn by applying a simple surrogate test where the data points are simply shuffled (such a test would only destroy the correlation structure within the data, and, thus, the frequency information).
RP or RQA alone cannot be used to infer nonlinearity from a time series. For this purpose, advanced surrogate techniques are more appropriate (Schreiber & Schmitz, 2000; Rapp et al., 2001).
Significance of RQA measures
When analysing time series by a windowed RQA, an important
question is how significant is the variation of the RQA measures.
A sub-optimal scaling of the variation of the RQA measures
can mislead to conclusions that the studied system
has changed its regime or that it would be nonstationary
(Fig. 4A, B). Therefore, it is strongly recommended
to cross-check the scaling of the presentation and
to present confidence intervals (Fig. 4C, D).
Confidence intervals can be calculated in various
ways, but we should avoid to derive them by simply
shuffling the original data. One approach could be
a bootstrap resampling of the line structures in the
RP (Marwan et al., 2013). Another approach
fits the probability of serial dependences (diagonal lines)
to a binomial distribution (Hirata et al., 2011). Whatever approach
we chose, the estimation of the confidence intervals is
not a trivial task, but in the future the standard software
for RQA should include such tests.
A common statement on recurrence analysis is that it is useful to analyse short data series. But we have to ask, how short is short? The required length for the estimation of dynamical invariants will be discussed in the following Subsect. Applying RQA analysis we should be aware that the RQA measures are statistical measures (like an average) and need some minimal length that a variation can be considered to be significant.
