Analysing the dynamics and relative influence of variables affecting ecosystem responses using functional PCA and boosted regression trees: A seagrass case study
Methods in Ecology and Evolution
School of Science / Centre for Marine Ecosystems Research
- Understanding the relative influence of variables on ecosystem responses and the dynamics of their effect is necessary for effective ecosystem monitoring and management. Also known as causal pathways analysis, we develop an approach using functional Principal Components Analysis (fPCA) and machine learning within a scenario analysis framework.
- fPCA is used to identify most influential variables for correlated, non‐homogenoeus and nonlinear time series data characteristic of complex ecosystems. Hierarchical clustering of fPCA scores reveals groups of more homogeneous scenarios and similarly influential variables. The resultant subset of variables helps to overcome model identifiability problems when analysing time‐lagged effects using Boosted Regression Trees (BRT).
- We use simulated data generated by a Dynamic Bayesian Network (DBN) of ecological windows for seagrass ecosystems given dredging stressors; 3,024 scenarios with 75 state variables are analysed. The BRT demonstrated a high level of fit (R² ≈ 0.97, MSE ≈ 0.16), supporting the validity of influential variables identified by fPCA. Influential variables identified included genus, location type, light, growth and seed. Six consecutive months of positive growth and adequate light were important for predicting states of high or moderate population.
- Compared to traditional scenario analysis and sensitivity analysis approaches, our approach simultaneously enabled capture of n‐way interactions while accounting for time correlations. Although some variables and their dynamics agreed with existing knowledge, new variables and/or time lags of their effects were identified, corresponding to opportunities for further investigation as well as informing monitoring and management. Although we demonstrate our method on state variables with DBN simulated data, it is equally applicable to general time series data.