I'm working at the interface between machine learning and environmental science. My research currently focuses on the detection of extreme events in Earth observation data (such as fAPAR, NDVI, carbon fluxes). The question of how climate extremes affect ecosystems is of high importance for understanding climate-carbon feedbacks and hence future climate predictions. I try to identify large-scale extremes in impact variables and trace them back to climatic drivers and external disturbances.
Zscheischler, J., Michalak, A., Schwalm, M., Mahecha, M., Huntzinger, D., Reichstein, M., Berthier, G., Ciais, P., Cook, R., El-Masri, B., Huang, M., Ito, A., Jain, A., King, A., Lei, H., Lu, C., Mao, J., Peng, S., Poulter, B., Ricciuto, D., Shi, X., Tao, B., Tian, H., Viovy, N., Wang, W., Wei, Y., Yang, J., Zeng, N.
Reichstein, M., Bahn, M., Ciais, P., Frank, D., Mahecha, M., Seneviratne, S., Zscheischler, J., Beer, C., Buchmann, N., Frank, D., Papale, D., Rammig, A., Smith, P., Thonicke, K., van der Velde, M., Vicca, S., Walz, A., Wattenbach, M.
eiReichstein, M., Bahn, M., Ciais, P., Frank, D., Mahecha, M., Seneviratne, S., Zscheischler, J., Beer, C., Buchmann, N., Frank, D., Papale, D., Rammig, A., Smith, P., Thonicke, K., van der Velde, M., Vicca, S., Walz, A., Wattenbach, M.
Climate Extremes and the Carbon CycleNature, 500, pages: 287-295, 2013 (article)
Latest climate projections suggest that both frequency and intensity of climate extremes will be substantially modified over the course of the coming decades. As a consequence, we need to understand to what extent and via which pathways climate extremes affect the state and functionality of terrestrial ecosystems and the associated biogeochemical cycles on a global scale. So far the impacts of climate extremes on the terrestrial biosphere were mainly investigated on the basis of case studies, while global assessments are widely lacking. In order to facilitate global analysis of this kind, we present a methodological framework that firstly detects spatiotemporally contiguous extremes in Earth observations, and secondly infers the likely pathway of the preceding climate anomaly. The approach does not require long time series, is computationally fast, and easily applicable to a variety of data sets with different spatial and temporal resolutions. The key element of our analysis strategy is to directly search in the relevant observations for spatiotemporally connected components exceeding a certain percentile threshold. We also put an emphasis on characterization of extreme event distribution, and scrutinize the attribution issue. We exemplify the analysis strategy by exploring the fraction of absorbed photosynthetically active radiation (fAPAR) from 1982 to 2011. Our results suggest that the hot spots of extremes in fAPAR lie in Northeastern Brazil, Southeastern Australia, Kenya and Tanzania. Moreover, we demonstrate that the size distribution of extremes follow a distinct power law. The attribution framework reveals that extremes in fAPAR are primarily driven by phases of water scarcity.
In Proceedings of the International Conference on Computational Science, 18, pages: 2337 - 2346, Procedia Computer Science, (Editors: Alexandrov, V and Lees, M and Krzhizhanovskaya, V and Dongarra, J and Sloot, PMA), Elsevier, Amsterdam, Netherlands, ICCS, 2013 (inproceedings)
Artificial Intelligence, 182-183, pages: 1-31, May 2012 (article)
While conventional approaches to causal inference are mainly based on conditional (in)dependences, recent methods also account for the shape of (conditional) distributions. The idea is that the causal hypothesis “X causes Y” imposes that the marginal distribution PX and the conditional distribution PY|X represent independent mechanisms of nature. Recently it has been postulated that the shortest description of the joint distribution PX,Y should therefore be given by separate descriptions of PX and PY|X. Since description length in the sense of Kolmogorov complexity is uncomputable, practical implementations rely on other notions of independence. Here we define independence via orthogonality in information space. This way, we can explicitly describe the kind of dependence that occurs between PY and PX|Y making the causal hypothesis “Y causes X” implausible. Remarkably, this asymmetry between cause and effect becomes particularly simple if X and Y are deterministically related. We present an inference method that works in this case. We also discuss some theoretical results for the non-deterministic case although it is not clear how to employ them for a more general inference method.
In Proceedings of the International Conference on Computational Science , 9, pages: 897-906, Procedia Computer Science, (Editors: H. Ali, Y. Shi, D. Khazanchi, M. Lees, G.D. van Albada, J. Dongarra, P.M.A. Sloot, J. Dongarra), Elsevier, Amsterdam, Netherlands, ICCS, June 2012 (inproceedings)
Classifying the land surface according to dierent climate zones is often a prerequisite for global diagnostic or
predictive modelling studies. Classical classifications such as the prominent K¨oppen–Geiger (KG) approach rely on
heuristic decision rules. Although these heuristics may transport some process understanding, such a discretization
may appear “arbitrary” from a data oriented perspective. In this contribution we compare the precision of a KG
classification to an unsupervised classification (k-means clustering). Generally speaking, we revisit the problem of
“climate classification” by investigating the inherent patterns in multiple data streams in a purely data driven way. One question is whether we can reproduce the KG boundaries by exploring dierent combinations of climate and remotely sensed vegetation variables. In this context we also investigate whether climate and vegetation variables build similar clusters. In terms of statistical performances, k-means clearly outperforms classical climate classifications. However, a subsequent stability analysis only reveals a meaningful number of clusters if both climate and vegetation data are considered in the analysis. This is a setback for the hope to explain vegetation by means of climate alone. Clearly, classification schemes like K¨oppen-Geiger will play an important role in the future. However, future developments in this area need to be assessed based on data driven approaches.
In pages: 839-847, (Editors: Cozman, F.G. , A. Pfeffer), AUAI Press, Corvallis, OR, USA, 27th Conference on Uncertainty in Artificial Intelligence (UAI), July 2011 (inproceedings)
We propose a method that infers whether linear relations between two high-dimensional variables X and Y are due to a causal influence from X to Y or from Y to X. The earlier proposed so-called Trace Method is extended to the regime where the dimension of the observed variables exceeds the sample size. Based on previous work, we postulate
conditions that characterize a causal relation between X and Y . Moreover, we describe a statistical test and argue that both causal directions are typically rejected if there is a common cause. A full theoretical analysis is
presented for the deterministic case but our approach seems to be valid for the noisy case, too, for which we additionally present an approach based on a sparsity constraint. The discussed method yields promising results for both simulated and real world data.
In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pages: 143-150, (Editors: P Grünwald and P Spirtes), AUAI Press, Corvallis, OR, USA, UAI, July 2010 (inproceedings)
We consider two variables that are related to each other by an invertible function. While it has previously been shown that the dependence structure of the noise can provide hints
to determine which of the two variables is the cause, we presently show that even in the deterministic (noise-free) case, there are asymmetries that can be exploited for causal inference. Our method is based on the idea that if the function and the probability density of the cause are chosen independently, then the distribution of the effect will, in a certain sense, depend on the function. We
provide a theoretical analysis of this method, showing that it also works in the low noise regime, and link it to information geometry. We report strong empirical results on various real-world data sets from different domains.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems