More thinking about less data: a perspective from the 2nd Provence Summer Workshop
Nature Publishing Group
Faculty of Health, Engineering and Science
School of Medical Sciences/Centre of Excellence for Alzheimer's Disease Research and Care
Doppler intuited that a sound’s pitch could be altered by the relative velocity between the source and an observer-70 years later Hubble used the same principle and 42 data points to prove the universe was indeed expanding. Arguably, no other data set of 0.042 Kb has done more to change our understanding of the cosmos. Although modest in volume, it took Hubble several years to acquire these precious numbers. Nowadays we conduct neuroscience in a state of instant data overload. In a matter of hours we could produce a structural image of an individual's brain comprising a matrix of 256 x 256 x 128 = 8 388 608 data points, a resting-state functional magnetic resonance imaging (MRI) time series (83MB), and from a simple blood sample, derive the person's genetic sequence by GWAS (30 K), state of gene expression by microarray (another 30 K) and metabolomic profile using any of a number of commercially available chips (1 K). We need not necessarily stop there. In principle, the number of brain-gene-omic interactions on permutation alone approaches 10 (19).