Thursday, January 14, 2010



Friendship network of children in a US school [James Moody]

fourth-paradigm-cover'The  Fourth Paradigm', a  remarkable publication from Microsoft Research, is about discovery based on data-intensive science  - a new kind of scientific exploration.

Gordon  Bell writes: 'It was Tycho Brahe’s assistant Johannes Kepler who took Brahe’s catalog of systematic astronomical observations and discovered the laws of planetary motion. This established the division between the mining and analysis of captured and carefully archived experimental data and the creation of theories. This division is one aspect of the Fourth Paradigm. '

This book is a collection of essays by various authors on many different aspects of this broad subject. To whet your appetite, an extract from an excellent  review of the book by Michael Nielsen [Nature. Vol 462. 10 December 2009]:

'Hundreds of projects in fields ranging from genomics to computational linguistics to astron­omy demonstrate a major shift in the scale at which scientific data are taken, and in how they are processed, shared and commu­nicated to the world. Most significantly, there is a shift in how researchers find meaning in data, with sophisticated algorithms and statistical techniques becom­ing part of the standard scientific toolkit. The Fourth Paradigm is about this shift, how scientists are dealing with it, and some of the consequences. Its 30 chap­ters, written by some 70 authors, cover a wide range of aspects of data-intensive science.

''The book is in four parts. The first two parts are a panorama of the new ways in which data are obtained, through new instru­ments and large-scale sensor net­works. The fields covered range from cosmology to the environ­ment and from healthcare to biology. Most of the chapters in these sections follow a common pattern. Each introduces a complex system of scientific inter­est — the human brain, the worlds oceans, the global health system and so on — before sup­plying an explanation of how we are building an instrument or a network of sensors to map out that system comprehensively and, in some cases, to track its real-time behaviour.

'We learn in one chapter, for example, about steps towards building a complete map of the human brain — the 'connectome'. Another chap­ter describes the Ocean Observatories Initiative, a major effort funded by the US National Science Foundation to build an enormous underwater sensor network in the northeast Pacific, off the coasts of Oregon, Washington and British Columbia. And so on, example after example.'

The book is also about the next step in libraries: digital data libraries. It concerns the whole range of issues to do with curating, preserving and making accessible, now and in the future, scientific data.

Gordon Bell, in the foreword to the Report writes: 'I believe that we will soon see a time when data will live forever as archival media—just like paper-based storage— and be publicly accessible in the “cloud” to humans and machines. Only recently have we dared to consider such permanence for data, in the same way we think of  “stuff” held in our national libraries and museums! Such permanence still seems far-fetched until you realize that capturing data provenance, including individual researchers’ records and sometimes everything about the researchers themselves, is what libraries insist on and have always tried to do. The “cloud” of magnetic polarizations encoding data and documents in the digital library will become the modern equivalent of the miles of library shelves holding paper and embedded ink particles. '

What is envisaged here is an interlinked network of the world's scientific knowledge in one big database.


A Web Mapping information visualisation tool, showing a representation of the 'Information Visualisation' web community. [Starlight]

The Fourth Paradigm is available as a free PDF download from Microsoft Research.

Graphics from the excellent site Visual Complexity

No comments: