There are many issues that should be considered in examining the implications of the imminent flood of data that will be generated both by the present and by the next generation of global ‘e-Science’ experiments. The term e-Science is used to represent the increasingly global collaborations – of people and of shared resources – that will be needed to solve the new problems of science and engineering [1]. These e-Science problems range from the simulation of whole engineering or biological systems, to research in bioinformatics, proteomics and pharmacogenetics. In all these instances we will need to be able to pool. | 36 The data deluge an e-Science perspective Tony Hey1 2 and Anne Trefethen1 1EPSRC Swindon United Kingdom 2 University of Southampton Southampton United Kingdom INTRODUCTION There are many issues that should be considered in examining the implications of the imminent flood of data that will be generated both by the present and by the next generation of global e-Science experiments. The term e-Science is used to represent the increasingly global collaborations - of people and of shared resources - that will be needed to solve the new problems of science and engineering 1 . These e-Science problems range from the simulation of whole engineering or biological systems to research in bioinformatics proteomics and pharmacogenetics. In all these instances we will need to be able to pool resources and to access expertise distributed across the globe. The information technology IT infrastructure that will make such collaboration possible in a secure and transparent manner is referred to as the Grid 2 . Thus in this chapter the term Grid is used as a shorthand for the middleware infrastructure that is currently being developed to support global e-Science collaborations. When mature this Grid middleware Grid Computing - Making the Global Infrastructure a Reality. Edited by F. Berman A. Hey and G. Fox 2003 John Wiley Sons Ltd ISBN 0-470-85319-0 810 TONY HEY AND ANNE TREFETHEN will enable the sharing of computing resources data resources and experimental facilities in a much more routine and secure fashion than is possible at present. Needless to say present Grid middleware falls far short of these ambitious goals. Both e-Science and the Grid have fascinating sociological as well as technical aspects. We shall consider only technological issues in this chapter. The two key technological drivers of the IT revolution are Moore s Law - the exponential increase in computing power and solid-state memory - and the dramatic increase in communication bandwidth made possible by .