 Shopping Bag ( 0 items )

All (6) from $94.63

New (4) from $98.19

Used (2) from $94.63
More About This Textbook
Overview
Editorial Reviews
From the Publisher
"…a useful introductory overview of a wide range of chemometric techniques…" (Organic Process Research & Development)"...a wonderful merging of statistical principles with more sophisticated chemometric designs..." (Spectroscopy Online, Vol 19(3), March 2004)
Product Details
Read an Excerpt
Chemometrics
Data Analysis for the Laboratory and Chemical PlantBy Richard G. Brereton
John Wiley & Sons
ISBN: 0471489786Chapter One
Introduction1.1 Points of View
There are numerous groups of people interested in chemometrics. One of the problems over the past two decades is that each group has felt it is dominant or unique in the world. This is because scientists tend to be rather insular. An analytical chemist will publish in analytical chemistry journals and work in an analytical chemistry department, a statistician or chemical engineer or organic chemist will tend to gravitate towards their own colleagues. There are a few brave souls who try to cross disciplines but on the whole this is difficult. However, many of the latest advances in theoretical statistics are often too advanced for routine chemometrics applications, whereas many of the problems encountered by the practising analytical chemist such as calibrating pipettes and checking balances are often too mundane to the statistician. Crosscitation analysis of different groups of journals, where one looks at which journal cites which other journal, provides fascinating insights into the gap between the theoretical statistics and chemometrics literature and the applied analytical chemistry journals. The potential for chemometrics is huge, ranging from physical chemistry such as kinetics and equilibrium studies, to organic chemistry such as reaction optimisation and QSAR, theoretical chemistry, most areas of chromatography and spectroscopy on to applications as varied as environmental monitoring, scientific archaeology, biology, forensic science, industrial process monitoring, geochemistry, etc., but on the whole there is no focus, the ideas being dissipated in each discipline separately. The specialist chemometrics community tends to be mainly interested in industrial process control and monitoring plus certain aspects of analytical chemistry, mainly nearinfrared spectroscopy, probably because these are areas where there is significant funding for pure chemometrics research. A small number of tutorial papers, reviews and books are known by the wider community, but on the whole there is quite a gap, especially between computer based statisticians and practising analytical chemists.
This division between disciplines spills over into industrial research. There are often quite separate data analysis and experimental sections in many organisations. A mass spectrometrist interested in principal components analysis is unlikely to be given time by his or her manager to spend a couple of days a week mastering the various ins and outs of modern chemometric techniques. If the problem is simple, that is fine; if more sophisticated, the statistician or specialist data analyst will muscle in, and try to take over the project. But the statistician may have no feeling for the experimental difficulties of mass spectrometry, and may not understand when it is most effective to continue with the interpretation and processing of data, or when to suggest changing some mass spectrometric parameters.
All these people have some interest in data analysis or chemometrics, but approach the subject in radically different ways. Writing a text that is supposed to appeal to a broad church of scientists must take this into account. The average statistician likes to build on concepts such as significance tests, matrix least squares and so on. A statistician is unlikely to be satisfied if he or she cannot understand a method in algebraic terms. Most texts, even most introductory texts, aimed at statisticians contain a fair amount of algebra. Chemical engineers, whilst not always so keen to learn about distributions and significance tests, are often very keen on matrix algebra, and a chemometrics course taught by a chemical engineer will often start with matrix least squares and linear algebra.
Practical chemists, on the other hand, often think quite differently. Many laboratory based chemists are doing what they are doing precisely because at an early phase in their career they were put off by mathematicians. This is especially so with organic chemists. They do not like ideas expressed in terms of formal maths, and equations are 'turn offs'. So a lecture course aimed at organic chemists would contain a minimum of maths. Yet some of these people recognise later in their career that they do need data analytical tools, even if these are to design simple experiments or for linear regression, or in QSAR. They will not, however, be attracted to chemometrics if they are told they are required first to go on a course on advanced statistical significance testing and distributions, just to be able to perform a simple optimisation in the laboratory. I was told once by a very advanced mathematical student that it was necessary to understand Gallois field theory in order to perform multilevel calibration designs, and that everyone in chemometrics should know what Krilov space is. Coming from a discipline close to computing and physics, this may be true. In fact, the theoretical basis of some of the methods can be best understood by these means. However, tell this to an experimentalist in the laboratory that this understanding is required prior to performing these experiments and he or she, even if convinced that chemometrics has an important role, will shy away. In this book we do not try to introduce the concepts of Gallois field theory or Krilov space, although I would suspect not many readers would be disappointed by such omissions.
Analytical chemists are major users of chemometrics, but their approach to the subject often causes big dilemmas. Many analytical chemists are attracted to the discipline because they are good at instrumentation and practical laboratory work. The majority spend their days recording spectra or chromatograms. They know what to do if a chromatographic column needs changing, or if a mass spectrum is not fragmenting as expected. Few have opted to work in this area specifically because of their mathematical background, yet many are confronted with huge quantities of data. The majority of analytical chemists accept the need for statistics and a typical education would involve some small level of statistics, such as comparison of means and of errors and a little on significance tests, but the majority of analytical texts approach these subjects with a minimum of maths. A number then try to move on to more advanced data analysis methods, mainly chemometrics, but often do not recognise that a different knowledge base and skills are required. The majority of practising analytical chemists are not mathematicians, and find equations difficult; however, it is important to have some understanding of the background to the methods they use. Quite correctly, it is not necessary to understand the statistical theory of principal components analysis or singular value decomposition or even to write a program to perform this (although it is in fact very easy!). However, it is necessary to have a feel for methods for data scaling, variable selection and interpretation of the principal components, and if one has such knowledge it probably is not too difficult to expand one's understanding to the algorithms themselves. In fact, the algorithms are a relatively small part of the data analysis, and even in a commercial chemometric software package PCA or PLS (two popular approaches) may involve between 1 and 5% of the code.
The relationship of chemometrics to different disciplines is indicated in Figure 1.1. On the left are the enabling sciences, mainly quite mathematical and not laboratory based. Statistics, of course, plays a major role in chemometrics, and many applied statisticians will be readers of this book. Statistical approaches are based on mathematical theory, so statistics falls between mathematics and chemometrics. Computing is important as much of chemometrics relies on software. However, chemometrics is not really computer science, and this book will not describe approaches such as neural networks or genetic programming, despite their potential importance in helping solve many complex problems in chemistry. Engineers, especially chemical and process engineers, have an important need for chemometric methods in many areas of their work, and have a quite different perspective from the mainstream chemist.
On the right are the main disciplines of chemistry that benefit from chemometrics. Analytical chemistry is probably the most significant area, although some analytical chemists make the mistake of claiming chemometrics uniquely as their own. Chemometrics has a major role to play and had many of its origins within analytical chemistry, but is not exclusively within this domain. Environmental chemists, biologists, food chemists as well as geochemists, chemical archaeologists, forensic scientists and so on depend on good analytical chemistry measurements and many routinely use multivariate approaches especially for pattern recognition, and so need chemometrics to help interpret their data. These scientists tend to identify with analytical chemists. The organic chemist has a somewhat different need for chemometrics, primarily in the areas of experimental design (e.g. optimising reaction conditions) and QSAR (quantitative structureanalysis relationships) for drug design. Finally, physical chemists such as spectroscopists, kineticists and materials scientists often come across methods for signal deconvolution and multivariate data analysis.
Different types of people will be interested in chemometrics, as illustrated in Figure 1.2. The largest numbers are application scientists. Many of these will not have a very strong mathematical background, and their main interest is to define the need for data analysis, to design experiments and to interpret results. This group may consist of some tens of thousands of people worldwide, and is quite large. A smaller number of people will apply methods in new ways, some of them developing software. These may well be consultants that interface with the users: many specialist academic research groups are at this level. They are not doing anything astoundingly novel as far as theoretical statisticians are concerned, but they will take problems that are too tough and complex for an applications scientist and produce new solutions, often tinkering with the existing methods. Industrial data analysis sections and dedicated software houses usually fit into this category too. There will be a few thousand people in such categories worldwide, often organised into diverse disciplines. A rather smaller number of people will be involved in implementing the first applications of computational and statistical methods to chemometrics. There is a huge theoretical statistical and computational literature of which only a small portion will eventually be useful to chemists. Invogue approaches such as multimode data analysis, Bayesian statistics, and wavelet transforms are as yet not in common currency in mainstream chemistry, but fascinate the more theoretical chemometrician and over the years some will make their way into the chemists' toolbox. Perhaps in this group there are a few hundred or so people around the world, often organised into very tightly knit communities. At the top of the heap are a very small number of theoreticians. Not much of chemical data analysis is truly original from the point of view of the mathematician  many of the 'new' methods might have been reported in the mathematical literature 10, 20 or even 50 years ago; maybe the number of mathematically truly original chemometricians is 10 or less. However, mathematical novelty is not the only sign of innovation. In fact, much of science involves connecting ideas. A good chemometrician may have the mathematical ability to understand the ideas of the theoretician and then translate these into potential applications. He or she needs to be a good listener and to be able to link the various levels of the triangle. Chemical data analysis differs from more unitary disciplines such as organic chemistry, where most scientists have a similar training base, and above a certain professional level the difference is mainly in the knowledge base.
Readers of this book are likely to be of two kinds, as illustrated in Figure 1.3. The first are those who wish to ascend the triangle, either from outside or from a low level. Many of these might be analytical chemists, for example an NIR spectroscopist who has seen the need to process his or her data and may wish some further insight into the methods being used. Or an organic chemist might wish to have the skills to optimise a synthesis, or a food chemist may wish to be able to interpret the tools for relating the results of a taste panel to chemical constituents. Possibly you have read a paper, attended a conference or a course or seen some software demonstrated. Or perhaps in the nextdoor laboratory, someone is already doing some chemometrics, perhaps you have heard about experimental design or principal components analysis and need some insight into the methods. Maybe you have some results but have little idea how to interpret them and perhaps by changing parameters using a commercial package you are deluged with graphs and not really certain whether they are meaningful. Some readers might be MSc or PhD students wishing to delve a little deeper into chemometrics.
The second group already has some mathematical background but wishes to enter the triangle from the side. Some readers of this book will be applied statisticians, often working in industry. Matrix algebra, significance tests and distributions are well known, but what is needed is to brush up on techniques as applied specifically to chemical problems. In some organisations there are specific data processing sections and this book is aimed as a particularly useful reference for professionals working in such an environment. Because there are not a large number of intensive courses in chemical data analysis, especially leading to degrees, someone with a general background in statistical data analysis who has moved job or is taking on extra responsibilities will find this book a valuable reference. Chemical engineers have a special interest in chemometrics and many are encountering the ideas when used to monitor processes.
1.2 Software and Calculations
The key to chemometrics is to understand how to perform meaningful calculations on data. In most cases these calculations are too complex to do by hand or using a calculator, so it is necessary to use some software.
The approach taken in this text, which differs from many books on chemometrics, is to understand the methods using numerical examples. Some excellent texts and reviews are more descriptive, listing the methods available together with literature references and possibly some examples. Others have a big emphasis on equations and output from packages. This book, however, is based primarily on how I personally learn and understand new methods, and how I have found it most effective to help students working with me.
Continues...
Table of Contents