Field of Science

Showing posts with label models. Show all posts
Showing posts with label models. Show all posts

Jim Simons: "We never override the computer"

Billionaire-mathematician Jim Simons has been called the most successful investor of all time. His Renaissance Technologies hedge fund has returned an average of 40% returns (after fees) over the last 20 years. The firm uses proprietary mathematical algorithms and models to exploit statistical asymmetries and fluctuations in stock prices to leverage price differentials and make money. 

Simons had made groundbreaking contributions to algebraic topology before founding Renaissance, and his background enabled him to recruit top mathematicians, computer scientists, physicists and statisticians to the company. In fact the company actively stays away from recruiting anyone with a financial or Wall Street background.

I've been enjoying the recent biography of Simons, "The Man Who Solved the Market", by Gregory Zuckerman. But there's an interesting video of a Simons talk at San Francisco State University from 2014 in which he says something very intriguing about the models that Renaissance builds:

"The only rule is that we never override the computer. No one ever comes in any day and says the computer wants to do this and that’s crazy and we shouldn’t do it. You don’t do it because you can’t simulate that, you can’t study the past and wonder whether the boss was gonna come in and change his mind about something. So you just stick with it, and it’s worked."

It struck me that this is how molecular modeling should be done as well. As I mentioned in a previous post, a major problem with modeling is that it's mostly applied in a slapdash manner to drug discovery problems, with heavy human intervention - often for the right reasons, because the algorithms don't work great - obscuring the true successes and failures of the models. But as Simons's quote indicates, the only way to truly improve the models would be to simply take their results at face value, without any human intervention, and test them. At the very minimum, "simulating" historical human intervention is going to be pretty hard. So the only way we'll know what works and what doesn't is if we trust the models and let them rip through. As I pointed out though, in most organizations experimenters are simply not incentivized, nor are there enough resources, to carry out this comprehensive testing. 

Jim Simons and Renaissance can do it because 1. They have the wisdom to realize that that's the only way in which they can get the models to work and 2. They have pockets that are deep enough so that even model failures can be tolerated. Most drug discovery organizations, especially smaller ones, presumably can't do 2. But they could still do it in a limited sense in a handful of projects. What's really necessary though is 1. and my concern is that we'll be waiting for that even if we have the resources to do 2.

How chemistry exemplifies the Fermi method

In my review of the new biography of Enrico Fermi I alluded to one of Fermi's most notable qualities - his uncanny ability to reach rapid conclusions to tough problems based on order of magnitude, back of the envelope calculations. This method of approximation has since come to be known as the Fermi method, and problems which can especially benefit from applying it are called Fermi problems.

It struck me that chemistry is an especially fertile ground for applying the Fermi method, and in fact many chemists probably use the technique unconsciously in their daily work without explicitly realizing it. For understanding why this so, it's worth taking a look at some of the details of the method and the kinds of problems to which it can be fruitfully applied.

At the heart of the Fermi method is a way to make educated guesses about different factors and quantities that could affect the answer to a problem. Usually when you are looking at complicated problems not just in physics or chemistry but in psychology or economics for that matter, much of complex problem solving involves examining different factors that could influence the magnitude and nature of the solution. For instance, say you were calculating the trajectory of a bomb dropped from an airplane. In that case you would consider parameters like the velocity of the plane, the velocity of the bomb, air resistance, the weight of the bomb, the angle at which it was dropped etc. If you were trying to gauge the impact of a certain economic proposal on the economy, you would consider the market and demographic to which the proposal was applied, the presence or absence of existing elements which could interact positively or negatively with the proposed policy, rates of inflation, potential changes in the prices of certain goods relevant to the policy etc. The first part of the Fermi method simply involves writing down such factors and making sure you have a more or less comprehensive list.

The second part of the Fermi method consists of making educated guesses for each of these factors. The crucial aspect of this part is that you don't need to make highly accurate predictions for each factor to the fourth or fifth decimal place. In fact it was precisely this approach that made Fermi such a novelty in his time; it was because physicists could calculate quantities to four decimal places that they were often tempted to do this. Fermi showed that they didn't have to, and in some sense he weaned them away from this temptation. The fact of the matter is you don't always need a high degree of accuracy to reach actionable, semi-qualitative conclusions; you just need to know some rough numbers and get the answer right to an order of magnitude. That was the key insight from Fermi's technique.

Now, before I proceed and discuss how these two aspects of the Fermi method may apply to chemistry, it's worth noting that there are of course several examples in which an order of magnitude answer is simply not good enough. A famous example concerns the very Manhattan Project of which Fermi was such a valued member. In the early phases of the project when General Leslie Groves was picked as head of the project, he quizzed the scientists in Chicago about how much fissile material they would need. When they said that at that point all they give him was an answer correct to an order of magnitude, he was indignant and pointed out that that would be tantamount to ordering a wedding cake and not knowing whether to order enough cake for ten people or one person or a hundred people.

Notwithstanding such specific cases though, it's clear that there are in fact several example of general problems which can benefit from Fermi's technique. Chemistry in fact is a poster child for both the key aspects of the method illustrated above. Many problems in chemistry involve estimating the various kind of forces - electrostatic, hydrophobic, hydrogen bonds, Van der Waals - influencing the interaction of one molecule with another. For instance when a drug molecule is interacting with a protein, all these factors play an important role. Sometimes they synergize with each other and sometimes they oppose each other. Using the Fermi method then, you would first simply make sure you are listing all of them as comprehensively as possible. The goal is to come up with a total number resulting from all these contributions that would crucially provide you with the strength or free energy of interaction between the drug and the protein; a quantity measured in units of kcal/mol.

This part is where the method is especially useful. When you are trying to come up with numbers for each of these forces, it's valuable simply to know some ranges; you don't need to know the answers to three decimal places. For instance, you know that hydrogen bonds can contribute 2-5 kcal/mol, electrostatic interactions usually add 1-2 kcal/mol, and all the hydrophobic interactions will add a few kcal/mol to the mix. There are some trickier estimates such as those for the entropy of interaction, but there are also approximations for these. Sum up these interactions and you can come up with a reasonable estimate for the free energy of binding. The job becomes easier when what you are interested in are differences and not absolute values. For instance you may be given a list of small molecules and asked to rank these in order of their free energies. In those cases you just have to look at differences: for instance, if one molecule is forming an extra hydrogen bond and the other isn't, you can say that the first one is better by about 2-3 kcal/mol. You can also use your knowledge of experimental measurements for calibrating your estimates, another trait which Fermi supremely exemplified.

This then is the Fermi method of approximate guesses in action. One of the reasons it's far more prevalent in chemistry than physics is because unlike physics, in chemistry it's usually not even possible to calculate numbers to very high accuracy. Therefore unlike some physicists, chemists would not even be tempted to attempt to do this and would have already resigned themselves (if you will) to making do with approximate solutions. Today the Fermi method is incorporated in both the minds of seasoned working chemists as well as in computer programs which try to automate the process. Both the seasoned chemist and the computer program try to first list all the interactions between molecules and then try to estimate the strengths of each interaction based on rough numbers, adding up to a final value. 

The method does not work all the time since every interaction is modeled, so it may potentially miss some important real life component. But it works well enough for chemists and computers to employ it in a variety of useful tasks, from narrowing down the set of drug molecules that have to be made to prioritizing molecules for new materials and energy applications. Enrico Fermi's ghost lives on in test tubes, computers, fume hoods and spectrometers, more than even his wide-ranging mind could have imagined.

What is chemical intuition?

Image source: Buchi.com
Recently I read a comment by a leading chemist in which he said that in chemistry, intuition is much more important than in physics. This is a curious comment since intuition is one of those things which is hard to define but which most people who play the game appreciate when they see it. It is undoubtedly important in any scientific discipline and certainly so in physics; Einstein and Feynman for instance were regarded as the outstanding intuitionists of their age, men whose grasp of physical reality largely unaided by mathematical analysis was unmatched. Yet it seems to me that "chemical intuition" is a phrase which you hear much more than "physical intuition". When it comes to intuition, chemists seem to be more in the league of financial traders, geopolitical experts and psychologists than physicists.
 

Why is this the case? The simple reason is that in chemistry, unlike physics, armchair mathematical manipulation and theorizing can take you only so far. Most chemical systems are too complex for the kind of first-principles approaches that yield predictions in physics to an uncanny degree of accuracy; the same is true of biology. While armchair speculation and order-of-magnitude calculations can certainly be very valuable, no chemist can design a zeolite, predict the ultimate product of a complex polymer synthesis or list the biological properties that a potential drug can have by simply working through the math. As the great organic chemist R B Woodward once said of his decision to pursue chemistry rather than math, in chemistry, ideas have to answer to reality. Chemistry much more than physics is an experimental science built on a foundation of rigorous and empirical models, and as the statistician George Box once memorably quipped, all models are wrong, but some are useful. It is chemical intuition that can separate the good models from the bad ones.

How then, to acquire chemical intuition? All chemists crave intuition, few have it. It's hard to define it, but I think a good definition would be that of a quality that lets one skip a lot of the details and get to the essential result, often one that is counter intuitive. That definition reminds me of a recent book by the philosopher Daniel Dennett in which he describes an intellectual device called an "intuition pump". An "intuition pump" is essentially a shortcut - anything from a linguistic trick to a thought experiment - that allows one to skirt the usual process of rigorous and methodical analysis and get to the point. A lot of chemical thinking involves the fine art of manipulating intuition pumps. It is the art of asking the simple, decisive question that gets to the heart of the matter. As in a novel mathematical proof, a moment of chemical intuition commands an element of surprise. And as with a truly ingenious mathematical derivation, it should ideally lead us to smack our foreheads and ask why we could not think of something so simple before.

Ultimately when it comes to harnessing intuition, there can be no substitute for experience. Yet the masters of the art in the last fifty years have imparted valuable lessons on how to acquire it. Here are three that I have noticed, and I would think they would apply as much to other disciplines as to chemistry.

1. Don't ignore the obvious: One of the most striking features of chemistry as a science is that very palpable properties like color, smell, taste and elemental state are directly connected to molecular structure. For instance, there is an unforgettably direct connection between the smell of a simple molecule called cis-3-hexenol and that of freshly cut grass. Once you smell both separately it is virtually impossible to forget the connection. Chemists who are known for their intuition never lose sight of these simple molecular properties, and they use them as disarming filters that can cut through the complex calculations and the multimillion dollar chemical analysis.

Colors, smells and explosions are what often attract budding chemists to their trade at an early age, and these qualities are also precisely the ones which can be an important elements of chemical intuition. I remember an anecdote about the Caltech chemist Harry Gray (an expert among other things on colored chemical compounds) who once deflated the predictions of some sophisticated quantum mechanics calculation by simply asking what the color of the proposed compound was; apparently there was no way the calculations could have been right if the compound had a particular color. As you immerse yourself in laborious compound characterization, computational modeling and statistical significance, don't forget what you can taste, touch, smell and see. As Pink Floyd said, this is all that your world will ever be.

2. Get a feel for energetics: The essence of chemistry can be boiled down to a fight: a fight unto death of countless factors that rally either for or against the amount of useful energy - technically called the free energy - that a system can provide. In one sense all of chemistry is one big multivariable optimization problem. When you are designing molecules as anticancer agents, for hydrogen storage or solar energy conversion or as enzyme mimics, ultimately what decides whether they will work or not is energetics, how well they can stabilize and be stabilized and ultimately lower the free energy of the system. Intimate familiarity with numbers can help in these cases. Get a feel for the rough contributions made by hydrogen bonds, electrostatics, steric effects and solvent influences, essentially all the important interactions between molecules that dictate the fate of chemical systems. Often the key to improving the properties of molecules is to figure out what single interaction or combination of interactions is responsible for a particular property; you can then tweak that property by turning the knobs on the relevant factors.

Order of magnitude calculations and rough guesses are especially important for chemists working at the interface of chemistry and biology; remember, life is a game played within a 3 kcal/mol window and any insight that allows you to nail down numbers within this window can only help. Linus Pauling was lying in bed with a cold when he managed to build accurate models of protein structure, largely based on his unmatched feel for such numbers that allowed him to make educated guesses about bond lengths and angle. And every chemist can learn from the incomparable intuition of Enrico Fermi who tossed pieces of paper in the air when the first atomic bomb went off, and used the distance at which they fell to calculate a crude estimate of the yield.

A striking case of insights acquired through thinking about energetics is illustrated by a story that the Nobel Prize winning chemist Roald Hoffmann narrates in an issue of the magazine "American Scientist". Hoffmann was theoretically investigating the conversion of graphene to graphane, which is the saturated counterpart of graphene (one in which all double bonds have been converted to single ones), under high pressure. After having done some high-level calculations, his student came into his office and communicated a very counter-intuitive result; apparently graphane was more stable than the equivalent number of benzenes. This was highly counterintuitive since every chemistry student learns that so-called aromatic compounds with alternating double bonds are more stable that their single-bond analogs because of the ubiquitous phenomenon of resonance. Hoffmann could not believe the result and his first reaction was to suspect that something must be wrong with the calculation.

Then, as he himself recalls, he leaned back in his chair, closed his eyes and brought half a century's store of chemical intuition to bear on the problem. Ultimately after all the book-keeping had been done, it turned out that the result was a simple consequence of energetics; the energy gained in the formation of strong carbon-carbon bonds more than offset that incurred due to the loss of aromaticity. The fact that it took a Nobel Laureate some time to work out the result is not in any way a criticism but a resounding validation of thinking in terms of simple energetics. Chemistry is full of surprises- even for Roald Hoffmann- and that's what makes it endlessly exciting.

3. Stay in touch with the basics, and learn from other fields: This is a lesson that is often iterated but seldom practiced. An old professor of mine used to recommend flipping open an elementary chemistry textbook every day to a random page and reading a few pages from it. Sometimes our research becomes so specialized and we become so enamored of our little corner of the chemical world that we forget the big picture. Part of the lessons cited above simply involves not missing the forest for the trees and always thinking of basic principles of structure and reactivity in the bigger sense.

This also involves keeping in touch with other fields of chemistry since an organic chemist never knows when a basic fact from his college inorganic chemistry textbook will come in handy. Most great chemists who were masters of chemical intuition could seamlessly transition their thoughts between different subfields of their science. This lesson is especially important in today's age when specialization has become so intense that it can sometimes lead to condescension toward fields other than your own. A corollary of learning from other fields is collaboration; what you don't have you can at least partially borrow. As Oppenheimer used to say about afternoon tea when he was director of the Institute for Advanced Study, "Tea is where we explain to each other what we don't understand". Chemists and scientists in general need to have tea more often.

Ultimately if we want to develop chemical intuition, it is worth remembering that all our favorite molecules, whether solar energy catalysts, cancer drugs or fertilizers, are all part of the same chemical universe, obeying the same rules even if in diverse contexts. Ultimately, no matter what kind of molecule we are interrogating, "Wir sind alle chemikers", every single one of us.

This is a revised version of an older post. 

A bond by any other name...: How the simple definition of a hydrogen bond gives us a glimpse into the heart of chemistry

Basic hydrogen bonding between two water molecules,
with the central hydrogen shared between two oxygens
A few years ago, a committee organized by the International Union of Pure and Applied Chemistry (IUPAC) - the international body of chemists that defines chemical terms and guides the lexicon of the field - met to debate and finalize the precise definition of a hydrogen bond.

Defining a hydrogen bond in the year 2011? That was odd to say the least. Hydrogen bonds have been known for at least seventy years now. It was in the 1940s that Linus Pauling defined them as foundational elements of protein structure; the glue that holds molecules including life-giving biological molecules like proteins and DNA together. Water molecules form hydrogen bonds with each other, and this feature accounts for water's unique properties. Whether it's sculpting the shape and form of DNA, governing the properties of materials or orchestrating the delicate dance of biochemistry performed by enzymes, these interactions are essential. Simply put, without hydrogen bonds, life would be impossible. No wonder that they have been extensively studied in hundreds of thousands of molecular structures since Pauling highlighted them in the 1940s. Today no chemistry textbook would be considered legitimate without a section on hydrogen bonding. The concept of a hydrogen bond has become as familiar and important to a chemist as the concept of an electromagnetic wave or pressure is to a physicist. 

What the devil, then, were chemists doing defining them in 2011?

It turns out that behind the story of defining hydrogen bonds lies a paradigm that takes us into the very guts of the nature and philosophy of chemistry. It leads us to ask what a chemical bond is in the first place; drill down deep enough into it, and it starts appearing like one of those existential questions a Zen monk would ask about life or the soul.

Remarkably enough, in spite of the debate and redefinitions, the basic features of a hydrogen bond are not controversial. A hydrogen bond is a bond that exists between a hydrogen and two other non-hydrogen atoms. Those two other atoms are most commonly oxygen and nitrogen, but that's where the consensus seems to end and the endless debate begins.

The basic reason for the debate has to do with the very definition of a bond as laid out by pioneering chemists like Gilbert Newton Lewis, Irving Langmuir and Linus Pauling in the 1920s. Loosely speaking a bond is a force of attraction between two atoms. It was Lewis who made the groundbreaking suggestion that a chemical bond results from the donation, acceptance or sharing of electrons. Pauling codified this definition into a rigorous principle using the laws of quantum mechanics. He showed that using quantum mechanics, you could prove that the sharing of electrons between two atoms leads to a net lowering of energy between them: this sounds logical since if there were no lowering of energy, there would be no incentive for two atoms to come close to each other.

So far so good. Everyone accepts this basic fact about chemical bonds. Treatises have been written about the various bonds between different atoms seen in a bewildering array of metallic, organic and inorganic molecules; Pauling's seminal book, "The Nature of the Chemical Bond", described such interactions in loving detail. In this book he showed that bonds can be covalent or ionic; the former involve a symmetric sharing of electrons while the latter involve an asymmetric give and take. There are also many intermediate cases. Furthermore, whether a bond is ionic or covalent depends on the electronegativity of the atoms involved. Electronegativity is the innate property of an atom to hold on closely to its electrons. Fluorine for instance is the most electronegative element of all, holding on to its electrons in a fiery embrace. Metals on the other hand are happy to give away their electrons. Not surprisingly, bonds between metals and fluorine are readily formed.

It is in the context of electronegativity that it's easy to understand hydrogen bonds. A hydrogen atom, just like a metal, is happy to give away its lone electron and form a positive ion. Oxygen and nitrogen atoms are happy to accept this electron. Thus, a hydrogen bond forms when a hydrogen atom is shared between two oxygen atoms (or one oxygen and one nitrogen, or two nitrogens). Because of the difference in electronegativity, the hydrogen acquires a slightly positive partial charge, and the other atoms acquire a slightly negative partial charge (represented in the picture above). Think of two oxygen atoms with a hydrogen between them acting as a bridge: that's a hydrogen bond between them. Because a hydrogen atom is small and has only a single electron it can achieve the feat of forming this bridge; what is fascinating is that this is partly the result of a unique quantum mechanical phenomenon called the tunnel effect.

All this is rather uncontroversial. The debate starts when we ask what criteria we use to actually identify such hydrogen bonds. Chemists usually identify bonds by looking at experimental structures of molecules. The most common method of acquiring these structures is by x-ray diffraction. Bouncing x-rays off a crystallized molecule allows us to accomplish the molecular equivalent of taking a photograph. The molecules under question can be anything from common salt to complicated proteins. These structures form a cornerstone of chemistry, and they are not just of academic importance but have also been immensely useful in the design of new drugs and materials.

When chemists acquire the crystal structure of a molecule, they will usually do something very simple: they will measure the distance between various atoms. The hallmark of a chemical bond is a short distance between two atoms; think of it as two people embracing each other and reducing the distance between themselves. How short, exactly? Short enough to be less than the sum of Van der Waals radii of the two atoms. Imagine the Van der Waals radius of an atom as a kind of safe perimeter which an atom uses to hold other atoms at bay. Each atom has a unique Van der Waals radius defining its safe perimeter which is measured in the unit angstroms (1 Ã… 10-10 meters). Hydrogen for instance has a radius of 1.2 Ã…, nitrogen has a radius of 1.5 Ã…. If a chemist measures the distance between a hydrogen and nitrogen in a crystal structure as being less than 2.7 Ã… (1.2 + 1.5), she would usually declare that there's a bond between the two atoms. If the three atoms in a hydrogen bond satisfy this geometric criteria, then one could legitimately claim a bond between them.

But this simple procedure turns out to be far trickier to accept than we can imagine. Here's the problem: a crystallized molecule is a complex assembly of hundreds or even thousands of atoms, all of which are jostling for space. The eventual formation of the crystal is not a zero-sum game; some atoms have to give way so that others can be happy. As the pioneering chemist Jack Dunitz has succinctly noted, "Atoms have to go somewhere". Thus, some atoms are simply forced together against their will, just like people in a cramped train compartment who are forced together against their will. Now, just like those cramped people, the distance between two such cramped atoms will be unnaturally short, often shorter than the the sum of their Van der Waals radii. The key term here is "unnaturally": the distance is short not because the atoms actually want to be close, but because they are forced to be so. Can one then say that there is a bond between them? And how short does this distance need to be in order to be called a bond?

There's another conundrum that emerged as thousands of crystal structures of molecules started to appear during the last few decades. While the "classical" hydrogen atoms bridging oxygens and nitrogens were well known, short distances also started appearing between hydrogen and atoms like carbon and chlorine. The purported hydrogen bond zoo started to get populated with even more exotic creatures. For instance chemists started noticing close contacts between hydrogens and the flat face of benzene rings: these hydrogens are attracted to the ghostly electron clouds lying on top of the rings. Sometimes you did not even need a benzene ring; two doubly bonded carbon atoms would appear as magnets for hydrogens and pull them close. And these hydrogens in turn again would not just be bonded to oxygens or nitrogens as their second partners; they could be bonded to carbons, or even to metallic elements. Experimental observation forced the concept of hydrogen bonds to be extended not just to other bonafide atoms but to more abstract entities like clouds of electrons.


An illustration of two kinds of hydrogen bonds:
conventional ones with oxygen atoms in blue
and those with a benzene ring in red
Two overriding questions thus emerged from these decades-long exercise of observation and analysis: Can one claim a bond between two atoms in a molecule - specifically a hydrogen bond in case of hydrogen, oxygen or nitrogen - simply based on the distance between them? And what atoms exactly constitute a hydrogen bond?

The IUPAC committee in 2011 seems to have exorcised both demons with a single stroke. They defined a hydrogen bond as any kind of bond between a hydrogen and two other atoms, provided that the other two atoms are more electronegative than hydrogen itself. That always included oxygen and nitrogen, but now the definition was expanded to include carbon and halogens (chlorine, fluorine, bromine and iodine) as well as ghostly entities like electron clouds on top of benzene rings. As far as the distance criterion was concerned, IUPAC made a wise decision in allowing it to be fairly variable and not simply limited to a hard cutoff within the sum of Van der Waals radii.

The IUPAC decision illuminates three very important aspects of chemistry. Firstly, it tells us that many chemical concepts are fuzzy and only approximately defined. For instance, although we have defined the distance between the hydrogen and other two atoms in a hydrogen bond as being less than the sum of their Van der Waals radii, in reality these distances are best derived as a statistical distribution. Some hydrogen bonds are very short, some are very long, and most are in the middle: even this classification has been a hotbed of contention, and you will find arguments about the nature and energies of "short" and "weak" hydrogen bonds in the scientific literature that approach the vigor of gladiatorial contests. The ones that are short usually involve nitrogen and oxygen, while the long ones involve the elements included in the expansive IUPAC definition. This fuzzy reality extends to other concepts in chemistry such as aromaticity, polarizability and acid and base behavior. Unlike concepts in physics like wavelength and electrical conductivity, concepts in chemistry can be much more malleable. However this malleability is not a bug but a feature, since it allows you to capture a vast amount of chemical diversity in a central idea.

The second key aspect of chemistry that the IUPAC statement reveals is the fact that scientific concepts don't have to always be rigorously defined in order to be useful. The definition of a hydrogen bond goes to the heart of chemistry as a science based on models. A hydrogen bond is a bond, but it's really a model based on our ideas of the myriad ways in which molecules interact. The key feature of a model is utility, not reality. In my own work for instance, I can recognize and exploit hydrogen bonds between proteins and drugs and improve their directionality and strength, all without worrying one bit about their precise, rigorous definition. Similarly polymer chemists can create new forms of hydrogen-bonded plastics with novel functions without caring much whether their hydrogen bonds conform to some well-defined ideal. Whatever the debate about hydrogen bonds may be, my view of them is similar to Potter Stewart's: I know them when I see them. Or, if I wanted to offer a cheesy but still quotable line from one of "The Matrix" movies, a hydrogen bond is merely a phrase; what matters is the connection it implies.

Ultimately, this fuzzy yet useful view of hydrogen bonding makes an important statement about the philosophy of chemistry. It tells us that chemistry has its own philosophy based on models and utility that is independent of its roots in physics. As the chemist Roald Hoffmann memorably put it, chemical concepts like aromaticity and electronegativity start to "wilt at their edges" when examined too closely, and Hoffmann could have been talking about the hydrogen bond there. The physics-based view of a hydrogen bond would involve writing down the Schrodinger equation for these bonds and rigorously solving it to obtain quantities like the energy and geometry of the bond. But that exercise is too reductionist; it does not actually help me understand the nature and variety of these bonds. It won't tell me why water is liquid and why it flows from mountains, why it's a universal solvent and why it's an existential life-giver. Chemistry proclaims its own emergent language and philosophy, and while the material entities of which molecules are composed are grounded in physics, the molecules themselves belong squarely in the domain of chemistry.

The fact that chemists are still debating hydrogen bonds means that chemistry remains an exciting and vibrant discipline, one in which even basic concepts are still being sculpted and fine-tuned. It tells us that for the foreseeable future chemistry will continue to be a science without limits, a country with open borders. The hydrogen bond should give chemists an opportunity to celebrate the very soul of their work.

The Uncertainty Principle for climate (and chemical) models

A recent issue of Nature had an interesting article on what seems to be a wholly paradoxical feature of models used in climate science; as the models are becoming increasingly realistic, they are also becoming less accurate and predictive because of growing uncertainties. I can only imagine this to be an excruciatingly painful fact for climate modelers who seem to be facing the equivalent of the Heisenberg uncertainty principle for their field. It's an especially worrisome time to deal with such issues since the modelers need to include their predictions in the next IPCC report on climate change which is due to be published next year.

A closer look at the models reveals that this behavior is not as paradoxical as it sounds, although it's still not clear how you would get around it. The article especially struck a chord with me I see similar problems bedeviling models used in chemical and biological research. In case of climate change, the fact is that earlier models were crude and did not account for many fine-grained factors that are now being included (such as the rate at which ice falls through clouds). In principle and even in practice there's a bewildering number of such factors (partly exemplified by the picture on top). Fortuitously, the crudeness of the models also prevented the uncertainties associated with these factors from being included in the modeling. The uncertainty remained hidden. Now that more real-world factors are being included, the uncertainties endemic in these factors reveal themselves and get tacked on to the models. You thus face an ironic tradeoff; as your models strive to mirror the real world better, they also become more uncertain. It's like swimming in quicksand; the harder you try to get out of it, the deeper you get sucked in.

This dilemma is not unheard of in the world of computational chemistry and biology. A lot of the models we currently use for predicting protein-drug interactions for instance are remarkably simple and yet accurate enough to be useful. Several reasons account for this unexpected accuracy; among them cancellation of errors (the Fermi principle), similarities of training sets to test sets and sometimes just plain luck. Error analysis is unfortunately not a priority in most of these studies, since the whole point is to publish correct results. Unless this culture changes our road to accurate prediction will be painfully slow.

But here's an example of how "more can be worse". For the last few weeks I have been using a very simple model to try to predict the diffusion of druglike molecules through cell membranes. This is an important problem in drug development since even your most stellar test-tube candidate will be worthless until it makes its way into cells. Cell membranes are hydrophobic while the water surrounding them is hydrophilic. The ease with which a potential drug transfers from the surrounding water into the membrane depends among other factors on its solvation energy, on how readily the drug can shed water molecules; the smaller the solvation energy, the easier it is for drugs to get across. This simple model which calculates the solvation energy seems to do unusually well in predicting the diffusion of drugs across real cell membranes, a process that's much more complex than just solvation-desolvation. 

One of the fundamental assumptions in the model is that the molecule exists in just one conformation in both water and the membrane. This assumption is fundamentally false since in reality, molecules are highly flexible creatures that interconvert between several conformations both in water and inside the membrane. To overcome this assumption, a recent paper explicitly calculated the conformations of the molecule in water and included this factor in the diffusion predictions. This was certainly more realistic. To their surprise, the authors found that making the calculation more realistic made the predictions worse. While the exact mix of factors responsible for this failure can be complicated to tease apart, what's likely happening is that the more realistic factors also bring more noise and uncertainty with them. This uncertainty piles up, errors which were likely canceling before no longer cancel, and the whole prediction becomes fuzzier and less useful.

I believe that this is what is partly happening in climate models. Including more real-life factors in the models does not mean that all those factors are well-understood. You are inevitably introducing some known unknowns. Ill-understood factors will introduce more uncertainty. Well-understood factors will introduce less uncertainty. Ultimately the accuracy of the models will depend on the interplay between these two kinds of factors, and currently it seems that the rate of inclusion of new factors is higher than the rate at which those factors can be accurately calculated.

The article goes on to note that in spite of this growing uncertainty the basic predictions of climate models are broadly consistent. However it also acknowledges the difficulty in explaining the growing uncertainty to a public which has become more skeptical of climate change since 2007 (when the last IPCC report was published). As a chemical modeler I can sympathize with the climate modelers. 

But the lesson to take away from this dilemma is that crude models sometimes work better than more realistic ones. Perhaps the climate modelers should remember George Box's quote that "all models are wrong, but some are useful". It is a worthy endeavor to try to make models more realistic, but it is even more important to make them useful.
Image source

The future of science: Will models usurp theories?

This year's Nobel Prize for physics was awarded to Saul Perlmutter, Brian Schmidt and Adam Riess for their discovery of an accelerating universe, a finding leading to the startling postulate that 75% of our universe contains a hitherto unknown entity called dark energy. All three were considered favorite candidates for a long time so this is not surprising at all. The prize also underscores the continuing importance of cosmology since it had been awarded in 2o06 to George Smoot and John Mather, again for confirming the Big Bang and the universe's expansion.

This is an important discovery which stands on the shoulders of august minds and an exciting history. It continues a grand narrative that starts from Henrietta Swan Leavitt (who established a standard reference for calculating astronomical distances) through Albert Einstein (whose despised cosmological constant was resurrected by these findings) and Edwin Hubble, continuing through George Lemaitre and George Gamow (with their ideas about the Big Bang) and finally culminating in our current sophisticated understanding of the expanding universe. Anyone who wants to know more about the personalities and developments leading to today's event should read Richard Panek's excellent book "The 4 Percent Universe".

But what is equally interesting is the ignorance that the prizewinning discovery reveals. The prize was really awarded for the observation of an accelerating universe, not the explanation. Nobody really knows why the universe is accelerating. The current explanation for the acceleration consists of a set of different models, none of which has been definitively proven to explain the facts well enough. And this makes me wonder if such a proliferation of models without accompanying concrete theories is going to embody science in the future.

The twentieth century saw theoretical advances in physics that agreed with experiment to an astonishing degree of accuracy. The culmination of achievement in modern physics was surely quantum electrodynamics (QED) which is supposed to be the most accurate theory of physics we have. Since then we have had some successes in quantitatively correlating theory to experiment, most notably in the work on validating the Big Bang and the development of the standard model of particle physics. But dark energy- there's no theory for it that remotely approaches the rigor of QED when it comes to comparison with experiment.

Of course it's unfair to criticize dark energy since we are just getting started on tackling its mysteries. Maybe someday a comprehensive theory will be found, but given the complexity of what we are trying to achieve (essentially explain the nature of all the matter and energy in the universe) it seems likely that we may always be stuck with models, not actual theories. And this may be the case not just with cosmology but with other sciences. The fact is that the kinds of phenomena that science has been dealing with recently have been multifactorial, complex and emergent. The kind of mechanical, reductionist approaches that worked so well for atomic physics and molecular biology may turn out to be too impoverished for taking apart these phenomena. Take biology for instance. Do you think we could have a complete "theory" for the human brain that can quantitatively calculate all brain states leading to consciousness and our reaction to the external world? How about trying to build a "theory" for signal transduction that would allow us to not just predict but truly understand (in a holistic way) all the interactions with drugs and biomolecules that living organisms undergo? And then there's other complex phenomena like the economy, the weather and social networks. It seems wise to say that we don't anticipate real overarching theories for these phenomena anytime soon.

On the other hand, I think it's a sign of things to come that most of these fields are rife with explanatory
models of varying accuracy and validity. Most importantly, modeling and simulation are starting to be considered as a respectable "third leg" of science, in addition to theory and experiment. One simple reason for this is the recognition that many of science's greatest current challenges may not be amenable to quantitative theorizing, and we may have to treat models of phenomena as independent, authoritative explanatory entities in their own right. We are already seeing this happen in chemistry, biology, climate science and social science, and I have been told that even cosmologists are now extensively relying on computational models of the universe. Admittedly these models are still far behind theory and experiment which have had head starts of about a thousand years. But there can be little doubt that such models can only become more accurate with increasing computational firepower. How accurate remains to be seen, but it's worth noting that there are already books that make a case for an independent, study-worthy philosophy of modeling and simulation. These books extol philosophers of science to treat models not just as convenient applications and representations of theories (which are then the only fundamental things worth studying) but as ultimate independent explanatory devices in themselves that deserve separate philosophical consideration.

Could this then be at least part of the future of science? A future where robust experimental observations are encompassed not by beautifully rigorous and complete theories like general relativity or QED but only by different models which are patched together through a combination of rigor, empirical data, fudge factors and plain old intuition? This would be a new kind of science, as useful in its applications as its old counterpart but rooting itself only in models and not in complete theories. Given the history of theoretical science, such a future may seem dark and depressing. That is because as the statistician George Box famously quipped, although some models are useful, all models are wrong. What Box meant was that models often feature unrealistic assumptions about all kinds of details that nonetheless allow us to reproduce the essential features of reality. Thus they can never provide the sure connection to "reality" that theories seem to. This is especially a problem when disparate models give the same answer to a question. In the absence of discriminating ideas, which model is then the "correct" one? The usual answer is "none of them", since they all do an equally good job of explaining the facts. But this view of science, where models that can be judged only on the basis of their utility are the ultimate arbiters of reality and where there is thus no sense of a unified theoretical framework, feels deeply unsettling. In this universe the "real" theory will always remain hidden behind a facade of models, much as reality is always hidden behind the event horizon of a black hole. Such a universe can hardly warm the cockles of the heart of those who are used to crafting grand narratives for life and the universe. However it may be the price we pay for more comprehensive understanding. In the future, Nobel Prizes may be frequently awarded for important observations for which there are no real theories, only models. The discovery of dark matter and energy and our current attempts to understand the brain and signal transduction could well be the harbingers of this new kind of science.

Should we worry about such a world rife with models and devoid of theories? Not necessarily. If there's one thing about science that we know, it's that it evolves. Grand explanatory theories have traditionally been supposed to be a key part- probably
the key part- of the scientific enterprise. But this is mostly because of historical precedent as well a psychological urge for seeking elegance and unification. Such belief has been resoundingly validated in the past but it's utility may well have plateaued. I am not advocating some "end of science" scenario here - far from it - but as the recent history of string theory and theoretical physics in general demonstrates, even the most mathematically elegant and psychologically pleasing theories may have scant connection to reality. Because of the sheer scale and complexity of what we are trying to currently explain, we may have hit a roadblock in the application of the largely reductionist traditional scientific thinking which has served us so well for half a millennium

Ultimately what matters though is whether our constructs- theories, models, rules of thumb or heuristic pattern recognition- are up to the task of constructing consistent explanations of complex phenomena. The business of science is explanation, whether through unified narratives or piecemeal explanation is secondary. Although the former sounds more psychologically satisfying, science does not really care about stoking our egos. What is out there exists, and we do whatever's necessary and sufficient to unravel it.

In praise of contradiction

Scientists usually don't like contradictions. A contradiction in experimental results is like a canary in a coal mine. It sets off alarm bells and compels the experimentalist to double-check his or her setup. A contradiction in theoretical results can be equally bad if not worse. It could mean you made a simple arithmetical mistake. Contradiction could force you to go back to the drawing board and start afresh. Science is not the only human activity where contradictions are feared and disparaged. A politician or businessman who contradicts himself is not considered trustworthy. A consumer product which garners contradictory reviews raises suspicions about its true value. Contradictory trends in the stock market can put investors in a real bind.

Yet contradiction and paradoxes have a hallowed place in intellectual history. First of all, contradiction is highly instructive simply because it forces us to think further and deeper. It reveals a discrepancy in our understanding of the world which needs to be resolved and encourages scientists to perform additional experiments and decisive calculations to settle the matter. It is only when scientists observe contradictory results that the real fun of discovery begins. It’s the interesting paradoxes and the divergent conclusions that often point to a tantalizing reality which is begging to be teased apart by further investigation.

Let's consider that purest realm of human thought, mathematics. In mathematics, the concept of proof by contradiction or reductio ad absurdum has been highly treasured for millennia. It has provided some of the most important and beautiful proofs in the field, like the irrationality of the square root of two. In his marvelous book "A Mathematician's Apology", the great mathematician G H Hardy paid the ultimate tribute to this potent weapon:
"Reductio ad absurdum, which Euclid loved so much, is one of a mathematician's finest weapons. It is a far finer gambit than any chess gambit: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers the game."
However, the great ability of contradiction goes far beyond opening a window into abstract realms of thought. Twentieth-century physics demonstrated that contradiction and paradoxes constitute the centerpiece of reality itself. At the turn of the century, it was a discrepancy in results from blackbody radiation that sparked one of the greatest revolutions in intellectual history in the form of the quantum theory. Paradoxes such as the twin paradox are at the heart of the theory of relativity. But it was in the hands of Niels Bohr that contradiction was transformed into a subtler and lasting facet of reality which Bohr named 'complementarity'. Complementarity entailed the presence of seemingly opposite concepts whose co-existence was nonetheless critical for an understanding of reality. It was immortalized in one of the most enduring and bizarre paradoxes of all, wave-particle duality. Wave-particle duality taught us that contradiction is not only an important aspect of reality but an indispensable one. Photons of light and electrons behave as both waves and particles. The two qualities seem to be maddeningly at odds with each other. Yet both are absolutely essential to grasp the essence of physical reality. Bohr codified this deep understanding of nature with a characteristically pithy statement- "The opposite of a big truth is also a big truth". Erwin Schrödinger followed up on his own disdain for complementarity by highlighting an even more bizarre quantum phenomenon- entanglement- wherein particles that are completely separated from each other are nonetheless intimately connected; by doing this Schrödinger brought us the enduring image of a cat helplessly trapped in limbo between a state of life and death.

The creative tension created by seemingly contradictory phenomena and results has been fruitful in other disciplines. Darwin was troubled by the instances of altruism he observed in the wild; these seemed to be contradicting the ‘struggle for existence’ which he was describing. It took the twentieth century and theories of kin selection and reciprocal altruism to fit these seemingly paradoxical observations into the framework of modern evolutionary theory. The history of organic chemistry is studded by efforts to determine the molecular structures of complex natural products like penicillin and chlorophyll. In many of these cases, contradictory proposed structures like those for penicillin spurred intense efforts to discover the true structure. Clearly, contradiction is not only a vital feature of science but it is also a constant and valuable companion of the process of scientific discovery.

These glittering instances of essential contradiction in science would seem perfectly at home with the human experience. While contradiction in science can be disturbing and ultimately rewarding, many religions and philosophies have come to savor this feature of the world for a long time. The Chinese philosophy of Yin and Yang recognizes the role of opposing and contrary forces in sustaining human life. In India, the festival celebrating the beginning of the Hindu new year includes a ritual where every member of the family consumes a little piece of sweet jaggery (solidified sugarcane juice) wrapped in a bitter leaf of the Neem tree (which contains the insecticide azadirachtin). The sweet and bitter are supposed to exemplify the essential combination of happy and sad moments that are necessary for a complete life. Similar paradoxes are recognized in Western theology, for instance pertaining to the doctrines of the Trinity and the Incarnation.

The ultimate validation of contradiction however is not through its role in life or in scientific truth but through its role as an insoluble part of our very psyche. We all feel disturbed by contradiction, yet how many of us think we hold perfectly consistent and mutually exclusive beliefs in our own mind about all aspects of our life? You may love your son, yet his egregious behavior may lead you to sometimes (hopefully not often) wish he had not been born. We often speak of 'love-hate' relationships which exemplify opposing feelings toward a loved one. If we minutely observe our behavior at every moment, such observation would undoubtedly reveal numerous instances of contradictory thoughts and behavior. This discrepancy is not only an indelible part of our consciousness but we all realize that it actually enriches our life, makes it more complex, more unpredictable. It is what makes us human.

Why would contradictory thinking be an important part of our psyche? I am no neuroscientist, but I believe that our puzzlement about contradiction would be mitigated if we realize that we human beings perceive reality by building models of the world. It has always been debatable whether the reality we perceive is what is truly 'out there' (and this question may never be answered); what is now certain is that neural events in our brains enable us to build sensory models of the world. Some of the elements in the model are more fundamental and fixed while others are flexible and constantly updated. The world that we perceive is what is revealed to us through this kind of interactive modeling. These models are undoubtedly some of the most complex ever generated, and anyone who has built models of complex phenomena would recognize how difficult it is to achieve a perfectly logically consistent model. Model building also typically involves errors, of which some may accumulate and others may cancel. In addition models can always be flawed because they don't include all the relevant elements of reality. All these limitations lead to models in which a few facts can appear contradictory, but trying to make these facts consistent with each other could possibly lead to even worse and unacceptable problems with the other parts of the model. Simply put, we compromise and end up living with a model with a few contradictions in favor of a model with too many. Further research in neuroscience will undoubtedly shed light on the details of model building done by the brain, but what seems unsurprising is that these models contain some contradictory worldviews which nonetheless preserve their overall utility.

Yet there are those who would seek to condemn such contradictory thinking as an anomaly. In my opinion, one of the most prominent examples of such a viewpoint in the last few years has been the criticism of religious-minded scientists by several so-called 'New Atheists' like Richard Dawkins and Sam Harris. The New Atheists have made it their mission to banish what they see as artificial barriers created between science and religion for the sake of political correctness, practical expediency and plain fear of offending the other party. There is actually much truth to this viewpoint, but the New Atheists seem to take it beyond its strictly utilitarian value.

A case in point is Francis Collins, the current director of the NIH. Collins is famous as a first-rate scientist who is also an ardent Catholic. The problem with Collins is not that he is deeply religious but that he tends to blur the line between science and religion. A particularly disturbing instance is a now widely discussed set of slides from a presentation where he tries to somehow scientifically justify the existence and value of the Christian God. Collins's conversion to a deeply religious man when he apparently saw the Trinity juxtaposed on his view of a beautiful frozen waterfall during a hike is also strange, and at the very least displays a poor chain of causation and inadequate critical thinking.

But all this does not make Collins any less of an able administrator. He does not need to mix science with religion to justify his abilities as a science manager. To my knowledge there is not a single instance of his religious beliefs dictating his preference for NIH funding or policy. In practice if not in principle, Collins manages to admirably separate science from storytelling. But the New Atheists are still not satisfied. They rope in Collins among a number of prominent scientists who they think are 'schizophrenic' in conducting scientific experiments during the week and then suspending critical thinking on Sundays when they pray in church. They express incredulity that someone as intelligent as Francis Collins can so neatly compartmentalize his rational and 'irrational' brain and somehow sustain two completely opposite - contradictory - modes of thought.

For a long time I actually agreed with this viewpoint. Yet as we have seen before, such seemingly contradictory thinking seems to be a mainstay of the human psyche and human experience. There are hundreds of scientists like Collins who largely manage to separate their scientific and religious beliefs. Thinking about it a bit more, I realized that the New Atheists' insistence on banishing perfectly mutually exclusive streams of thinking seems to go against a hallowed principle that they themselves have emphasized to no end- a recognition of reality as it is. If the New Atheists and indeed all of us hold reality to be sacrosanct, then we need to realize that contradictory thinking and behavior are essential elements of this reality. As the history of science demonstrates, appreciating contradiction can even be essential in deciphering the workings of the physical world.

Now this certainly does not mean that we should actively encourage contradiction in our thinking. We also recognize the role of tragedy in the human experience, but few of us would strive to deliberately make our lives tragic. Contradictory thinking should be recognized, highlighted and swiftly dealt with, whether in science or life. But its value in shaping our experience should also be duly appreciated. Paradox seems to be a building block in the fabric of the world, whether in the mind of Francis Collins or in the nature of the universe. We should in fact celebrate the remarkable fact that the human mind can subsume opposing thoughts within its function and still operate within the realm of reason. Simply denying this and proclaiming that it should not be so would mean denying the very thing we are striving for- a deeper and more honest understanding of reality.

Life (and chemistry) is a box of models

One of the most important challenges in teaching students chemistry is in conveying the fact that chemistry is essentially a milieu of models. Too often students can misinterpret the conceptual devices taught to them as "real" entities. While models seem to perpetually and cruelly banish the concept of "reality" itself to fanciful speculation at best, the real beauty of chemistry is in how the simplest of models can explain a vast range of diverse chemical phenomena. Students' understanding can only be enriched by communicating to them the value of models as a window into our world. How can we achieve this?

We can start by emphasizing the very fact. Very few of my chemistry teachers even mentioned the word "model" in their discourse, let alone emphasized the preponderance of models used in chemistry. One can claim that all of chemistry is in fact a model. The reason for this is not hard to grasp: models come to our aid when the world gets too complex. The complex nature of chemical systems wherein one cannot describe them using first principles lends especially this 'central science' to modeling.

You can start with the simplest fact taught in freshman chemistry class- the structure of methane as it is drawn on paper. The methane molecule of course exists in real life, but that does not mean that you can actually see four bonds growing out tetrahedrally from a central carbon. Recent advancements in techniques like scanning tunneling microscopy have brought an astoundingly real feel to molecules, but what you see is still diffuse electron density and not actual bonds. The tetrahedral representation of methane that we draw on paper is very much a model.

Once students realize that even their simple representations of molecules are models, the road ahead becomes easier. Since we are talking about methane, we will inevitably talk about hybridization and describe how the carbon is sp3 hybridized. But of course hybridization is merely a mathematical and conceptual device- and a very powerful one at that- and this needs to become clear. Hybridization in methane leads to discussion about hybridization in other molecules. This is usually followed by one of the most conceptually simple and useful models in chemistry where you can make back-of-the-envelope calculations to get real and useful results- VSEPR. VSEPR is a great example of a simple model that works in a great number of cases; asking whether it is "real" is futile. Thus, one can drive home the importance of modeling even in the first few sessions of chemistry 101.

Once these facts become clear, the floodgates can open. Students can cease to think of the world as real and still be happy. Think the famous Van der Waals "12-6" curve is real? Think again. It does a marvelous job of representing in simple terms an incredibly complex and delicate tension between attraction and repulsion engendered by point charges, dipoles and higher order terms, and it's no more than that. But it works! It's a disarmingly simple model that's even incorporated in popular molecular modeling programs. How about crystal field theory? Another fantastic model that does a great job of explaining the properties of transition metal complexes without being real. Of course, let's not even get started on that ubiquitous act that initiates a newbie into the world of organic chemistry- arrow pushing. That's the very epitome of modeling for you. And after this onslaught, students should have little trouble understanding that those ephemeral, seductive twin forms of benzene that seem to interconvert into each other on paper are pure fiction.

Want a book that teaches chemistry through models? You are in luck. One of the best books that conveys the reality of chemistry as model building also turns out to be one of the most influential scientific books of the 20th century- Linus Pauling's "The Nature of the Chemical Bond". In this book Pauling introduces dozens of ideas like polarization, hybridization, ionic and covalent character of bonds, resonance and hydrogen bonding. All of these are enshrined in his Valence Bond Theory. And all are models. If conveying the importance of models to students gives us an opportunity to introduce them to this classic text, the effort would already have been worthwhile.

So would students turn fatalistic and despondent once they have been convinced that the world is not real but is a model? Not at all. The singular fact that snatches hope from the jaws of defeat is the very fact that we can in fact build such models and understand the world. Think about it; we build models that are almost laughingly simplistic representations of a hideously complex reality that's probably going to remain out of our reach forever. And yet these apparent embarrassments help us understand protein folding, design new drugs against cancer, build solar cells, bake a cake and capture the smell of a rose in a bottle.

What more could we want.

Models, laws and the limits of reductionism

I am currently reading Stuart Kauffman's "Reinventing the Sacred" and it's turning out to be one of the most thought-provoking books I have read in a long time, full of mind-bending ideas. Kauffman who was originally trained as a doctor was for many years a member of the famous Institute for Complexity in Santa Fe, which is a bastion of interdisciplinary research.

Kauffman is a kind of polymath who draws upon physics, chemistry, biology, computer science and economics to essentially argue the limitations of reductionism and the existence of emergent phenomena. He makes some fascinating arguments for instance about biology not being reducible to specific physics. One of the main reasons this cannot be done is because the evolution of complex biological systems is contingent and can follow any number of virtually infinite courses depending on slightly different conditions; according to Kauffman, this infinity is not just a ‘countable infinity’ but an ‘uncountable one’ (more on this mind-boggling distinction later). Biological systems are also highly non-linear and full of feedback and 'surprises' and these qualities make their prediction not just very difficult in practice but even in principle.

I am sure I will have much more to say about Kauffman’s book later, but for now I want to focus on his argument against reductionism based on what is called the ‘multiple platform’ framework. Kauffman’s basic thesis draws on an argument made by the Nobel laureate Philip Anderson. Anderson wrote a groundbreaking article in Science in 1972 extolling the limits of reductionism. To illustrate the multiple platform principle, he talked about computers processing 1s and 0s and manipulating them to give a myriad number of results. The question is: is the processing of 1s and 0s in a computer uniquely dependent upon the specific physics involved (which in this case would be quantum mechanics)? The answer may seem obvious but Anderson says that it’s hard to make this argument, since one can also get the same results from manipulating buckets that are either empty (0s) or filled with water (1s). Thus, the binary operations of a computer cannot be reduced to specific physics since they can be modeled by ‘multiple platforms’.

Another example that Kauffman cites is of the Navier-Stokes equations, the basic equations of fluid dynamics. The equations are classical and are derived from Newton’s laws. One would think that they would be ultimately reducible to the movements of individual particles of fluid and thus to quantum mechanics. Yet as of today, nobody has found a way to derive the Navier-Stokes equations from those of quantum mechanics. However, the physicist Leo Kadanoff has actually ‘derived’ these equations by using a rather simple ‘toy world’ of beads on a lattice. The movement of fluids and therefore the equations can be modeled by moving the beads around. Thus, we again have an example of multiple platforms leading to the same phenomenon, precluding the unique dependence of the phenomenon on a particular set of laws.

All this is extremely interesting, but I am not sure I follow Kauffman here. The toy world or the bucket brigades that Kadanoff and Anderson talk about are models. Models are very different from physical laws. Sure, there can be multiple models (or platforms) for deriving a given set of phenomena, but the existence of multiple models does not preclude dependence on a unique set of laws. A close analogy which I often think of is from molecular mechanics. A molecular mechanics model of a molecule assumes the molecule to be a classical set of balls and springs, with the electrons neglected. By any definition this is a ludicrously simple model that completely ignores quantum effects (or at least takes them into consideration implicitly by getting parameters from experiment). Yet, with the right parametrization, it works well-enough to be useful. There could conceivably be many other models which could give the same results. Yet nobody would make the argument that the behavior of molecules modeled in molecular mechanics is not reducible to quantum mechanics.

Kauffman’s argument that the explanatory arrows don’t always point downwards because one cannot always extrapolate upwards from lower-level phenomena is very well-taken. Emergent properties are surely real. But at least in the specific cases he considers, I am not sure that one can make an argument about phenomena not being reducible to specific physics simply because they can be derived from multiple platforms. The multiple platforms are models. The specific physics constitutes a set of laws, which is quite different.

More model perils; parametrize this

ResearchBlogging.orgNow here's a very interesting review article that puts some of the pitfalls of models that I have mentioned on these pages in perspective. The article is by Jack Dunitz and his long-time colleague Angelo Gavezzotti. Dunitz is in my opinion one of the finest chemists and technical writers of the last half century and I have learnt a lot from his articles. Two that are on my "top 10" list are his article showing the entropic gain accrued by displacing water molecules in crystals and proteins (a maximum of 2 kcal/mol for strongly bound water) and his paper demonstrating that organic fluorine rarely, if ever, forms hydrogen bonds.

In any case, in this article he talks about an area in which he is the world's acknowledged expert; organic crystal structures. Understanding and predicting (the horror!) crystal structures essentially boils down to understanding the forces that makes molecules stick to each other. Dunitz and Gavezzotti describe theoretical and historical attempts to model forces between molecules, and many of their statements about the inherent limitations of modeling these forces rang as loudly in my mind as the bell in Sainte-Mère-Église during the Battle of Normandy.

Dunitz has a lot to say about atom-atom potentials that are the most popular framework for modeling inter and intramolecular interactions. Basically such potentials assume simple functional forms that model the attractive and repulsive interactions between nuclei which are treated as rigid balls. This is also of course the fundamental approximation in molecular mechanics and force fields. The interactions are basically Coulombic interactions (relatively simple to model) and more complicated dispersion interactions which are essentially quantum mechanical in nature. The real and continuing challenge is to model these weak dispersive interactions.

But the problem is fuzzy. As Dunitz says, atom-atom potentials are popular mainly because they are simple in form and easy to calculate. However, they have scant, if any, connection to "reality". This point cannot be stressed enough again. As this blog has noted several times before, we use models because they work, not because they are real. The coefficients in the functional forms of the atom-atom potentials are essentially varied to minimize the potential energy of the system and there are several ways to skin this cat. For instance, atomic point charges are rather arbitrary (and definitely not "real") and can be calculated and assigned by a variety of theoretical approaches. In the end, nobody knows if the final values or even the functional forms have much to do with the real forces inside crystals. It's all a question of parameterization which gives you the answer, and while parameterization may seem like a magic wand which may give you anything that you want, that's precisely the problem with it...that it may give you anything that you want without reproducing the underlying reality. Overfitting is also a constant headache and one of the biggest problems with any modeling in my opinion; whether in chemistry, quantitative finance or atmospheric science. More on that later.

An accurate treatment of intermolecular forces will have to take electron delocalization into consideration. The part which is the hardest to deal with is the part close to the bottom of the famous Van der Waals energy curve, where there is an extremely delicate balance between repulsion and attraction. Naturally one thinks of quantum mechanics to handle such fine details. A host of sophisticated methods have been developed to calculate molecular energies and forces. But those who think QM will take them to heaven may be mistaken; it may in fact take them to hell.

Let's start with the basics. In any QM calculation one uses a certain theoretical framework and a certain basis set to represent atomic and molecular orbitals. One then adds terms to the basis set to improve accuracy. Consider Hartree-Fock theory. As Dunitz says, it is essentially useless for dealing with electron delocalization because it does not take electron correlation into account, no matter how large a basis set you use. More sophisticated methods have names like "Moller-Plesset perturbation theory with second order corrections" (MP2) but these may greatly overestimate the interaction energy, and more importantly the calculations become hideously computer intensive for anything more than the simplest molecules.

True, there are "model systems" like the benzene dimer (which has been productively beaten to death) for which extremely high levels of theory have been developed that approach experimental accuracy within a hairsbreadth. But firstly, model systems are just that, model systems; the benzene dimer is not exactly a molecular arrangement which real life chemists deal with all the time. Secondly, a practical chemist would rather have an accuracy of 1 kcal/mol for a large system than an accuracy of 0.1 kcal/mole for a simple system like the benzene dimer. Thus, while MP2 and other methods may give you unprecedented accuracy for some model systems, they are usually very expensive for most systems of biological interest and not very useful.

DFT still seems to be one of the best techniques around to deal with intermolecular forces. But "classical" DFT suffers from a well-known inability to treat dispersion. "Parameterized DFT" in which an inverse sixth power term is added to the basic equations can work well and promises to be a very useful addition to the theoretical chemist's arsenal. More parameterization though.

And yet, as Dunitz points out, problems remain. Even if one can accurately calculate the interaction energy of the benzene dimer, it is not really possible to know how much of it comes from dispersion and how much of it comes from higher order terms. Atom-atom potentials are happiest calculating interaction energies at large distances, where the Coulomb term is pretty much the only one which survives, but at small interatomic distances which are the distances most of interest for the chemist and the crystallographer, a complex dance between attraction and repulsion, monopoles, dipoles and multipoles and overlapping electron clouds manifests itself. The devil himself would have a hard time calculating interactions in these regions.

The theoretical physicist turned Wall Street quant Emanuel Derman (author of the excellent book ("My Life as a Quant: Reflections on Physics and Finance") says that one of the problems with the financial modelers on Wall Street is that they suffer from "physics envy". Just like in physics, they want to discover three laws that govern 99% of the financial world. More predictably as Derman says, they end up discovering 99 laws that seem to govern 3% of the financial world with varying error margins. I would go a step further and say that even physics is accurate only in the limit of ideal cases and this deviation from absolute accuracy distinctly shows in theoretical chemistry. Just consider that the Schrodinger equation can be solved exactly only for the hydrogen atom, which is where chemistry only begins. Anything more complicated that, and even the most austere physicist cannot help but approximate, parametrize, and constantly struggle with errors and noise. As much as the theoretical physicist would like to tout the platonic purity of his theories, their practical applications would without exception involve much approximation. There is a reason why that pinnacle of twentieth century physics is called the Standard Model.

I would say that computational modelers in virtually every field from finance to climate change to biology and chemistry suffer from what Freeman Dyson has called "technical arrogance". We have made enormous progress in understanding complex systems in the last fifty years and yet when it comes to modeling the stock market, the climate or protein folding, we seem to think that we know it all. But we don't. Far from it. Until we do all we can do is parametrize, and try to avoid the fallacy of equating our models with reality.

That's right Dorothy. Everything is a model. Let's start with the benzene dimer.

Dunitz, J., & Gavezzotti, A. (2009). How molecules stick together in organic crystals: weak intermolecular interactions Chemical Society Reviews, 38 (9) DOI: 10.1039/b822963p