AI won’t take your job until it does

Since the advent of the industrial revolution, there has been concern that machines will take jobs and usually this is true. Machines do take jobs. The automatic loom took away weaving jobs. Trains took away pony express jobs. The usual process is that a technology comes along and it either increases the productivity of workers or replaces a job that humans used to do. However, historically the net effect is not a loss of jobs in toto. Instead, humans started doing other things, which expanded the economy in other directions, and that allowed other humans to do other things and so forth.

Is AI an ordinary technology that will displace some jobs while creating others or will it take away jobs? There are good arguments on both sides but I think the discussion glosses over an important point, which is that what happens with AI is a moving target that depends on the capability of AI. Unlike a technology like the wheel or air conditioning, the pinnacle of that technology is restricted to what it was meant to do. AI is different. The aim of AI is to replicate or supersede all human capabilities. So clearly, if an AI is developed that can do do anything and everything you do better and cheaper, it will replace you. The question is whether there is a limit to how good AI can get and this boils down to whether or not biology and the human brain is describable by the laws of physics. If you believe you or your mind is beyond physics then machines can never fully replace you. But if you believe in a mechanistic view of the universe where everything follows the same understandable laws (as I do), then there is nothing in principle that would negate human level or superhuman artificial intelligence.

I am certain that human level AGI is possible and will be attained within my lifetime but I no longer think it will be soon enough to make me obsolete before I retire. If anything, it may even prolong my career because it can compensate for my eventual (current?) decline. If you haven’t tried ChatGPT or Google Gemini or XAI’s Grok lately, I suggest you give it a whirl. The capabilities are simply astonishing. They write computer code, summarize scientific research, do mathematical calculations, and so forth. They are also far from perfect. They hallucinate, lose focus, and can forget what happened in a discussion just minutes ago. I used AI to write most of the code for a project I just finished. I have to iterate with it a lot and give it a lot of guidance but it can instantly generate hundreds of lines of code. It’s hard to tell if it increased my productivity given that it makes a lot of mistakes and I have to repeat steps over and over again. Perhaps, I could have gone faster if had I just wrote the code myself. But the AI models are getting better. AI leaders like Andrej Karpathy and Demis Hassabis think that several more breakthroughs are required before we get to full AGI but that labs are frantically working and they will come. Those that argue that AI will never reach human level intelligence because it just replicates the entire internet have either not kept up with the progress or don’t actually understand how models are trained. Learning from the internet is just a portion of what goes into an AI model and will be less and less important in the future.

The answer as to whether AI will take your job or be a productivity enhancer is yes. It will do both. It will help you with your job. It will create new jobs. As it gets better it will take jobs. How this all plays out is anyone’s guess. AI is like sea-level rise. As the water level rises, we run to higher ground, and that high ground becomes more and more valuable. It is most valuable right before it is submerged just as I will be most productive right before I am completely obsolete.

COVID exposed liberalism’s greatest gap

I argued in a post four years ago (see here) that Western Liberalism is inherently conflicted. By Liberalism, I mean the modern continuation of the philosophy that arose from the British enlightenment of the seventeenth century initiated by the writings of John Locke. In that previous post, I wrote of the inherent conflict between the “freedom to (do as one chooses)” and the “freedom from (oppression, bullying, …)”, now loosely represented by twentieth-century libertarianism and American New Deal liberalism, respectively. But what should be done if one’s personal choice oppresses another person? I argued that liberalism has no answer, and this conflict is just the tip of the iceberg of liberalism’s biggest problem, which was overtly exposed during the COVID-19 pandemic.

During the pandemic, public health officials would proclaim that their health directives such as social distancing, masking, shutting down schools, and so forth were merely “following the science”. I believe they meant it. They really did think that they were following the science, but science, as David Hume pointed out long ago, cannot make claims on moral choices. What they should have said was “according to my current take on the incomplete data that we have, l believe the somewhat draconian actions I suggest will minimize deaths due to COVID infection, and according to my personal moral principles is what I deem to be most important.” I strongly suspect that most of those officials and scientists don’t actually realize that their statements and actions are reflections of their personal moral philosophies rather than some scientific truth. I know because I myself was one of those people who was either blind to or in denial of this glaring gap. I only realized my delusion after many a discussion with a conservative thinker.

Philosophers have been grappling with this moral gap since the advent of liberalism. The Utilitarians such as Jeremy Bentham and John Stuart Mill believed that society should be designed to maximize happiness for the maximum number of people, but even they disagreed as to what constitutes happiness. John Rawls critiqued Utilitarianism and proposed a solution based on his “veil of ignorance” concept. He proposed that society should be designed such that a person who had no control of what attributes or segment of society they would be born into would find the most acceptable. It was “obvious” to him that most people would choose a society that made sure the worst off still had a pretty good life, which is what modern American Liberals advocate and is probably what someone with an upper-middle-class intellectual background would suggest. But this is simply not true. It is quite possible that some people don’t want to live in such a society. Maybe they would rather live in a world where a small set of individuals dominate the rest and take the chance of where they end up. Lots of people like to gamble and take risks. Plato advocated for rule by aristocrats, which meant “rule by the best” and not the modern usage of landed gentry. I don’t think it’s a stretch to suggest that many Silicon Valley power brokers would agree with this idea. There is simply no universally accepted view of what an ideal society would be.

In terms of COVID, many liberals have now come to accept that a more balanced response weighing factors such as the harm to children’s education, viability of small businesses, restrictions on personal freedom and speech should be considered along with minimizing death due to the disease. However, I don’t think they have fully grasped that there is still a glaring hole in Liberalism that can never be filled by Liberalism alone. In 1989, after the fall of communism, Francis Fukuyama suggested that we reached the “end of history” with Western Liberalism as a stable ideological fixed point. While he thought that it may not be a smooth process towards universal Liberalism and has refined his ideas since, he still fundamentally believes that Liberalism (more towards the libertarian view) will still win out in the end. I, personally, am not so sure. I think we will be lost for a very long time.

I don’t have any good answers but I do think the most relevant work on this question may be C.S. Lewis’s (yes, the same one) 1943 essay, “The Abolition of Man”, which addresses this gap and defends objective moral law, which he calls interestingly, “The Tao”. You may or not agree with it but this book may be one of the most insightful (and underrated) works of philosophy ever. I’m going to let you read it first before I give my take.

The rate of AI takeoff

Artificial General Intelligence or AGI means having cognitive capabilities as good as or better than humans in every facet. As I opined long ago (see here), while there is no inherent physical law to prevent AGI from exceeding humans, I doubt it would make economic sense to do so for everything. Is there really that much utility for a machine to be able to tell jokes better than any human? It’s not clear to me how anyone or any thing could be funnier than the late Sam Kinison anyway. It would take a lot of reinforcement learning in front of a lot of audiences and even then it might fail because comedy in the 80’s was just funnier (I’m purposely dating myself).

The real issue for society is when AGI will be able to do most jobs better than humans. I think AGI capability may come reasonably soon, maybe in the next few years but I think it will take some time before it completely upends society. AI aficionados believe in something called the intelligence explosion, where they believe that once AGI reaches a certain level they will then design new AGIs and iterate to infinite intelligence rapidly. Yet, even with iterative self-improvement there is still a wide range of possible AGI take off speeds depending on how exactly new AGI will increase the capacity of the next generation.

We can make this quantitative with a simple growth analogy. For population growth, if each member of the population reproduces at some fixed rate then the rate of increase will scale as the size of the population, like bacteria dividing in two and thus doubling the population every generation. This is classic exponential growth and can be represented by the ordinary differential equation:

\frac{dx}{dt} = r x

where x is the population size and r is a rate constant measured in units of inverse time, like per day. We need a differential equation to get x(t), the population size at any given time, because we only have local information, (see my last toolkit for the physical world post). The left hand side of the growth equation is the derivative of population size with respect to time, which is the slope or rate of population growth. The population size x is a number while t is a time and a number divided by time is a speed or rate. For the equation to make sense, the right hand side must have the same “dimensions” as the left so that is why we need to include a rate constant r, which must have units of inverse time. In physics/math parlance, r sets the scale. Usually, it’s easier to think in terms of time rather than rate so we often set r = 1/\tau, where \tau is called a time constant. It’s like a doubling time or half life but in base e rather than base 2. It matters quite a bit if the time constant is a year or a century. The full solution to the growth equation is

x(t) = x_0 e^{r(t-t_0)}

where t_0 is the initial time and x_0 is the initial condition at time t_0. For every derivative in a differential equation, you need to specify a piece of information – initial conditions for time derivatives and boundary conditions for spatial derivatives. For simplicity, let t_0=0, you can always shift the time to call whenever you start zero, and let x_0 = 1, you can always rescale the population to think in terms of population compared to initial population, so we get x(t) = e^{rt} = \exp(rt). In math notation, \exp(a) means e^a. The rate constant determines everything. If you think that the time constant for AGI growth is between a year or two, like Moore’s law, then you’ll have really rapid growth. You’ll get a tenfold increase in less than a decade. If you believe it’s more like ten years then that same growth will take a century. If r is negative then you get decay. The exact same equation describes both exponential growth and exponential decay. It also explains exponential approach to an equilibrium. For example, if I combine linear growth with exponential decay I get

\frac{d}{dt} = I - r x

for which x will approach I exponentially with a rate constant of r. This equation can even be applied to body weight where the time constant is about a year. The general rule of thumb is that it will take about three time constants to get 95% there, so for body weight, you have to eat consistently for three years to know your eventual body weight, or you can use the NIH body weight planner, which incorporates some of the modeling work that I did with Kevin Hall.

In exponential growth, it will take infinite time to get to infinity. There is no singularity, which would require x go to infinity in a finite time, like a black hole. However, it is possible to have a growth equation exhibit a singularity by changing the equation to

\frac{dx}{dt} = r x^\alpha

where \alpha is a measure of the gain in rate with population size. One way to think about this is to rewrite the right hand side of the equation as (r x^{\alpha-1}) x and consider r x^{\alpha-1} as an x dependent rate constant. It gets larger for larger x if \alpha is greater than one and smaller if it is less than one. Cities are an example where there is an increasing advantage with scale and \alpha is greater than one. Economic growth would not be as fast if people were spread out rather than concentrated in one place. It is not a coincidence that almost all the AI companies and startups are located in San Francisco and most finance companies are in New York or London.

We can solve the differential equation by rearranging to obtain

\frac{dx}{r x^\alpha} = dt

and integrating both sides to get

\frac{x^{1 - \alpha}}{r(1-\alpha)} = t + C

where C is an integration constant determined by the initial condition. If \alpha > 1 then if we choose the initial condition appropriately, we can write

x \propto -\frac{1}{(t-t_S)^{1/(\alpha-1)}}

Notice that we have t - t_S in the denominator. Suppose we start at t less than t_S. As t gets closer to t_S, x will get bigger and bigger and x will be infinity when t = t_S. In math, this is called a finite time blow up and is a bonafide singularity. Things start slow and then they explode. On the flip side if \alpha is less than one then you’ll get slower than exponential growth.

The rate of AI takeoff thus boils down to the values of r and \alpha. Both need not be constant either, they could change depending on circumstance. For AI, x would not correspond to a population size but rather some measure of AI capability. When AI capability is limited, like a year ago, \alpha is near zero. AI improvement up to now has not been limited by how good AI is but rather by human ingenuity, data, and compute. The dramatic increase in performance we’ve seen recently rides on top of Moore’s law, cleverness of human AI researchers, and the audacity to keep going bigger.

I use AI tools daily and I’ve gotten pretty good at getting them to do what I need them to do but I need to be quite specific and I often have to iterate multiple times before I get something useful. The biggest problem I have is that the AI forgets what transpired a few interactions ago, so I have to explain the problem over and over again. It’s kind of like working with someone without a hippocampus who will forget everything once it’s out of their immediate attention. Part of this will be solved with longer context windows. But having a completely viable AGI helper may require having a dedicated model whose weights are updated as you interact with it. All of this will require a lot more compute and that will be a rate limiting step for at least the next short while.

AI will increase productivity of AI research but for growth to explode, AI cannot be limited by anything other than itself. Even if AI takes over all research including designing hardware and building foundries and power plants, it still has to obey the laws of physics. The size of transistors on GPUs are limited by the structure and properties of matter. How fast information can travel is limited by the speed of light. How large buildings can be are limited by the strength of gravity. AI optimists seem to think that all problems are limited by intelligence but intelligence can only do so much. The theory of computation shows that there are problems that are just plain hard to solve no matter how smart you are. Biological systems are complex and contingent on random evolutionary quirks that must simply be discovered. From my own experience, I don’t think intelligence is the rate limiting step for progress in medicine. Most drugs fail because you just don’t know how a drug will behave unless you actually test it on lots of people, and that is just going to be slow. Being smarter might make the odds better but it will still take time to test. I thus think the jury is still out on how fast AI will take off. I am sure that I will be replaced someday but I’m really not sure when.

My obsolescence is nigh

Nine years ago I wrote a post, Alpha Go and the Future of Work, where I argued that the replacement of human work with AI was inevitable and that humans would have to figure out what to do with themselves. I hypothesized a possible non-dystopian future (I felt obliged to go against type and be optimistic in that post) where humans could live in a fabricated pre-industrial world doing low productivity but rewarding tasks like making pottery or writing poems while having most of our material needs met by machines. I also argued that mathematicians could still do mathematics since it was infinite and even if machines were better at it than us, we could still find open problems to explore and amuse ourselves. That post was written pre-ChatGPT and so I wasn’t sure when the machine takeover would happen but now it is clear to me that the time is soon if not now.

When OpenAI first introduced their GPT (which stands for generative pre-trained transformer) models a few years after I wrote my post, I was skeptical that it would get us all the way to AGI. (What used to be called artificial intelligence (AI) is now called artificial general intelligence (AGI) because AI is now used for what used to be called machine learning, which is what used to be called statistics). I was skeptical because GPT used supervised learning, which is limited by the data it is trained on. It was noticed around 2012 that deep learning model performance got better if you increased the amount of data, the number of parameters in the model, and the amount of computation (now called compute). (I notice that in the 21st century, verbs can be nouns and vice versa). The great improvements we have seen recently in AI is mostly due to the increase in these three quantities. While technical tweaks can help, generally bigger is better. That is why the big tech companies are scrambling to build larger and larger data centers. I’m still long on NVIDIA because even though newer smaller models can often do the same thing as older larger models (e.g. DeepSeek R1) there is still much to be gained by going bigger both in the training of the models and especially as I will get to more later, in the delivery.

What convinced me that AGI is around the corner is that the power of the current frontier large language models (LLM) like OpenAI’s ChatGPT-4.5 and Anthropic’s Claude3.7 Sonnet is all in the post training. To really understand why I need to explain how large language models work and are trained. Those in the know can just skip ahead. What these models do is take a sequence of tokens (tokens are like pieces of words) and predict the next token. The prediction involves a series of simple mathematical operations like matrix multiplication with a whole lot of adjustable parameters. Changing the values of the parameters changes the prediction. Pre-training involves comparing the predicted token with the actual token and then adjusting the parameters (using gradient descent) to make any discrepancy smaller. The pre-training phase involves taking these models, feeding it as much of the world’s written record as can be acquired, and then adjusting the hundred billion or more model parameters. This process can take months on thousands upon thousands of GPU’s (graphical processing units made mostly by NVIDIA) and costs multiple millions of dollars.

This phase is then followed by post-training, which is why ChatGPT will produce nicely organized paragraphs with an enthusiastic and polite tone (xAI’s Grok 3 will give you rude responses if you choose the unhinged mode). Post-training involves giving a model feedback on its responses to steer it in a desired direction. Part of this involves something called reinforcement learning (RL), which is how animals and humans mostly learn. In RL, a model (or animal) will modify its actions (called a policy) to maximize its predicted rewards (e.g. like being right or wrong or catching a rabbit). So you could give it a math problem (the model not the wolf) and instead of training it to match the solution explicitly as in pre-training, you simply let it try to solve the problem and then tell it if it is right or wrong. In this way, you let the machine figure out on its own how it should respond. You can also have it produce responses and ask humans to give feedback on the answers (i.e. reinforcement learning with human feedback, RLHF). RL is where the magic happens. It’s what was used to train Alpha Go. RL is what will take LLMs to AGI.

The CEO’s of the leading AI companies are predicting that AGI will arrive this year or next. I am generally a skeptic and a naysayer about tech hype but this time I’m convinced it will happen. I use LLMs regularly in my work and daily life and can see in real time how fast they are improving. All of them can pass the Turing test with ease. Sure they hallucinate occasionally and give wrong answers but so do humans. However, while AGI will arrive in principle very soon, it is not clear how long it will take to drastically change the world. One of the factors will be cost. AGI will be expensive and the bulk of that cost will be in the delivery. When an LLM generates a response it does so one token at a time. After it produces the first token, it adds it to the original sequence and runs the whole thing back through the model to get the next token. This is why the current models will limit how long an input, called the context length, can be. Right now context lengths are on the order of a hundred thousand tokens, with the Google models up to 2 million. The longer the context length, the more compute is required to generate responses. Additionally, models with some reasoning capability will run through the model multiple times and build on previous passes, verify responses, backtrack and correct errors, or just try out different responses and then pick the best one. This all requires a lot of compute. Advances are continuously bringing down this cost but whatever we end up with will cost more than a simple Google search. The increase in required compute, be it a hundred or a thousand fold, will require increasing the existing infrastructure by that factor. That will be a lot of data centers using a lot of energy.

Given this expense, the first jobs that AI will displace are those that are among the most high paying. As I mentioned in my post nine years ago, Alpha Go was easily able to defeat humans at Go not because it was particularly good at Go but because humans (in general) are really bad at it. The more practice, study, and training it takes for a human to do a task, the easier it will be for a machine to do it better. Thus, the jobs that require specialized skills, like computer programming, accounting, statistics, and applied math will be the first to be replaced. The things that humans do naturally, like guess which member of the tribe stole their lunch, will be the hardest for machines to replicate because those are capabilities that evolved over millions of years. That is not to say that machines using RL won’t be able to do them, just that it will likely take more time and compute. After all, natural selection is a form of reinforcement learning.

So sadly, I may be the last person to hold my job. I am fortunate that this is arriving in the twilight of my career so it doesn’t affect me that much but I feel bad for those younger than me. For example, I’ve been working on my software tool StochasticGene for the past five years, which builds on my work from the past decade. An AI agent will soon be able to recreate everything I’ve done in 30 seconds. It will also be easier for the AI to recreate my code from scratch rather than help me build from what I have because it’s much harder for the AI to understand what I’ve done and integrate changes into it without breaking something (which it has done several times already) then to just make its own code that will fit single molecule imaging data with stochastic models of gene transcription. I also feel that once an AI is better at my job than me then I should simply step aside. Unlike pure mathematics where nothing vital is really at stake, doing my job faster and better could possibly lead to a new treatment sooner. While I’m heartened that maybe this will accelerate medical research so that it will save a life someday, I am also very sad that I won’t get to do what I truly loved doing.

2025-03-24: I corrected a grammatical error in the first paragraph. Yes, I actually wrote this without using an LLM.

Toolkit for the physical world: Chapter 2 – All interactions are local

I still haven’t quite decided how to organize this series. I’ll stick with chapter numbers for now although these short posts hardly count as real chapters and the chapters don’t necessarily need to be read in order. I just need to try to get back into the habit of writing so bear with me as I try to figure this out.

A crucial thing to know about (our modern understanding of) the physical world is that all physical interactions are local. By this I mean that for thing A to affect thing B, thing A had to go out and “touch” thing B or send an agent to do the “touching” for it. There is no action at a distance without some mediator. Well what about radio and TV (mobile phones for any Gen Z readers) you might ask? Yes, the magic of modern technology allows us to use a device to hear voices and watch images from around the world. In fact, Newton kind of kicked off the quantitative science revolution by proposing that the moon stays in orbit around the earth because earth’s gravity exerts a pull on it. So how is that not action at a distance?

Well, this seeming contradiction is resolved by the concept of fields. Although a full understanding of a field requires some advanced physics like Maxwell’s equations, general relativity, and quantum mechanics, the gist of it can be understood quite easily. A field is a thing that has a value at every point in space. A gravitational field is the thing in space that pushes an object with mass in a given direction. The gravitational field is also induced by any object with mass. So the earth and moon both create gravitational fields that attract each other. They don’t crash into each other because they are moving so they kind of miss each other as they fall towards each other. Yes, it is confusing but trust me for now. Orbital motion is very well understood, so well understood that we can launch a spacecraft and land it on the moon or any planet.

I can hear some of you muttering that a field sounds like action at a distance to me. How can the earth set up a gravitational field that affects the moon? Yes, the caveat is that nothing can travel faster than the speed of light. So, it actually takes time for the earth to set up its gravitational field. If the earth were to suddenly disappear, the moon would not notice for a little more than a second, which is the time it takes light to travel from the earth to the moon. The gravitational field generated by the earth would dissipate as a complicated wave that was initiated at the former location of the earth. Same with when you turn on a light in a room. It seems instantaneous but it took time for the electricity to reach the bulb to generate light and then for that light to fill the room.

Locality is also true when it comes to biology and your health. Drugs work because the drug molecules travel through your blood stream and affect cells. A pain killer like Ibuprofen works because it inhibits the action of certain molecules. If you think some agent like microplastics or power transmission lines in the environment can cause illness then you need to come up with a local explanation of why and then test it. For example, microplastics might cause health problems by stimulating immune cells and cause excessive inflammation. But it’s a lot harder coming up with a plausible hypothesis for power lines. Power lines carry AC current at 50 or 60 cycles per second so they do induce radio waves but these wavelengths are on the order of thousands of kilometers. Microwaves have wavelengths on the order of ten centimeters and heat up tissue by inducing polarized molecules like water to vibrate and bang into each other thereby creating heat. Electromagnetic waves at very high frequencies like infrared radiation, visible light, and UV radiation can affect tissue but they are not generated by power lines. That is not to say that power lines couldn’t affect health but you need to come up with a plausible explanation with local interactions.

You can get quite far just using locality to try to understand how something can work or if it is reasonable. One of my pet peeves about science fiction movies is that you can see the laser beams in ray gun battles. Now, in order to see something, light had to enter your eyes. So to see a laser beam shooting from one point to another, some of the photons in the beam of light had to reach your eye. Well, why would that happen? If you were shooting through air some of the photons might scatter off of air molecules and enter your eye but that wouldn’t happen in the vacuum of space. If there is no scattering then there is no reason a well designed weapon would be visible, where you would want to concentrate all the energy on the target. A visible beam just means energy is being wasted. My series will have succeeded beyond all expectations if future movie goers become discerning enough to insist that laser beams not be visible in movies.

Toolkit for the physical world: Chapter 1 – Where do we start?

Obviously, I should have prepared Chapter 1 after I posted the Introduction but of course I hadn’t. I thought it would be easy to write these posts but then I got stuck in the “everything is connected” conundrum. Science is like an arch where all the stones need to be in place for it to stand. Everything relates to everything else. So like every other botched attempt to explain science, I’m going to have to blunder down the same path of introducing a lot of concepts that won’t fully make sense until they are all put together.

I’ll start with the main scientific premise, which is that science is objective – meaning that everything that happens is governed by a strict set of rules or laws, which we mostly understand. In principle, these laws of physics give us the tools to understand and explain all physical phenomena. But there’s a catch: these laws operate primarily at a microscopic level. What we observe at human scales is not always a direct application of these laws but the cumulative effect of their manifold applications.

A natural way to explain science would be to start with these laws. The world is made up of tiny elementary particles, and their movements and interactions are governed by four fundamental forces: gravity, electromagnetism, the weak nuclear force, and the strong nuclear force. There are also universal quantities, like energy and momentum, that are conserved. These laws form the backbone of how the universe works, but it’s not always obvious how they’re being applied.

A good rule of thumb is that the simpler the physical system—like the motion of a single particle—the easier it is to figure out which laws are in operation and how to make accurate predictions. But when it comes to more complex phenomena, like living organisms, there’s a myriad of microscopic actions playing out across a wide range of scales. Sometimes, out of this turmoil, new regularities will emerge—meta-laws, so to speak. These emergent meta-laws aren’t fundamental, nor are they entirely reliable, but they’re often just good enough to let us make predictions.

There’s no one-size-fits-all way to discover and use these meta-laws, and that’s what makes the practice of science both engaging and frustrating. It’s also what makes science so hard to explain to a layperson. How do you explain why we’re both confident and confused at the same time? Why we can answer some questions but not others? Why flimsy pieces of evidence sometimes carry more weight than they should? It’s why we occasionally succeed but often fail. The fundamental laws give us confidence that there might be a way to figure things out, but they don’t tell us how to find it. And sometimes, what looks like a meta-law turns out to be nothing more than us spotting patterns in randomness.

Science is two things: a vast body of accumulated knowledge that explains how the world operates, and the process by which we acquire and validate new knowledge that revises and updates the old. I’m always skeptical of what we think we know, but I trust the process. In fact, writing this series feels a lot like doing science. I don’t know exactly where I’m going or whether I’m doing it right, but I do know I can at least leave a trail—one that others can follow, improve, and fix.

Toolkit for the physical world: An introduction

I had trouble writing this post so I gave it to ChatGPT4o and asked it to fix it. This is what it came up with (after some further editing by me. I had to remove some (but not all) of the cheeriness for example):

I can’t remember the exact source—maybe a podcast like the BBC’s In Our Time or something I read—but it was about a public debate over a contentious scientific topic in the Victorian age. What struck me wasn’t the details of the topic itself, but how the discourse unfolded. The discussion revolved around the plausibility of competing hypotheses in light of existing data, with proposals for experiments to resolve the uncertainties. What stood out most was the shared, implicit assumption that the scientific method was the only valid way to uncover the truth. It was the kind of rigorous, data-driven reasoning we rarely see in the public sphere today.

This observation led me to reflect on a curious paradox: though our world is more reliant on science and technology than ever before, the public’s connection to science has diminished. During the Victorian age, science was widely admired—even romanticized—for its potential to reveal the secrets of the universe. Today, that admiration has waned. In popular culture, magic often eclipses science, both in entertainment value and in how it captivates the imagination. Just look at the way science is portrayed in movies: more often than not, it’s either a plot device for catastrophic hubris (Frankenstein, Jurassic Park) or a stand-in for magic (Iron Man, Doctor Strange). Arthur C. Clarke famously wrote, “Any sufficiently advanced technology is indistinguishable from magic;” it feels like many have skipped over the science entirely and gone straight to the magic.

Why has this disconnect grown? Part of the answer lies in the failures of science and the hubris of scientists—manifested in events like World Wars, nuclear weapons, climate change, social media’s unintended consequences, the COVID-19 pandemic, etc. I won’t delve into that here. Instead, I want to focus on another factor: the very nature of science itself. Thinking scientifically isn’t “natural” for the human brain. Our minds evolved to make quick inferences with limited data, spot patterns, and leap to conclusions—abilities that were essential for survival but are fundamentally at odds with the slow, methodical process of gathering evidence, testing hypotheses, and refining theories that science demands. Science, like playing the violin, is a skill that must be learned. It requires years of practice, patience, and, above all, the resilience to persist through repeated failure.

This leads to another challenge: science education. Understanding modern science and technology requires not only a grasp of the scientific method but also a large body of background knowledge. Physics, chemistry, biology, mathematics—the sheer volume of information is daunting. Compounding the problem is that science is often taught as a series of isolated facts, obscuring the process by which those facts were discovered. The messy, iterative, failure-ridden practice of science rarely makes it into the classroom, much less the big screen. And let’s face it: a realistic depiction of someone running failed experiments over months or years wouldn’t exactly be box office gold.

Yet the practice of science is what makes it so powerful. Hypothesis testing, experimentation, measurement, and iteration—these are the tools we’ve honed to understand the physical world. They’re slow and arduous, but they work. My greatest superpower, as someone who practices science, is the ability to fail repeatedly and keep going. That resilience is crucial because science is mostly failure—until it isn’t.

I thus want to write a series of articles outlining some essential concepts and tools you need to understand the physical world. You really just need to know a specific set of keystone concepts and a handful of practical tools, to navigate the complexities of the modern world. I hope these ideas will equip you to assess statements for their validity, distinguish fact from fiction, and evaluate statistical claims with a discerning eye. Whether it’s understanding the principles behind a medical study, interpreting the risks of new technology, or simply questioning “too-good-to-be-true” claims, these tools aim to allow you to cut through noise and misinformation. They’re not about mastering every detail but about recognizing patterns, asking the right questions, and applying a systematic way of thinking that empowers you to make informed decisions in a world driven by data and complexity. Through this series, I hope to provide a toolkit for understanding the physical world.

The Physics of Refrigeration

Today, the high temperature for Baltimore was 37 Celsius, which is body temperature. Much of the US is experiencing a heat wave with many places in the south and southwest experiencing highs over 40 C. It’s pretty clear that without air conditioning (which is a room refrigerator), life in the summer would be unbearable. I think the refrigerator is one of the greatest inventions in history. Now of course things like writing, the wheel, the printing press, electric lights, etc. may have had more impact on the world but I think pumping heat (which is what a refrigerator does) is the greatest because it required an extremely deep understanding of physics and math before it could happen. It would be very unlikely to accidentally discover how to pump heat.

Given the importance of refrigeration and air conditioning to modern life, I think it is imperative that everyone knows how they actually work. People probably know that it involves some sort of refrigerant that is pumped around. They may even vaguely remember learning about the Carnot cycle in their high school or college physics classes. However, my guess is that many people don’t really get intuitively how a refrigerator works, unlike say a furnace, which they do get (i.e. it burns a fuel that makes hot air that is blown around the house, or it burns gas to heat water that is pumped around the house).

To make a refrigerator, there are a few things you need to know. The first is that there is this thing called energy that is conserved. This is known as the first law of thermodynamics. What is energy? Physicists and philosophers could debate that into the wee hours of the night but suffice to say it is a thing that can take many forms but cannot be destroyed. More importantly, energy can take the form of work (i.e. the ability to move something) or heat, which is a form of internal energy measured by temperature. The second thing you need to know is that heat flows from a higher temperature to a lower temperature. This is the second law of thermodynamics and has to do with entropy always increasing, which is a topic I will visit in the future.

The second law is less obvious than it seems. If you boil some water and leave it on the counter, it will cool down to room temperature. Heat flows out of the hot water and into the room. However, if you take a cup of cold water out of the fridge and leave it on the counter then it will warm up to room temperature. So, if you were not careful you would might believe things just flow to room temperature. The question is why did the cup get warmer but the room not get colder if heat flowed from the room to the cup. Likewise in the first example, the hot cup got colder but the room didn’t seem to change either. Why? Well it is because the room is really big compared to the cup and how much the temperature changes when you add or subtract heat depends on how much stuff (called heat capacity) you have. Removing a cup’s worth of heat from the room only changed the temperature by an imperceptible amount because the room has a lot of heat capacity. The excess energy was basically spread out over a large volume. Refrigeration is hard because you need to make heat flow from a cold area to a hotter area. The laws of thermodynamics, which are absolutely not obvious and one of the greatest human achievements, also explain why perpetual motion machines are not possible (another post perhaps).

The first two laws of thermodynamics still don’t tell you how to make a heat pump. They only tell you that it requires work (energy). But what work do you need to do? Most heat pumps use some form of compression and expansion cycle because when you compress a gas/fluid (i.e. do work on it) the temperature goes up and conversely when you let a gas expand against something like a piston (the gas does work) and the temperature will go down. Because energy is conserved, when you do work on a gas that energy is converted into heat or “internal” energy, which is manifested by an increase in temperature. When the gas expands against something it must use some internal energy and the temperature will decrease. Finally, most gases will cool down just by moving from high pressure to lower pressure through a small opening (called a throttle). This is called the Joule-Thompson effect and occurs because the gas molecules have an attractive force for each other (Van der Waals forces) and thus expansion requires work to pull them further apart.

Now, we almost have all we need to build a refrigerator. As you can anticipate, we will use expanding gas to cool our fridge. One possible design is to have a container of compressed gas with a throttle attached so the gas cools as it is released and expands. Pump this cool gas into the fridge where it can absorb heat from the inside of the fridge (heat flows from the warmer interior of fridge into the colder gas and heats that gas). Pumping out the gas and compressing it makes it a full cycle. Energy is used to pump the gas around and to compress it. This is called a reverse gas or Brayton cycle. It uses energy to move heat from a cold place to a warm place. In the forward direction, the Brayton cycle is a heat engine (in fact it is a jet turbine) where you convert heat into work. The problem with the Brayton cycle is that it is very inefficient (or impossible) to use as a heat pump at room temperatures or lower. This is because you can only get so much cooling when you throttle a gas at low temperatures (modern freezers can go down to -40 C).

Thus, most modern fridges and air conditioners use a vapour-compression cycle, where a liquid/gas combination refrigerant is used. The principle is similar, you get cooling by expansion but the added twist is you don’t just expand a gas but you let a liquid “boil” into gas when expanded. This will result in a much larger drop in temperature but requires finding a magical substance that is liquid at room temperature when under compression and then turns into gas when sent through a throttle. Freon, which is such a refrigerant, was invented in the mid-twentieth century and made refrigerators practical and affordable to the masses. However, Freon (and related CFCs) also destroyed the ozone layer that protects the earth from harmful UV rays and was internationally banned in 1994 in the Montreal Protocol. We should note that this beneficial agreement would not have happened so quickly if chemical companies hadn’t already found replacements (and could thus enforce new patent monopolies since the old ones had expired). Regulation can happen when the interests of capitalists are aligned. Something to note when dealing with climate change.

There is one final trick that makes the modern fridge/AC more efficient and that is when you compress the gas back into liquid, it raises the temperature to much higher than room temperature and even higher than the temperature in Texas during a heat wave (although barely). This hot liquid is then cooled by releasing heat into the less hot environment and giving it a head start before it passes through the throttle. A commercial heat pump is an air conditioner that can move heat in both directions. When it is in air conditioning mode the refrigerant expands and cools when inside the house and is compressed outside of the house and when acting as a heater it does the opposite.

Finally, it is actually possible to build a refrigerator with no moving parts using what is called the thermoelectric effect (Peltier effect) where passing electricity across a junction of two different materials can cause one side to cool down and the other heat up due to a quantum mechanical effect, which I may explain later). Now that you know the laws of thermodynamics and how a refrigerator works, explain why you can’t cool a room by leaving the fridge door open.

The venture capital subsidy is over

Much has been written about the low approval of the current President and the general disgruntlement of the population despite conventional economic measures being good. The unemployment rate is low, wages are rising, and inflation is abating. The most logical explanation is that while the rate of inflation is slowing prices are still higher than before and while unemployment affects a few, high prices affect everyone. I think this is correct but one thing that doesn’t seem to be mentioned is that one of the reasons prices are higher and in some cases much higher is that many of the tech sector services that people relied on like ride share and food delivery were basically giving away their goods for free before and now they are not. Companies like Uber, DoorDash, and Grubhub were never profitable and were kept afloat by unlimited venture capital money, especially from SoftBank, and now this subsidy is gone. Now if you want a ride or get food delivered, you’re going to have to pay full price and this has made a lot of people really unhappy.

The subsidy was premised on the Silicon Valley idea that all start-ups need to “scale” until they are a virtual monopoly (like Amazon, Google/Alphabet and Facebook/Meta). However, the one thing that these tech strategists seemed to not consider is that scaling is useful when getting bigger means getting better, either by having lower cost, acquiring more data, or exploiting network effects. Scaling can work for search and entertainment companies because the more users there are the more data you have to determine what people want. It works for social network companies because the more people there are on the network, the more other people want to join that network. However, it doesn’t really work for delivery and transportation. Costs do not really decrease if Uber or DoorDash get bigger. You still need to pay a person to drive a car and the more miles they drive the more it costs. It could possibly scale if the cars were bigger and drove on fixed routes (like public transportation) but no one has yet to figure out how to scale point-to-point services. The pitch was that the tech companies would optimize the routes but that essentially means solving the “traveling salesman problem” which is NP-complete (i.e. cannot be easily solved and gets exponentially harder as the size gets bigger). Thus, while these tech companies got bigger they just burned through more cash. The primary beneficiaries were us. We got rides and food for next to nothing and now that’s over. However, it was not all costless. It hurt existing industries like traditional taxis, which were heavily regulated. One of the greatest failures in oversight was letting Uber operate in New York but that is another story.

Now these companies are either going bankrupt or increasing their prices. It is true that inflation is partially responsible for ending the subsidy because it led to higher interest rates which made borrowing more expensive but the reckoning would have to come sooner or later. Technological idealism does not obviate the laws of physics or capitalism – all businesses need to make money.

The myth of the zero sum college admission game

A major concern of the commentary in the wake of the recent US Supreme Court decision eliminating the role of race in college admission is how to maintain diversity in elite colleges. What is not being written about is that maybe we shouldn’t have exclusive elite colleges to start with. Americans seem to take for granted that attending an elite college is a zero sum game. However, there is no reason that a so-called elite education must be a scarce resource. Harvard, with its 50 billion plus endowment could easily expand its incoming freshman class by a factor of 5 or 10. It doesn’t because that obviously would make its product less prestigious and diminish its brand. It is a policy decision that allows elite universities like Harvard and Stanford to maintain their status. Being old is not an excuse. Ancient universities in Europe like the University of Bologna in Italy or University of Heidelberg in Germany, are state run and have acceptance rates well over 50%.

The main problem in the US is not that exclusive universities exist but that they have undue power. Kids are scrambling to get in because they believe it gives them a leg up in life. And they are mostly correct. All the Supreme Court Judges, save one, went to either Harvard or Yale law school. The faculty of elite schools tend to get their degrees from elite schools. High power consulting, Wall Street, and law firms tend to recruit from a small set of elite schools. Yet, this is only because we as a society choose it to be this way. In the distant past, elite colleges were basically finishing schools for the wealthy and powerful. Going to an Ivy league school was not what conferred you power and wealth. You were there because you already had power and wealth. It has only been in the past half century or so that the elite schools started admitting on the basis of merit. The cynical view is that world was getting more technical and thus it was useful for the wealthy and powerful to have access to talented recruits.

While it is true that the top schools generally have more resources and more research active faculty, what really makes them elite is the quality of their students. It is not that elite colleges produce the best graduates but rather that the best students choose elite colleges. Now there is an over supply of gifted students. For every student that is admitted to a top ten school there are probably five or more others who would have done equally well. This is not entirely a negative thing. Having talent spread across more universities is a boon to students and society.

As seen with what happened in California and Michigan, eliminating race-conscious admission will likely decrease the number of under-represented minorities at elite schools. But this only matters if going to an elite school is the only way to access to the levers of power and have a productive life. We could make an elite education available to everyone. We could increase supply by increasing funding to state run universities and we could take away the public subsidy of elite private schools by taxing their land and endowments. The fact that affirmative action still matters over a half century later is an indication of failure. There is talent everywhere and that talent should be given a chance to flourish.

Chat GPT and the end of human culture

I had a personal situation this past year that kept me from posting much but today I decided to sit down and write something – all by myself without any help from anyone or anything. I could have enlisted the help of Chat GPT or some other large language model (LLM) but I didn’t. These posts generally start out with a foggy idea, which then take on a life of their own. Part of my enjoyment of writing these things is that I really don’t know what they will say until I’m finished. But sometime in the near future I’m pretty sure that WordPress will have a little window where you can type an idea and a LLM will just write the post for you. At first I will resist using it but one day I might not feel well and I’ll try it and like it and eventually all my posts will be created by a generative AI. Soon afterwards, the AI will learn what I like to blog about and how often I do so and it will just start posting on it’s own without my input. Maybe most or all content will be generated by an AI.

These LLMs are created by training a neural network to predict the next word of a sentence, given the previous words, sentences, paragraphs, and essentially everything that has ever been written. The machine is fed some text and produces what it thinks should come next. It then compares its prediction with the actual answer and updates its settings (connection weights) based on some score of how well it did. When fed the entire corpus of human knowledge (or at least what is online), we have all seen how well it can do. As I have speculated previously (see here), this isn’t all too surprising given that the written word is relatively new in our evolutionary history. Thus, humans aren’t really all that good at it and there isn’t all that much variety in what we write. Once an AI has the ability to predict the next word, it doesn’t take much more tinkering to make it generate an entire text. The specific technology that made this generative leap is called a diffusion model, which I may describe in more technical detail in the future. But in the simplest terms, the model finds successive small modifications to transform the initial text (or image or anything) into pure noise. The model can then be run backwards starting from random noise to create text.

When all content is generated by AI, the AI will no longer have any human data on which to further train. Human written culture will then be frozen. The written word will just consist of rehashing of previous thoughts along with random insertions generated by a machine. If the AI starts to train on AI generated text then it could leave human culture entirely. Generally, when these statistical learning machines train on their own generated data they can go unstable and become completely unpredictable. Will the AI be considered conscious by then?

New Paper

Distributing task-related neural activity across a cortical network through task-independent connections

Nature Communications volume 14, Article number: 2851 (2023) Cite this article

Abstract

Task-related neural activity is widespread across populations of neurons during goal-directed behaviors. However, little is known about the synaptic reorganization and circuit mechanisms that lead to broad activity changes. Here we trained a subset of neurons in a spiking network with strong synaptic interactions to reproduce the activity of neurons in the motor cortex during a decision-making task. Task-related activity, resembling the neural data, emerged across the network, even in the untrained neurons. Analysis of trained networks showed that strong untrained synapses, which were independent of the task and determined the dynamical state of the network, mediated the spread of task-related activity. Optogenetic perturbations suggest that the motor cortex is strongly-coupled, supporting the applicability of the mechanism to cortical networks. Our results reveal a cortical mechanism that facilitates distributed representations of task-variables by spreading the activity from a subset of plastic neurons to the entire network through task-independent strong synapses.

Milo Time

Milo Kessler died of osteosarcoma on March 11, 2022. He was just 18. He was a math major and loved tennis. I never met Milo but I think of him often. I got to know his father Daryl after Milo had passed. Daryl created this podcast about Milo. It’s very well done and gives me comfort.

How many different universes can there be?

If there are an infinite number of universes or even if a single universe is infinite in extent then each person (or thing) should have an infinite number of doppelgangers, each with a slight variation. The argument is that if the universe is infinite, and most importantly does not repeat itself, then all possible configurations of matter and energy (or bits of stuff) can and will occur. There should be an infinite number of other universe or other parts of our universe that contain another solar system, with a sun and an earth and a you except that maybe one molecule in one of your cells is in a different position or moving at a different velocity. A simple way to think about this is to imagine an infinite black and white TV screen where each pixel can be either black or white. If the screen is nonperiodic then any configuration of pixels can be found somewhere on the screen. This is kind of like how any sequence of numbers can be found in the digits of Pi or an infinite number of monkeys typing will eventually type out Hamlet. This generalizes to changing or time dependent universes where any sequence of flickering pixels will exist somewhere on the screen.

Not all universes are possible if you include any type of universal rule in your universe. Universes that violate the rule are excluded. If the pixels obeyed Newton’s law of motion then arbitrary sequences of pixels could no longer occur because the configuration of pixels in the next moment of time depends on the previous moment. However, we can have all possible worlds if we assume that rules are not universal and can change over different parts of the universe.

Some universes are also excluded if we introduce rational belief. For example, it is possible that there is another universe, like in almost every movie these days, that is like ours but slightly different. However, it is impossible for a purely rational person in a given universe to believe arbitrary things. Rational belief is as strong a constraint on the universe as any law of motion. One cannot believe in the conservation of energy and the Incredible Hulk (who can increase in mass by a factor of a thousand within seconds) at the same time. Energy is not universally conserved in the Marvel Cinematic Universe. (Why these supposedly super-smart scientists in those universes don’t invent a perpetual motion machine is a mystery.) Rationality does not even have to be universal. Just having a single rational person excludes certain universes. Science is impossible in a totally random universe in which nothing is predictable. However, if a person merely believed they were rational but were actually not then any possible universe is again possible.

Ultimately, this boils down to the question of what exactly exists? I for one believe that concepts such as rationality, beauty, and happiness exist as much as matter and energy exists. Thus for me, all possible universes cannot exist. There does not exist a universe where I am happy and there is so much suffering and pain in the world.

2023-12-28: Corrected a typo.

Falling through the earth part 2

In my previous post, I showed that an elevator falling from the surface through the center of the earth due to gravity alone would obey the dynamics of a simple harmonic oscillator. I did not know what would happen if the shaft went through some arbitrary chord through the earth. Rick Gerkin believed that it would take the same amount of time for all chords and it turns out that he is correct. The proof is very simple. Consider any chord (straight path) through the earth. Now take a plane and slice the earth through that chord and the center of the earth. This is always possible because it takes three points to specify a plane. Now looking perpendicular to the plane, you can always rotate the earth such that you see

Let the blue dot represent the elevator on this chord. It will fall towards the midpoint. The total force on the elevator is towards the center of the earth along the vector r. From the previous post, we know that the gravitational acceleration is \omega^2 r. The force driving the elevator is along the chord and will have a magnitude that is given by r times the cosine of the angle between x and r. But this has magnitude exactly equal to x! Thus, the acceleration of the elevator along the chord is \omega^2 x and thus the equation of motion for the elevator is \ddot x = \omega^2 x, which will be true for all chords and is the same as what we derived before. Hence, it will take the same amount of time to transit the earth. This is a perfect example of how most problems are solved by conceptualizing them in the right way.

Falling through the earth

The 2012 remake of the classic film Total Recall features a giant elevator that plunges through the earth from Australia to England. This trip is called the “fall”, which I presume to mean it is propelled by gravity alone in an evacuated tube. The film states that the trip takes 17 minutes (I don’t remember if this is to get to the center of the earth or the other side). It also made some goofy point that the seats flip around in the vehicle when you cross the center because gravity reverses. This makes no sense because when you fall you are weightless and if you are strapped in, what difference does it make what direction you are in. In any case, I was still curious to know if 17 minutes was remotely accurate and the privilege of a physics education is that one is given the tools to calculate the transit time through the earth due to gravity.

The first thing to do is to make an order of magnitude estimate to see if the time is even in the ballpark. For this you only need middle school physics. The gravitational acceleration for a mass at the surface of the earth is g = 9.8 m/s^2. The radius of the earth is 6.371 million metres. Using the formula that distance r = 1/2 g t^2 (which you get by integrating twice over time), you get t = \sqrt{2 r / g}. Plugging in the numbers gives 1140 seconds or 19 minutes. So it would take 19 minutes to get to the center of the earth if you constantly accelerated at 9.8 m/s^2. It would take the same amount of time to get back to the surface. Given that the gravitational acceleration at the surface should be an upper bound, the real transit time should be slower. I don’t know who they consulted but 17 minutes is not too far off.

We can calculate a more accurate time by including the effect of the gravitational force changing as you transit through the earth but this will require calculus. It’s a beautiful calculation so I’ll show it here. Newton’s law for the gravitational force between a point mass m and a point mass M separated by a distance r is

F = - \frac{G Mm}{r^2}

where G = 6.67\times 10^{-11} m^3 kg^{-1} s^{-2} is the gravitational constant. If we assume that mass M (i.e. earth) is fixed then Newton’s 2nd law of motion for the mass m is given by m \ddot r = F. The equivalence of inertial mass and gravitational mass means you can divide m from both sides. So, if a mass m were outside of the earth, then it would accelerate towards the earth as

\ddot r = F/m = - \frac{G M}{r^2}

and this number is g when r is the radius of the earth. This is the reason that all objects fall with the same acceleration, apocryphally validated by Galileo on the Tower of Pisa. (It is also the basis of the theory of general relativity).

However, the earth is not a point but an extended ball where each point of the ball exerts a gravitational force on any mass inside or outside of the ball. (Nothing can shield gravity). Thus to compute the force acting on a particle we need to integrate over the contributions of each point inside the earth. Assume that the density of the earth \rho is constant so the mass M = \frac{4}{3} \pi  R^3\rho where R is the radius of the earth. The force between two point particles acts in a straight line between the two points. Thus for an extended object like the earth, each point in it will exert a force on a given particle in a different direction. So the calculation involves integrating over vectors. This is greatly simplified because a ball is highly symmetric. Consider the figure below, which is a cross section of the earth sliced down the middle.

The particle/elevator is the red dot, which is located a distance r from the center. (It can be inside or outside of the earth). We will assume that it travels on an axis through the center of the earth. We want to compute the net gravitational force on it from each point in the earth along this central axis. All distances (positive and negative) are measured with respect to the center of the earth. The blue dot is a point inside the earth with coordinates (x,y). There is also the third coordinate coming out of the page but we will not need it. For each blue point on one side of the earth there is another point diametrically opposed to it. The forces exerted by the two blue points on the red point are symmetrical. Their contributions in the y direction are exactly opposite and cancel leaving the net force only along the x axis. In fact there is an entire circle of points with radius y (orthogonal to the page) around the central axis where each point on the circle combines with a partner point on the opposite side to yield a force only along the x axis. Thus to compute the net force on the elevator we just need to integrate the contribution from concentric rings over the volume earth. This reduces an integral over three dimensions to just two.

The magnitude of the force (density) between the blue and red dot is given by

\frac{G m \rho}{(r-x)^2+y^2}

To get the component of the force along the x direction we need to multiple by the cosine of the angle between the central axis and the blue dot, which is

\frac{r-x}{((r-x)^2+y^2)^{1/2}}

(i.e. ratio of the adjacent to the hypotenuse of the relevant triangle). Now, to capture the contributions for all the pairs on the circle we multiple by the circumference which is 2\pi y. Putting this together gives

F/m = -G\rho \int_{-R}^{R}\int_{0}^{\sqrt{R^2-x^2}}\frac{r-x}{((r-x)^2+y^2)^{3/2}} 2\pi y dy dx

The y integral extends from zero to the edge of the earth, which is \sqrt{R^2-x^2}. (This is R at x=0 (the center) and zero at x=\pm R (the poles) as expected). The x integral extends from one pole to the other, hence -R to R. Completing the y integral gives

2\pi G\rho \int_{-R}^{R}\left. \frac{r-x}{((r-x)^2+y^2)^{1/2}} \right|_{0}^{\sqrt{R^2-x^2}}dx

= 2\pi G\rho \int_{-R}^{R}\left[ \frac{r-x}{((r-x)^2+R^2-x^2)^{1/2}} - \frac{r-x}{|r-x|} \right]dx (*)

The second term comes from the 0 limit of the integral, which is \frac{r-x}{((r-x)^2)^{1/2}}. The square root of a number has positive and negative roots but the denominator here is a distance and thus is always a positive quantity and thus must include the absolute value. The first term of the above integral can be completed straightforwardly (I’ll leave it as an exercise) but the second term must be handled with care because r-x can change sign depending on whether r is greater or less than x. For a particle outside of the earth r-x is always positive and we get

\int_{-R}^{R} \frac{r-x}{|r-x|} dx =  \int_{-R}^{R} dx = 2R, r > R

Inside the earth, we must break the integral up into two parts

\int_{-R}^{R} \frac{r-x}{|r-x|} dx = \int_{-R}^{r}  dx - \int_{r}^{R} dx = r+R - R + r = 2r, -R \le r\le R

The first term of (*) integrates to

\left[ \frac{(r^2-2rx+R^2)^{1/2}(-2r^2+rx+R^2)}{3r^2}  \right]_{-R}^{R}

= \frac{(r^2-2rR+R^2)^{1/2}(-2r^2+rR+R^2)}{3r^2} -   \frac{(r^2+2rR+R^2)^{1/2}(-2r^2-rR+R^2)}{3r^2}

Using the fact that (r \pm R)^2 = r^2 \pm 2rR + R^2, we get

= \frac{|R-r|(-2r^2+rR+R^2)}{3r^2} -   \frac{|R+r|(-2r^2-rR+R^2)}{3r^2}

(We again need the absolute value sign). For r > R, the particle is outside of the earth) and |R-r| = r-R, |R+r| = r + R. Putting everything together gives

F/m = 2\pi G\rho \left[ \frac{6r^2R-2R^3}{3r^2} - 2 R\right] = -\frac{4}{3}\pi R^3\rho G\frac{1}{r^2} = - \frac{MG}{r^2}

Thus, we have explicitly shown that the gravitational force exerted by a uniform ball is equivalent to concentrating all the mass in the center. This formula is true for r < - R too.

For -R \le r  \le R we have

F/m = 2\pi G\rho \left[ \frac{4}{3} r - 2r\right] =-\frac{4}{3}\pi\rho G r = -\frac{G M}{R^3}r

Remarkably, the gravitational force on a particle inside the earth is just the force on the surface scaled by the ratio r/R. The equation of motion of the elevator is thus

\ddot r = - \omega^2 r with \omega^2 = GM/R^3 = g/R

(Recall that the gravitational acceleration at the surface is g = GM/R^2 = 9.8 m/s^2). This is the classic equation for a harmonic oscillator with solutions of the form \sin \omega t. Thus, a period (i.e. round trip) is given by 2\pi/\omega. Plugging in the numbers gives 5062 seconds or 84 minutes. A transit through the earth once would be half that at 42 minutes and the time to fall to the center of the earth would be 21 minutes, which I find surprisingly close to the back of the envelope estimate.

Now Australia is not exactly antipodal to England so the tube in the movie did not go directly through the center, which would make the calculation much harder. This would be a shorter distance but the gravitational force would be at an angle to the tube so there would be less acceleration and something would need to keep the elevator from rubbing against the walls (wheels or magnetic levitation). I actually don’t know if it would take a shorter or longer time than going through the center. If you calculate it, please let me know.

The dynamics of inflation

Inflation, the steady increase of prices and wages, is a nice example of what is called a marginal mode, line attractor, or invariant manifold in dynamical systems. What this means is that the dynamical system governing wages and prices has an equilibrium that is not a single point but rather a line or curve in price and wage space. This is easy to see because if we suddenly one day decided that all prices and wages were to be denominated in some other currency, say Scooby Snacks, nothing should change in the economy. Instead of making 15 dollars an hour, you now make 100 Scooby Snacks an hour and a Starbucks Frappuccino will now cost 25 Scooby Snacks, etc. As long as wages and prices are in balance, it does not matter what they are denominated in. That is why the negative effects of inflation are more subtle than simply having everything cost more. In a true inflationary state your inputs should always balance your outputs but at an ever increasing price. Inflation is bad because it changes how you think about the future and that adjustments to the economy always take time and have costs.

This is why our current situation of price increases does not yet constitute inflation. We are currently experiencing a supply shock that has made goods scarce and thus prices have increased to compensate. Inflation will only take place when businesses start to increase prices and wages in anticipation of future increases. We can show this in a very simple mathematical model. Let P represent some average of all prices and W represent average wages (actually they will represent the logarithm of both quantities but that will not matter for the argument). So in equilibrium P = W. Now suppose there is some supply shock and prices now increase. In order to get back into equilibrium wages should increase so we can write this as

\dot{W} = P - W

where the dot indicates the first derivative (i.e. rate of change of W is positive if P is greater than W). Similarly, if wages are higher than prices, prices should increase and we have

\dot{P} = W- P

Now notice that the equilibrium (where there is no change in W or P) is given by W=P but given that there is only one equation and two unknowns, there is no unique solution. W and P can have any value as long as they are the same. W – P = 0 describes a line in W-P space and thus it is called a line attractor. (Mathematicians would call this an invariant manifold because a manifold is a smooth surface and the rate of change does not change (i.e. is invariant) on this surface. Physicists would call this a marginal mode because if you were to solve the eigenvalue equation governing this system, it would have a zero eigenvalue, which means that its eigenvector (called a mode) is on the margin between stable and unstable.) Now if you add the two equations together you get

\dot{P} + \dot{W} = \dot{S} = 0

which implies that the rate of change of the sum of P and W, which I call S, is zero. i.e. there is no inflation. Thus if prices and wages respond immediately to changes then there can be no inflation (in this simple model). Now suppose we have instead

\ddot{W} = P - W

\ddot{P} = W-P

The second derivative of W and P respond to differences. This is like having a delay or some momentum. Instead of the rate of S responding to price wage differences, the rate of the momentum of S reacts. Now when we add the two equations together we get

\ddot{S} = 0

If we integrate this we now get

\dot{S} = C

where C is some nonnegative constant. So in this situation, the rate of change of S is positive and thus S will just keep on increasing forever. Now what is C? Well it is the anticipatory increases in S. If you were lucky enough that C was zero (i.e. no anticipation) then there would be no inflation. Remember that W and P are logarithms so C is the rate of inflation. Interestingly, the way to combat inflation in this simple toy model is to add a first derivative term. This changes the equation to

\ddot{S} + \dot{S} = 0

which is analogous to adding friction to a mechanical system (used differently to what an economist would call friction). The first derivative counters the anticipatory effect of the second derivative. The solution to this equation will return to a state zero inflation (exercise to the reader).

Now of course this model is too simple to actually describe the real economy but I think it gives an intuition to what inflation is and is not.

2022-05-18: Typos corrected.