Artificial Intelligence, but at what cost?

Richard Mortier · 13 min read · May 15, 2025 · #academic

Our inestimable and most excellent Chaplain, Revd Dr Helen Orchard, likes to have a theme for the Sunday evensong services for the term. Back in Michaelmas 2023 it was … AI. I said I’d help find someone to give a sermon from a technical perspective but then signally failed to do so (sorry!). So in the end I said I’d do it, even though AI is not my thing and I’d never given a sermon before. Or, for that matter, attended evensong. Take the opportunities offered and all that.

I realised this week that, although a few people at the time had asked for copies, I’d also done nothing about that (I am nothing if not consistently rubbish). So here’s the text, more or less as given, on 15 October 2023. Note that the golden eagle I mount is a rather fine lectern in our Chapel (pictured). Nothing more salacious than that. Filthy minds.

Three editorial notes given that it’s been over a year and a half since I gave this (my! how time flies…):

I allude to this but should be clear: the neural network is not the only technological approach to producing AI – several others exist and are both useful and used, machine learning being one that’s particularly productive in recent years. However the most hyped was and still seems to be various forms of neural network so that’s what I focused on.
I refer to “static datasets” because the versions of ChatGPT at the time were trained infrequently on a given dataset of the moment. Training updates now seem much more frequent (perhaps weekly), user context is maintained throughout a chat session, and user feedback sought at the end. So while it’s still technically true that the datasets involved are static, it’s much less noticeable.
The example of “God save the” worked particularly because this was only about a year after Queen Elizabeth II died, so “queen” was likely still the instinctive response of many.

Finally, just in case it’s not clear – I tend toward the sceptical end regarding AI. Potentially a useful tool in some circumstances but all claims about AGI are nonsense and the singularity won’t happen because of the machines. Human stupidity on the other hand seems without bound. And always follow the money.

A photograph of a fine golden-coloured lectern, the head of which is an eagle — Original

As I mount the golden eagle for the first time, I should say that I am not normally given to preaching – though my children might disagree with that statement – but as the theme this term is Artificial Intelligence, Helen asked me to speak to you about that from the perspective of a computer scientist. Unless you catch me in a pub after a couple of pints, I am also not given to philosophising, so I will limit myself to the physical reality of Artificial Intelligence, or AI. Specifically, what is it and what does it cost. I will use AIs that generate text as examples, as these so-called Large Language Models have been the focus of considerable interest in recent months, but the same basic mechanisms and problems apply to AIs used to generate images, music, videos and so on.
First, what is it. AI is a catch-all term for a set of technologies that attempt to replicate whatever we call “intelligence”. Computer scientists, cognitive psychologists and mathematicians have worked on these various technologies for decades, but the current vogue is very much for a particular set of mathematical techniques that try to produce brain-like behaviour by modelling inter-connected neurons.
Each neuron is stimulated by one or more input signals which it combines to produce an output signal with some probability. The outputs of some neurons are connected to the inputs of some other neurons, creating an enormous network. The effect in our brains might be that an input signal “I want a biscuit” results in an output signal that causes us to move an arm to pick up a biscuit. In a modern “generative AI”, the input might be a sentence or paragraph or two of text, and the resulting output might be an image or a sequence of words.
As a simple example of what I mean, if I asked you to give the next few words in the phrase starting “God save the” you might say “king send him victorious”. You have just performed inference using your own language model, generating some likely output text given three words of input. I’ll come back to that example later.

I said the inputs were combined to produce the output with some probability, but how exactly? The process for combining inputs involves a set of parameters that are determined by finding the values that give the best fit some a priori data. This is known as training if you’re an AI specialist, or parameter fitting if you’re a statistician.
A simple analogy: you may recall that a straight line is defined by two parameters, its slope and any point on the line. If you had a set of two dimensional data points that you thought were straightforwardly related, you might try to discover that relationship by drawing the best straight line you could through them; but which particular line would you think was the best? A reasonable choice might be the one that minimised the total distance from the line to each point. For an AI the maths is a little more complex, but that’s basically what happens: training finds the parameter values that give the best fit to a large set of training data.
So that’s a modern AI: a statistical model that, when stimulated by one or more inputs, produces outputs with some probability. The inputs might be words or images or some other thing, and the outputs might be words or images or some other thing. The underlying model might be wrapped up by other models that, for example, try to filter out undesirable outputs or provide for different ways of consuming inputs.
It is the sheer scale that makes this work: your brain has perhaps 100 billion neurons each of which might connect to 10,000 other neurons for a total of perhaps one million billion connections, whereas an AI such as a recent version of ChatGPT might have 175 billion parameters but each connected to just hundreds of others. The underlying mathematics has been known for decades; it is the combination of massive training datasets and the enormous computational resources of the cloud that have enabled us to build these AIs.

Second, ignoring the hysteria around so-called Artificial General Intelligence and The Singularity, what costs do these AIs incur?
To return to the example I used, I said that you might have completed the phrase “God save the” with the words “king send him victorious”. In some sense that is the “correct” completion. But perhaps some of you would have initially thought “queen send her victorious”. And I have at least one friend who would naturally respond “queen and her fascist regime”.
Human experience is varied and personal – the training process I described typically uses large static datasets collected by scraping the Internet. While the resulting AI can be configured not always to produce identical outputs given identical inputs, the training process does naturally lead to a kind of homogenisation. Simplistically, if your group is not represented in that training dataset, its experience will not be represented in the AI and thus will not be reproduced in the output. Worse, if the training data contains misrepresentations or attacks on your group, the AI will by default capture and perpetuate them, already observed to be a particular problem for women, Jews, and many minorities.
Further, I mentioned that training data is scraped from the Internet – but as the musical Avenue Q famously put it, “the Internet is for porn”. A lot of that text is rather fantastical and describes actions generally unacceptable in polite society, so the companies producing and operating AIs try to create guardrails by building other models that filter offensive outputs generated by their AIs – but how do you train such a model? You need to start with examples of offensive output that are labelled as such so that you can train a model to differentiate between what is offensive and what is inoffensive. But creating that labelled data involves human labour. For example, OpenAI were reported as outsourcing this activity to workers in Kenya paid less than $2 per day to label perhaps 200 paragraphs per day of offensive input text with the type of offensiveness: rape, torture, incest, and so on. Unpleasant and psychologically damaging work.

There are also more practical problems posed by the resources used to create and operate AIs. In particular, energy and water.
It takes a lot of computation to train and operate a large popular AI – OpenAI reported about three and a half thousand petaflops-per-second-days in 2020 to train their GPT model, where a petaflop represents a million billion computations. That is, about 10 years of a computer running at one petaflop per second. For comparison, your phone might achieve 0.1% of that performance. But as the bumper sticker has it, the cloud is just someone else’s computer – in the case of a training run for a large AI model, several hundred thousand computers in a datacenter. For example, Microsoft’s Iowa datacenter was built out for training models for OpenAI and has 285,000 standard processor cores and 10,000 GPUs (more powerful and power-hungry processors that you might be familiar with using if you’re a gamer).
This means CO₂ from the energy to power the computers plus water to cool them. How much? Well, estimates computed for earlier, smaller, models put the CO₂ footprint of a single training run at roughly the same as a round-trip flight from New York to San Francisco. Once trained, individual queries are comparatively cheap – but ChatGPT experienced the fastest ever growth of an Internet service. Earlier this year it was estimated as serving hundreds of millions of queries per day resulting in power consumption of perhaps 1 gigawatt-hour each day – the equivalent of 33,000 American households.
As for water, Microsoft has reported that its global water usage increased 34% from 2021 to 2022; Google’s increased 20% in the same period, but from a higher baseline. The increase is believed to be substantially due to training and operating AI. A group from University of California at Riverside estimate that each “conversation” with ChatGPT uses, directly and indirectly, about a pint of water – and this generally needs to be clean drinking water that will not leave residues that clog systems. The month before GPT-4 training was completed, Microsoft’s Iowa datacenters consumed 11.5 million gallons, about 6% of the district’s drinking water. The amounts vary based on season and location of the datacenter but it seems clear that water consumption is very substantial and could impact local communities and ecosystems. And of course, there is a tension here: cheap and green solar energy improves the carbon footprint but the associated higher temperatures usually also worsens the water footprint as more cooling is required.

So there’s a view of AI – an impressive set of mathematical and computational techniques that can recreate some human behaviours to some extent in some circumstances, at significant practical and moral cost. My own view is threefold.
First, using the phrase “Artificial Intelligence” to describe these technologies, rather than something less emotive such as Computationally Intensive Statistics, inevitably generates a very strong hype cycle, and we are currently at a point in that cycle where a welcome degree of scepticism is starting to come in and people are more actively questioning what exactly these technologies can and can’t do.
Second, we have largely proceeded to date without concern for any of the costs I discussed earlier, and – also welcome – that is changing: the costs are significant and we cannot ignore them.
Third, there are interesting legal and economic tussles taking place as to who owns the training data, who owns the weights – that is, the AIs – produced, and by whom and how should AIs be regulated. In particular, it is notable that many companies are claiming that there is a need for regulatory barriers to be introduced – but those are the companies that have already reached a scale where they can overcome those barriers, so such barriers will serve only to keep newcomers out of the marketplace, entrenching the existing power of “big tech” (OpenAI, Google, Microsoft, Amazon, Meta, etc).
Finally, as I used the word hysteria earlier to describe hyped fears of Artificial General Intelligence and the Singularity – please be sceptical of anyone claiming that as a serious existential risk, particularly if they are associated with aforementioned “big tech”! I view most of that discourse as a “dead cat” strategy, an attempt to distract from the current harms they are causing today by pointing to vague, nebulous, yet potentially infinite future harms. For more about the quite startling beliefs of many of those sounding those alarms, I recommend reading about the TESCREAL set of ideologies – Transhumanism, Extropianism, Singularitarianism, Cosmism, Rationalism, Effective Altruism, Longtermism.
Thank-you.

References

Background

“Language Models are Few-Shot Learners”, OpenAI, 2020. https://arxiv.org/abs/2005.14165
“On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?”, Bender et al, FAcct’21. https://doi.org/10.1145/3442188.3445922
“The Internet is for porn”, Stephanie D’Abruzzo & Rick Lyon, Avenue Q. https://genius.com/Stephanie-dabruzzo-and-rick-lyon-the-internet-is-for-porn-lyrics

Hidden Work

“OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic”, Time.com, 2023, https://time.com/6247678/openai-chatgpt-kenya-workers/
“Behind the secretive work of the many, many humans helping to train AI”, NPR, 2023. https://www.npr.org/2023/06/26/1184392406/behind-the-secretive-work-of-the-many-many-humans-helping-to-train-ai

Energy

“Energy and Policy Considerations for Deep Learning in NLP”, Strubell et al, 2019. https://arxiv.org/abs/1906.02243
“Training a single AI model can emit as much carbon as five cars in their lifetimes”, MIT Technology Review, 2019. https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/

Water

“Artificial intelligence technology behind ChatGPT was built in Iowa — with a lot of water”, AP News, 2023. https://apnews.com/article/chatgpt-gpt4-iowa-ai-water-consumption-microsoft-f551fde98083d17a7e8d904f8be822c4
“A.I. tools fueled a 34% spike in Microsoft’s water consumption, and one city with its data centers is concerned about the effect on residential supply”, Fortune, 2023. https://fortune.com/2023/09/09/ai-chatgpt-usage-fuels-spike-in-microsoft-water-consumption/
“Making AI Less “Thirsty”: Uncovering and Addressing the Secret Water Footprint of AI Models“, Pengfei Li et al, 2023. https://arxiv.org/abs/2304.03271