Can AI Predict the Future?

Mar 17
8 min read

If you could reliably predict the future, even just a little bit, you would possess something akin to a superpower. We are not talking about the sci-fi sense of time travel or crystal balls. Imagine instead being able to predict whether your health condition will improve or degrade over the next three months, which startup will dominate its market in five years, or, much more commonly, exactly what the weather will be like tomorrow!

In our latest episode, we dive into the complex world of time series data, exploring the subtle but crucial distinction between prophecy and forecasting. AI is not about seeing the future; it is about prediction under uncertainty. It is about estimating which futures are more likely than others. As it turns out, AI is incredibly good at this—under the right conditions.

Listen Now

The Magic of “Time Series" Data

To understand how a machine predicts the future, we first need to understand the concept of a time series. In data science, a time series is simply a data structure consisting of measurements taken over time. It is a sequence where the strict order of events matters.

Every day, we generate countless time series. Your phone battery draining from 97% to 93% to 88% is a time series. Your beat-by-beat heart rate is a time series. Even a city’s fluctuating energy demand throughout the day is a time series. In all these examples, the core idea is the same: you are measuring a variable that changes in relation to its past.

However, not all sequences are time series. If you roll a six-sided die one hundred times, you could represent that data sequentially, but it would not be meaningful. The result of your next roll is completely unconstrained by your previous rolls. For AI to predict the future, it requires an environment where the past actively constrains the future. The stronger the constraint, the more predictable the system.

The Enemies of Forecasting: Randomness and Feedback Loops

Even when a system is theoretically predictable, there are practical challenges that prevent AI from being a flawless oracle.

The Problem of Randomness: Randomness is a fundamental mathematical difficulty in forecasting because it compounds over time. If an AI is slightly unsure about where an object will be in one hour, predicting its location in ten hours becomes exponentially harder. Furthermore, unprecedented, chaotic events—like the outbreak of COVID-19 or a sudden factory fire—introduce randomness that historical patterns simply cannot capture.
Unknown Unknowns: Sometimes the most important driver of a future event is not in your dataset at all. Imagine trying to predict supermarket sales in early 2020. If you only had data from 2000 to 2019, your models would be completely baffled by the massive, sudden spike in toilet paper demand. The AI model itself did not break; the world changed, and the cause (panic psychology amplified by the news) was not a variable the AI was measuring.
Feedback Loops: Because we live in an interactive world, predictions often change behaviour, and behaviour changes what happens next. Consider Google Maps predicting a traffic jam. If the app tells thousands of drivers that a specific route will be heavily congested, those drivers will take alternative routes. The original route may then become unexpectedly clear.
In financial markets, this phenomenon is even more pronounced. If a highly influential investor (or AI system) predicts a stock will surge, people will follow that advice, creating a surge in demand that artificially fulfils the prophecy. In these environments, the forecast itself actively alters the future.

A Brief History of Looking Forward

Humans have been fascinated by understanding time and predicting the future for thousands of years. Early attempts relied on sweeping narratives, like the ancient prophecies found in the Book of Revelation. These were symbolic and vague enough to be reinterpreted later, meaning there was no way to mathematically falsify them.

True forecasting—modelling the set of possible outcomes as a distribution—requires precision and an acceptance of uncertainty.

1600s to 1705: The birth of celestial mechanics. Using newly formulated laws of planetary motion by Johannes Kepler and Isaac Newton's theories of gravity, Edmund Halley studied historical comet observations. He made a staggering 50-plus-year forecast, accurately predicting the return of Halley’s Comet in 1758.
1960 to 1969: Rudolf E. Kalman published the Kalman Filter, a mathematical algorithm that proved vital for the Apollo 11 moon landing. Because spacecraft sensors drifted and were imperfect, scientists used physics to predict the ship's position, tracked how wrong that prediction might be, and corrected it with sensor data. By incorporating uncertainty, they successfully navigated to the moon.
1970: Box and Jenkins developed ARIMA (Autoregressive Integrated Moving Average), a general class of statistical models. Instead of humans defining the strict physics of a system, these models were flexible frameworks that learned patterns directly from historical time series data.
2023: We entered the era of deep learning with Google’s release of GraphCast. For over 50 years, global weather forecasting relied on massive supercomputers running numerical physics simulations. GraphCast, trained on decades of historical weather data, completely outperformed these traditional simulators, running global forecasts in under a minute.

How Does AI Perceive Time?

It is easy to imagine an AI experiencing the continuous flow of time just as we do, but computers do not inherently understand concepts like “Tuesday" or “3:00 PM." To a machine learning model, everything is simply a number. Time is perceived in discrete, fragmented moments. To give AI clues about time, data scientists might include variables representing the time of day or the day of the week, acknowledging that a Monday behaves differently from a Saturday. However, AI's true superpower lies in its ability to handle multivariate data.

Visualisation of the raw weather data. There is a clear seasonal trend, but there are some extreme outliers among the pressure measurements, which make it difficult to see any patterns (Sebastian Callh, 2020).

A univariate time series is just one line wobbling over time, like tracking daily temperature. A multivariate time series is like a massive spreadsheet moving through time—simultaneously tracking temperature, humidity, pressure, rainfall, and wind. Most real-world systems are not just one line; they are many lines interacting.

Rather than constantly observing a continuous stream, the AI looks at data in isolated “windows." It might process 90 days of weather data all at once to predict the next seven days.

From Memory to Attention: Under the Hood

Historically, engineers built Recurrent Neural Networks (RNNs) to process time series data sequentially, in the order it was generated. An RNN reads a sequence one step at a time, carrying a little “memory" forward. However, RNNs suffered from vanishing gradients—older information quickly faded, and the model forgot important past events.

This led to the development of Long Short-Term Memory (LSTM) network. LSTMs introduced a brilliant forget mechanism, allowing the model to selectively remember and forget certain timesteps. If an LSTM is analysing weather every ten minutes, it learns to forget hours of light drizzle while heavily retaining the memory of a massive pressure drop, ensuring it does not get overwhelmed by useless noise.

More recently, the AI landscape has shifted towards Transformers (the technology underpinning tools like ChatGPT). Transformers abandon the step-by-step reading approach entirely. Instead, they use an attention mechanism, looking at all time steps simultaneously. If a Transformer is trying to make a prediction for today, it asks: “Which other moments in history are most relevant to this exact situation?" Instead of reading a book left to right, it can instantly flip to the exact page that contains the answer.

The Three Core Tasks of Time Series AI

When we feed these complex neural networks our historical data, we are usually asking them to perform one of three tasks:

Forecasting (“What comes next?"): For example, the UK's National Energy System Operator relies heavily on forecasting to balance the power grid. Storing energy is highly inefficient, so supply must perfectly match demand. If the AI overestimates demand, power plants burn wasted energy, resulting in higher consumer prices. If it underestimates demand, the country could face rolling blackouts.
Classification (“What is happening right now?"): Wearable tech uses time series data to classify human behaviour. A smartwatch tracks your heart rate, movement, and breathing over time. Without you pushing a button, it can look at that window of data and categorise whether you are sitting at a desk, walking, or swimming.
Anomaly Detection (“Is this behaviour weird?"): Anomaly detection asks if a sequence deviates from what is considered “normal" for that specific system. This is how modern bank fraud algorithms operate. Buying a coffee in London is normal behaviour. Buying a coffee in London three minutes after a purchase is made in Tokyo is a glaring anomaly.
This is also how the Apple Watch’s medically approved Atrial Fibrillation (AFib) detector works. It tracks your heart rate to learn what your “normal" looks like. It does not alert you after a single irregular beat; it waits for multiple irregular readings over time. While studies show the watch's sensitivity is somewhat low (meaning it misses some real AFib episodes), its specificity is incredibly high—meaning it almost never gives a false alarm, reliably catching early warning signs of strokes.

Weather forecasting using a trained AI model. The model has picked up on the seasonality in the data and successfully extrapolates. The final observations are somewhat irregular, and not perfectly captured by the model (Sebastian Callh, 2020).

The Trap of Spurious Correlations

While AI is incredibly powerful at spotting patterns, it has zero common sense! This leads to a dangerous pitfall known as spurious correlations—when two variables appear mathematically linked but have no causal relationship.

For example, historical data shows a near-perfect correlation between the amount of cheese consumed in a given year and the number of people who tragically die by becoming tangled in their bedsheets. Similarly, there is a flawless statistical correlation between the release of Nicolas Cage movies and the number of swimming pool drownings. Ice cream sales and shark attacks also rise together perfectly—though both are simply caused by the arrival of warm summer weather.

An AI blindly hunting for patterns might confidently use Nicolas Cage movie releases to forecast public safety risks. We know that correlation does not equal causation, but an algorithm splitting millions of tiny correlations across a vast, multivariate dataset does not know the difference. The primary skill in data science is shifting from asking “how do I build a forecast?" to “how do I critically evaluate one?"

The Ethics of Having a Crystal Ball

As AI forecasting becomes more embedded in our infrastructure, we must confront the ethical implications of a machine that anticipates human behaviour.

First, there is the issue of historical bias. AI models can only learn from the data of the past. If a predictive policing model is trained on decades of arrest records, it learns where arrests have historically been made, not necessarily where crime occurs today. Deploying that model into the real world simply entrenches and accelerates historical inequities, trapping society in the prejudices of the past.

Second, predicting the future threatens individual autonomy. The more granularly an AI can predict consumer or patient behaviour, the more that knowledge can be used to nudge or restrict people. If a wearable device accurately forecasts that a patient's health will severely decline over the next year, a health insurance company equipped with that data could preemptively price that individual out of coverage.

AI forecasting does not magically see the future. It learns from the past and makes highly calibrated guesses about what similar-looking futures might bring. The world is fundamentally chaotic and constantly evolving. As Kieren and Riku point out, the future is always yet to come, and the only truly predictable thing about it is change.

If you enjoyed reading, don’t forget to subscribe to our newsletter for more, share it with a friend or family member, and let us know your thoughts — whether it’s feedback, future topics, or guest ideas, we’d love to hear from you!