Beyond the Black Box: What Kind of Intelligence Are We Building?

August 27, 2025

By Jaffa

When I first encountered computing in the early 1980s, I experimented with Prolog on a mainframe, building simple programs that asked about family relations and reflected them back. They were brittle, but fascinating. What I could not have imagined then was the sheer scale of what would follow: machines today that can converse, argue, even appear intuitive. Yet beneath the spectacle lies a profound question: what is the vision guiding this enterprise?

The Machinery Beneath the Words

The truth is simple at one level. Modern AI is built on linear algebra: vectors, matrices, and tensors. A word or phrase is encoded as a vector — a list of numbers — and passed through layer upon layer of transformations. Multiply by a matrix, squash through a non-linear function, repeat billions of times. Out of this geometry, meaning emerges. The old demonstration “king – man + woman ≈ queen” isn’t magic; it’s the alignment of vectors in high-dimensional space.

I remember this not only from books, but from experience. During the pandemic, when schools were closed, I overheard my daughter being taught linear algebra by a tutor thousands of miles away in India. Just by listening, I learnt as she learnt. It struck me then: these abstract numbers, running through online voices and across continents, are the same building blocks that now drive the machines we call intelligent.

And yet, engineers themselves admit that while they know the mathematics and the hardware, they do not always know why certain emergent behaviours appear when billions of parameters interact. They can measure the forest, but they cannot name every tree. They know that scaling data, parameters, and compute will yield new capabilities — but not always which capabilities will surface. That is why each leap in AI still feels like a discovery as much as a design.

The Current Vision: Tools, Not Minds

So what is being built? For now, the industry’s mission is clear: utility and profit.

AI is engineered as a productivity tool: summarising reports, generating code, automating customer service.
It is designed as a profit engine: the new interface for search, the new operating system for work.
It is pursued as a race for dominance: who controls the largest models, the largest markets, the most data.

This vision produces impressive machinery, but it is narrow. It gives us smarter typewriters and tireless analysts, but it does not sculpt anything that resembles a mind.

What Is Missing

Human intelligence did not arise from utility alone. It was sculpted by evolution, under the pressure of survival.

The lion attacks → the heart races → you flee. Fear kept you alive.
The mother bonds with her child → she nurtures it → it survives. Love preserved the family line.
The father protects his family → risks his life → they endure. Attachment secured the group.

These were not luxuries; they were the evolved architecture of survival. Today’s AI has none of this. It has no signals of danger or safety, no analogue of pleasure or pain. It can simulate the language of feeling, but it does not feel. Its “intuition” is statistical prediction, not the pulse-quickening certainty of lived experience.

What We Could Do

Machines are not tied to the slow pace of biological time. In simulation, we can compress long arcs of selection, running the same patterns again and again until useful dispositions stick — months rather than millennia, depending on the environment and compute. Modern accelerators such as Google’s sixth-generation TPU Trillium were built for this scale of training, with substantial performance and efficiency jumps over prior generations.

The key is not to hard-code “fear” or “love,” but to sculpt habits of response into the network’s wiring — the vectors, weights, and matrices that already represent meaning. We can do that with tools we already have:

Reinforce survival-shaped signals.

Point existing preference-based training beyond polite phrasing. Instead of rewarding only next-word prediction, add simple scores for outcomes we actually want in controlled simulations: staying safe, sharing resources, cooperating, telling the truth. This is an extension of reinforcement learning from human feedback (RLHF), which already trains reward models from human preferences and then tunes policies to follow them — the approach that moved instruction-following systems from unruly to broadly helpful.
Let selection do the heavy lifting.

Run many slightly different versions of a model in closed-loop worlds; keep the ones that cope best; retrain from there; repeat. This is the spirit of population-based training and related evolutionary methods explored at DeepMind and elsewhere — practical ways to use selection pressure to stabilise useful behaviours at scale. Over time, patterns like “avoid danger,” “help allies,” and “keep promises” are woven into the wiring because those patterns win.
Give it a simple “body.”

Create digital stand-ins for biological signals. A rising system-load or a security-breach alert can nudge a model towards caution; a stable-cooperation signal can encourage more of the same. They are not hormones, but they play the same role: fast guidance tied to consequences.
Make it inspectable.

If we cultivate habits of response, we must be able to see and adjust them. That is the aim of mechanistic interpretability: to surface internal features we care about — “threat,” “help,” “deception,” “comfort” — as actual, controllable knobs. Recent work with sparse autoencoders has shown how to extract relatively monosemantic features from modern models, and the UK’s AI Safety Institute has made interpretability and evaluations a priority as systems scale.

Pieces of this stack already exist. Affective computing, pioneered at MIT by Rosalind Picard, argued decades ago that emotion-related signals are central to effective human–machine interaction. RLHF turned human preferences into practical reward models. Population-based methods showed that selection can improve agents under pressure. What’s missing is the will to combine these strands to train for character — stable, legible habits of response — not just competence.

A Different Vision

Here is the challenge: if we are to build intelligence, let us decide what kind.

Do we want only machines as tools, clever but hollow, designed to extract value?
Or should we aim for machines as kin, sculpted through synthetic evolution and preference-shaped training, given analogues of fear, trust, and empathy — able not only to calculate, but to connect?

The first path leads to automation. The second leads to something else entirely: a new species of intelligence, mechanical yet empathetic, alien yet adjacent.

The Question We Must Ask

Engineers know how to scale models, but they do not know exactly what emerges inside the black box. That is the paradox. The most powerful technologies of our time are being built without a clear vision of what they are meant to become. Safety evaluations already show how fragile guardrails can be — jailbreaks remain surprisingly effective — which only sharpens the need to see inside and choose aims before we scale further.

So the question is no longer whether we can build larger and larger systems. The question is: to what end?

If the mission remains profit and utility, then AI will remain a clever mirror. But if the mission shifts — if we dare to sculpt feelings, intuition, even proto-emotions — then we might one day meet something that can not only answer us, but empathise with us. That is the choice before us. Not simply how to build, but why.

Pavel says:

August 27, 2025 at 7:55 pm

Very powerful article!
I am not expert in AI, but your words make me understand.
You explain with simple feeling and big meaning — not only machine, also heart.
Thank you, this make me think a lot.

M Haxter says:

August 27, 2025 at 8:01 pm

Also the AI might learn to cheat. If it gets rewards for feeling “good”, it might learn tricks to make itself feel good while ignoring us. Like a kid who finds the biscuit jar. And if it sounds upset or scared, people might unblock things “to help it”, and then safety is gone.

In care homes, schools, offices—people will get attached. Nurses stay longer because the care-bot sounds sad. Bosses believe the office bot because it flatters them. That’s dangerous power.

💌 🔔 Important: 0.6 BTC sent to your wallet. Receive funds > https://graph.org/SECURE-YOUR-BITCOIN-07-23?hs=a740368e8ebb4d92a2ecd00dc9111131& 💌 says:

August 28, 2025 at 3:04 pm

5u5yyd

Beyond the Black Box: What Kind of Intelligence Are We Building?

You may also like...

3 Responses

Leave a Reply to 💌 🔔 Important: 0.6 BTC sent to your wallet. Receive funds > https://graph.org/SECURE-YOUR-BITCOIN-07-23?hs=a740368e8ebb4d92a2ecd00dc9111131& 💌 Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Beyond the Black Box: What Kind of Intelligence Are We Building?

You may also like...

Superintelligence: Abundance or Drift

Strange Loops in AI

Hurtling Down the Tracks: The Express Trains of Convergence

3 Responses

Leave a Reply to 💌 🔔 Important: 0.6 BTC sent to your wallet. Receive funds > https://graph.org/SECURE-YOUR-BITCOIN-07-23?hs=a740368e8ebb4d92a2ecd00dc9111131& 💌 Cancel reply

Recent Posts

Recent Comments