A first draft AGI career thesis

2024-03-12 / 29 min read

Why AGI #

I plan to spend the rest of my career contributing towards the development of AGI.

There are two problems with this statement:

I haven’t defined AGI
It doesn’t tell me what I should go and do tomorrow

I’m trying to work out what I want to spend the next 3 decades of my life working on. That is a very long time, I cannot plan that far ahead, but I can try to plan for periods of 1-5 years. It seems reasonable that I might spend a year or more deciding what to work before I actually commit to doing anything. Let’s call this “heads up” mode.

Once I enter “heads down” mode, it’s hard to keep looking up and distracting to do so. I need to be sure of what I am doing and why I am doing it for a year or more, and avoid regularly questioning my assumptions or getting distracted by other shiny things. I want to be 100% in on a problem until some point of completion (whether or not it is solved) before I look up again.

I’ve looked broadly at the problem space over many years but with renewed focus recently, trying to start to form opinions and conviction in a path of action over the next 1-5 years, and this is my attempt to write that down.

Why build intelligence? This seems like the most important technology we are going to build this century and it is natural (to me at least) to want to be at the center of it. There are two drivers of my interest here:

[P1] To create our future: one day we or our progeny will leave this planet, solar system, and then the galaxy, in our unwavering quest for knowledge. While humans may leave this planet, we probably won’t leave this solar system, or it would be quite inefficient for us to do so. Artificial intelligence seems necessary for exploration beyond planetary scales, given special relativity and the need for distributed control.
[P2] To understand us: the fact that there are several billion human brains on this planet, and that we still don’t really know how they work, and that they do something that we haven’t been able to reproduce. If we want to understand human intelligence, we need to create it.

Defining AGI #

I’m quite happy to give ChatGPT etc the literal definition of being an artificial general intelligence, it is artificial, intelligent, and general to a surprising degree. I also don’t have a problem with the fact that the goal posts keep moving, as I’m not sure what we would do at this point if they didn’t.

“General” is the obviously problematic term in AGI. Humans are the most generally intelligent beings we are aware of. Defining generality as that which matches of exceeds that of humans is an obvious starting point.

Real world impact #

Human’s are physical beings, with the ability to manipulate other physical objects and affect the future state of the world. When you consider the spectrum of human abilities, the majority of them have some physical manifestation. ChatGPT cannot make me a cup of coffee. It’s domain is the digital world and it is really quite constrained to it. It can tell / show me how to make a coffee, it may even be able to tell me how to get to the moon and back, but it cannot actually do any of that.

For me, that is a requirement - “AGI is a robotics problem”. Matching human generality requires embodiment almost by definition, and it follows naturally from the milestones I lay out and a long term goal of interstellar exploration. This puts breaking Moravec’s Paradox front and center of my interests. This is counter to other’s definitions that explicitly avoid the physical world by measuring against “cognitive tasks”.

Embodiment also opens up a huge number of ways in which AI can have meaningful and positive impact for humans. The vast majority of pre-internet economy is based in the physical world, manufacturing, construction, agriculture cannot be fundementally changed without embodied intelligence.

Attempts to capture this in a new turing test have come in a few forms, including the Embodied Turing Test in which “An AI animal model—whether robotic or in simulation—passes the test if its behavior is indistinguishable from that of its living counterpart”. That doesn’t mean that they look like or necessarily perform the same actions as an animal or human, but that they can achieve the same outcome or effect in the physical world. Or, maybe a modern version of the coffee cup test, that an AGI should be able “work as a competent cook in an arbitrary kitchen”.

Adaptability and flexibility #

Perhaps the most astounding thing about human intelligence is not the sum of all skills and achievements that we are capable of as a species today - but our ability to learn new skills and flexibly adapt to new challenges. “Artificial intelligence that is not specialized to carry out specific tasks, but can learn to perform as broad a range of tasks as a human.”.

This moves the focus from what any system can do today to what it is capable of learning to do. This stops you from “cheating” to some degree (e.g with a mixture of expert systems) while at the same time not explicitly requiring that we successfully deploy a system across many domains in order to measure success.

In effect, we need to measure the ability of a system to actually learn, and that is part of how we should measure progress towards AGI.

AGI: Embodied, human level intelligence #

If I were to put it into one sentence, then this is currently where I net out:

A system that is capable of reliably performing at the 99th percentile of skilled adult human across a broad range of tasks including real world physical robotics problems and learning new cognitive and physical skills as sample efficiently as humans.

There are plenty who would argue that this is narrow view of intelligence, and indeed human intelligence is incredibly specialized towards our survival, there may well be forms of intelligence far more general and more powerful than ours. But it’s still undoubtedly the most advanced and general intelligence we are aware of, and we know that it actually exists and therefore it is possible to create something like it. Survival, is also at some point in the future itself a necessary skill.

Vague AGI milestones #

How does this unfold? This is not much more than a wild guess for now, but I can see all of these things happening at some point in the future, and don’t see them performed consistently today, although we may be very close on some.

Action and effect

AI starts to replace roles that have a reactive nature that require them to perform actions like sending emails, organizing calendars, sending pull-requests to a code repository, or basic use of company tools or the internet overall. This is starting to happen in narrow domains such as support roles, with some companies already successfully replacing a large fraction of humans with reactive AI agents.
Agentic action and planning

AI agents start to take on roles at companies that require them to take proactive actions independently in order to succeed in the role. They are able to interpret high level directives and perform real actions that change the world in a positive way / towards a goal, they plan, suggest and implement changes.
Physical / real world control

AI agents enter the physical world with improving robotics and have to adapt to the physical requirements of our world. They act as agents in more “straightforward” settings, at home from both a social / functional perspective, they enter the manual/physical labour workforce.
Online learning and adaptation

They are able to learn from the same forms of tuition as humans, a robotic factory worker requires on the order of a few hours of training from a human to stack boxes / assemble items, check for defects etc. There is little to no human oversight required for them to perform their role, and they can adapt without human intervention to new environments.
Collaboration and group agency

Groups of AI agents are able to collaborate together to achieve a larger goal, as a work group in a company, or as a company itself. They require little to no human involvement, they don’t get “stuck”. They are able to adapt quickly to new pressures or processes, changes in regulation, new competitors, new environments.
Novel discovery out of distribution

AI starts to take a leading role in science and the most challenging human problems, both theory but also experimentation. They can propose new theories and ideas but also implement and test them across a range of scientific domains, mathematics, physics, biochemistry.

With the above, groups of AI agents start to be able to achieve some of humanity's most impressive feats requiring massive cross-disciplinary expertise in the physical world. Mostly non-human entities are capable of achieving things as significant as launching a return mission to the moon, or mars.

Clearly even just milestone 1 above has profound impacts on safety. Agency and taking actions proactively is a step change in how we deploy AI systems today and brings a huge amount of risk that we should be wary of. I don’t intend to say much more on safety than this for now. My p(doom) over the next 10 years is quite low, I will continue to re-evaluate it and consider how much energy I want to focus towards safety, and will likely do so if I / we succeed in increasing my p(doom).

Parts of the AGI puzzle #

There are various problem areas, questions, ideas that I think are important and I am interested in exploring more. This is not intended to be a list of what we need to do, just a list of areas that I think might be important when it comes to creating human-level intelligence, and that do (or don’t) excite me. I hope I can find the time to go into these areas in a lot more detail, as each of them alone are huge and I barely scratch the surface here.

NeuroAI #

In 2023, a number of leading AI and Neuroscience research wrote a paper that argued that “To accelerate progress in AI and realize its vast potential, we must invest in fundamental research in “NeuroAI.”.

NeuroAI is “based on the premise that a better understanding of neural computation will reveal fundamental ingredients of intelligence and catalyze the next revolution in AI”. The suggestion is that that the bulk of the work required to towards achieving the G in AGI is in effect - breaking moravecs paradox - targeting building systems that can match the perceptual and motor capabilities of simpler animals.

It also suggests that to achieve this, we need to develop a new generation of researchers that are fluent in both the neuroscience and AI domains, and able to take research and concepts from Neuroscience and apply them computer and robotics systems.

I broadly agree with the sentiment of the paper, while at the same time appreciating the general dissilusionment from the AI research community on the effectiveness of neuroscience to guide the development of better AI systems.

The best approach to transfer insights from neuroscience into AI research is probably “Sideways-in”, as defined by Peter Stratton in Convolutionary, Evolutionary, and Revolutionary. This is neither top-down nor bottom-up, low-level details such as sub-cellular ion channels are abstracted away, and we are instead attempting to glean insights at the neuronal, neuronal population, or macro architectural level about the functional, algorithmic properties of these structures.

There are many observed properties of biological brains that we simply don’t have a good understanding of, or are yet to be applied successfully to artificial systems today, I’ll highlight a few:

The remarkable homogeneity of the mammalian cortex - the (mostly) same 6 layers across both perception, motor, language, and other areas of the brain, a repeating columnar structure. This implies a similar “algorithm” in the neocortex for perception and action. This is roughly what Jeff Hawkins is focused on in his Thousand brains theory of intelligence.
Spiking neural networks (SNNs) - biological neurons communicate with discrete spikes, whereas most AI research emulates a rate based model of neuron activation. Artificial SNNs are challenging but success could lead to huge improvements in computational efficiency and open the door to Neuromorhpic hardware.
There are more backward connections than there are forward connections in the brain, even by as much as a ratio of 10:1. The brain makes a lot of predictions fills in the gaps, our attention is focused towards surprise. This hints at the brain being a highly predictive model, with efforts in self-supervised learning on the AI side attempting to recognize this, as well as neuroscience driven efforts such as predictive coding.
Oscillations are a crucial component of how the brain works, with many different rhythms / waves clearly tied to functional properties of mammalian brains. The case for this is nicely laid out for this by Buzsáki in Rhythms of the Brain, arguing the stochastic nature of neurons serves a crucial function, with homeostasis mechanisms that are designed to keep the brain and it’s oscillations right at the edge of criticality.
Macro architecture - the cortex is homogenous, but it is nothing without the mid-brain structures such as the basal ganglia, hippocampus, thalamus, or the cerebellum - each highly specialized, with unique structure and purpose and still plenty of open-questions about their exact role in intelligence.

Reinforcement learning #

On the internet, we have a colossal supply of data to train our models on, text, image, and video. To apply the same kind of model training techniques to robotics, you need to collect a vast amount of data about robot actions and senses. Where do you get this from?

As an example, in DeepMind’s RT-1/2 projects the data comes from human-provided demonstrations - hundreds of thousands of them across hundreds of tasks. These demonstrations are specific to particular hardware, and fairly constrained in environment. This high cost of data acquisition seems inevitable if we are going to attempt to try to train action models like we do other generative models, and particularly high if hardware and software are going to develop in parallel.

In the case of self-driving cars, data collection is a little less tied to the specific hardware, and there’s no shortage of data of humans demonstrating how to drive cars. In a relatively narrow problem space this approach has gone pretty far, even if we are struggling with the tail cases.

If you don’t have a lot of data or the patience to collect it, then there is reinforcement learning. In another example from DeepMind, low cost bipedal robots are taught to play football using deep neural networks and reinforcement learning. Traditional reinforcement learning does have it’s own set of challenges:

Robots took around ~35 days of (simulated) time to be learn to get up and walk, and another ~50 days to kick a ball into an open goal, before being trained to play against an opponent.
Sparse rewards aren’t enough - a robot with an initial random policy is highly unlikely to accidentally get up and kick a ball into a goal. Staged training starting with simpler tasks, and shaping rewards that are less sparse than scoring are crucial.
Reinforcement learning in the real world is challenging given the requirement for human supervision e.g resetting trials, leading to using a simulated environment which requires resolving the “sim2real” problem of transferring the trained agent from the simulation to the real world, although examples such as this one have done that successfully.

Having the right training signal whether that is coming from curated training examples human, or a carefully constructed reward signal, is as always a fundamental part of the problem, and there are two very different approaches above that we can take here. Reinforcement learning, self-supervised learning, world model building seem to me to have a clear advantage in the pursuit of AGI.

Model-based reinforcement learning #

Reinforcement learning with sparse rewards is challenging: “without any priors over the state space (such as an understanding of the world or an explicit instruction provided to the agent), the optimization landscape looks like swiss cheese”. Model-based reinforcement learning has a few advantages, opening the door to planning and arguably system 2 like thinking, inverse reinforcement learning, or optimizing fast changing, dynamic reward functions. For example, being able to leverage knowledge of how to win at normal chess in order to perform well at Losing chess without additional training data.

In the DayDreamer project, physical robots learn to walk within an hour in the real world from scratch. Rather than just learning which actions from a given state lead to the most reward, they also build an internal model of the world which can then be used to predict future states and improve sample efficiently dramatically.

World models are clearly useful in any sort of problem that requires planning, and seem like a necessary component to explain human traits like the ability to improve at a task without actually doing it. In Dreamer and other projects like TD-MPC2 the value of model-based reinforcement learning appears to have the potential to offer huge improvements in performance, and a promising research direction.

Humans undoubtedly have a world model. We predict how the world evolves, things we can’t directly observe such as people’s thoughts, the impact of our actions. To build a model we need good representations, and self-supervised learning is an important research direction here, with things like VICReg, JEPA and its variants standing out.

I am not personally convinced that any system can build a sufficiently capable model of the world in order to achieve the future I have described - through text, image, and video alone. Models of the world can certainly be learnt this way, but I’m doubtful about to what degree. There is clearly details in the physical world that can’t quite be fully modeled from language or even video alone. For example, understanding the amount of pressure required to crack an egg, or subtle difference in balance required to walk on ice without falling over. If robots are part of AGI, they will have to experience the world and learn from it first hand, and will likely build models of it.

Movement as a foundation for thought #

The reason we evolved brains in the first place was in order to move. To understand perception we have to understand action. Our brains are always generating output even before we are born, constantly trying to understand the consequences of those actions, and mapping the external world onto a set of rich internal dynamics. This view is outlined nicely by György Buzsáki in his book The Brain from Inside Out - “Brains do not process information: they create it”. There is plenty evidence that movement is an important part of how we think and perceive the world, and that thought itself is internalized action, with neural activity of imagined actions almost indistinguishable from real ones. Even the cerebellum, which is typically thought of as playing a role in motor function appears to have a role in cognitive processes.

Humans are at the end of a long evolutionary tree, with language only present in humans and representing one of the last significant jumps in our cognitive abilities. Our non-speaking nearest species such as apes, or many mammals are capable or remarkable feats of control and planning without language, even mice can anticipate the outcome of a series of actions, and yet and these skills are still very much out of reach for our AI systems today.

Whether we really need our AGI to be grounded/embodied in the real world in order to achieve cognitive (non-physical) AGI is hotly debated, but in a definition of AGI that requires embodiment, given that it might be useful to higher level cognitive processes then it seems like the first place to start.

Simulations to accelerate robotics #

Real world training has a high cost and is limited by actual time. Simulations are used extensively within reinforcement learning and have some obvious advantages (no hardware, massively scalable/parallelizable).

The state of simulated world-like environments today on the whole is relatively basic. There are projects like Gymnasium that attempt to bring these all together, or companies like Imbue working on building rich simulation environments, or DeepMind acquiring MuJoco and developing their own simulations, or building in game engines - with games being an obvious training ground for such agents, and some attempting to address the sim2real problem of transferring a model trained in a simulation to the real world.

I suspect that a considerable amount of progress toward AGI or in the robotics domain more generally will be made within simulated environments, at least initially. It also seems like there is an opportunity to drive significant impact in the field by improving the quality, depth and detail of simulated environments for reinforcement learning research, and I can imagine a future where through VR, humans are able to interact with and train any agents in such an environment.

Reconsidering our approach to credit assignment #

Practically every major achievement in ML over the past decade builds its success upon gradient descent. The entire ecosystem of tooling is ultimately build around gradients - tensorflow, keras, pytorch, jax - at their core they all do autograd and make it easy for anyone to, with the right loss function, train neural networks in this way.

And yet, we are fairly certain that this is not what the brain does. Hinton was previously motivated by this, in his paper on the Forward-Forward Algorithm describes in one of its first sections “What is wrong with backpropagation”, drawing attention to the implausibility of backpropagation as a model of how the cortex learns, due to the lack of evidence that the cortex propagates error derivatives, the asymmetrical and highly recursive architecture, or the limits of on-the-fly backpropagation through time for learning sequences (something that humans are obviously very good at). Or other than biological implausibility - the fact that we must be able to compute gradients of any function we use, which limits the types of architectures we can construct, or how easily we can construct them. Yann LeCun’s vision for autonomous machine intelligence has no problem with that, suggesting we differentiate end-to-end through complex hierarchy of brain inspired modules.

It seems possible or even likely that gradient descent is the most optimal form of credit assignment possible, but that does not necessarily mean that it should be the foundation upon which we try to build AGI. Sometimes the most optimal approach is not the best approach, and it might be that the benefits of alternative approaches, or the lack of gradient descent’s weaknesses make a less optimal approach the better choice.

Gradient descent / DL is power hungry in both memory and compute, leveraging rate based models and floating point arithmic makes translation to neuromorphic architectures a challenge unlike e.g spiking neural networks.
Complexity and sensitively of training - gradient descent is highly sensitive to hyperparameters, constrains on training requiring end-to-end back propagation add significant overhead to engineering large scale systems, which might be reduced with more local learning approaches.
Online and incremental learning - integrating new information into a neural network is a challenge with todays approaches to training, with a tendancy to catastrophically forget previously learnt patterns when integrating new information and working around this today is nothing short of fiddly.

From my own standpoint, answering this question is fairly crucial to deciding what to do, given that the method of credit assignement is tied deeply to what one might build and the techniques required to build it.

Continual learning, catastrophic forgetting, incrementality #

If we want online adaptability and useful memory, continual learning seems to be a pretty big part of the puzzle. We have to consider how new information integrated into the system in an online fashion, so it can learn on the job and adapt.

Continual learning relates to learning from new data, without forgetting what you already know. If you train a standard neural net classifier through gradient descent, whatever the architecture - by the time you’ve got to your final class it’s all but forgotten how to classify the first classes it saw. This is known as “catastrophic forgetting” and is a well documented phenomenon. This even happens in LLMs.

There is plenty of work trying to avoid this behavior such as replay based approaches to maintain a stable training distribution, regularization of weight updates to control stability or limit changes in parameters that are important to old tasks, assigning parameter spaces to each individual task, meta-learning, or even reverting to KNN or instance based learning approaches which are pretty much immune to the problem. Nothing is perfect however, and continual learning benchmarks consistently lag behind their non-continual counterparts.

Continual learning opens the door to more “incremental” approaches to learning. Most new models we build are trained completely from scratch. We don’t have many techniques to improve on them. Throwing extra data in gives us all the challenges above, and adding more parameters is wrought with complexity, so we usually start from scratch.

It is unclear to me whether this is a fundamental property of the problem of trying to build intelligence, or if we could actually develop tools and techniques to solve this. Part of this approach seems to be a consequence of research and experimentation best practices - making results reproducible, avoiding tuning hyperparameters to the test set, etc.

Most research experiments until recently may have taken minutes, hours, or days to retrain a model. In the last decade and especially with LLMs, those training runs are starting to take weeks if not months. As we move into the real-world, these training runs could start taking years before we get the signal we are looking for, even with the help of simulated environments. We won’t want to teach robots to walk again just to find out if we have succeeded in giving them the next ability we are designing for.

LLMs #

To me, there is something un-interesting about LLMs today, and the core issue is that they don’t do things. They respond to things, but they don’t have agency, goals or intentions, they don’t plan and they don’t have memory. They are not designed to produce actions that affect the state of the world, or change the future. They struggle to learn continuously and adapt. When I see impressive feats of reasoning from LLMs, I find I am usually more surprised not at it’s ability but by how seemingly complex abstract problems actually have statistical solutions, or can be solved through memory.

LLMs train on text, but this is only a highly compressed and course view into the world. Multi-modality is here with (static) images, and video will probably follow soon. I expect to see huge progress made on top of LLMs, particularly around introducing agency, giving them memory, or the ability to reason and search, and I would not disagree that there is a lot more still on the table with relatively minor improvements to the architecture, more data, and more scaling.

The idea that LLMs are nearly there and we just need to feed them video, add possibly symbolic reasoning, give them the ability to plan, and real memory and a few other things and we are there sounds a lot like several major research breakthroughs on the same kind of scale as transformers themselves and not a few small remaining hurdles.

I am instead leaning toward the idea that LLMs are an “off-ramp” to AGI. Whilst it seems likely that LLMs will be paired with foundation models in robotics, perhaps jointly trained and able to leverage the representations learnt from physical agents to improve their understanding of the world and this might together create some of our first widely deployed general robotics systems, I think it’s also likely that we will have moved on a long way from today’s LLMs by the time we get there.

Right or wrong, at the same time there are a lot of people and organizations working on LLMs, with the resources required to actually experiment and teams of highly experienced people who are pushing the frontier of that research. On that basis it seems like a field in which I can have a relatively small impact. I’m excited to see how far LLMs go, or even if they go the whole way, but I’m happy to leave that to everyone else working on them.

Meta learning #

Learning to learn. Typically meta-learning approaches are quite specific to certain architectures, but what I really want to talk about is the concept from Jeff Clune of AI generating Algorithms.

Progress in the field relies on human expertise and creativity. Thousands of researchers explore and test new ideas, and occasionally we make a breakthrough. Sometimes we leverage meta-learning along the way, with SOTA results being achieved through neural architecture search in multiple cases.

AI-GA might give a path towards automating that process itself. It may be beyond neural architecture search, for example by finding the actual learning algorithms. Ultimately any meta-learning needs to be built on top of some “building blocks” for an outer-loop optimizer to work with, and trying to define the space of all building blocks and architectures that definitely contains the target is not an obvious task. It is also computationally expensive, and for a reasonable timeline would depend on simulated environments.

Despite the challenges, it seems like an important line of research. Whether it is would be faster path forward than a more manual approach is unclear, but as an individual - the opportunity to automate and scale your own search and discovery process is at the very least going to be an important tool in the journey ahead.

First steps #

This is a high level snapshot of my current understanding, definitions, areas of interest. I know have a lot to learn still. I hope and expect to have to re-write this every year, perhaps more frequently at first. In the meantime I’ll continue to work on the following, but if you have suggestions for books/papers/projects then I would love to hear them! I’m trying to be a generalist, and have a good understanding of many areas before becoming an expert in one.

Continue to read and consume, understand the areas of research, the people, the big ideas and improve my conceptual framework of the problem.
Putting ideas into words is critical for me to consolidate thoughts. I hope to do more of it, summarizing papers, and following up with more in-depth reviews on some of the areas I’ve highlighted above.
Get good at building. Improve my own development and research workflows, know my way around the frameworks and get really really fast at at turning a paper into code. I’d also like to understand the entire stack, from writing CUDA kernels to distributed training, or implementing auto grad myself.
Learn to learn. My experience so far building software and companies probably gives me some bad habits I’ll need to break, This is a new problem space for me and will require me to critique my own approach to learning.
Find the right human environment. Be around people I can learn from and who I can talk with, but also who I can work with.

Hopefully with some progress on the above, I might find I am ready to hit heads down mode, and commit to some line of research for a year or two. But my guess is I’m still 1-2 years out from that.

reply via email follow me on twitter

< lewish