rlpAI Artificial Intelligence Project

Last Updated: 4/11/2019
Status: In Progress


Develop an Artificial General Intelligence (AGI) agent which can learn to function in a continuous real-time environment through various means of training using a general architecture which is not specific to any task, ability, or robot design.  The goal is not human-level intelligence, rather the highest functioning general intelligence achievable on modest hardware.


The agent should be able to make sense of itself and its environment quickly – similar to a human or animal.

The agent should be able to develop a broad understanding of the world through experiences and interactions in the environment.

The agent should be able to learn how to accomplish tasks which are fundamentally different from each other, and retain the skills learned.

The agent should be able to use relevant knowledge learned from prior experiences in order to solve new problems.

The agent should be able to learn how to communicate with a human (using virtual reality) or another agent in the environment.

General Description

The agent is general, and is designed to be able to learn to function in any arbitrary and unknown environment.   In other words, the agent code is independent of the environment.  It doesn’t have any prior knowledge of the world or itself at startup.

The agent learns from experiencing the world as it finds ways to meet objectives.   It determines the best actions to take based on what has been learned.  It must learn everything it needs to know to operate.

Since the world is open and continuous, the agent is free to take any one of an infinite number of paths to anywhere.  Therefore, curriculum training using objectives is used to guide the agent to learning opportunities where it can discover and learn new things.

The environment runs in real-time and is indifferent to any agents who come in to play.  Since the agent runs asynchronously, it has to be able to think fast enough to keep up with the environment.

(This is not)

This agent is not a reinforcement-learning agent, it’s not a machine learning algorithm, it’s not deep learning, it’s not an evolutionary algorithm.  It uses many elements from the field of AI, but this is an AGI system, not a narrow AI algorithm.

Development Approach

The AI has to be developed bottoms-up with questions like: If the agent were able to do what that baby monkey just did, what functionality would it need to support that?

Observe the behavior of people and animals, especially babies, and reverse-engineer these behaviors down to basic functionality.  Then test the functionality by considering a set of very different tasks such as: Tick-Tack-Toe, navigation to a way point in an unknown physical environment, a task with multiple prerequisites, a classic control problem, playing tag with another agent, and learning to walk.  Try to understand how the architecture will support the data and processes of the tasks, keeping in mind the software can’t be written for any of these tasks specifically.

Developing from the top-down misses the intent of this development activity.  It isn’t helpful for the agent to be able to perform advanced tasks (such as playing video games or similar) if it can’t understand what its doing.  Basic functionality is king.

For example, it should eventually learn to play the game Breakout, but not by running a CNN + a reinforcement learning algorithm for 100,000 or more episodes.  It needs to understand what a game is, what blocks are, what a ball is, how a ball interacts with a paddle, how to reset a game, what a score is, and it needs to do this all visually.  Having already learned these basic things beforehand through other experiences, it should be able to learn how to play Breakout, and any game for that matter, relatively quickly.

Additional sources of inspiration: psychology, introspection, insects, other AGI projects.

Design Philosophy

The software architecture is the piping for the data to run through.  It’s the flow of data and the structures built at run-time that will ultimately constitute intelligence, not the architecture.  The architecture has to provide specific basic low-level functionality, but generally just supports the emergent growth of the AI as it has experiences.

The world is made of things, so the agent must be designed around the ability to deal with things:  objects, ideas, events, etc.

Basic functionality allows cognitive capabilities.
Cognitive capabilities allow intelligence.
Intelligence allows intentional behaviors.
Intentional behaviors allow meaningful interactions with the world.


The primary environment is intended to be a reality simulator capable of providing a rich sensory experience to the agent.  It is continuous and infinite in size, for all intents and purposes.  It has the following characteristics:

  • 3 dimensional, continuous, open, real-time
  • Simulates basic physics and collisions
  • Can simulate an unlimited variety of objects
  • Can include sub-environments within the environment (such as games)

Note that the environment does not give the agent a reward like a standard reinforcement learning environment would.  There is no reward channel.  How would the environment know what a reward is or is not?

Also note that unlike typical reinforcement learning environments, this one doesn’t wait for the agent to decide what to do.  The environment runs at real time no matter what – it’s not synchronous with the agents (post v0.4).

Desirable Agent Characteristics

There are several important characteristics which it needs to possess:

    • Multi-Domain Knowledge:  Learns how to move, navigate, play different games, charge its battery, communicate, etc., without forgetting what it learned.
    • Learns quickly:  It’s not practical for an AGI to require tens or hundreds of thousands of epochs in order to learn how to do something.  Learning should be mostly on-line, and one-shot, where that makes sense.
    • Hierarchical Learning:  It has to learn in a manner which supports knowledge and skill sharing across domains and activities.
    • Self-Motivated:  If not actively pursuing a goal, the agent should prefer to work on improving its skills and its knowledge of the world.  A significant fraction of its knowledge will be learned in an unsupervised manner.
    • Trainable:  The agent should be trainable through various means, including curriculum training, training by demonstration, through reinforcement rewards, and others.
    • Realistic Sensors:  The sensors included in the robot need to be something which could exist in the real world such as vision cameras, GPS, contact sensors, etc.  The sensors cannot provide information to the agent which would typically be considered as unknowable or unobservable in the real world.
    • Real time processing.  It needs to be able to operate in an environment with other agents and humans, where physics operates in real-time.  The environment needs to be able to run fast enough that virtual reality isn’t hindered by frame rate.