2003 Scientific Research and Experimental Development tax credit claim
T661 - Part 2 – Scientific or Technological Project Information
Step 1 - Detailed Project Description
Project Identification: code and name
Project Number – 1
Project Name – Adaptron
Project Type – Basic Research
Subject Area – Artificial Intelligence, Artificial Life
A. What is the Technological objective of the work?
This research aims to simulate human learning and thinking using an artificial neural network (ANN) for pattern recognition with an integrated behaviour network for action selection. The resulting agent is called Adaptron. The objective of this research is to devise an ANN that grows the number of nodes as new experiences are acquired and that prunes the nodes (forgets them) as new learning replaces old. The research aims to extend the ANN by connecting up recognition events to actions.
The goal of Adaptron is for it to be able to learn to function in any environment that can produce quantized stimuli and that obeys a set of deterministic rules whenever actions are performed within it.
Adaptron must begin with the ability to recognize a primitive / non-reducible set of stimuli and the ability to produce a predefined set of primitive actions. It must also be preprogrammed to recognize a subset of its stimuli as rewarding and another disjoint subset of its stimuli as punishing. With only these predetermined parameters Adaptron should learn to recognize combinations of the primitive stimuli from its environment and to perform primitive actions and combinations of primitive actions so as to minimize punishing stimuli and maximize rewarding stimuli.
Thus Adaptron must be able to “live” in an artificial environment in which it can sense the stimuli and produce actions. The environment must be 100% deterministic such that in all initial environmental states any action performed by Adaptron will always result in the same final states. All detectable dimensions of the environment must be 100% discrete – there are no continuously measurable quantities. The environment cannot change unless Adaptron performs an action i.e. there are no other agents in the environment changing its state. The environment must produce rewarding and punishing stimuli. Designing Adaptron to live in a continuous, changing and noisy environment are goals for subsequent research projects.
When the research has proven that the theories are correct and the software design is viable, Adaptron Inc. plans to promote the software for imbedding in robots and control systems.
B. How would the success of this work advance the technology?
Existing ANNs are built from a fixed number of nodes and the weights on the connections between these nodes are adjusted as they are trained to recognize a set of input stimuli. Devising an ANN that grows i.e. adds nodes as it encounters new stimuli, as a means for learning has not been attempted. Such an ANN could be used in an adaptive control system without having to reprogram it with a new set of nodes.
Existing behaviour networks are designed to be general-purpose networks with learning rules imbedded by the developer. They have not been integrated with ANNs nor constrained by their topology and learning algorithms. Successful integration of behavioural networks with ANNs that grow should result in an adaptive system that can build ever increasingly complicated hierarchical behaviour networks.
An advanced example of where Adaptron might be used would be in the control of the lights at an intersection. Possible sources of stimuli would be television cameras pointing at the roads and sidewalks involved, pedestrian crossing buttons, vehicle detection wires under the pavement, and a clock. Adaptron would be trained to control the lights for optimum traffic flow and how to handle complicated situations such as slow pedestrians, accidents and emergency vehicles. Once the learning is complete the ability to continue learning would be disabled and the resulting control software and experience would be imbedded in an actual intersection control unit.
The area of robotics is also a prime candidate for advancement through the use of Adaptron. Artificial Intelligence research into robots has been progressing for several decades and many successes have been accomplished.
For example projects such as CYC have accumulated large databases of common sense knowledge that could be imbedded in a robot for it to operate sensibly in the real world but this knowledge has primarily been input by its developers rather than learnt. Also this knowledge is not grounded on or related to the stimuli that a robot would experience.
Another example is Brook’s robots such as Cog which are well grounded to the environment but do not have the ability to combine learnt behaviour into more complicated behaviour, i.e. they do not scale well.
Many robots have been developed to perform specific tasks in very narrow environments and perform some learning but they cannot handle general-purpose situations. The scientific advancement that Adaptron aims to accomplish is general purpose learning and thinking software that can be imbedded in a robot such that it can learn all its knowledge and operate in any environment in which it is placed.
C. Explain the scientific or technological uncertainty needing resolution for the advancement in B.
An ANN that grows hierarchically as it learns to recognize more complicated patterns of stimuli has not been invented. It is uncertain as to how the growth should be controlled. It is also unknown if any node in an ANN can be used to trigger actions. With current fixed node ANNs the actions would only be associated with final recognition nodes, not hidden nodes. With a growing ANN the actions will end up associated with hidden nodes. Strategies for using novelty, familiarity, punishment and reward, as feedback in guiding the growth of the ANN must also be discovered.
Of even more scientific uncertainty is how thinking can be introduced into the ANN. Based on the idea that thinking is a stream of expectations which effectively model experienced stimuli in a goal directed fashion various processes based on signaling between the nodes need to be invented and tested.
D. Describe the work to be undertaken this year to resolve the above technical problem.
Determination of Adaptron’s success at learning and thinking will be based on the observation of its actions in test environments and by inspection of its internal memory traces and processes. The levels of learning and thinking to be achieved are:
- Recognition of a series of stimuli.
- Performance of actions to obtain novel stimuli and avoid boring situations.
- Performance of actions to obtain rewarding stimuli and avoid punishing situations. (learning)
- Aggregation of stimuli recognized and actions performed into new compound actions that can be automatically performed (habitualization)
- Internal reproduction of past experiences in order to decide whether to perform them. (thinking)
Initial testing of Adaptron will be done in artificial environments simulated in software.
E. List the research deliverables generated.
Adaptron Inc. acquired the research notes and multiple versions of the Adaptron simulation software written in BASIC from Brett Martensen in January 2003.
From Jan 1st to June 30th 2003
Research notes added to 1998 Notebook started 3rd May 1998:
- Design of nodes to improve recognition of sequential stimuli,
- Design of nodes to learn action sequences,
- Descriptions of suitable environments for testing purposes,
- Ideas for recognition of simultaneous stimuli.
Versions of the Adaptron software that were developed are named:
Screen captures of the memory dumps of test runs of Adaptron have been kept.
A logbook is kept of the experiments performed and the time spent doing research by the specified employee.