Controlling an octopus arm
Task Description
The goal of this task is to control an octopus arm in order to reach a desired goal region.
Additionally, the agent is required to reach the goal in minimum time.
The simulator offers a graphical display, for visualization. This can be turned off during training.
A screen shot (with the two red dots representing targets) is below.
Files
The main binary distribution package contains all files needed to run the simulator and to create and run agents.
Template agents are provided in C, Java, and Python.
As the simulator is written in Java, there is only one download for all platforms.
Please refer to the included readme file for specific system requirements and usage instructions.
Additional documents:
Source code
The files above are sufficient to use both the simulator and the random agents. However, we also provide
the source code, for anyone interested:
- octopus-environment-src.zip (last updated May 6, 2006)
This is the source code for the environment in the distribution above, in the same directory structure.
-
octopus-agent-src.zip
This is the source code for the Java agent which contains all the classes. Note that the only
code that needs to be modified is already provided in the distribution.
The source code for the C and Python agents is already in the distribution as well.
Support
For support, please e-mail
Danny Castonguay (dan.cast at mail dot mcgill dot ca) or Eric Vigeant (evigea at cim dot mcgill dot ca) and cc Doina Precup (dprecup at cs dot mcgill dot ca)
Possible extensions
The current code distribution allows for two types of tasks to be defined:
-
Reaching. In these tasks, a set of targets (visualized as red dots) is specified. The agent can be required
to touch any one of the targets, or to touch the targets in a sequence. Several different target sequences
can be specified as well. There is a penalty per time step (minimum energy tasks may be added in the future)
-
Feeding. This is a more challenging task. The agent must make the arm push several pieces of food into
an elliptical mouth. Each piece of food offers a specific reward when eaten.
A screenshot is below; the beige circles are food, while the black region is the mouth.
These tasks can be configured using the XML file provided with the distribution above. We would be grateful
for feedback regarding these tasks.