2 Walknet

ReaCog is an expansion of a control system that has been realized as a neural network. The underlying system has been termed Walknet. Walknet is biologically inspired and is supposed to describe the results of many behavioral studies on the walking behavior of stick insects (Dürr et al. 2004; Schilling et al. 2013b). We will briefly sketch the properties of the network as far as is required for understanding the basic abilities considered here.

Overall, the controller has to deal with the difficult task of coordinating multiple degrees of freedom; in the case of the hexapod walker the body consists of twenty-two DoF. There are three DoF for each of the six legs and an additional four DoF are present in between the body segments. The system is redundant, as only six DoFs are needed to define a position and orientation in three-dimensional space. The controller therefore has to to deal with sixteen extra DoFs. The architecture of the Walknet controller is decentral. Each leg has an individual and more or less independent controller that decides which action to choose (two such leg-controllers are shown in figure 2, the black boxes in the lower part). A single leg controller consists of several procedures. In the figure, each procedure is represented as a single black box. In the basic system, the two important behaviors a leg can perform are the swing and stance movement. The procedures themselves are realized as artificial RNN. Examples are the two basic procedures: the “Swing-net”, which controls the swing movement, and the “Stance-net”, which controls the stance movement of the leg. Only two of the six leg-controllers are shown. These networks constitute the procedural memory of the system. The procedural modules receive direct sensory input and provide motor control commands as an output. But there are also modules that provide input to another module. The controller on the leg level determines which procedure should be actived at any given time, depending on the current state of the leg (swing or stance), as well as on sensory inputs (ground contact, position). In addition, controllers of neighboring legs can influence each other through a small number of connections between those controllers. These influences are explicitly derived from experiments on the coordination of legs in walking experiments on the stick insect.

As was found in the insects, during the swing movement (protraction) the legs aim towards a position at the front, close to the position of the anterior leg. Therefore, each leg possesses a so-called “target net” in order to produce these targeted movements. During forward walking the so-called “Target_fw-net” is responsible for this targeting. During backward walking “Target_bw-net” is used. Both directly influence the Swing-net. Procedures marked as blue boxes (“body model”, “leg model”) will be explained below (section 3a).

ReaCog is expanded by an RNN, which consists of motivation units (figure 2, marked in red). This network allows the system to autonomously select one of the different possible behaviors. For example, the system may choose between forward or backward walking, or standing. A motivation unit is an artificial neuron with linear summation input and piecewise linear activation function, showing output values from zero to one. Applied to a procedure, for example Swing-net, a motivation unit determines the strength of the output of the corresponding procedural network (in a multiplicative way). As mentioned above, motivation units form a recurrent neural network and can influence each other through excitatory or inhibitory connections (as shown in figure 2).

In addition, there are sensory units that are part of this RNN and that can directly influence the motivation units’ activation, e.g., as shown in figure 2 for the “lower-level” units for Swing and Stance. There, an active ground-contact sensor of a leg reinforces the stance motivation unit for this leg. As the motivation unit network can be arbitrarily expanded, it allows to control of complex behaviors. To illustrate a small group of behaviors only, units as “walk”, “fw” (forward), “bw” (backward), “leg1” are depicted (for more examples see Schilling et al. 2013b; Cruse & Wehner 2011).

The network of motivation and sensory units does not have to form a simple, tree-like structure (see figure 2). It can constitute a heterarchy. Motivation units can be bi-directionally connected through positive (arrowheads) and negative (T-shaped heads) connections. As shown in the figure, this can lead to cycles. There are also different overlapping subnetworks, e.g., the “leg” units as well as the motivation unit for “walk” are active during backward and forward walking. But only one unit indicating the direction of walking can be active at any given time, i.e. either the unit “fw” or “bw” can be active. As a consequence, there are multiple stable attractor states formed through the combinations of excitatory and inhibitory connections. The stable “internal states” stabilize the behavior of the overall control system, as the system cannot be easily disturbed solely through inappropriate sensory inputs. For example, sensory inputs are treated differently depending on the current state (swing or stance) of the control system, and these internal states can be differentiated on a higher-level, e.g., into walking, standing, or feeding (for details see Schilling et al. 2013a; Schilling et al. 2013b).

Image - figure003.jpg Figure 3: Step pattern arising from the decentralized leg-controllers connected by local rules and the environment. Abscissa is time; black bars indicate swing movement; the gaps represent stance movement of this leg (from top to bottom: front left leg (FL), middle left leg (ML), hind left leg (HL), correspondingly front right leg (FR), middle right leg (MR) and hind right leg (HR) for the right side). The lower bars indicate 500 iterations corresponding to 5s real time. These “foot-fall patterns” show various locally or globally stable patterns depending on walking velocity (a: slow, b: fast) and of starting position. In (a) the legs start with an “uncomfortable” leg configuration leading to a gallop-like pattern (indicated by the vertical ellipses) that after about six steps changes to the globally stable pattern, typical for slow insect walking (see inclined ellipses, step # 8). (b) shows fast walking leading to a tripod gait characterized by synchronous swing movements of ML, FR, HR and FL, HL, MR (see vertical ellipses).

For an RNN, maintaining a stable state is a non-trivial problem, in particular, when there are various disturbances. To illustrate the adaptability and at the same time the stability of the behavior controlled by such a motivation unit network, in figure 3 we show two cases of hexapod walking. Figure 3a shows an example of a slow walking speed where the legs begin from a difficult starting configuration (both front legs, both middle legs and both hind legs start from the same position, which is opposite to the coordination found in normal walking, where opposite legs alternate). Nonetheless, the agent is able to walk. After some steps, the agent reaches a temporally stable pattern corresponding to normal walking. Figure 3b shows a step pattern corresponding to high-speed walking, often termed “tripod gait”. Although usually considered to be a regular pattern, detailed inspection shows that there are local temporal variations, but the overall pattern remains stable (for videos of further walking examples see Schilling et al. 2013b). It is important to note that none of these step-patterns are explicitly implemented, but arise as emergent properties (for details see Schilling et al. 2013a). As another impressive emergent property, Bläsing (2006) showed that, with some minor extensions, this walker is able to climb over large obstacles (which can be more than twice the normal step-width).