Re: poly: Software for superintelligence

From: Nick Bostrom <>
Date: Fri Jan 23 1998 - 12:42:35 PST

Anders Sandberg wrote:
> At least I would love to hear it, since my field of research is memory
> consolidation using Hebbian learning right now.

:-). Ok, I've cut out a passage that contains the core of the idea.
The rest is available at It
was written two years ago so it's time to update it in the light of
new data.

[As we recall, the task is to store complex representations
(representations with a structure) in long-term memory after a
one-shot presentation using only neurobiologically realistic
mechanisms, in this case Hebbian learning.]

Integration through annexation.
In order to integrate the patterns A, B, and C into an complex
whole, simply clamp them on adjacent sections of an ANN! This way the
complex representation is stored after one shot, and the synaptic
mechanisms that support this process are the well-known phenomena of
Hebbian short and long term potentiation. The memory trace is
distributed, robust and manifests graceful degeneration. And it is
content addressable.

Suppose now that we want to store the pattern CBA in the same memory
compartment as where we stored ABC. Will this incur the risk that ABA
or CBC is retrieved when B is clamped to the middle section? Not if
the ANN is fully connected, or if there are sufficiently strong
connections between the left and the right sections. There are many
cortical areas that satisfy this requirement, even for complex
representations much longer and bigger than a triple, at least if the
constituent patterns are not too big. They need not be. In principle
they could be just symbols of concepts whose full meaning and content
were stored elsewhere.

For example, take the thought "Dogs can swim." The concept of "dogs"
presumably contains the essentials of the whole lot of things the
subject knows about dogs; and likewise for the concept "can swim". So
a person's grasp of these concepts must involve a vast amount of
nodes and connections. But this knowledge needs be represented once
only. It does not need to be explicit in the representation of "Dogs
can swim." It would, in principle, suffice if the pattern DS were
laid down in an ANN, presuming that there is a mechanism at hand that
can activate the "dog" concept in response to the pattern D, and the
"can swim" concept in response to S. And the pattern DS could be
stored by a very small number of neurons.

This is not to be take literally as a theory of concept
representations in the brain, but only as an illustration of the fact
that the full representation of a concept need not be repeated in
every ANN representation of a thought. A more realistic theory might
perhaps start with the assumption that there are, in general, no
separate representations of the conceptual content; there are only
the concept symbols that occur in individual thoughts and beliefs:
and the concept is nothing but the contribution this symbol makes to
the belief representations wherein it occurs. A special status might
then be granted to concepts that are directly grounded in perception;
and so on. But it is clearly beyond the scope of the present document
to elaborate on this line of thought.

So the need for multiple concept instantiations does not necessarily
spell disaster. They are quite cheap if a symbolic encoding is used.
Without prejudicing the issue of whether the symbolic attractors
would mostly be extended over a wide cortical area, with very many
attractors occupying the same region, or rather tend to be smallish,
laying side by side, it can nevertheless be instructive to calculate
how many propositions could be stored in a cortical ANN of 1 mm2. Let
V be the size of the conceptual vocabulary. Then one concept can be
coded in 2log(V) bits (or less, if the concepts have different usage
frequencies). Let the average number of concept-instances in a belief
(presumably somewhat less than the number of words in the sentences
that express it) be n. Let d be the neuronal density in units of
number of neurons per square mm. We then have

N = d*0.138 / (2log(V)*n*Robustness)

where 0.138 is the Hopfield value (i.e. the ratio of the storage
capacity of a Hopfield net and the number of neurons it contains),
and Robustness is a value that compensates for the difference in
efficiency between an ideal Hopfield net and a noisy, asymmetric
partially connected sheet of cortical cells. To get a very rough
estimation of N we can take V=100.000, n=5, Robustness=50, and (from
Douglas&Martin(1991)) d=105. We then obtain N=1000, plus minus an
order of magnitude or so. This does not seem to be on wholly the
wrong scale.

Another problem is this: How do we access all the patterns that begin
with the subpattern A, for example? If we feed in A to the first
position in the ANN, it will settle into an attractor, ABC, say. But
there might be other memories that also begin with A, e.g. ACB, ADE,
etc. If we simply repeat the process of clamping A to the first
position, we may be frustrated to discover that the network keeps
being sucked in by the same pattern ABC each time. ABC might be the
strongest attractor beginning with A, and this prevents us from ever
recalling the other memories from the clue A alone.

One countermeasure is to have the neurons getting tired after a
while, so that the neurons active in the B and the C of ABC
eventually retire and allow the activity to flow over to another
basin. Depending on the delay before exhaustion sets in, this would
make the attention flow quickly or slowly from item to item.

A less passive approach is to include an extra context segment in
each complex attractor. Thus we would have ABC1, ACB2, ADE3, etc. In
order to scan though all patterns beginning with A, we begin by
clamping A and 1 to the first position and the context position,
respectively. Then we change to the next context, 2; then to 3, and
so forth. Each pattern ABC1, ACB2, ADE3, etc. will then come forth in
turn, and will be maintained for exactly as long as we (or the
system) choose. The context need, of course, not be represented as a
distinct section; it can equally well be a general "colouration" of
the whole pattern. And the same holds for the other parts of the
representation: the sharp, discreet, linear form suggested here is
chosen merely for the sake of clearness of exposition; in nature
things will be more muddled.

Nick Bostrom
Received on Fri Jan 23 20:48:50 1998

This archive was generated by hypermail 2.1.8 : Tue Mar 07 2006 - 14:45:29 PST