Segmented memory (Classifier)

This module is not included the HTM theory it just popped up as I was implementing the Temporal memory.

We said CC can be implemented as a 3D array with a shape ( num-of-segments, num-of-mCols, nbits-per-iSDP ). Segmented memory is the same shape = (nsegs,rows,isdp).

Because this module is used to implement TM it accepts two inputs /In TM case we store transitions, S1 => S2/. So If we exclude the time component from the equation, the transition becomes a classification.

We get two birds with one stone :)

Input one in this case is the predictor(P0) and two the predicted(P1).

The update works by first figuring out which segment to use first dimension … later on this.

Then use P1 to find indexes of the rows /second dimension/ where to store the P0 pattern /third dimension/.

The prediction on the other hand works by comparing P0 to every row and every segment and calculating the similarity, then doing a sum across the rows and finally picking 2% of the row-indexes with the biggest sum. The so called Winners-Take-All algorithm. This then becomes the prediction.

Now about the segment selection mechanism.

Originally all segments are empty, so in this case we pick one at random and store the iSDP. If on the other hand all segments are full, we pick the most similar, “thin” the segment-iSDP with the data-iSDP and store the result back.

 segs[seg,row,:] = isdp.thin(isdp.union(segs[seg,row,:], input_data), nbits=40)

Later in comparison stage all the rules of how similarity on thinned vectors degrade apply.

It is equally easy instead of THINNING to use UNION, but this would not preserve sparsity and because we use iSDP, rather than binary SDP it will require us to increase the size of the 3rd dimension. Even then we have to do thinning but at increased number of ‘bits’ i.e. lower sparsity (higher density). The code change to do that is easy if we need it. Let say the UNION should be allowed to grow to twice the current sparsity of 40bits i.e. 80

 segs[seg,row,:] = isdp.thin(isdp.union(segs[seg,row,:], input_data), nbits=80)

It is better instead to add more segments, at least logically it seems so, haven’t tested it.

At the end of the day, SegmentedMemory is defacto a CLASSIFIER.

This is still research project and I’ve done limited testing and everything. Keep this in mind.

import sys
sys.path.extend(['../../ilib'])
from isegmented_memory import *
from ilexicon import *
vsize = 500; spa = 0.02
x = iLexicon(vsize=vsize,spaOnbits=spa)
segm = iSegmentedMemory(shape=(5,vsize,spa))

Generate symbols from ‘a’ to ‘z’

x.az()

One shot learning …

segm.learn(data=x.a, pred=x.z)
segm.learn(data=x.b, pred=x.z)
segm.learn(data=x.c, pred=x.z)

print x.best_match(segm.predict(x.a))
print x.bm(segm.predict(x.b))
print x.bm(segm.predict(x.c))
z
z
z

First the case of different symbols predicting same symbol, seems to work ;)

Next lets try one symbol predicting different results. As expected the prediction would vary and will depend on how the thinning worked.

segm.learn(data=x.d, pred=x.z)
segm.learn(data=x.d, pred=x.y)
segm.learn(data=x.d, pred=x.x)

print x.bm(segm.predict(x.d))
print x.best_top(segm.predict(x.d), topn=3)
y
['y', 'x', 'z']

we can also predict top-n best candidates.

Finally can we predict “ourselves”, seems so :

segm.learn(data=x.f, pred=x.f)
print x.bm(segm.predict(x.f))
f

if we did not train for something we will get random result.

[x.bm(segm.predict(x[sym])) for sym in string.ascii_lowercase ]
['z', 'z', 'z', 'y', 'z', 'f', 'x', 'z', 'z', 'z', 'z', 'f', 'f', 'z', 'z', 'x', 'z', 'x', 'z', 'x', 'z', 'x', 'x', 'x', 'z', 'x']
print segm.mem_map(segs=[0,1,2,3,4],lex=x)
0009 :   *   d0017 c0044   *   a0008
0031 :   *     *     *   d0037   *  
0055 : d0029   *     *   a0008   *  
0064 :   *     *     *     *   d0037
0076 :   *     *     *     *   d0037
0077 :   *     *     *   f0077   *  
0079 :   *     *     *     *   f0077
0084 : f0077   *     *     *     *  
0086 :   *     *     *     *   f0077
0093 :   *     *     *     *   d0037
0100 :   *     *     *     *   d0037
0121 :   *   c0044 b0017 d0037 a0008
0145 : f0077   *     *     *     *  
0162 :   *     *     *     *   d0037
0164 :   *     *     *   f0077   *  
0170 :   *     *     *   d0037   *  
0180 : f0077   *     *     *     *  
0187 :   *     *     *     *   f0077
0203 :   *     *     *   c0037 b0017
0237 :   *     *   d0037   *     *  
0246 : a0008 c0044   *     *   b0017
0254 :   *     *   d0037   *     *  
0267 :   *     *     *   d0037   *  
0273 :   *   d0037   *     *     *  
0290 :   *     *     *     *   d0037
0298 :   *     *     *     *   d0037
0313 :   *   c0037 a0008   *     *  
0334 : d0037 c0044 a0008   *   b0017
0338 :   *     *   f0077   *     *  
0355 :   *     *     *     *   d0037
0374 :   *   d0037   *     *     *  
0375 :   *     *     *     *   d0037
0399 : d0037   *     *     *     *  
0406 : f0077   *     *     *     *  
0407 : b0044   *   d0037 a0008   *  
0426 :   *   b0029   *     *   a0008
0427 :   *     *   d0037   *     *  
0448 :   *   a0008 c0044 b0017   *  
0472 :   *   d0037   *     *     *  
0493 :   *   d0037   *     *     *  

40
x.add_items(['apple','orange','tomato', 'fruit', 'vegie', 'chair','table','unknown'])
segm.learn(data=x.apple, pred=x.fruit)
segm.learn(data=x.orange, pred=x.fruit)
segm.learn(data=x.chair, pred=x.unknown)

print x.best_match(segm.predict(x.apple))
print x.bm(segm.predict(x.orange))
print x.bm(segm.predict(x.chair))
fruit
fruit
unknown
print segm.info
Seg-Shape (segs,vsize,bits) : (5, 500, 10), Segments : 5
Sparsity/bits: 0.02/10
Match-thresh: 0.1, vsize: 500
Mem: 0.05 MB, used:3%/840, total:25000
Capacity: ~250 patterns, pat/mem:5242.88