Spatial Pooler/Mapper¶
I already mentioned what is the purpose of the Spatial Pooler (I call mine Spatial Mapper). The question is how to implement it.
The implementation is a mix of HTM theory and Self Organized Map (SOM).
The core is a numpy 2D array with values in the range between [0,1]. The X axis has the size of the input SDP and the Y axis has the size of the output.
You can think of a column as a pseudo-X-neuron where every row is a connection to the input Y-neuron. Percentage (by default 85%) of the column is randomly filled with numbers between 0 and 1.
When SDP arrives at X-axis it is converted to 01-mask, the mask is multiplied across the rows and then the columns are summed /zero-rows are zeroed, one-rows are kept/. The columns (def:2%) with the higher sum (winner takes all) are chosen for update.
We also filter the values below a threshold (def:0.1) and exclude them from the sum.
Now that we have the Y-winner cols we apply the update to 1 and 0 rows according to the following two formulas :
 ones += learn_rate * ( 1 - ones)
 zeros += unlearn_rate * ( 0 - zeros)
this forces the updated values to move on a S-curve between 0 and 1 as they are updated.
the unlearn rate has to be much smaller than the learn rate
So far so good, but we have a small problem, the algorithm so far will tend to use the most popular columns and not use the rest. To fix that we boost of the rarely used columns.
During the learning phase we count every use of a column and based on this create a score which becomes smaller for highly used columns and vs versa. This score then is used in the calculating the WTA-sum.
Here is how the sum operation looks :
olap = np.sum((input.reshape(-1,1) * self.weight_matrix) > self.permanence_thresh, axis=0)
if boost : olap = (olap * self.boost)
So let see it in action :
import sys
sys.path.extend(['../../ilib', '../../ilib/encoders', '../../ilib/modules'])
from scalar_encoder import *
from category_encoder import *
from ispatial_mapper import *
CACHEDIR=/home/raptor/.cache/matplotlib
Using fontManager instance from /home/raptor/.cache/matplotlib/fontList.json
backend TkAgg version unknown
se = ScalarEncoder(minimum=0,maximum=100,vsize=200,spaOnbits=20)
print se.info
> Scalar encoder -----
min-max/range : 0-100/100
buckets,width,n : 180,20,200
resolution : 0.56, 0.0056%
sparsity : 0.10, spa_nbits:20
sm = iSpatialMapper(shape=(200,1000),spa=(20,0.02))
As I said you don’t need to pretrain it, but to illustrate how it works we need to feed some random data.
print sm.info
Shape: (200, 1000)
Sparsity: (0.1, 0.02)
ON Bits count : (20,20)
Learn rates : (0.1, 0.01)
Boost learn rate : 0.01 (on:True)
Overlap match % : 0.5
Cell conn. % : 0.85
Permanence thresh : 0.1
None
We batch train it with 2000 random data points within the encoder value range.
sm.batch_train( [ se.encode(v) for v in np.random.randint(0,100,size=2000) ] )
print "Encode 55 > ", se.encode(55)
print "Normalize 55 > ",sm.predict(se.encode(55))
Encode 55 >  99:100:101:102:103:104:105:106:107:108:109:110:111:112:113:114:115:116:117:118
Normalize 55 >  29:75:216:248:254:257:282:334:421:436:440:463:496:582:673:712:810:868:886:954
The same input should consistenly generate the same output
print sm.predict(se.encode(53)) // sm.predict(se.encode(53))
print sm.predict(se.encode(55)) // sm.predict(se.encode(55))
print sm.predict(se.encode(57)) // sm.predict(se.encode(57))
1.0
1.0
1.0
print se.encode(55) / se.encode(57)
print sm.predict(se.encode(55)) / sm.predict(se.encode(57))
print se.encode(55) / se.encode(53)
print sm.predict(se.encode(55)) / sm.predict(se.encode(53))
17
7
16
8
… should preserves similarity ratio … the ratio will vary depending on the how encoders and Spatial mapper are configured
print se.encode(55) // se.encode(40)
print sm.predict(se.encode(55)) // sm.predict(se.encode(40))
print se.encode(55) // se.encode(70)
print sm.predict(se.encode(55)) // sm.predict(se.encode(70))
0.0
0.0
0.0
0.0
at some point the similarity drops to zero.
Pretrain¶
Earlier we trained SMap with random values. There is method for that.
Lexicon + SM.encode/decode¶
In addition you can use as a shortcut SMap own encode method.
Forward/encode : Value => Encoder => SMap => iSDP
this method saves us time by encoding-and-then-mapping the value.
Plus you may need given the iSDP similar to some SM generated vector get back the encoded-symbol i.e. decode() You have to think about it because this is not trivial operation. The reason is that we have two processes …
Forward/encode : Value => Encoder => SMap => iSDP Backward/decode : ~iSDP => correct iSDP => Encoder => Value
Spatial Mapper mapping is not reversable, so we have to build somehow a Map to do that. Luckily we have Lexicon, where we ca store symbol <=> iSDP map. So what we can do is every time we encode and SMap a symbol, we store the output-iSDP and the symbol in a lexicon. Later when we decode something that returns iSDP that have to be reversed we use this map. Of course nothing is free this lexicon takes memory. Also you have to use SMap encode() method which builds this map on the forward pass.