SDP:SDR vs SDP:BSC

Vector Symbolic Architecture (VSA) can be based on many different type of vectors.

For example my previous project Bi is based on Kanerva Binary Spatter Code (BSC) vectors, which are binary (0|1) vectors normally of size 10000 bits with sparsity/density of 50% i.e. 50% of the bits are ONE, 50% are ZERO.

This new project is based on Sparse Distributed Representation (SDR) binary vectors with sparsity of 2% and size of 2048 bits (as per Numenta HTM specs). When I work and test I normally use vectors of size 500 or 1000 bits … which may not work for production environment, but in my research so far are OK.

BTW the parameters are not set in stone, the more I experiment the more I think it is task specific. So use the specs I mentioned in both cases as a rule of thumb.

You can change the specs system wide in the isdp.py module.


SDP

What then is SDP ?

SDP is my umbrella term I use for all cases of distributed vectors, which by the way could be Real-Value vectors too (with size ~300 real numbers).

SDP is abreviation of Semantic-Distributed-Pointer.

The difference between SDP and standard pointers used in programming languages is that the vector itself has a meaning and is not just referencing computer memory location.


SDP vs iSDP

So SDP, but what about the ‘i’ ?

Normally it is easiest to implement binary vector as 1D bitmap of 0s and 1s. The problem is that because there are no good libraries in any language that are speedy enough to handle large memories and no hardware accelearation I had to figure other solution.

So thats why I decided to switch to indexed-SDP. Instead of binary vector when we have sparse vector we can use an array of index-numbers for every bit that has a value of 1.

  So the vector : 0010010000
  becomes : [2,5]

we are missing one more thing ? the size of the vector .. we can then rewrite our vector as a tuple.

 (10,2,5)

but why stop there lets just extend a numpy array add vsize attribute to it and we get iSDP.

The benefit of using iSDP is that the memory requiremnts are smaller for sparse vectors compared to the binary version.

(We need 16 bit integer per index)

iSDP = 16 * size * sparsity% = 16 * 2000 * 0.02 = 640 bits
bSDP = size = 2000 bits

or in percent terms :

iSDP/bSDP = 16 * s% = 16 * 0.02 = 32% of the size of bSDP

if we can make numpy work with 10|11 bit integers it will be even smaller 20%

The other benefit is when using many iSDP’s in big memory chunks operation on them is faster because of numpy and the ability to speed it up with tools like numba, tensorflow and hardware.

The disadvantage is that the implementations is more complex.

Let’s try the isdp module :