Keyvi index

Keyvi is a KV store.

In comparison to other KV stores as Redis its underlining technology is not Hash function, but Finite State Trancducers (FTS).

There are two structures Index and Dictionary. Both share the same Data format as stored on a disk.

The difference is the Dictionary is static and the Index is dynamic i.e. UPDATABLE.

Another quirk is there could be only ONE Writer but MANY Readers.

Below is example how Index work.

import keyvi.index

First lets create Read-Write object.

kv = keyvi.index.Index('test')

…. then use .Set() to create KV pair

kv.Set('abc', 'abc')

.. then some more

kv.Set('abcd', str(555))
kv.Set('abxy', '23.67')
kv.Set('brum','789')
kv.Set('abxyz', f'{int(55.12345678*100)}')

We use .Get() which returns matching object

match = kv.Get('abc')

we can get back the value if such exists from this object

match.GetValue()
'abc'
kv.Get('abcd').GetValue()
555
kv.Get('abxyz').GetValue()/100
55.120

We can also can search for a key/keys even if its not exact match by specifing Levenstain distance.

.GetFuzzy(key, distance, len-of-exact-prefix)

match2 = kv.GetFuzzy('abcd',1,1)
[ m.GetValue() for m in match2]
['abc', 555]

You can also open the Index as read-only as I mentioned in the begining

ro = keyvi.index.ReadOnlyIndex('test')

We pass string as value to .Set(), but get back the correct type ..

ro.Get('abxy').GetValue()
23.670
type(ro.Get('abaxy').GetValue())
float

.. the same for integer

ro.Get('brum').GetValue()
789
type(ro.Get('brum').GetValue())
int

Because the KV file is Memory mapped reading is Faaast… and as a bonus because of the underlying technology FST everything is compressed on the fly … win/win

%timeit ro.Get('brum').GetValue()
591 ns ± 3.12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Storing sequences

Now I will show you how to save sequences i.e. list of integers as key, value or both.

By default Keyvi does not support that but we can convert the list to string beforehand and from string to list on the way out.

import struct 

# list of numbers =to=> string
def nums2str(nums, itype='H'): #B:int8,H:int16,I:int32,Q:int64
    return struct.pack(f">{len(nums)}{itype}", *nums)

#string =2=> tuple of numbers, use list()
def str2nums(b, itype='H'):
    size = 1 if itype == 'B' else ('HIQ'.index(itype) + 1) * 2
    return struct.unpack(f">{len(b)//size}{itype}", b)# bytes(b,'utf-8'))	
nums2str([1,2,3,4])
b'\x00\x01\x00\x02\x00\x03\x00\x04'
str2nums(nums2str([1,2,3,4]))
(1, 2, 3, 4)
kv.Set(nums2str([1,2,3,4]), 'a list key')
kv.Get(nums2str([1,2,3,4])).GetValue()
'a list key'
kv.Set(nums2str([5,6,7]), nums2str([7,8,9]))
val = kv.Get(nums2str([5,6,7])).GetValue()
val
'\x00\x07\x00\x08\x00\t'
str2nums(bytes(val,'utf-8'))
(7, 8, 9)

I just decided to try and seems keyvi support lists you just need to pass them as string ;)

kv.Set('str-list','[1,2,3]')
kv.Get('str-list').GetValue()
[1, 2, 3]
type(kv.Get('str-list').GetValue())
list
kv.Set('[1,2]','kwy-str-list')
kv.Get('[1,2]').GetValue()
'kwy-str-list'