Valentina Database KeyValue v7.0

Introduction

One reason why Valentina DB is famous is its ability to use and mix different database models and features, while others consider them to be mutually exclusive.

For example, while some developers approach Valentina DB as a powerful SQL Relational DBMS, others utilize the Valentina API (40-50 classes with about 1000 methods total), i.e. they use it in NON-SQL way.

A new feature in Valentina Database Release 7.0 is another “alternative feature”: KeyValue store.

  • Up to now you have used Tables/Views and Links, now you can use KeyValues inside of your Valentina DB
  • You still can use only Tables/Links and ignore KeyValues
  • You can use only KeyValues and ignore Tables/Links
  • You can create few KeyValues in your db
  • You can use both Tables and KeyValues in the same database
  • You can use KeyValues via API only
  • You can use KeyValues via new commands of Valentina SQL
  • You can use KeyValues in way as others do: your keys, etc
  • You can use our suggestions about keys
  • We will consider how KeyValues can extend Relational Model
  • We will show how Tables and Links can be friends with KeyValues and cooperate

KeyValue Stores from Other Database Vendors

Let's look at some well known KeyValue using database management systems: BerkleyDB, LevelDB and RockDB.

  • They are made in C or C++
  • They store arbitrary key/data pairs as byte arrays

They have simple and similar API:

* Put( key, value )
* Get( key ) -> value
* Delete( key )

It is known that (at 2016/2017):

  • BerkleyDB was part of mySQL up to v5.1 as Table Storage;
  • MariaDB 6/7 going to put LevelDB as Table Storage;
  • Oracle has put BerkleyDB under SQLite API;
  • Some startup tries to put RockDB as underline Table Storage for SQL Relational DB.

VDB KeyValue Overview

In Valentina DB v7 we have to implement KeyValue as multi-level and multi-purpose feature.

1) The first level going to be a simplest KeyValue store with simple API. And we making [NEW] step! We do the integration of KeyValue feature into Valentina SQL providing special SQL commands!

2) The second level could be - usage of KeyValue as a Table Store instead of our vertical store. But we skip this level for now.

3) The third level improves the power of KeyValue in Valentina SQL. We provide KeyValue WITH KEY command. This allows you easy describe a complex key having few parts without the need to provide comparator on C/CPP level.

4) The forth levels - integrates KeyValue into Object-Relational Data Model of Valentina DB even more deeply! This level provides few kinds of KeyValues:

  • KeyValue FOR TABLE tbl_name
  • KeyValue FOR LINK link_name
  • KeyValue FOR TABLES

As far as we know, nobody even tried to do this before (End of 2016).

VDB Default KeyValue

At this first level, we have provide API of regular KeyValues:

For this, we introduce I_KeyValue interface in our CPP ADK, and some special methods in I_Database interface to Create/Drop KeyValue objects in the scope of VDB. Yes, the first difference you can see is that you can have many KeyValue objects inside of single db. Each KeyValue has own name.

We should notice, that even on this level, Valentina provide small improvement: Value is stored as VARIANT, i.e it has the TYPE byte prefix. This allows:

  • Return back Value of exactly same type;
  • a GUI application, such as Valentina Studio, is able to show value in the correct format of type, what is much better than just see 'set of bytes in HEX form'.

SQL Integration

Additionally we provide Valentina SQL commands to work with this kind of VDB KeyValue:

  • KEYVALUE keyvalue_name INSERT ( key:value, … )
  • KEYVALUE keyvalue_name UPDATE ( key:value, … )
  • KEYVALUE keyvalue_name UPSERT ( key:value, … )
  • KEYVALUE keyvalue_name DELETE( key, … )
  • KEYVALUE keyvalue_name GET ( key, … )
  • SHOW KEYVALUES [FROM db_name]
  • GET/SET PROPERTIES OF KEYVALUE keyvalue_name

SQL & Comparators

The main question on SQL level is - how to define comparator of keys? On this first level we will use a general comparator, which compares key-string using memcmp(), i.e considers key-string as a byte-array. Besides, key-string is used in UTF8 format.

Below we will consider special kinds of KeyValue, which will have more sophisticated complex comparators.

VDB KeyValue WITH KEY

CREATE KEYVALUE keyvalue_name WITH KEY ( COLUMN_TYPE, ... );
  • This KeyValue is NOT related to Object-Releational Data Model of Valentina DB!
  • This KeyValue is universal, as a vdb_default_keyvalue, but it has the specified structure of complex KEY defined by user.

Information about KEY structure allows to Valentina SQL engine considers this key as a Compound Value and automatically creates comparator to sort keys.

Key Structure

To define the structure of Key, it is enough to list TYPES of its parts.

You can use:

  • INTEGER TYPES: BYTE, SHORT, USHORT, LONG, ULONG, LLONG, ULLONG
  • STRING TYPES: VARCHAR
  • BINARY TYPES: VARBINARY

Example:

CREATE KEYVALUE keyvalue_name WITH KEY ( ULONG, VARCHAR, ULONG );

VDB KeyValue FOR TABLE

Now let's talk about KeyValue as helper-object to Relational Table. Many articles during years have point that Relational Tables are not flexible things if you need a very different set of fields for records. So let's try to resolve this problem?

Let you have table T1 with 10 fields (f1..f10). Let Valentina DB creates for you KeyValue as helper object of that table T1.

Such KeyValue will use a special kind of keys: 'RecID.PropX'.
This complex key has two parts 'ULONG.VARCHAR'.

Let we have insert into empty table T1 its first record, which get RecID = 1:

INSERT INTO T1 VALUES(1, 2, ..., 10 )

Now we can insert into its paired keyvalue-object:

KEYVALUE kv_T1 INSERT( '1.prop1':'value1', '1.prop2':'value2', '1.prop3':'value3',  ...   '1.prop500':'value500' )

Easy to see that this 500 pairs will be perfectly associated with the record with RecID = 1.

We can say the same in more short way (syntax sugar):

KEYVALUE kv_T1 INSERT( 'prop1':'value1', 'prop2':'value2', 'prop3':'value3',  ...   'prop500':'value500' ) FOR RECORD 1;
 
-- RECORD keyword is optional:
KEYVALUE kv_T1 INSERT( 'prop1':'value1', 'prop2':'value2', 'prop3':'value3',  ...   'prop500':'value500' ) FOR 1

1) Realize that Valentina DB can do this syntax sugar only because it is aware of this special kind of KeyValue, so it is able to provide special forms of SQL command.
2) Realize that inside you see a nice list of key:value pairs, when using such syntax.

LastRecID() Usage

Also you may wonder, if you can use LastRecID() feature? Yes you can:

KEYVALUE kv_T1 INSERT( 'prop1':'value1', 'prop2':'value2', 'prop3':'value3',  ...   'prop500':'value500' ) FOR LastRecID()

Please note that we allow here to specify LastRecID() without parameters, but KeyValue knows its Table, so it will be able to execute this as Last_RecID_Of_Table( 'T1' );

Definition

To support this kind of KeyValue, VDB provides a special version of SQL command:

CREATE KEYVALUE keyvalue_name FOR TABLE table_name

This KeyValue automatically knows the structure of key: {RecID.PropName}

Search

To READ all keys that are associated with some RecID you can use the nice syntax:

-- Think about it as GET LIKE '1.*'
KEYVALUE keyvalue_name GET LIKE ( '1' )

RecID Checks

It is easy to see that such (Table + KeyValue) construction, should be able to check RecID when a pair is inserted into KeyValue.

RecID Delete

This KeyValue kind must cooperate with Table when a record is deleted. By analogy with ForeignKey and ObjectPtr link, we can see here two abilities:

  • CASCADE deletion of associated key-value pairs.
  • RESTRICT deletion of table record if it has associated key-value pairs.

Table ALTER/DROP

This KeyValue object works as a child object to its Table. So we should think about its reactions to ALTER and DROP of the table.

ALTER requires no actions from KeyValue, even if we change the name of the table because objects are linked internally by Object IDs.

DROP of the table should follow to CASCADE or RESTRICT settings of Table.

VDB KeyValue FOR LINK

Let us remind that Valentina DB has another great feature - Links. Valentina v6/v7 knows about the following kinds of links: FK, ObjectPtr, BinaryLink.

BinaryLink could work with 1:1, 1:M and M:M links between 2 tables. They are smaller and faster of regular solution MM-Table. But BinaryLink was not able to have associated info to link.

EXAMPLE of MM Table Sales

This table says that person with ID 1 have buy product with ID 45, 8 items of this product, at 29 May 2016

PersonPtr ProductPtr ItemsCount Date
1 45 8 2016-05-29

With KeyValue FOR LINK, we can solve this task for BinaryLink also! We can use pair of RecIDs as part of the key to specify an association between Link “record” and key-value pairs.

Such KeyValue will use a special kind of keys: 'RecID1.RecID2.PropertyName'.
This complex key has three parts 'ULONG.ULONG.VARCHAR'.

INSERT INTO BINARY LINK linkSales (1, 45);
KEYVALUE kv_LinkSales INSERT( '1.45.ItemsCount':8, '1.45.Date':'2016-05-29' );

The same in more short form (syntax sugar):

INSERT INTO BINARY LINK linkSales (1, 45);
KEYVALUE kv_LinkSales INSERT( 'ItemsCount':8, 'Date':'2016-05-29' ) FOR RECORD (1, 45);

Realize that Valentina DB can do this syntax sugar only because it is aware of this special kind of KeyValue, so it is able to provide special forms of SQL command.

To support this kind of KeyValue, VDB provides special version of SQL command:

CREATE KEYVALUE keyvalue_name FOR LINK link_name

VDB KeyValue FOR TABLES

Please notice FOR TABLES - plural form.

CREATE KEYVALUE keyvalue_name FOR TABLES;

This KeyValue automatically knows structure of key: {TableID.RecID.PropName}

This KeyValue is able to keep records of different Tables in the same store.

Such KeyValue will use a special kind of keys: 'TableID.RecID.Property'.
This complex key has three parts 'ULONG.ULONG.VARCHAR'.

KEYVALUE kv_ALL INSERT 
(
    'T1.1.first_name' : 'Jon',
    'T1.1.last_name'  : 'Jonson',
 
    'T2.24.picture'   : :1,
    'T2.24.tags'      : 'sea;sun;rest',
);

Why we consider this as a special case?

Because we think, that it allows developing flexible solutions, when you can keep data in both horizontal tables and such KeyValue, yet having the ability to move records between them.

VDB KeyValue Integration

We have made a huge job, to integrate this NON-RELATIONAL (as many people did think) feature into our ORDBMS smoothly and in a consistent way. For example:

  • KeyValue SQL commands accept SQL binding: KEYVALUE kv1 INSERT ( :1 : :2 );
  • KeyValue SQL commands accept SQL vars: KEYVALUE kv1 INSERT ( @key : @value );
  • KeyValue will participate in dumps, clone, diagnose, etc operations.
  • REST API of VSERVER will be able to work with KeyValue, because of SQL commands.

KeyValue Journaling

KeyValue information is stored in the .dat/.blb volumes, where VDB keeps info of Tables and Links. These volumes are automatically journaled.

KeyValue in Stored Procedures

Since we have SQL commands for KeyValues, and for the value we providing support of @var - it is possible to use KeyValue in Stored Procedures.

KeyValue in Triggers

Again yes, because we have SQL commands.

Performace

To Others

in 7.0 we did _not_ spend a lot of time working on performance. But the first benches show quite comparable times to existing top-level KeyValue.

For example, LevelDB says they can do 400,000 inserts per second (they not mention hardware). So they need 2.5 seconds for million pairs. Our bench shows similar time (2.5-2.6 sec) on MacBook Pro 2015 Mid. For 10 million VDB shows 26-28 seconds, i.e. linear time.

Future we can increase results with sure because we see possibilities.

SQL via API

Although KeyValue Stores usually are made in CPP, they are wrapped usually by different programming languages, for example, JavaScript, Python, etc, where developers do a lot of preparation work using strings.

Taking this into account, you should not be scared by SQL to access KeyValues. SQL just becomes universal string-format to communicate with KeyValue Valentina Server from different languages.

Good news are that using VDatabase.CreateStatement() method and VSqlStatement class, you will have speed comparable to API methods.