1. Vladimir Esipov
  2. Valentina Database ADK
  3. Суббота, Июль 27 2024, 07:20 AM
  4.  Подписаться через email
Type virtual field RecID is a constant - always 32 bytes (UInt32).
But, often, when designing a DB, we know in advance the maximum capacity of the table, usually reference tables.
Maybe it makes sense to introduce something like a limitation on the maximum number of records (MaxRecCount) for table metadata and select it from a predefined list when creating a table:
256 -> UInt8
65535 -> UInt16
Default -> UInt32.
I mean that the RecID value is stored in indexes and links, which directly affects the size of index files.
That is, when using the native VDB data storage method, we, winning in the absence of a primary key (which is emphasized in WIKI as a special feature), at the same time lose in size on secondary indexes and links (which is silent about :p ).

Or go the other way - use the method used by SQLite when storing rowid (each row identifier is stored as a variable-length integer, i.e. small rowid values take up less disk space than large rowid values) and try to apply it to storing the RecID value in indexes.
And store the RecID in memory as the same 32 bytes (for simplicity).

Or will such steps not yield significant gains?
Комментарий
There are no comments made yet.
Ruslan Zasukhin Ответ принят
Hi Vladimir,

We were going to use SQLite alg to compress RecID inside of the index, even the GIT branch where this task was started. Let's notice that this way gives us a variable size of recID in the index.

Hmm, you have to push interesting ideas with limit ... UInt8, UInt32

Another issue, UInt32 for RecID can be small for some tables. This gives us 4 billion records in a Table, but for example, people population is 8 already.


This is not easy to change and improve. Will affect the fundamental parts of the engine. But we need to do that... The question is when.
Комментарий
There are no comments made yet.
Vladimir Esipov Ответ принят
Yes, I have a rough idea of the complexity of the task and the number of changes in the code base.
It's easier with a single pointer size...
It's hard for me to assess how practical this is and whether it will give any significant effect that would justify such costs in reworking the code.
I just expressed an idea... It's up to you to decide whether the game is worth the candle.

Ruslan, please consider the proposal to use temporary indexes. Maybe something can be done?

P.S: I would like to note - first the integration with DuckDB, and all these innovations, if they are destined to be implemented, then later...
Комментарий
There are no comments made yet.
  • Страница :
  • 1


There are no replies made for this post yet.
However, you are not allowed to reply to this post.

Categories

Announcements & News
  1. 0 subcategories
Valentina Studio
  1. 2 subcategories
Valentina Server
  1. 4 subcategories
Valentina Database ADK
  1. 0 subcategories
Valentina Reports ADK
  1. 0 subcategories
Other Discussions
  1. 2 subcategories
BETA Testing
  1. 0 subcategories
Education & Research
  1. 0 subcategories