Sixth - system for data storage, computation, exploration and interaction
- This is a subproject of Sixth
- download latest snapshot
- This program is free software; you can redistribute it and/or modify it under the terms of version 3 of the GNU Lesser General Public License or later as published by the Free Software Foundation.
- Program author:
- Svjatoslav Agejenko
- Homepage: http://svjatoslav.eu
- Email: mailto://svjatoslav@svjatoslav.eu
- other applications hosted at svjatoslav.eu
1 Vision / goal
Provide versioned, clustered, flexible, object-relational database functionality for the Sixth computation engine.
- I hate object-relational impedance mismatch.
- I don't like to convert data between persistent database and runtime
objects for every transaction. How about creating united
database/computation engine instead to:
- Eliminate constant moving and converting of data between 2 systems.
- Abstract away difference between RAM VS persistent storage. Let the system decide at runtime which data to keep in what kind of memory.
1.1 Inspiration
- Relational databases:
- Transactional.
- Indexable / Quickly searchable.
- Git (version control system)
- Versionable
- Branchable / mergeable.
- Transparent cansistency, checksumming and deduplication.
- (Git as a database: https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ )
1.2 Solution (the big idea)
I see 4D data structure.
Dimensions:
- List of all the objecs in the system (rows).
- List of all declared unique object fields (columns).
- List of all historical transactions/commits/versions (think of sheets of paper).
- List of all concurrently running branches/threads. Branches can appear and merge over time as needed.
- (Every cell is concrete field value within an object)
Partitioning/clustering:
- Why not to partition/(load balance) as required across networked physical computers along arbitrary dimension(s) declared above ?
Indexing (for fast searching):
- Why not to index along arbitrary dimensions (as required) ?
Further optimizations:
- In current early stage, trying to focus on minimum possible set of features that would provide maximum possible set of power/benefit :)
- Once featres are locked. Anything can be optimised. Optimization for size (deduplication) can be solved using Git style content addressible storage mechanism.
2 Current status
- Implemented very simple persistent key-value map.
Long term goal is to implement more advanced features on top of this.