Sixth - system for data storage, computation, exploration and interaction
-Table of Contents
-+
Sixth - system for data storage, computation, exploration and interaction
+-
-
- This is a subproject of Sixth +
- This is a subproject of Sixth + -
- download latest snapshot +
- download latest snapshot +
- This program is free software; you can redistribute it and/or modify it under the terms of version 3 of the GNU Lesser General Public -License or later as published by the Free Software Foundation. +License or later as published by the Free Software Foundation. +
- Program author:
-
-
- Svjatoslav Agejenko -
- Homepage: http://svjatoslav.eu/ -
- Email: mailto://svjatoslav@svjatoslav.eu/ -
+ - Svjatoslav Agejenko + +
- Homepage: http://svjatoslav.eu + +
- Email: mailto://svjatoslav@svjatoslav.eu + +
1 Current status
+1 Vision / goal
+Provide versioned, clustered, flexible, object-relational database +functionality for the Sixth computation engine. +
+ +-
+
- I hate object-relational impedance mismatch. + + +
- I don't like to convert data between persistent database and runtime
+objects for every transaction. How about creating united
+database/computation engine instead to:
+
-
+
- Eliminate constant moving and converting of data between 2 systems. + +
- Abstract away difference between RAM VS persistent storage. Let +the system decide at runtime which data to keep in what kind of +memory. + +
+
1.1 Inspiration
+-
+
- Relational databases:
+
-
+
- Transactional. + +
- Indexable / Quickly searchable. + +
+
+ - Git (version control system)
+
-
+
- Versionable + +
- Branchable / mergeable. + +
- Transparent cansistency, checksumming and deduplication. + +
- (Git as a database: +https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) + +
+
1.2 Solution (the big idea)
++I see 4D data structure. +
+ + + + ++Dimensions: +
+-
+
- List of all the objecs in the system (rows). + +
- List of all declared unique object fields (columns). + +
- List of all historical transactions/commits/versions (think of +sheets of paper). + +
- List of all concurrently running branches/threads. Branches can +appear and merge over time as needed. + +
- (Every cell is concrete field value within an object) + +
+Partitioning/clustering: +
+-
+
- Why not to partition/(load balance) as required across networked +physical computers along arbitrary dimension(s) declared above ? + +
+Indexing (for fast searching): +
-
-
- Implemented very simple persistent key-value map. +
- Why not to index along arbitrary dimensions (as required) ? + +
+Further optimizations: +
+-
+
- In current early stage, trying to focus on minimum possible set of +features that would provide maximum possible set of power/benefit :) + +
- Once featres are locked. Anything can be optimised. Optimization for +size (deduplication) can be solved using Git style content +addressible storage mechanism. + +
2 Current status
+-
+
- Implemented very simple persistent key-value map. +
@@ -186,11 +359,25 @@ Long term goal is to implement more advanced features on top of this.