X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=blobdiff_plain;f=doc%2Findex.org;h=62f29575f2468c2c9565251a1a431d65a01aadf9;hb=2316187f481ff4854fd93e381e6b1c802cd5bac0;hp=3c9334126ef631e1a5313c269291f8c790350482;hpb=a41607862942cced0ec94799ce3adb183cb06f06;p=sixth-data.git diff --git a/doc/index.org b/doc/index.org index 3c93341..62f2957 100644 --- a/doc/index.org +++ b/doc/index.org @@ -17,6 +17,70 @@ - [[http://svjatoslav.eu/programs.jsp][other applications hosted at svjatoslav.eu]] +* (document settings) :noexport: +** use dark style for TWBS-HTML exporter +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: " +#+HTML_HEAD: + +* Vision / goal +Provide versioned, clustered, flexible, object-relational database +functionality for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth computation engine]]. + ++ I hate object-relational impedance mismatch. + ++ I don't like to convert data between persistent database and runtime + objects for every transaction. How about creating united + database/computation engine instead to: + + Eliminate constant moving and converting of data between 2 systems. + + Abstract away difference between RAM VS persistent storage. Let + the system decide at runtime which data to keep in what kind of + memory. + +** Inspiration ++ Relational databases: + + Transactional. + + Indexable / Quickly searchable. + ++ Git (version control system) + + Versionable + + Branchable / mergeable. + + Transparent cansistency, checksumming and deduplication. + + (Git as a database: + https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) + +** Solution (the big idea) +I see 4D data structure. + +[[file:data model.png]] + +Dimensions: ++ List of all the objecs in the system (rows). ++ List of all declared unique object fields (columns). ++ List of all historical transactions/commits/versions (think of + sheets of paper). ++ List of all concurrently running branches/threads. Branches can + appear and merge over time as needed. ++ (Every cell is concrete field value within an object) + +Partitioning/clustering: ++ Why not to partition/(load balance) as required across networked + physical computers along arbitrary dimension(s) declared above ? + +Indexing (for fast searching): ++ Why not to index along arbitrary dimensions (as required) ? + +Further optimizations: ++ In current early stage, trying to focus on minimum possible set of + features that would provide maximum possible set of power/benefit :) ++ Once featres are locked. Anything can be optimised. Optimization for + size (deduplication) can be solved using Git style content + addressible storage mechanism. + * Current status - Implemented very simple persistent key-value map.