From: Svjatoslav Agejenko Date: Thu, 25 May 2017 20:02:21 +0000 (+0300) Subject: Elaborated on high level vision. X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=commitdiff_plain;ds=sidebyside;h=2316187f481ff4854fd93e381e6b1c802cd5bac0;p=sixth-data.git Elaborated on high level vision. --- diff --git a/doc/data model.png b/doc/data model.png new file mode 100644 index 0000000..da3274d Binary files /dev/null and b/doc/data model.png differ diff --git a/doc/index.html b/doc/index.html index 29e212a..cc7cce5 100644 --- a/doc/index.html +++ b/doc/index.html @@ -1,184 +1,357 @@ - - - + + - - - Sixth - system for data storage, computation, exploration and interaction - - + + + + + + + +" + - -
-

Sixth - system for data storage, computation, exploration and interaction

-
-

Table of Contents

- -
-
+
+

Sixth - system for data storage, computation, exploration and interaction

+
+ -
  • other applications hosted at svjatoslav.eu
  • +
  • other applications hosted at svjatoslav.eu +
  • -
    -

    1 Current status

    +
    +

    1 Vision / goal

    +

    +Provide versioned, clustered, flexible, object-relational database +functionality for the Sixth computation engine. +

    + +
      +
    • I hate object-relational impedance mismatch. +
    • + +
    • I don't like to convert data between persistent database and runtime +objects for every transaction. How about creating united +database/computation engine instead to: +
        +
      • Eliminate constant moving and converting of data between 2 systems. +
      • +
      • Abstract away difference between RAM VS persistent storage. Let +the system decide at runtime which data to keep in what kind of +memory. +
      • +
      +
    • +
    +
    + +
    +

    1.1 Inspiration

    +
    +
      +
    • Relational databases: +
        +
      • Transactional. +
      • +
      • Indexable / Quickly searchable. +
      • +
      +
    • + +
    • Git (version control system) + +
    • +
    +
    +
    + +
    +

    1.2 Solution (the big idea)

    +
    +

    +I see 4D data structure. +

    + + +
    +

    data model.png +

    +
    + +

    +Dimensions: +

    +
      +
    • List of all the objecs in the system (rows). +
    • +
    • List of all declared unique object fields (columns). +
    • +
    • List of all historical transactions/commits/versions (think of +sheets of paper). +
    • +
    • List of all concurrently running branches/threads. Branches can +appear and merge over time as needed. +
    • +
    • (Every cell is concrete field value within an object) +
    • +
    + +

    +Partitioning/clustering: +

    +
      +
    • Why not to partition/(load balance) as required across networked +physical computers along arbitrary dimension(s) declared above ? +
    • +
    + +

    +Indexing (for fast searching): +

      -
    • Implemented very simple persistent key-value map.
    • +
    • Why not to index along arbitrary dimensions (as required) ? +
    • +
    + +

    +Further optimizations: +

    +
      +
    • In current early stage, trying to focus on minimum possible set of +features that would provide maximum possible set of power/benefit :) +
    • +
    • Once featres are locked. Anything can be optimised. Optimization for +size (deduplication) can be solved using Git style content +addressible storage mechanism. +
    • +
    +
    +
    +
    + +
    +

    2 Current status

    +
    +
      +
    • Implemented very simple persistent key-value map. +

    @@ -186,11 +359,25 @@ Long term goal is to implement more advanced features on top of this.

    +
    +
    +
    +

    Author: Svjatoslav Agejenko

    +

    Created: 2017-05-25 Thu 22:58

    +

    Emacs 25.1.1 (Org-mode 8.2.10)

    +
    diff --git a/doc/index.org b/doc/index.org index 3c93341..62f2957 100644 --- a/doc/index.org +++ b/doc/index.org @@ -17,6 +17,70 @@ - [[http://svjatoslav.eu/programs.jsp][other applications hosted at svjatoslav.eu]] +* (document settings) :noexport: +** use dark style for TWBS-HTML exporter +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: " +#+HTML_HEAD: + +* Vision / goal +Provide versioned, clustered, flexible, object-relational database +functionality for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth computation engine]]. + ++ I hate object-relational impedance mismatch. + ++ I don't like to convert data between persistent database and runtime + objects for every transaction. How about creating united + database/computation engine instead to: + + Eliminate constant moving and converting of data between 2 systems. + + Abstract away difference between RAM VS persistent storage. Let + the system decide at runtime which data to keep in what kind of + memory. + +** Inspiration ++ Relational databases: + + Transactional. + + Indexable / Quickly searchable. + ++ Git (version control system) + + Versionable + + Branchable / mergeable. + + Transparent cansistency, checksumming and deduplication. + + (Git as a database: + https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) + +** Solution (the big idea) +I see 4D data structure. + +[[file:data model.png]] + +Dimensions: ++ List of all the objecs in the system (rows). ++ List of all declared unique object fields (columns). ++ List of all historical transactions/commits/versions (think of + sheets of paper). ++ List of all concurrently running branches/threads. Branches can + appear and merge over time as needed. ++ (Every cell is concrete field value within an object) + +Partitioning/clustering: ++ Why not to partition/(load balance) as required across networked + physical computers along arbitrary dimension(s) declared above ? + +Indexing (for fast searching): ++ Why not to index along arbitrary dimensions (as required) ? + +Further optimizations: ++ In current early stage, trying to focus on minimum possible set of + features that would provide maximum possible set of power/benefit :) ++ Once featres are locked. Anything can be optimised. Optimization for + size (deduplication) can be solved using Git style content + addressible storage mechanism. + * Current status - Implemented very simple persistent key-value map.