X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=blobdiff_plain;f=doc%2Findex.html;h=d0f979d5b047cfde09b816345d0080da6f77137e;hb=32e305d7f189ddd3a6e62a2abf29065187cc75b2;hp=cc7cce511bd10eec26e31452391debda615c27db;hpb=2316187f481ff4854fd93e381e6b1c802cd5bac0;p=sixth-data.git diff --git a/doc/index.html b/doc/index.html index cc7cce5..d0f979d 100644 --- a/doc/index.html +++ b/doc/index.html @@ -2,7 +2,7 @@ Sixth - system for data storage, computation, exploration and interaction - + @@ -220,25 +220,28 @@ License or later as published by the Free Software Foundation. -
  • other applications hosted at svjatoslav.eu +
  • other applications hosted at svjatoslav.eu
  • -

    1 Vision / goal

    +

    1 Vision / goal

    -Provide versioned, clustered, flexible, object-relational database -functionality for the Sixth computation engine. +Provide versioned, clustered, flexible, distributed, multi-dimensional +data storage engine for the Sixth computation engine.

    + +
    -
    -

    1.1 Inspiration

    -
    +
    +

    2 Inspiration

    +
    • Relational databases:
        @@ -280,102 +286,131 @@ memory.
    -
    -
    -

    1.2 Solution (the big idea)

    -
    -

    -I see 4D data structure. -

    - - -
    -

    data model.png -

    -
    +
    +

    2.1 Brain

    +
    +
      +
    • Appears to have more than 3D dimensional design. Food for +thought…) + +
    • -

      -Dimensions: -

      +
    • From there come following ideas:
        -
      • List of all the objecs in the system (rows). +
      • Maybe every problem can be translated to geometry (use any shapes +and as many dimensions as you need). Solution(s) to such problems +would then appear as relatively simple search/comparison/lookup +results. As a bonus, such geometrical *data storage* AND +*computation* can be naturally made in *parallel* and +*distributed*. That's what neurons in the brain appear to be doing +! :) . Learning means building/updating the model (the hard +part). Question answering is making (relatively simple) lookups +(geometrical queries) against the model.
      • -
      • List of all declared unique object fields (columns). + +
      • Mapping of hyperspace to traditional object-oriented programming +model: +
          +
        • Object is a point in space (universe). Each object member +variable translates to its own dimension. That is: if class +declares 4 variables for an object, then corresponding object +can be stored as a single point inside 4 dimensional +space. Variable values translate to point coordinates in +space. That is: Integer, floating point number and even boolean +and string can be translated to linear value that can be used as +a coordinate along particular dimension.
        • -
        • List of all historical transactions/commits/versions (think of -sheets of paper). + +
        • Each class declares its own space (universe). All class +instances (objects) are points inside that particular +universe. References between objects of different types are +hyperlinks (portals) between different universes.
        • -
        • List of all concurrently running branches/threads. Branches can -appear and merge over time as needed. +
      • -
      • (Every cell is concrete field value within an object) +
    +
    +
    +
    -

    -Partitioning/clustering: -

    +
    +

    3 Current status

    +
      -
    • Why not to partition/(load balance) as required across networked -physical computers along arbitrary dimension(s) declared above ? +
    • More or less defined Vision / goal.
    • -
    -

    -Indexing (for fast searching): -

    +
  • Collected some ideas. +
  • + +
  • Implemented very simple persistent key-value map.
      -
    • Why not to index along arbitrary dimensions (as required) ? +
    • Long term goal is to use it as a backing storage engine and +implement more advanced features on top of this. +
    • +
  • +
    +
    +
    +

    4 See also

    +

    -Further optimizations: +Interesting or competing projects with good ideas:

    +
      -
    • In current early stage, trying to focus on minimum possible set of -features that would provide maximum possible set of power/benefit :) -
    • -
    • Once featres are locked. Anything can be optimised. Optimization for -size (deduplication) can be solved using Git style content -addressible storage mechanism. +
    • GRAKN.AI: database in the form of a knowledge graph that uses +machine reasoning to simplify data processing challenges for AI +applications. + -
    -
    -
    + -
    -

    2 Current status

    -
    +
  • Gemstone/S based on Smalltalk. +
  • -

    -Long term goal is to implement more advanced features on top of this. -

    +
  • Magma distributed database in Smalltalk. + +
  • +