X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=blobdiff_plain;f=doc%2Findex.html;h=34a8cc164340f6dd4a7695c57474d1a408ef49cd;hb=d8fc769e32de3c36ee37ad7ca0450c4c0e05a765;hp=29e212a12b7257c69acc942f32670f0e59ad3def;hpb=a41607862942cced0ec94799ce3adb183cb06f06;p=sixth-data.git diff --git a/doc/index.html b/doc/index.html index 29e212a..34a8cc1 100644 --- a/doc/index.html +++ b/doc/index.html @@ -1,196 +1,540 @@ - - - + + - - - Sixth - system for data storage, computation, exploration and interaction - - + + + + + + + +" + - -
-

Sixth - system for data storage, computation, exploration and interaction

-
-

Table of Contents

- -
-
+
+

Sixth - system for data storage, computation, exploration and interaction

+
+ -
  • other applications hosted at svjatoslav.eu
  • +
  • other applications hosted at svjatoslav.eu +
  • -
    -

    1 Current status

    +
    +

    1 Vision / goal

    +

    +Provide versioned, clustered, flexible, distributed, multi-dimensional +data storage engine for the Sixth computation engine. +

    + +
      +
    • Speaking of traditional relational database and object oriented +business applications: +
        -
      • Implemented very simple persistent key-value map.
      • +
      • I hate object-relational impedance mismatch. +
      • + +
      • I don't like to convert data between persistent database and +runtime objects for every transaction. How about creating united +database/computation engine instead to: +
      • + +
      • Eliminate constant moving and converting of data between 2 systems +and make computing happen close to where the data is stored. +
      • + +
      • Abstract away difference between RAM VS persistent storage. Let +the system decide at runtime which data to keep in what kind of +memory. +
      +
    • +
    +
    +
    + +
    +

    2 Inspiration

    +
    +
      +
    • Relational databases: +
        +
      • Transactional. +
      • +
      • Indexable / Quickly searchable. +
      • +
      +
    • + +
    • Git (version control system) +
        +
      • Versionable +
      • +
      • Branchable / mergeable. +
      • +
      • Transparent cansistency, checksumming and deduplication. +
      • +
      • (Git as a database: +
      • +
      +

      +https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) +

      +
    • +
    +
    + +
    +

    2.1 Brain

    + +
    +
    +
    +

    3 Ideas

    +
    +
    +

    3.1 Distributed computation and data storage

    +

    -Long term goal is to implement more advanced features on top of this. +Maybe every problem can be translated to geometry (use any shapes and +as many dimensions as you need). Solution(s) to such problems would +then appear as relatively simple search/comparison/lookup results. As +a bonus, such geometrical *data storage* AND *computation* can be +naturally made in *parallel* and *distributed*. That's what neurons in +the brain appear to be doing ! :) . Learning means building/updating +the model (the hard part). Question answering is making (relatively +simple) lookups (geometrical queries) against the model.

    +
    +

    3.2 Mapping of hyperspace to traditional object-oriented model

    +
    +

    +Object oriented programming is inspired by the way human mind +operates. It allows programmer to express ideas to computer in a more +human-like terms. +

    + +

    +It is possible to map object model to geometrical hyperspace: +

    + +
      +
    • Object is a point in space (universe). Each object member variable +translates to its own dimension. That is: if class declares 4 +variables for an object, then corresponding object can be stored as +a single point inside 4 dimensional space. Variable values translate +to point coordinates in space. That is: Integer, floating point +number and even boolean and string can be translated to linear value +that can be used as a coordinate along particular dimension. +
    • + +
    • Each class declares its own space (universe). All class instances +(objects) are points inside that particular universe. References +between objects of different types are hyperlinks (portals) between +different universes. +
    • +
    +
    +
    +
    +

    3.3 Handling of relations

    +
    +

    +Consider we want to create database of books and authors. Book can +have multiple authors, and single person can be author for multiple +books. It is possible to store how many hours of work each author has +contributed to every book, using hyperspace as follows: +

    + +
      +
    • Every dimension corresponds to one particular book author. (10 +authors in the database, would require 10 dimensional space) +
        +
      • Point in space corresponds to one particular book. +
          +
        • Point location along particular (author) dimension corresponds +to amount of work contributed by particular author for given +book. +
        • +
        +
      • +
      +
    • +
    + +

    +Alternatively: +

    + +
      +
    • Every dimension corresponds to one particular book. +
        +
      • Point in space corresponds to one particular author in the entire +database. +
          +
        • Point location along particular (book) dimension corresponds to +amount of work contributed for book by given author (point). +
        • +
        +
      • +
      +
    • +
    +
    +
    + +
    +

    3.4 Layered architecture

    +
    +
    +
    layer 1
    disk / block storage / partition +
    + +
    layer 2
    key/value storage. Keys are unique and are dictated by +storage engine. Value is arbitrary but limited size byte +array. This layer is responsible for handling disk +defragmentation and consistency in case of crash +recovery. +
    + +
    layer 3
    key/value storage. Keys are content hashes. Values are +arbitrary but limited size content byte arrays. This +layer effectively implements content addressable +storage. Content addressible storage enables GIT-like +behavior (possibility for competing branches, retaining +history, transparent deduplication) +
    + +
    layer 4
    Implements arbitrary dimensional multiverse. +
    + +
    layer 5
    Distributed computation engine. +
    +
    +
    +
    +
    + +
    +

    4 Current status

    +
    +
      +
    • More or less defined Vision / goal. +
    • + +
    • Collected some ideas. +
    • + +
    • Implemented very simple persistent key-value map. +
        +
      • Long term goal is to use it as a backing storage engine and +implement more advanced features on top of this. +
      • +
      +
    • +
    +
    +
    + +
    +

    5 See also

    +
    +

    +Interesting or competing projects with good ideas: +

    + + +
    +
    +
    +