X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=blobdiff_plain;f=doc%2Findex.html;h=709f4d27295bc3ebf94583a7c7551d731a1a8c78;hb=d2dcbfcedcc730df5f45ed3c428e6f2eae849916;hp=e8c9fdfca180d7612e711648c7ba109c45a6c56d;hpb=6a94b638363b5e6d79b8c873f2995637fc73039c;p=sixth-data.git diff --git a/doc/index.html b/doc/index.html index e8c9fdf..709f4d2 100644 --- a/doc/index.html +++ b/doc/index.html @@ -1,12 +1,198 @@ - - + + + + + + Sixth Data - Data storage and computing engine - - - - - + + + @@ -14,355 +200,255 @@ footer {background-color: #111 !important;} pre {background-color: #111; color: #ccc;} - + -
-

Sixth Data - Data storage and computing engine

+
+

Sixth Data - Data storage and computing engine

+ -
-

1 General

+
+

1 General

-
  • Other software projects hosted at svjatoslav.eu -
  • +
  • Other software projects hosted at svjatoslav.eu
  • -
    -

    1.1 Source code

    +
    +

    1.1 Source code

    -
    -

    2 Vision / goal

    +
    +

    2 Vision / goal

    -Provide versioned, clustered, flexible, distributed, multi-dimensional -data storage engine for the Sixth computation engine. +Provide hackable, versioned, optimized, distributed, geometrical, +arbitrary dimensional (hypercube based) data storage and computation +engine (as inspired by the brain) for general purpose visual computing +environment called Sixth.

    +

    +Because Lisp is hackable self defined programmable programming +language it would be used to provide imperative programming support. +

    +
    +
    +
    +

    3 Inspiration

    +
      -
    • Speaking of traditional relational database and object oriented -business applications: - +
    • see also: OLAP cube.
    • +
    +
    +
    +

    3.1 Brain

    +
    - +
  • Such properties allow parallel Geometrical computation and +beautifully fits CM-1 Connection Machine architecture (for extra +hardware accelerated solution).
  • -
    -

    3 Inspiration

    -
    - + +

    +we can pre-distribute data across computation units and perform +parallel geometrical computation. +

    +
    +
    -
    -

    3.1 Brain

    -
    +
    +

    4 Reasons for hypercube as a so called first class citizen

    +
    +
  • Nicely captures apparent properties of the brain.
  • +
  • Naturally supports distributed and parallel geometrical data storage +and computation.
  • -
    +
    +

    5 Geometrical computation idea

    +
    -
    -

    4 Ideas

    -
    -
    -

    4.1 Distributed computation and data storage

    -
    +
    +

    5.1 Distributed computation and data storage

    +
    +

    +Lots of problems can be translated to geometry (use any shapes and as +many dimensions as you need). Solution(s) to such problems could be +then found via geometrical search/comparison/lookup results. As a +bonus, such geometrical data storage AND computation can be +naturally made in parallel and distributed. +

    +

    -Maybe every problem can be translated to geometry (use any shapes and -as many dimensions as you need). Solution(s) to such problems would -then appear as relatively simple search/comparison/lookup results. As -a bonus, such geometrical *data storage* AND *computation* can be -naturally made in *parallel* and *distributed*. That's what neurons in -the brain appear to be doing ! :) . Learning means building/updating -the model (the hard part). Question answering is making (relatively -simple) lookups (geometrical queries) against the model. +Learning means building/updating/re-balancing the model (the hard +part). Question answering is making (relatively simple) lookups +(geometrical queries) against the model.

    -
    -

    4.2 Mapping of hyperspace to traditional object-oriented model

    -
    +
    +

    5.2 Mapping hypercube to object-oriented model and relational database

    +

    Object oriented programming is inspired by the way human mind operates. It allows programmer to express ideas to computer in a more @@ -370,210 +456,167 @@ human-like terms.

    -It is possible to map object model to geometrical hyperspace: +It is actually also possible to map object model and relational +database to geometrical hyperspace:

      -
    • Object is a point in space (universe). Each object member variable -translates to its own dimension. That is: if class declares 4 -variables for an object, then corresponding object can be stored as -a single point inside 4 dimensional space. Variable values translate -to point coordinates in space. That is: Integer, floating point -number and even boolean and string can be translated to linear value -that can be used as a coordinate along particular dimension. -
    • - -
    • Each class declares its own space (universe). All class instances -(objects) are points inside that particular universe. References -between objects of different types are hyperlinks (portals) between -different universes. -
    • +
    • Object or database table row is a point in hypercube arbitrary +dimensional space. Each object member variable or database table +column can be mapped to its own dimension in hypercube. That is: if +class declares 4 variables for an object, then corresponding object +can be stored as a single point inside 4 dimensional +hypercube. Variable values translate to point coordinates in that +hypercube. That is: numbers and string can be translated to linear +value that can be used as a coordinate along particular dimension.
    • + +
    • Each object class or database table declares its own hypercube that +contain instances (objects) of that class or rows of a table.
    -
    -

    4.3 Handling of relations

    -
    + +
    +

    5.3 Mapping entity relations in hypercube

    +

    -Consider we want to create database of books and authors. Book can -have multiple authors, and single person can be author for multiple -books. It is possible to store how many hours of work each author has -contributed to every book, using hyperspace as follows: +Consider we want to create database of:

    -
      -
    • Every dimension corresponds to one particular book author. (10 -authors in the database, would require 10 dimensional space) -
        -
      • Point in space corresponds to one particular book. -
          -
        • Point location along particular (author) dimension corresponds -to amount of work contributed by particular author for given -book. -
        • -
        -
      • -
      -
    • +
    • Books.
    • +
    • Authors.
    • +
    • Effort: Amount of time contributed by every author to every book +that he/she wrote.

    -Alternatively: +Information above can be represented as 3D cube where dimensions are:

    - -
      -
    • Every dimension corresponds to one particular book.
        -
      • Point in space corresponds to one particular author in the entire -database. -
          -
        • Point location along particular (book) dimension corresponds to -amount of work contributed for book by given author (point). -
        • -
        -
      • -
      -
    • +
    • X: Book
    • +
    • Y: Author
    • +
    • Z: Effort
    -
    -
    - -
    -

    4.4 Layered architecture

    -
    -
    -
    layer 1
    disk / block storage / partition -
    -
    layer 2
    key/value storage. Keys are unique and are dictated by -storage engine. Value is arbitrary but limited size byte -array. This layer is responsible for handling disk -defragmentation and consistency in case of crash -recovery. -
    - -
    layer 3
    key/value storage. Keys are content hashes. Values are -arbitrary but limited size content byte arrays. This -layer effectively implements content addressable -storage. Content addressible storage enables GIT-like -behavior (possibility for competing branches, retaining -history, transparent deduplication) -
    - -
    layer 4
    Implements arbitrary dimensional multiverse. -
    - -
    layer 5
    Distributed computation engine. -
    -
    +

    +Points in that cube would nicely capture many to many relations +between authors and the books. +

    -
    -

    5 Current status

    -
    +
    +

    6 Current status

    +
      -
    • More or less defined Vision / goal. -
    • +
    • More or less defined Vision / goal.
    • -
    • Collected some ideas. -
    • +
    • Collected some inspiring ideas.
    • Implemented very simple persistent key-value map.
      • Long term goal is to use it as a backing storage engine and -implement more advanced features on top of this. -
      • -
      -
    • +implement more advanced features on top of this via layered +architecture. +
    -
    -

    6 See also

    -
    +
    +

    7 See also

    +

    Interesting or competing projects with good ideas:

      -
    • GRAKN.AI +
    • flexible user interface building for interacting with different data
        -
      • database in the form of a knowledge graph that uses machine -reasoning to simplify data processing challenges for AI -applications. https://grakn.ai/ -
      • +
      • Glamorous Toolkit +
          +
        • Moldable development environment. It is a live notebook. It is a +flexible search interface. It is a fancy code editor. It is a +software analysis platform. It is a data visualization engine. All +in one.
        • +
      • +
    - +
    -
  • Magma +
    +

    7.1 Computation on multi dimensional data

    +
    -
  • +
    +
    +
    +

    7.2 Distributed, reliable, parallel computing systems

    +
    +
      +
    • ChrysaLisp +
        +
      • Assembler/C-Script/Lisp 64 bit, MIMD, multi CPU, multi threaded, +multi core, multi user Parallel OS. With GUI, Terminal, OO +Assembler, Class libraries, C-Script compiler, Lisp interpreter, +Debugger, and more…
      • +
    • Gemstone/S
      • Completely distributed smalltalk based computing -system. -
      • -
      -
    • +system. +
    + +
  • http://phantomos.org/ +
      +
    • Programs run forever. System crash or reboot does not destroy +state of running program.
    • +
  • + +
  • Magma +
      +
    • Multi-user object database for Squeak
    • +
  • TAOS
      -
    • Completely distributed operating system/virtual machine: -
    • +
    • Completely distributed operating system/virtual machine:
    • +
  • - +
    +
    -
  • ChrysaLisp +
    +

    7.3 Rules based machine reasoning

    +
      -
    • Assembler/C-Script/Lisp 64 bit, MIMD, multi CPU, multi threaded, -multi core, multi user Parallel OS. With GUI, Terminal, OO -Assembler, Class libraries, C-Script compiler, Lisp interpreter, -Debugger, and more… -
    • -
    -
  • +
  • GRAKN.AI +
      +
    • database in the form of a knowledge graph that uses machine +reasoning to simplify data processing challenges for AI +applications. https://grakn.ai/
    • +
  • + +
  • Prolog programming language
  • -
    -
    -

    Author: Svjatoslav Agejenko

    -

    Created: 2019-01-18 Fri 23:27

    -

    Emacs 26.1 (Org-mode 9.1.9)

    -
    +
    +

    Author: Svjatoslav Agejenko

    +

    Created: 2021-03-16 Tue 20:48

    +

    Validate

    +