X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=blobdiff_plain;f=doc%2Findex.html;h=3ee73a7b8933a5dbf1d1ba37e21e7812f024d1c4;hb=6be6c3c06011a8cbf136a4b795fb7af2c44a5e43;hp=28f924f76fc0583f20e9a4a859579d8fe4f49dcf;hpb=bb7b2daf4049f53eefbde9912daca7f31a3b6717;p=sixth-data.git diff --git a/doc/index.html b/doc/index.html index 28f924f..3ee73a7 100644 --- a/doc/index.html +++ b/doc/index.html @@ -1,401 +1,601 @@ - - + + + -Sixth - system for data storage, computation, exploration and interaction - - - - - - - -" + + + +Sixth Data - Data storage and computing engine + + - + + + + + -
-

Sixth - system for data storage, computation, exploration and interaction

-
-
-
-

1.1 Inspiration

+
+

1.1 Source code

+
+
+ +
+

2 Vision / goal

+
+

+Provide hackable, versioned, optimized, distributed, geometrical, +arbitrary dimensional (hypercube based) data storage and computation +engine (as inspired by the brain) for general purpose visual computing +environment called Sixth. +

-
  • Git (version control system) +

    +Because Lisp is hackable self defined programmable programming +language it would be used to provide imperative programming support. +

    +
  • +
    +
    +

    3 Inspiration

    +
    - +
    +
    +

    3.1 Brain

    +
    +
    -
    -

    1.2 Solution (the big idea)

    -
    +
    +

    3.2 CM-1 Connection Machine

    +

    -I see 4D data structure. +https://en.wikipedia.org/wiki/Connection_Machine

    - -
    -

    data model.png +

    +Massively parallel (thousands of CPUs) connected via +machine's internal 12-dimensional hypercube network allows to +efficiently simulate arbitrary dimensional hypercube and network +topology between computational units. So that when we are +solving/simulating for example 5 dimensional problem, we can arrange +computational units into virtual 5D network. See: +http://www.mission-base.com/tamiko/theory/cm_txts/di-ch2.html

    -

    -Dimensions: +we can pre-distribute data across computation units and perform +parallel geometrical computation.

    +
    +
    +
    + +
    +

    4 Reasons for hypercube as a so called first class citizen

    +
      -
    • List of all the objecs in the system (rows). -
    • -
    • List of all declared unique object fields (columns). -
    • -
    • List of all historical transactions/commits/versions (think of -sheets of paper). -
    • -
    • List of all concurrently running branches/threads. Branches can -appear and merge over time as needed. -
    • -
    • (Every cell is concrete field value within an object) -
    • +
    • Hypercube is quite general purpose data structure that naturally +encapsulates wide variety data and problems.
    • + +
    • Nicely captures apparent properties of the brain.
    • + +
    • Naturally supports distributed and parallel geometrical data storage +and computation.
    • + +
    • Dedicated hardware like CM-1 can be built around hypercube concept +that results in data, computation process and hardware, all +beautifully fitting together while complementing each other +strengths.
    • + +
    • Hypercube stored data (and computation process) has geometry by its +nature and should fit nicely with "3D first" user interface ideology +of the parent Sixth project.
    +
    +
    +
    +

    5 Geometrical computation idea

    +
    +
    +
    +

    5.1 Distributed computation and data storage

    +
    +

    +Lots of problems can be translated to geometry (use any shapes and as +many dimensions as you need). Solution(s) to such problems could be +then found via geometrical search/comparison/lookup results. As a +bonus, such geometrical data storage AND computation can be +naturally made in parallel and distributed. +

    + +

    +Learning means building/updating/re-balancing the model (the hard +part). Question answering is making (relatively simple) lookups +(geometrical queries) against the model. +

    +
    +
    +
    +

    5.2 Mapping hypercube to object-oriented model and relational database

    +
    +

    +Object oriented programming is inspired by the way human mind +operates. It allows programmer to express ideas to computer in a more +human-like terms. +

    -Partitioning/clustering: +It is actually also possible to map object model and relational +database to geometrical hyperspace:

    +
      -
    • Why not to partition/(load balance) as required across networked -physical computers along arbitrary dimension(s) declared above ? -
    • +
    • Object or database table row is a point in hypercube arbitrary +dimensional space. Each object member variable or database table +column can be mapped to its own dimension in hypercube. That is: if +class declares 4 variables for an object, then corresponding object +can be stored as a single point inside 4 dimensional +hypercube. Variable values translate to point coordinates in that +hypercube. That is: numbers and string can be translated to linear +value that can be used as a coordinate along particular dimension.
    • + +
    • Each object class or database table declares its own hypercube that +contain instances (objects) of that class or rows of a table.
    +
    +
    +
    +

    5.3 Mapping entity relations in hypercube

    +

    -Indexing (for fast searching): +Consider we want to create database of:

      -
    • Why not to index along arbitrary dimensions (as required) ? -
    • +
    • Books.
    • +
    • Authors.
    • +
    • Effort: Amount of time contributed by every author to every book +that he/she wrote.

    -Further optimizations: +Information above can be represented as 3D cube where dimensions are:

      -
    • In current early stage, trying to focus on minimum possible set of -features that would provide maximum possible set of power/benefit :) -
    • -
    • Once featres are locked. Anything can be optimised. Optimization for -size (deduplication) can be solved using Git style content -addressible storage mechanism. -
    • +
    • X: Book
    • +
    • Y: Author
    • +
    • Z: Effort
    + +

    +Points in that cube would nicely capture many to many relations +between authors and the books. +

    -
    -

    2 Current status

    -
    +
    +

    6 Current status

    +
      +
    • More or less defined Vision / goal.
    • + +
    • Collected some inspiring ideas.
    • +
    • Implemented very simple persistent key-value map. -
    • +
        +
      • Long term goal is to use it as a backing storage engine and +implement more advanced features on top of this via layered +architecture.
      • +
    +
    +
    +
    +

    7 See also

    +

    -Long term goal is to implement more advanced features on top of this. +Interesting or competing projects with good ideas:

    -
    -
    -
    -

    3 TODO

    -
    -
    -

    3.1 check out Magma

    -
      -
    • http://wiki.squeak.org/squeak/2665 -
    • +
    • flexible user interface building for interacting with different data +
        +
      • Glamorous Toolkit +
          +
        • Moldable development environment. It is a live notebook. It is a +flexible search interface. It is a fancy code editor. It is a +software analysis platform. It is a data visualization engine. All +in one.
        • +
      • +
    + +
    +

    7.1 Computation on multi dimensional data

    +
    -
    -
    -
    -

    Author: Svjatoslav Agejenko

    -

    Created: 2017-06-13 Tue 22:14

    -

    Emacs 25.1.1 (Org-mode 8.2.10)

    -
    +
    +
    +
    +

    Author: Svjatoslav Agejenko

    +

    Created: 2021-04-01 Thu 19:11

    +

    Validate

    +