Sixth Data - Data storage and computing engine

1. General +
- 1.1. Source code
+
2. Vision / goal
3. Inspiration +
- 3.1. Brain
- 3.2. CM-1 Connection Machine
+
4. Reasons for hypercube as a so called first class citizen
5. Geometrical computation idea +
+
6. Current status

1 General

1. General

This is a subproject of Sixth -
This program is free software: you can redistribute it and/or modify -it under the terms of the GNU Lesser General Public License as -published by the Free Software Foundation, either version 3 of the -License, or (at your option) any later version. -
This program is free software: released under Creative Commons Zero +(CC0) license
Program author:
- Svjatoslav Agejenko -
- Homepage: https://svjatoslav.eu -
- Email: mailto://svjatoslav@svjatoslav.eu -
-
Svjatoslav Agejenko
Homepage: https://svjatoslav.eu
Email: mailto://svjatoslav@svjatoslav.eu

Other software projects hosted at svjatoslav.eu -

Other software projects hosted at svjatoslav.eu

1.1 Source code

1.1. Source code

Download latest snapshot in TAR GZ format -
Download latest snapshot in TAR GZ format
Browse Git repository online -
Browse Git repository online
Clone Git repository using command: +

+Clone Git repository using command: +

 git clone https://www2.svjatoslav.eu/git/sixth-data.git
+

See JavaDoc. -
See JavaDoc.

2 Vision / goal

2. Vision / goal

-Provide versioned, clustered, flexible, distributed, multi-dimensional -data storage engine for the Sixth computation engine. +Provide hackable, versioned, optimized, distributed, geometrical, +arbitrary dimensional (hypercube based) data storage and computation +engine (as inspired by the brain) for general purpose visual computing +environment called Sixth.

Speaking of traditional relational database and object oriented -business applications: - -
- I hate object-relational impedance mismatch. -
- I don't like to convert data between persistent database and -runtime objects for every transaction. How about creating united -database/computation engine instead to: -
- Eliminate constant moving and converting of data between 2 systems -and make computing happen close to where the data is stored. -
- Abstract away difference between RAM VS persistent storage. Let -the system decide at runtime which data to keep in what kind of -memory. -
-

+Because Lisp is hackable self defined programmable programming +language it would be used to provide imperative programming support. +

- -

3 Inspiration

3. Inspiration

Relational databases: -
- Transactional. -
- Indexable / Quickly searchable. -
-
Git (version control system) -
- Versionable -
- Branchable / mergeable. -
- Transparent cansistency, checksumming and deduplication. -
- (Git as a database: -
-
-https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) -
-
see also: OLAP cube.

- -

3.1 Brain

3.1. Brain

Brain appears to have more than 3D dimensional design: -https://singularityhub.com/2017/06/21/is-there-a-multidimensional-mathematical-world-hidden-in-the-brains-computation/ -
Brain appears to be natural geometrical/parallel data storage and +computational engine: +
- https://www.quantamagazine.org/the-brain-maps-out-ideas-and-memories-like-spaces-20190114/
- Geometrical Thinking Offers a Window Into Computation
Even more awesome is that brain appears to operate and is wired as +arbitrary/variable dimensional structure: +https://singularityhub.com/2017/06/21/is-there-a-multidimensional-mathematical-world-hidden-in-the-brains-computation/
Brain appears to use geometry to map thoughts and even sounds: +
On top of this, this multidimensional space that brain represents +has dynamic/variable resolution/density:
- https://www.quantamagazine.org/the-brain-maps-out-ideas-and-memories-like-spaces-20190114/ -
- https://www.quantamagazine.org/goals-and-rewards-redraw-the-brains-map-of-the-world-20190328 -
-
https://www.quantamagazine.org/goals-and-rewards-redraw-the-brains-map-of-the-world-20190328

It directly inspires Geometrical computation idea and nicely fits -with CM-1 Connection Machine design. -

Such properties allow parallel Geometrical computation and +beautifully fits CM-1 Connection Machine architecture (for extra +hardware accelerated solution).

3.2 CM-1 Connection Machine

+ +

3.2. CM-1 Connection Machine

https://en.wikipedia.org/wiki/Connection_Machine

see: Geometrical computation -
Computation unit has local CPU and RAM. -
Data is pre-distributed across computation units. -
Machine's internal 12-dimensional hypercube network allows to -efficiently simulate arbitrary dimensional network topology between -computational units. So that when we are solving/simulating for -example 5 dimensional problem, we can arrange computational units -into virtual 5D network. See: +
+Massively parallel (thousands of CPUs) connected via +machine's internal 12-dimensional hypercube network allows to +efficiently simulate arbitrary dimensional hypercube and network +topology between computational units. So that when we are +solving/simulating for example 5 dimensional problem, we can arrange +computational units into virtual 5D network. See: http://www.mission-base.com/tamiko/theory/cm_txts/di-ch2.html -

+ +

+we can pre-distribute data across computation units and perform +parallel geometrical computation. +

4 Ideas

4. Reasons for hypercube as a so called first class citizen

4.1 Geometrical computation

Inspired by Brain. -
Wits nicely with CM-1 Connection Machine properties. -
Hypercube is quite general purpose data structure that naturally +encapsulates wide variety data and problems.
Nicely captures apparent properties of the brain.
Naturally supports distributed and parallel geometrical data storage +and computation.
Dedicated hardware like CM-1 can be built around hypercube concept +that results in data, computation process and hardware, all +beautifully fitting together while complementing each other +strengths.
Hypercube stored data (and computation process) has geometry by its +nature and should fit nicely with "3D first" user interface ideology +of the parent Sixth project.

5. Geometrical computation idea

5.1. Distributed computation and data storage

+Lots of problems can be translated to geometry (use any shapes and as +many dimensions as you need). Solution(s) to such problems could be +then found via geometrical search/comparison/lookup results. As a +bonus, such geometrical data storage AND computation can be +naturally made in parallel and distributed. +

4.1.1 Distributed computation and data storage

-Maybe every problem can be translated to geometry (use any shapes and -as many dimensions as you need). Solution(s) to such problems would -then appear as relatively simple search/comparison/lookup results. As -a bonus, such geometrical *data storage* AND *computation* can be -naturally made in *parallel* and *distributed*. That's what neurons in -the brain appear to be doing ! :) . Learning means building/updating -the model (the hard part). Question answering is making (relatively -simple) lookups (geometrical queries) against the model. +Learning means building/updating/re-balancing the model (the hard +part). Question answering is making (relatively simple) lookups +(geometrical queries) against the model.

4.1.2 Mapping of hyperspace to traditional object-oriented model

5.2. Mapping hypercube to object-oriented model and relational database

Object oriented programming is inspired by the way human mind operates. It allows programmer to express ideas to computer in a more @@ -408,213 +405,77 @@ human-like terms.

-It is possible to map object model to geometrical hyperspace: +It is actually also possible to map object model and relational +database to geometrical hyperspace:

Object is a point in space (universe). Each object member variable -translates to its own dimension. That is: if class declares 4 -variables for an object, then corresponding object can be stored as -a single point inside 4 dimensional space. Variable values translate -to point coordinates in space. That is: Integer, floating point -number and even boolean and string can be translated to linear value -that can be used as a coordinate along particular dimension. -
Each class declares its own space (universe). All class instances -(objects) are points inside that particular universe. References -between objects of different types are hyperlinks (portals) between -different universes. -
Object or database table row is a point in hypercube arbitrary +dimensional space. Each object member variable or database table +column can be mapped to its own dimension in hypercube. That is: if +class declares 4 variables for an object, then corresponding object +can be stored as a single point inside 4 dimensional +hypercube. Variable values translate to point coordinates in that +hypercube. That is: numbers and string can be translated to linear +value that can be used as a coordinate along particular dimension.
Each object class or database table declares its own hypercube that +contain instances (objects) of that class or rows of a table.

4.1.3 Handling of relations

+ +

5.3. Mapping entity relations in hypercube

-Consider we want to create database of books and authors. Book can -have multiple authors, and single person can be author for multiple -books. It is possible to store how many hours of work each author has -contributed to every book, using hyperspace as follows: +Consider we want to create database of:

- -

Every dimension corresponds to one particular book author. (10 -authors in the database, would require 10 dimensional space) -
- Point in space corresponds to one particular book.
  - Point location along particular (author) dimension corresponds -to amount of work contributed by particular author for given -book. -
  -
-
Books.
Authors.
Effort: Amount of time contributed by every author to every book +that he/she wrote.

-Alternatively: +Information above can be represented as 3D cube where dimensions are:

Every dimension corresponds to one particular book. -
- Point in space corresponds to one particular author in the entire -database. -
  - Point location along particular (book) dimension corresponds to -amount of work contributed for book by given author (point). -
  - X: Book
  - Y: Author
  - Z: Effort
  -
-

- -

4.2 Layered architecture

layer 1: disk / block storage / partition -
layer 2: key/value storage. Keys are unique and are dictated by -storage engine. Value is arbitrary but limited size byte -array. This layer is responsible for handling disk -defragmentation and consistency in case of crash -recovery. -
layer 3: key/value storage. Keys are content hashes. Values are -arbitrary but limited size content byte arrays. This -layer effectively implements content addressable -storage. Content addressible storage enables GIT-like -behavior (possibility for competing branches, retaining -history, transparent deduplication) -
layer 4: Implements arbitrary dimensional multiverse. -
layer 5: Distributed computation engine. -

+Points in that cube would nicely capture many to many relations +between authors and the books. +

5 Current status

6. Current status

More or less defined Vision / goal. -
More or less defined Vision / goal.
Collected some ideas. -
Collected some inspiring ideas.
Implemented very simple persistent key-value map.
- Long term goal is to use it as a backing storage engine and -implement more advanced features on top of this. -
-

- -

6 See also

-Interesting or competing projects with good ideas: -

- -

CM-1 Connection Machine -
GRAKN.AI -
- database in the form of a knowledge graph that uses machine -reasoning to simplify data processing challenges for AI -applications. https://grakn.ai/ -
-
Magma -
- Multi-user object database for Squeak -
-
Gemstone/S -
- Completely distributed smalltalk based computing -system. -
-
TAOS -
- Completely distributed operating system/virtual machine: -
-
ChrysaLisp -
- Assembler/C-Script/Lisp 64 bit, MIMD, multi CPU, multi threaded, -multi core, multi user Parallel OS. With GUI, Terminal, OO -Assembler, Class libraries, C-Script compiler, Lisp interpreter, -Debugger, and more… -
-

Sixth Data - Data storage and computing engine

Sixth Data - Data storage and computing engine

Table of Contents

1 General

1. General

1.1 Source code

1.1. Source code

2 Vision / goal

2. Vision / goal

3 Inspiration

3. Inspiration

3.1 Brain

3.1. Brain

3.2 CM-1 Connection Machine

3.2. CM-1 Connection Machine

4 Ideas

4. Reasons for hypercube as a so called first class citizen

4.1 Geometrical computation

5. Geometrical computation idea

5.1. Distributed computation and data storage

4.1.1 Distributed computation and data storage

4.1.2 Mapping of hyperspace to traditional object-oriented model

5.2. Mapping hypercube to object-oriented model and relational database

4.1.3 Handling of relations

5.3. Mapping entity relations in hypercube

4.2 Layered architecture

5 Current status

6. Current status

6 See also