1 #+TITLE: Sixth Data - Data storage and computing engine
4 - This is a subproject of [[http://www3.svjatoslav.eu/projects/sixth/][Sixth]]
6 - This program is free software: you can redistribute it and/or modify
7 it under the terms of the [[https://www.gnu.org/licenses/lgpl.html][GNU Lesser General Public License]] as
8 published by the Free Software Foundation, either version 3 of the
9 License, or (at your option) any later version.
13 - Homepage: http://svjatoslav.eu
14 - Email: mailto://svjatoslav@svjatoslav.eu
16 - [[http://www.svjatoslav.eu/projects/][Other software projects hosted at svjatoslav.eu]]
19 - [[http://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=snapshot;h=HEAD;sf=tgz][Download latest snapshot in TAR GZ format]]
21 - [[http://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=summary][Browse Git repository online]]
23 - Clone Git repository using command:
24 : git clone http://www2.svjatoslav.eu/git/sixth-data.git
28 :ID: f6764282-a6f6-44e6-8716-b428074dd093
30 Provide versioned, clustered, flexible, distributed, multi-dimensional
31 data storage engine for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth computation engine]].
33 + Speaking of traditional relational database and object oriented
34 business applications:
36 + I hate object-relational impedance mismatch.
38 + I don't like to convert data between persistent database and
39 runtime objects for every transaction. How about creating united
40 database/computation engine instead to:
42 + Eliminate constant moving and converting of data between 2 systems
43 and make computing happen close to where the data is stored.
45 + Abstract away difference between RAM VS persistent storage. Let
46 the system decide at runtime which data to keep in what kind of
50 + Relational databases:
52 + Indexable / Quickly searchable.
54 + Git (version control system)
56 + Branchable / mergeable.
57 + Transparent cansistency, checksumming and deduplication.
59 https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ )
63 :ID: d2375acc-af14-4f18-8ad0-7949501178c5
65 + Appears to have more than 3D dimensional design. Food for
67 + https://singularityhub.com/2017/06/21/is-there-a-multidimensional-mathematical-world-hidden-in-the-brains-computation/
69 + It directly inspires following ideas
70 + [[id:5d287158-53ea-44a2-a754-dd862366066a][Distributed comutation and data storage]]
71 + [[id:a117c11e-97c1-4822-88b2-9fc10f96caec][Mapping of hyperspace to traditional object-oriented model]]
72 + [[id:b6b15bd2-c78b-4c51-a343-72843a515c29][Handling of relations]]
74 ** Distributed computation and data storage
76 :ID: 5d287158-53ea-44a2-a754-dd862366066a
78 Maybe every problem can be translated to geometry (use any shapes and
79 as many dimensions as you need). Solution(s) to such problems would
80 then appear as relatively simple search/comparison/lookup results. As
81 a bonus, such geometrical *data storage* AND *computation* can be
82 naturally made in *parallel* and *distributed*. That's what neurons in
83 the brain appear to be doing ! :) . Learning means building/updating
84 the model (the hard part). Question answering is making (relatively
85 simple) lookups (geometrical queries) against the model.
86 ** Mapping of hyperspace to traditional object-oriented model
88 :ID: a117c11e-97c1-4822-88b2-9fc10f96caec
90 Object oriented programming is inspired by the way human mind
91 operates. It allows programmer to express ideas to computer in a more
94 It is possible to map object model to geometrical hyperspace:
96 + Object is a point in space (universe). Each object member variable
97 translates to its own dimension. That is: if class declares 4
98 variables for an object, then corresponding object can be stored as
99 a single point inside 4 dimensional space. Variable values translate
100 to point coordinates in space. That is: Integer, floating point
101 number and even boolean and string can be translated to linear value
102 that can be used as a coordinate along particular dimension.
104 + Each class declares its own space (universe). All class instances
105 (objects) are points inside that particular universe. References
106 between objects of different types are hyperlinks (portals) between
108 ** Handling of relations
110 :ID: b6b15bd2-c78b-4c51-a343-72843a515c29
112 Consider we want to create database of books and authors. Book can
113 have multiple authors, and single person can be author for multiple
114 books. It is possible to store how many hours of work each author has
115 contributed to every book, using hyperspace as follows:
117 + Every dimension corresponds to one particular book author. (10
118 authors in the database, would require 10 dimensional space)
119 + Point in space corresponds to one particular book.
120 + Point location along particular (author) dimension corresponds
121 to amount of work contributed by particular author for given
126 + Every dimension corresponds to one particular book.
127 + Point in space corresponds to one particular author in the entire
129 + Point location along particular (book) dimension corresponds to
130 amount of work contributed for book by given author (point).
132 ** Layered architecture
133 + layer 1 :: disk / block storage / partition
135 + layer 2 :: key/value storage. Keys are unique and are dictated by
136 storage engine. Value is arbitrary but limited size byte
137 array. This layer is responsible for handling disk
138 defragmentation and consistency in case of crash
141 + layer 3 :: key/value storage. Keys are content hashes. Values are
142 arbitrary but limited size content byte arrays. This
143 layer effectively implements content addressable
144 storage. Content addressible storage enables GIT-like
145 behavior (possibility for competing branches, retaining
146 history, transparent deduplication)
148 + layer 4 :: Implements arbitrary dimensional multiverse.
150 + layer 5 :: Distributed computation engine.
152 - More or less defined [[id:f6764282-a6f6-44e6-8716-b428074dd093][Vision / goal]].
154 - Collected some [[id:d2375acc-af14-4f18-8ad0-7949501178c5][ideas]].
156 - Implemented very simple persistent key-value map.
157 - Long term goal is to use it as a backing storage engine and
158 implement more advanced features on top of this.
161 Interesting or competing projects with good ideas:
163 + GRAKN.AI: database in the form of a knowledge graph that uses
164 machine reasoning to simplify data processing challenges for AI
168 + Gemstone/S based on Smalltalk.
169 + http://esug.org/data/ESUG2015/3%20wednesday/1100-1130%20SQL%20Queries%20on%20Smalltalk%20Objects/SQL%20Queries%20in%20Smalltalk%20(James%20Foster).pdf
171 + Magma distributed database in Smalltalk.
172 + http://wiki.squeak.org/squeak/2665
175 + https://github.com/zetavm/zetavm
177 * (document settings) :noexport:
178 ** use dark style for TWBS-HTML exporter
179 #+HTML_HEAD: <link href="https://bootswatch.com/4/darkly/bootstrap.min.css" rel="stylesheet">
180 #+HTML_HEAD: <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.11.2/jquery.min.js"></script>
181 #+HTML_HEAD: <script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.1/js/bootstrap.min.js"></script>"
182 #+HTML_HEAD: <style type="text/css">
183 #+HTML_HEAD: footer {background-color: #111 !important;}
184 #+HTML_HEAD: pre {background-color: #111; color: #ccc;}
185 #+HTML_HEAD: </style>