X-Git-Url: http://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=blobdiff_plain;f=doc%2Findex.org;h=6bfc99f6f16588c86a277a402ac61a3a395c1312;hp=3c9334126ef631e1a5313c269291f8c790350482;hb=6a94b638363b5e6d79b8c873f2995637fc73039c;hpb=a41607862942cced0ec94799ce3adb183cb06f06 diff --git a/doc/index.org b/doc/index.org index 3c93341..6bfc99f 100644 --- a/doc/index.org +++ b/doc/index.org @@ -1,23 +1,195 @@ -#+TITLE: Sixth - system for data storage, computation, exploration and interaction +#+TITLE: Sixth Data - Data storage and computing engine ------ -- This is a subproject of [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth]] +* (document settings) :noexport: +** use dark style for TWBS-HTML exporter +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: -- [[http://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=snapshot;h=HEAD;sf=tgz][download latest snapshot]] +* General +- This is a subproject of [[http://www3.svjatoslav.eu/projects/sixth/][Sixth]] -- This program is free software; you can redistribute it and/or modify - it under the terms of version 3 of the [[https://www.gnu.org/licenses/lgpl.html][GNU Lesser General Public - License]] or later as published by the Free Software Foundation. +- This program is free software: you can redistribute it and/or modify + it under the terms of the [[https://www.gnu.org/licenses/lgpl.html][GNU Lesser General Public License]] as + published by the Free Software Foundation, either version 3 of the + License, or (at your option) any later version. - Program author: - Svjatoslav Agejenko - Homepage: http://svjatoslav.eu - Email: mailto://svjatoslav@svjatoslav.eu -- [[http://svjatoslav.eu/programs.jsp][other applications hosted at svjatoslav.eu]] +- [[http://www.svjatoslav.eu/projects/][Other software projects hosted at svjatoslav.eu]] +** Source code +- [[http://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=snapshot;h=HEAD;sf=tgz][Download latest snapshot in TAR GZ format]] +- [[http://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=summary][Browse Git repository online]] + +- Clone Git repository using command: + : git clone http://www2.svjatoslav.eu/git/sixth-data.git + +* Vision / goal + :PROPERTIES: + :ID: f6764282-a6f6-44e6-8716-b428074dd093 + :END: +Provide versioned, clustered, flexible, distributed, multi-dimensional +data storage engine for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth computation engine]]. + ++ Speaking of traditional relational database and object oriented + business applications: + + + I hate object-relational impedance mismatch. + + + I don't like to convert data between persistent database and + runtime objects for every transaction. How about creating united + database/computation engine instead to: + + + Eliminate constant moving and converting of data between 2 systems + and make computing happen close to where the data is stored. + + + Abstract away difference between RAM VS persistent storage. Let + the system decide at runtime which data to keep in what kind of + memory. + +* Inspiration ++ Relational databases: + + Transactional. + + Indexable / Quickly searchable. + ++ Git (version control system) + + Versionable + + Branchable / mergeable. + + Transparent cansistency, checksumming and deduplication. + + (Git as a database: + https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) + +** Brain + :PROPERTIES: + :ID: d2375acc-af14-4f18-8ad0-7949501178c5 + :END: ++ Brain appears to have more than 3D dimensional design: + https://singularityhub.com/2017/06/21/is-there-a-multidimensional-mathematical-world-hidden-in-the-brains-computation/ + ++ Brain appears to use geometry to map thoughts and even sounds: + https://www.quantamagazine.org/the-brain-maps-out-ideas-and-memories-like-spaces-20190114/ + + ++ It directly inspires following ideas + + [[id:5d287158-53ea-44a2-a754-dd862366066a][Distributed comutation and data storage]] + + [[id:a117c11e-97c1-4822-88b2-9fc10f96caec][Mapping of hyperspace to traditional object-oriented model]] + + [[id:b6b15bd2-c78b-4c51-a343-72843a515c29][Handling of relations]] +* Ideas +** Distributed computation and data storage + :PROPERTIES: + :ID: 5d287158-53ea-44a2-a754-dd862366066a + :END: +Maybe every problem can be translated to geometry (use any shapes and +as many dimensions as you need). Solution(s) to such problems would +then appear as relatively simple search/comparison/lookup results. As +a bonus, such geometrical *data storage* AND *computation* can be +naturally made in *parallel* and *distributed*. That's what neurons in +the brain appear to be doing ! :) . Learning means building/updating +the model (the hard part). Question answering is making (relatively +simple) lookups (geometrical queries) against the model. +** Mapping of hyperspace to traditional object-oriented model + :PROPERTIES: + :ID: a117c11e-97c1-4822-88b2-9fc10f96caec + :END: +Object oriented programming is inspired by the way human mind +operates. It allows programmer to express ideas to computer in a more +human-like terms. + +It is possible to map object model to geometrical hyperspace: + ++ Object is a point in space (universe). Each object member variable + translates to its own dimension. That is: if class declares 4 + variables for an object, then corresponding object can be stored as + a single point inside 4 dimensional space. Variable values translate + to point coordinates in space. That is: Integer, floating point + number and even boolean and string can be translated to linear value + that can be used as a coordinate along particular dimension. + ++ Each class declares its own space (universe). All class instances + (objects) are points inside that particular universe. References + between objects of different types are hyperlinks (portals) between + different universes. +** Handling of relations + :PROPERTIES: + :ID: b6b15bd2-c78b-4c51-a343-72843a515c29 + :END: +Consider we want to create database of books and authors. Book can +have multiple authors, and single person can be author for multiple +books. It is possible to store how many hours of work each author has +contributed to every book, using hyperspace as follows: + ++ Every dimension corresponds to one particular book author. (10 + authors in the database, would require 10 dimensional space) + + Point in space corresponds to one particular book. + + Point location along particular (author) dimension corresponds + to amount of work contributed by particular author for given + book. + +Alternatively: + ++ Every dimension corresponds to one particular book. + + Point in space corresponds to one particular author in the entire + database. + + Point location along particular (book) dimension corresponds to + amount of work contributed for book by given author (point). + +** Layered architecture ++ layer 1 :: disk / block storage / partition + ++ layer 2 :: key/value storage. Keys are unique and are dictated by + storage engine. Value is arbitrary but limited size byte + array. This layer is responsible for handling disk + defragmentation and consistency in case of crash + recovery. + ++ layer 3 :: key/value storage. Keys are content hashes. Values are + arbitrary but limited size content byte arrays. This + layer effectively implements content addressable + storage. Content addressible storage enables GIT-like + behavior (possibility for competing branches, retaining + history, transparent deduplication) + ++ layer 4 :: Implements arbitrary dimensional multiverse. + ++ layer 5 :: Distributed computation engine. * Current status +- More or less defined [[id:f6764282-a6f6-44e6-8716-b428074dd093][Vision / goal]]. + +- Collected some [[id:d2375acc-af14-4f18-8ad0-7949501178c5][ideas]]. + - Implemented very simple persistent key-value map. + - Long term goal is to use it as a backing storage engine and + implement more advanced features on top of this. + +* See also +Interesting or competing projects with good ideas: + ++ GRAKN.AI + + database in the form of a knowledge graph that uses machine + reasoning to simplify data processing challenges for AI + applications. https://grakn.ai/ + ++ [[http://wiki.squeak.org/squeak/2665][Magma]] + + Multi-user object database for Squeak + ++ [[http://esug.org/data/ESUG2015/3%20wednesday/1100-1130%20SQL%20Queries%20on%20Smalltalk%20Objects/SQL%20Queries%20in%20Smalltalk%20(James%20Foster).pdf][Gemstone/S]] + + Completely distributed smalltalk based computing + system. + ++ [[http://www.uruk.org/emu/Taos.html][TAOS]] + + Completely distributed operating system/virtual machine: -Long term goal is to implement more advanced features on top of this. ++ [[https://github.com/vygr/ChrysaLisp][ChrysaLisp]] + + Assembler/C-Script/Lisp 64 bit, MIMD, multi CPU, multi threaded, + multi core, multi user Parallel OS. With GUI, Terminal, OO + Assembler, Class libraries, C-Script compiler, Lisp interpreter, + Debugger, and more...