X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=blobdiff_plain;f=doc%2Findex.org;h=1dce2777ff536fc431240bc6f28661857927c166;hb=058fc98562d8714f5ffcdb89c50f685b474c61fc;hp=f4919bb94324cbbf975fcd7d2c7084819ee5a32e;hpb=6da6d15b3291c8e14035f5f4f2bd8a2493ab0143;p=sixth-data.git diff --git a/doc/index.org b/doc/index.org index f4919bb..1dce277 100644 --- a/doc/index.org +++ b/doc/index.org @@ -1,89 +1,174 @@ -#+TITLE: Sixth - system for data storage, computation, exploration and interaction +#+TITLE: Sixth Data - Data storage and computing engine ------ -- This is a subproject of [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth]] - -- [[http://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=snapshot;h=HEAD;sf=tgz][download latest snapshot]] +* (document settings) :noexport: +** use dark style for TWBS-HTML exporter +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: -- This program is free software; you can redistribute it and/or modify - it under the terms of version 3 of the [[https://www.gnu.org/licenses/lgpl.html][GNU Lesser General Public - License]] or later as published by the Free Software Foundation. +* General +- This program is free software: released under Creative Commons Zero + (CC0) license - Program author: - Svjatoslav Agejenko - - Homepage: http://svjatoslav.eu + - Homepage: https://svjatoslav.eu - Email: mailto://svjatoslav@svjatoslav.eu -- [[http://www.svjatoslav.eu/programs.jsp][other applications hosted at svjatoslav.eu]] +- [[https://www.svjatoslav.eu/projects/][Other software projects hosted at svjatoslav.eu]] +** Source code +- [[https://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=snapshot;h=HEAD;sf=tgz][Download latest snapshot in TAR GZ format]] -* (document settings) :noexport: -** use dark style for TWBS-HTML exporter -#+HTML_HEAD: -#+HTML_HEAD: -#+HTML_HEAD: " -#+HTML_HEAD: +- [[https://www2.svjatoslav.eu/gitweb/?p=sixth-data.git;a=summary][Browse Git repository online]] + +- Clone Git repository using command: + : git clone https://www2.svjatoslav.eu/git/sixth-data.git + +- See [[https://www3.svjatoslav.eu/projects/sixth-data/apidocs/][JavaDoc]]. * Vision / goal :PROPERTIES: :ID: f6764282-a6f6-44e6-8716-b428074dd093 :END: -Provide versioned, clustered, flexible, distributed, multi-dimensional -data storage engine for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth computation engine]]. - -+ Speaking of traditional relational database and object oriented - business applications: - + I hate object-relational impedance mismatch. - - + I don't like to convert data between persistent database and - runtime objects for every transaction. How about creating united - database/computation engine instead to: - + Eliminate constant moving and converting of data between 2 systems. - + Abstract away difference between RAM VS persistent storage. Let - the system decide at runtime which data to keep in what kind of - memory. - -** Inspiration -+ Relational databases: - + Transactional. - + Indexable / Quickly searchable. - -+ Git (version control system) - + Versionable - + Branchable / mergeable. - + Transparent cansistency, checksumming and deduplication. - + (Git as a database: - https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) - -+ Brain (appears to have more than 3D dimensional design. Food for - thought...) - + https://singularityhub.com/2017/06/21/is-there-a-multidimensional-mathematical-world-hidden-in-the-brains-computation/ - + From there comes following idea: Maybe every problem can be - translated to geometry (use any shapes and as many dimensions as - you need). Solution(s) to such problems would then appear as - relatively simple search/comparison/lookup results. As a bonus, - such geometrical *data storage* AND *computation* can be - naturally made in *parallel* and *distributed*. That's what - neurons in the brain appear to be doing ! :) . Learning means - building/updating the model (the hard part). Question answering - is making (relatively simple) lookups (geometrical queries) - against the model. - +Provide hackable, versioned, optimized, distributed, geometrical, +arbitrary dimensional ([[id:96116550-a6a1-4700-bef7-865d0deee7ea][hypercube based]]) data storage and computation +engine ([[id:d2375acc-af14-4f18-8ad0-7949501178c5][as inspired by the brain]]) for general purpose visual computing +environment called [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth]]. + +Because [[http://www.paulgraham.com/rootsoflisp.html][Lisp is hackable self defined programmable programming +language]] it would be used to provide [[https://en.wikipedia.org/wiki/Imperative_programming][imperative programming]] support. +* Inspiration +:PROPERTIES: +:ID: 0fa6354b-18c9-4120-bbf5-c7239aebecab +:END: ++ see also: [[https://en.wikipedia.org/wiki/OLAP_cube][OLAP cube]]. +** Brain + :PROPERTIES: + :ID: d2375acc-af14-4f18-8ad0-7949501178c5 + :END: ++ Brain appears to be natural geometrical/parallel data storage and + computational engine: + + https://www.quantamagazine.org/the-brain-maps-out-ideas-and-memories-like-spaces-20190114/ + + [[https://www.simonsfoundation.org/2021/04/07/geometrical-thinking-offers-a-window-into-computation/][Geometrical Thinking Offers a Window Into Computation]] + ++ Even more awesome is that brain appears to operate and is wired as + arbitrary/variable dimensional structure: + https://singularityhub.com/2017/06/21/is-there-a-multidimensional-mathematical-world-hidden-in-the-brains-computation/ + ++ On top of this, this multidimensional space that brain represents + has dynamic/variable resolution/density: + + https://www.quantamagazine.org/goals-and-rewards-redraw-the-brains-map-of-the-world-20190328 + ++ Such properties allow parallel [[id:171fe375-c737-41e6-b429-a414f6abc5d8][Geometrical computation]] and + beautifully fits [[id:01aa65c1-3d44-44a8-9b90-58454bc6be80][CM-1 Connection Machine]] architecture (for extra + hardware accelerated solution). + + +** CM-1 Connection Machine +:PROPERTIES: +:ID: 01aa65c1-3d44-44a8-9b90-58454bc6be80 +:END: +https://en.wikipedia.org/wiki/Connection_Machine + +Massively parallel (thousands of CPUs) connected via +machine's internal 12-dimensional hypercube network allows to +efficiently simulate arbitrary dimensional hypercube and network +topology between computational units. So that when we are +solving/simulating for example 5 dimensional problem, we can arrange +computational units into virtual 5D network. See: +http://www.mission-base.com/tamiko/theory/cm_txts/di-ch2.html + +we can pre-distribute data across computation units and perform +parallel [[id:171fe375-c737-41e6-b429-a414f6abc5d8][geometrical computation]]. + +* Reasons for hypercube as a so called first class citizen +:PROPERTIES: +:ID: 96116550-a6a1-4700-bef7-865d0deee7ea +:END: ++ Hypercube is quite general purpose data structure that naturally + encapsulates wide variety data and problems. + ++ Nicely captures apparent [[id:d2375acc-af14-4f18-8ad0-7949501178c5][properties of the brain]]. + ++ Naturally supports distributed and parallel [[id:171fe375-c737-41e6-b429-a414f6abc5d8][geometrical data storage + and computation.]] + ++ Dedicated hardware like [[id:01aa65c1-3d44-44a8-9b90-58454bc6be80][CM-1]] can be built around hypercube concept + that results in data, computation process and hardware, all + beautifully fitting together while complementing each other + strengths. + ++ Hypercube stored data (and computation process) has geometry by its + nature and should fit nicely with "3D first" user interface ideology + of the parent [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth]] project. +* Geometrical computation idea +:PROPERTIES: +:ID: 171fe375-c737-41e6-b429-a414f6abc5d8 +:END: +** Distributed computation and data storage + :PROPERTIES: + :ID: 5d287158-53ea-44a2-a754-dd862366066a + :END: +Lots of problems can be translated to geometry (use any shapes and as +many dimensions as you need). Solution(s) to such problems could be +then found via geometrical search/comparison/lookup results. As a +bonus, such geometrical *data storage* AND *computation* can be +naturally made in *parallel* and *distributed*. + +Learning means building/updating/re-balancing the model (the hard +part). Question answering is making (relatively simple) lookups +(geometrical queries) against the model. +** Mapping hypercube to object-oriented model and relational database + :PROPERTIES: + :ID: a117c11e-97c1-4822-88b2-9fc10f96caec + :END: +Object oriented programming is inspired by the way human mind +operates. It allows programmer to express ideas to computer in a more +human-like terms. + +It is actually also possible to map object model and relational +database to geometrical hyperspace: + ++ Object or database table row is a point in hypercube arbitrary + dimensional space. Each object member variable or database table + column can be mapped to its own dimension in hypercube. That is: if + class declares 4 variables for an object, then corresponding object + can be stored as a single point inside 4 dimensional + hypercube. Variable values translate to point coordinates in that + hypercube. That is: numbers and string can be translated to linear + value that can be used as a coordinate along particular dimension. + ++ Each object class or database table declares its own hypercube that + contain instances (objects) of that class or rows of a table. + +** Mapping entity relations in hypercube + :PROPERTIES: + :ID: b6b15bd2-c78b-4c51-a343-72843a515c29 + :END: +Consider we want to create database of: ++ Books. ++ Authors. ++ Effort: Amount of time contributed by every author to every book + that he/she wrote. + +Information above can be represented as 3D cube where dimensions are: ++ X: Book ++ Y: Author ++ Z: Effort + +Points in that cube would nicely capture many to many relations +between authors and the books. * Current status - More or less defined [[id:f6764282-a6f6-44e6-8716-b428074dd093][Vision / goal]]. +- Collected some [[id:0fa6354b-18c9-4120-bbf5-c7239aebecab][inspiring]] [[id:171fe375-c737-41e6-b429-a414f6abc5d8][ideas]]. + - Implemented very simple persistent key-value map. - Long term goal is to use it as a backing storage engine and - implement more advanced features on top of this. - -* TODO -+ check out GRAKN.AI: database in the form of a knowledge graph that - uses machine reasoning to simplify data processing challenges for AI - applications. - + https://grakn.ai/ - -+ check out Magma - + http://wiki.squeak.org/squeak/2665 + implement more advanced features on top of this via layered + architecture.