X-Git-Url: http://www2.svjatoslav.eu/gitweb/?a=blobdiff_plain;f=doc%2Findex.org;h=caa29aa68ff459f2cf4c98dfb5808582001817b0;hb=b382105757c3d9a2bb528fa8e4218354feff2429;hp=42a9704a411a23bd79f0eeaad2c3e411ce8673fa;hpb=bb7b2daf4049f53eefbde9912daca7f31a3b6717;p=sixth-data.git diff --git a/doc/index.org b/doc/index.org index 42a9704..caa29aa 100644 --- a/doc/index.org +++ b/doc/index.org @@ -14,7 +14,7 @@ - Homepage: http://svjatoslav.eu - Email: mailto://svjatoslav@svjatoslav.eu -- [[http://svjatoslav.eu/programs.jsp][other applications hosted at svjatoslav.eu]] +- [[http://www.svjatoslav.eu/programs.jsp][other applications hosted at svjatoslav.eu]] * (document settings) :noexport: @@ -28,20 +28,25 @@ #+HTML_HEAD: * Vision / goal -Provide versioned, clustered, flexible, object-relational database -functionality for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth computation engine]]. - -+ I hate object-relational impedance mismatch. - -+ I don't like to convert data between persistent database and runtime - objects for every transaction. How about creating united + :PROPERTIES: + :ID: f6764282-a6f6-44e6-8716-b428074dd093 + :END: +Provide versioned, clustered, flexible, distributed, multi-dimensional +data storage engine for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html][Sixth computation engine]]. + ++ Speaking of traditional relational database and object oriented + business applications: + + I hate object-relational impedance mismatch. + + + I don't like to convert data between persistent database and + runtime objects for every transaction. How about creating united database/computation engine instead to: + Eliminate constant moving and converting of data between 2 systems. + Abstract away difference between RAM VS persistent storage. Let - the system decide at runtime which data to keep in what kind of - memory. + the system decide at runtime which data to keep in what kind of + memory. -** Inspiration +* Inspiration + Relational databases: + Transactional. + Indexable / Quickly searchable. @@ -51,40 +56,103 @@ functionality for the [[http://www2.svjatoslav.eu/gitbrowse/sixth/doc/index.html + Branchable / mergeable. + Transparent cansistency, checksumming and deduplication. + (Git as a database: - https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) - -** Solution (the big idea) -I see 4D data structure. - -[[file:data model.png]] - -Dimensions: -+ List of all the objecs in the system (rows). -+ List of all declared unique object fields (columns). -+ List of all historical transactions/commits/versions (think of - sheets of paper). -+ List of all concurrently running branches/threads. Branches can - appear and merge over time as needed. -+ (Every cell is concrete field value within an object) - -Partitioning/clustering: -+ Why not to partition/(load balance) as required across networked - physical computers along arbitrary dimension(s) declared above ? - -Indexing (for fast searching): -+ Why not to index along arbitrary dimensions (as required) ? - -Further optimizations: -+ In current early stage, trying to focus on minimum possible set of - features that would provide maximum possible set of power/benefit :) -+ Once featres are locked. Anything can be optimised. Optimization for - size (deduplication) can be solved using Git style content - addressible storage mechanism. + https://www.kenneth-truyers.net/2016/10/13/git-nosql-database/ ) + +** Brain + :PROPERTIES: + :ID: d2375acc-af14-4f18-8ad0-7949501178c5 + :END: ++ Appears to have more than 3D dimensional design. Food for + thought...) + + https://singularityhub.com/2017/06/21/is-there-a-multidimensional-mathematical-world-hidden-in-the-brains-computation/ + ++ It directly inspires following ideas + + [[id:5d287158-53ea-44a2-a754-dd862366066a][Distributed comutation and data storage]] + + [[id:a117c11e-97c1-4822-88b2-9fc10f96caec][Mapping of hyperspace to traditional object-oriented model]] + + [[id:b6b15bd2-c78b-4c51-a343-72843a515c29][Handling of relations]] + +* Ideas +** Distributed computation and data storage + :PROPERTIES: + :ID: 5d287158-53ea-44a2-a754-dd862366066a + :END: +Maybe every problem can be translated to geometry (use any shapes and +as many dimensions as you need). Solution(s) to such problems would +then appear as relatively simple search/comparison/lookup results. As +a bonus, such geometrical *data storage* AND *computation* can be +naturally made in *parallel* and *distributed*. That's what neurons in +the brain appear to be doing ! :) . Learning means building/updating +the model (the hard part). Question answering is making (relatively +simple) lookups (geometrical queries) against the model. +** Mapping of hyperspace to traditional object-oriented model + :PROPERTIES: + :ID: a117c11e-97c1-4822-88b2-9fc10f96caec + :END: +Object oriented programming is inspired by the way human mind +operates. It allows programmer to express ideas to computer in a more +human-like terms. + +It is possible to map object model to geometrical hyperspace: + ++ Object is a point in space (universe). Each object member variable + translates to its own dimension. That is: if class declares 4 + variables for an object, then corresponding object can be stored as + a single point inside 4 dimensional space. Variable values translate + to point coordinates in space. That is: Integer, floating point + number and even boolean and string can be translated to linear value + that can be used as a coordinate along particular dimension. + ++ Each class declares its own space (universe). All class instances + (objects) are points inside that particular universe. References + between objects of different types are hyperlinks (portals) between + different universes. +** Handling of relations + :PROPERTIES: + :ID: b6b15bd2-c78b-4c51-a343-72843a515c29 + :END: +Consider we want to create database of books and authors. Book can +have multiple authors, and single person can be author for multiple +books. It is possible to store how many hours of work each author has +contributed to every book, using hyperspace as follows: + ++ Every dimension corresponds to one particular book author. (10 + authors in the database, would require 10 dimensional space) + + Point in space corresponds to one particular book. + + Point location along particular (author) dimension corresponds + to amount of work contributed by particular author for given + book. + +Alternatively: + ++ Every dimension corresponds to one particular book. + + Point in space corresponds to one particular author in the entire + database. + + Point location along particular (book) dimension corresponds to + amount of work contributed for book by given author (point). + + * Current status +- More or less defined [[id:f6764282-a6f6-44e6-8716-b428074dd093][Vision / goal]]. + +- Collected some [[id:d2375acc-af14-4f18-8ad0-7949501178c5][ideas]]. + - Implemented very simple persistent key-value map. + - Long term goal is to use it as a backing storage engine and + implement more advanced features on top of this. + +* See also +Interesting or competing projects with good ideas: -Long term goal is to implement more advanced features on top of this. ++ GRAKN.AI: database in the form of a knowledge graph that uses + machine reasoning to simplify data processing challenges for AI + applications. + + https://grakn.ai/ -* TODO -** check out Magma ++ Gemstone/S based on Smalltalk. + + http://esug.org/data/ESUG2015/3%20wednesday/1100-1130%20SQL%20Queries%20on%20Smalltalk%20Objects/SQL%20Queries%20in%20Smalltalk%20(James%20Foster).pdf + ++ Magma distributed database in Smalltalk. + http://wiki.squeak.org/squeak/2665 + ++ ZetaVM + + https://github.com/zetavm/zetavm