Welcome to the Semantic DB Project!

The Semantic DB is an experimental knowledge representation language and engine where we reduce everything to kets and operators. Why would we want to do this? Because, the claim is that this is a natural notation to represent brain like things. In particular, kets can be considered a representation for synpases, and the coefficients of the kets the activity level of that synapse. Operators in turn represent changes in state of a system. We also claim this language/notation is more powerful at representing knowledge than the current standard notations for the semantic web. There is a lot left to do in this project! While we do have a working python implementation (the reference implementation), work is currently underway to write a c++ version. And of course, our documentation needs a lot of attention!

So what on Earth is a ket? It is simply a notation to associate a string with a float, where the string is essentially arbitrary, but chosen to be human readable (an advantage over using random integers as labels). So for example, slightly sleepy might be represented by 0.2|sleepy> and very hungry by 0.9|hungry>. Combined with what we call learn rules, this is already sufficient to represent some simple knowledge about Fred:

full-name |Fred> => |Fred Smith>
mother |Fred> => |Joan Smith>
father |Fred> => |Eric Smith>
age |Fred> => |52>
sibling |Fred> => |Emily> + |Robert> + |Mary>

Here we have the "literal operators" full-name, mother, father, age, sibling. Mathematically they can be considered as a type of "sparse matrix multiplication", noting that because kets can have arbitrary labels, regular matrix representations are infeasible for our literal operators (both because we don't ahead of time know the dimensions of those matrices, and because the required matrices would sometimes be very large). A good example of the advantage to a sparse representation is the movie-actor data set, extracted from IMDB, which maps actors to movies, and movies to actors. If that data was represented using matrices, they would be very large indeed! Literal operators are the simplest kind of operator in our project, though most operators have no mapping to matrices, they can still be considered a type of multiplication. Indeed, we can chain them together into what we call "operator sequences". For literal operators this is simply sparse matrix multiplication, for other operators this is a more general type of multiplication. So for example, we could ask the age of Fred's grandfather on his mother side by the operator sequence: age father mother |Fred>.

What can we represent with kets? Aside from learn rules, we have three primary data structures. Kets, superpositions, and sequences. Kets being the building block, and are simply string/float pairs (if the float is not specified then it is 1). Superpositions are the addition of several kets together (yes, you can add and subtract them in a natural way), and represent the simultaneous "activation" of several kets, each with their own activation level. Sequences are time ordered lists of superpositions, with each superposition seperated by a dot. Some examples:

Kets: 3|apple>, 2|orange>, 0.7|hungry>, 0.4|sleepy>, |red>, |dog>, etc.
Superposition, say a shopping list: 5|apple> + 4|orange> + |milk> + |steak> + |chocolate>
Sequence, say the spelling of Fred: |F> . |r> . |e> . |d>

Much more to come!

Some links:
The python implementation on github
The corresponding readme
Encoders and prediction write-up
Usage information for our operators
The c++ version (ie, Semantic DB 3.1)
Some sample sw files
A larger collection of sw files