The semantic db project.
Some proof of concept examples using the BKO scheme to represent semantic knowledge:
|content> => |sw-url: http://semantic-db.org/frog.sw>
+ |sw-url: http://semantic-db.org/george.sw>
+ |sw-url: http://semantic-db.org/early-us-presidents.sw>
+ |sw-url: http://semantic-db.org/top-5-longest-rivers.sw>
+ |sw-url: http://semantic-db.org/UK-to-US-spelling.sw>
+ |sw-url: http://semantic-db.org/currency-exchange.sw>
+ |sw-url: http://semantic-db.org/temperature-conversion.sw>
+ |sw-url: http://semantic-db.org/train-of-thought.swc>
+ |sw-url: http://semantic-db.org/basic-numbers.sw>
+ |sw-url: http://semantic-db.org/small-primes.sw>
+ |sw-url: http://semantic-db.org/common-cliche.sw>
+ |sw-url: http://semantic-db.org/share-example.sw>
+ |sw-url: http://semantic-db.org/common-internet-acronyms.sw>
+ |sw-url: http://semantic-db.org/basic-alphabet.sw>
+ |sw-url: http://semantic-db.org/the-doors.sw>
+ |sw-url: http://semantic-db.org/xml-to-bko-breakfast-menu-example.sw>
+ |sw-url: http://semantic-db.org/loebner-bots.sw>
"What was the party of the third president?"
|answer> => party |x> for |x> in |early US Presidents: _list> such that <number: 3|number|x> == 1
-- using inverse:
party inverse-president-number |number: 3>
-- inverse is the way to go, sure, but here is the other way too:
(using more recent notation, and dropping the "|x> for |x>" bit which we don't need because of the linearity of operators)
|answer> => party (|x> in "" |early US Presidents: _list> such that <number: 3|number|x> == 1)
"Who was the president in 1825?"
|answer> => |x> in |early US Presidents: _list> such that <year: 1825|era|x> > 0.5
-- using inverse:
inverse-president-era |year: 1825>
"Which long river does Burundi drain into?"
|answer> => |x> in |river: long: _list> such that <country: Burundi|drainage-basin-countries|x> > 0.8
"What was the dissolution date of the party of the sixth president?"
|answer> => dissolved: party |x> for |x> in |early US Presidents: _list> such that <number: 6|number|x> == 1
-- using inverse:
dissolved party inverse-president-number |number: 6>
-- again, the long version is:
|answer> => dissolved party (|x> in "" |early US Presidents: _list> such that <number: 6|number|x> == 1)
-- another version, which is closer to the English version, is:
|answer> => dissolution-date party sixth |president>
"Which early presidents were members of the Democratic-Replublican party?"
|answer> => |x> in |early US Presidents: _list> such that <political party: Democratic-Republican|party|x> > 0.9
-- using inverse:
inverse-party |party: Democratic-Republican>
Alternately, if we have a full list of members of the Democratic-Republican's, we could instead do:
|answer> => |_self><early US Presidents: _list||political party: Democratic-Republican: _list>
Or, if we make use of the set intersection function (which is a useful thing indeed):
|answer> => intersection(|early US Presidents: _list>, |political party: Democratic-Republican: _list>)
"How many siblings does George have?"
|answer> => count: siblings |person: George>
If we have data on George, Ed and Travis's friends we can do:
"Which friends do George, Ed and Travis have in common?"
|answer> => intersection(friends|person: George>, friends|person: Ed>, friends|person: Travis>)
-- which BTW is a common pattern:
|answer> => intersection(op|U>, op|V>, op|X>, op|Y>)
"Which actors do movie name-a and movie name-b have in common?"
|answer> => intersection(actors|movie: name-a>,actors|movie: name-b>)
"What can I get for breakfast that is under $6?"
|answer> => |x> in |menu: breakfast> such that <*|value: price|x> < 6
"I want something with bacon in it!"
|answer> => |x> in |menu: breakfast> such that <food: bacon|read:description|x> > 0.9
"I want something with fruit in it!"
|answer> => |x> in |menu: breakfast> such that <fruit: *|read:description|x> > 0.9
"What can I get for breakfast that is under 700 calories?"
|answer> => |x> in |menu: breakfast> such that <*|value: calories|x> < 700
"I want waffles that are under $8."
|answer> => |x> in |menu: breakfast> such that <food: waffles|read:name|x> > 0.9 and <*|value:price|x> < 8
"Who was US president during WWII?"
|answer> => |x> in "" |US Presidents: _list> such that float-count intersection(era|World War II>,era|x>) > 0
$ wc -l *.txt 316 Alice in Wonderland.txt 828 Frankenstein.txt 1889 I Robot.txt 7910 Moby Dick.txt 38287 Shakespeare.txt 1619 Sherlock Holmes.txt 1815 Tom Sawyer.txt 52664 totalI think there is a bug in not handling dashes neatly enough. Also would be nice to look at unique word pairs, and maybe triples too.
$ wc -l * 271 Alice in Wonderland.1.txt 27857 Alice in Wonderland.10.txt 6793 Alice in Wonderland.2.txt 18837 Alice in Wonderland.3.txt 25345 Alice in Wonderland.4.txt 27202 Alice in Wonderland.5.txt 27651 Alice in Wonderland.6.txt 27763 Alice in Wonderland.7.txt 27810 Alice in Wonderland.8.txt 27836 Alice in Wonderland.9.txt 697 Frankenstein.1.txt 75341 Frankenstein.10.txt 20938 Frankenstein.2.txt 56251 Frankenstein.3.txt 71791 Frankenstein.4.txt 74860 Frankenstein.5.txt 75252 Frankenstein.6.txt 75308 Frankenstein.7.txt 75326 Frankenstein.8.txt 75334 Frankenstein.9.txt 1695 I Robot.1.txt 70307 I Robot.10.txt 25501 I Robot.2.txt 55359 I Robot.3.txt 67637 I Robot.4.txt 69818 I Robot.5.txt 70170 I Robot.6.txt 70257 I Robot.7.txt 70284 I Robot.8.txt 70297 I Robot.9.txt 5461 Moby Dick.1.txt 216522 Moby Dick.10.txt 79483 Moby Dick.2.txt 171928 Moby Dick.3.txt 208439 Moby Dick.4.txt 215056 Moby Dick.5.txt 216193 Moby Dick.6.txt 216416 Moby Dick.7.txt 216478 Moby Dick.8.txt 216506 Moby Dick.9.txt 35723 Shakespeare.1.txt 922531 Shakespeare.10.txt 347982 Shakespeare.2.txt 748365 Shakespeare.3.txt 892318 Shakespeare.4.txt 915498 Shakespeare.5.txt 919727 Shakespeare.6.txt 921198 Shakespeare.7.txt 921963 Shakespeare.8.txt 922355 Shakespeare.9.txt 1234 Sherlock Holmes.1.txt 105912 Sherlock Holmes.10.txt 26212 Sherlock Holmes.2.txt 72755 Sherlock Holmes.3.txt 98566 Sherlock Holmes.4.txt 104534 Sherlock Holmes.5.txt 105584 Sherlock Holmes.6.txt 105814 Sherlock Holmes.7.txt 105879 Sherlock Holmes.8.txt 105901 Sherlock Holmes.9.txt 1011 Tom Sawyer.1.txt 71496 Tom Sawyer.10.txt 21822 Tom Sawyer.2.txt 54309 Tom Sawyer.3.txt 68334 Tom Sawyer.4.txt 70931 Tom Sawyer.5.txt 71354 Tom Sawyer.6.txt 71444 Tom Sawyer.7.txt 71477 Tom Sawyer.8.txt 71489 Tom Sawyer.9.txt 12105988 total
1) a quite literal network: O |a1> => |a2> O |a2> => |a3> O |a3> => |a4> O |a4> => |a5> O |a5> => |a6> O |a6> => |a7> O |a7> => |a8> O |a8> => |a9> O |a9> => |a10> O |a10> => |a1> + |b1> O |b1> => |b2> O |b2> => |b3> O |b3> => |b4> O |b4> => |b5> O |b5> => |b6> O |b6> => |b7> O |b7> => |b1>Here is a diagram of that network.
2) the methanol molecule: molecular-pieces |molecule: methanol> => |methanol: 1> + |methanol: 2> + |methanol: 3> + |methanol: 4> + |methanol: 5> + |methanol: 6> atom-type |methanol: 1> => |atom: H> bonds-to |methanol: 1> => |methanol: 4> atom-type |methanol: 2> => |atom: H> bonds-to |methanol: 2> => |methanol: 4> atom-type |methanol: 3> => |atom: H> bonds-to |methanol: 3> => |methanol: 4> atom-type |methanol: 4> => |atom: C> bonds-to |methanol: 4> => |methanol: 1> + |methanol: 2> + |methanol: 3> + |methanol: 5> atom-type |methanol: 5> => |atom: O> bonds-to |methanol: 5> => |methanol: 4> + |methanol: 6> atom-type |methanol: 6> => |atom: H> bonds-to |methanol: 6> => |methanol: 5>Here is a diagram of the methanol "network".
c = new_context("a context") -- create a new context c.learn(...) -- learn something in that context print(c.dump_universe() -- print everything we know about the current contextwe do:
C = context_list("global") C.set("first context") -- switch to context if it already exists in the context_list, else create it. C.learn(...) -- learn something in that context C.set("another context") -- switch to context if it already exists in the context_list, else create it. C.learn(...) -- learn something in that context C.set("first context") -- switch back to previous context. print(C.dump_universe()) -- dump the current context print(C.dump_multiverse()) -- dump all known context's in the context_list.The other thing is I finally wrote the code to parse a "molecule of knowledge".
fish|x> + |a> + |b>works.
fish|x> + (|a> + |b>)is broken.
$ ./check_the_parser.py | grep "#"To check just for pass/fail, use:
$ ./check_the_parser.py | grep "##"Decided to write a shell script wrapper for this.
# a|x> + b|y> => a|x> + b|y> def algebra_add(one,two): return one + two # a|x> * b|y> => a*b |x*y> def algebra_mult(one,two,Abelian=True): one = superposition() + one # hack so one and two are definitely sp, not ket two = superposition() + two result = superposition() for x in one.data: for y in two.data: labels = x.label.split('*') + y.label.split('*') if Abelian: labels.sort() label = "*".join(labels) result += ket(label,x.value * y.value) return result # (a|x> + b|y> + c|z>)^|n> # eg: (|a> + |b> + |c>)^|2> = |a*a> + 2.000|a*b> + 2.000|a*c> + |b*b> + 2.000|b*c> + |c*c> def algebra_power(one,two): one = superposition() + one two_label = two.ket().label null, power = extract_category_value(two_label) try: n = int(power) except: return ket("",0) if n <= 0: return ket("1") result = one for k in range(n - 1): result = algebra_mult(result,one) return result # implement basic algebra: def algebra(one,operator,two): # assume Abelian for now. op_label = operator if type(operator) == str else operator.the_label() null, op = extract_category_value(op_label) if op not in ['+','*','^']: return ket("",0) if op == '+': return algebra_add(one,two) elif op == '*': return algebra_mult(one,two) elif op == '^': return algebra_power(one,two) else: return ket("",0) # simple complex number mult: def complex_algebra_mult(one,two): one = superposition() + one # hack so one and two are definitely sp, not ket two = superposition() + two result = superposition() for x in one.data: for y in two.data: if x.label == 'real' and y.label == 'real': result += ket("real",x.value * y.value) if x.label == 'real' and y.label == 'imag': result += ket("imag",x.value * y.value) if x.label == 'imag' and y.label == 'real': result += ket("imag",x.value * y.value) if x.label == 'imag' and y.label == 'imag': result += ket("real",-1 * x.value * y.value) return resultWorks pretty well, considering how little work it took. But will almost certainly tweak/improve it in the future.
supported-ops |x> => |op: op1> + |op: op2> + |op: op3> op1 |x> => |a> + |b> + |c> op2 |x> => |d> + |e> op3 |x> => |f> + |g> + |h> + |i>
# maps ket -> ket # 3|x> => 3|x> # |number: 7.2> => 7.2| > # NB: the space in the ket label. # 2|number: 3> => 6| > # We can't use just |> because it is dropped all over the place! # 8|number: text> => 0| > # so the maths eqn: 3a + 7 # |3.7> => 3.7| > # in my notation is 3|a> + 7| > # 3|5> => 15| > def category_number_to_number(one): # find better name! one = one.ket() cat, value = extract_category_value(one.label) try: n = float(value) except: if cat == 'number': # not 100% want to keep these two lines return ket(" ",0) return one return ket(" ",one.value * n) # a|x> * b|y> => a*b |x*y> def algebra_mult(one,two,Abelian=True): one = superposition() + one # hack so one and two are definitely sp, not ket two = superposition() + two result = superposition() for x in one.data: x = category_number_to_number(x) for y in two.data: y = category_number_to_number(y) print("x*y",x,"*",y) labels = [ L for L in x.label.split('*') + y.label.split('*') if L.strip() != '' ] if Abelian: labels.sort() label = "*".join(labels) if label == '': # we can't have ket("",value), since it will be dropped. label = " " result += ket(label,x.value * y.value) return result # (a|x> + b|y>)^|n> def algebra_power(one,two,Abelian=True): one = superposition() + one two = category_number_to_number(two) try: n = int(two.value) except: return ket(" ",0) if n <= 0: return ket(" ",1) result = one for k in range(n - 1): result = algebra_mult(result,one,Abelian) return result
op1 |x> => |a> + 2.000|b> + 5.000|c> + 2.000|d> op2 |x> => 0.200|a> + 0.300|b> + 0.500|e> op3 |x> => 0.100|b> + 0.100|a> + 0.100|c> + 0.200|d> op1 |y> => 3.000|m> + |n> + 7.000|o> + 2.000|p> + |q>And note that we have:
count-sum op1 |x> = |number: 10.0> count-sum op2 |x> = |number: 1.0> # op2 has currency conservation. count-sum op3 |x> = |number: 0.5> count-sum op1 |y> = |number: 14.0>Well, if we create inverse, we now observe:
inverse-op1 op1 |x> = 10.000|x> inverse-op2 op2 |x> = |x> inverse-op3 op3 |x> = 0.500|x> inverse-op1 op1 |y> = 14.000|y>So if the rule has currency conservation, as in the op2 case (where count-sum op2|x> == |number: 1>), then the inverse is an exact inverse (surprised I haven't tested this till now).
inverse-op1 op1 inverse-op1 op1 |x> = 100.000|x> inverse-op2 op2 inverse-op2 op2 |x> = |x> inverse-op3 op3 inverse-op3 op3 |x> = 0.250|x> inverse-op1 op1 inverse-op1 op1 |y> = 196.000|y>or mixed case:
inverse-op3 op3 inverse-op1 op1 inverse-op1 op1 |x> = 50.000|x> inverse-op3 op3 inverse-op1 op1 inverse-op1 op1 inverse-op3 op3 |x> = 25.000|x>Next, provided the operators are disjoint. eg:
intersection(op1|x>,op1|y>) = |>Then we have this (ie, |x> and |y> are independent and do not "interact"):
inverse-op1 op1 (|x> + |y>) = 10.000|x> + 14.000|y> inverse-op1 op1 (3|x> + 2|y>) = 30.000|x> + 28.000|y>I guess just showing that these things are well behaved if you are careful.
inverse-op1 op1 (3|x> + 2|y>) = 30.000|x> + 28.000|y> == inverse-op1 op1 3|x> + inverse-op1 op1 2|y>If |x> and |y> interacted, we would not have this equality.
"no one" => <person: *|op|x> == 0 "no thing" => <object: *|op|x> == 0 "no place" => <location: *|op|x> == 0 "no where" => <location: *|op|x> == 0 "no how" => <... > == 0 "no body" => <person: *|op|x> == 0 "some one" => <person: *|op|x> > 0 "some thing" => <object: *|op|x> > 0 "some place" => <location: *|op|x> > 0 "some where" => <location: *|op|x> > 0 "some how" => <... > > 0 "some body" => <person: *|op|x> > 0 Not, sure but maybe this: "any one" => pick-elt |_self><person: *|op|x> "any thing" => pick-elt |_self><object: *|op|x> "any place" => pick-elt |_self><location: *|op|x> "any where" => pick-elt |_self><location: *|op|x> "any how" => pick-elt |_self><... > "any body" => pick-elt |_self><person: *|op|x> "every one" => |_self><person: *|op|x> "every thing" => |_self><object: *|op|x> "every place" => |_self><location: *|op|x> "every where" => |_self><location: *|op|x> "every how" => |_self><... > "every body" => |_self><person: *|op|x>
sa: bah |z> => |it worked!> sa: foo |A> #=> shout bah |z> sa: bah |A> #=> shout read |text: a couple of words> sa: dump ---------------------------------------- |context> => |context: sw console> supported-ops |z> => |op: bah> bah |z> => |it worked!> supported-ops |A> => |op: foo> + |op: bah> foo |A> #=> shout bah |z> bah |A> #=> shout read |text: a couple of words> ---------------------------------------- sa: foo |A> IT WORKED! |IT WORKED!> sa: bah |A> WORD: A WORD: COUPLE WORD: OF WORD: WORDS |WORD: A> + |WORD: COUPLE> + |WORD: OF> + |WORD: WORDS>I guess a simple description is that you store rules that don't get processed until they get activated. So even running dump does not activate the rules. Only when invoked inside x.apply_op() are they activated.
bah |x> => |it worked!> o |a> => |b> o |b> => |c> o |c> #=> |d> + shout bah |x> o |d> => |e> o |e> => |f>Load that into the console, then observe:
sa: o |a> |b> sa: o^2 |a> |c> sa: o^3 |a> IT WORKED! |d> + |IT WORKED!> sa: o^4 |a> IT WORKED! |e> sa: o^5 |a> IT WORKED! |f>Where I am using "shout bah |x>" as an for-example side-effect. But really there are no limits on what it could be.
"Y tastes exactly like X" => taste |Y> => taste |X> "Y smells a lot like X" => smell |Y> => 0.9 smell |X> "Y sounds a little like X" => sound |Y> => 0.2 sound |X> "Y feels a tiny bit like X" => feel |Y> => 0.05 feel |X> "Y looks not at all like X" => looks |Y> => NOT looks |X> -- this one is provisional.Note that they all correspond to the senses. I am not currently sure if there are similar for non sensory input.
$ ./the_semantic_db_console.py Welcome! sa: smell |X> => |roses> sa: -- Y smells a lot like X sa: smell |Y> => 0.85 smell |X> sa: dump ---------------------------------------- |context> => |context: sw console> supported-ops |X> => |op: smell> smell |X> => |roses> supported-ops |Y> => |op: smell> smell |Y> => 0.850|roses> ---------------------------------------- sa: similar[smell] |Y> 0.850|X> sa: smell |Z> => 0.4|roses> sa: dump ---------------------------------------- |context> => |context: sw console> supported-ops |X> => |op: smell> smell |X> => |roses> supported-ops |Y> => |op: smell> smell |Y> => 0.850|roses> supported-ops |Z> => |op: smell> smell |Z> => 0.400|roses> ---------------------------------------- sa: similar[smell] |Y> 0.850|X> + 0.471|Z> sa: similar[smell] |X> 0.850|Y> + 0.400|Z> sa: similar[smell] |Z> 0.471|Y> + 0.400|X>
R = <b|op5 op4 op3 op2 op1|a>In terms of the brain, is a sum over pathways through brain-space.
We can say, given |a>, |b>, |c>, |d> and: R1 = <b|op2 op1|a> R2 = <d|op2 op1|c> R1' = <b|op2 op1(|a> + |c>) R2' = <d|op2 op1(|a> + |c>) If R1 == R1' and R2 == R2' then |a> and |c> do not interact. If R1 != R1' or R2 != R2' then |a> and |c> do interact.Also, a quick comment:
If <b|some-op-sequence|a> > 0, then we can say there exists a brain-space pathway between |a> and |b>. And a math proof/chain of logic bears some similarity with a brain-space pathway. Details later!16/5/2014 update: maybe it goes something like this:
sa: first-op (|truth-a> + |truth-b> + |truth-c>) |truth-d> + |truth-e> sa: second-op first-op (|truth-a> + |truth-b> + |truth-c>) |truth-f> + |truth-g> + |truth-h> + |truth-i> sa: third-op second-op first-op (|truth-a> + |truth-b> + |truth-c>) |truth-j> sa: fourth-op third-op second-op first-op (|truth-a> + |truth-b> + |truth-c>) |truth-k> + |truth-l> sa: fifth-op fourth-op third-op second-op first-op (|truth-a> + |truth-b> + |truth-c>) |truth-m> + |desired-result> + |truth-n> + |truth-o> + |truth-p> ie: <desired-result|fifth-op fourth-op third-op second-op first-op (|truth-a> + |truth-b> + |truth-c>) == 1Of course, in general, finding the right operators to apply at each level is non-trivial.
$ ./the_semantic_db_console.py Welcome! sa: context friends sa: friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> sa: friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Robert> + |Frank> + |Julie> sa: dump ---------------------------------------- |context> => |context: friends> supported-ops |Fred> => |op: friends> friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> supported-ops |Sam> => |op: friends> friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie> ---------------------------------------- sa: -- what friends does Fred have? sa: friends |Fred> |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> sa: -- what friends does Sam have? sa: friends |Sam> |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie> sa: -- how many friends does Fred have? sa: count friends |Fred> |number: 8> sa: -- how many friends does Sam have? sa: count friends |Sam> |number: 7> sa: -- what friends do Fred and Sam have? sa: union (friends |Fred>,friends|Sam>) |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> + |George> + |Rober> + |Frank> + |Julie> sa: -- how many friends do Fred and Sam have? sa: count union (friends |Fred>,friends|Sam>) |number: 12> sa: -- what friends do Fred and Sam have in common? sa: common (friends |Fred>,friends|Sam>) -- common is an alias for intersection. |Jack> + |Emma> + |Charlie> sa: -- how many friends do Fred and Sam have in common? sa: count common (friends|Fred>,friends|Sam>) |number: 3>
sa: age |Mary> => rescale[1] smooth[1]^2 |age: 40> sa: dump ---------------------------------------- |context> => |context: sw console> supported-ops |Mary> => |op: age> age |Mary> => 0.167|age: 38.0> + 0.667|age: 39.0> + |age: 40.0> + 0.667|age: 41.0> + 0.167|age: 42.0> ----------------------------------------
siblings |person: *> #=> brothers |_self> + sisters |_self> children |person: *> #=> sons |_self> + daughters |_self> parents |person: *> #=> mother |_self> + father |_self> uncles |person: *> #=> brothers parents |_self> aunts |person: *> #=> sisters parents |_self> aunts-and-uncles |person: *> #=> siblings parents |_self> cousins |person: *> #=> children siblings parents |_self> grand-fathers |person: *> #=> father parents |_self> grand-mothers |person: *> #=> mother parents |_self> grand-parents |person: *> #=> parents parents |_self> grand-children |person: *> #=> children children |_self> great-grand-parents |person: *> #=> parents parents parents |_self> great-grand-children |person: *> #=> children children children |_self>
def label_decent(x): print("x:",x) result = [x] if x == "*": return result if x.endswith(": *"): x = x[:-3] while True: try: x,null = x.rsplit(": ",1) result.append(x + ": *") except: result.append("*") return resulteg, if you feed in this label "a: b: c: d: fred", it returns these trail labels:
a: b: c: d: fred a: b: c: d: * a: b: c: * a: b: * a: * *And the key code in context.recall() is:
match = False for trial_label in label_decent(label): if trial_label in self.known_kets: if op in self.rules[trial_label]: rule = self.rules[trial_label][op] match = True break if not match: print("recall not found") rule = ket("",0)
Celcius |*> #=> C |_self> Celsius |*> #=> C |_self> Ditto for Fahrenheit and Kelvin: Fahrenheit |*> #=> F |_self> Kelvin |*> #=> K |_self> sa: roughly |*> #=> rescale[1] smooth[1]^2 |_self> sa: roughly |age: 40> 0.167|age: 38.0> + 0.667|age: 39.0> + |age: 40.0> + 0.667|age: 41.0> + 0.167|age: 42.0>And we can neatly implement the concept of generalisations and stereotypes:
"80% of women take too long to get ready" take-too-long-to-get-ready |person: female: *> #=> 0.8 |yes> But we can over-ride with: "Mary doesn't take long to get ready" take-too-long-to-get-ready |person: female: Mary> => 0.1 |yes> "70% of men drink too much" drink-too-much |person: male: *> #=> 0.7|yes> But we can over-ride with: "Fred is a tea-totaler" drink-too-much |person: male: Fred> => |no> "60% of British have bad teeth" have-bad-teeth |person: UK: *> #=> 0.6 |yes> "10% of Americans have bad teeth" have-bad-teeth |person: US: *> #=> 0.1 |yes>And we can implement the idea of general rules for plurals, overridden by more specific rules:
sa: plural |word: *> #=> read-letters(|_self> + |letter: s>) sa: plural |word: apple> |word: apples> sa: plural |word: mouse> |word: mouses> sa: plural |word: mouse> => |word: mice> -- let's store the exemption from the rule: sa: plural |word: mouse> -- test it works. |word: mice> sa: plural |word: rabbit> -- test the general rule again. |word: rabbits> sa: plural |rabbit> -- show the importance of the "word: " prefix, since the rule was defined for |word: *> recall not found |>Here is a closely related example:
sa: is-human |person: *> #=> |yes> sa: is-human |person: Fred> |yes> sa: is-human |person: Sam> => |no!> -- we don't like Sam. sa: is-human |person: Eric> |yes> sa: is-human |person: Sam> |no!>
sa: read |text: Two of our famous Belgian Waffles with plenty of real maple syrup.> |word: two> + 2.000|word: of> + |word: our> + |word: famous> + |word: belgian> + |word: waffles> + |word: with> + |word: plenty> + |word: real> + |word: maple> + |word: syrup>But with active read we have examples such as:
|text: Two of our famous Belgian Waffles with plenty of real maple syrup.> becomes: |number: 2> + |country: Belgium> + |food: belgian waffles> + |food: waffles> + |food: maple syrup> |text: Light Belgian waffles covered with strawberries and whipped cream.> becomes: |country: Belgium> + |food: belgian waffles> + |food: waffles> + |food: strawberries> + |fruit: strawberries> + |food: whipped cream> + |food: cream> |text: Two eggs, bacon or sausage, toast, and our ever-popular hash browns.> becomes: |number: 2> + |food: egg> + |food: bacon> + |food: sausage> + |food: toast> + |food: hash browns>This is done using the pattern recognition code, and this set of rules:
|food: waffles> => |word: waffles> |country: Belgium> => |word: belgian> |food: strawberries> => |word: strawberries> |fruit: strawberries> => |word: strawberries> |food: berries> => |word: berries> |fruit: berries> => |word: berries> |country: France> => |word: french> |food: toast> => |word: toast> |meal: breakfast> => |word: breakfast> |food: egg> => |word: egg> |food: eggs> => |word: eggs> |food: bacon> => |word: bacon> |food: sausage> => |word: sausage> |food: sausages> => |word: sausages> |number: 2> => |word: two> |food: cream> => |word: cream> |food: belgian waffles> => |word: belgian> + |word: waffles> |food: maple syrup> => |word: maple> + |word: syrup> |food: whipped cream> => |word: whipped> + |word: cream> |food: hash browns> => |word: hash> + |word: browns>BTW, the code for active read is currently at the bottom of the functions code.
def silent_active_read_text(context,one,pattern=""): result = superposition() data = read_text(one).data # later other functions here too instead of just read_text() for k in range(len(data)): y1 = data[k] result += context.pattern_recognition(y1,pattern).drop_below(1) if k < len(data) - 1: y2 = data[k] + data[k + 1] # this line corresponds to my "buffer" idea. Explain later! result += context.pattern_recognition(y2,pattern).drop_below(1) return result
sa: load next-breakfast-menu.sw -- first, without active-buffer: sa: read |text: two of our famous belgian waffles with plenty of real maple syrup> |word: two> + 2.000|word: of> + |word: our> + |word: famous> + |word: belgian> + |word: waffles> + |word: with> + |word: plenty> + |word: real> + |word: maple> + |word: syrup> -- now, apply the active-buffer function: sa: active-buffer[2,1] read |text: two of our famous belgian waffles with plenty of real maple syrup> |number: 2> + |country: Belgium> + |food: belgian waffles> + |food: waffles> + |food: maple syrup> -- another example: sa: read description |food: Homestyle Breakfast> |word: Two> + |word: eggs> + |word: bacon> + |word: or> + |word: sausage> + |word: toast> + |word: and> + |word: our> + |word: ever-popular> + |word: hash> + |word: browns> -- now apply the active-buffer function: sa: active-buffer[2,0] read description |food: Homestyle Breakfast> 2.000|food: eggs> + 2.000|food: bacon> + 2.000|food: sausage> + 2.000|food: toast> + 2.500|food: hash browns> -- NB: it missed the |word: Two> since the current version of read does not convert to lower-case, and the pattern is: |number: 2> => |word: two> -- ie, lowercase "two"
sa: load next-breakfast-menu.sw sa: description |food: Homestyle Breakfast> |text: "Two eggs, bacon or sausage, toast, and our ever-popular hash browns"> sa: read description |food: Homestyle Breakfast> |word: two> + |word: eggs> + |word: bacon> + |word: or> + |word: sausage> + |word: toast> + |word: and> + |word: our> + |word: ever-popular> + |word: hash> + |word: browns> sa: active-buffer[2,1] read description |food: Homestyle Breakfast> |number: 2> + |food: eggs> + |food: bacon> + |food: sausage> + |food: toast> + |food: hash browns> -- now define a pattern: sa: |food: homestyle breaky> => |food: eggs> + |food: bacon> + |food: sausage> -- now another layer of active-buffer: sa: active-buffer[2,0] active-buffer[2,1] read description |food: Homestyle Breakfast> 3.000|food: homestyle breaky> -- a slightly tweaked version of the same (just jiggling coeffs really): sa: active-buffer[3,1] active-buffer[2,1] read description |food: Homestyle Breakfast> |food: homestyle breaky>So maybe I should try to explain what is going on here.
-- load up the data I prepared: sa: load spell-active-buffer.sw -- let's take a look: sa: display context: sw console l1 supported-ops: op: : a, b, f, r, o, g, u, v, w l2 supported-ops: op: : 2.00 s, g, r, t, a, c, xy, z animal: frog supported-ops: op: spell spell: f, r, o, g weather: fog supported-ops: op: spell spell: f, o, g animal: rat supported-ops: op: spell spell: r, a, t -- now play with active buffer: sa: active-buffer[7,0,spell] "" |l1> 6.655|animal: rat> + 14.269|weather: fog> + 17.164|animal: frog> sa: active-buffer[7,0,spell] "" |l2> 2.977|weather: fog> + 6.155|animal: frog> + 12.899|animal: rat>Of course, this would work better in terms of using sequences, but we don't have the code for that.
rat is matching "a, b, f, r" (note the r and a) fog is matching "f, r, o, g" (ignoring the r) and: frog is matching "g, r" (blind to order of the kets) rat is matching "r, t, a" (blind to order of the kets)Still works as a fine proof of concept though.
|a> . |b> . |c> . |d> maps to: c1 (|a> + |b> + |c> + |d>) + c2 (|ab> + |bc> + |cd>) + c3 (|ac> + |bd>) + c4 (|abc> + |bcd>) + c5 (|abd> + |acd>) + c6 |abcd>So a superposition (the first term) acts as a first approximation to a sequence.
|f> . |r> . |g> . |o> maps to: |f> . |r> . c |o> . c |g> Then compare: |f> . |r> . c |o> . c |g> with |f> . |r> . |o> . |g>How to actually do this in the general case in code, I currently have no idea!
-- first load up some knowledge: sa: |person: Fred Smith> => |word: fred> + |word: freddie> + |word: simth> + |word: smithie> -- various names and nick-names sa: |person: Mary> => |word: mazza> -- just a nick-name sa: |greeting: Hey!> => |word: hey> sa: |question: what is> => |word: what's> sa: |direction: up> => |word: up> sa: |phrase: having a baby> => read |text: having a baby> sa: |phrase: in the family way> => read |text: in the family way> sa: |phrase: up the duff> => read |text: up the duff> sa: |phrase: with child> => read |text: with child> sa: |concept: pregnancy> => |phrase: having a baby> + |phrase: in the family way> + |phrase: up the duff> + |phrase: with child> -- save a copy: sa: save active-buffer-play.sw -- now start playing with it: sa: active-buffer[7,0] read |text: Hey Freddie what's up?> 2.083|greeting: Hey!> + 1.500|person: Fred Smith> + 2.917|question: what is> + 2.083|direction: up> + 1.250|phrase: up the duff> -- up the duff is in there because of the word "up" -- indeed, this shows up if we apply another layer of active-buffer: sa: active-buffer[7,0] active-buffer[7,0] read |text: Hey Freddie what's up?> 0.988|concept: pregnancy> -- now test phrase matching a concept, in this case phrases that mean pregnant. sa: active-buffer[7,0] read |text: Hey Mazza, you with child, up the duff, in the family way, having a baby?> 2.593|greeting: Hey!> + 4.186|person: Mary> + 11.586|phrase: with child> + 6.857|direction: up> + 23.414|phrase: up the duff> + 25.000|phrase: in the family way> + 9.224|phrase: having a baby> -- one more layer of active-buffer: sa: active-buffer[7,0] active-buffer[7,0] read |text: Hey Mazza, you with child, up the duff, in the family way, having a baby?> 11.069|concept: pregnancy>
# active-buffer[N,t] some-superposition -- uses "" as the default pattern. # active-buffer[N,t,pattern] some-superposition -- uses your chosen pattern (we can't use "" as the pattern, due to broken parser!) # eg: active-buffer[3,0] read |text: I want french waffles> # where: # N is an int -- the size of the active buffer # t is a float -- the drop below threshold # pattern is a string -- the pattern we are using def console_active_buffer(one,context,parameters): -- one is the passed in superposition. try: N,t,pattern = parameters.split(',') N = int(N) t = float(t) except: try: N,t = parameters.split(',') N = int(N) t = float(t) pattern = "" except: return ket("",0) result = superposition() data = one.data for k in range(len(data)): for n in range(N): if k < len(data) - n: y = superposition() y.data = data[k:k+n+1] -- I guess this is the bit you could call the buffer. result += context.pattern_recognition(y,pattern).drop_below(t) return result -- .coeff_sort() here?
sa: load internet-acronyms.sw sa: active-buffer[7,0] read |text: WTF is going on OMg thx RTFM!> 2.593|phrase: What The Fuck> + 4.826|phrase: Oh My God> + 4.043|phrase: Thanks> + 2.593|phrase: Read the Fine Manual> + 2.593|phrase: Read the Fucking Manual>I think that is pretty cool. And the results will only get better with more knowledge in BKO form.
sa: load internet-acronyms.sw sa: create inverse sa: read |text: WTF is going on OMg thx RTFM!> |word: wtf> + |word: is> + |word: going> + |word: on> + |word: omg> + |word: thx> + |word: rtfm> sa: active-buffer[7,0] read |text: WTF is going on OMg thx RTFM!> 2.593|phrase: What The Fuck> + 4.826|phrase: Oh My God> + 4.043|phrase: Thanks> + 2.593|phrase: Read the Fine Manual> + 2.593|phrase: Read the Fucking Manual> sa: active-buffer[7,0,inverse-] active-buffer[7,0] read |text: WTF is going on OMg thx RTFM!> 1.302|word: wtf> + 3.221|word: omg> + 3.274|word: thx> + 4.044|word: rtfm> sa: active-buffer[7,0] active-buffer[7,0,inverse-] active-buffer[7,0] read |text: WTF is going on OMg thx RTFM!> 1.333|phrase: What The Fuck> + 2.509|phrase: Oh My God> + 2.264|phrase: Thanks> + 1.525|phrase: Read the Fine Manual> + 1.525|phrase: Read the Fucking Manual>So back and forth. Word's to phrase. Phrase to words. Words to phrase. Kinda cool!
|phrase: Be Right Back> => |word: brb> |phrase: By The Way> => |word: btw>Maybe they should use acronym as the operator-label instead of the empty operator-label "":
acronym |phrase: Be Right Back> => |word: brb> acronym |phrase: By The Way> => |word: btw> sa: create inverse sa: active-buffer[7,0,acronym] read |text: BTW, brb!!> 1.500|phrase: By The Way> + 1.500|phrase: Be Right Back> sa: active-buffer[7,0,inverse-acronym] active-buffer[7,0,acronym] read |text: BTW, brb!!> 1.167|word: btw> + 1.167|word: brb>
|context> => |context: H I pat rec> # # # # # # ##### # # # # # # pixels |letter: H> => |pixel: 1: 1> + |pixel: 1: 5> pixels |letter: H> +=> |pixel: 2: 1> + |pixel: 2: 5> pixels |letter: H> +=> |pixel: 3: 1> + |pixel: 3: 5> pixels |letter: H> +=> |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 4> + |pixel: 4: 5> pixels |letter: H> +=> |pixel: 5: 1> + |pixel: 5: 5> pixels |letter: H> +=> |pixel: 6: 1> + |pixel: 6: 5> pixels |letter: H> +=> |pixel: 7: 1> + |pixel: 7: 5> # # # # # ### # # # # # # pixels |noisy: H> => |pixel: 1: 5> pixels |noisy: H> +=> |pixel: 2: 1> + |pixel: 2: 5> pixels |noisy: H> +=> |pixel: 3: 1> + |pixel: 3: 5> pixels |noisy: H> +=> |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 5> pixels |noisy: H> +=> |pixel: 5: 1> pixels |noisy: H> +=> |pixel: 6: 1> + |pixel: 6: 5> pixels |noisy: H> +=> |pixel: 7: 1> + |pixel: 7: 5> # # # # ### ##### ## # # # ### # pixels |noisy: H2> => |pixel: 1: 1> + |pixel: 1: 5> pixels |noisy: H2> +=> |pixel: 2: 1> pixels |noisy: H2> +=> |pixel: 3: 1> + |pixel: 3: 3> + |pixel: 3: 4> + |pixel: 3: 5> pixels |noisy: H2> +=> |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 4> + |pixel: 4: 5> pixels |noisy: H2> +=> |pixel: 5: 1> + |pixel: 5: 2> + |pixel: 5: 5> pixels |noisy: H2> +=> |pixel: 6: 1> + |pixel: 6: 5> pixels |noisy: H2> +=> |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 5> ##### # # # # # ##### pixels |letter: I> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 1: 5> pixels |letter: I> +=> |pixel: 2: 3> pixels |letter: I> +=> |pixel: 3: 3> pixels |letter: I> +=> |pixel: 4: 3> pixels |letter: I> +=> |pixel: 5: 3> pixels |letter: I> +=> |pixel: 6: 3> pixels |letter: I> +=> |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5> #### # # # # ### pixels |noisy: I> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> pixels |noisy: I> +=> |pixel: 2: 3> pixels |noisy: I> +=> |> pixels |noisy: I> +=> |> pixels |noisy: I> +=> |pixel: 5: 3> pixels |noisy: I> +=> |pixel: 6: 3> pixels |noisy: I> +=> |pixel: 7: 1> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5> ## # ### # # ### #### ##### pixels |noisy: I2> => |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 5> pixels |noisy: I2> +=> |pixel: 2: 2> + |pixel: 2: 3> + |pixel: 2: 4> pixels |noisy: I2> +=> |pixel: 3: 3> pixels |noisy: I2> +=> |pixel: 4: 3> pixels |noisy: I2> +=> |pixel: 5: 3> + |pixel: 5: 4> + |pixel: 5: 5> pixels |noisy: I2> +=> |pixel: 6: 1> + |pixel: 6: 2> + |pixel: 6: 3> + |pixel: 6: 4> pixels |noisy: I2> +=> |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5>OK. Then we drop into the console:
$ ./the_semantic_db_console.py Welcome! sa: load H-I-pat-rec.sw loading sw file: H-I-pat-rec.sw sa: simm |*> #=> 100 similar[pixels] |_self> -- use this to save typing. sa: simm |noisy: H> 82.353|letter: H> + 61.905|noisy: H2> + 26.667|letter: I> + 25.000|noisy: I2> + 14.286|noisy: I> sa: simm |noisy: H2> 76.190|letter: H> + 61.905|noisy: H> + 47.619|noisy: I2> + 38.095|letter: I> + 19.048|noisy: I> sa: simm |letter: H> 82.353|noisy: H> + 76.190|noisy: H2> + 35.000|noisy: I2> + 29.412|letter: I> + 17.647|noisy: I> sa: simm |noisy: I> 73.333|letter: I> + 45.000|noisy: I2> + 19.048|noisy: H2> + 17.647|letter: H> + 14.286|noisy: H> sa: simm |noisy: I2> 65.000|letter: I> + 47.619|noisy: H2> + 45.000|noisy: I> + 35.000|letter: H> + 25.000|noisy: H> sa: simm |letter: I> 73.333|noisy: I> + 65.000|noisy: I2> + 38.095|noisy: H2> + 29.412|letter: H> + 26.667|noisy: H>Here is the code to print out the lettters based on the defined pixels.
def pixels_to_string(context,one): data = one.apply_op(context,"pixels") I = 5 J = 7 string = "" for j in range(1,J+1): for i in range(1,I+1): elt = ket("pixel: " + str(j) + ": " + str(i)) coeff = data.find_value(elt) c = '#' if coeff == 0: c = ' ' string += c string += "\n" return string.rstrip('\n') def pixel_ket(i,j): return ket("pixel: " + str(j) + ": " + str(i)) def create_pixel_rules(label,image): pre = "pixels |" + label + "> +=>" i = 0 j = 0 for line in image.split('\n'): result = superposition() j += 1 for c in line: i += 1 if c != ' ': result += pixel_ket(i,j) print(pre,result) i = 0Here is an example going back and forth:
C = context_list("pattern recognition play") -- define a context_list. load_sw(C,"H-I-pat-rec.sw") -- load up the data from the relevant .sw file. image = pixels_to_string(C,ket("letter: H")) -- convert the rules for |letter: H> to a string representation. print(image) # # # # # # ##### # # # # # # create_pixel_rules("letter: H",image) -- convert the string "image" back to a list of rules. pixels |letter: H> +=> |pixel: 1: 1> + |pixel: 1: 5> pixels |letter: H> +=> |pixel: 2: 1> + |pixel: 2: 5> pixels |letter: H> +=> |pixel: 3: 1> + |pixel: 3: 5> pixels |letter: H> +=> |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 4> + |pixel: 4: 5> pixels |letter: H> +=> |pixel: 5: 1> + |pixel: 5: 5> pixels |letter: H> +=> |pixel: 6: 1> + |pixel: 6: 5> pixels |letter: H> +=> |pixel: 7: 1> + |pixel: 7: 5>This means I can now create pixel rule sets for images just by doing ascii art, without the hard work of working out the rules manually.
####### # # # # # # ### ### # # # # # # ####### pixels |squares> +=> |pixel: 1: 1> + |pixel: 1: 2> + |pixel: 1: 3> + |pixel: 1: 4> + |pixel: 1: 5> + |pixel: 1: 6> + |pixel: 1: 7> pixels |squares> +=> |pixel: 2: 1> + |pixel: 2: 4> + |pixel: 2: 7> pixels |squares> +=> |pixel: 3: 1> + |pixel: 3: 4> + |pixel: 3: 7> pixels |squares> +=> |pixel: 4: 1> + |pixel: 4: 2> + |pixel: 4: 3> + |pixel: 4: 5> + |pixel: 4: 6> + |pixel: 4: 7> pixels |squares> +=> |pixel: 5: 1> + |pixel: 5: 4> + |pixel: 5: 7> pixels |squares> +=> |pixel: 6: 1> + |pixel: 6: 4> + |pixel: 6: 7> pixels |squares> +=> |pixel: 7: 1> + |pixel: 7: 2> + |pixel: 7: 3> + |pixel: 7: 4> + |pixel: 7: 5> + |pixel: 7: 6> + |pixel: 7: 7> dim-1 |squares> => |dimension: 7> dim-2 |squares> => |dimension: 7>
y = M x [ y1 ] [ 0 1 1 0 ] [ x1 ] [ y2 ] = [ 4 0 2 3 ] [ x2 ] [ y3 ] [ 2 1 4 4 ] [ x3 ] [ x4 ]In BKO this is:
M |x1> => 0.0|y1> + 4.0|y2> + 2.0|y3> M |x2> => |y1> + 0.0|y2> + |y3> M |x3> => |y1> + 2.0|y2> + 4.0|y3> M |x4> => 0.0|y1> + 3.0|y2> + 4.0|y3>A couple of examples.
sa: M (|x1> + |x2> + |x3> + |x4>) 2.000|y1> + 9.000|y2> + 11.000|y3>ie, y = (2,9,11)
sa: M (9|x1> + 3|x2> + 0|x3> + 4|x4>) 3.000|y1> + 48.000|y2> + 37.000|y3>ie, y = (3,48,37)
y = M1 x [ y1 ] = [ 0 7 1 1 6 4 1 ] [ x1 ] [ y2 ] [ 3 6 4 0 4 8 2 ] [ x2 ] [ x3 ] [ x4 ] [ x5 ] [ x6 ] [ x7 ] z = M2 y [ z1 ] [ 6 0 ] [ y1 ] [ z2 ] [ 2 3 ] [ y2 ] [ z3 ] = [ 7 4 ] [ z4 ] [ 9 0 ] [ z5 ] [ 5 1 ]In BKO this is:
M1 |x1> => 3|y2> -- NB: we drop/ignore terms that have coeff == 0. M1 |x2> => 7|y1> + 6|y2> M1 |x3> => |y1> + 4|y2> M1 |x4> => |y1> M1 |x5> => 6|y1> + 4|y2> M1 |x6> => 4|y1> + 8|y2> M1 |x7> => |y1> + 2|y2> M2 |y1> => 6|z1> + 2|z2> + 7|z3> + 9|z4> + 5|z5> M2 |y2> => 3|z2> + 4|z3> + |z5>Now, let's play with the BKO.
sa: M1 (|x1> + |x2> + |x3> + |x4> + |x5> + |x6> + |x7>) 27.000|y2> + 20.000|y1> -- NB: the order of |y1> and |y2> are reversed. This is irrelevant.ie, y = (20,27)
sa: M1 (8|x1> + 9|x3> + 3|x4> + |x5> + 6|x6> + |x7>) 43.000|y1> + 114.000|y2>ie, y = (43,114)
sa: M2 (|y1> + |y2>) 6.000|z1> + 5.000|z2> + 11.000|z3> + 9.000|z4> + 6.000|z5>ie, z = (6,5,11,9,6)
sa: M2 (43|y1> + 114|y2>) 258.000|z1> + 428.000|z2> + 757.000|z3> + 387.000|z4> + 329.000|z5>ie, z = (258,428,757,387,329)
sa: M2 M1 (8|x1> + 9|x3> + 3|x4> + |x5> + 6|x6> + |x7>) 258.000|z1> + 428.000|z2> + 757.000|z3> + 387.000|z4> + 329.000|z5>Finally, if we want to find M = M2 M1, so that z = M x, then we can do that easily enough too:
sa: M2 M1 |x1> 9.000|z2> + 12.000|z3> + 3.000|z5> sa: M2 M1 |x2> 42.000|z1> + 32.000|z2> + 73.000|z3> + 63.000|z4> + 41.000|z5> sa: M2 M1 |x3> 6.000|z1> + 14.000|z2> + 23.000|z3> + 9.000|z4> + 9.000|z5> sa: M2 M1 |x4> 6.000|z1> + 2.000|z2> + 7.000|z3> + 9.000|z4> + 5.000|z5> sa: M2 M1 |x5> 36.000|z1> + 24.000|z2> + 58.000|z3> + 54.000|z4> + 34.000|z5> sa: M2 M1 |x6> 24.000|z1> + 32.000|z2> + 60.000|z3> + 36.000|z4> + 28.000|z5> sa: M2 M1 |x7> 6.000|z1> + 8.000|z2> + 15.000|z3> + 9.000|z4> + 7.000|z5>In standard matrix representation, this is:
z = M x [ z1 ] [ 0 42 6 6 36 24 6 ] [ x1 ] [ z2 ] [ 9 32 14 2 24 32 8 ] [ x2 ] [ z3 ] = [ 12 73 23 7 58 60 15 ] [ x3 ] [ z4 ] [ 0 63 9 9 54 36 9 ] [ x4 ] [ z5 ] [ 3 41 9 5 34 28 7 ] [ x5 ] [ x6 ] [ x7 ]Interestingly enough, we can use this same construct, ie "op4 op3 op2 op1 |x_i>" to find the "effective" matrix representation for operators more interesting than just literal operators.
M_mn = <m|some-op|n>Of course, this breaks if some-op is non-linear.
[ a1 ] [ 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 ] [ a1 ] [ a2 ] [ 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a2 ] [ a3 ] [ 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a3 ] [ a4 ] [ 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a4 ] [ a5 ] [ 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a5 ] [ a6 ] [ 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a6 ] [ a7 ] [ 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ] [ a7 ] [ a8 ] [ 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ] [ a8 ] [ a9 ] = [ 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 ] [ a9 ] [ a10 ] [ 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 ] [ a10 ] [ b1 ] [ 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 ] [ b1 ] [ b2 ] [ 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 ] [ b2 ] [ b3 ] [ 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 ] [ b3 ] [ b4 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 ] [ b4 ] [ b5 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 ] [ b5 ] [ b6 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ] [ b6 ] [ b7 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ] [ b7 ]In BKO this is more compactly written as (this BTW is an example from higher up):
O |a1> => |a2> O |a2> => |a3> O |a3> => |a4> O |a4> => |a5> O |a5> => |a6> O |a6> => |a7> O |a7> => |a8> O |a8> => |a9> O |a9> => |a10> O |a10> => |a1> + |b1> O |b1> => |b2> O |b2> => |b3> O |b3> => |b4> O |b4> => |b5> O |b5> => |b6> O |b6> => |b7> O |b7> => |b1>BTW, "currency conservation" corresponds to sum of columns == 1 (which we don't have here due to the O |a10> => |a1> + |b1> rule).
count-sum superposition == count-sum some-op superpositioneg:
sa: count-sum (5|a1> + 9|a2> + 4|a3> + 2|a4> + |a5> + 9|a6> + 2|a7> + 7|a8> + 5|a9> + 2|a10> + 4|b1> + 5|b2> + 7|b3> + 8|b6> + 4|b7>) |number: 74.0> sa: -- without currency conservation, apply O matrix once: sa: count-sum O (5|a1> + 9|a2> + 4|a3> + 2|a4> + |a5> + 9|a6> + 2|a7> + 7|a8> + 5|a9> + 2|a10> + 4|b1> + 5|b2> + 7|b3> + 8|b6> + 4|b7>) |number: 76.0> sa: -- apply O matrix 20 times: sa: count-sum O^20 (5|a1> + 9|a2> + 4|a3> + 2|a4> + |a5> + 9|a6> + 2|a7> + 7|a8> + 5|a9> + 2|a10> + 4|b1> + 5|b2> + 7|b3> + 8|b6> + 4|b7>) |number: 166.0> sa: O |a10> => 0.3|a1> + 0.7|b1> -- restore currency conservation, and apply O once: sa: count-sum O (5|a1> + 9|a2> + 4|a3> + 2|a4> + |a5> + 9|a6> + 2|a7> + 7|a8> + 5|a9> + 2|a10> + 4|b1> + 5|b2> + 7|b3> + 8|b6> + 4|b7>) |number: 74.0> sa: -- apply O matrix 20 times: sa: count-sum O^20 (5|a1> + 9|a2> + 4|a3> + 2|a4> + |a5> + 9|a6> + 2|a7> + 7|a8> + 5|a9> + 2|a10> + 4|b1> + 5|b2> + 7|b3> + 8|b6> + 4|b7>) |number: 74.00000000000001> -- and we still have the original amount of currency after 20 rounds of the O matrix.
sa: dump ---------------------------------------- |context> => |context: algebra play> supported-ops |*> => |op: foo> foo |*> #=> algebra(""|_self>,|^>,|3>) supported-ops |x> => |op: > |x> => |a> supported-ops |y> => |op: > |y> => |a> + |b> supported-ops |z> => |op: > |z> => |a> + |b> + |c> ---------------------------------------- sa: foo |x> |a*a*a> sa: foo |y> |a*a*a> + 3.000|a*a*b> + 3.000|a*b*b> + |b*b*b> sa: foo |z> |a*a*a> + 3.000|a*a*b> + 3.000|a*a*c> + 3.000|a*b*b> + 6.000|a*b*c> + 3.000|a*c*c> + |b*b*b> + 3.000|b*b*c> + 3.000|b*c*c> + |c*c*c>26/4/2014 update: Now, what if you want functions of more than one variable.
---------------------------------------- |x> => |a> power |x> => |4> |y> => |a> + |b> power |y> => |5> foo |*> #=> algebra(""|_self>,|^>,|3>) bah |*> #=> algebra(""|_self>,|^>,power|_self>) ---------------------------------------- sa: foo |x> |a*a*a> sa: bah |x> |a*a*a*a> sa: foo |y> |a*a*a> + 3.000|a*a*b> + 3.000|a*b*b> + |b*b*b> sa: bah |y> |a*a*a*a*a> + 5.000|a*a*a*a*b> + 10.000|a*a*a*b*b> + 10.000|a*a*b*b*b> + 5.000|a*b*b*b*b> + |b*b*b*b*b>These being equivalent to:
foo(t) = t^3 bah(t,power) = t^power
[ y1 ] = [ 7 3 2 9 ] [ x1 ] [ y2 ] = [ 0 0 0 0 ] [ x2 ] [ y3 ] = [ 5 2 0 3 ] [ x3 ] [ y4 ] = [ 0 0 0 0 ] [ x4 ]In BKO terms is identical to:
[ y1 ] = [ 7 3 2 9 ] [ x1 ] [ y3 ] = [ 5 2 0 3 ] [ x2 ] [ x3 ] [ x4 ]and:
[ y1 ] = [ 7 3 2 9 ] [ x1 ] [ y2 ] = [ 0 0 0 0 ] [ x2 ] [ y3 ] = [ 5 2 0 3 ] [ x3 ] [ y4 ] = [ 0 0 0 0 ] [ x4 ] [ y5 ] = [ 0 0 0 0 ] [ y6 ] = [ 0 0 0 0 ] [ y7 ] = [ 0 0 0 0 ]with this set of rules:
M |x1> => 7|y1> + 5|y3> M |x2> => 3|y1> + 2|y3> M |x3> => 2|y1> M |x4> => 9|y1> + 3|y3>
sa: features |my perfect woman> => |beautiful> + |smart> + |skinny> + |educated> + |loving> + |sexy> sa: features |Mary> => |loving> + |skinny> sa: features |Liz> => |smart> + |educated> + |loving> sa: features |Jane> => |skinny> + |sexy> sa: features |Mia> => |smart> + |skinny> + |educated> + |loving> sa: features |Emma> => |athletic> + |skinny> + |sexy> + |beautiful> + |religious> sa: features |Donna> => |beautiful> + |smart> + |skinny> + |educated> + |sexy> sa: features |the goddess> => |beautiful> + |smart> + |skinny> + |educated> + |loving> + |sexy> sa: fsimm |*> #=> 100 similar[features] |_self> -- use this to save typing sa: fsimm |my perfect woman> 100.000|the goddess> + 83.333|Donna> + 66.667|Mia> + 50.000|Liz> + 50.000|Emma> + 33.333|Mary> + 33.333|Jane> sa: -- she is out of my league: sa: drop in-range[80,100] fsimm |my perfect woman> 100.000|the goddess> + 83.333|Donna> sa: -- she is in my league: sa: drop in-range[50,80] fsimm |my perfect woman> 66.667|Mia> + 50.000|Liz> + 50.000|Emma> sa: -- I'm not all that interested in her: sa: drop-above[49] fsimm |my perfect woman> 33.333|Mary> + 33.333|Jane>There are a bunch of things working underneath to make this all work!
F0 0 F1 1 F2 1 F3 2 F4 3 F5 5 F6 8 F7 13 F8 21 F9 34 F10 55 F11 89 F12 144 F13 233 F14 377 F15 610 F16 987 F17 1597 F18 2584 F19 4181 F20 6765 F21 10946 F22 17711 F23 28657 F24 46368 F25 75025 F26 121393 F27 196418 F28 317811 F29 514229 F30 832040 F31 1346269 F32 2178309 F33 3524578 F34 5702887 F35 9227465 F36 14930352 F37 24157817 F38 39088169 F39 63245986 F40 102334155 F41 165580141 F42 267914296 F43 433494437 F44 701408733 F45 1134903170Here is Fibonacci in BKO:
|context> => |context: Fibonacci method 1> fib |0> => |0> fib |1> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> fib n-1 |_self> + fib n-2 |_self> |context> => |context: Fibonacci method 2> fib |0> => |0> fib |1> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> to-number ( fib n-1 |_self> + fib n-2 |_self> ) |context> => |context: Fibonacci method 3> fib |0> => |0> fib |1> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> arithmetic( fib n-1 |_self>, |+>, fib n-2 |_self>)Now, initially this is really quite slow. Pages and pages of debugging info fills my screen even for fib |20>.
sa: fib |34> => |5702887> sa: fib |35> => |9227465> sa: fib |45> |1134903170>And of course, we can now easily find the Golden Ratio:
sa: fib-ratio |*> #=> arithmetic( fib |_self> , |/>, fib n-1 |_self> ) sa: fib-ratio |45> |1.618033988749895>OK. Now here is a fun thing. We can store literal rules that over-ride the general rule. Again, words fail me. Here is an example:
sa: fib |13> => fib |13> -- learn the literal/specific rule using the general/function rule (on the right hand side). sa: fib |14> => fib |14> sa: dump ---------------------------------------- |context> => |context: Fibonacci method 3> supported-ops |0> => |op: fib> fib |0> => |0> supported-ops |1> => |op: fib> fib |1> => |1> supported-ops |*> => |op: n-1> + |op: n-2> + |op: fib> + |op: fib-ratio> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> arithmetic( fib n-1 |_self>, |+>, fib n-2 |_self>) fib-ratio |*> #=> arithmetic( fib |_self> , |/>, fib n-1 |_self> ) supported-ops |13> => |op: fib> fib |13> => |233> -- this is the interesting bit. On the right, "fib |13>" has been replaced with its value, |233>. supported-ops |14> => |op: fib> fib |14> => |377> -- here too. ----------------------------------------Now whenever the code wants to know "fib |13>" or "fib |14>" it uses the specific rule, rather than the fib |*> general rule.
fact |0> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) fact |*> #=> arithmetic( |_self>, |*>, fact n-1 |_self>)
sa: fib (|33> + |37> + |40>) |3524578> + |24157817> + |102334155> sa: fact (|3> + |4> + |5> + |6>) |6> + |24> + |120> + |720>
sa: fact |number: 0> => |number: 1> sa: n-1 |number: *> #=> arithmetic(|_self>,|->,|number: 1>) -- NB: the |number: 1> instead of just |1>. sa: fact |number: *> #=> arithmetic( |_self>, |*>, fact n-1 |_self>) sa: fact |number: 6> |number: 720>
---------------------------------------- |context> => |context: average> ave |*> #=> arithmetic(count-sum "" |_self>,|/>,count "" |_self>) apply-weights |*> #=> mult(""|_self>, weights|_self>) weighted-ave |*> #=> arithmetic(count-sum apply-weights |_self>,|/>,count-sum weights |_self>) harmonic-mean |*> #=> arithmetic(count "" |_self>,|/>,count-sum invert "" |_self>) |x> => |a> + 2.000|b> + 3.000|c> + 4.000|d> weights |x> => 0.100|a> + 0.100|b> + 0.700|c> + 0.100|d> |y> => |a> + 2.000|b> + 5.000|c> + 7.000|d> weights |y> => 2.000|a> + 14.000|b> + 8.000|c> + 32.000|d> |z> => 60.000|a> + 40.000|b> ---------------------------------------- sa: ave |x> |number: 2.5> sa: ave |y> |number: 3.75> sa: weighted-ave |x> |number: 2.8> sa: weighted-ave |y> |number: 5.25> -- then making use of linearity we can do more than one at a time: sa: ave (|x> + |y> + |z>) |number: 2.5> + |number: 3.75> + |number: 50.0> sa: weighted-ave (|x> + |y>) |number: 2.8> + |number: 5.25> sa: harmonic-mean (|x> + |y> + |z>) |number: 1.9200000000000004> + |number: 2.170542635658915> + |number: 47.99999999999999>
# bko_if(|True>,|a>,|b>) -- returns |a> # bko_if(|False>,|c>,|d>) -- returns |d> def bko_if(condition,one,two): if condition.the_label() == "True": return one else: return two sa: if (|True>,|x>,|y>) |x> sa: if(|False>,|x>,|y>) |y> sa: foo |fish> => |True> sa: bah |fish> => |False> sa: if (foo |fish> , |true fish> , |false fish>) -- foo|fish> is an indirect condition. |true fish> sa: if (bah |fish> , |true fish> , |false fish>) -- so is bah|fish> |false fish>Just discovered there is kind of a bug here!
sa: if (foo|fish>, shout |true fish>, shout |false fish>) TRUE FISH -- note the code shouts both cases, FALSE FISH -- indpendent of which branch the if statement chooses. len: 3 sp: |True> sp: |TRUE FISH> sp: |FALSE FISH> op in whitelist 3 py: bko_if(pieces[0],pieces[1],pieces[2]) |TRUE FISH>Heh. Found a neat solution:
-- method 1: sa: shout if (foo |fish>, |true fish>, |false fish>) TRUE FISH |TRUE FISH> -- method 2: sa: |true fish> #=> shout |true fish> sa: |false fish> #=> shout |false fish> sa: "" if(bah|fish>, |true fish>, |false fish>) FALSE FISH |FALSE FISH> -- method 2: activate |true fish> #=> shout |true fish branch> activate |false fish> #=> shout |false fish branch> sa: activate if(bah|fish>, |true fish>, |false fish>) FALSE FISH BRANCH |FALSE FISH BRANCH>So, let's try with words...
if(condition,op3 op2 op1|a>,op3 op2 op1|b>)Correct:
op3 op2 op1 if(condition,|a>,|b>)That is method 1.
if(condition,op3 op2 op1|a>,op7 op6|b>)Correct:
temp-op |a> => op3 op2 op1 |a> temp-op |b> => op7 op6 |b> temp-op if(condition,|a>,|b>)But of course, sometimes you do want the superposition calculated before applying the if statement.
temp-op |x> => op2 op1 |x> temp-op |y> => op6 op5 op4 |y> temp-op if(condition,bah foo|x>, bah|y>)Nope! Broken if <x|bah foo|x> and <y|bah foo|x> == 0, and <x|bah|y> and <y|bah|y> == 0.
def valid_op(op): if not op[0].isalpha(): return False return all(c in ascii_letters + '0123456789-' for c in op)and now we have this:
def valid_op(op): if not op[0].isalpha() and not op[0] == '!': return False return all(c in ascii_letters + '0123456789-!' for c in op)So now we can do:
sa: ! |False> => |True> sa: !|True> => |False> sa: shout! |*> #=> shout |_self> sa: ! |False> |True> sa: ! ! |False> -- NB: this is ! !, ie separated by a space (this is an op-sequence after-all) and not !! unless you define that yourself. |False> sa: shout! |fish> FISH |FISH>BTW, we had to use ! and not "NOT" since NOT is already used as a sigmoid.
sa: !!|*> #=> ! !|_self> sa: !!|True> |True> sa: !!|False> |False>
!|one> => |two> !|two> => |three> !|three> => |four> !|four> => |five> !|five> => |six> !|six> => |one>And so we have:
sa: ! |one> |two> sa: ! ! |one> |three> sa: ! ! ! |one> |four> sa: ! ! ! ! |one> |five> sa: ! ! ! ! ! |one> |six> sa: ! ! ! ! ! ! |one> |one> sa: ! ! ! ! ! ! ! |one> |two>
def invert(x): if x == 0: -- the idea of 1/0 = 0 has some merit. return 0 -- eg, in the model 0 coeff means ignore. Seems appropriate that 1/ignore = ignore. else: return 1/xeg:
sa: x = 3|x> + 0.5|y> + 0|z> sa: id 3.000|x> + 0.500|y> + 0.000|z> sa: invert 0.333|x> + 2.000|y> + 0.000|z> sa: invert invert 3.000|x> + 0.500|y> + 0.000|z>
sa: friends-list |*> #=> extract-value list-to-words extract-value friends |_self> sa: hello! |*> #=> merge-labels(|Hello > + |_self> + |!>) sa: friends |person: Eric> => |person: Fred> + |person: Sam> + |person: Harry> + |person: Mary> + |person: liz> sa: friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> sa: friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie> sa: friends |Charlie> => |Jack> + |Emma> sa: dump ---------------------------------------- |context> => |context: hello friends> supported-ops |*> => |op: friends-list> + |op: hello!> friends-list |*> #=> extract-value list-to-words extract-value friends |_self> hello! |*> #=> merge-labels(|Hello > + |_self> + |!>) supported-ops |person: Eric> => |op: friends> friends |person: Eric> => |person: Fred> + |person: Sam> + |person: Harry> + |person: Mary> + |person: liz> supported-ops |Fred> => |op: friends> friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> supported-ops |Sam> => |op: friends> friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie> supported-ops |Charlie> => |op: friends> friends |Charlie> => |Jack> + |Emma> ---------------------------------------- sa: hello! |Harry> |Hello Harry!> sa: hello! friends-list |Charlie> |Hello Jack and Emma!> sa: hello! friends-list |Sam> |Hello Charlie, George, Emma, Jack, Rober, Frank and Julie!> sa: hello! friends-list |Fred> |Hello Jack, Harry, Ed, Mary, Rob, Patrick, Emma and Charlie!> sa: hello! friends-list |person: Eric> |Hello Fred, Sam, Harry, Mary and liz!>
|context> => |context: non-linear resonance> resonance |*> #=> 1000 drop-below[0.99] simm(""|_self>, ""|g>) |g> |g> => |a> + |b> + |c> + |d> |f1> => |a> |f2> => |a> + |b> |f3> => |a> + |b> + |c> |f4> => |a> + |b> + |c> + 0.900|d> |f5> => 0.950|a> + |b> + |c> + |d> |f6> => |a> + |b> + |c> + |d> |f7> => |a> + |b> + |c> + |d> + |e> ---------------------------------------- sa: resonance |f1> -- no sign of a resonance. |> sa: resonance |f2> -- no sign of a resonance. |> sa: resonance |f3> -- still no sign of a resonance. |> sa: ket-simm(""|f4>,""|g>) -- test how close |f4> and |g> are. 0.981|simm> -- 98% match. sa: resonance |f4> -- 98% match, and yet still no resonance. |> sa: ket-simm(""|f5>,""|g>) -- test how close |f5> and |g> are. 0.991|simm> -- 99% match. sa: resonance |f5> 990.506|g> -- finally, we have resonance of |g> sa: resonance |f6> -- |f6> is a perfect match with |g> 1000.000|g> sa: resonance |f7> |> -- |f7> doesn't resonate with |g>.
|curve> => 15|x: 1> + 4|x: 2> + 8|x: 3> + 5|x: 4> + 4|x: 5> + 16|x: 6> + 0|x: 7> + 17|x: 8> + 4|x: 9> + 17|x: 10> + 5|x: 11> + 15|x: 12> + 10|x: 13> + 19|x: 14> + 11|x: 15> + 1|x: 16> + 10|x: 17> + 13|x: 18> + 3|x: 19> + 5|x: 20> + 4|x: 21> + 5|x: 22> + 13|x: 23> + 7|x: 24> + 6|x: 25> + 12|x: 26> + 9|x: 27> + 3|x: 28> + 3|x: 29> + 8|x: 30>In the brain this could be for example the set of sound-wave patterns that excite the "frog" neuron.
def list_to_sp(s,list): result = superposition() result.data = [ket(s + str(k),v) for k,v in enumerate(list)] return result def sp_to_list(sp): return [x.value for x in sp.ket_sort().data] # NB: the ket_sort(). Even if we shuffle the sp, we get the same list back. For example: 15.000|x: 1> + 4.000|x: 2> + 8.000|x: 3> + 5.000|x: 4> + 4.000|x: 5> + 16.000|x: 6> + 0.000|x: 7> + 17.000|x: 8> + 4.000|x: 9> maps to: [15.0, 4.0, 8.0, 5.0, 4.0, 16.0, 0.0, 17.0, 4.0]
fuzzy-resonance-1 |*> #=> 200 drop-below[0.51] simm(""|_self>, ""|g>) |g> fuzzy-resonance-2 |*> #=> 200 clean drop-below[0.51] simm(""|_self>, ""|g>) |g> sa: fuzzy-resonance-1 |f1> |> sa: fuzzy-resonance-1 |f2> |> sa: fuzzy-resonance-1 |f3> 150.000|g> sa: fuzzy-resonance-1 |f4> 196.154|g> sa: fuzzy-resonance-1 |f5> 198.101|g> sa: fuzzy-resonance-1 |f6> 200.000|g> sa: fuzzy-resonance-1 |f7> 160.000|g> sa: fuzzy-resonance-2 |f1> |> sa: fuzzy-resonance-2 |f2> |> sa: fuzzy-resonance-2 |f3> 200.000|g> sa: fuzzy-resonance-2 |f4> 200.000|g> sa: fuzzy-resonance-2 |f5> 200.000|g> sa: fuzzy-resonance-2 |f6> 200.000|g> sa: fuzzy-resonance-2 |f7> 200.000|g>
[ y1 ] [ s1[x1,t1] ] [ sum[x1,t] ] [ a1 a2 a3 a4 a5 ] [ x1 ] [ y2 ] [ s2[x1,t2] ] [ x2 ] [ y3 ] [ s3[x1,t3] ] [ x3 ] [ y4 ] = [ s4[x1,t4] ] [ x4 ] [ y5 ] [ s5[x1,t5] ] [ x5 ] [ y6 ] [ s6[x1,t6] ] [ y7 ] [ s7[x1,t7] ] [ y8 ] [ s8[x1,t8] ]This being a simplified model of a single neuron, and uses my idea of a function matrix (which I need to describe at some point).
{a1,a2,a3,a4,a5} are reals/floats and can be positive or negative. sum[x,t] sums the input x for a time-slice of length t, then spits out the result at the end of that time slice. If we don't include the sum[] term, assume t = 0. Indeed, we only need t > 0 if we want time-dependence. s_k[x,t_k] are sigmoids, with passed in parameter t_k.Note that there are a lot of free parameters here, and I have no idea how the brain tweaks them!
d = a OR b OR c [ d ] = [ BF[x1] ] [ 1 1 1 ] [ BF[x1] ] [ a ] [ BF[x2] ] [ b ] [ BF[x3] ] [ c ] d = a AND b AND c [ d ] = [ BF[x1] ] [ 1/3 1/3 1/3 ] [ BF[x1] ] [ a ] [ BF[x2] ] [ b ] [ BF[x3] ] [ c ] d = a XOR b XOR c [ d ] = [ XF[x1] ] [ 1 1 1 ] [ BF[x1] ] [ a ] [ BF[x2] ] [ b ] [ BF[x3] ] [ c ]Where BF[x] and XF[x] are sigmoids:
def binary_filter(x): if x <= 0.96: return 0 else: return 1 def xor_filter(x): if 0.96 <= x and x <= 1.04: return 1 else: return 0In BKO, and the console, this is:
sa: -- d = a OR b OR c sa: binary-filter to-number count-sum binary-filter (|a> + 0|b> + |c>) | > -- NB: this is equivalent to 1, and not the empty ket (look closely!) sa: -- d = a AND b AND c sa: binary-filter to-number 0.3333 count-sum binary-filter (|a> + |b> + |c>) | > sa: binary-filter to-number 0.3333 count-sum binary-filter (|a> + 0|b> + |c>) 0.000| > -- NB: this is equivalent to 0. sa: -- d = a XOR b XOR c sa: xor-filter to-number count-sum binary-filter (0|a> + |b> + 0|c>) | > sa: xor-filter to-number count-sum binary-filter (|a> + |b> + 0|c>) 0.000| > sa: xor-filter to-number count-sum binary-filter (|a> + |b> + |c>) 0.000| >OK. Let's try for a slightly more complex example:
f = (a AND b AND c) OR (d AND e) [ f ] = [ BF[x1] ] [ 1 1 ] [ BF[x1] ] [ 1/3 1/3 1/3 0 0 ] [ BF[x1] ] [ a ] [ BF[x2] ] [ 0 0 0 1/2 1/2 ] [ BF[x2] ] [ b ] [ BF[x3] ] [ c ] [ BF[x4] ] [ d ] [ BF[x5] ] [ e ]And we can do a version of set union and intersection too, but first we need some pieces:
def pos(x): -- the simplest of the sigmoids. if x <= 0: return 0 else: return x abs(x) = pos(x) + pos(-x) abs(a - b) = pos(a - b) + pos(-a + b) a + b + abs(a - b) = 2*max(a,b) a + b - abs(a - b) = 2*min(a,b)eg:
[ r1 ] [ 1 1 1 ] [ pos[x1] ] [ 1 1 ] [ a ] [ r2 ] = [ 1 -1 -1 ] [ pos[x2] ] [ 1 -1 ] [ b ] [ pos[x3] ] [ -1 1 ] ie: r1 = a + b + pos(a - b) + pos(-a + b) = 2*max(a,b) r2 = a + b - pos(a - b) - pos(-a + b) = 2*min(a,b)And then we need the observation that max corresponds to a version of set union (even works for values other than 0 and 1), and min corresponds to intersection.
set-union(f,g): [max(f[k],g[k]) for k in range(len(f))] set-intersection(f,g): [min(f[k],g[k]) for k in range(len(f))]So finally, in MatSumSig, we have union and intersection (ignoring a factor of 2):
[ U1 ] [ 1 1 1 0 0 0 0 0 0 0 0 0 ] [ pos[x1] ] [ 1 1 0 0 0 0 0 0 ] [ f1 ] [ I1 ] = [ 1 -1 -1 0 0 0 0 0 0 0 0 0 ] [ pos[x2] ] [ 1 -1 0 0 0 0 0 0 ] [ g1 ] [ U2 ] [ 0 0 0 1 1 1 0 0 0 0 0 0 ] [ pos[x3] ] [ -1 1 0 0 0 0 0 0 ] [ f2 ] [ I2 ] [ 0 0 0 1 -1 -1 0 0 0 0 0 0 ] [ pos[x4] ] [ 0 0 1 1 0 0 0 0 ] [ g2 ] [ U3 ] [ 0 0 0 0 0 0 1 1 1 0 0 0 ] [ pos[x5] ] [ 0 0 1 -1 0 0 0 0 ] [ f3 ] [ I3 ] [ 0 0 0 0 0 0 1 -1 -1 0 0 0 ] [ pos[x6] ] [ 0 0 -1 1 0 0 0 0 ] [ g3 ] [ U4 ] [ 0 0 0 0 0 0 0 0 0 1 1 1 ] [ pos[x7] ] [ 0 0 0 0 1 1 0 0 ] [ f4 ] [ I4 ] [ 0 0 0 0 0 0 0 0 0 1 -1 -1 ] [ pos[x8] ] [ 0 0 0 0 1 -1 0 0 ] [ g4 ] [ pos[x9] ] [ 0 0 0 0 -1 1 0 0 ] [ pos[x10] ] [ 0 0 0 0 0 0 1 1 ] [ pos[x11] ] [ 0 0 0 0 0 0 1 -1 ] [ pos[x12] ] [ 0 0 0 0 0 0 -1 1 ] ie: [U1,U2,U3,U4] = 2* [max(f1,g1), max(f2,g2), max(f3,g3), max(f4,g4)] [I1,I2,I3,I4] = 2* [min(f1,g1), min(f2,g2), min(f3,g3), min(f4,g4)]Then finally, a simple version of my favourite toy: simm:
simm(w,f,g): sum(w[k]*min(f[k],g[k]) for k in range(len(f))) [ r ] = [ sigmoid[x1] ] [ w1 w2 w3 w4 ] [ I1 ] [ I2 ] [ I3 ] [ I4 ] where I think in this case it is assummed w_k >= 0And all at once:
[ r ] = [ sigmoid[x1] ] [ w1 w2 w3 w4 ] [ pos[x1] ] [ 1 -1 -1 0 0 0 0 0 0 0 0 0 ] [ pos[x1] ] [ 1 1 0 0 0 0 0 0 ] [ f1 ] [ pos[x2] ] [ 0 0 0 1 -1 -1 0 0 0 0 0 0 ] [ pos[x2] ] [ 1 -1 0 0 0 0 0 0 ] [ g1 ] [ pos[x3] ] [ 0 0 0 0 0 0 1 -1 -1 0 0 0 ] [ pos[x3] ] [ -1 1 0 0 0 0 0 0 ] [ f2 ] [ pos[x4] ] [ 0 0 0 0 0 0 0 0 0 1 -1 -1 ] [ pos[x4] ] [ 0 0 1 1 0 0 0 0 ] [ g2 ] [ pos[x5] ] [ 0 0 1 -1 0 0 0 0 ] [ f3 ] [ pos[x6] ] [ 0 0 -1 1 0 0 0 0 ] [ g3 ] [ pos[x7] ] [ 0 0 0 0 1 1 0 0 ] [ f4 ] [ pos[x8] ] [ 0 0 0 0 1 -1 0 0 ] [ g4 ] [ pos[x9] ] [ 0 0 0 0 -1 1 0 0 ] [ pos[x10] ] [ 0 0 0 0 0 0 1 1 ] [ pos[x11] ] [ 0 0 0 0 0 0 1 -1 ] [ pos[x12] ] [ 0 0 0 0 0 0 -1 1 ]Now, this can be called a space simm, but there is also a time based simm. I think it goes like this:
simm(w,f,g): sum(w[t]*min(f[t],g[t]) for t in range(len(f))) -- k is space based, t is time based, but otherwise identical equation. [ r ] = [ sum[x1,t2] ] [ sigmoid[x1,t1] ] [ 1 -1 -1 ] [ pos[x1] ] [ 1 1 ] [ f ] [ pos[x2] ] [ 1 -1 ] [ g ] [ pos[x3] ] [ -1 1 ]ie, in words: a simm of f,g with respect to time.
f[k] => f[k-1]/4 + f[k]/2 + f[k+1]/4 [ f0 ] [ 3 1 0 0 0 0 0 0 0 0 0 ] [ f0 ] [ f1 ] [ 1 2 1 0 0 0 0 0 0 0 0 ] [ f1 ] [ f2 ] [ 0 1 2 1 0 0 0 0 0 0 0 ] [ f2 ] [ f3 ] [ 0 0 1 2 1 0 0 0 0 0 0 ] [ f3 ] [ f4 ] [ 0 0 0 1 2 1 0 0 0 0 0 ] [ f4 ] [ f5 ] = 1/4 [ 0 0 0 0 1 2 1 0 0 0 0 ] [ f5 ] [ f6 ] [ 0 0 0 0 0 1 2 1 0 0 0 ] [ f6 ] [ f7 ] [ 0 0 0 0 0 0 1 2 1 0 0 ] [ f7 ] [ f8 ] [ 0 0 0 0 0 0 0 1 2 1 0 ] [ f8 ] [ f9 ] [ 0 0 0 0 0 0 0 0 1 2 1 ] [ f9 ] [ f10 ] [ 0 0 0 0 0 0 0 0 0 1 3 ] [ f10 ]Now in BKO:
smooth |f0> => 0.75|f0> + 0.25|f1> smooth |f1> => 0.25|f0> + 0.5|f1> + 0.25|f2> smooth |f2> => 0.25|f1> + 0.5|f2> + 0.25|f3> smooth |f3> => 0.25|f2> + 0.5|f3> + 0.25|f4> smooth |f4> => 0.25|f3> + 0.5|f4> + 0.25|f5> smooth |f5> => 0.25|f4> + 0.5|f5> + 0.25|f6> smooth |f6> => 0.25|f5> + 0.5|f6> + 0.25|f7> smooth |f7> => 0.25|f6> + 0.5|f7> + 0.25|f8> smooth |f8> => 0.25|f7> + 0.5|f8> + 0.25|f9> smooth |f9> => 0.25|f8> + 0.5|f9> + 0.25|f10> smooth |f10> => 0.25|f9> + 0.75|f10>Note that we have currency conservation. ie, sum of columns = 1 (taking into account the 1/4), and count-sum smooth |fk> = 1
sa: smooth^300 |f5> 0.091|f0> + 0.091|f1> + 0.091|f2> + 0.091|f3> + 0.091|f4> + 0.091|f5> + 0.091|f6> + 0.091|f7> + 0.091|f8> + 0.091|f9> + 0.091|f10> sa: smooth^300 |f0> 0.091|f0> + 0.091|f1> + 0.091|f2> + 0.091|f3> + 0.091|f4> + 0.091|f5> + 0.091|f6> + 0.091|f7> + 0.091|f8> + 0.091|f9> + 0.091|f10> sa: count-sum smooth^300 |f5> |number: 1.0000000000000002> sa: invert smooth^300 |f5> 11.000|f0> + 11.000|f1> + 11.000|f2> + 11.000|f3> + 11.000|f4> + 11.000|f5> + 11.000|f6> + 11.000|f7> + 11.000|f8> + 11.000|f9> + 11.000|f10> sa: invert smooth^300 |f0> 10.954|f0> + 10.957|f1> + 10.965|f2> + 10.975|f3> + 10.987|f4> + 11.000|f5> + 11.013|f6> + 11.025|f7> + 11.036|f8> + 11.043|f9> + 11.047|f10> sa: invert smooth^500 |f0> 10.999|f0> + 10.999|f1> + 10.999|f2> + 11.000|f3> + 11.000|f4> + 11.000|f5> + 11.000|f6> + 11.000|f7> + 11.001|f8> + 11.001|f9> + 11.001|f10>Now, the brother (which I haven't actually played with much), unsmooth (yeah, needs a better name, though I guess we could call it a balanced discrete derivative):
f[k] => - f[k-1]/2 + f[k] - f[k+1]/2 [ f0 ] [ 1 -1 0 0 0 0 0 0 0 0 0 ] [ f0 ] [ f1 ] [ -1 2 -1 0 0 0 0 0 0 0 0 ] [ f1 ] [ f2 ] [ 0 -1 2 -1 0 0 0 0 0 0 0 ] [ f2 ] [ f3 ] [ 0 0 -1 2 -1 0 0 0 0 0 0 ] [ f3 ] [ f4 ] [ 0 0 0 -1 2 -1 0 0 0 0 0 ] [ f4 ] [ f5 ] = 1/2 [ 0 0 0 0 -1 2 -1 0 0 0 0 ] [ f5 ] [ f6 ] [ 0 0 0 0 0 -1 2 -1 0 0 0 ] [ f6 ] [ f7 ] [ 0 0 0 0 0 0 -1 2 -1 0 0 ] [ f7 ] [ f8 ] [ 0 0 0 0 0 0 0 -1 2 -1 0 ] [ f8 ] [ f9 ] [ 0 0 0 0 0 0 0 0 -1 2 -1 ] [ f9 ] [ f10 ] [ 0 0 0 0 0 0 0 0 0 -1 1 ] [ f10 ]Note that we don't have currency conservation. Sum of columms = 0.
[ filtered-signal ] = [ pos[x1] ] [ 1 -1 ] [ signal ] [ off-current ] where: signal is a time varying signal. off-current is a time varying off-current (ie, an inhibitory signal of roughly the same strength as the signal) filtered-signal is the result [ filtered-signal ] = [ pos[x1] ] [ 1 -10 ] [ signal ] [ off-current ] where off-current is a strongly inhibitory signal [ filtered-signal ] = [ pos[x1] ] [ 1 -0.2 ] [ signal ] [ off-current ] where off-current is a weakly inhibitory signalOf course, this also means it takes "currency" to switch off a signal. eg, intrusive thoughts you can't quite mentally switch off.
[ d ] [ fn1[x1] ] [ a ] [ e ] = [ fn2[x2] ] [ b ] [ f ] [ fn3[x3] ] [ c ]which expands to:
d = fn1[a] e = fn2[b] f = fn3[c]OK. So we work from the right, inserting the values from the applied vector to the respective x_i. (here x1 = a, x2 = b, x3 = c)
[ d ] [ bah1[x3] ] [ a ] [ e ] = [ bah2[x2,x1] ] [ b ] [ f ] [ bah3[x1,x2,x3] ] [ c ]which expands to:
d = bah1[c] e = bah2[b,a] f = bah3[a,b,c]And more interestingly, the functions in the function matrices can have "stored data" (in this case L_i).
[ d ] [ foo[L1,x] ] [ a ] [ e ] = [ foo[L2,x] ] [ b ] [ f ] [ foo[L3,x] ] [ c ]which expands to:
d = foo[L1,(a,b,c)] -- NB: x_i are elements, x is the vector (x1,x2,...,xn) e = foo[L2,(a,b,c)] f = foo[L3,(a,b,c)]For example, if we set:
L1 = (m1,m2,m3) L2 = (m4,m5,m6) L3 = (m7,m8,m9)and
foo[u,v] = dot-product(u,v)then:
[ d ] [ foo[L1,x] ] [ a ] [ e ] = [ foo[L2,x] ] [ b ] [ f ] [ foo[L3,x] ] [ c ]expands to a standard matrix:
[ d ] [ m1 m2 m3 ] [ a ] [ e ] = [ m4 m5 m6 ] [ b ] [ f ] [ m7 m8 m9 ] [ c ]ie:
d = m1*a + m2*b + m3*c e = m4*a + m5*b + m6*c f = m7*a + m8*b + m9*cAnd that's about it.
[ d ] [ foo1[x1] ] [ 5 6 1 0 2 ] [ fn1[x1] ] [ m1 m2 m3 ] [ a ] [ e ] = [ foo1[x2] ] [ 8 8 7 2 1 ] [ fn2[x3] ] [ m4 m5 m6 ] [ b ] [ f ] [ foo2[x1] ] [ fn3[x2] ] [ m7 m8 m9 ] [ c ] [ g ] [ foo2[x2] ] [ bah1[x1] ] [ bah2[x3] ]It is going to be ugly, but let's show how we expand this down:
[ d ] [ foo1[x1] ] [ 5 6 1 0 2 ] [ fn1[x1] ] [ m1*a + m2*b + m3*c ] [ e ] = [ foo1[x2] ] [ 8 8 7 2 1 ] [ fn2[x3] ] [ m4*a + m5*b + m6*c ] [ f ] [ foo2[x1] ] [ fn3[x2] ] [ m7*a + m8*b + m9*c ] [ g ] [ foo2[x2] ] [ bah1[x1] ] [ bah2[x3] ] [ d ] [ foo1[x1] ] [ 5 6 1 0 2 ] [ fn1[m1*a + m2*b + m3*c] ] [ e ] = [ foo1[x2] ] [ 8 8 7 2 1 ] [ fn2[m7*a + m8*b + m9*c] ] [ f ] [ foo2[x1] ] [ fn3[m4*a + m5*b + m6*c] ] [ g ] [ foo2[x2] ] [ bah1[m1*a + m2*b + m3*c] ] [ bah2[m7*a + m8*b + m9*c] ] [ d ] [ foo1[x1] ] [ 5*fn1[m1*a + m2*b + m3*c] + 6*fn2[m7*a + m8*b + m9*c] + 1*fn3[m4*a + m5*b + m6*c] + 0*bah1[m1*a + m2*b + m3*c] + 2*bah2[m7*a + m8*b + m9*c] ] [ e ] = [ foo1[x2] ] [ 8*fn1[m1*a + m2*b + m3*c] + 8*fn2[m7*a + m8*b + m9*c] + 7*fn3[m4*a + m5*b + m6*c] + 2*bah1[m1*a + m2*b + m3*c] + 1*bah2[m7*a + m8*b + m9*c] ] [ f ] [ foo2[x1] ] [ g ] [ foo2[x2] ] d = foo1[5*fn1[m1*a + m2*b + m3*c] + 6*fn2[m7*a + m8*b + m9*c] + 1*fn3[m4*a + m5*b + m6*c] + 0*bah1[m1*a + m2*b + m3*c] + 2*bah2[m7*a + m8*b + m9*c]] e = foo1[8*fn1[m1*a + m2*b + m3*c] + 8*fn2[m7*a + m8*b + m9*c] + 7*fn3[m4*a + m5*b + m6*c] + 2*bah1[m1*a + m2*b + m3*c] + 1*bah2[m7*a + m8*b + m9*c]] f = foo2[5*fn1[m1*a + m2*b + m3*c] + 6*fn2[m7*a + m8*b + m9*c] + 1*fn3[m4*a + m5*b + m6*c] + 0*bah1[m1*a + m2*b + m3*c] + 2*bah2[m7*a + m8*b + m9*c]] g = foo2[8*fn1[m1*a + m2*b + m3*c] + 8*fn2[m7*a + m8*b + m9*c] + 7*fn3[m4*a + m5*b + m6*c] + 2*bah1[m1*a + m2*b + m3*c] + 1*bah2[m7*a + m8*b + m9*c]]Heh. The idea of layers and layers of matrices and fn-matrices is actually a tidy little model of how computation in the brain works (kind of the point of the MatSumSig model).
[ y1 ] = [ secure-hash[x1] ] [ a b ] [ x1 ] [ y2 ] [ secure-hash[x2] ] [ c d ] [ x2 ] y1 = secure-hash[a*x1 + b*x2] y2 = secure-hash[c*x2 + d*x2]
---------------------------------------- |context> => |context: greetings play> hello |*> #=> merge-labels(|Hello, > + |_self> + |!>) hey |*> #=> merge-labels(|Hey Ho! > + |_self> + |.>) wat-up |*> #=> merge-labels (|Wat up my homie! > + |_self> + | right?>) greetings |*> #=> merge-labels(|Greetings fine Sir. I believe they call you > + |_self> + |.>) howdy |*> #=> merge-labels(|Howdy partner!>) good-morning |*> #=> merge-labels(|Good morning > + |_self> + |.>) gday |*> #=> merge-labels(|G'day > + |_self> + |.>) random-greet |*> #=> pick-elt ( hello |_self> + hey |_self> + wat-up |_self> + greetings |_self> + howdy |_self> + good-morning |_self> + gday |_self>) friends-list |*> #=> extract-value list-to-words extract-value friends |_self> friends |Charlie> => |Jack> + |Emma> friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie> ---------------------------------------- sa: random-greet |Matt> fn: pick-elt -- these three lines are some debugging info. len: 1 -- and shows the possible greetings it can choose from: sp: |Hello, Matt!> + |Hey Ho! Matt.> + |Wat up my homie! Matt right?> + |Greetings fine Sir. I believe they call you Matt.> + |Howdy partner!> + |Good morning Matt.> + |G'day Matt.> |Good morning Matt.> sa: random-greet |George> |Wat up my homie! George right?> sa: random-greet friends-list |Charlie> fn: pick-elt len: 1 sp: |Hello, Jack and Emma!> + |Hey Ho! Jack and Emma.> + |Wat up my homie! Jack and Emma right?> + |Greetings fine Sir. I believe they call you Jack and Emma.> + |Howdy partner!> + |Good morning Jack and Emma.> + |G'day Jack and Emma.> |Hey Ho! Jack and Emma.> sa: random-greet friends-list |Sam> fn: pick-elt len: 1 sp: |Hello, Charlie, George, Emma, Jack, Rober, Frank and Julie!> + |Hey Ho! Charlie, George, Emma, Jack, Rober, Frank and Julie.> + |Wat up my homie! Charlie, George, Emma, Jack, Rober, Frank and Julie right?> + |Greetings fine Sir. I believe they call you Charlie, George, Emma, Jack, Rober, Frank and Julie.> + |Howdy partner!> + |Good morning Charlie, George, Emma, Jack, Rober, Frank and Julie.> + |G'day Charlie, George, Emma, Jack, Rober, Frank and Julie.> |G'day Charlie, George, Emma, Jack, Rober, Frank and Julie.>
fib |0> => |0> fib |1> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> arithmetic( fib n-1 |_self>, |+>, fib n-2 |_self>) fib-ratio |*> #=> arithmetic( fib |_self> , |/>, fib n-1 |_self> )becomes:
fib 0 = 0 fib 1 = 1 n-1 * = _self - 1 n-2 * = _self - 2 fib * = fib n-1 _self + fib n-2 _self fib-ratio * = fib _self / fib n-1 _self fact |0> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) fact |*> #=> arithmetic( |_self>, |*>, fact n-1 |_self>)becomes:
fact 0 = 1 n-1 * = _self - 1 fact * = _self * fact n-1 _self ave |*> #=> arithmetic(count-sum "" |_self>,|/>,count "" |_self>)becomes:
ave * = count-sum "" _self / count "" _self -- Not sure this works, as it needs to mix labels with numbers to work. hello |*> #=> merge-labels(|Hello, > + |_self> + |!>) hey |*> #=> merge-labels(|Hey Ho! > + |_self> + |.>) wat-up |*> #=> merge-labels (|Wat up my homie! > + |_self> + | right?>) greetings |*> #=> merge-labels(|Greetings fine Sir. I believe they call you > + |_self> + |.>) howdy |*> #=> merge-labels(|Howdy partner!>) good-morning |*> #=> merge-labels(|Good morning > + |_self> + |.>) gday |*> #=> merge-labels(|G'day > + |_self> + |.>) random-greet |*> #=> pick-elt ( hello |_self> + hey |_self> + wat-up |_self> + greetings |_self> + howdy |_self> + good-morning |_self> + gday |_self>) friends-list |*> #=> extract-value list-to-words extract-value friends |_self>becomes:
hello * = "Hello, ${_self}!" hey * = "Hey Ho! ${_self}." wat-up * = "Wat up my homie! ${_self} right?" greetings * = "Greetings fine Sir. I believe they call you ${_self}." howdy * = "Howdy partner!" good-morning * = "Good morning ${_self}." gday * = "G'day ${_self}." random-greet * = pick-elt [ hello _self, hey _self, wat-up _self, greetings _self, howdy _self, good-morning _self, gday _self] friends-list * = extract-value list-to-words extract-value friends _self
hello |*> #=> merge-labels(|Hello, > + |_self> + |!>) hey |*> #=> merge-labels(|Hey Ho! > + |_self> + |.>) wat-up |*> #=> merge-labels (|Wat up my homie! > + |_self> + | right?>) greetings |*> #=> merge-labels(|Greetings fine Sir. I believe they call you > + |_self> + |.>) howdy |*> #=> merge-labels(|Howdy partner!>) good-morning |*> #=> merge-labels(|Good morning > + |_self> + |.>) gday |*> #=> merge-labels(|G'day > + |_self> + |.>) random-greet |*> #=> pick-elt ( hello |_self> + hey |_self> + wat-up |_self> + greetings |_self> + howdy |_self> + good-morning |_self> + gday |_self>) friends-list |*> #=> extract-value list-to-words extract-value friends |_self>becomes:
hello |*> #=> |Hello, > _ |_self> _ |!> hey |*> #=> |Hey Ho! > _ |_self> _ |.> wat-up |*> #=> |Wat up my homie! > _ |_self> _ | right?> greetings |*> #=> |Greetings fine Sir. I believe they call you > _ |_self> _ |.> howdy |*> #=> |Howdy partner!> good-morning |*> #=> |Good morning > _ |_self> _ |.> gday |*> #=> |G'day > _ |_self> _ |.> random-greet |*> #=> pick-elt ( hello |_self> + hey |_self> + wat-up |_self> + greetings |_self> + howdy |_self> + good-morning |_self> + gday |_self>) friends-list |*> #=> extract-value list-to-words extract-value friends |_self>
a*b = \Sum_k abs(a_k . b_k) -- discrete version, where . is the standard multiplication operator. a*b = \Int dt abs(a(t) . b(t)) -- continuous version.Both of which have the property:
0 <= w*[f - g] <= w*f + w*g where 0 if f == g, and w*f + w*g if f and g are perfectly disjoint.Which is just a standard property of metrics. See WP.
2. d(x,y) = 0 if and only if x = y 4. d(x,z) <= d(x,y) + d(y,z)OK. So we can normalize the range of this simm to [0,1] by simply dividing by w*f + w*g:
0 <= w*[f - g]/(w*f + w*g) <= 1OK. So this is a good start, but we find for some examples it doesn't work as well as expected (details later!).
0 <= (w*[f - g] + R abs(w*f - w*g))/(w*f + w*g + R abs(w*f - w*g)) <= 1Next, set R = 1, and note that: a + b + abs(a - b) = 2*max(a,b), so now we have:
0 <= (w*[f - g] + abs(w*f - w*g))/2.max(w*f,w*g) <= 1And in the physics tradition has some symmetries:
given the scalar k (not equal to 0), we have: 1) symmetry under w => k.w 2) symmetry under f => k.f and g => k.g 3) symmetry under w => w.exp(i*t)/R(t), f => R(t).exp(-i*t).f, g => R(t).exp(-i*t).gNow, I don't know a use for this last one, but I put it in because it reminds me of gauge transformations in physics, and is a good decider on what terms are valid to use in our metric.
0 <= m <= 1becomes:
1 >= 1 - m >= 0ie:
(w*[f - g] + R abs(w*f - w*g))/(w*f + w*g + R abs(w*f - w*g))becomes:
(w*f + w*g + R abs(w*f - w*g) - w*[f - g] - R abs(w*f - w*g))/(w*f + w*g + R abs(w*f - w*g)) = (w*f + w*g - w*[f - g])/(w*f + w*g + R abs(w*f - w*g)) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g) -- assuming R = 1So there we have it:
simm(w,f,g) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g) NB: if w is not given, ie, simm(f,g), then assume w_k = 1 for all k, or w(t) = 1 for all t.Now a couple of observations.
simm(w,f,g) = \Sum_k w_k min(f_k , g_k) / max(w*f,w*g)And then these motivate the superposition versions of simm:
def weighted_simm(w,A,B): A = multiply(w,A) B = multiply(w,B) return intersection(A.normalize(),B.normalize()).count_sum() def simm(A,B): return intersection(A.normalize(),B.normalize()).count_sum()where:
A.normalize(), B.normalize() implement the idea of rescaling so that w*f == w*g, and also max(w*f,w*g) = 1 intersection(...) is equivalent to the min(f,g) term.
def unscaled_simm(A,B): wf = A.count_sum() wg = B.count_sum() if wf == 0 and wg == 0: return 0 return intersection(A,B).count_sum()/max(wf,wg)
def silent_simm(A,B): # handle single kets, where we don't want rescaling to s1*wf == s2*wg if A.count() <= 1 and B.count() <= 1: a = A.ket() b = B.ket() if a.label != b.label: return 0 a = max(a.value,0) # just making sure they are >= 0. b = max(b.value,0) if a == 0 and b == 0: # prevent div by zero. return 0 return min(a,b)/max(a,b) # default case: return intersection(A.normalize(),B.normalize()).count_sum()Next, since we have a simm, we can now implement the Landscape function:
L(f,x) = simm(f,g(x)) f is an input pattern there are different patterns g(x) at each point x, and in general x is not continuous.So the Landscape function neatly maps an incoming pattern into a mathematical surface.
Define Jp as the set of p'th roots of unity. ie, exp(i*2*Pi*k/p) And give individual roots the names j_pk. eg, J3 = {j_31,j_32,j_33} The useful fact is that they add up to 0. j_21 + j_22 = 0 (ie 1 + -1 = 0) j_31 + j_32 + j_33 = 0 j_41 + j_42 + j_43 + j_44 = 0 and so on. And just like above, we have: 0 <= w*[j_21 f1 + j_22 f2] <= w*f1 + w*f2 0 <= w*[j_31 f1 + j_32 f2 + j_33 f3] <= w*f1 + w*f2 + w*f3 0 <= w*[j_41 f1 + j_42 f2 + j_43 f3 + j_44 f4] <= w*f1 + w*f2 + w*f3 + w*f4 and so on. And also like above, we have: a + b + abs(a - b) = 2*max(a,b) a1 + a2 + a3 + abs(j_31 a1 + j_32 a2 + j_33 a3) approx-equal 3*max(a1,a2,a3) a1 + a2 + a3 + a4 + abs(j_41 a1 + j_42 a2 + j_43 a3 + j_44 a4) approx-equal 4*max(a1,a2,a3,a4) and so on. The point is, we now have a family of simm's: simm(w,f,g) = (w*f + w*g - w*[f - g])/2.max(w*f,w*g) simm(w,f1,f2) = (w*f1 + w*f2 - w*[j_21 f1 + j_22 f2])/2.max(w*f1,w*f2) simm(w,f1,f2,f3) = (w*f1 + w*f2 + w*f3 - w*[j_31 f1 + j_32 f2 + j_33 f3])/3.max(w*f1,w*f2,w*f3) simm(w,f1,f2,f3,f4) = (w*f1 + w*f2 + w*f3 + w*f4 - w*[j_41 f1 + j_42 f2 + j_43 f3 + j_44 f4])/4.max(w*f1,w*f2,w*f3,w*f4) simm(w,f1,f2,f3,f4,f5) = (w*f1 + w*f2 + w*f3 + w*f4 + w*f5 - w*[j_51 f1 + j_52 f2 + j_53 f3 + j_54 f4 + j_55 f5])/5.max(w*f1,w*f2,w*f3,w*f4,w*f5) and so on. Which presumably can be shrunk down to this: simm(w,f,g) = \Sum_k w_k min(f_k , g_k) / max(w*f,w*g) simm(w,f1,f2) = \Sum_k w_k min(f1_k,f2_k)/max(w*f1,w*f2) simm(w,f1,f2,f3) = \Sum_k w_k min(f1_k,f2_k,f3_k)/max(w*f1,w*f2,w*f3) simm(w,f1,f2,f3,f4) = \Sum_k w_k min(f1_k,f2_k,f3_k,f4_k)/max(w*f1,w*f2,w*f3,w*f4) simm(w,f1,f2,f3,f4,f5) = \Sum_k w_k min(f1_k,f2_k,f3_k,f4_k,f5_k)/max(w*f1,w*f2,w*f3,w*f4,w*f5) and so on.OK. That doesn't look too hard to implement in python at least. More work in MatSumSig though. We presumably need quite a few layers of processing.
N = floor(1/2 - log_2(frequency-of-this-item/frequency-of-most-common-item))In python this is:
N = math.floor(0.5 - math.log(current/largest,2))Then I decided to normalize it so like simm it is in [0,1], 1 for exact match, 0 for not in set.
# e is a ket, X is a superposition # for best effect X should be a frequency list def normed_frequency_class(e,X): X = X.drop() -- drop elements with coeff <= 0 smallest = X.find_min_coeff() -- return the min coeff in X as float largest = X.find_max_coeff() -- return the max coeff in X as float f = X.find_value(e) -- return the value of ket e in superposition X as float if largest <= 0 or f <= 0: -- otherwise the math.log() blows up! return 0 fc_max = math.floor(0.5 - math.log(smallest/largest,2)) + 1 -- NB: the + 1 is important, else the smallest element in X gets reported as not in set. return 1 - math.floor(0.5 - math.log(f/largest,2))/fc_maxnfc can be considered a type of fuzzy set membership function.
def pattern_recognition(self,pattern,op="pattern",t=0): op = op.label.split("op: ")[-1] if type(op) == ket else op result = superposition() for x in self.known_kets: if op in self.rules[x]: candidate_pattern = self.recall(op,x) # do we need active=True here? value = silent_simm(pattern,candidate_pattern) # probably. if value > t: # Also a few spots below. result.data.append(ket(x,value)) # eg, what happens if candidate_pattern is return result.coeff_sort() # not ket/sp? def map_to_topic(self,e,op,t=0): # do we need the op = op.label.split("op: ... stuff here? result = superposition() for x in self.known_kets: if op in self.rules[x]: frequency_list = self.recall(op,x) # do we need active=True here? value = normed_frequency_class(e,frequency_list) if value > t: result.data.append(ket(x,value)) return result.normalize(100).coeff_sort() # NB: .normalize(100) is a key component of this function. # Half the magic is in nfc(), the other half in normalize(100). # eg, say "foo" is a long way down the frequency list for some object. # so its nfc() is small. But if it is not in any other frequency list, # then we want it 100% match to that one frequency list.Then in the ket class (not yet in superposition that I know of), we have:
# do we need a superposition version of this? Probably... # implements: similar[op] |x> def similar(self,context,op): f = self.apply_op(context,op) # use apply_op or context.recall() directly? print("f:",f.display()) # in light of active=True thing, apply_op() seems the right answer. # return context.pattern_recognition(f,op) # yeah, but what about in pat_rec? return context.pattern_recognition(f,op).delete_ket(self) # we delete self, ie |x>, from the result, since it is always a 100% match anyway. # implements: find-topic[op] |x> def find_topic(self,context,op): return context.map_to_topic(self,op)23/5/2014 update: Here are some nice results using find-topic. In this case guessing if a name is last, male or female, using this data set.
mother_child(trude, sally). father_child(tom, sally). father_child(tom, erica). father_child(mike, tom). sibling(X, Y) :- parent_child(Z, X), parent_child(Z, Y). parent_child(X, Y) :- father_child(X, Y). parent_child(X, Y) :- mother_child(X, Y). This results in the following query being evaluated as true: ?- sibling(sally, erica). Yes the query ?- sibling(sally, sally). also succeedsNow, in BKO:
|context> => |context: prolog example> mother |sally> => |trude> child |trude> = > |sally> father |sally> => |tom> child |tom> => |sally> father |erica> => |tom> child |tom> +=> |erica> father |tom> => |mike> child |mike> => |tom> parent |*> #=> mother |_self> + father |_self> sibling |*> #=> child parent |_self> -- this being the BKO equivalent of: sibling(X, Y) :- parent_child(Z, X), parent_child(Z, Y) sibling-clean |*> #=> drop (child parent |_self> + -|_self>) -- now put it to use: sa: sibling |sally> |sally> + |erica> sa: sibling |erica> |sally> + |erica> sa: sibling-clean |sally> |erica> sa: sibling-clean |erica> |sally> -- applying bra's is not yet implemented, but if it was then: -- "is erica a sibling of sally?" -- would map to: sa: <erica|sibling|sally> == 1
|a: b: c: d: e> would correspond to the list ["a","b","c","d","e"]. And use merge-labels to pre/append lists: sa: merge-labels( |x: y: z> + |: > + |a: b: c: d: e>) |x: y: z: a: b: c: d: e> sa: merge-labels(|a: b: c: d: e> + |: > + |x: y: z>) |a: b: c: d: e: x: y: z> -- and then cf CAR and CDR in LISP: sa: extract-category |a: b: c: d: e: f> |a: b: c: d: e> sa: extract-category extract-category |a: b: c: d: e: f> |a: b: c: d> sa: extract-category^4 |a: b: c: d: e: f> |a: b> sa: extract-value |a: b: c: d: e: f> |f> sa: extract-value extract-category |a: b: c: d: e: f> |e> sa: extract-value extract-category^4 |a: b: c: d: e: f> |b> Indeed, in python we have: >>> "a: b: c: d: e".split(": ") ['a', 'b', 'c', 'd', 'e']
sa: range(|C: 0>,|C: 100>,|10>) |C: 0> + |C: 10> + |C: 20> + |C: 30> + |C: 40> + |C: 50> + |C: 60> + |C: 70> + |C: 80> + |C: 90> + |C: 100> sa: F range(|C: 0>,|C: 100>,|10>) |F: 32.00> + |F: 50.00> + |F: 68.00> + |F: 86.00> + |F: 104.00> + |F: 122.00> + |F: 140.00> + |F: 158.00> + |F: 176.00> + |F: 194.00> + |F: 212.00> sa: range (|F: 0>,|F: 100>,|10>) |F: 0> + |F: 10> + |F: 20> + |F: 30> + |F: 40> + |F: 50> + |F: 60> + |F: 70> + |F: 80> + |F: 90> + |F: 100> sa: C range (|F: 0>,|F: 100>,|10>) |C: -17.78> + |C: -12.22> + |C: -6.67> + |C: -1.11> + |C: 4.44> + |C: 10.00> + |C: 15.56> + |C: 21.11> + |C: 26.67> + |C: 32.22> + |C: 37.78>
pat-rec(x,x) = ...
well-behaved means similar objects return similar superpositions (this is the hard bit to achieve, but hopefully not impossible) deterministic means if you feed in the same object, you get essentially the same superposition. There is some lee-way in that it doesn't have to be 100% identical on each run, but close. distinctive means different object types have easily distinguishable superpositions (again, this is on the hard side)Interestingly enough, the superposition can be pretty much anything, provided it is "well-behaved", "deterministic" and "distinctive".
sa: load fragment-documents-3.sw loading sw file: sw-examples/fragment-documents-3.sw sa: dump |slashdot-1> supported-ops |slashdot-1> => |op: fragment-lengths> + |op: fragment-hash> fragment-lengths |slashdot-1> => 1807.000|0> + 500.000|2> + 346.000|5> + 209.000|4> + 165.000|3> + 134.000|7> + 128.000|8> + 86.000|6> + 58.000|21> + 51.000|41> + 45.000|13> + 44.000|17> + 43.000|20> + 43.000|29> + 42.000|16> + 39.000|24> + 39.000|36> + 34.000|11> + 30.000|32> + 27.000|1> + 26.000|55> + 26.000|31> + 25.000|19> + 25.000|9> + 23.000|43> + 23.000|37> + 23.000|33> + 22.000|65> + 22.000|48> + 21.000|10> + 20.000|25> + 19.000|14> + 19.000|93> + 19.000|38> + 18.000|18> + 18.000|39> + 17.000|70> + 17.000|46> + 17.000|23> + 16.000|80> + 16.000|22> + 16.000|131> + 14.000|30> + 13.000|72> + 13.000|74> + 13.000|42> + 13.000|34> + 13.000|27> + 11.000|26> + 10.000|15> + 10.000|73> + 10.000|75> + 10.000|76> + 9.000|59> + 8.000|12> + 8.000|68> + 8.000|47> + 8.000|57> + 8.000|77> + 7.000|54> + 7.000|50> + 7.000|35> + 6.000|69> + 5.000|94> + 5.000|81> + 5.000|88> + 5.000|63> + 5.000|62> + 5.000|61> + 5.000|40> + 5.000|44> + 5.000|87> + 4.000|97> + 4.000|95> + 4.000|45> + 4.000|49> + 4.000|111> + 4.000|113> + 4.000|115> + 4.000|53> + 4.000|116> + 4.000|109> + 4.000|107> + 4.000|139> + 4.000|135> + 3.000|228> + 3.000|146> + 3.000|120> + 3.000|124> + 3.000|71> + 3.000|179> + 3.000|52> + 3.000|51> + 3.000|181> + 3.000|100> + 3.000|28> + 2.000|128> + 2.000|91> + 2.000|154> + 2.000|156> + 2.000|150> + 2.000|119> + 2.000|157> + 2.000|85> + 2.000|84> + 2.000|83> + 2.000|82> + 2.000|141> + 2.000|166> + 2.000|421> + 2.000|168> + 2.000|167> + 2.000|160> + 2.000|60> + 2.000|199> + 2.000|110> + 2.000|112> + 2.000|114> + 2.000|104> + 2.000|402> + 2.000|189> + 2.000|188> + 2.000|103> + 2.000|102> + 2.000|101> + 2.000|415> + 2.000|86> + 2.000|99> + |220> + |121> + |1851> + |126> + |222> + |231> + |232> + |159> + |90> + |155> + |117> + |153> + |118> + |477> + |242> + |149> + |3346> + |143> + |147> + |145> + |144> + |256> + |250> + |450> + |685> + |78> + |79> + |174> + |175> + |172> + |390> + |337> + |332> + |262> + |260> + |695> + |3148> + |165> + |169> + |66> + |64> + |327> + |324> + |3435> + |772> + |198> + |431> + |430> + |191> + |193> + |197> + |1321> + |58> + |56> + |180> + |208> + |1336> + |383> + |206> + |207> + |203> + |105> + |1520> + |379> + |417> + |419> + |148> + |2008> + |495> + |138> + |133> + |130> + |136> fragment-hash |slashdot-1> => 1808.000|09> + 332.000|aa> + 254.000|79> + 149.000|4d> + 88.000|91> + 66.000|83> + 54.000|d9> + 35.000|bd> + 33.000|14> + 33.000|2d> + 29.000|d4> + 28.000|08> + 28.000|e8> + 27.000|56> + 26.000|a0> + 26.000|ed> + 25.000|97> + 24.000|54> + 23.000|93> + 23.000|a5> + 22.000|75> + 22.000|4f> + 22.000|66> + 22.000|42> + 21.000|7a> + 21.000|4e> + 21.000|b7> + 20.000|ca> + 20.000|ac> + 20.000|de> + 20.000|6f> + 19.000|c8> + 19.000|84> + 19.000|49> + 19.000|fa> + 18.000|13> + 18.000|81> + 18.000|ef> + 17.000|0f> + 17.000|ad> + 17.000|48> + 17.000|31> + 17.000|f2> + 17.000|b6> + 16.000|1b> + 16.000|4c> + 16.000|3c> + 15.000|be> + 14.000|95> + 14.000|53> + 13.000|12> + 13.000|10> + 13.000|62> + 13.000|9e> + 13.000|f3> + 12.000|9a> + 11.000|d0> + 11.000|60> + 11.000|46> + 11.000|59> + 11.000|34> + 11.000|39> + 11.000|3f> + 11.000|26> + 10.000|16> + 10.000|1f> + 10.000|02> + 10.000|da> + 10.000|8c> + 10.000|4a> + 10.000|63> + 10.000|2a> + 10.000|ec> + 9.000|c5> + 9.000|dc> + 9.000|55> + 9.000|50> + 9.000|30> + 9.000|33> + 9.000|ea> + 9.000|e9> + 9.000|e6> + 9.000|e2> + 8.000|0d> + 8.000|d1> + 8.000|d2> + 8.000|9b> + 8.000|92> + 8.000|bc> + 8.000|b1> + 8.000|7e> + 8.000|a9> + 8.000|70> + 8.000|71> + 8.000|73> + 8.000|8e> + 8.000|bb> + 8.000|5d> + 8.000|57> + 8.000|6e> + 8.000|3e> + 8.000|e4> + 8.000|e0> + 7.000|17> + 7.000|1d> + 7.000|1e> + 7.000|85> + 7.000|b4> + 7.000|b0> + 7.000|8b> + 7.000|69> + 7.000|41> + 7.000|77> + 7.000|21> + 7.000|fe> + 7.000|e1> + 6.000|d6> + 6.000|d8> + 6.000|cb> + 6.000|06> + 6.000|01> + 6.000|03> + 6.000|c4> + 6.000|af> + 6.000|99> + 6.000|db> + 6.000|dd> + 6.000|b2> + 6.000|7c> + 6.000|a8> + 6.000|5c> + 6.000|47> + 6.000|2b> + 6.000|a6> + 6.000|32> + 6.000|eb> + 6.000|f8> + 6.000|f4> + 6.000|3d> + 6.000|86> + 6.000|82> + 5.000|0e> + 5.000|0c> + 5.000|18> + 5.000|d3> + 5.000|cc> + 5.000|1c> + 5.000|04> + 5.000|c6> + 5.000|8f> + 5.000|88> + 5.000|7f> + 5.000|78> + 5.000|74> + 5.000|ce> + 5.000|61> + 5.000|64> + 5.000|40> + 5.000|45> + 5.000|44> + 5.000|52> + 5.000|6c> + 5.000|6a> + 5.000|6d> + 5.000|2e> + 5.000|25> + 5.000|29> + 5.000|fc> + 5.000|fd> + 5.000|e5> + 4.000|0a> + 4.000|0b> + 4.000|19> + 4.000|d5> + 4.000|d7> + 4.000|9f> + 4.000|00> + 4.000|c2> + 4.000|f0> + 4.000|7d> + 4.000|90> + 4.000|72> + 4.000|65> + 4.000|5f> + 4.000|5e> + 4.000|5a> + 4.000|9d> + 4.000|58> + 4.000|2f> + 4.000|a7> + 4.000|36> + 4.000|37> + 4.000|3a> + 4.000|23> + 4.000|ff> + 3.000|80> + 3.000|15> + 3.000|cf> + 3.000|1a> + 3.000|05> + 3.000|c9> + 3.000|87> + 3.000|89> + 3.000|b8> + 3.000|ab> + 3.000|7b> + 3.000|ae> + 3.000|c1> + 3.000|a4> + 3.000|8d> + 3.000|8a> + 3.000|c3> + 3.000|a1> + 3.000|cd> + 3.000|68> + 3.000|43> + 3.000|a2> + 3.000|6b> + 3.000|38> + 3.000|f1> + 3.000|f7> + 3.000|3b> + 3.000|b3> + 3.000|22> + 3.000|28> + 3.000|e7> + 2.000|9c> + 2.000|ba> + 2.000|07> + 2.000|c0> + 2.000|98> + 2.000|df> + 2.000|b9> + 2.000|67> + 2.000|76> + 2.000|35> + 2.000|bf> + 2.000|f9> + 2.000|f6> + 2.000|27> + 2.000|20> + 2.000|fb> + |11> + |c7> + |4b> + |b5> + |24> + |51> + |a3> + |ee> + |f5> + |e3>Generated using this python.
file_table = { "eztv-1" : "web-pages/eztv-1.html", "eztv-2" : "web-pages/eztv-2.html", "diary-1" : "web-pages/k5-diary-1.html", "diary-2" : "web-pages/k5-diary-2.html", "wc-comments-1" : "web-pages/wc-comments-1.html", "wc-comments-2" : "web-pages/wc-comments-2.html", "slashdot-1" : "web-pages/slashdot-1.html", "slashdot-2" : "web-pages/slashdot-2.html", "semantic-1" : "web-pages/semantic-db-1.html", } def fragment_string(s,fragments): r = [s] for frag in fragments: list = r r = [] for s in list: r += s.split(frag) return r fragments = ["<","|",">"] def dict_to_sp(dict): result = superposition() for x in dict: result.data.append(ket(x,dict[x])) return result def dict_load_fragment_lengths(filename,fragments): dict = {} with open(filename,'r') as f: text = f.read() for sequence in fragment_string(text,fragments): length = str(len(sequence.strip())) if length not in dict: dict[length] = 1 else: dict[length] += 1 return dict_to_sp(dict) import hashlib # in testing so far, this thing works great! Much more discriminating power (by 10 points roughly) than frag_lengths. # where "discriminating power" is the difference in coeff between the largest coeff in the sp, and the second largest. # discrimination() is currently both in the functions and the superposition class. def dict_load_fragment_hash(filename,fragments): dict = {} with open(filename,'r') as f: text = f.read() for sequence in fragment_string(text,fragments): hash = hashlib.sha1(sequence.strip().encode('utf-8')).hexdigest()[-2:] if hash not in dict: dict[hash] = 1 else: dict[hash] += 1 return dict_to_sp(dict) for topic in file_table: file = file_table[topic] print("topic: " + topic) print("file: " + file) x = topic C.learn("fragment-lengths",x,dict_load_fragment_lengths(file,fragments).coeff_sort()) C.learn("fragment-hash",x,dict_load_fragment_hash(file,fragments).coeff_sort()) # insert these rules into context: # simm |*> #=> 100 similar[fragment-lengths] |_self> # hs |*> #=> 100 similar[fragment-hash] |_self> C.learn("simm","*",stored_rule("100 similar[fragment-lengths] |_self>")) C.learn("hs","*",stored_rule("100 similar[fragment-hash] |_self>")) name = "web-pages/fragment-documents-4.sw" save_sw(C,name)Let's generate some results. Get the console to do it on load time, by adding this to the end of the fragment-documents-3.sw file.
|simm-result-1> => simm |eztv-1> |simm-result-2> => simm |eztv-2> |simm-result-3> => simm |slashdot-1> |simm-result-4> => simm |slashdot-2> |simm-result-5> => simm |diary-1> |simm-result-6> => simm |diary-2> |simm-result-7> => simm |wc-comments-1> |simm-result-8> => simm |wc-comments-2> |simm-result-9> => simm |semantic-1> |hs-result-1> => hs |eztv-1> |hs-result-2> => hs |eztv-2> |hs-result-3> => hs |slashdot-1> |hs-result-4> => hs |slashdot-2> |hs-result-5> => hs |diary-1> |hs-result-6> => hs |diary-2> |hs-result-7> => hs |wc-comments-1> |hs-result-8> => hs |wc-comments-2> |hs-result-9> => hs |semantic-1>Done. Now we have (after tidying up the dump in the console):
|simm-result-1> => 97.025|eztv-2> + 65.993|slashdot-2> + 65.819|slashdot-1> + 63.885|diary-2> + 63.594|diary-1> + 58.907|wc-comments-1> + 58.907|wc-comments-2> + 41.707|semantic-1> |simm-result-2> => 97.025|eztv-1> + 67.337|slashdot-2> + 67.180|slashdot-1> + 65.823|diary-2> + 65.564|diary-1> + 59.827|wc-comments-2> + 59.827|wc-comments-1> + 43.032|semantic-1> |simm-result-3> => 98.915|slashdot-2> + 76.447|diary-2> + 76.194|diary-1> + 68.010|wc-comments-1> + 67.962|wc-comments-2> + 67.180|eztv-2> + 65.819|eztv-1> + 56.437|semantic-1> |simm-result-4> => 98.915|slashdot-1> + 76.371|diary-2> + 76.107|diary-1> + 67.939|wc-comments-1> + 67.892|wc-comments-2> + 67.337|eztv-2> + 65.993|eztv-1> + 56.353|semantic-1> |simm-result-5> => 98.525|diary-2> + 76.194|slashdot-1> + 76.107|slashdot-2> + 75.691|wc-comments-1> + 75.653|wc-comments-2> + 65.564|eztv-2> + 63.594|eztv-1> + 55.682|semantic-1> |simm-result-6> => 98.525|diary-1> + 76.447|slashdot-1> + 76.371|slashdot-2> + 76.215|wc-comments-1> + 76.201|wc-comments-2> + 65.823|eztv-2> + 63.885|eztv-1> + 56.463|semantic-1> |simm-result-7> => 99.811|wc-comments-2> + 76.215|diary-2> + 75.691|diary-1> + 68.010|slashdot-1> + 67.939|slashdot-2> + 65.518|semantic-1> + 59.827|eztv-2> + 58.907|eztv-1> |simm-result-8> => 99.811|wc-comments-1> + 76.201|diary-2> + 75.653|diary-1> + 67.962|slashdot-1> + 67.892|slashdot-2> + 65.492|semantic-1> + 59.827|eztv-2> + 58.907|eztv-1> |simm-result-9> => 65.518|wc-comments-1> + 65.492|wc-comments-2> + 56.463|diary-2> + 56.437|slashdot-1> + 56.353|slashdot-2> + 55.682|diary-1> + 43.032|eztv-2> + 41.707|eztv-1> |hs-result-1> => 96.824|eztv-2> + 69.881|slashdot-1> + 69.730|slashdot-2> + 65.903|diary-2> + 65.323|diary-1> + 60.620|wc-comments-1> + 60.603|wc-comments-2> + 46.397|semantic-1> |hs-result-2> => 96.824|eztv-1> + 69.748|slashdot-1> + 69.612|slashdot-2> + 66.569|diary-2> + 65.970|diary-1> + 61.013|wc-comments-1> + 60.982|wc-comments-2> + 45.875|semantic-1> |hs-result-3> => 98.205|slashdot-2> + 69.881|eztv-1> + 69.748|eztv-2> + 67.442|diary-2> + 66.801|diary-1> + 58.669|wc-comments-2> + 58.612|wc-comments-1> + 44.965|semantic-1> |hs-result-4> => 98.205|slashdot-1> + 69.730|eztv-1> + 69.612|eztv-2> + 67.063|diary-2> + 66.447|diary-1> + 58.519|wc-comments-2> + 58.462|wc-comments-1> + 44.890|semantic-1> |hs-result-5> => 97.902|diary-2> + 68.407|wc-comments-2> + 68.298|wc-comments-1> + 66.801|slashdot-1> + 66.447|slashdot-2> + 65.970|eztv-2> + 65.323|eztv-1> + 41.417|semantic-1> |hs-result-6> => 97.902|diary-1> + 68.762|wc-comments-2> + 68.625|wc-comments-1> + 67.442|slashdot-1> + 67.063|slashdot-2> + 66.569|eztv-2> + 65.903|eztv-1> + 42.049|semantic-1> |hs-result-7> => 99.669|wc-comments-2> + 68.625|diary-2> + 68.298|diary-1> + 61.013|eztv-2> + 60.620|eztv-1> + 58.612|slashdot-1> + 58.462|slashdot-2> + 43.165|semantic-1> |hs-result-8> => 99.669|wc-comments-1> + 68.762|diary-2> + 68.407|diary-1> + 60.982|eztv-2> + 60.603|eztv-1> + 58.669|slashdot-1> + 58.519|slashdot-2> + 43.180|semantic-1> |hs-result-9> => 46.397|eztv-1> + 45.875|eztv-2> + 44.965|slashdot-1> + 44.890|slashdot-2> + 43.180|wc-comments-2> + 43.165|wc-comments-1> + 42.049|diary-2> + 41.417|diary-1>
def list_load_fragment_hash(filename,fragments): array = [0] * 4096 -- NB: 256 changes to 4096 with open(filename,'r') as f: text = f.read() for sequence in fragment_string(text,fragments): hash = hashlib.sha1(sequence.encode('utf-8')).hexdigest()[-3:] -- NB: [-2:] changed to [-3:] x = int(hash,16) array[x] += 1 return arrayHere are the results:
drop-below[0.8] simm(""|f>, ""|g>) -- more specific vs: simm(drop-below[2] "" |f>, drop-below[2] "" |g>) -- less specific, more general, which will be useful.
hs |*> #=> 100 similar[fragment-hash-big] |_self> -- some post processing: |hs-result-1> => hs |big-eztv-1> |hs-result-2> => hs |big-eztv-2> |hs-result-3> => hs |big-slashdot-1> |hs-result-4> => hs |big-slashdot-2> |hs-result-5> => hs |big-slashdot-3> |hs-result-6> => hs |big-diary-1> |hs-result-7> => hs |big-diary-2> |hs-result-8> => hs |big-wc-comments-1> |hs-result-9> => hs |big-wc-comments-2> |hs-result-10> => hs |big-semantic-1> |hs-result-11> => hs |big-semantic-2>Giving these results:
|hs-result-1> => 94.222|big-eztv-2> + 28.930|big-semantic-2> + 27.309|big-semantic-1> + 25.386|big-slashdot-3> + 25.268|big-slashdot-1> + 24.881|big-slashdot-2> + 20.443|big-diary-2> + 19.792|big-diary-1> + 18.030|big-wc-comments-1> + 18.025|big-wc-comments-2> |hs-result-2> => 94.222|big-eztv-1> + 28.527|big-semantic-2> + 26.552|big-semantic-1> + 25.029|big-slashdot-3> + 24.902|big-slashdot-1> + 24.620|big-slashdot-2> + 21.521|big-diary-2> + 20.880|big-diary-1> + 18.256|big-wc-comments-2> + 18.240|big-wc-comments-1> |hs-result-3> => 96.367|big-slashdot-2> + 79.561|big-slashdot-3> + 25.268|big-eztv-1> + 24.902|big-eztv-2> + 20.522|big-semantic-2> + 20.063|big-semantic-1> + 16.364|big-diary-2> + 16.054|big-diary-1> + 12.592|big-wc-comments-1> + 12.558|big-wc-comments-2> |hs-result-4> => 96.367|big-slashdot-1> + 79.506|big-slashdot-3> + 24.881|big-eztv-1> + 24.620|big-eztv-2> + 20.252|big-semantic-2> + 19.877|big-semantic-1> + 16.310|big-diary-2> + 16.000|big-diary-1> + 12.660|big-wc-comments-1> + 12.593|big-wc-comments-2> |hs-result-5> => 79.561|big-slashdot-1> + 79.506|big-slashdot-2> + 25.386|big-eztv-1> + 25.029|big-eztv-2> + 21.065|big-semantic-2> + 20.537|big-semantic-1> + 16.763|big-diary-2> + 16.606|big-diary-1> + 13.020|big-wc-comments-1> + 12.987|big-wc-comments-2> |hs-result-6> => 96.013|big-diary-2> + 40.747|big-wc-comments-1> + 40.747|big-wc-comments-2> + 20.880|big-eztv-2> + 19.792|big-eztv-1> + 18.007|big-semantic-1> + 17.932|big-semantic-2> + 16.606|big-slashdot-3> + 16.054|big-slashdot-1> + 16.000|big-slashdot-2> |hs-result-7> => 96.013|big-diary-1> + 40.610|big-wc-comments-1> + 40.610|big-wc-comments-2> + 21.521|big-eztv-2> + 20.443|big-eztv-1> + 18.154|big-semantic-1> + 18.120|big-semantic-2> + 16.763|big-slashdot-3> + 16.364|big-slashdot-1> + 16.310|big-slashdot-2> |hs-result-8> => 99.533|big-wc-comments-2> + 40.747|big-diary-1> + 40.610|big-diary-2> + 18.240|big-eztv-2> + 18.030|big-eztv-1> + 13.799|big-semantic-2> + 13.771|big-semantic-1> + 13.020|big-slashdot-3> + 12.660|big-slashdot-2> + 12.592|big-slashdot-1> |hs-result-9> => 99.533|big-wc-comments-1> + 40.747|big-diary-1> + 40.610|big-diary-2> + 18.256|big-eztv-2> + 18.025|big-eztv-1> + 13.849|big-semantic-2> + 13.791|big-semantic-1> + 12.987|big-slashdot-3> + 12.593|big-slashdot-2> + 12.558|big-slashdot-1> |hs-result-10> => 88.817|big-semantic-2> + 27.309|big-eztv-1> + 26.552|big-eztv-2> + 20.537|big-slashdot-3> + 20.063|big-slashdot-1> + 19.877|big-slashdot-2> + 18.154|big-diary-2> + 18.007|big-diary-1> + 13.791|big-wc-comments-2> + 13.771|big-wc-comments-1> |hs-result-11> => 88.817|big-semantic-1> + 28.930|big-eztv-1> + 28.527|big-eztv-2> + 21.065|big-slashdot-3> + 20.522|big-slashdot-1> + 20.252|big-slashdot-2> + 18.120|big-diary-2> + 17.932|big-diary-1> + 13.849|big-wc-comments-2> + 13.799|big-wc-comments-1>Wow. So swapping from 256 to 4096 buckets has really increased the discrimination of this!
|hs-result-1> => 96.824|eztv-2> + 69.881|slashdot-1> + 69.730|slashdot-2> + ... vs: |hs-result-1> => 94.222|big-eztv-2> + 28.930|big-semantic-2> + 27.309|big-semantic-1> + 25.386|big-slashdot-3> + ...
drop-2-hash-op |*> #=> drop-below[2] fragment-hash-big |_self> drop-3-hash-op |*> #=> drop-below[3] fragment-hash-big |_self> drop-4-hash-op |*> #=> drop-below[4] fragment-hash-big |_self> drop-5-hash-op |*> #=> drop-below[5] fragment-hash-big |_self> drop-6-hash-op |*> #=> drop-below[6] fragment-hash-big |_self> drop-2-hash |big-eztv-1> => drop-2-hash-op |_self> drop-2-hash |big-eztv-2> => drop-2-hash-op |_self> ... drop-3-hash |big-eztv-1> => drop-3-hash-op |_self> ... count-1 |big-eztv-1> => count fragment-hash-big |_self> ... count-2 |big-eztv-1> => count drop-2-hash |_self> ... count-3 |big-eztv-1> => count drop-3-hash |_self> ...And then we have this:
count-1 |big-eztv-1> => |number: 2113> -- number of distinct kets when using 4096 buckets count-2 |big-eztv-1> => |number: 700> -- number of distinct kets with coeff 2 and above count-3 |big-eztv-1> => |number: 202> -- number of distinct kets with coeff 3 and above count-4 |big-eztv-1> => |number: 73> -- number of distinct kets with coeff 4 and above count-5 |big-eztv-1> => |number: 34> -- number of distinct kets with coeff 5 and above count-6 |big-eztv-1> => |number: 28> -- number of distinct kets with coeff 6 and above count-1 |big-eztv-2> => |number: 2120> count-2 |big-eztv-2> => |number: 740> count-3 |big-eztv-2> => |number: 208> count-4 |big-eztv-2> => |number: 81> count-5 |big-eztv-2> => |number: 35> count-6 |big-eztv-2> => |number: 31> count-1 |big-slashdot-1> => |number: 1023> count-2 |big-slashdot-1> => |number: 268> count-3 |big-slashdot-1> => |number: 114> count-4 |big-slashdot-1> => |number: 82> count-5 |big-slashdot-1> => |number: 73> count-6 |big-slashdot-1> => |number: 66> count-1 |big-slashdot-2> => |number: 1020> count-2 |big-slashdot-2> => |number: 267> count-3 |big-slashdot-2> => |number: 113> count-4 |big-slashdot-2> => |number: 83> count-5 |big-slashdot-2> => |number: 74> count-6 |big-slashdot-2> => |number: 68> count-1 |big-slashdot-3> => |number: 1044> count-2 |big-slashdot-3> => |number: 261> count-3 |big-slashdot-3> => |number: 129> count-4 |big-slashdot-3> => |number: 95> count-5 |big-slashdot-3> => |number: 83> count-6 |big-slashdot-3> => |number: 73> count-1 |big-diary-1> => |number: 596> count-2 |big-diary-1> => |number: 171> count-3 |big-diary-1> => |number: 98> count-4 |big-diary-1> => |number: 78> count-5 |big-diary-1> => |number: 66> count-6 |big-diary-1> => |number: 62> count-1 |big-diary-2> => |number: 619> count-2 |big-diary-2> => |number: 174> count-3 |big-diary-2> => |number: 97> count-4 |big-diary-2> => |number: 78> count-5 |big-diary-2> => |number: 66> count-6 |big-diary-2> => |number: 62> count-1 |big-wc-comments-1> => |number: 493> count-2 |big-wc-comments-1> => |number: 124> count-3 |big-wc-comments-1> => |number: 58> count-4 |big-wc-comments-1> => |number: 45> count-5 |big-wc-comments-1> => |number: 39> count-6 |big-wc-comments-1> => |number: 37> count-1 |big-wc-comments-2> => |number: 494> count-2 |big-wc-comments-2> => |number: 123> count-3 |big-wc-comments-2> => |number: 58> count-4 |big-wc-comments-2> => |number: 45> count-5 |big-wc-comments-2> => |number: 39> count-6 |big-wc-comments-2> => |number: 37> count-1 |big-semantic-1> => |number: 1926> count-2 |big-semantic-1> => |number: 730> count-3 |big-semantic-1> => |number: 327> count-4 |big-semantic-1> => |number: 193> count-5 |big-semantic-1> => |number: 132> count-6 |big-semantic-1> => |number: 91> count-1 |big-semantic-2> => |number: 2266> count-2 |big-semantic-2> => |number: 970> count-3 |big-semantic-2> => |number: 412> count-4 |big-semantic-2> => |number: 230> count-5 |big-semantic-2> => |number: 150> count-6 |big-semantic-2> => |number: 100>
drop-6-simm |*> #=> 100 similar[drop-6-hash] |_self> drop-6-simm |big-eztv-1> => drop-6-simm |_self> drop-6-simm |big-eztv-2> => drop-6-simm |_self> drop-6-simm |big-slashdot-1> => drop-6-simm |_self> drop-6-simm |big-slashdot-2> => drop-6-simm |_self> drop-6-simm |big-slashdot-3> => drop-6-simm |_self> drop-6-simm |big-diary-1> => drop-6-simm |_self> drop-6-simm |big-diary-2> => drop-6-simm |_self> drop-6-simm |big-wc-comments-1> => drop-6-simm |_self> drop-6-simm |big-wc-comments-2> => drop-6-simm |_self> drop-6-simm |big-semantic-1> => drop-6-simm |_self> drop-6-simm |big-semantic-2> => drop-6-simm |_self>
$ grep "^drop" sw-examples/fragment-documents-big-hash-more-post-processing--saved-2.sw | grep "simm" | less drop-2-simm |big-slashdot-2> => 96.888|big-slashdot-1> + 86.420|big-slashdot-3> + 19.525|big-eztv-1> + 18.822|big-eztv-2> + 13.456|big-diary-2> + 13.104|big-diary-1> + 12.465|big-semantic-1> + 11.684|big-semantic-2> + 10.829|big-wc-comments-2> + 10.815|big-wc-comments-1> drop-3-simm |big-slashdot-2> => 99.005|big-slashdot-1> + 91.635|big-slashdot-3> + 21.095|big-eztv-1> + 21.020|big-eztv-2> + 13.811|big-diary-2> + 13.488|big-diary-1> + 12.379|big-semantic-1> + 11.458|big-semantic-2> + 10.735|big-wc-comments-1> + 10.735|big-wc-comments-2> drop-4-simm |big-slashdot-2> => 99.177|big-slashdot-1> + 94.003|big-slashdot-3> + 21.831|big-eztv-1> + 21.619|big-eztv-2> + 14.148|big-diary-2> + 13.835|big-diary-1> + 13.529|big-semantic-1> + 12.731|big-semantic-2> + 11.193|big-wc-comments-1> + 11.193|big-wc-comments-2> drop-5-simm |big-slashdot-2> => 99.608|big-slashdot-1> + 95.326|big-slashdot-3> + 21.891|big-eztv-1> + 21.720|big-eztv-2> + 14.257|big-semantic-1> + 14.047|big-diary-2> + 13.722|big-diary-1> + 13.650|big-semantic-2> + 11.466|big-wc-comments-1> + 11.466|big-wc-comments-2> drop-6-simm |big-slashdot-2> => 99.315|big-slashdot-1> + 95.522|big-slashdot-3> + 22.063|big-eztv-1> + 21.905|big-eztv-2> + 15.225|big-semantic-1> + 14.513|big-semantic-2> + 14.234|big-diary-2> + 13.907|big-diary-1> + 11.607|big-wc-comments-1> + 11.607|big-wc-comments-2> drop-2-simm |big-wc-comments-1> => 99.737|big-wc-comments-2> + 41.842|big-diary-1> + 41.707|big-diary-2> + 17.071|big-eztv-2> + 16.528|big-eztv-1> + 11.578|big-slashdot-3> + 10.815|big-slashdot-2> + 10.815|big-slashdot-1> + 10.449|big-semantic-1> + 9.549|big-semantic-2> drop-3-simm |big-wc-comments-1> => 99.900|big-wc-comments-2> + 41.927|big-diary-1> + 41.879|big-diary-2> + 17.916|big-eztv-2> + 17.167|big-eztv-1> + 11.251|big-slashdot-3> + 10.735|big-slashdot-2> + 10.735|big-slashdot-1> + 10.247|big-semantic-2> + 10.217|big-semantic-1> drop-4-simm |big-wc-comments-1> => 99.896|big-wc-comments-2> + 41.229|big-diary-1> + 41.155|big-diary-2> + 18.062|big-eztv-2> + 17.341|big-eztv-1> + 11.431|big-slashdot-3> + 11.201|big-slashdot-1> + 11.193|big-slashdot-2> + 10.387|big-semantic-1> + 10.327|big-semantic-2> drop-5-simm |big-wc-comments-1> => 99.893|big-wc-comments-2> + 41.620|big-diary-1> + 41.548|big-diary-2> + 18.639|big-eztv-2> + 17.712|big-eztv-1> + 11.726|big-slashdot-3> + 11.475|big-slashdot-1> + 11.466|big-slashdot-2> + 10.445|big-semantic-2> + 10.281|big-semantic-1> drop-6-simm |big-wc-comments-1> => 99.892|big-wc-comments-2> + 42.094|big-diary-1> + 42.017|big-diary-2> + 18.828|big-eztv-2> + 17.905|big-eztv-1> + 11.623|big-slashdot-1> + 11.622|big-slashdot-3> + 11.607|big-slashdot-2> + 10.709|big-semantic-2> + 10.509|big-semantic-1> drop-2-simm |big-slashdot-3> => 86.420|big-slashdot-2> + 86.332|big-slashdot-1> + 19.757|big-eztv-1> + 19.140|big-eztv-2> + 14.599|big-diary-2> + 14.146|big-diary-1> + 13.144|big-semantic-1> + 12.428|big-semantic-2> + 11.592|big-wc-comments-2> + 11.578|big-wc-comments-1> drop-3-simm |big-slashdot-3> => 91.635|big-slashdot-2> + 91.483|big-slashdot-1> + 21.401|big-eztv-1> + 21.387|big-eztv-2> + 14.822|big-diary-2> + 14.275|big-diary-1> + 12.588|big-semantic-1> + 11.946|big-semantic-2> + 11.251|big-wc-comments-1> + 11.251|big-wc-comments-2> drop-4-simm |big-slashdot-3> => 94.027|big-slashdot-1> + 94.003|big-slashdot-2> + 22.434|big-eztv-1> + 22.263|big-eztv-2> + 15.225|big-diary-2> + 14.681|big-diary-1> + 13.727|big-semantic-1> + 13.053|big-semantic-2> + 11.431|big-wc-comments-1> + 11.431|big-wc-comments-2> drop-5-simm |big-slashdot-3> => 95.350|big-slashdot-1> + 95.326|big-slashdot-2> + 22.655|big-eztv-1> + 22.500|big-eztv-2> + 14.936|big-diary-2> + 14.611|big-diary-1> + 14.315|big-semantic-1> + 13.708|big-semantic-2> + 11.726|big-wc-comments-1> + 11.726|big-wc-comments-2> drop-6-simm |big-slashdot-3> => 95.628|big-slashdot-1> + 95.522|big-slashdot-2> + 23.057|big-eztv-1> + 22.928|big-eztv-2> + 15.306|big-semantic-1> + 15.187|big-diary-2> + 14.861|big-diary-1> + 14.594|big-semantic-2> + 11.622|big-wc-comments-1> + 11.622|big-wc-comments-2> drop-2-simm |big-wc-comments-2> => 99.737|big-wc-comments-1> + 41.883|big-diary-1> + 41.748|big-diary-2> + 17.051|big-eztv-2> + 16.504|big-eztv-1> + 11.592|big-slashdot-3> + 10.829|big-slashdot-2> + 10.829|big-slashdot-1> + 10.463|big-semantic-1> + 9.551|big-semantic-2> drop-3-simm |big-wc-comments-2> => 99.900|big-wc-comments-1> + 41.927|big-diary-1> + 41.879|big-diary-2> + 17.916|big-eztv-2> + 17.167|big-eztv-1> + 11.251|big-slashdot-3> + 10.735|big-slashdot-2> + 10.735|big-slashdot-1> + 10.247|big-semantic-2> + 10.217|big-semantic-1> drop-4-simm |big-wc-comments-2> => 99.896|big-wc-comments-1> + 41.229|big-diary-1> + 41.155|big-diary-2> + 18.062|big-eztv-2> + 17.341|big-eztv-1> + 11.431|big-slashdot-3> + 11.201|big-slashdot-1> + 11.193|big-slashdot-2> + 10.387|big-semantic-1> + 10.327|big-semantic-2> drop-5-simm |big-wc-comments-2> => 99.893|big-wc-comments-1> + 41.620|big-diary-1> + 41.548|big-diary-2> + 18.639|big-eztv-2> + 17.712|big-eztv-1> + 11.726|big-slashdot-3> + 11.475|big-slashdot-1> + 11.466|big-slashdot-2> + 10.445|big-semantic-2> + 10.281|big-semantic-1> drop-6-simm |big-wc-comments-2> => 99.892|big-wc-comments-1> + 42.094|big-diary-1> + 42.017|big-diary-2> + 18.828|big-eztv-2> + 17.905|big-eztv-1> + 11.623|big-slashdot-1> + 11.622|big-slashdot-3> + 11.607|big-slashdot-2> + 10.709|big-semantic-2> + 10.509|big-semantic-1> drop-2-simm |big-eztv-2> => 91.919|big-eztv-1> + 19.140|big-slashdot-3> + 18.954|big-slashdot-1> + 18.822|big-slashdot-2> + 18.502|big-diary-2> + 18.168|big-diary-1> + 17.071|big-wc-comments-1> + 17.051|big-wc-comments-2> + 14.576|big-semantic-2> + 14.373|big-semantic-1> drop-3-simm |big-eztv-2> => 92.067|big-eztv-1> + 21.387|big-slashdot-3> + 21.020|big-slashdot-2> + 20.968|big-slashdot-1> + 18.885|big-diary-2> + 18.682|big-diary-1> + 17.916|big-wc-comments-1> + 17.916|big-wc-comments-2> + 10.938|big-semantic-1> + 10.033|big-semantic-2> drop-4-simm |big-eztv-2> => 91.371|big-eztv-1> + 22.263|big-slashdot-3> + 21.630|big-slashdot-1> + 21.619|big-slashdot-2> + 19.604|big-diary-2> + 19.334|big-diary-1> + 18.062|big-wc-comments-1> + 18.062|big-wc-comments-2> + 11.566|big-semantic-1> + 10.769|big-semantic-2> drop-5-simm |big-eztv-2> => 91.874|big-eztv-1> + 22.500|big-slashdot-3> + 21.732|big-slashdot-1> + 21.720|big-slashdot-2> + 20.235|big-diary-2> + 19.963|big-diary-1> + 18.639|big-wc-comments-1> + 18.639|big-wc-comments-2> + 12.323|big-semantic-1> + 11.531|big-semantic-2> drop-6-simm |big-eztv-2> => 91.821|big-eztv-1> + 22.928|big-slashdot-3> + 21.977|big-slashdot-1> + 21.905|big-slashdot-2> + 20.446|big-diary-2> + 20.173|big-diary-1> + 18.828|big-wc-comments-1> + 18.828|big-wc-comments-2> + 13.466|big-semantic-1> + 12.758|big-semantic-2> drop-2-simm |big-slashdot-1> => 96.888|big-slashdot-2> + 86.332|big-slashdot-3> + 19.664|big-eztv-1> + 18.954|big-eztv-2> + 13.366|big-diary-2> + 13.014|big-diary-1> + 12.598|big-semantic-1> + 11.857|big-semantic-2> + 10.829|big-wc-comments-2> + 10.815|big-wc-comments-1> drop-3-simm |big-slashdot-1> => 99.005|big-slashdot-2> + 91.483|big-slashdot-3> + 21.043|big-eztv-1> + 20.968|big-eztv-2> + 13.811|big-diary-2> + 13.488|big-diary-1> + 12.611|big-semantic-1> + 11.661|big-semantic-2> + 10.735|big-wc-comments-1> + 10.735|big-wc-comments-2> drop-4-simm |big-slashdot-1> => 99.177|big-slashdot-2> + 94.027|big-slashdot-3> + 21.841|big-eztv-1> + 21.630|big-eztv-2> + 14.160|big-diary-2> + 13.847|big-diary-1> + 13.535|big-semantic-1> + 12.738|big-semantic-2> + 11.201|big-wc-comments-1> + 11.201|big-wc-comments-2> drop-5-simm |big-slashdot-1> => 99.608|big-slashdot-2> + 95.350|big-slashdot-3> + 21.903|big-eztv-1> + 21.732|big-eztv-2> + 14.264|big-semantic-1> + 14.059|big-diary-2> + 13.734|big-diary-1> + 13.656|big-semantic-2> + 11.475|big-wc-comments-1> + 11.475|big-wc-comments-2> drop-6-simm |big-slashdot-1> => 99.315|big-slashdot-2> + 95.628|big-slashdot-3> + 22.136|big-eztv-1> + 21.977|big-eztv-2> + 15.238|big-semantic-1> + 14.525|big-semantic-2> + 14.257|big-diary-2> + 13.930|big-diary-1> + 11.623|big-wc-comments-1> + 11.623|big-wc-comments-2> drop-2-simm |big-eztv-1> => 91.919|big-eztv-2> + 19.757|big-slashdot-3> + 19.664|big-slashdot-1> + 19.525|big-slashdot-2> + 16.839|big-diary-2> + 16.543|big-diary-1> + 16.528|big-wc-comments-1> + 16.504|big-wc-comments-2> + 14.555|big-semantic-1> + 14.422|big-semantic-2> drop-3-simm |big-eztv-1> => 92.067|big-eztv-2> + 21.401|big-slashdot-3> + 21.095|big-slashdot-2> + 21.043|big-slashdot-1> + 17.167|big-wc-comments-1> + 17.167|big-wc-comments-2> + 16.626|big-diary-2> + 16.527|big-diary-1> + 10.899|big-semantic-1> + 9.977|big-semantic-2> drop-4-simm |big-eztv-1> => 91.371|big-eztv-2> + 22.434|big-slashdot-3> + 21.841|big-slashdot-1> + 21.831|big-slashdot-2> + 17.341|big-wc-comments-1> + 17.341|big-wc-comments-2> + 17.077|big-diary-2> + 16.807|big-diary-1> + 11.472|big-semantic-1> + 10.542|big-semantic-2> drop-5-simm |big-eztv-1> => 91.874|big-eztv-2> + 22.655|big-slashdot-3> + 21.903|big-slashdot-1> + 21.891|big-slashdot-2> + 17.712|big-wc-comments-1> + 17.712|big-wc-comments-2> + 17.367|big-diary-2> + 17.095|big-diary-1> + 12.336|big-semantic-1> + 11.531|big-semantic-2> drop-6-simm |big-eztv-1> => 91.821|big-eztv-2> + 23.057|big-slashdot-3> + 22.136|big-slashdot-1> + 22.063|big-slashdot-2> + 17.905|big-wc-comments-1> + 17.905|big-wc-comments-2> + 17.574|big-diary-2> + 17.301|big-diary-1> + 13.499|big-semantic-1> + 12.775|big-semantic-2> drop-2-simm |big-semantic-1> => 87.282|big-semantic-2> + 14.972|big-diary-1> + 14.903|big-diary-2> + 14.555|big-eztv-1> + 14.373|big-eztv-2> + 13.144|big-slashdot-3> + 12.598|big-slashdot-1> + 12.465|big-slashdot-2> + 10.463|big-wc-comments-2> + 10.449|big-wc-comments-1> drop-3-simm |big-semantic-1> => 91.060|big-semantic-2> + 16.607|big-diary-1> + 16.553|big-diary-2> + 12.611|big-slashdot-1> + 12.588|big-slashdot-3> + 12.379|big-slashdot-2> + 10.938|big-eztv-2> + 10.899|big-eztv-1> + 10.217|big-wc-comments-1> + 10.217|big-wc-comments-2> drop-4-simm |big-semantic-1> => 93.378|big-semantic-2> + 17.871|big-diary-2> + 17.617|big-diary-1> + 13.727|big-slashdot-3> + 13.535|big-slashdot-1> + 13.529|big-slashdot-2> + 11.566|big-eztv-2> + 11.472|big-eztv-1> + 10.387|big-wc-comments-1> + 10.387|big-wc-comments-2> drop-5-simm |big-semantic-1> => 95.103|big-semantic-2> + 17.879|big-diary-2> + 17.620|big-diary-1> + 14.315|big-slashdot-3> + 14.264|big-slashdot-1> + 14.257|big-slashdot-2> + 12.336|big-eztv-1> + 12.323|big-eztv-2> + 10.281|big-wc-comments-1> + 10.281|big-wc-comments-2> drop-6-simm |big-semantic-1> => 96.188|big-semantic-2> + 18.303|big-diary-2> + 18.053|big-diary-1> + 15.306|big-slashdot-3> + 15.238|big-slashdot-1> + 15.225|big-slashdot-2> + 13.499|big-eztv-1> + 13.466|big-eztv-2> + 10.509|big-wc-comments-1> + 10.509|big-wc-comments-2> drop-2-simm |big-diary-2> => 97.510|big-diary-1> + 41.748|big-wc-comments-2> + 41.707|big-wc-comments-1> + 18.502|big-eztv-2> + 16.839|big-eztv-1> + 14.903|big-semantic-1> + 14.599|big-slashdot-3> + 14.071|big-semantic-2> + 13.456|big-slashdot-2> + 13.366|big-slashdot-1> drop-3-simm |big-diary-2> => 98.236|big-diary-1> + 41.879|big-wc-comments-1> + 41.879|big-wc-comments-2> + 18.885|big-eztv-2> + 16.626|big-eztv-1> + 16.553|big-semantic-1> + 15.550|big-semantic-2> + 14.822|big-slashdot-3> + 13.811|big-slashdot-2> + 13.811|big-slashdot-1> drop-4-simm |big-diary-2> => 97.872|big-diary-1> + 41.155|big-wc-comments-1> + 41.155|big-wc-comments-2> + 19.604|big-eztv-2> + 17.871|big-semantic-1> + 17.380|big-semantic-2> + 17.077|big-eztv-1> + 15.225|big-slashdot-3> + 14.160|big-slashdot-1> + 14.148|big-slashdot-2> drop-5-simm |big-diary-2> => 98.533|big-diary-1> + 41.548|big-wc-comments-1> + 41.548|big-wc-comments-2> + 20.235|big-eztv-2> + 17.911|big-semantic-2> + 17.879|big-semantic-1> + 17.367|big-eztv-1> + 14.936|big-slashdot-3> + 14.059|big-slashdot-1> + 14.047|big-slashdot-2> drop-6-simm |big-diary-2> => 98.820|big-diary-1> + 42.017|big-wc-comments-1> + 42.017|big-wc-comments-2> + 20.446|big-eztv-2> + 18.423|big-semantic-2> + 18.303|big-semantic-1> + 17.574|big-eztv-1> + 15.187|big-slashdot-3> + 14.257|big-slashdot-1> + 14.234|big-slashdot-2> drop-2-simm |big-diary-1> => 97.510|big-diary-2> + 41.883|big-wc-comments-2> + 41.842|big-wc-comments-1> + 18.168|big-eztv-2> + 16.543|big-eztv-1> + 14.972|big-semantic-1> + 14.146|big-slashdot-3> + 14.118|big-semantic-2> + 13.104|big-slashdot-2> + 13.014|big-slashdot-1> drop-3-simm |big-diary-1> => 98.236|big-diary-2> + 41.927|big-wc-comments-1> + 41.927|big-wc-comments-2> + 18.682|big-eztv-2> + 16.607|big-semantic-1> + 16.527|big-eztv-1> + 15.604|big-semantic-2> + 14.275|big-slashdot-3> + 13.488|big-slashdot-2> + 13.488|big-slashdot-1> drop-4-simm |big-diary-1> => 97.872|big-diary-2> + 41.229|big-wc-comments-1> + 41.229|big-wc-comments-2> + 19.334|big-eztv-2> + 17.617|big-semantic-1> + 17.409|big-semantic-2> + 16.807|big-eztv-1> + 14.681|big-slashdot-3> + 13.847|big-slashdot-1> + 13.835|big-slashdot-2> drop-5-simm |big-diary-1> => 98.533|big-diary-2> + 41.620|big-wc-comments-1> + 41.620|big-wc-comments-2> + 19.963|big-eztv-2> + 17.652|big-semantic-2> + 17.620|big-semantic-1> + 17.095|big-eztv-1> + 14.611|big-slashdot-3> + 13.734|big-slashdot-1> + 13.722|big-slashdot-2> drop-6-simm |big-diary-1> => 98.820|big-diary-2> + 42.094|big-wc-comments-1> + 42.094|big-wc-comments-2> + 20.173|big-eztv-2> + 18.166|big-semantic-2> + 18.053|big-semantic-1> + 17.301|big-eztv-1> + 14.861|big-slashdot-3> + 13.930|big-slashdot-1> + 13.907|big-slashdot-2> drop-2-simm |big-semantic-2> => 87.282|big-semantic-1> + 14.576|big-eztv-2> + 14.422|big-eztv-1> + 14.118|big-diary-1> + 14.071|big-diary-2> + 12.428|big-slashdot-3> + 11.857|big-slashdot-1> + 11.684|big-slashdot-2> + 9.551|big-wc-comments-2> + 9.549|big-wc-comments-1> drop-3-simm |big-semantic-2> => 91.060|big-semantic-1> + 15.604|big-diary-1> + 15.550|big-diary-2> + 11.946|big-slashdot-3> + 11.661|big-slashdot-1> + 11.458|big-slashdot-2> + 10.247|big-wc-comments-1> + 10.247|big-wc-comments-2> + 10.033|big-eztv-2> + 9.977|big-eztv-1> drop-4-simm |big-semantic-2> => 93.378|big-semantic-1> + 17.409|big-diary-1> + 17.380|big-diary-2> + 13.053|big-slashdot-3> + 12.738|big-slashdot-1> + 12.731|big-slashdot-2> + 10.769|big-eztv-2> + 10.542|big-eztv-1> + 10.327|big-wc-comments-1> + 10.327|big-wc-comments-2> drop-5-simm |big-semantic-2> => 95.103|big-semantic-1> + 17.911|big-diary-2> + 17.652|big-diary-1> + 13.708|big-slashdot-3> + 13.656|big-slashdot-1> + 13.650|big-slashdot-2> + 11.531|big-eztv-2> + 11.531|big-eztv-1> + 10.445|big-wc-comments-1> + 10.445|big-wc-comments-2> drop-6-simm |big-semantic-2> => 96.188|big-semantic-1> + 18.423|big-diary-2> + 18.166|big-diary-1> + 14.594|big-slashdot-3> + 14.525|big-slashdot-1> + 14.513|big-slashdot-2> + 12.775|big-eztv-1> + 12.758|big-eztv-2> + 10.709|big-wc-comments-1> + 10.709|big-wc-comments-2>Cool result. Discrimination now of 70 odd points! Compared to what, 20 was it last time?
# common[op] (|x> + |y> + |z>) # eg: common[friends] (|Fred> + |Sam>) # eg: common[actors] (|movie-1> + |movie-2>) # or indirectly # |list> => |Fred> + |Sam> + |Charles> # common[friends] "" |list> -- this has the advantage that we can consider arbitrary long lists, without much hassle. def common(one,context,op): if one.count() <= 1: # this should also neatly filter out kets, I presume. return one.apply_op(context,op) r = one.data[0].apply_op(context,op) for k in range(1,one.count()): sp = one.data[k].apply_op(context,op) r = intersection(r,sp) return rThe old way to do this was:
common(friends |Fred>, friends |Sam>) -- where "common" is an alias for "intersection" common(actors |movie-1>, actors |movie-2>) common(friends |Fred>, friends |Sam>, friends |Charles>)So, small but useful improvement.
"common" : ".apply_sp_fn(common,context,\"{0}\")",
If we have data on George, Ed and Travis's friends we can do: "Which friends do George, Ed and Travis have in common?" |answer> => intersection(friends|person: George>, friends|person: Ed>, friends|person: Travis>) -- which BTW is a common pattern: |answer> => intersection(op|U>, op|V>, op|X>, op|Y>) "Which actors do movie name-a and movie name-b have in common?" |answer> => intersection(actors|movie: name-a>,actors|movie: name-b>)Now we can implement these using:
|answer> => common[friends] (|person: George> + |person: Ed> + |person: Travis>) |answer> => common[op] (|U> + |V> + |X> + |Y>) |answer> => common[actors] (|movie-a> + |movie-b>)
count-1 |Tom-Sawyer-1M> => |number: 10772> -- number of distinct kets when using 1048576 buckets count-2 |Tom-Sawyer-1M> => |number: 4431> -- number of distinct kets with coeff 2 and above count-3 |Tom-Sawyer-1M> => |number: 2893> -- number of distinct kets with coeff 3 and above count-4 |Tom-Sawyer-1M> => |number: 2149> -- number of distinct kets with coeff 4 and above count-5 |Tom-Sawyer-1M> => |number: 1692> -- number of distinct kets with coeff 5 and above count-6 |Tom-Sawyer-1M> => |number: 1417> -- number of distinct kets with coeff 6 and above count-7 |Tom-Sawyer-1M> => |number: 1223> -- number of distinct kets with coeff 7 and above count-8 |Tom-Sawyer-1M> => |number: 1044> -- number of distinct kets with coeff 8 and above count-9 |Tom-Sawyer-1M> => |number: 908> -- number of distinct kets with coeff 9 and above count-10 |Tom-Sawyer-1M> => |number: 818> -- number of distinct kets with coeff 10 and above count-1 |Gone-with-Wind-1M> => |number: 24270> -- note again, the sparse representation pays off. Otherwise, these would all have 1 million terms! count-2 |Gone-with-Wind-1M> => |number: 12666> count-3 |Gone-with-Wind-1M> => |number: 9152> count-4 |Gone-with-Wind-1M> => |number: 7294> count-5 |Gone-with-Wind-1M> => |number: 6182> count-6 |Gone-with-Wind-1M> => |number: 5307> count-7 |Gone-with-Wind-1M> => |number: 4674> count-8 |Gone-with-Wind-1M> => |number: 4195> count-9 |Gone-with-Wind-1M> => |number: 3819> count-10 |Gone-with-Wind-1M> => |number: 3525> count-1 |Frankenstein-1M> => |number: 9160> count-2 |Frankenstein-1M> => |number: 4582> count-3 |Frankenstein-1M> => |number: 3184> count-4 |Frankenstein-1M> => |number: 2432> count-5 |Frankenstein-1M> => |number: 1945> count-6 |Frankenstein-1M> => |number: 1612> count-7 |Frankenstein-1M> => |number: 1394> count-8 |Frankenstein-1M> => |number: 1215> count-9 |Frankenstein-1M> => |number: 1077> count-10 |Frankenstein-1M> => |number: 968> count-1 |Alice-in-Wonderland-1M> => |number: 4744> count-2 |Alice-in-Wonderland-1M> => |number: 2035> count-3 |Alice-in-Wonderland-1M> => |number: 1360> count-4 |Alice-in-Wonderland-1M> => |number: 1024> count-5 |Alice-in-Wonderland-1M> => |number: 827> count-6 |Alice-in-Wonderland-1M> => |number: 696> count-7 |Alice-in-Wonderland-1M> => |number: 611> count-8 |Alice-in-Wonderland-1M> => |number: 525> count-9 |Alice-in-Wonderland-1M> => |number: 458> count-10 |Alice-in-Wonderland-1M> => |number: 414> count-1 |Shakespeare-1M> => |number: 72218> count-2 |Shakespeare-1M> => |number: 30731> count-3 |Shakespeare-1M> => |number: 20170> count-4 |Shakespeare-1M> => |number: 15475> count-5 |Shakespeare-1M> => |number: 12679> count-6 |Shakespeare-1M> => |number: 10781> count-7 |Shakespeare-1M> => |number: 9392> count-8 |Shakespeare-1M> => |number: 8412> count-9 |Shakespeare-1M> => |number: 7571> count-10 |Shakespeare-1M> => |number: 6909> count-1 |Moby-Dick-1M> => |number: 26238> count-2 |Moby-Dick-1M> => |number: 11258> count-3 |Moby-Dick-1M> => |number: 7372> count-4 |Moby-Dick-1M> => |number: 5520> count-5 |Moby-Dick-1M> => |number: 4424> count-6 |Moby-Dick-1M> => |number: 3671> count-7 |Moby-Dick-1M> => |number: 3137> count-8 |Moby-Dick-1M> => |number: 2746> count-9 |Moby-Dick-1M> => |number: 2415> count-10 |Moby-Dick-1M> => |number: 2157> count-1 |I-Robot-1M> => |number: 9218> count-2 |I-Robot-1M> => |number: 4168> count-3 |I-Robot-1M> => |number: 2803> count-4 |I-Robot-1M> => |number: 2098> count-5 |I-Robot-1M> => |number: 1711> count-6 |I-Robot-1M> => |number: 1424> count-7 |I-Robot-1M> => |number: 1234> count-8 |I-Robot-1M> => |number: 1100> count-9 |I-Robot-1M> => |number: 985> count-10 |I-Robot-1M> => |number: 893> count-1 |Sherlock-Holmes-1M> => |number: 10684> count-2 |Sherlock-Holmes-1M> => |number: 5126> count-3 |Sherlock-Holmes-1M> => |number: 3549> count-4 |Sherlock-Holmes-1M> => |number: 2729> count-5 |Sherlock-Holmes-1M> => |number: 2254> count-6 |Sherlock-Holmes-1M> => |number: 1941> count-7 |Sherlock-Holmes-1M> => |number: 1672> count-8 |Sherlock-Holmes-1M> => |number: 1467> count-9 |Sherlock-Holmes-1M> => |number: 1330> count-10 |Sherlock-Holmes-1M> => |number: 1183> count-1 |nineteen-eighty-four-1M> => |number: 11454> count-2 |nineteen-eighty-four-1M> => |number: 5444> count-3 |nineteen-eighty-four-1M> => |number: 3700> count-4 |nineteen-eighty-four-1M> => |number: 2831> count-5 |nineteen-eighty-four-1M> => |number: 2289> count-6 |nineteen-eighty-four-1M> => |number: 1914> count-7 |nineteen-eighty-four-1M> => |number: 1633> count-8 |nineteen-eighty-four-1M> => |number: 1445> count-9 |nineteen-eighty-four-1M> => |number: 1272> count-10 |nineteen-eighty-four-1M> => |number: 1161>And now the simm results:
$ grep "^drop" sw-examples/frag_ebooks_1M_post_processing--saved.sw | grep "simm" drop-10-simm |nineteen-eighty-four-1M> => 69.955|Sherlock-Holmes-1M> + 69.483|Tom-Sawyer-1M> + 69.442|I-Robot-1M> + 68.444|Moby-Dick-1M> + 64.511|Gone-with-Wind-1M> + 64.251|Frankenstein-1M> + 62.139|Alice-in-Wonderland-1M> + 49.982|Shakespeare-1M> drop-10-simm |Moby-Dick-1M> => 68.730|Sherlock-Holmes-1M> + 68.444|nineteen-eighty-four-1M> + 66.421|Tom-Sawyer-1M> + 64.975|Frankenstein-1M> + 64.652|I-Robot-1M> + 62.815|Gone-with-Wind-1M> + 60.736|Alice-in-Wonderland-1M> + 56.115|Shakespeare-1M> drop-10-simm |Shakespeare-1M> => 57.121|Sherlock-Holmes-1M> + 56.115|Moby-Dick-1M> + 54.958|Gone-with-Wind-1M> + 54.597|Frankenstein-1M> + 53.924|I-Robot-1M> + 50.770|Tom-Sawyer-1M> + 49.982|nineteen-eighty-four-1M> + 45.706|Alice-in-Wonderland-1M> drop-10-simm |Tom-Sawyer-1M> => 70.489|Sherlock-Holmes-1M> + 69.483|nineteen-eighty-four-1M> + 68.727|I-Robot-1M> + 67.818|Gone-with-Wind-1M> + 66.947|Alice-in-Wonderland-1M> + 66.421|Moby-Dick-1M> + 63.893|Frankenstein-1M> + 50.770|Shakespeare-1M> drop-10-simm |Sherlock-Holmes-1M> => 70.581|I-Robot-1M> + 70.489|Tom-Sawyer-1M> + 69.955|nineteen-eighty-four-1M> + 68.730|Moby-Dick-1M> + 68.671|Frankenstein-1M> + 65.807|Gone-with-Wind-1M> + 62.923|Alice-in-Wonderland-1M> + 57.121|Shakespeare-1M> drop-10-simm |Frankenstein-1M> => 68.671|Sherlock-Holmes-1M> + 64.975|Moby-Dick-1M> + 64.251|nineteen-eighty-four-1M> + 63.893|Tom-Sawyer-1M> + 60.598|I-Robot-1M> + 58.861|Gone-with-Wind-1M> + 57.375|Alice-in-Wonderland-1M> + 54.597|Shakespeare-1M> drop-10-simm |Gone-with-Wind-1M> => 67.818|Tom-Sawyer-1M> + 65.807|Sherlock-Holmes-1M> + 64.511|nineteen-eighty-four-1M> + 63.202|I-Robot-1M> + 62.815|Moby-Dick-1M> + 58.861|Frankenstein-1M> + 58.766|Alice-in-Wonderland-1M> + 54.958|Shakespeare-1M> drop-10-simm |Alice-in-Wonderland-1M> => 66.947|Tom-Sawyer-1M> + 62.923|Sherlock-Holmes-1M> + 62.561|I-Robot-1M> + 62.139|nineteen-eighty-four-1M> + 60.736|Moby-Dick-1M> + 58.766|Gone-with-Wind-1M> + 57.375|Frankenstein-1M> + 45.706|Shakespeare-1M> drop-10-simm |I-Robot-1M> => 70.581|Sherlock-Holmes-1M> + 69.442|nineteen-eighty-four-1M> + 68.727|Tom-Sawyer-1M> + 64.652|Moby-Dick-1M> + 63.202|Gone-with-Wind-1M> + 62.561|Alice-in-Wonderland-1M> + 60.598|Frankenstein-1M> + 53.924|Shakespeare-1M>So these ebooks are, using this method, roughly 60-70% similar, except for Shakespeare who is around 55% similar to the rest.
def ket_elt(j,i): return ket("grid: " + str(j) + " " + str(i)) def ket_elt_bd(j,i,I,J): # finite universe model: # if i <= 0 or j <= 0 or i > I or j > J: # return ket("",0) -- NB: this makes use of the fact that if the learn rule is |> # torus model: -- then it is ignored. ie, it is not learnt. i = (i - 1)%I + 1 -- eg: foo |x> => |> j = (j - 1)%J + 1 -- leaves |x> unchanged. return ket("grid: " + str(j) + " " + str(i)) def create_grid(c,I,J): c.learn("dim-1","grid",str(I)) c.learn("dim-2","grid",str(J)) for j in range(1,J+1): for i in range(1,I+1): elt = ket_elt(j,i) c.add_learn("elements","grid",elt) c.learn("N",elt,ket_elt_bd(j-1,i,I,J)) c.learn("NE",elt,ket_elt_bd(j-1,i+1,I,J)) c.learn("E",elt,ket_elt_bd(j,i+1,I,J)) c.learn("SE",elt,ket_elt_bd(j+1,i+1,I,J)) c.learn("S",elt,ket_elt_bd(j+1,i,I,J)) c.learn("SW",elt,ket_elt_bd(j+1,i-1,I,J)) c.learn("W",elt,ket_elt_bd(j,i-1,I,J)) c.learn("NW",elt,ket_elt_bd(j-1,i-1,I,J))
left |x> => |a> right |x> => |b> left |a> => |c> right |a> => |d> left |b> => |e> right |b> => |f> left |c> => |g> right |c> => |h> left |d> => |i> right |d> => |j> left |e> => |k> right |e> => |l> left |f> => |m> right |f> => |n>And then we can add other info to nodes. eg:
text |x> => |start node> text |a> => |first child node> text |b> => |second child node>OK. Now, let's look at this after we load it into the console:
---------------------------------------- |context> => |context: simple binary tree> left |x> => |a> right |x> => |b> text |x> => |start node> left |a> => |c> right |a> => |d> text |a> => |first child node> left |b> => |e> right |b> => |f> text |b> => |second child node> left |c> => |g> right |c> => |h> left |d> => |i> right |d> => |j> left |e> => |k> right |e> => |l> left |f> => |m> right |f> => |n> child |*> #=> left |_self> + right |_self> ----------------------------------------And now we have this, we can descend the tree, eg:
sa: right left |x> |d> sa: right right left |x> |j> sa: right left right |x> |l> sa: child^2 |a> |g> + |h> + |i> + |j> sa: child^2 |b> |k> + |l> + |m> + |n> sa: child^3 |x> |g> + |h> + |i> + |j> + |k> + |l> + |m> + |n>And if nodes don't exist, the code handles that gracefully by returning the empty ket (which is also the identity ket):
sa: left child^3 |x> |> sa: child^4 |x> |>Next, what if we want multiple levels of the tree at once? Well, that is trivial enough too (though the current parser does not handle it), so we have to do it a little more verbosely:
sa: 1 |x> |x> sa: -- (1 + child) |x> sa: |x> + child |x> |x> + |a> + |b> sa: -- (1 + child + child^2) |x> -- this is the notation we would use if the parser could handle it. sa: |x> + child |x> + child^2 |x> -- instead, for now, we have to do this. |x> + |a> + |b> + |c> + |d> + |e> + |f> sa: -- (1 + child + child^2 + child^3) |x> sa: |x> + child |x> + child^2 |x> + child^3 |x> |x> + |a> + |b> + |c> + |d> + |e> + |f> + |g> + |h> + |i> + |j> + |k> + |l> + |m> + |n> sa: -- (1 + child + child^2 + child^3 + child^4) |x> sa: |x> + child |x> + child^2 |x> + child^3 |x> + child^4 |x> |x> + |a> + |b> + |c> + |d> + |e> + |f> + |g> + |h> + |i> + |j> + |k> + |l> + |m> + |n> -- NB: no new terms. We have reached the bottom of the tree.Which weirdly enough, reminds me of this from Quantum Mechanics:
exp(A) |Psi> which expands to: (1 + A + A^2/2 + A^3/3! + A^4/4! + A^5/5! + ... A^n/n! + ...) |Psi> where A is just some QM operatorBut we don't want the 1/n! coeffs, so just apply the "clean" sigmoid:
clean exp(child) |x> -- heh. I might write a exp[child,n] function now. This looks useful! which expands to: -- eg, another application is the six degrees of separation idea: exp[friends,6] |Fred> (1 + child + child^2 + child^3 + child^4 + ... + child^n + ... ) |x>Now, finally, if we create inverses in the console, we can ascend the tree too.
sa: create inverse sa: dump ---------------------------------------- |context> => |context: simple binary tree> left |x> => |a> right |x> => |b> text |x> => |start node> left |a> => |c> right |a> => |d> text |a> => |first child node> inverse-left |a> => |x> left |b> => |e> right |b> => |f> text |b> => |second child node> inverse-right |b> => |x> left |c> => |g> right |c> => |h> inverse-left |c> => |a> left |d> => |i> right |d> => |j> inverse-right |d> => |a> left |e> => |k> right |e> => |l> inverse-left |e> => |b> left |f> => |m> right |f> => |n> inverse-right |f> => |b> child |*> #=> left |_self> + right |_self> inverse-supported-ops |op: left> => |x> + |a> + |b> + |c> + |d> + |e> + |f> inverse-supported-ops |op: right> => |x> + |a> + |b> + |c> + |d> + |e> + |f> inverse-supported-ops |op: text> => |x> + |a> + |b> inverse-text |start node> => |x> inverse-supported-ops |op: inverse-left> => |a> + |c> + |e> + |g> + |i> + |k> + |m> inverse-text |first child node> => |a> inverse-supported-ops |op: inverse-right> => |b> + |d> + |f> + |h> + |j> + |l> + |n> inverse-text |second child node> => |b> inverse-left |g> => |c> inverse-right |h> => |c> inverse-left |i> => |d> inverse-right |j> => |d> inverse-left |k> => |e> inverse-right |l> => |e> inverse-left |m> => |f> inverse-right |n> => |f> inverse-supported-ops |op: child> => |*> inverse-supported-ops |op: inverse-supported-ops> => |op: left> + |op: right> + |op: text> + |op: inverse-left> + |op: inverse-right> + |op: child> + |op: inverse-supported-ops> + |op: inverse-text> inverse-supported-ops |op: inverse-text> => |start node> + |first child node> + |second child node> ---------------------------------------- sa: inverse-left |a> |x> sa: inverse-right |b> |x> sa: inverse-right inverse-left |m> |b> sa: inverse-right inverse-right inverse-left |m> |x> sa: parent |*> #=> inverse-left |_self> + inverse-right |_self> -- create the parent operator out of inverse-left and inverse-right. sa: parent |m> |f> sa: parent^2 |m> |b> sa: parent^3 |m> |x> sa: parent^2 |k> |b> sa: parent^2 |h> |a> sa: parent^3 |h> |x> sa: parent^4 |h> |> -- we have gone past the top of the tree.And that I think, covers the basics of binary trees in BKO.
(the cat spied a rat) becomes: CAR |the cat spied a rat> => |the> CDR |the cat spied a rat> => |cat spied a rat> CAR |cat spied a rat> => |cat> CDR |cat spied a rat> => |spied a rat> CAR |spied a rat> => |spied> CDR |spied a rat> => |a rat> CAR |a rat> => |a> CDR |a rat> => |rat> CAR |rat> => |rat> CDR |rat> => |>
# exp[child,n] |x> # maps to: (1 + child + child^2 + ... + child^n ) |x> # cf: exp(A) |Psi> in QM. # if n <= 0, return |x> # def exp(one,context,parameters): try: op, n = parameters.split(",") # slightly hackish. Don't know a better way to do it .... print("exp op " + op) print("exp n " + n) n = int(n) except: return one r = one tmp = one for k in range(n): tmp = tmp.apply_op(context,op) r += tmp return rAnd now for some examples. So let's use the binary tree data from yesterday.
sa: load simple-binary-tree.sw sa: dump ---------------------------------------- |context> => |context: simple binary tree> left |x> => |a> right |x> => |b> text |x> => |start node> left |a> => |c> right |a> => |d> text |a> => |first child node> left |b> => |e> right |b> => |f> text |b> => |second child node> left |c> => |g> right |c> => |h> left |d> => |i> right |d> => |j> left |e> => |k> right |e> => |l> left |f> => |m> right |f> => |n> child |*> #=> left |_self> + right |_self> ---------------------------------------- sa: exp[child,-1] |x> -- if you get the n parameter wrong, it just returns |x> |x> sa: exp[child,0] |x> -- this is an expected result. |x> sa: exp[child,1] |x> |x> + |a> + |b> sa: exp[child,2] |x> |x> + |a> + |b> + |c> + |d> + |e> + |f> sa: exp[child,3] |x> |x> + |a> + |b> + |c> + |d> + |e> + |f> + |g> + |h> + |i> + |j> + |k> + |l> + |m> + |n> sa: exp[child,4] |x> -- notice no new terms. We are at the bottom of the tree. |x> + |a> + |b> + |c> + |d> + |e> + |f> + |g> + |h> + |i> + |j> + |k> + |l> + |m> + |n> sa: exp[child,5] |x> -- notice no new terms. We are at the bottom of the tree. |x> + |a> + |b> + |c> + |d> + |e> + |f> + |g> + |h> + |i> + |j> + |k> + |l> + |m> + |n> sa: exp[child,1] |a> |a> + |c> + |d> sa: exp[child,1] |b> |b> + |e> + |f> sa: exp[child,1] |f> |f> + |m> + |n> sa: exp[child,1] (|a> + |f>) -- NB: exp[] can be applied to both kets and superpositions. |a> + |f> + |c> + |d> + |m> + |n> -- Here we essentially have: exp[child,1] |a> + exp[child,1] |f> sa: exp[child,1] |x> |x> + |a> + |b> sa: text exp[child,1] |x> -- I think this is cool result! Shows a little of the power of my notation. |start node> + |first child node> + |second child node>
text |x> => |start node> left |x> => |0> -- left |x> right |x> => |1> -- right |x> text |0> => |first child node> left |0> => |00> -- left left |x> right |0> => |10> -- right left |x> text |1> => |second child node> left |1> => |01> -- left right |x> right |1> => |11> -- right right |x> text |00> => |third child node> left |00> => |000> -- left left left |x> right |00> => |100> -- right left left |x> text |10> => |fourth child node> left |10> => |010> -- left right left |x> right |10> => |110> -- right right left |x> text |01> => |fifth child node> left |01> => |001> -- left left right |x> right |01> => |101> -- right left right |x> text |11> => |sixth child node> left |11> => |011> -- left right right |x> right |11> => |111> -- right right right |x> child |*> #=> left |_self> + right |_self>OK. Now we have these new ket labels, showing the difference between child^n |x> and exp[child,n] |x> will be easier.
sa: child |x> |0> + |1> sa: text child |x> |first child node> + |second child node> sa: exp[child,1] |x> |x> + |0> + |1> sa: text exp[child,1] |x> |start node> + |first child node> + |second child node> sa: child^2 |x> |00> + |10> + |01> + |11> sa: text child^2 |x> |third child node> + |fourth child node> + |fifth child node> + |sixth child node> sa: exp[child,2] |x> |x> + |0> + |1> + |00> + |10> + |01> + |11> sa: text exp[child,2] |x> |start node> + |first child node> + |second child node> + |third child node> + |fourth child node> + |fifth child node> + |sixth child node> sa: child^3 |x> |000> + |100> + |010> + |110> + |001> + |101> + |011> + |111> sa: exp[child,3] |x> |x> + |0> + |1> + |00> + |10> + |01> + |11> + |000> + |100> + |010> + |110> + |001> + |101> + |011> + |111>
# exp-max[op] |x> # maps to (1 + op + op^2 + ... op^n) |x> # such that exp[op,n] |x> == exp[op,n+1] |x> (strictly speaking, it is their lengths being compared) # Warning though, we have no idea before hand how large n and the resulting superposition is going to be. # Also, for large data sets this is going to be big-O expensive. But someone smarter than me can fix that problem, presumably. def exp_max(one,context,parameters): try: op, t = parameters.split(",") t = int(t) except: op = parameters t = 0 r = one tmp = one previous_size = len(r) # yup. I finally implemented len() for superpositions/kets. n = 0 while True: tmp = tmp.apply_op(context,op) r += tmp # if len(r) == previous_size: # a variant is: len(r) - previous_size <= t if len(r) - previous_size <= t: # since kets add in sp, this difference is the number of newly discovered kets. break # so, if this is 0, then we have reached the end of the network. previous_size = len(r) # if this is say 1, then in this round we only found 1 new ket. n += 1 # which in some cases is enough to say, this will suffice as the end of the network. print("n:",n) return rA comment:
# Something I have wanted to do for a very long time is to split an academic field of study into categories. # Roughly: exp-max[references,t] |some seed physics paper> # where the "references" operator applied to a paper on arxiv.org returns the list of papers it references. # We may (though maybe not) need t > 0, else it might drag in all of arxiv.orgNow, for some examples using the binary tree data.
sa: load binary-tree.sw sa: child |*> #=> left |_self> + right |_self> sa: exp-max[child] |x> -- slurp the whole tree, starting at the |x> node. n: 3 -- BTW, considering the verbosity in the console, the big-O on this is probably bad! |x> + |0> + |1> + |00> + |10> + |01> + |11> + |000> + |100> + |010> + |110> + |001> + |101> + |011> + |111> sa: exp-max[child] |0> -- slurp the whole tree, starting at the |0> node. n: 2 -- NB: here n = 2, above it was n = 3 (where n is how many steps exp has decended) |0> + |00> + |10> + |000> + |100> + |010> + |110> sa: exp-max[left] |x> -- slurp down the left branch of the tree. n: 3 |x> + |0> + |00> + |000> sa: exp-max[right] |x> -- slurp down the right branch of the tree. n: 3 |x> + |1> + |11> + |111> sa: create inverse -- create all the inverses sa: exp-max[inverse-left] |000> -- climb up the tree starting at |000> n: 3 |000> + |00> + |0> + |x> -- now some examples from the middle branches, not the edges: sa: exp-max[inverse-left] |101> n: 0 |101> -- inverse-left is not defined for |101> sa: exp-max[inverse-right] |101> -- try again: n: 1 |101> + |01> -- inverse-right is not defined for |01> (hence we can climb no higher) -- lets define a general rule for parents: sa: parent |*> #=> inverse-left |_self> + inverse-right |_self> -- try again: sa: exp-max[parent] |101> n: 3 |101> + |01> + |1> + |x> -- success. sa: exp-max[parent] |011> n: 3 |011> + |11> + |1> + |x> -- yup. Using parent we can climb the tree from any branch upwards to the top.BTW, perhaps another use or two of this code is:
exp-max[people-you-know] |some seed person> exp-max[url-links-to] |some seed webpage>Quick test, and even cyclic networks don't cause an infinite loop. See here:
sa: load simple-network.sw sa: matrix[O] [ a1 ] = [ 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 ] [ a1 ] [ a2 ] [ 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a2 ] [ a3 ] [ 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a3 ] [ a4 ] [ 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a4 ] [ a5 ] [ 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a5 ] [ a6 ] [ 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a6 ] [ a7 ] [ 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 ] [ a7 ] [ a8 ] [ 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 ] [ a8 ] [ a9 ] [ 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 ] [ a9 ] [ a10 ] [ 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 ] [ a10 ] [ b1 ] [ 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 1.00 ] [ b1 ] [ b2 ] [ 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 ] [ b2 ] [ b3 ] [ 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 ] [ b3 ] [ b4 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 ] [ b4 ] [ b5 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 ] [ b5 ] [ b6 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 ] [ b6 ] [ b7 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 ] [ b7 ] sa: exp-max[O] |a1> n: 16 2.000|a1> + 2.000|a2> + 2.000|a3> + 2.000|a4> + 2.000|a5> + 2.000|a6> + 2.000|a7> + 2.000|a8> + |a9> + |a10> + 2.000|b1> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7> sa: exp-max[O] |a3> n: 14 2.000|a3> + 2.000|a4> + 2.000|a5> + 2.000|a6> + 2.000|a7> + 2.000|a8> + |a9> + |a10> + |a1> + 2.000|b1> + |a2> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7> sa: exp-max[O] |b1> -- NB: {b1,b2,b3,b5,b6,b7} are a sub-network. n: 6 -- so we never hit any of the a_n nodes. 2.000|b1> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7> sa: exp-max[O] |b3> n: 6 2.000|b3> + |b4> + |b5> + |b6> + |b7> + |b1> + |b2>Note, I don't really understand the meaning of the coeffs, but we can get rid of them easily enough using clean.
sa: clean exp-max[O] |a1> n: 16 |a1> + |a2> + |a3> + |a4> + |a5> + |a6> + |a7> + |a8> + |a9> + |a10> + |b1> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7> sa: clean exp-max[O] |a3> n: 14 |a3> + |a4> + |a5> + |a6> + |a7> + |a8> + |a9> + |a10> + |a1> + |b1> + |a2> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7> sa: clean exp-max[O] |b1> n: 6 |b1> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7>Also, these newtorks and sub-networks are making me think of group theory in maths (and groups vs subgroups and so on).
sa: create inverse sa: exp-max[inverse-O] |a1> n: 9 2.000|a1> + |a10> + |a9> + |a8> + |a7> + |a6> + |a5> + |a4> + |a3> + |a2> sa: exp-max[inverse-O] |a3> n: 9 2.000|a3> + |a2> + |a1> + |a10> + |a9> + |a8> + |a7> + |a6> + |a5> + |a4> sa: exp-max[inverse-O] |b3> n: 12 2.000|b3> + 2.000|b2> + 2.000|b1> + 3.000|a10> + 2.000|b7> + 2.000|a9> + 2.000|b6> + 2.000|a8> + 2.000|b5> + 2.000|a7> + 2.000|b4> + |a6> + |a5> + |a4> + |a3> + |a2> + |a1>Heh, so now {a1,a2,a3,a4,a5,a6,a7,a8,a9,a10} is the sub-group of {a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,b1,b2,b3,b4,b5,b6,b7}
O |a1> => |a2> inverse-O |a1> => |a10> O |a2> => |a3> inverse-O |a2> => |a1> O |a3> => |a4> inverse-O |a3> => |a2> O |a4> => |a5> inverse-O |a4> => |a3> O |a5> => |a6> inverse-O |a5> => |a4> O |a6> => |a7> inverse-O |a6> => |a5> O |a7> => |a8> inverse-O |a7> => |a6> O |a8> => |a9> inverse-O |a8> => |a7> O |a9> => |a10> inverse-O |a9> => |a8> O |a10> => |a1> + |b1> inverse-O |a10> => |a9> O |b1> => |b2> inverse-O |b1> => |a10> + |b7> O |b2> => |b3> inverse-O |b2> => |b1> O |b3> => |b4> inverse-O |b3> => |b2> O |b4> => |b5> inverse-O |b4> => |b3> O |b5> => |b6> inverse-O |b5> => |b4> O |b6> => |b7> inverse-O |b6> => |b5> O |b7> => |b1> inverse-O |b7> => |b6>And here is the inverse-O matrix:
sa: matrix[inverse-O] [ a1 ] = [ 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a1 ] [ a2 ] [ 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a2 ] [ a3 ] [ 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a3 ] [ a4 ] [ 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a4 ] [ a5 ] [ 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 ] [ a5 ] [ a6 ] [ 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 ] [ a6 ] [ a7 ] [ 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 ] [ a7 ] [ a8 ] [ 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 ] [ a8 ] [ a9 ] [ 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 ] [ a9 ] [ a10 ] [ 1.00 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 ] [ a10 ] [ b1 ] [ 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 ] [ b1 ] [ b2 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 ] [ b2 ] [ b3 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 ] [ b3 ] [ b4 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 0 ] [ b4 ] [ b5 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 ] [ b5 ] [ b6 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 ] [ b6 ] [ b7 ] [ 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 ] [ b7 ] |matrix>
sa: create inverse sa: nbr |*> #=> O |_self> + inverse-O |_self> -- propogate up and down in each step. Though it does make things somewhat inefficient! sa: relevant-kets[O] |a1> + |a2> + |a3> + |a4> + |a5> + |a6> + |a7> + |a8> + |a9> + |a10> + |b1> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7> sa: vector[nbr] relevant-kets[O] [ a1 ] = [ 0 1.00 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 ] [ a1 ] [ a2 ] [ 1.00 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a2 ] [ a3 ] [ 0 1.00 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a3 ] [ a4 ] [ 0 0 1.00 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 ] [ a4 ] [ a5 ] [ 0 0 0 1.00 0 1.00 0 0 0 0 0 0 0 0 0 0 0 ] [ a5 ] [ a6 ] [ 0 0 0 0 1.00 0 1.00 0 0 0 0 0 0 0 0 0 0 ] [ a6 ] [ a7 ] [ 0 0 0 0 0 1.00 0 1.00 0 0 0 0 0 0 0 0 0 ] [ a7 ] [ a8 ] [ 0 0 0 0 0 0 1.00 0 1.00 0 0 0 0 0 0 0 0 ] [ a8 ] [ a9 ] [ 0 0 0 0 0 0 0 1.00 0 1.00 0 0 0 0 0 0 0 ] [ a9 ] [ a10 ] [ 1.00 0 0 0 0 0 0 0 1.00 0 1.00 0 0 0 0 0 0 ] [ a10 ] [ b1 ] [ 0 0 0 0 0 0 0 0 0 1.00 0 1.00 0 0 0 0 1.00 ] [ b1 ] [ b2 ] [ 0 0 0 0 0 0 0 0 0 0 1.00 0 1.00 0 0 0 0 ] [ b2 ] [ b3 ] [ 0 0 0 0 0 0 0 0 0 0 0 1.00 0 1.00 0 0 0 ] [ b3 ] [ b4 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 1.00 0 0 ] [ b4 ] [ b5 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 1.00 0 ] [ b5 ] [ b6 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.00 0 1.00 ] [ b6 ] [ b7 ] [ 0 0 0 0 0 0 0 0 0 0 1.00 0 0 0 0 1.00 0 ] [ b7 ] |matrix> -- no sort order bug here. sa: exp-max[nbr] |a3> 100.000|a3> + 49.000|a4> + 50.000|a2> + 77.000|a5> + 89.000|a1> + 28.000|a6> + 39.000|a10> + 45.000|a7> + 59.000|b1> + 56.000|a9> + 17.000|a8> + 10.000|b2> + 10.000|b7> + 11.000|b3> + 11.000|b6> + 2.000|b4> + 2.000|b5> -- as promised, all in one network. sa: exp-max[nbr] |b3> 133.000|b3> + 199.000|b4> + 268.000|b2> + 135.000|b5> + 204.000|b1> + 160.000|b6> + 273.000|a10> + 229.000|b7> + 70.000|a1> + 70.000|a9> + 70.000|a2> + 70.000|a8> + 12.000|a3> + 12.000|a7> + 13.000|a4> + 13.000|a6> + 2.000|a5> -- also all in one network. -- and for a quick idea of how inefficient it is to go up and down with each step: sa: count-sum exp-max[nbr] |a3> |number: 655.0> sa: count-sum exp-max[nbr] |b3> |number: 1933.0> sa: count-sum exp-max[nbr] |b1> |number: 840.0> -- heh. OK. Another (more efficient) way to do it. Though not 100% it will slurp in the entire network. -- Indeed, my hunch is that there exist more interesting networks where this trick is not sufficient to slurp in the whole network. sa: exp-max[O] |a1> + exp-max[inverse-O] |a1> 4.000|a1> + 3.000|a2> + 3.000|a3> + 3.000|a4> + 3.000|a5> + 3.000|a6> + 3.000|a7> + 3.000|a8> + 2.000|a9> + 2.000|a10> + 2.000|b1> + |b2> + |b3> + |b4> + |b5> + |b6> + |b7> sa: exp-max[O] |b1> + exp-max[inverse-O] |b1> 4.000|b1> + 2.000|b2> + 2.000|b3> + 3.000|b4> + 3.000|b5> + 3.000|b6> + 3.000|b7> + 3.000|a10> + 2.000|a9> + 2.000|a8> + 2.000|a7> + |a6> + |a5> + |a4> + |a3> + |a2> + |a1> sa: count-sum (exp-max[O] |a1> + exp-max[inverse-O] |a1>) |number: 37.0> sa: count-sum (exp-max[O] |b1> + exp-max[inverse-O] |b1>) |number: 35.0>
sa: create inverse sa: parent |*> #=> inverse-left |_self> + inverse-right |_self> sa: nghbr |*> #=> parent |_self> + child |_self> -- this is the useful bit (defining a neighbour operator), but we need a little work first. sa: exp-max[child] |x> -- OK. We have a copy of the tree. n: 3 |x> + |0> + |1> + |00> + |10> + |01> + |11> + |000> + |100> + |010> + |110> + |001> + |101> + |011> + |111> sa: map[nghbr] exp-max[child] |x> -- create the nghbr data for all elements in the tree. sa: matrix[nghbr] -- take a look [ 00 ] = [ 1.00 1.00 1.00 0 0 0 1.00 0 0 0 1.00 0 0 0 0 1.00 ] [ 0 ] [ 0 ] [ 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 00 ] [ 000 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 000 ] [ 01 ] [ 0 0 0 1.00 1.00 1.00 0 0 1.00 0 0 1.00 0 0 0 1.00 ] [ 1 ] [ 1 ] [ 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 ] [ 01 ] [ 001 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 001 ] [ 10 ] [ 1.00 0 0 0 0 0 1.00 1.00 0 0 0 0 1.00 0 0 0 ] [ 10 ] [ 010 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 010 ] [ 11 ] [ 0 0 0 1.00 0 0 0 0 1.00 1.00 0 0 0 1.00 0 0 ] [ 11 ] [ 011 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 011 ] [ 100 ] [ 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 100 ] [ 101 ] [ 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 ] [ 101 ] [ 110 ] [ 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 ] [ 110 ] [ 111 ] [ 0 0 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 ] [ 111 ] [ x ] [ 1.00 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 0 ] [ * ] [ x ] -- observe that: -- |x> has 2 children, and no parents -- middle nodes have 2 children and 1 parent -- bottom nodes have 0 children and 1 parent -- DOH! There is a sort order bug for {0,00,000} and maybe others. eg, the childen of |x> are wrong! -- No idea how to fix. But I do know where. In the natural-sort function. sa: exp-max[nghbr] |010> -- put it to use with an example. n: 6 -- NB: we didn't need the map[nghbr] function for this to work. That was only so we had a matrix. 16.000|010> + 61.000|10> + 30.000|0> + 15.000|110> + 39.000|x> + 46.000|00> + 9.000|1> + 8.000|000> + 8.000|100> + 11.000|01> + 11.000|11> + |001> + |101> + |011> + |111> sa: exp-max[nghbr] |01> -- another example. n: 5 61.000|01> + 30.000|1> + 15.000|001> + 15.000|101> + 39.000|x> + 46.000|11> + 9.000|0> + 8.000|011> + 8.000|111> + 11.000|00> + 11.000|10> + |000> + |100> + |010> + |110>Point being, from anywhere on the binary tree we can reach all the other nodes. We just had to define the nghbr (neighbour) operator.
-- another way of defining children and parents of a node: sa: nbr |*> #=> left |_self> + inverse-left |_self> + right |_self> + inverse-right |_self> -- instead of using matrix, use vector (where you specify the kets of itnerest) -- the idea is this way we can use ops that are general rules, rather than having to run map with them first. -- and the only reason we had to run map first was so that we had the right list from context.relevant_kets(op) sa: vector[nbr] exp-max[child] |x> [ 0 ] = [ 1.00 1.00 0 1.00 1.00 0 0 1.00 1.00 0 0 0 0 0 0 ] [ x ] [ 00 ] [ 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 ] [ 000 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 1 ] [ 1 ] [ 1.00 0 1.00 0 0 1.00 1.00 0 0 0 0 1.00 1.00 0 0 ] [ 00 ] [ 01 ] [ 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 ] [ 10 ] [ 001 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 01 ] [ 10 ] [ 0 1.00 0 0 1.00 0 0 0 0 1.00 1.00 0 0 0 0 ] [ 11 ] [ 010 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 000 ] [ 11 ] [ 0 0 1.00 0 0 0 1.00 0 0 0 0 0 0 1.00 1.00 ] [ 100 ] [ 011 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 010 ] [ 100 ] [ 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 0 ] [ 110 ] [ 101 ] [ 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 ] [ 001 ] [ 110 ] [ 0 0 0 0 1.00 0 0 0 0 0 0 0 0 0 0 ] [ 101 ] [ 111 ] [ 0 0 0 0 0 0 1.00 0 0 0 0 0 0 0 0 ] [ 011 ] [ x ] [ 0 1.00 1.00 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 111 ] |matrix> -- heh. Still has the sort order bug!
# 1/5/2014: # to-value and to-category (maybe come up with better names!) # to-value |> => |> # to-value |19> => 19| > -- NB the space, cf to-number # to-value |age: 23> => 23|age> # to-value |age: 23.5> => 23.5|age> # to-value |string> => |string> or 0| > -- currently the first one. # to-value |cat: val> => |cat: val> or 0|cat> # to-value |cat1: cat2: 13> => 13|cat1: cat2> # # to-category 57| > => |57> # to-category |age> => |age: 1> # to-category 23|age> => |age: 23> def to_value(one): # tested. Seems to work as desired! # do we need one = one.ket() here? cat, value = extract_category_value(one.label) print("cat: " + cat) print("value: " + value) if len(cat) == 0: label = " " else: label = cat try: x = float(value) return ket(label,x) except: return one def to_category(one): # do we need one = one.ket() here? label = one.label if label in [""," "]: # maybe label.strip() == ""? label = "" # Also, stop using -- for comments in python! else: label += ": " return ket(label + "%.3f" % one.value)
count-1 |semantic-2-64k> => |number: 4580> count-2 |semantic-2-64k> => |number: 739> count-3 |semantic-2-64k> => |number: 304> count-4 |semantic-2-64k> => |number: 207> count-5 |semantic-2-64k> => |number: 140> count-6 |semantic-2-64k> => |number: 87> count-7 |semantic-2-64k> => |number: 63> count-8 |semantic-2-64k> => |number: 48> count-9 |semantic-2-64k> => |number: 35> count-10 |semantic-2-64k> => |number: 35> count-1 |eztv-1-64k> => |number: 2888> count-2 |eztv-1-64k> => |number: 152> count-3 |eztv-1-64k> => |number: 47> count-4 |eztv-1-64k> => |number: 32> count-5 |eztv-1-64k> => |number: 28> count-6 |eztv-1-64k> => |number: 27> count-7 |eztv-1-64k> => |number: 26> count-8 |eztv-1-64k> => |number: 23> count-9 |eztv-1-64k> => |number: 22> count-10 |eztv-1-64k> => |number: 22> count-1 |slashdot-3-64k> => |number: 1203> count-2 |slashdot-3-64k> => |number: 165> count-3 |slashdot-3-64k> => |number: 104> count-4 |slashdot-3-64k> => |number: 89> count-5 |slashdot-3-64k> => |number: 80> count-6 |slashdot-3-64k> => |number: 70> count-7 |slashdot-3-64k> => |number: 55> count-8 |slashdot-3-64k> => |number: 52> count-9 |slashdot-3-64k> => |number: 46> count-10 |slashdot-3-64k> => |number: 45> count-1 |slashdot-1-64k> => |number: 1179> count-2 |slashdot-1-64k> => |number: 159> count-3 |slashdot-1-64k> => |number: 92> count-4 |slashdot-1-64k> => |number: 79> count-5 |slashdot-1-64k> => |number: 72> count-6 |slashdot-1-64k> => |number: 65> count-7 |slashdot-1-64k> => |number: 55> count-8 |slashdot-1-64k> => |number: 52> count-9 |slashdot-1-64k> => |number: 48> count-10 |slashdot-1-64k> => |number: 47> count-1 |wc-comments-2-64k> => |number: 528> count-2 |wc-comments-2-64k> => |number: 99> count-3 |wc-comments-2-64k> => |number: 54> count-4 |wc-comments-2-64k> => |number: 44> count-5 |wc-comments-2-64k> => |number: 38> count-6 |wc-comments-2-64k> => |number: 36> count-7 |wc-comments-2-64k> => |number: 34> count-8 |wc-comments-2-64k> => |number: 32> count-9 |wc-comments-2-64k> => |number: 31> count-10 |wc-comments-2-64k> => |number: 29> count-1 |diary-1-64k> => |number: 629> count-2 |diary-1-64k> => |number: 157> count-3 |diary-1-64k> => |number: 90> count-4 |diary-1-64k> => |number: 77> count-5 |diary-1-64k> => |number: 66> count-6 |diary-1-64k> => |number: 62> count-7 |diary-1-64k> => |number: 62> count-8 |diary-1-64k> => |number: 58> count-9 |diary-1-64k> => |number: 56> count-10 |diary-1-64k> => |number: 56> count-1 |eztv-2-64k> => |number: 2919> count-2 |eztv-2-64k> => |number: 182> count-3 |eztv-2-64k> => |number: 46> count-4 |eztv-2-64k> => |number: 34> count-5 |eztv-2-64k> => |number: 31> count-6 |eztv-2-64k> => |number: 30> count-7 |eztv-2-64k> => |number: 29> count-8 |eztv-2-64k> => |number: 26> count-9 |eztv-2-64k> => |number: 26> count-10 |eztv-2-64k> => |number: 26> count-1 |diary-2-64k> => |number: 657> count-2 |diary-2-64k> => |number: 156> count-3 |diary-2-64k> => |number: 89> count-4 |diary-2-64k> => |number: 76> count-5 |diary-2-64k> => |number: 65> count-6 |diary-2-64k> => |number: 62> count-7 |diary-2-64k> => |number: 62> count-8 |diary-2-64k> => |number: 58> count-9 |diary-2-64k> => |number: 56> count-10 |diary-2-64k> => |number: 56> count-1 |wc-comments-1-64k> => |number: 528> count-2 |wc-comments-1-64k> => |number: 99> count-3 |wc-comments-1-64k> => |number: 54> count-4 |wc-comments-1-64k> => |number: 44> count-5 |wc-comments-1-64k> => |number: 38> count-6 |wc-comments-1-64k> => |number: 36> count-7 |wc-comments-1-64k> => |number: 34> count-8 |wc-comments-1-64k> => |number: 32> count-9 |wc-comments-1-64k> => |number: 31> count-10 |wc-comments-1-64k> => |number: 29> count-1 |slashdot-2-64k> => |number: 1173> count-2 |slashdot-2-64k> => |number: 162> count-3 |slashdot-2-64k> => |number: 93> count-4 |slashdot-2-64k> => |number: 79> count-5 |slashdot-2-64k> => |number: 72> count-6 |slashdot-2-64k> => |number: 65> count-7 |slashdot-2-64k> => |number: 55> count-8 |slashdot-2-64k> => |number: 52> count-9 |slashdot-2-64k> => |number: 48> count-10 |slashdot-2-64k> => |number: 47> count-1 |semantic-1-64k> => |number: 2495> count-2 |semantic-1-64k> => |number: 441> count-3 |semantic-1-64k> => |number: 213> count-4 |semantic-1-64k> => |number: 133> count-5 |semantic-1-64k> => |number: 102> count-6 |semantic-1-64k> => |number: 62> count-7 |semantic-1-64k> => |number: 40> count-8 |semantic-1-64k> => |number: 27> count-9 |semantic-1-64k> => |number: 23> count-10 |semantic-1-64k> => |number: 23>And now the simm results:
drop-1-simm |semantic-2-64k> => 65.846|semantic-1-64k> + 8.067|diary-1-64k> + 8.021|diary-2-64k> + 6.087|eztv-1-64k> + 6.086|slashdot-3-64k> + 6.036|eztv-2-64k> + 5.741|slashdot-1-64k> + 5.722|slashdot-2-64k> + 4.875|wc-comments-2-64k> + 4.863|wc-comments-1-64k> drop-2-simm |semantic-2-64k> => 73.858|semantic-1-64k> + 11.679|diary-1-64k> + 11.633|diary-2-64k> + 8.002|slashdot-3-64k> + 7.878|slashdot-1-64k> + 7.871|slashdot-2-64k> + 7.543|wc-comments-2-64k> + 7.543|wc-comments-1-64k> + 6.206|eztv-1-64k> + 6.198|eztv-2-64k> drop-3-simm |semantic-2-64k> => 77.788|semantic-1-64k> + 13.761|diary-1-64k> + 13.708|diary-2-64k> + 9.388|slashdot-3-64k> + 9.331|slashdot-1-64k> + 9.328|slashdot-2-64k> + 8.981|wc-comments-2-64k> + 8.981|wc-comments-1-64k> + 7.461|eztv-1-64k> + 7.461|eztv-2-64k> drop-4-simm |semantic-2-64k> => 77.417|semantic-1-64k> + 14.565|diary-1-64k> + 14.510|diary-2-64k> + 10.068|slashdot-3-64k> + 10.004|slashdot-1-64k> + 10.004|slashdot-2-64k> + 9.532|wc-comments-2-64k> + 9.532|wc-comments-1-64k> + 8.093|eztv-1-64k> + 8.093|eztv-2-64k> drop-5-simm |semantic-2-64k> => 79.583|semantic-1-64k> + 15.292|diary-1-64k> + 15.236|diary-2-64k> + 10.793|slashdot-3-64k> + 10.720|slashdot-1-64k> + 10.720|slashdot-2-64k> + 9.826|wc-comments-2-64k> + 9.826|wc-comments-1-64k> + 8.778|eztv-1-64k> + 8.778|eztv-2-64k> drop-6-simm |semantic-2-64k> => 79.927|semantic-1-64k> + 16.338|diary-1-64k> + 16.267|diary-2-64k> + 11.652|slashdot-3-64k> + 11.561|slashdot-1-64k> + 11.561|slashdot-2-64k> + 10.001|wc-comments-2-64k> + 10.001|wc-comments-1-64k> + 9.580|eztv-1-64k> + 9.580|eztv-2-64k> drop-7-simm |semantic-2-64k> => 79.571|semantic-1-64k> + 16.886|diary-2-64k> + 16.705|diary-1-64k> + 11.901|slashdot-3-64k> + 11.771|slashdot-1-64k> + 11.771|slashdot-2-64k> + 10.171|wc-comments-2-64k> + 10.171|wc-comments-1-64k> + 10.080|eztv-1-64k> + 10.080|eztv-2-64k> drop-8-simm |semantic-2-64k> => 79.337|semantic-1-64k> + 17.309|diary-2-64k> + 17.050|diary-1-64k> + 12.324|slashdot-3-64k> + 12.191|slashdot-1-64k> + 12.191|slashdot-2-64k> + 10.479|eztv-1-64k> + 10.479|eztv-2-64k> + 9.567|wc-comments-2-64k> + 9.567|wc-comments-1-64k> drop-9-simm |semantic-2-64k> => 80.936|semantic-1-64k> + 17.558|diary-2-64k> + 17.299|diary-1-64k> + 12.807|slashdot-3-64k> + 12.653|slashdot-1-64k> + 12.653|slashdot-2-64k> + 10.906|eztv-1-64k> + 10.906|eztv-2-64k> + 9.686|wc-comments-2-64k> + 9.686|wc-comments-1-64k> drop-10-simm |semantic-2-64k> => 80.936|semantic-1-64k> + 17.558|diary-2-64k> + 17.299|diary-1-64k> + 12.818|slashdot-3-64k> + 12.663|slashdot-1-64k> + 12.663|slashdot-2-64k> + 10.906|eztv-1-64k> + 10.906|eztv-2-64k> + 9.866|wc-comments-2-64k> + 9.866|wc-comments-1-64k> drop-1-simm |eztv-1-64k> => 94.118|eztv-2-64k> + 14.042|slashdot-3-64k> + 13.544|slashdot-2-64k> + 13.489|slashdot-1-64k> + 11.737|diary-2-64k> + 11.588|diary-1-64k> + 11.527|wc-comments-2-64k> + 11.509|wc-comments-1-64k> + 7.677|semantic-1-64k> + 6.087|semantic-2-64k> drop-2-simm |eztv-1-64k> => 90.561|eztv-2-64k> + 20.942|slashdot-3-64k> + 20.170|slashdot-1-64k> + 20.155|slashdot-2-64k> + 15.597|wc-comments-2-64k> + 15.597|wc-comments-1-64k> + 14.809|diary-2-64k> + 14.520|diary-1-64k> + 9.051|semantic-1-64k> + 6.206|semantic-2-64k> drop-3-simm |eztv-1-64k> => 92.151|eztv-2-64k> + 21.999|slashdot-3-64k> + 21.427|slashdot-2-64k> + 21.404|slashdot-1-64k> + 16.428|wc-comments-2-64k> + 16.428|wc-comments-1-64k> + 15.645|diary-2-64k> + 15.345|diary-1-64k> + 10.699|semantic-1-64k> + 7.461|semantic-2-64k> drop-4-simm |eztv-1-64k> => 91.835|eztv-2-64k> + 22.414|slashdot-3-64k> + 21.806|slashdot-2-64k> + 21.750|slashdot-1-64k> + 16.300|wc-comments-2-64k> + 16.300|wc-comments-1-64k> + 15.989|diary-2-64k> + 15.686|diary-1-64k> + 11.832|semantic-1-64k> + 8.093|semantic-2-64k> drop-5-simm |eztv-1-64k> => 91.655|eztv-2-64k> + 22.688|slashdot-3-64k> + 21.984|slashdot-2-64k> + 21.927|slashdot-1-64k> + 16.544|wc-comments-2-64k> + 16.544|wc-comments-1-64k> + 16.052|diary-2-64k> + 15.745|diary-1-64k> + 12.518|semantic-1-64k> + 8.778|semantic-2-64k> drop-6-simm |eztv-1-64k> => 91.816|eztv-2-64k> + 23.304|slashdot-3-64k> + 22.407|slashdot-2-64k> + 22.349|slashdot-1-64k> + 16.716|wc-comments-2-64k> + 16.716|wc-comments-1-64k> + 16.183|diary-2-64k> + 15.917|diary-1-64k> + 13.807|semantic-1-64k> + 9.580|semantic-2-64k> drop-7-simm |eztv-1-64k> => 91.800|eztv-2-64k> + 23.049|slashdot-3-64k> + 22.448|slashdot-2-64k> + 22.388|slashdot-1-64k> + 16.378|wc-comments-2-64k> + 16.378|wc-comments-1-64k> + 16.188|diary-2-64k> + 15.922|diary-1-64k> + 14.815|semantic-1-64k> + 10.080|semantic-2-64k> drop-8-simm |eztv-1-64k> => 91.744|eztv-2-64k> + 23.340|slashdot-3-64k> + 22.730|slashdot-2-64k> + 22.669|slashdot-1-64k> + 16.198|wc-comments-2-64k> + 16.198|wc-comments-1-64k> + 16.171|diary-2-64k> + 15.905|diary-1-64k> + 15.600|semantic-1-64k> + 10.479|semantic-2-64k> drop-9-simm |eztv-1-64k> => 91.443|eztv-2-64k> + 24.012|slashdot-3-64k> + 23.161|slashdot-2-64k> + 23.098|slashdot-1-64k> + 16.345|wc-comments-2-64k> + 16.345|wc-comments-1-64k> + 16.320|diary-2-64k> + 16.053|diary-1-64k> + 15.896|semantic-1-64k> + 10.906|semantic-2-64k> drop-10-simm |eztv-1-64k> => 91.443|eztv-2-64k> + 24.142|slashdot-3-64k> + 23.283|slashdot-2-64k> + 23.221|slashdot-1-64k> + 16.320|diary-2-64k> + 16.053|diary-1-64k> + 15.896|semantic-1-64k> + 15.643|wc-comments-2-64k> + 15.643|wc-comments-1-64k> + 10.906|semantic-2-64k> drop-1-simm |slashdot-3-64k> => 77.224|slashdot-2-64k> + 77.217|slashdot-1-64k> + 14.042|eztv-1-64k> + 13.832|eztv-2-64k> + 11.313|diary-2-64k> + 11.143|diary-1-64k> + 8.821|wc-comments-2-64k> + 8.821|wc-comments-1-64k> + 8.165|semantic-1-64k> + 6.086|semantic-2-64k> drop-2-simm |slashdot-3-64k> => 92.552|slashdot-2-64k> + 92.470|slashdot-1-64k> + 20.942|eztv-1-64k> + 20.751|eztv-2-64k> + 13.720|diary-2-64k> + 13.400|diary-1-64k> + 11.105|wc-comments-2-64k> + 11.105|wc-comments-1-64k> + 10.961|semantic-1-64k> + 8.002|semantic-2-64k> drop-3-simm |slashdot-3-64k> => 95.009|slashdot-1-64k> + 95.007|slashdot-2-64k> + 21.999|eztv-1-64k> + 21.835|eztv-2-64k> + 14.396|diary-2-64k> + 14.059|diary-1-64k> + 12.626|semantic-1-64k> + 10.926|wc-comments-2-64k> + 10.926|wc-comments-1-64k> + 9.388|semantic-2-64k> drop-4-simm |slashdot-3-64k> => 95.704|slashdot-2-64k> + 95.660|slashdot-1-64k> + 22.414|eztv-1-64k> + 22.253|eztv-2-64k> + 14.732|diary-2-64k> + 14.389|diary-1-64k> + 13.807|semantic-1-64k> + 10.940|wc-comments-2-64k> + 10.940|wc-comments-1-64k> + 10.068|semantic-2-64k> drop-5-simm |slashdot-3-64k> => 96.441|slashdot-2-64k> + 96.384|slashdot-1-64k> + 22.688|eztv-1-64k> + 22.540|eztv-2-64k> + 14.854|diary-2-64k> + 14.532|semantic-1-64k> + 14.503|diary-1-64k> + 11.211|wc-comments-2-64k> + 11.211|wc-comments-1-64k> + 10.793|semantic-2-64k> drop-6-simm |slashdot-3-64k> => 97.030|slashdot-2-64k> + 96.971|slashdot-1-64k> + 23.304|eztv-1-64k> + 23.156|eztv-2-64k> + 15.880|semantic-1-64k> + 15.074|diary-2-64k> + 14.752|diary-1-64k> + 11.652|semantic-2-64k> + 11.086|wc-comments-2-64k> + 11.086|wc-comments-1-64k> drop-7-simm |slashdot-3-64k> => 97.909|slashdot-2-64k> + 97.854|slashdot-1-64k> + 23.049|eztv-1-64k> + 22.900|eztv-2-64k> + 16.636|semantic-1-64k> + 13.495|diary-2-64k> + 13.173|diary-1-64k> + 11.901|semantic-2-64k> + 10.224|wc-comments-2-64k> + 10.224|wc-comments-1-64k> drop-8-simm |slashdot-3-64k> => 97.882|slashdot-2-64k> + 97.827|slashdot-1-64k> + 23.340|eztv-1-64k> + 23.189|eztv-2-64k> + 17.445|semantic-1-64k> + 13.715|diary-2-64k> + 13.391|diary-1-64k> + 12.324|semantic-2-64k> + 9.594|wc-comments-2-64k> + 9.594|wc-comments-1-64k> drop-9-simm |slashdot-3-64k> => 97.884|slashdot-2-64k> + 97.839|slashdot-1-64k> + 24.012|eztv-1-64k> + 23.857|eztv-2-64k> + 17.797|semantic-1-64k> + 13.413|diary-2-64k> + 13.079|diary-1-64k> + 12.807|semantic-2-64k> + 9.704|wc-comments-2-64k> + 9.704|wc-comments-1-64k> drop-10-simm |slashdot-3-64k> => 97.872|slashdot-2-64k> + 97.827|slashdot-1-64k> + 24.142|eztv-1-64k> + 23.952|eztv-2-64k> + 17.808|semantic-1-64k> + 13.430|diary-2-64k> + 13.097|diary-1-64k> + 12.818|semantic-2-64k> + 9.890|wc-comments-2-64k> + 9.890|wc-comments-1-64k> drop-1-simm |slashdot-1-64k> => 96.165|slashdot-2-64k> + 77.217|slashdot-3-64k> + 13.489|eztv-1-64k> + 13.291|eztv-2-64k> + 10.937|diary-2-64k> + 10.801|diary-1-64k> + 8.392|wc-comments-2-64k> + 8.392|wc-comments-1-64k> + 7.855|semantic-1-64k> + 5.741|semantic-2-64k> drop-2-simm |slashdot-1-64k> => 97.959|slashdot-2-64k> + 92.470|slashdot-3-64k> + 20.170|eztv-1-64k> + 19.987|eztv-2-64k> + 12.980|diary-2-64k> + 12.660|diary-1-64k> + 10.793|semantic-1-64k> + 10.221|wc-comments-2-64k> + 10.221|wc-comments-1-64k> + 7.878|semantic-2-64k> drop-3-simm |slashdot-1-64k> => 99.536|slashdot-2-64k> + 95.009|slashdot-3-64k> + 21.404|eztv-1-64k> + 21.241|eztv-2-64k> + 13.739|diary-2-64k> + 13.402|diary-1-64k> + 12.569|semantic-1-64k> + 10.403|wc-comments-2-64k> + 10.403|wc-comments-1-64k> + 9.331|semantic-2-64k> drop-4-simm |slashdot-1-64k> => 99.494|slashdot-2-64k> + 95.660|slashdot-3-64k> + 21.750|eztv-1-64k> + 21.589|eztv-2-64k> + 14.049|diary-2-64k> + 13.743|semantic-1-64k> + 13.707|diary-1-64k> + 10.713|wc-comments-2-64k> + 10.713|wc-comments-1-64k> + 10.004|semantic-2-64k> drop-5-simm |slashdot-1-64k> => 99.943|slashdot-2-64k> + 96.384|slashdot-3-64k> + 21.927|eztv-1-64k> + 21.779|eztv-2-64k> + 14.459|semantic-1-64k> + 14.143|diary-2-64k> + 13.792|diary-1-64k> + 10.972|wc-comments-2-64k> + 10.972|wc-comments-1-64k> + 10.720|semantic-2-64k> drop-6-simm |slashdot-1-64k> => 99.942|slashdot-2-64k> + 96.971|slashdot-3-64k> + 22.349|eztv-1-64k> + 22.201|eztv-2-64k> + 15.789|semantic-1-64k> + 14.312|diary-2-64k> + 13.991|diary-1-64k> + 11.561|semantic-2-64k> + 11.111|wc-comments-2-64k> + 11.111|wc-comments-1-64k> drop-7-simm |slashdot-1-64k> => 99.940|slashdot-2-64k> + 97.854|slashdot-3-64k> + 22.388|eztv-1-64k> + 22.239|eztv-2-64k> + 16.506|semantic-1-64k> + 13.358|diary-2-64k> + 13.037|diary-1-64k> + 11.771|semantic-2-64k> + 10.218|wc-comments-2-64k> + 10.218|wc-comments-1-64k> drop-8-simm |slashdot-1-64k> => 99.939|slashdot-2-64k> + 97.827|slashdot-3-64k> + 22.669|eztv-1-64k> + 22.518|eztv-2-64k> + 17.312|semantic-1-64k> + 13.576|diary-2-64k> + 13.252|diary-1-64k> + 12.191|semantic-2-64k> + 9.589|wc-comments-2-64k> + 9.589|wc-comments-1-64k> drop-9-simm |slashdot-1-64k> => 99.938|slashdot-2-64k> + 97.839|slashdot-3-64k> + 23.098|eztv-1-64k> + 22.943|eztv-2-64k> + 17.642|semantic-1-64k> + 13.242|diary-2-64k> + 12.908|diary-1-64k> + 12.653|semantic-2-64k> + 9.687|wc-comments-2-64k> + 9.687|wc-comments-1-64k> drop-10-simm |slashdot-1-64k> => 99.937|slashdot-2-64k> + 97.827|slashdot-3-64k> + 23.221|eztv-1-64k> + 23.065|eztv-2-64k> + 17.652|semantic-1-64k> + 13.258|diary-2-64k> + 12.924|diary-1-64k> + 12.663|semantic-2-64k> + 9.873|wc-comments-2-64k> + 9.873|wc-comments-1-64k> drop-1-simm |wc-comments-2-64k> => 99.533|wc-comments-1-64k> + 39.457|diary-1-64k> + 39.239|diary-2-64k> + 11.952|eztv-2-64k> + 11.527|eztv-1-64k> + 8.821|slashdot-3-64k> + 8.460|slashdot-2-64k> + 8.392|slashdot-1-64k> + 6.678|semantic-1-64k> + 4.875|semantic-2-64k> drop-2-simm |wc-comments-2-64k> => 99.906|wc-comments-1-64k> + 42.452|diary-2-64k> + 42.446|diary-1-64k> + 16.382|eztv-2-64k> + 15.597|eztv-1-64k> + 11.105|slashdot-3-64k> + 10.221|slashdot-1-64k> + 10.214|slashdot-2-64k> + 8.796|semantic-1-64k> + 7.543|semantic-2-64k> drop-3-simm |wc-comments-2-64k> => 99.898|wc-comments-1-64k> + 41.866|diary-2-64k> + 41.846|diary-1-64k> + 17.344|eztv-2-64k> + 16.428|eztv-1-64k> + 10.926|slashdot-3-64k> + 10.403|slashdot-1-64k> + 10.400|slashdot-2-64k> + 9.476|semantic-1-64k> + 8.981|semantic-2-64k> drop-4-simm |wc-comments-2-64k> => 99.895|wc-comments-1-64k> + 41.388|diary-2-64k> + 41.363|diary-1-64k> + 17.221|eztv-2-64k> + 16.300|eztv-1-64k> + 10.940|slashdot-3-64k> + 10.713|slashdot-1-64k> + 10.713|slashdot-2-64k> + 9.717|semantic-1-64k> + 9.532|semantic-2-64k> drop-5-simm |wc-comments-2-64k> => 99.892|wc-comments-1-64k> + 41.749|diary-2-64k> + 41.721|diary-1-64k> + 17.623|eztv-2-64k> + 16.544|eztv-1-64k> + 11.211|slashdot-3-64k> + 10.972|slashdot-1-64k> + 10.972|slashdot-2-64k> + 10.001|semantic-1-64k> + 9.826|semantic-2-64k> drop-6-simm |wc-comments-2-64k> => 99.891|wc-comments-1-64k> + 42.199|diary-1-64k> + 42.175|diary-2-64k> + 17.797|eztv-2-64k> + 16.716|eztv-1-64k> + 11.111|slashdot-1-64k> + 11.111|slashdot-2-64k> + 11.086|slashdot-3-64k> + 10.204|semantic-1-64k> + 10.001|semantic-2-64k> drop-7-simm |wc-comments-2-64k> => 99.889|wc-comments-1-64k> + 41.206|diary-1-64k> + 41.181|diary-2-64k> + 17.521|eztv-2-64k> + 16.378|eztv-1-64k> + 10.408|semantic-1-64k> + 10.224|slashdot-3-64k> + 10.218|slashdot-1-64k> + 10.218|slashdot-2-64k> + 10.171|semantic-2-64k> drop-8-simm |wc-comments-2-64k> => 99.888|wc-comments-1-64k> + 40.310|diary-1-64k> + 40.280|diary-2-64k> + 17.396|eztv-2-64k> + 16.198|eztv-1-64k> + 9.594|slashdot-3-64k> + 9.589|slashdot-1-64k> + 9.589|slashdot-2-64k> + 9.567|semantic-2-64k> + 9.422|semantic-1-64k> drop-9-simm |wc-comments-2-64k> => 99.886|wc-comments-1-64k> + 39.936|diary-1-64k> + 39.915|diary-2-64k> + 17.543|eztv-2-64k> + 16.345|eztv-1-64k> + 9.704|slashdot-3-64k> + 9.687|slashdot-1-64k> + 9.687|slashdot-2-64k> + 9.686|semantic-2-64k> + 9.516|semantic-1-64k> drop-10-simm |wc-comments-2-64k> => 99.884|wc-comments-1-64k> + 38.335|diary-1-64k> + 38.315|diary-2-64k> + 16.841|eztv-2-64k> + 15.643|eztv-1-64k> + 9.890|slashdot-3-64k> + 9.873|slashdot-1-64k> + 9.873|slashdot-2-64k> + 9.866|semantic-2-64k> + 9.696|semantic-1-64k> drop-1-simm |diary-1-64k> => 95.959|diary-2-64k> + 39.457|wc-comments-2-64k> + 39.457|wc-comments-1-64k> + 12.996|eztv-2-64k> + 11.588|eztv-1-64k> + 11.143|slashdot-3-64k> + 10.801|slashdot-1-64k> + 10.735|slashdot-2-64k> + 10.374|semantic-1-64k> + 8.067|semantic-2-64k> drop-2-simm |diary-1-64k> => 97.965|diary-2-64k> + 42.446|wc-comments-2-64k> + 42.446|wc-comments-1-64k> + 17.071|eztv-2-64k> + 14.835|semantic-1-64k> + 14.520|eztv-1-64k> + 13.400|slashdot-3-64k> + 12.660|slashdot-1-64k> + 12.648|slashdot-2-64k> + 11.679|semantic-2-64k> drop-3-simm |diary-1-64k> => 98.515|diary-2-64k> + 41.846|wc-comments-2-64k> + 41.846|wc-comments-1-64k> + 18.197|eztv-2-64k> + 16.206|semantic-1-64k> + 15.345|eztv-1-64k> + 14.059|slashdot-3-64k> + 13.761|semantic-2-64k> + 13.402|slashdot-1-64k> + 13.396|slashdot-2-64k> drop-4-simm |diary-1-64k> => 98.311|diary-2-64k> + 41.363|wc-comments-2-64k> + 41.363|wc-comments-1-64k> + 18.562|eztv-2-64k> + 16.572|semantic-1-64k> + 15.686|eztv-1-64k> + 14.565|semantic-2-64k> + 14.389|slashdot-3-64k> + 13.707|slashdot-1-64k> + 13.707|slashdot-2-64k> drop-5-simm |diary-1-64k> => 98.754|diary-2-64k> + 41.721|wc-comments-2-64k> + 41.721|wc-comments-1-64k> + 18.801|eztv-2-64k> + 16.819|semantic-1-64k> + 15.745|eztv-1-64k> + 15.292|semantic-2-64k> + 14.503|slashdot-3-64k> + 13.792|slashdot-1-64k> + 13.792|slashdot-2-64k> drop-6-simm |diary-1-64k> => 98.822|diary-2-64k> + 42.199|wc-comments-2-64k> + 42.199|wc-comments-1-64k> + 18.978|eztv-2-64k> + 17.165|semantic-1-64k> + 16.338|semantic-2-64k> + 15.917|eztv-1-64k> + 14.752|slashdot-3-64k> + 13.991|slashdot-1-64k> + 13.991|slashdot-2-64k> drop-7-simm |diary-1-64k> => 98.822|diary-2-64k> + 41.206|wc-comments-2-64k> + 41.206|wc-comments-1-64k> + 18.989|eztv-2-64k> + 17.248|semantic-1-64k> + 16.705|semantic-2-64k> + 15.922|eztv-1-64k> + 13.173|slashdot-3-64k> + 13.037|slashdot-1-64k> + 13.037|slashdot-2-64k> drop-8-simm |diary-1-64k> => 98.809|diary-2-64k> + 40.310|wc-comments-2-64k> + 40.310|wc-comments-1-64k> + 19.023|eztv-2-64k> + 17.186|semantic-1-64k> + 17.050|semantic-2-64k> + 15.905|eztv-1-64k> + 13.391|slashdot-3-64k> + 13.252|slashdot-1-64k> + 13.252|slashdot-2-64k> drop-9-simm |diary-1-64k> => 98.801|diary-2-64k> + 39.936|wc-comments-2-64k> + 39.936|wc-comments-1-64k> + 19.165|eztv-2-64k> + 17.367|semantic-1-64k> + 17.299|semantic-2-64k> + 16.053|eztv-1-64k> + 13.079|slashdot-3-64k> + 12.908|slashdot-1-64k> + 12.908|slashdot-2-64k> drop-10-simm |diary-1-64k> => 98.801|diary-2-64k> + 38.335|wc-comments-2-64k> + 38.335|wc-comments-1-64k> + 19.165|eztv-2-64k> + 17.367|semantic-1-64k> + 17.299|semantic-2-64k> + 16.053|eztv-1-64k> + 13.097|slashdot-3-64k> + 12.924|slashdot-1-64k> + 12.924|slashdot-2-64k> drop-1-simm |eztv-2-64k> => 94.118|eztv-1-64k> + 13.832|slashdot-3-64k> + 13.343|slashdot-2-64k> + 13.291|slashdot-1-64k> + 13.143|diary-2-64k> + 12.996|diary-1-64k> + 11.952|wc-comments-2-64k> + 11.935|wc-comments-1-64k> + 7.528|semantic-1-64k> + 6.036|semantic-2-64k> drop-2-simm |eztv-2-64k> => 90.561|eztv-1-64k> + 20.751|slashdot-3-64k> + 19.987|slashdot-1-64k> + 19.972|slashdot-2-64k> + 17.360|diary-2-64k> + 17.071|diary-1-64k> + 16.382|wc-comments-2-64k> + 16.382|wc-comments-1-64k> + 9.051|semantic-1-64k> + 6.198|semantic-2-64k> drop-3-simm |eztv-2-64k> => 92.151|eztv-1-64k> + 21.835|slashdot-3-64k> + 21.264|slashdot-2-64k> + 21.241|slashdot-1-64k> + 18.497|diary-2-64k> + 18.197|diary-1-64k> + 17.344|wc-comments-2-64k> + 17.344|wc-comments-1-64k> + 10.699|semantic-1-64k> + 7.461|semantic-2-64k> drop-4-simm |eztv-2-64k> => 91.835|eztv-1-64k> + 22.253|slashdot-3-64k> + 21.646|slashdot-2-64k> + 21.589|slashdot-1-64k> + 18.865|diary-2-64k> + 18.562|diary-1-64k> + 17.221|wc-comments-2-64k> + 17.221|wc-comments-1-64k> + 11.832|semantic-1-64k> + 8.093|semantic-2-64k> drop-5-simm |eztv-2-64k> => 91.655|eztv-1-64k> + 22.540|slashdot-3-64k> + 21.836|slashdot-2-64k> + 21.779|slashdot-1-64k> + 19.107|diary-2-64k> + 18.801|diary-1-64k> + 17.623|wc-comments-2-64k> + 17.623|wc-comments-1-64k> + 12.518|semantic-1-64k> + 8.778|semantic-2-64k> drop-6-simm |eztv-2-64k> => 91.816|eztv-1-64k> + 23.156|slashdot-3-64k> + 22.259|slashdot-2-64k> + 22.201|slashdot-1-64k> + 19.244|diary-2-64k> + 18.978|diary-1-64k> + 17.797|wc-comments-2-64k> + 17.797|wc-comments-1-64k> + 13.807|semantic-1-64k> + 9.580|semantic-2-64k> drop-7-simm |eztv-2-64k> => 91.800|eztv-1-64k> + 22.900|slashdot-3-64k> + 22.299|slashdot-2-64k> + 22.239|slashdot-1-64k> + 19.254|diary-2-64k> + 18.989|diary-1-64k> + 17.521|wc-comments-2-64k> + 17.521|wc-comments-1-64k> + 14.815|semantic-1-64k> + 10.080|semantic-2-64k> drop-8-simm |eztv-2-64k> => 91.744|eztv-1-64k> + 23.189|slashdot-3-64k> + 22.579|slashdot-2-64k> + 22.518|slashdot-1-64k> + 19.289|diary-2-64k> + 19.023|diary-1-64k> + 17.396|wc-comments-2-64k> + 17.396|wc-comments-1-64k> + 15.600|semantic-1-64k> + 10.479|semantic-2-64k> drop-9-simm |eztv-2-64k> => 91.443|eztv-1-64k> + 23.857|slashdot-3-64k> + 23.005|slashdot-2-64k> + 22.943|slashdot-1-64k> + 19.432|diary-2-64k> + 19.165|diary-1-64k> + 17.543|wc-comments-2-64k> + 17.543|wc-comments-1-64k> + 15.896|semantic-1-64k> + 10.906|semantic-2-64k> drop-10-simm |eztv-2-64k> => 91.443|eztv-1-64k> + 23.952|slashdot-3-64k> + 23.128|slashdot-2-64k> + 23.065|slashdot-1-64k> + 19.432|diary-2-64k> + 19.165|diary-1-64k> + 16.841|wc-comments-2-64k> + 16.841|wc-comments-1-64k> + 15.896|semantic-1-64k> + 10.906|semantic-2-64k> drop-1-simm |diary-2-64k> => 95.959|diary-1-64k> + 39.239|wc-comments-2-64k> + 39.239|wc-comments-1-64k> + 13.143|eztv-2-64k> + 11.737|eztv-1-64k> + 11.313|slashdot-3-64k> + 10.937|slashdot-1-64k> + 10.872|slashdot-2-64k> + 10.305|semantic-1-64k> + 8.021|semantic-2-64k> drop-2-simm |diary-2-64k> => 97.965|diary-1-64k> + 42.452|wc-comments-2-64k> + 42.452|wc-comments-1-64k> + 17.360|eztv-2-64k> + 15.118|semantic-1-64k> + 14.809|eztv-1-64k> + 13.720|slashdot-3-64k> + 12.980|slashdot-1-64k> + 12.968|slashdot-2-64k> + 11.633|semantic-2-64k> drop-3-simm |diary-2-64k> => 98.515|diary-1-64k> + 41.866|wc-comments-2-64k> + 41.866|wc-comments-1-64k> + 18.497|eztv-2-64k> + 16.499|semantic-1-64k> + 15.645|eztv-1-64k> + 14.396|slashdot-3-64k> + 13.739|slashdot-1-64k> + 13.734|slashdot-2-64k> + 13.708|semantic-2-64k> drop-4-simm |diary-2-64k> => 98.311|diary-1-64k> + 41.388|wc-comments-2-64k> + 41.388|wc-comments-1-64k> + 18.865|eztv-2-64k> + 16.951|semantic-1-64k> + 15.989|eztv-1-64k> + 14.732|slashdot-3-64k> + 14.510|semantic-2-64k> + 14.049|slashdot-1-64k> + 14.049|slashdot-2-64k> drop-5-simm |diary-2-64k> => 98.754|diary-1-64k> + 41.749|wc-comments-2-64k> + 41.749|wc-comments-1-64k> + 19.107|eztv-2-64k> + 17.242|semantic-1-64k> + 16.052|eztv-1-64k> + 15.236|semantic-2-64k> + 14.854|slashdot-3-64k> + 14.143|slashdot-1-64k> + 14.143|slashdot-2-64k> drop-6-simm |diary-2-64k> => 98.822|diary-1-64k> + 42.175|wc-comments-2-64k> + 42.175|wc-comments-1-64k> + 19.244|eztv-2-64k> + 17.648|semantic-1-64k> + 16.267|semantic-2-64k> + 16.183|eztv-1-64k> + 15.074|slashdot-3-64k> + 14.312|slashdot-1-64k> + 14.312|slashdot-2-64k> drop-7-simm |diary-2-64k> => 98.822|diary-1-64k> + 41.181|wc-comments-2-64k> + 41.181|wc-comments-1-64k> + 19.254|eztv-2-64k> + 17.829|semantic-1-64k> + 16.886|semantic-2-64k> + 16.188|eztv-1-64k> + 13.495|slashdot-3-64k> + 13.358|slashdot-1-64k> + 13.358|slashdot-2-64k> drop-8-simm |diary-2-64k> => 98.809|diary-1-64k> + 40.280|wc-comments-2-64k> + 40.280|wc-comments-1-64k> + 19.289|eztv-2-64k> + 17.823|semantic-1-64k> + 17.309|semantic-2-64k> + 16.171|eztv-1-64k> + 13.715|slashdot-3-64k> + 13.576|slashdot-1-64k> + 13.576|slashdot-2-64k> drop-9-simm |diary-2-64k> => 98.801|diary-1-64k> + 39.915|wc-comments-2-64k> + 39.915|wc-comments-1-64k> + 19.432|eztv-2-64k> + 18.022|semantic-1-64k> + 17.558|semantic-2-64k> + 16.320|eztv-1-64k> + 13.413|slashdot-3-64k> + 13.242|slashdot-1-64k> + 13.242|slashdot-2-64k> drop-10-simm |diary-2-64k> => 98.801|diary-1-64k> + 38.315|wc-comments-2-64k> + 38.315|wc-comments-1-64k> + 19.432|eztv-2-64k> + 18.022|semantic-1-64k> + 17.558|semantic-2-64k> + 16.320|eztv-1-64k> + 13.430|slashdot-3-64k> + 13.258|slashdot-1-64k> + 13.258|slashdot-2-64k> drop-1-simm |wc-comments-1-64k> => 99.533|wc-comments-2-64k> + 39.457|diary-1-64k> + 39.239|diary-2-64k> + 11.935|eztv-2-64k> + 11.509|eztv-1-64k> + 8.821|slashdot-3-64k> + 8.460|slashdot-2-64k> + 8.392|slashdot-1-64k> + 6.658|semantic-1-64k> + 4.863|semantic-2-64k> drop-2-simm |wc-comments-1-64k> => 99.906|wc-comments-2-64k> + 42.452|diary-2-64k> + 42.446|diary-1-64k> + 16.382|eztv-2-64k> + 15.597|eztv-1-64k> + 11.105|slashdot-3-64k> + 10.221|slashdot-1-64k> + 10.214|slashdot-2-64k> + 8.796|semantic-1-64k> + 7.543|semantic-2-64k> drop-3-simm |wc-comments-1-64k> => 99.898|wc-comments-2-64k> + 41.866|diary-2-64k> + 41.846|diary-1-64k> + 17.344|eztv-2-64k> + 16.428|eztv-1-64k> + 10.926|slashdot-3-64k> + 10.403|slashdot-1-64k> + 10.400|slashdot-2-64k> + 9.476|semantic-1-64k> + 8.981|semantic-2-64k> drop-4-simm |wc-comments-1-64k> => 99.895|wc-comments-2-64k> + 41.388|diary-2-64k> + 41.363|diary-1-64k> + 17.221|eztv-2-64k> + 16.300|eztv-1-64k> + 10.940|slashdot-3-64k> + 10.713|slashdot-1-64k> + 10.713|slashdot-2-64k> + 9.717|semantic-1-64k> + 9.532|semantic-2-64k> drop-5-simm |wc-comments-1-64k> => 99.892|wc-comments-2-64k> + 41.749|diary-2-64k> + 41.721|diary-1-64k> + 17.623|eztv-2-64k> + 16.544|eztv-1-64k> + 11.211|slashdot-3-64k> + 10.972|slashdot-1-64k> + 10.972|slashdot-2-64k> + 10.001|semantic-1-64k> + 9.826|semantic-2-64k> drop-6-simm |wc-comments-1-64k> => 99.891|wc-comments-2-64k> + 42.199|diary-1-64k> + 42.175|diary-2-64k> + 17.797|eztv-2-64k> + 16.716|eztv-1-64k> + 11.111|slashdot-1-64k> + 11.111|slashdot-2-64k> + 11.086|slashdot-3-64k> + 10.204|semantic-1-64k> + 10.001|semantic-2-64k> drop-7-simm |wc-comments-1-64k> => 99.889|wc-comments-2-64k> + 41.206|diary-1-64k> + 41.181|diary-2-64k> + 17.521|eztv-2-64k> + 16.378|eztv-1-64k> + 10.408|semantic-1-64k> + 10.224|slashdot-3-64k> + 10.218|slashdot-1-64k> + 10.218|slashdot-2-64k> + 10.171|semantic-2-64k> drop-8-simm |wc-comments-1-64k> => 99.888|wc-comments-2-64k> + 40.310|diary-1-64k> + 40.280|diary-2-64k> + 17.396|eztv-2-64k> + 16.198|eztv-1-64k> + 9.594|slashdot-3-64k> + 9.589|slashdot-1-64k> + 9.589|slashdot-2-64k> + 9.567|semantic-2-64k> + 9.422|semantic-1-64k> drop-9-simm |wc-comments-1-64k> => 99.886|wc-comments-2-64k> + 39.936|diary-1-64k> + 39.915|diary-2-64k> + 17.543|eztv-2-64k> + 16.345|eztv-1-64k> + 9.704|slashdot-3-64k> + 9.687|slashdot-1-64k> + 9.687|slashdot-2-64k> + 9.686|semantic-2-64k> + 9.516|semantic-1-64k> drop-10-simm |wc-comments-1-64k> => 99.884|wc-comments-2-64k> + 38.335|diary-1-64k> + 38.315|diary-2-64k> + 16.841|eztv-2-64k> + 15.643|eztv-1-64k> + 9.890|slashdot-3-64k> + 9.873|slashdot-1-64k> + 9.873|slashdot-2-64k> + 9.866|semantic-2-64k> + 9.696|semantic-1-64k> drop-1-simm |slashdot-2-64k> => 96.165|slashdot-1-64k> + 77.224|slashdot-3-64k> + 13.544|eztv-1-64k> + 13.343|eztv-2-64k> + 10.872|diary-2-64k> + 10.735|diary-1-64k> + 8.460|wc-comments-2-64k> + 8.460|wc-comments-1-64k> + 7.896|semantic-1-64k> + 5.722|semantic-2-64k> drop-2-simm |slashdot-2-64k> => 97.959|slashdot-1-64k> + 92.552|slashdot-3-64k> + 20.155|eztv-1-64k> + 19.972|eztv-2-64k> + 12.968|diary-2-64k> + 12.648|diary-1-64k> + 10.787|semantic-1-64k> + 10.214|wc-comments-2-64k> + 10.214|wc-comments-1-64k> + 7.871|semantic-2-64k> drop-3-simm |slashdot-2-64k> => 99.536|slashdot-1-64k> + 95.007|slashdot-3-64k> + 21.427|eztv-1-64k> + 21.264|eztv-2-64k> + 13.734|diary-2-64k> + 13.396|diary-1-64k> + 12.566|semantic-1-64k> + 10.400|wc-comments-2-64k> + 10.400|wc-comments-1-64k> + 9.328|semantic-2-64k> drop-4-simm |slashdot-2-64k> => 99.494|slashdot-1-64k> + 95.704|slashdot-3-64k> + 21.806|eztv-1-64k> + 21.646|eztv-2-64k> + 14.049|diary-2-64k> + 13.743|semantic-1-64k> + 13.707|diary-1-64k> + 10.713|wc-comments-2-64k> + 10.713|wc-comments-1-64k> + 10.004|semantic-2-64k> drop-5-simm |slashdot-2-64k> => 99.943|slashdot-1-64k> + 96.441|slashdot-3-64k> + 21.984|eztv-1-64k> + 21.836|eztv-2-64k> + 14.459|semantic-1-64k> + 14.143|diary-2-64k> + 13.792|diary-1-64k> + 10.972|wc-comments-2-64k> + 10.972|wc-comments-1-64k> + 10.720|semantic-2-64k> drop-6-simm |slashdot-2-64k> => 99.942|slashdot-1-64k> + 97.030|slashdot-3-64k> + 22.407|eztv-1-64k> + 22.259|eztv-2-64k> + 15.789|semantic-1-64k> + 14.312|diary-2-64k> + 13.991|diary-1-64k> + 11.561|semantic-2-64k> + 11.111|wc-comments-2-64k> + 11.111|wc-comments-1-64k> drop-7-simm |slashdot-2-64k> => 99.940|slashdot-1-64k> + 97.909|slashdot-3-64k> + 22.448|eztv-1-64k> + 22.299|eztv-2-64k> + 16.506|semantic-1-64k> + 13.358|diary-2-64k> + 13.037|diary-1-64k> + 11.771|semantic-2-64k> + 10.218|wc-comments-2-64k> + 10.218|wc-comments-1-64k> drop-8-simm |slashdot-2-64k> => 99.939|slashdot-1-64k> + 97.882|slashdot-3-64k> + 22.730|eztv-1-64k> + 22.579|eztv-2-64k> + 17.312|semantic-1-64k> + 13.576|diary-2-64k> + 13.252|diary-1-64k> + 12.191|semantic-2-64k> + 9.589|wc-comments-2-64k> + 9.589|wc-comments-1-64k> drop-9-simm |slashdot-2-64k> => 99.938|slashdot-1-64k> + 97.884|slashdot-3-64k> + 23.161|eztv-1-64k> + 23.005|eztv-2-64k> + 17.642|semantic-1-64k> + 13.242|diary-2-64k> + 12.908|diary-1-64k> + 12.653|semantic-2-64k> + 9.687|wc-comments-2-64k> + 9.687|wc-comments-1-64k> drop-10-simm |slashdot-2-64k> => 99.937|slashdot-1-64k> + 97.872|slashdot-3-64k> + 23.283|eztv-1-64k> + 23.128|eztv-2-64k> + 17.652|semantic-1-64k> + 13.258|diary-2-64k> + 12.924|diary-1-64k> + 12.663|semantic-2-64k> + 9.873|wc-comments-2-64k> + 9.873|wc-comments-1-64k> drop-1-simm |semantic-1-64k> => 65.846|semantic-2-64k> + 10.374|diary-1-64k> + 10.305|diary-2-64k> + 8.165|slashdot-3-64k> + 7.896|slashdot-2-64k> + 7.855|slashdot-1-64k> + 7.677|eztv-1-64k> + 7.528|eztv-2-64k> + 6.678|wc-comments-2-64k> + 6.658|wc-comments-1-64k> drop-2-simm |semantic-1-64k> => 73.858|semantic-2-64k> + 15.118|diary-2-64k> + 14.835|diary-1-64k> + 10.961|slashdot-3-64k> + 10.793|slashdot-1-64k> + 10.787|slashdot-2-64k> + 9.051|eztv-1-64k> + 9.051|eztv-2-64k> + 8.796|wc-comments-2-64k> + 8.796|wc-comments-1-64k> drop-3-simm |semantic-1-64k> => 77.788|semantic-2-64k> + 16.499|diary-2-64k> + 16.206|diary-1-64k> + 12.626|slashdot-3-64k> + 12.569|slashdot-1-64k> + 12.566|slashdot-2-64k> + 10.699|eztv-1-64k> + 10.699|eztv-2-64k> + 9.476|wc-comments-2-64k> + 9.476|wc-comments-1-64k> drop-4-simm |semantic-1-64k> => 77.417|semantic-2-64k> + 16.951|diary-2-64k> + 16.572|diary-1-64k> + 13.807|slashdot-3-64k> + 13.743|slashdot-1-64k> + 13.743|slashdot-2-64k> + 11.832|eztv-1-64k> + 11.832|eztv-2-64k> + 9.717|wc-comments-2-64k> + 9.717|wc-comments-1-64k> drop-5-simm |semantic-1-64k> => 79.583|semantic-2-64k> + 17.242|diary-2-64k> + 16.819|diary-1-64k> + 14.532|slashdot-3-64k> + 14.459|slashdot-1-64k> + 14.459|slashdot-2-64k> + 12.518|eztv-1-64k> + 12.518|eztv-2-64k> + 10.001|wc-comments-2-64k> + 10.001|wc-comments-1-64k> drop-6-simm |semantic-1-64k> => 79.927|semantic-2-64k> + 17.648|diary-2-64k> + 17.165|diary-1-64k> + 15.880|slashdot-3-64k> + 15.789|slashdot-1-64k> + 15.789|slashdot-2-64k> + 13.807|eztv-1-64k> + 13.807|eztv-2-64k> + 10.204|wc-comments-2-64k> + 10.204|wc-comments-1-64k> drop-7-simm |semantic-1-64k> => 79.571|semantic-2-64k> + 17.829|diary-2-64k> + 17.248|diary-1-64k> + 16.636|slashdot-3-64k> + 16.506|slashdot-1-64k> + 16.506|slashdot-2-64k> + 14.815|eztv-1-64k> + 14.815|eztv-2-64k> + 10.408|wc-comments-2-64k> + 10.408|wc-comments-1-64k> drop-8-simm |semantic-1-64k> => 79.337|semantic-2-64k> + 17.823|diary-2-64k> + 17.445|slashdot-3-64k> + 17.312|slashdot-1-64k> + 17.312|slashdot-2-64k> + 17.186|diary-1-64k> + 15.600|eztv-1-64k> + 15.600|eztv-2-64k> + 9.422|wc-comments-2-64k> + 9.422|wc-comments-1-64k> drop-9-simm |semantic-1-64k> => 80.936|semantic-2-64k> + 18.022|diary-2-64k> + 17.797|slashdot-3-64k> + 17.642|slashdot-1-64k> + 17.642|slashdot-2-64k> + 17.367|diary-1-64k> + 15.896|eztv-1-64k> + 15.896|eztv-2-64k> + 9.516|wc-comments-2-64k> + 9.516|wc-comments-1-64k> drop-10-simm |semantic-1-64k> => 80.936|semantic-2-64k> + 18.022|diary-2-64k> + 17.808|slashdot-3-64k> + 17.652|slashdot-1-64k> + 17.652|slashdot-2-64k> + 17.367|diary-1-64k> + 15.896|eztv-1-64k> + 15.896|eztv-2-64k> + 9.696|wc-comments-2-64k> + 9.696|wc-comments-1-64k>OK. Some pretty good results there! I wonder if upscaling to 1M buckets as I have already done in the ebook case, if that would improve the results? I suspect not.
sa: meta-simm |*> #=> 100 similar[drop-4-simm] |_self> sa: meta-simm |slashdot-1-64k> 68.117|slashdot-3-64k> + 68.039|slashdot-2-64k> + 48.623|eztv-1-64k> + 48.024|eztv-2-64k> + 45.224|semantic-1-64k> + 41.308|semantic-2-64k> + 40.832|diary-2-64k> + 40.647|diary-1-64k> + 39.798|wc-comments-2-64k> + 39.798|wc-comments-1-64k>So terrible results. No need to test other kets, or drop-n's. But an interesting fact nonetheless (ie, meta-simm's give worse results).
[ diary-1-64k ] [ 100.00 98.822 15.917 18.978 17.165 16.338 13.991 13.991 14.752 42.199 42.199 ] [ diary-1-64k ] [ diary-2-64k ] [ 98.822 100.00 16.183 19.244 17.648 16.267 14.312 14.312 15.074 42.175 42.175 ] [ diary-2-64k ] [ eztv-1-64k ] [ 15.917 16.183 100.00 91.816 13.807 9.580 22.349 22.407 23.304 16.716 16.716 ] [ eztv-1-64k ] [ eztv-2-64k ] [ 18.978 19.244 91.816 100.00 13.807 9.580 22.201 22.259 23.156 17.797 17.797 ] [ eztv-2-64k ] [ semantic-1-64k ] [ 17.165 17.648 13.807 13.807 100.00 79.927 15.789 15.789 15.880 10.204 10.204 ] [ semantic-1-64k ] [ semantic-2-64k ] = [ 16.338 16.267 9.580 9.580 79.927 100.00 11.561 11.561 11.652 10.001 10.001 ] [ semantic-2-64k ] [ slashdot-1-64k ] [ 13.991 14.312 22.349 22.201 15.789 11.561 100.00 99.942 96.971 11.111 11.111 ] [ slashdot-1-64k ] [ slashdot-2-64k ] [ 13.991 14.312 22.407 22.259 15.789 11.561 99.942 100.00 97.030 11.111 11.111 ] [ slashdot-2-64k ] [ slashdot-3-64k ] [ 14.752 15.074 23.304 23.156 15.880 11.652 96.971 97.030 100.00 11.086 11.086 ] [ slashdot-3-64k ] [ wc-comments-1-64k ] [ 42.199 42.175 16.716 17.797 10.204 10.001 11.111 11.111 11.086 100.00 99.891 ] [ wc-comments-1-64k ] [ wc-comments-2-64k ] [ 42.199 42.175 16.716 17.797 10.204 10.001 11.111 11.111 11.086 99.891 100.00 ] [ wc-comments-2-64k ]
# these two functions help to pretty-print tables, and matrices in particular: def normalize_column_return_list(s,n): lines = (s.split('\n') + ['']*n)[:n] max_len = max(len(x) for x in lines) return [x.ljust(max_len) for x in lines] def paste_columns(data,pre='',sep=' ',post=''): if len(data) == 0: return "" columns = len(data) rows = max(s.count('\n') + 1 for s in data) r = [normalize_column_return_list(s,rows) for s in data] return "\n".join(pre + sep.join(r[j][k] for j in range(columns)) + post for k in range(rows)) eg: col_1 = "3\n9\n217\n13" col_2 = "0\n5\n-3\n-513" col_3 = "3.1415\n2.17\n1.23\n-6" matrix = paste_columns([col_1,col_2,col_3],'[ ',' ',' ]') print(matrix) spits out: [ 3 0 3.1415 ] [ 9 5 2.17 ] [ 217 -3 1.23 ] [ 13 -513 -6 ]And then this, on the messy side, code:
def sp_to_vect(one): if one.count() <= 1: vect = one.the_label() else: vect = "\n".join(x.label for x in one.data) return paste_columns([vect],'[ ','',' ]') def sp_to_list(one): if one.count() <= 1: return one.the_label() return "\n".join(x.label for x in one.data) # make 0.000 coeffs prettier! def coeff_to_str(x): if x == 0: return "0" else: return str("%.3f" % x) # this means if we want to change precission, we only need to change it here. def sp_coeffs_to_list(one): if one.count() <= 1: return coeff_to_str(one.the_value()) return "\n".join(coeff_to_str(x.value) for x in one.data) # code to spit out a pretty printed matrix given BKO rules: def matrix(context,op): one = context.relevant_kets(op).ket_sort() # one is the list of kets that will be on the right hand side. # usefully, relevant_kets() always returns a superposition. if one.count() == 0: # if one is empty, return the identity ket. return ket("",0) two = superposition() # two is the list of kets that will be on the left hand side. for elt in one.data: sp = elt.apply_op(context,op) two = union(two,sp) two = two.ket_sort() empty = two.multiply(0) # empty is the two list, with all coeffs set to 0 matrix_columns = [] # convert to list-comprehension? for elt in one.data: sp = (elt.apply_op(context,op) + empty).ket_sort() # we add "empty" so the column has all the elements. matrix_columns.append(sp_coeffs_to_list(sp)) x = sp_to_vect(one) y = sp_to_vect(two) M = paste_columns(matrix_columns,'[ ',' ',' ]') matrix = paste_columns([y,'=',M,x]) print(matrix) #print("\n" + paste_columns(matrix_columns,'',' ','')) return ket("matrix") # Just here so it retuns a ket of some sort. Has no meaning, really.And now in the console:
$ ./the_semantic_db_console.py Welcome! sa: load matrix-example-2.sw sa: dump ---------------------------------------- |context> => |context: 2 matrix play> M1 |x1> => 3.000|y2> M1 |x2> => 7.000|y1> + 6.000|y2> M1 |x3> => |y1> + 4.000|y2> M1 |x4> => |y1> M1 |x5> => 6.000|y1> + 4.000|y2> M1 |x6> => 4.000|y1> + 8.000|y2> M1 |x7> => |y1> + 2.000|y2> M2 |y1> => 6.000|z1> + 2.000|z2> + 7.000|z3> + 9.000|z4> + 5.000|z5> M2 |y2> => 3.000|z2> + 4.000|z3> + |z5> ---------------------------------------- sa: -- what happens if we try an unknown operator, we get the empty ket: sa: matrix[fish] 0.000|> sa: -- now, the M1 matrix: sa: matrix[M1] [ y1 ] = [ 0 7.000 1.000 1.000 6.000 4.000 1.000 ] [ x1 ] [ y2 ] [ 3.000 6.000 4.000 0 4.000 8.000 2.000 ] [ x2 ] [ x3 ] [ x4 ] [ x5 ] [ x6 ] [ x7 ] |matrix> sa: -- now, the M2 matrix: sa: matrix[M2] [ z1 ] = [ 6.000 0 ] [ y1 ] [ z2 ] [ 2.000 3.000 ] [ y2 ] [ z3 ] [ 7.000 4.000 ] [ z4 ] [ 9.000 0 ] [ z5 ] [ 5.000 1.000 ] |matrix> sa: -- now another data set: sa: load fred-sam-friends.sw sa: dump ---------------------------------------- |context> => |context: friends> friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie> ---------------------------------------- sa: matrix[friends] [ Charlie ] = [ 1.000 1.000 ] [ Fred ] [ Ed ] [ 1.000 0 ] [ Sam ] [ Emma ] [ 1.000 1.000 ] [ Frank ] [ 0 1.000 ] [ George ] [ 0 1.000 ] [ Harry ] [ 1.000 0 ] [ Jack ] [ 1.000 1.000 ] [ Julie ] [ 0 1.000 ] [ Mary ] [ 1.000 0 ] [ Patrick ] [ 1.000 0 ] [ Rob ] [ 1.000 0 ] [ Rober ] [ 0 1.000 ] |matrix> sa: -- now another data set: sa: load bots.sw sa: matrix[name] [ Bella ] = [ 1.000 0 0 ] [ bot: Bella ] [ Emma ] [ 0 1.000 0 ] [ bot: Emma ] [ Madison ] [ 0 0 1.000 ] [ bot: Madison ] |matrix> sa: matrix[age] [ age: 23 ] = [ 0 0 1.000 ] [ bot: Bella ] [ age: 29 ] [ 0 1.000 0 ] [ bot: Emma ] [ age: 31 ] [ 1.000 0 0 ] [ bot: Madison ] |matrix> sa: matrix[religion] [ religion: Christianity ] = [ 1.000 0 0 ] [ bot: Bella ] [ religion: Islam ] [ 0 0 1.000 ] [ bot: Emma ] [ religion: Taoism ] [ 0 1.000 0 ] [ bot: Madison ] |matrix> sa: matrix[make-of-car] [ car: BMW ] = [ 0 1.000 0 ] [ bot: Bella ] [ car: Bugatti ] [ 0 0 1.000 ] [ bot: Emma ] [ car: Porsche ] [ 1.000 0 0 ] [ bot: Madison ] |matrix> sa: matrix[mother] [ Madison ] = [ 0 1.000 0 ] [ bot: Bella ] [ Mia ] [ 1.000 0 1.000 ] [ bot: Emma ] [ bot: Madison ] |matrix> sa: matrix[father] [ Ian ] = [ 0 0 1.000 ] [ bot: Bella ] [ Nathan ] [ 0 1.000 0 ] [ bot: Emma ] [ William ] [ 1.000 0 0 ] [ bot: Madison ] |matrix> sa: matrix[bed-time] [ time: 10:30pm ] = [ 0 0 1.000 ] [ bot: Bella ] [ time: 2am ] [ 0 1.000 0 ] [ bot: Emma ] [ time: 8pm ] [ 1.000 0 0 ] [ bot: Madison ] |matrix> sa: -- now another data set: sa: load in-my-league.sw sa: matrix[features] [ athletic ] = [ 0 1.000 0 0 0 0 0 0 ] [ Donna ] [ beautiful ] [ 1.000 1.000 0 0 0 0 1.000 1.000 ] [ Emma ] [ educated ] [ 1.000 0 0 1.000 0 1.000 1.000 1.000 ] [ Jane ] [ loving ] [ 0 0 0 1.000 1.000 1.000 1.000 1.000 ] [ Liz ] [ religious ] [ 0 1.000 0 0 0 0 0 0 ] [ Mary ] [ sexy ] [ 1.000 1.000 1.000 0 0 0 1.000 1.000 ] [ Mia ] [ skinny ] [ 1.000 1.000 1.000 0 1.000 1.000 1.000 1.000 ] [ my perfect woman ] [ smart ] [ 1.000 0 0 1.000 0 1.000 1.000 1.000 ] [ the goddess ] |matrix> sa: -- now another data set: sa: load shopping-basket.sw sa: matrix[basket] [ apple ] = [ 3.000 0 4.000 0 0 ] [ f ] [ bananas ] [ 0 1.000 0 0 0 ] [ user 1 ] [ bread ] [ 1.000 1.000 0 0 1.000 ] [ user 2 ] [ carrots ] [ 0 1.000 0 0 0 ] [ user 3 ] [ cheese ] [ 0 0 0 1.000 1.000 ] [ user 4 ] [ chocolate ] [ 0 1.000 0 1.000 0 ] [ coffee ] [ 1.000 0 1.000 0 0 ] [ milk ] [ 1.000 1.000 1.000 0 0 ] [ olive oil ] [ 0 0 0 1.000 0 ] [ oranges ] [ 5.000 0 0 0 0 ] [ pizza ] [ 0 0 0 1.000 0 ] [ salami ] [ 0 0 0 0 1.000 ] [ steak ] [ 1.000 0 1.000 0 0 ] [ tea ] [ 0 1.000 0 0 0 ] [ vegemite ] [ 0 0 0 1.000 1.000 ] |matrix> sa: -- now another data set: sa: load breakfast-menu.sw sa: matrix[price] [ price: 4.50 ] = [ 0 0 1.000 0 0 ] [ food: Belgian Waffles ] [ price: 5.95 ] [ 1.000 0 0 0 0 ] [ food: Berry-Berry Belgian Waffles ] [ price: 6.95 ] [ 0 0 0 1.000 0 ] [ food: French Toast ] [ price: 7.95 ] [ 0 0 0 0 1.000 ] [ food: Homestyle Breakfast ] [ price: 8.95 ] [ 0 1.000 0 0 0 ] [ food: Strawberry Belgian Waffles ] |matrix> sa: matrix[calories] [ calories: 600 ] = [ 0 0 1.000 0 0 ] [ food: Belgian Waffles ] [ calories: 650 ] [ 1.000 0 0 0 0 ] [ food: Berry-Berry Belgian Waffles ] [ calories: 900 ] [ 0 1.000 0 0 1.000 ] [ food: French Toast ] [ calories: 950 ] [ 0 0 0 1.000 0 ] [ food: Homestyle Breakfast ] [ food: Strawberry Belgian Waffles ] |matrix> sa: matrix[name] [ text: "Belgian Waffles" ] = [ 1.000 0 0 0 0 ] [ food: Belgian Waffles ] [ text: "Berry-Berry Belgian Waffles" ] [ 0 1.000 0 0 0 ] [ food: Berry-Berry Belgian Waffles ] [ text: "French Toast" ] [ 0 0 1.000 0 0 ] [ food: French Toast ] [ text: "Homestyle Breakfast" ] [ 0 0 0 1.000 0 ] [ food: Homestyle Breakfast ] [ text: "Strawberry Belgian Waffles" ] [ 0 0 0 0 1.000 ] [ food: Strawberry Belgian Waffles ] |matrix> sa: matrix[description] [ text: "Light Belgian waffles covered with an assortment of fresh berries and whipped cream" ] = [ 0 1.000 0 0 0 ] [ food: Belgian Waffles ] [ text: "Light Belgian waffles covered with strawberries and whipped cream" ] [ 0 0 0 0 1.000 ] [ food: Berry-Berry Belgian Waffles ] [ text: "Thick slices made from our homemade sourdough bread" ] [ 0 0 1.000 0 0 ] [ food: French Toast ] [ text: "Two eggs, bacon or sausage, toast, and our ever-popular hash browns" ] [ 0 0 0 1.000 0 ] [ food: Homestyle Breakfast ] [ text: "Two of our famous Belgian Waffles with plenty of real maple syrup" ] [ 1.000 0 0 0 0 ] [ food: Strawberry Belgian Waffles ] |matrix> sa: -- now for a real world data set: sa: load fragment-documents-64k--post-processing--saved--cleaned.sw sa: matrix[drop-10-hash] [ 0651 ] = [ 0 0 0 0 16.000 17.000 0 0 0 0 0 ] [ diary-1-64k ] [ 08fa ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ diary-2-64k ] [ 09a6 ] [ 0 0 50.000 50.000 0 0 0 0 0 0 0 ] [ eztv-1-64k ] [ 0b57 ] [ 12.000 12.000 0 0 0 0 0 0 0 12.000 12.000 ] [ eztv-2-64k ] [ 0b6f ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ semantic-1-64k ] [ 0be8 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ semantic-2-64k ] [ 0c6d ] [ 0 0 0 0 16.000 16.000 0 0 0 0 0 ] [ slashdot-1-64k ] [ 0e76 ] [ 0 0 0 0 257.000 701.000 0 0 0 0 0 ] [ slashdot-2-64k ] [ 0fa0 ] [ 0 0 0 0 0 0 23.000 23.000 23.000 0 0 ] [ slashdot-3-64k ] [ 141b ] [ 0 0 0 0 14.000 25.000 0 0 0 0 0 ] [ wc-comments-1-64k ] [ 1466 ] [ 0 0 0 0 0 0 19.000 19.000 19.000 0 0 ] [ wc-comments-2-64k ] [ 16c6 ] [ 0 0 34.000 34.000 0 0 0 0 0 0 0 ] [ 176a ] [ 23.000 23.000 0 0 0 0 0 0 0 0 0 ] [ 1853 ] [ 0 0 0 0 0 10.000 0 0 0 0 0 ] [ 18a5 ] [ 0 0 0 0 0 0 16.000 16.000 16.000 0 0 ] [ 1a42 ] [ 0 0 13.000 13.000 0 0 0 0 0 0 0 ] [ 1b1b ] [ 0 0 0 0 0 0 12.000 12.000 12.000 0 0 ] [ 1e4c ] [ 0 0 0 0 0 0 11.000 11.000 11.000 0 0 ] [ 1fac ] [ 73.000 73.000 0 0 0 0 0 0 0 17.000 17.000 ] [ 223a ] [ 28.000 29.000 0 0 0 0 0 0 0 0 0 ] [ 2353 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 2454 ] [ 0 0 0 0 0 0 17.000 17.000 17.000 0 0 ] [ 25db ] [ 23.000 23.000 0 0 0 0 0 0 0 0 0 ] [ 27d8 ] [ 17.000 17.000 0 0 0 12.000 0 0 0 13.000 13.000 ] [ 28ab ] [ 0 0 0 0 0 0 0 0 0 10.000 11.000 ] [ 2a7a ] [ 0 0 0 0 0 0 10.000 10.000 10.000 0 0 ] [ 2c50 ] [ 0 0 0 0 12.000 24.000 0 0 0 0 0 ] [ 2d14 ] [ 0 0 0 0 0 0 26.000 26.000 26.000 0 0 ] [ 2d55 ] [ 0 0 17.000 17.000 0 0 0 0 0 0 0 ] [ 2d5d ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 2f08 ] [ 0 0 0 0 0 0 24.000 24.000 24.000 0 0 ] [ 2fc8 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 364d ] [ 0 0 23.000 23.000 0 0 144.000 144.000 144.000 0 0 ] [ 3678 ] [ 12.000 12.000 0 0 0 0 0 0 0 0 0 ] [ 3695 ] [ 0 0 0 0 0 0 0 0 0 30.000 30.000 ] [ 377b ] [ 21.000 21.000 0 0 0 0 0 0 0 0 0 ] [ 3808 ] [ 18.000 27.000 0 0 26.000 26.000 0 0 0 0 0 ] [ 38e6 ] [ 0 0 0 33.000 0 0 0 0 0 0 0 ] [ 3932 ] [ 0 0 57.000 57.000 0 0 0 0 0 0 0 ] [ 3c81 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 3eac ] [ 0 0 0 0 0 0 15.000 15.000 0 0 0 ] [ 3f8b ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 3fea ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 410c ] [ 0 0 0 0 0 0 0 0 0 16.000 15.000 ] [ 42ef ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 4325 ] [ 0 0 0 0 0 11.000 0 0 0 0 0 ] [ 4342 ] [ 0 0 0 0 0 0 16.000 16.000 15.000 0 0 ] [ 4576 ] [ 0 0 0 0 0 12.000 0 0 0 0 0 ] [ 4a13 ] [ 0 0 0 0 0 0 15.000 15.000 0 0 0 ] [ 4c07 ] [ 0 0 0 0 0 12.000 0 0 0 0 0 ] [ 4d14 ] [ 0 0 0 0 0 13.000 0 0 0 0 0 ] [ 4d2d ] [ 73.000 73.000 0 0 330.000 406.000 28.000 28.000 30.000 0 0 ] [ 5006 ] [ 31.000 32.000 10.000 40.000 0 0 0 0 0 0 0 ] [ 507e ] [ 14.000 14.000 11.000 11.000 0 0 0 0 0 0 0 ] [ 53ec ] [ 13.000 13.000 0 0 0 0 0 0 0 67.000 67.000 ] [ 5439 ] [ 0 0 0 0 116.000 130.000 0 0 0 0 0 ] [ 57ed ] [ 0 0 0 0 0 0 22.000 22.000 22.000 0 0 ] [ 5a93 ] [ 0 0 0 0 0 0 14.000 14.000 14.000 0 0 ] [ 5fad ] [ 0 0 0 0 0 0 16.000 16.000 15.000 0 0 ] [ 6231 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 62dc ] [ 0 0 0 0 74.000 99.000 0 0 0 0 0 ] [ 63fe ] [ 0 0 0 33.000 0 0 0 0 0 0 0 ] [ 6541 ] [ 11.000 11.000 0 0 0 0 0 0 0 11.000 11.000 ] [ 69bd ] [ 0 0 0 0 0 0 33.000 33.000 33.000 0 0 ] [ 6be1 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 6d21 ] [ 0 0 0 0 0 10.000 0 0 0 0 0 ] [ 6df4 ] [ 0 0 0 0 0 15.000 0 0 0 0 0 ] [ 6e98 ] [ 0 0 0 0 0 0 0 0 0 30.000 30.000 ] [ 6f6b ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 6f7f ] [ 0 0 100.000 100.000 0 0 0 0 0 0 0 ] [ 70de ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 7549 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 75de ] [ 38.000 38.000 0 0 0 0 0 0 0 0 0 ] [ 764d ] [ 0 0 0 0 18.000 31.000 0 0 0 0 0 ] [ 7718 ] [ 0 0 0 0 0 0 0 0 0 11.000 11.000 ] [ 78ca ] [ 0 0 0 0 0 0 15.000 15.000 13.000 0 0 ] [ 7bf0 ] [ 0 0 14.000 14.000 0 0 0 0 0 0 0 ] [ 7cef ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 7d75 ] [ 0 0 0 0 0 0 19.000 18.000 18.000 0 0 ] [ 8071 ] [ 0 0 0 0 52.000 52.000 0 0 0 0 0 ] [ 8097 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 8365 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 854e ] [ 49.000 49.000 294.000 324.000 0 0 17.000 17.000 17.000 40.000 40.000 ] [ 8731 ] [ 12.000 12.000 0 0 0 0 0 0 0 0 0 ] [ 8a83 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 8b81 ] [ 0 0 13.000 13.000 0 0 0 0 0 0 0 ] [ 8e4d ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 8f98 ] [ 30.000 31.000 10.000 40.000 0 0 0 0 0 0 0 ] [ 9183 ] [ 53.000 53.000 0 34.000 0 0 0 0 0 19.000 19.000 ] [ 9197 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 924b ] [ 24.000 24.000 0 0 0 0 0 0 0 0 0 ] [ 92b7 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ 93d9 ] [ 0 0 0 0 0 0 0 0 0 30.000 30.000 ] [ 94aa ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ 9513 ] [ 19.000 21.000 0 0 0 0 0 0 0 0 0 ] [ 95b7 ] [ 0 0 0 0 12.000 12.000 0 0 0 0 0 ] [ 972b ] [ 0 0 50.000 50.000 0 0 0 0 0 0 0 ] [ 9759 ] [ 38.000 40.000 0 0 15.000 15.000 0 0 0 32.000 32.000 ] [ 9ed4 ] [ 0 0 0 0 0 0 26.000 26.000 25.000 0 0 ] [ 9ef1 ] [ 0 0 0 38.000 0 0 0 0 0 0 0 ] [ a379 ] [ 0 0 12.000 12.000 0 0 251.000 251.000 251.000 0 0 ] [ a48a ] [ 59.000 60.000 0 0 0 0 0 0 0 92.000 92.000 ] [ a4c1 ] [ 0 0 0 0 21.000 27.000 0 0 0 0 0 ] [ a4e8 ] [ 0 0 0 0 0 0 0 0 0 13.000 13.000 ] [ a7c0 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ a8c9 ] [ 0 0 0 0 0 0 0 0 0 30.000 30.000 ] [ ab87 ] [ 0 0 0 0 0 10.000 0 0 0 0 0 ] [ acd0 ] [ 128.000 128.000 0 0 0 0 0 0 0 22.000 22.000 ] [ ae23 ] [ 0 0 0 0 12.000 12.000 0 0 0 0 0 ] [ ae2a ] [ 10.000 10.000 0 0 12.000 13.000 0 0 0 0 0 ] [ b095 ] [ 0 0 0 0 0 14.000 0 0 0 0 0 ] [ b209 ] [ 0 0 0 0 0 24.000 0 0 0 0 0 ] [ b61c ] [ 0 0 1031.000 1031.000 0 0 0 0 0 19.000 19.000 ] [ b649 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ ba48 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ bb1c ] [ 26.000 26.000 0 0 0 0 0 0 0 0 0 ] [ bb29 ] [ 0 0 0 0 0 17.000 0 0 0 0 0 ] [ bb2f ] [ 0 0 0 0 11.000 14.000 0 0 0 0 0 ] [ bb56 ] [ 0 0 0 0 0 0 22.000 22.000 22.000 0 0 ] [ bd83 ] [ 0 0 0 0 0 0 42.000 42.000 42.000 0 0 ] [ be4c ] [ 0 0 0 0 0 0 0 0 0 30.000 30.000 ] [ bf7c ] [ 22.000 22.000 0 0 0 0 0 0 0 0 0 ] [ c206 ] [ 0 0 50.000 50.000 0 0 0 0 0 0 0 ] [ c24f ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ c395 ] [ 15.000 15.000 0 0 0 0 0 0 0 15.000 15.000 ] [ c441 ] [ 0 0 0 0 0 0 0 0 0 30.000 30.000 ] [ c727 ] [ 0 0 0 0 54.000 57.000 0 0 0 0 0 ] [ c77a ] [ 13.000 13.000 0 0 0 0 0 0 0 0 0 ] [ c90d ] [ 17.000 17.000 83.000 113.000 0 0 0 0 0 0 0 ] [ cb1f ] [ 12.000 12.000 0 0 0 0 0 0 0 66.000 66.000 ] [ d202 ] [ 0 0 0 0 162.000 204.000 0 0 0 0 0 ] [ d362 ] [ 13.000 13.000 0 0 0 0 0 0 0 13.000 13.000 ] [ d426 ] [ 14.000 14.000 14.000 14.000 0 0 0 0 0 0 0 ] [ d43c ] [ 0 0 17.000 47.000 0 0 0 0 0 0 0 ] [ d691 ] [ 0 0 0 0 0 0 78.000 78.000 78.000 0 0 ] [ d8d7 ] [ 0 0 0 0 13.000 13.000 0 0 0 0 0 ] [ ddb9 ] [ 33.000 33.000 0 0 0 0 0 0 0 0 0 ] [ de0f ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ def0 ] [ 0 0 0 0 162.000 204.000 0 0 0 0 0 ] [ e191 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ e1a2 ] [ 102.000 102.000 0 0 0 0 0 0 0 20.000 20.000 ] [ e341 ] [ 74.000 74.000 0 0 0 0 0 0 0 17.000 17.000 ] [ e3a1 ] [ 59.000 59.000 0 0 0 0 0 0 0 16.000 16.000 ] [ e3f2 ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ e5aa ] [ 160.000 168.000 584.000 614.000 268.000 278.000 331.000 332.000 340.000 76.000 76.000 ] [ e78f ] [ 0 0 0 0 13.000 13.000 0 0 0 0 0 ] [ ea98 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ efac ] [ 0 0 33.000 33.000 0 0 0 0 0 0 0 ] [ f03c ] [ 0 0 0 0 0 0 15.000 15.000 15.000 0 0 ] [ f1d9 ] [ 0 0 0 0 0 0 49.000 49.000 49.000 0 0 ] [ f7b6 ] [ 0 0 0 0 0 0 12.000 12.000 12.000 0 0 ] [ fa5d ] [ 25.000 26.000 0 0 0 0 0 0 0 66.000 66.000 ] [ fb84 ] [ 0 0 0 0 0 0 16.000 16.000 16.000 0 0 ] [ fde9 ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] [ fdea ] [ 10.000 10.000 0 0 0 0 0 0 0 0 0 ] |matrix>Here is the drop-2-hash matrix.
$ ./the_semantic_db_console.py Welcome! sa: load matrix-example-2.sw sa: matrix[M1] [ y1 ] = [ 0 7.000 1.000 1.000 6.000 4.000 1.000 ] [ x1 ] [ y2 ] [ 3.000 6.000 4.000 0 4.000 8.000 2.000 ] [ x2 ] [ x3 ] [ x4 ] [ x5 ] [ x6 ] [ x7 ] |matrix> sa: matrix[M2] [ z1 ] = [ 6.000 0 ] [ y1 ] [ z2 ] [ 2.000 3.000 ] [ y2 ] [ z3 ] [ 7.000 4.000 ] [ z4 ] [ 9.000 0 ] [ z5 ] [ 5.000 1.000 ] |matrix> sa: matrix[M2,M1] [ z1 ] = [ 6.000 0 ] [ 0 7.000 1.000 1.000 6.000 4.000 1.000 ] [ x1 ] [ z2 ] [ 2.000 3.000 ] [ 3.000 6.000 4.000 0 4.000 8.000 2.000 ] [ x2 ] [ z3 ] [ 7.000 4.000 ] [ x3 ] [ z4 ] [ 9.000 0 ] [ x4 ] [ z5 ] [ 5.000 1.000 ] [ x5 ] [ x6 ] [ x7 ] |matrix> sa: -- even (sort of) works if you get the order wrong: sa: matrix[M1,M2] [ ] = [ 0 0 0 0 0 ] [ 6.000 0 ] [ y1 ] [ 2.000 3.000 ] [ y2 ] [ 7.000 4.000 ] [ 9.000 0 ] [ 5.000 1.000 ] |matrix>Here's the code:
# code to return a single matrix, and the left-hand superposition: # one must be a superposition # op is a literal op def single_matrix(one,context,op): one = one.apply_sigmoid(set_to,1) two = superposition() # two is the list of kets that will be on the left hand side. for elt in one.data: # heh. using one.data kind of breaks the superposition abstract interface idea. sp = elt.apply_op(context,op) two = union(two,sp) two = two.ket_sort().multiply(0) # merged two, and empty into the same thing. matrix_columns = [sp_coeffs_to_list((elt.apply_op(context,op) + two).ket_sort()) for elt in one.data ] M = paste_columns(matrix_columns,'[ ',' ',' ]') # M is the matrix return two, M # third version. # this one I want to handle multiple ops at once, and then chain the matrices. # eg: matrix[M2,M1] # or: matrix[friends,friends] -- ie, matrix of second-order friends def multi_matrix(context,ops): ops = ops.split(',')[::-1] print("ops:",ops) one = context.relevant_kets(ops[0]).ket_sort() # one is the list of kets that will be on the right hand side. # usefully, relevant_kets() always returns a superposition. if one.count() == 0: # if one is empty, return the identity ket. return ket("",0) two, M = single_matrix(one,context,ops[0]) matrices = [M] for op in ops[1:]: two, M = single_matrix(two,context,op) matrices.append(M) x = sp_to_vect(one) y = sp_to_vect(two) line = [y,'='] + matrices[::-1] + [x] matrix = paste_columns(line) print(matrix) return ket("matrix")OK. Now some more examples:
sa: load child-parent-binary-tree.sw sa: matrix[left] [ 0 ] = [ 0 0 0 0 0 0 1.000 ] [ 0 ] [ 00 ] [ 1.000 0 0 0 0 0 0 ] [ 00 ] [ 000 ] [ 0 1.000 0 0 0 0 0 ] [ 01 ] [ 001 ] [ 0 0 1.000 0 0 0 0 ] [ 1 ] [ 01 ] [ 0 0 0 1.000 0 0 0 ] [ 10 ] [ 010 ] [ 0 0 0 0 1.000 0 0 ] [ 11 ] [ 011 ] [ 0 0 0 0 0 1.000 0 ] [ x ] |matrix> sa: matrix[right] [ 1 ] = [ 0 0 0 0 0 0 1.000 ] [ 0 ] [ 10 ] [ 1.000 0 0 0 0 0 0 ] [ 00 ] [ 100 ] [ 0 1.000 0 0 0 0 0 ] [ 01 ] [ 101 ] [ 0 0 1.000 0 0 0 0 ] [ 1 ] [ 11 ] [ 0 0 0 1.000 0 0 0 ] [ 10 ] [ 110 ] [ 0 0 0 0 1.000 0 0 ] [ 11 ] [ 111 ] [ 0 0 0 0 0 1.000 0 ] [ x ] |matrix> sa: matrix[child] [ 0 ] = [ 0 0 0 0 0 0 0 1.000 ] [ * ] [ 00 ] [ 0 1.000 0 0 0 0 0 0 ] [ 0 ] [ 000 ] [ 0 0 1.000 0 0 0 0 0 ] [ 00 ] [ 001 ] [ 0 0 0 1.000 0 0 0 0 ] [ 01 ] [ 01 ] [ 0 0 0 0 1.000 0 0 0 ] [ 1 ] [ 010 ] [ 0 0 0 0 0 1.000 0 0 ] [ 10 ] [ 011 ] [ 0 0 0 0 0 0 1.000 0 ] [ 11 ] [ 1 ] [ 0 0 0 0 0 0 0 1.000 ] [ x ] [ 10 ] [ 0 1.000 0 0 0 0 0 0 ] [ 100 ] [ 0 0 1.000 0 0 0 0 0 ] [ 101 ] [ 0 0 0 1.000 0 0 0 0 ] [ 11 ] [ 0 0 0 0 1.000 0 0 0 ] [ 110 ] [ 0 0 0 0 0 1.000 0 0 ] [ 111 ] [ 0 0 0 0 0 0 1.000 0 ] |matrix> sa: matrix[parent] [ 0 ] = [ 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 ] [ * ] [ 00 ] [ 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 ] [ 0 ] [ 01 ] [ 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 ] [ 00 ] [ 1 ] [ 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 ] [ 000 ] [ 10 ] [ 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 ] [ 001 ] [ 11 ] [ 0 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 ] [ 01 ] [ x ] [ 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 0 ] [ 010 ] [ 011 ] [ 1 ] [ 10 ] [ 100 ] [ 101 ] [ 11 ] [ 110 ] [ 111 ] |matrix> sa: -- these next two are not super useful, but here they are anyway. sa: matrix[parent,child] [ 0 ] = [ 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 1.000 ] [ * ] [ 00 ] [ 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 ] [ 0 1.000 0 0 0 0 0 0 ] [ 0 ] [ 01 ] [ 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 ] [ 0 0 1.000 0 0 0 0 0 ] [ 00 ] [ 1 ] [ 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 ] [ 0 0 0 1.000 0 0 0 0 ] [ 01 ] [ 10 ] [ 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 ] [ 0 0 0 0 1.000 0 0 0 ] [ 1 ] [ 11 ] [ 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 ] [ 0 0 0 0 0 1.000 0 0 ] [ 10 ] [ x ] [ 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 1.000 0 ] [ 11 ] [ 0 0 0 0 0 0 0 1.000 ] [ x ] [ 0 1.000 0 0 0 0 0 0 ] [ 0 0 1.000 0 0 0 0 0 ] [ 0 0 0 1.000 0 0 0 0 ] [ 0 0 0 0 1.000 0 0 0 ] [ 0 0 0 0 0 1.000 0 0 ] [ 0 0 0 0 0 0 1.000 0 ] |matrix> sa: matrix[child,parent] [ 0 ] = [ 0 0 0 0 0 0 1.000 ] [ 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 ] [ * ] [ 00 ] [ 1.000 0 0 0 0 0 0 ] [ 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 ] [ 0 ] [ 000 ] [ 0 1.000 0 0 0 0 0 ] [ 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 ] [ 00 ] [ 001 ] [ 0 0 1.000 0 0 0 0 ] [ 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 ] [ 000 ] [ 01 ] [ 0 0 0 1.000 0 0 0 ] [ 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 ] [ 001 ] [ 010 ] [ 0 0 0 0 1.000 0 0 ] [ 0 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 ] [ 01 ] [ 011 ] [ 0 0 0 0 0 1.000 0 ] [ 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 0 ] [ 010 ] [ 1 ] [ 0 0 0 0 0 0 1.000 ] [ 011 ] [ 10 ] [ 1.000 0 0 0 0 0 0 ] [ 1 ] [ 100 ] [ 0 1.000 0 0 0 0 0 ] [ 10 ] [ 101 ] [ 0 0 1.000 0 0 0 0 ] [ 100 ] [ 11 ] [ 0 0 0 1.000 0 0 0 ] [ 101 ] [ 110 ] [ 0 0 0 0 1.000 0 0 ] [ 11 ] [ 111 ] [ 0 0 0 0 0 1.000 0 ] [ 110 ] [ 111 ] |matrix>
sa: load matrix-example-2.sw sa: matrix[M1] [ y1 ] = [ 0 7.000 1.000 1.000 6.000 4.000 1.000 ] [ x1 ] [ y2 ] [ 3.000 6.000 4.000 0 4.000 8.000 2.000 ] [ x2 ] [ x3 ] [ x4 ] [ x5 ] [ x6 ] [ x7 ] |matrix> sa: matrix[M2] [ z1 ] = [ 6.000 0 ] [ y1 ] [ z2 ] [ 2.000 3.000 ] [ y2 ] [ z3 ] [ 7.000 4.000 ] [ z4 ] [ 9.000 0 ] [ z5 ] [ 5.000 1.000 ] |matrix> sa: matrix[M2,M1] [ z1 ] = [ 6.000 0 ] [ 0 7.000 1.000 1.000 6.000 4.000 1.000 ] [ x1 ] [ z2 ] [ 2.000 3.000 ] [ 3.000 6.000 4.000 0 4.000 8.000 2.000 ] [ x2 ] [ z3 ] [ 7.000 4.000 ] [ x3 ] [ z4 ] [ 9.000 0 ] [ x4 ] [ z5 ] [ 5.000 1.000 ] [ x5 ] [ x6 ] [ x7 ] |matrix> sa: merged-matrix[M2,M1] [ z1 ] = [ 0 42.000 6.000 6.000 36.000 24.000 6.000 ] [ x1 ] [ z2 ] [ 9.000 32.000 14.000 2.000 24.000 32.000 8.000 ] [ x2 ] [ z3 ] [ 12.000 73.000 23.000 7.000 58.000 60.000 15.000 ] [ x3 ] [ z4 ] [ 0 63.000 9.000 9.000 54.000 36.000 9.000 ] [ x4 ] [ z5 ] [ 3.000 41.000 9.000 5.000 34.000 28.000 7.000 ] [ x5 ] [ x6 ] [ x7 ] |matrix> sa: merged-matrix[M1,M2] [ ] = [ 0 0 ] [ y1 ] [ y2 ] |matrix> sa: load child-parent-binary-tree.sw sa: merged-matrix[parent,child] -- parent child |object> == 2 |object> since it is a binary tree. [ 0 ] = [ 0 2.000 0 0 0 0 0 0 ] [ * ] [ 00 ] [ 0 0 2.000 0 0 0 0 0 ] [ 0 ] [ 01 ] [ 0 0 0 2.000 0 0 0 0 ] [ 00 ] [ 1 ] [ 0 0 0 0 2.000 0 0 0 ] [ 01 ] [ 10 ] [ 0 0 0 0 0 2.000 0 0 ] [ 1 ] [ 11 ] [ 0 0 0 0 0 0 2.000 0 ] [ 10 ] [ x ] [ 0 0 0 0 0 0 0 2.000 ] [ 11 ] [ x ] |matrix> sa: merged-matrix[child,parent] -- child parent |object> is the same as sibling |object> [ 0 ] = [ 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 0 ] [ * ] [ 00 ] [ 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 ] [ 0 ] [ 000 ] [ 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 ] [ 00 ] [ 001 ] [ 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 ] [ 000 ] [ 01 ] [ 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 ] [ 001 ] [ 010 ] [ 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 ] [ 01 ] [ 011 ] [ 0 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 ] [ 010 ] [ 1 ] [ 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 0 ] [ 011 ] [ 10 ] [ 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 0 ] [ 1 ] [ 100 ] [ 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 0 ] [ 10 ] [ 101 ] [ 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 0 ] [ 100 ] [ 11 ] [ 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 0 ] [ 101 ] [ 110 ] [ 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 0 ] [ 11 ] [ 111 ] [ 0 0 0 0 0 0 0 1.000 0 0 0 0 0 0 1.000 ] [ 110 ] [ 111 ] |matrix>
# 23/5/2014: # let's implement a map function (since we can't have multi-line for loops, this will have to do!) # eg: map[op] (|x> + |y>) # runs: # op |x> => op |_self> # op |y> => op |_self> # ie, it converts function operators (op on the right hand side), in to literal operators (on the left hand side) # eg: map[fib] (|10> + |11>) # eg: map[child] (|x> + |0> + |1> + |00> + |01> + |10> + |11>) # or indirectly: # map[op] "" |list> # one is a ket/sp # op is a string def map(one,context,op): one = superposition() + one # map kets to superposition. Maybe have a ket/sp function called x.superposition()?? for x in one.data: # what if x has x.value != 1? x.apply_op handles that. context.learn(op,x,x.apply_op(context,op)) return ket("map")Probably not clear what it is doing, so maybe an example at the console will help:
sa: load small-fib.sw loading sw file: sw-examples/small-fib.sw sa: dump ---------------------------------------- |context> => |context: sw console> fib |0> => |0> fib |1> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> arithmetic( fib n-1 |_self>, |+>, fib n-2 |_self>) -- fib |*> is a function operator. ---------------------------------------- sa: fib |8> => fib |8> -- do one example by hand. sa: dump ---------------------------------------- |context> => |context: sw console> fib |0> => |0> fib |1> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> arithmetic( fib n-1 |_self>, |+>, fib n-2 |_self>) fib |8> => |21> -- NB: this line. ---------------------------------------- sa: map[fib] (|13> + |14> + |15> + |16>) -- use our map function sa: dump ---------------------------------------- |context> => |context: sw console> fib |0> => |0> fib |1> => |1> n-1 |*> #=> arithmetic(|_self>,|->,|1>) n-2 |*> #=> arithmetic(|_self>,|->,|2>) fib |*> #=> arithmetic( fib n-1 |_self>, |+>, fib n-2 |_self>) fib |8> => |21> fib |13> => |233> -- For |0>,|1>,|8> and now |13>,|14>,|15>,|16> fib is a literal operator. fib |14> => |377> -- a kind of memoization, I suppose. fib |15> => |610> fib |16> => |987> ----------------------------------------Note, a key part of why this works is literal operators (not sure that is the best name for them) have higher precedence than operators applied to |*>, or |category: *>
a: b: c: d: fred a: b: c: d: * a: b: c: * a: b: * a: * *So for example, fib |20> will have these two trial labels in context.recall():
20 *So, if fib |20> is defined, use that, else drop back and try fib |*>. If fib|*> is not defined return |>.
match = False for trial_label in label_decent(label): if trial_label in self.known_kets: if op in self.rules[trial_label]: rule = self.rules[trial_label][op] match = True break if not match: print("recall not found") rule = ket("",0)
result |*> #=> fn |_self> |list> => |a> + |b> + |c> + |d> + |e> map[result] "" |list>In a real programming language this would be something like this:
list = [a,b,c,d,e] result = map(fn,list)Here is a worked example in the console:
sa: fn |*> #=> merge-labels(|fn > + |_self>) -- fn is just some example function sa: result |*> #=> fn |_self> -- we want the results of fn with result as the operator label sa: |list> => |a> + |b> + |c> + |d> + |e> -- define our list sa: dump -- have a look at what we have so far: ---------------------------------------- |context> => |context: sw console> fn |*> #=> merge-labels(|fn > + |_self>) result |*> #=> fn |_self> |list> => |a> + |b> + |c> + |d> + |e> ---------------------------------------- sa: map[result] "" |list> -- apply the map sa: dump -- have a look at what we have now: ---------------------------------------- |context> => |context: sw console> fn |*> #=> merge-labels(|fn > + |_self>) result |*> #=> fn |_self> |list> => |a> + |b> + |c> + |d> + |e> result |a> => |fn a> result |b> => |fn b> result |c> => |fn c> result |d> => |fn d> result |e> => |fn e> ---------------------------------------- sa: matrix[result] -- have a look at what we have in matrix form: [ fn * ] = [ 1.00 0 0 0 0 0 ] [ * ] [ fn a ] [ 0 1.00 0 0 0 0 ] [ a ] [ fn b ] [ 0 0 1.00 0 0 0 ] [ b ] [ fn c ] [ 0 0 0 1.00 0 0 ] [ c ] [ fn d ] [ 0 0 0 0 1.00 0 ] [ d ] [ fn e ] [ 0 0 0 0 0 1.00 ] [ e ] |matrix>
sa: fn |*> #=> merge-labels(|fn > + |_self>) sa: map[fn,destination] (|x> + |y> + |z>) sa: dump ---------------------------------------- |context> => |context: sw console> fn |*> #=> merge-labels(|fn > + |_self>) destination |x> => |fn x> destination |y> => |fn y> destination |z> => |fn z> ---------------------------------------- sa: matrix[destination] [ fn x ] = [ 1.00 0 0 ] [ x ] [ fn y ] [ 0 1.00 0 ] [ y ] [ fn z ] [ 0 0 1.00 ] [ z ] |matrix>Small, but useful improvement!
friends |Alex> => |Jason> + |Ed> + |Mary> + |Liz> + |Beth> + |James> + |nathan> friends |Bill> => |Jason> + |Beth> + |lena> + |John> + |nathan> friends |Harry> => |charlie> + |bella> + |sam> + |smithie> + |david> + |nathan> links-to |url 1> => |url k> + |url g> + |url b> + |url f> + |url l> + |url e> + |url j> links-to |url 2> => |url h> + |url l> + |url b> + |url g> + |url i> links-to |url 3> => |url m> + |url a> + |url d> + |url c> + |url n> + |url l>Right, so at least superficially, the friends network, and the url links-to network, look entirely different, but they actually share the same network structure.
sa: matrix[friends] [ bella ] = [ 0 0 1.00 ] [ Alex ] [ Beth ] [ 1.00 1.00 0 ] [ Bill ] [ charlie ] [ 0 0 1.00 ] [ Harry ] [ david ] [ 0 0 1.00 ] [ Ed ] [ 1.00 0 0 ] [ James ] [ 1.00 0 0 ] [ Jason ] [ 1.00 1.00 0 ] [ John ] [ 0 1.00 0 ] [ lena ] [ 0 1.00 0 ] [ Liz ] [ 1.00 0 0 ] [ Mary ] [ 1.00 0 0 ] [ nathan ] [ 1.00 1.00 1.00 ] [ sam ] [ 0 0 1.00 ] [ smithie ] [ 0 0 1.00 ] |matrix> sa: matrix[links-to] [ url a ] = [ 0 0 1.00 ] [ url 1 ] [ url b ] [ 1.00 1.00 0 ] [ url 2 ] [ url c ] [ 0 0 1.00 ] [ url 3 ] [ url d ] [ 0 0 1.00 ] [ url e ] [ 1.00 0 0 ] [ url f ] [ 1.00 0 0 ] [ url g ] [ 1.00 1.00 0 ] [ url h ] [ 0 1.00 0 ] [ url i ] [ 0 1.00 0 ] [ url j ] [ 1.00 0 0 ] [ url k ] [ 1.00 0 0 ] [ url l ] [ 1.00 1.00 1.00 ] [ url m ] [ 0 0 1.00 ] [ url n ] [ 0 0 1.00 ] |matrix>So I guess the implication is that just given a particular network structure (ie, a matrix), it is generally impossible to reconstruct the meaning of that network.
sa: create inverse sa: merged-matrix[links-to,inverse-links-to] [ url a ] = [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ url a ] [ url b ] [ 0 2.00 0 0 1.00 1.00 2.00 1.00 1.00 1.00 1.00 2.00 0 0 ] [ url b ] [ url c ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ url c ] [ url d ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ url d ] [ url e ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ url e ] [ url f ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ url f ] [ url g ] [ 0 2.00 0 0 1.00 1.00 2.00 1.00 1.00 1.00 1.00 2.00 0 0 ] [ url g ] [ url h ] [ 0 1.00 0 0 0 0 1.00 1.00 1.00 0 0 1.00 0 0 ] [ url h ] [ url i ] [ 0 1.00 0 0 0 0 1.00 1.00 1.00 0 0 1.00 0 0 ] [ url i ] [ url j ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ url j ] [ url k ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ url k ] [ url l ] [ 1.00 2.00 1.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 1.00 3.00 1.00 1.00 ] [ url l ] [ url m ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ url m ] [ url n ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ url n ] |matrix> sa: merged-matrix[friends,inverse-friends] [ bella ] = [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ bella ] [ Beth ] [ 0 2.00 0 0 1.00 1.00 2.00 1.00 1.00 1.00 1.00 2.00 0 0 ] [ Beth ] [ charlie ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ charlie ] [ david ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ david ] [ Ed ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ Ed ] [ James ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ James ] [ Jason ] [ 0 2.00 0 0 1.00 1.00 2.00 1.00 1.00 1.00 1.00 2.00 0 0 ] [ Jason ] [ John ] [ 0 1.00 0 0 0 0 1.00 1.00 1.00 0 0 1.00 0 0 ] [ John ] [ lena ] [ 0 1.00 0 0 0 0 1.00 1.00 1.00 0 0 1.00 0 0 ] [ lena ] [ Liz ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ Liz ] [ Mary ] [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ Mary ] [ nathan ] [ 1.00 2.00 1.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 1.00 3.00 1.00 1.00 ] [ nathan ] [ sam ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ sam ] [ smithie ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ smithie ] |matrix> sa: merged-matrix[inverse-links-to,links-to] [ url 1 ] = [ 7.00 3.00 1.00 ] [ url 1 ] [ url 2 ] [ 3.00 5.00 1.00 ] [ url 2 ] [ url 3 ] [ 1.00 1.00 6.00 ] [ url 3 ] |matrix> sa: merged-matrix[inverse-friends,friends] [ Alex ] = [ 7.00 3.00 1.00 ] [ Alex ] [ Bill ] [ 3.00 5.00 1.00 ] [ Bill ] [ Harry ] [ 1.00 1.00 6.00 ] [ Harry ] |matrix>BTW, while we have this data loaded in the console, these do make sense:
sa: matrix[inverse-friends] [ Alex ] = [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ bella ] [ Bill ] [ 0 1.00 0 0 0 0 1.00 1.00 1.00 0 0 1.00 0 0 ] [ Beth ] [ Harry ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ charlie ] [ david ] [ Ed ] [ James ] [ Jason ] [ John ] [ lena ] [ Liz ] [ Mary ] [ nathan ] [ sam ] [ smithie ] |matrix> sa: matrix[inverse-links-to] [ url 1 ] = [ 0 1.00 0 0 1.00 1.00 1.00 0 0 1.00 1.00 1.00 0 0 ] [ url a ] [ url 2 ] [ 0 1.00 0 0 0 0 1.00 1.00 1.00 0 0 1.00 0 0 ] [ url b ] [ url 3 ] [ 1.00 0 1.00 1.00 0 0 0 0 0 0 0 1.00 1.00 1.00 ] [ url c ] [ url d ] [ url e ] [ url f ] [ url g ] [ url h ] [ url i ] [ url j ] [ url k ] [ url l ] [ url m ] [ url n ] |matrix>BTW, talking of "inverse-links-to", I wonder how hard it would be implement a basic "page rank" in BKO?
sa: merged-matrix[inverse-friends,friends] [ Alex ] = [ 7.00 3.00 1.00 ] [ Alex ] [ Bill ] [ 3.00 5.00 1.00 ] [ Bill ] [ Harry ] [ 1.00 1.00 6.00 ] [ Harry ] |matrix> Call the matrix M Then M(x,y) = count common[friends] (|x> + |y>)Or, in general, set M = merged-matrix[inverse-op,op], then M(x,y) = count common[op] (|x> + |y>) = count intersection(op|x>,op|y>)
sa: inverse-friends friends |Alex> 7.000|Alex> + 3.000|Bill> + |Harry> sa: inverse-friends friends |Bill> 3.000|Alex> + 5.000|Bill> + |Harry> sa: inverse-friends friends |Harry> 6.000|Harry> + |Alex> + |Bill>Heh. So it should be kind of obvious:
M(x,y) = <y|inverse-op op|x> -- noting that M is symmetrical in x,y.
sa: load average.sw sa: dump ---------------------------------------- |context> => |context: average> ave |*> #=> arithmetic(count-sum "" |_self>,|/>,count "" |_self>) apply-weights |*> #=> mult(""|_self>, weights|_self>) weighted-ave |*> #=> arithmetic(count-sum apply-weights |_self>,|/>,count-sum weights |_self>) tmp-ave |*> #=> arithmetic(count-sum 100 "" |_self>,|/>,count 100 "" |_self>) harmonic-mean |*> #=> arithmetic(count "" |_self>,|/>,count-sum invert "" |_self>) |u> => |a> + 2.000|b> + 3.000|c> + 4.000|d> |x> => |a> + 2.000|b> + 3.000|c> + 4.000|d> weights |x> => 0.100|a> + 0.100|b> + 0.700|c> + 0.100|d> |y> => |a> + 2.000|b> + 5.000|c> + 7.000|d> weights |y> => 2.000|a> + 14.000|b> + 8.000|c> + 32.000|d> |tmp> => 0.100|a> + 0.100|b> + 0.700|c> + 0.100|d> |z> => 60.000|a> + 40.000|b> ---------------------------------------- sa: map[ave,average] (|u> + |x> + |y> + |z>) sa: matrix[average] [ number: 2.5 ] = [ 1.00 1.00 0 0 ] [ u ] -- NB: not one-to-one [ number: 3.75 ] [ 0 0 1.00 0 ] [ x ] [ number: 50.0 ] [ 0 0 0 1.00 ] [ y ] [ z ] |matrix> sa: map[harmonic-mean,harmonic] (|x> + |y> + |z>) sa: matrix[harmonic] [ number: 1.9200000000000004 ] = [ 1.00 0 0 ] [ x ] -- NB: it is one-to-one [ number: 2.170542635658915 ] [ 0 1.00 0 ] [ y ] [ number: 47.99999999999999 ] [ 0 0 1.00 ] [ z ] |matrix> sa: map[weighted-ave,weighted-average] (|x> + |y>) sa: matrix[weighted-average] [ number: 2.8 ] = [ 1.00 0 ] [ x ] [ number: 5.25 ] [ 0 1.00 ] [ y ] |matrix>
They are metaphors for each other. They are different representations of the same thing. They are homomorphisms.
def metric_mbr(metric,x,thresh,data): for elt in data: if metric(x,elt) >= thresh: -- depending on if 0 or 1 means exact match, you may want to swap to: <= thresh return True return False def categorize_list(data,metric,thresh): out_list = [] for x in data: n = 0 del_list = [] for i in range(len(out_list)): if metric_mbr(metric,x,thresh,out_list[i]): if n == 0: out_list[i].append(x) idx = i n = 1 else: out_list[idx] += out_list[i] del_list.append(i) if n == 0: out_list.append([x]) else: out_list = [x for index, x in enumerate(out_list) if index not in del_list] return out_listNow I need to write a BKO version.
# 28/5/2014: # working towards a BKO version of the categorize code. # first, the equivalent of metric_mbr, using simm. # # one is a superposition # op is a string # x is a ket # thresh is a float def simm_mbr(context,op,x,thresh,one): f = x.apply_op(context,op) for elt in one.data: g = elt.apply_op(context,op) if silent_simm(f,g) >= thresh: return True return False # categorize[op,thresh,destination] def categorize(context,parameters): try: op,thresh,destination = parameters.split(',') thresh = float(thresh) destination = ket(destination) except: return ket("",0) one = context.relevant_kets(op) # one is a superposition print("one:",one) out_list = [] # out_list will be a list of superpositions. for x in one.data: # x is of course a ket n = 0 del_list = [] # del_list will be a list of integers. for i in range(len(out_list)): if simm_mbr(context,op,x,thresh,out_list[i]): if n == 0: out_list[i] += x idx = i n = 1 else: out_list[idx] += out_list[i] del_list.append(i) if n == 0: out_list.append(superposition() + x) # we use "superposition() + x" instead of just "x" so out_list is always a list of superpositions, not kets. else: out_list = [x for index,x in enumerate(out_list) if index not in del_list] for k, sp in enumerate(out_list): print("sp:",sp) context.learn("category-" + str(k),destination,sp) return ket("categorize")
sa: load H-I-pat-rec.sw sa: simm |*> #=> 100 similar[pixels] |_self> + 100 |_self> -- need to add 100 |_self> else the diagonals in the simm matrix will be 0. sa: |list> => |letter: H> + |noisy: H> + |noisy: H2> + |letter: I> + |noisy: I> + |noisy: I2> + |letter: O> sa: |list> => shuffle "" |list> -- shuffle the list. sa: map[simm,simm-pixels] "" |list> sa: matrix[simm-pixels] [ letter: H ] = [ 100.00 29.41 40.91 82.35 76.19 17.65 35.00 ] [ letter: H ] [ letter: I ] [ 29.41 100.00 45.45 26.67 38.10 73.33 65.00 ] [ letter: I ] [ letter: O ] [ 40.91 45.45 100.00 36.36 50.00 36.36 40.91 ] [ letter: O ] [ noisy: H ] [ 82.35 26.67 36.36 100.00 61.90 14.29 25.00 ] [ noisy: H ] [ noisy: H2 ] [ 76.19 38.10 50.00 61.90 100.00 19.05 47.62 ] [ noisy: H2 ] [ noisy: I ] [ 17.65 73.33 36.36 14.29 19.05 100.00 45.00 ] [ noisy: I ] [ noisy: I2 ] [ 35.00 65.00 40.91 25.00 47.62 45.00 100.00 ] [ noisy: I2 ] |matrix> sa: categorize[pixels,0.6,result] one: |letter: H> + |noisy: H> + |noisy: H2> + |letter: I> + |noisy: I> + |noisy: I2> + |letter: O> -- the order is determined by relevant-kets[op] sp: |letter: H> + |noisy: H> + |noisy: H2> sp: |letter: I> + |noisy: I> + |noisy: I2> sp: |letter: O> |categorize> sa: dump |result> category-0 |result> => |letter: H> + |noisy: H> + |noisy: H2> category-1 |result> => |letter: I> + |noisy: I> + |noisy: I2> category-2 |result> => |letter: O>BTW, turns out shuffling the list does nothing. The matrix spits out results using sort, and relevant-kets[op] order is the same as in the original .sw
sa: load fragment-documents-64k--post-processing--saved.sw sa: matrix[drop-6-simm] [ diary-1-64k ] = [ 0 0 98.82 15.92 18.98 17.16 16.34 13.99 13.99 14.75 42.20 42.20 ] [ * ] [ diary-2-64k ] [ 0 98.82 0 16.18 19.24 17.65 16.27 14.31 14.31 15.07 42.17 42.17 ] [ diary-1-64k ] [ eztv-1-64k ] [ 0 15.92 16.18 0 91.82 13.81 9.58 22.35 22.41 23.30 16.72 16.72 ] [ diary-2-64k ] [ eztv-2-64k ] [ 0 18.98 19.24 91.82 0 13.81 9.58 22.20 22.26 23.16 17.80 17.80 ] [ eztv-1-64k ] [ semantic-1-64k ] [ 0 17.16 17.65 13.81 13.81 0 79.93 15.79 15.79 15.88 10.20 10.20 ] [ eztv-2-64k ] [ semantic-2-64k ] [ 0 16.34 16.27 9.58 9.58 79.93 0 11.56 11.56 11.65 10.00 10.00 ] [ semantic-1-64k ] [ slashdot-1-64k ] [ 0 13.99 14.31 22.35 22.20 15.79 11.56 0 99.94 96.97 11.11 11.11 ] [ semantic-2-64k ] [ slashdot-2-64k ] [ 0 13.99 14.31 22.41 22.26 15.79 11.56 99.94 0 97.03 11.11 11.11 ] [ slashdot-1-64k ] [ slashdot-3-64k ] [ 0 14.75 15.07 23.30 23.16 15.88 11.65 96.97 97.03 0 11.09 11.09 ] [ slashdot-2-64k ] [ wc-comments-1-64k ] [ 0 42.20 42.17 16.72 17.80 10.20 10.00 11.11 11.11 11.09 0 99.89 ] [ slashdot-3-64k ] [ wc-comments-2-64k ] [ 0 42.20 42.17 16.72 17.80 10.20 10.00 11.11 11.11 11.09 99.89 0 ] [ wc-comments-1-64k ] [ wc-comments-2-64k ] |matrix> -- NB: the 0 on the diagonal because we used (ie, no + 100 |_self> term): sa: dump |*> drop-1-simm |*> #=> 100 similar[hash-64k] |_self> drop-2-simm |*> #=> 100 similar[drop-2-hash] |_self> drop-3-simm |*> #=> 100 similar[drop-3-hash] |_self> drop-4-simm |*> #=> 100 similar[drop-4-hash] |_self> drop-5-simm |*> #=> 100 similar[drop-5-hash] |_self> drop-6-simm |*> #=> 100 similar[drop-6-hash] |_self> drop-7-simm |*> #=> 100 similar[drop-7-hash] |_self> drop-8-simm |*> #=> 100 similar[drop-8-hash] |_self> drop-9-simm |*> #=> 100 similar[drop-9-hash] |_self> drop-10-simm |*> #=> 100 similar[drop-10-hash] |_self> sa: categorize[drop-6-hash,0.75,result] one: |semantic-2-64k> + |eztv-1-64k> + |slashdot-3-64k> + |slashdot-1-64k> + |wc-comments-2-64k> + |diary-1-64k> + |eztv-2-64k> + |diary-2-64k> + |wc-comments-1-64k> + |slashdot-2-64k> + |semantic-1-64k> sp: |semantic-2-64k> + |semantic-1-64k> sp: |eztv-1-64k> + |eztv-2-64k> sp: |slashdot-3-64k> + |slashdot-1-64k> + |slashdot-2-64k> sp: |wc-comments-2-64k> + |wc-comments-1-64k> sp: |diary-1-64k> + |diary-2-64k> |categorize> sa: dump |result> category-0 |result> => |semantic-2-64k> + |semantic-1-64k> category-1 |result> => |eztv-1-64k> + |eztv-2-64k> category-2 |result> => |slashdot-3-64k> + |slashdot-1-64k> + |slashdot-2-64k> category-3 |result> => |wc-comments-2-64k> + |wc-comments-1-64k> category-4 |result> => |diary-1-64k> + |diary-2-64k>BTW, the big-O for categorize is pretty terrible at the moment. So is intersection_fn() which I have been intending to fix for a while now (presumably with the help of ordered dictionaries).
First, we say x is near y if metric[x,y] <= t for a metric of your choice, and some threshold t (of course, different values of t change the result!). Then a linear bridging set is a set {x0,x1,x2,x3,...,xn} such that: 1) x_k is near x_k+1, for all k in {0,1,...,n} 2) x_0 is not near x_n A general bridging set is a set {x0,x1,x2,x3,...,xn} such that: 1) for every j in {0,1,...,n}, x_j is near an x_k for some k != j in {0,1,...,n}. -- ie, every element in the set is near some other element in the set 2) there exist at least one j,k pair such that x_j is not near x_k -- in the categorize code, we tend to drop this requirement. Hrmm... maybe (2) should be changed to: 2) there may exist j,k pairs such that x_j is not near x_k to clearly distinguish from standard equivalency classes: cf: define an equivalency operator: a ~ b 1) a ~ a 2) if a ~ b then b ~ a -- ie, symmetrical 3) if a ~ b, and b ~ c, then a ~ c -- ie, transitive. NB: in general, bridging sets don't have this property! 4) if a ~ b then a is in [b] -- ie, a is a member of the equivalency class [b] The point: given a set of elements, the categorize code partitions it into distinct general bridging sets. Also, the lack of transitivity in bridging sets is why the categorize code has to go through some contortions! (where by "lack of transitivity" I mean, just because a is near b, and b is near c, doesn't imply a is near c) Some examples: The easiest is a bridge. It is a very simple example of a linear bridging set, and along with species DNA was a motivator for the bridging set idea. Set the left bank to be x_0, the right bank to be x_n, and the steps you take from one side to the other form the bridging set. The 400m running track on an oval is a simple general bridging set, so is the path you take for your morning jog. If we have a metric that can measure the similarity in DNA (some version of simm perhaps), then each species form distinct bridging sets. And a good use case for the categorize code, BTW. The collection of atoms that make up say a dog, form another bridging set. The tree of life, ie the evolution of life from single cell to multi-cellular life is a big bridging set. A smaller version of this is you are in a bridging set with your parents, grand-parents, and back to your ancestors. And then via your parents, you are in a bridging set with your siblings, their children, their children's children and so on. A train of thought, or math proof can also be considered bridging sets (though I'm not sure what putting it in these terms buys us). A person's face from all different angles/perspectives form a bridging set. This idea should be useful! Ditto a banana, or a pen or a stapler, or a tiger, or an elephant, any object really. Your appearance, first as a baby, then up through adulthood, and then old age forms a linear bridging set. Scenes in a movie/tv show, even as characters move around a bit, is a general bridging set. Your weekly shopping basket is usually a linear bridging set (if you are a consistent shopper). Eg, from 5 years ago, week by week, till now. There are other trivial examples of linear bridging sets: {a,b,c,d,e,f,...,x,y,z} {0,1,2,3,4,5,6,...,20,21,22,23,24,25} Water slowly brought to the boil. etc. Some notes: The value of t has a strong influence on the result. Set it too tight and your categories splinter into smaller ones. Set it too loose, and everything ends up in the same category. The addition of a single new element can sometimes merge two or more categories into one, if it is in a key location. And the other way too. The removal of a key element can fracture a category into two or more pieces (eg, if you remove the middle of the bridge, it is no longer a single bridging set). OK. Wierdly, we can map bridging sets into equivalency classes, but I'm not sure what it buys us! If a and b are members of the same bridging set, then we can say: a ~ b Then standard equivalency class conditions (1,2,3,4 above) are met.
|context> => |context: www proposal> describes |document: www proposal> => |"Hypertext"> + |A Proposal "Mesh"> refers-to |document: www proposal> => |Comms ACM> describes |Comms ACM> => |"Hypertext"> includes |"Hypertext"> => |Linked information> + |Hypermedia> for-example |Linked information> => |Hyper Card> + |ENQUIRE> + |A Proposal "Mesh"> describes |a proposal "mesh"> => |CERN> unifies |a proposal "mesh"> => |ENQUIRE> + |VAX/NOTES> + |uucp News> + |CERNDOC> examples |Computer conferencing> => |IBM GroupTalk> + |uucp News> + |VAX/NOTES> + |A Proposal "Mesh"> for-example |Hierarchical systems> => |CERN> + |CERNDOC> + |Vax/Notes> + |uucp News> + |IBM GroupTalk> includes |CERNDOC> => |document: www proposal> wrote |person: Tim Berners-Lee> => |document: www proposal>
sa: load www-proposal.sw sa: display context: www proposal document: www proposal supported-ops: op: describes, op: refers-to describes: "Hypertext", A Proposal "Mesh" refers-to: Comms ACM Comms ACM supported-ops: op: describes describes: "Hypertext" "Hypertext" supported-ops: op: includes includes: Linked information, Hypermedia Linked information supported-ops: op: for-example for-example: Hyper Card, ENQUIRE, A Proposal "Mesh" a proposal "mesh" supported-ops: op: describes, op: unifies describes: CERN unifies: ENQUIRE, VAX/NOTES, uucp News, CERNDOC Computer conferencing supported-ops: op: examples examples: IBM GroupTalk, uucp News, VAX/NOTES, A Proposal "Mesh" Hierarchical systems supported-ops: op: for-example for-example: CERN, CERNDOC, Vax/Notes, uucp News, IBM GroupTalk CERNDOC supported-ops: op: includes includes: document: www proposal person: Tim Berners-Lee supported-ops: op: wrote wrote: document: www proposalSo this is just one more representation of knowledge (along with mind-maps, matrices and of course sw).
Cyc: (#$isa #$BillClinton #$UnitedStatesPresident) "Bill Clinton belongs to the collection of U.S. presidents" (#$genls #$Tree-ThePlant #$Plant) "All trees are plants" (#$capitalCity #$France #$Paris) "Paris is the capital of France." (#$implies (#$and (#$isa ?OBJ ?SUBSET) (#$genls ?SUBSET ?SUPERSET)) (#$isa ?OBJ ?SUPERSET)) "if OBJ is an instance of the collection SUBSET and SUBSET is a subcollection of SUPERSET, then OBJ is an instance of the collection SUPERSET" (#$relationAllExists #$biologicalMother #$ChordataPhylum #$FemaleAnimal) for every instance of the collection #$ChordataPhylum (i.e. for every chordate), there exists a female animal (instance of #$FemaleAnimal) which is its mother (described by the predicate #$biologicalMother).Roughly translates to this BKO:
is-a |person: Bill Clinton> => |United States President> or: |United States President: _list> => ... + |person: Bill Clinton> + ... <person: Bill Clinton|"" |United States President: _list> == 1 is-plant |tree: *> => |yes> capital-city |country: France> => |city: Paris> <OBJ|""|subset: _list> == 1 |superset: _list> => ... + |subset: _list> + ... <OBJ|""|superset: _list> == 1 (or something like that, not 100% sure) has-mother |Chordata Phylum: *> => |yes>
sa: load matrix-as-network.sw sa: vector[friends] |Alex> [ Beth ] = [ 1.000 ] [ Alex ] [ Ed ] [ 1.000 ] [ James ] [ 1.000 ] [ Jason ] [ 1.000 ] [ Liz ] [ 1.000 ] [ Mary ] [ 1.000 ] [ nathan ] [ 1.000 ] |matrix> sa: vector[friends] (|Bill> + |Harry>) [ bella ] = [ 0 1.000 ] [ Bill ] [ Beth ] [ 1.000 0 ] [ Harry ] [ charlie ] [ 0 1.000 ] [ david ] [ 0 1.000 ] [ Jason ] [ 1.000 0 ] [ John ] [ 1.000 0 ] [ lena ] [ 1.000 0 ] [ nathan ] [ 1.000 1.000 ] [ sam ] [ 0 1.000 ] [ smithie ] [ 0 1.000 ] |matrix> sa: relevant-kets[friends] |Alex> + |Bill> + |Harry> sa: vector[friends] relevant-kets[friends] -- do it indirectly. [ bella ] = [ 0 0 1.000 ] [ Alex ] [ Beth ] [ 1.000 1.000 0 ] [ Bill ] [ charlie ] [ 0 0 1.000 ] [ Harry ] [ david ] [ 0 0 1.000 ] [ Ed ] [ 1.000 0 0 ] [ James ] [ 1.000 0 0 ] [ Jason ] [ 1.000 1.000 0 ] [ John ] [ 0 1.000 0 ] [ lena ] [ 0 1.000 0 ] [ Liz ] [ 1.000 0 0 ] [ Mary ] [ 1.000 0 0 ] [ nathan ] [ 1.000 1.000 1.000 ] [ sam ] [ 0 0 1.000 ] [ smithie ] [ 0 0 1.000 ] |matrix> -- and this one for fun! sa: vector[friends] shuffle relevant-kets[friends] -- NB: the shuffle in there changed the order (on the right hand side) [ bella ] = [ 1.000 0 0 ] [ Harry ] [ Beth ] [ 0 1.000 1.000 ] [ Alex ] [ charlie ] [ 1.000 0 0 ] [ Bill ] [ david ] [ 1.000 0 0 ] [ Ed ] [ 0 1.000 0 ] [ James ] [ 0 1.000 0 ] [ Jason ] [ 0 1.000 1.000 ] [ John ] [ 0 0 1.000 ] [ lena ] [ 0 0 1.000 ] [ Liz ] [ 0 1.000 0 ] [ Mary ] [ 0 1.000 0 ] [ nathan ] [ 1.000 1.000 1.000 ] [ sam ] [ 1.000 0 0 ] [ smithie ] [ 1.000 0 0 ] |matrix> sa: vector[friends] shuffle relevant-kets[friends] -- NB: different order. Harry, Bill, Alex instead of Harry, Alex, Bill. [ bella ] = [ 1.000 0 0 ] [ Harry ] [ Beth ] [ 0 1.000 1.000 ] [ Bill ] [ charlie ] [ 1.000 0 0 ] [ Alex ] [ david ] [ 1.000 0 0 ] [ Ed ] [ 0 0 1.000 ] [ James ] [ 0 0 1.000 ] [ Jason ] [ 0 1.000 1.000 ] [ John ] [ 0 1.000 0 ] [ lena ] [ 0 1.000 0 ] [ Liz ] [ 0 0 1.000 ] [ Mary ] [ 0 0 1.000 ] [ nathan ] [ 1.000 1.000 1.000 ] [ sam ] [ 1.000 0 0 ] [ smithie ] [ 1.000 0 0 ] |matrix> sa: |list> => |Bill> + |Alex> sa: vector[friends] "" |list> -- another indirect example. [ Beth ] = [ 1.000 1.000 ] [ Bill ] [ Ed ] [ 0 1.000 ] [ Alex ] [ James ] [ 0 1.000 ] [ Jason ] [ 1.000 1.000 ] [ John ] [ 1.000 0 ] [ lena ] [ 1.000 0 ] [ Liz ] [ 0 1.000 ] [ Mary ] [ 0 1.000 ] [ nathan ] [ 1.000 1.000 ] |matrix> -- if you pass in |>, it defaults back to show everyone that supports that op. sa: vector[friends] |> [ bella ] = [ 0 0 1.000 ] [ Alex ] [ Beth ] [ 1.000 1.000 0 ] [ Bill ] [ charlie ] [ 0 0 1.000 ] [ Harry ] [ david ] [ 0 0 1.000 ] [ Ed ] [ 1.000 0 0 ] [ James ] [ 1.000 0 0 ] [ Jason ] [ 1.000 1.000 0 ] [ John ] [ 0 1.000 0 ] [ lena ] [ 0 1.000 0 ] [ Liz ] [ 1.000 0 0 ] [ Mary ] [ 1.000 0 0 ] [ nathan ] [ 1.000 1.000 1.000 ] [ sam ] [ 0 0 1.000 ] [ smithie ] [ 0 0 1.000 ] |matrix> -- if you don't pass in a superposition, it defaults back to the default ket. sa: id -- show the default ket 0.000|> -- the default ket is 0|> sa: x = fred -- set the default ket sa: id -- show the default ket |fred> sa: vector[friends] -- same as: vector[friends] |fred> [ ] = [ 0 ] [ fred ] -- fred currently has no friends. |matrix> sa: friends |fred> => |Sam> + |Mary> -- give fred a couple of friends sa: vector[friends] [ Mary ] = [ 1.000 ] [ fred ] [ Sam ] [ 1.000 ] |matrix>So that is about it. Same as merged-matrix, but you get to choose who you are interested in, instead of always giving data on everyone (that supports that op)!
sa: matrix[letter-count] [ a ] = [ 9083 26317 142241 23325 76232 35669 260565 35285 23871 ] [ Alice-in-Wonderland ] [ b ] [ 1621 4766 25476 4829 15699 6847 50138 6117 4763 ] [ Frankenstein ] [ c ] [ 2817 9055 37297 7379 21938 11349 72409 10725 6942 ] [ Gone-with-Wind ] [ d ] [ 5228 16720 85897 12139 37966 18763 144619 18828 15168 ] [ I-Robot ] [ e ] [ 15084 45720 228415 37293 117608 59029 440119 54536 37230 ] [ Moby-Dick ] [ f ] [ 2248 8516 34779 5940 20363 9936 73859 9105 6270 ] [ nineteen-eighty-four ] [ g ] [ 2751 5762 38283 6037 20489 9113 61948 8023 6822 ] [ Shakespeare ] [ h ] [ 7581 19400 119901 16803 61947 28093 234301 28284 19130 ] [ Sherlock-Holmes ] [ i ] [ 7803 21411 101987 20074 62942 30304 214275 27361 18380 ] [ Tom-Sawyer ] [ j ] [ 222 431 1501 346 915 310 2955 421 465 ] [ k ] [ 1202 1722 18290 2370 8011 3512 32029 3590 3136 ] [ l ] [ 5053 12603 79783 12870 42338 18395 156371 17276 12426 ] [ m ] [ 2245 10295 39595 6534 22871 10513 101507 11391 7255 ] [ n ] [ 7871 24220 123989 21302 65429 31516 231652 29337 20858 ] [ o ] [ 9245 25050 130230 24555 69648 34287 299732 34452 24251 ] [ p ] [ 1796 5939 23979 5148 16553 8058 50638 6987 4766 ] [ q ] [ 135 323 1270 321 1244 397 2998 416 182 ] [ r ] [ 6400 20708 105074 17003 52446 25861 224994 25378 16262 ] [ s ] [ 6980 20808 107430 18044 62734 28382 232317 27105 17852 ] [ t ] [ 11631 29706 157163 28316 86983 42127 311911 39232 28389 ] [ u ] [ 3867 10340 50453 9483 26933 12903 121631 13527 9376 ] [ v ] [ 911 3788 15224 3062 8540 4252 36692 4471 2451 ] [ w ] [ 2696 7335 43623 6761 21174 11225 78929 10754 7735 ] [ x ] [ 170 675 1700 508 1037 779 4867 567 326 ] [ y ] [ 2442 7743 37639 6552 16849 9071 90162 9267 6830 ] [ z ] [ 79 243 1045 208 598 303 1418 150 155 ]Here is the (scaled) similarity:
sa: 100 similar[letter-count] |Sherlock-Holmes> 97.879|nineteen-eighty-four> + 97.541|Tom-Sawyer> + 97.393|Moby-Dick> + 97.352|I-Robot> + 97.116|Gone-with-Wind> + 97.089|Alice-in-Wonderland> + 97.079|Shakespeare> + 96.516|Frankenstein>So the relative frequency of letters is very similar across a broad range of text's including back to Shakespeare. Presumably other languages say French, German or Italian would have different frequencies.
Standard simm is normalized/scaled so that (see definition of simm above) w*f == w*g. def simm(A,B): return intersection(A.normalize(),B.normalize()).count_sum() But there is also an non-normalized/scaled version: def unscaled_simm(A,B): return intersection(A,B).count_sum()/max(A.count_sum(),B.count_sum()) Usually, from experience, it seems the scaled version gives better results. But for comparison, here is the unscaled simm (yet to wire into the processor, BTW): sa: 100 unscaled-similar[letter-count] |Sherlock-Holmes> 95.354|nineteen-eighty-four> + 78.455|Frankenstein> + 69.638|Tom-Sawyer> + 68.690|I-Robot> + 46.045|Moby-Dick> + 27.084|Alice-in-Wonderland> + 24.687|Gone-with-Wind> + 12.244|Shakespeare>Now some work in the console:
sa: load ebook-letter-counts.sw sa: |list> => |nineteen-eighty-four> + |Tom-Sawyer> + |I-Robot> + |Gone-with-Wind> + |Frankenstein> + |Shakespeare> + |Moby-Dick> + |Sherlock-Holmes> + |Alice-in-Wonderland> sa: norm |*> #=> normalize letter-count |_self> sa: usimm |*> #=> 100 unscaled-similar[letter-count] |_self> + 100 |_self> -- unscaled-similar not yet wired in. sa: simm |*> #=> 100 similar[letter-count] |_self> + 100 |_self> sa: map[norm,normalized-letter-count] "" |list> sa: map[usimm,unscaled-simm-matrix] "" |list> sa: map[simm,simm-matrix] "" |list> sa: matrix[normalized-letter-count] [ a ] = [ 0.07753 0.07750 0.08118 0.07848 0.08114 0.07909 0.07375 0.08157 0.07923 ] [ Alice-in-Wonderland ] [ b ] [ 0.01384 0.01403 0.01454 0.01625 0.01671 0.01518 0.01419 0.01414 0.01581 ] [ Frankenstein ] [ c ] [ 0.02404 0.02666 0.02129 0.02483 0.02335 0.02516 0.02049 0.02479 0.02304 ] [ Gone-with-Wind ] [ d ] [ 0.04462 0.04923 0.04902 0.04084 0.04041 0.04160 0.04093 0.04352 0.05034 ] [ I-Robot ] [ e ] [ 0.12875 0.13463 0.13035 0.12548 0.12518 0.13089 0.12457 0.12607 0.12357 ] [ Moby-Dick ] [ f ] [ 0.01919 0.02508 0.01985 0.01999 0.02167 0.02203 0.02091 0.02105 0.02081 ] [ nineteen-eighty-four ] [ g ] [ 0.02348 0.01697 0.02185 0.02031 0.02181 0.02021 0.01753 0.01855 0.02264 ] [ Shakespeare ] [ h ] [ 0.06471 0.05713 0.06843 0.05654 0.06594 0.06229 0.06632 0.06538 0.06349 ] [ Sherlock-Holmes ] [ i ] [ 0.06660 0.06305 0.05820 0.06754 0.06700 0.06719 0.06065 0.06325 0.06100 ] [ Tom-Sawyer ] [ j ] [ 0.00189 0.00127 0.00086 0.00116 0.00097 0.00069 0.00084 0.00097 0.00154 ] [ k ] [ 0.01026 0.00507 0.01044 0.00797 0.00853 0.00779 0.00907 0.00830 0.01041 ] [ l ] [ 0.04313 0.03711 0.04553 0.04330 0.04507 0.04079 0.04426 0.03994 0.04124 ] [ m ] [ 0.01916 0.03032 0.02260 0.02199 0.02434 0.02331 0.02873 0.02633 0.02408 ] [ n ] [ 0.06718 0.07132 0.07076 0.07168 0.06964 0.06988 0.06557 0.06782 0.06923 ] [ o ] [ 0.07891 0.07376 0.07432 0.08262 0.07413 0.07603 0.08484 0.07964 0.08049 ] [ p ] [ 0.01533 0.01749 0.01368 0.01732 0.01762 0.01787 0.01433 0.01615 0.01582 ] [ q ] [ 0.00115 0.00095 0.00072 0.00108 0.00132 0.00088 0.00085 0.00096 0.00060 ] [ r ] [ 0.05463 0.06098 0.05996 0.05721 0.05582 0.05734 0.06368 0.05867 0.05397 ] [ s ] [ 0.05958 0.06127 0.06131 0.06071 0.06677 0.06293 0.06576 0.06266 0.05925 ] [ t ] [ 0.09927 0.08747 0.08969 0.09528 0.09259 0.09341 0.08828 0.09069 0.09422 ] [ u ] [ 0.03301 0.03045 0.02879 0.03191 0.02867 0.02861 0.03443 0.03127 0.03112 ] [ v ] [ 0.00778 0.01115 0.00869 0.01030 0.00909 0.00943 0.01039 0.01034 0.00813 ] [ w ] [ 0.02301 0.02160 0.02490 0.02275 0.02254 0.02489 0.02234 0.02486 0.02567 ] [ x ] [ 0.00145 0.00199 0.00097 0.00171 0.00110 0.00173 0.00138 0.00131 0.00108 ] [ y ] [ 0.02084 0.02280 0.02148 0.02205 0.01793 0.02011 0.02552 0.02142 0.02267 ] [ z ] [ 0.00067 0.00072 0.00060 0.00070 0.00064 0.00067 0.00040 0.00035 0.00051 ] |matrix> sa: matrix[unscaled-simm-matrix] [ Alice-in-Wonderland ] = [ 100.00000 34.50011 6.68626 39.42134 12.47074 25.97839 3.31616 27.08393 38.88633 ] [ Alice-in-Wonderland ] [ Frankenstein ] [ 34.50011 100.00000 19.38041 87.14738 36.14696 75.27262 9.61202 78.45510 87.86411 ] [ Frankenstein ] [ Gone-with-Wind ] [ 6.68626 19.38041 100.00000 16.96103 53.61561 25.73779 49.59655 24.68720 17.19438 ] [ Gone-with-Wind ] [ I-Robot ] [ 39.42134 87.14738 16.96103 100.00000 31.63450 65.89134 8.41209 68.69032 96.69821 ] [ I-Robot ] [ Moby-Dick ] [ 12.47074 36.14696 53.61561 31.63450 100.00000 48.00428 26.59149 46.04481 32.06974 ] [ Moby-Dick ] [ nineteen-eighty-four ] [ 25.97839 75.27262 25.73779 65.89134 48.00428 100.00000 12.76506 95.35360 66.77162 ] [ nineteen-eighty-four ] [ Shakespeare ] [ 3.31616 9.61202 49.59655 8.41209 26.59149 12.76506 100.00000 12.24400 8.52782 ] [ Shakespeare ] [ Sherlock-Holmes ] [ 27.08393 78.45510 24.68720 68.69032 46.04481 95.35360 12.24400 100.00000 69.63764 ] [ Sherlock-Holmes ] [ Tom-Sawyer ] [ 38.88633 87.86411 17.19438 96.69821 32.06974 66.77162 8.52782 69.63764 100.00000 ] [ Tom-Sawyer ] |matrix> sa: matrix[simm-matrix] [ Alice-in-Wonderland ] = [ 100.00000 94.93789 96.51590 97.31733 96.76409 97.11247 95.57426 97.08918 97.49465 ] [ Alice-in-Wonderland ] [ Frankenstein ] [ 94.93789 100.00000 95.97417 96.01124 95.22426 96.47916 95.24344 96.51556 95.53771 ] [ Frankenstein ] [ Gone-with-Wind ] [ 96.51590 95.97417 100.00000 96.00326 96.98087 97.01012 95.91322 97.11608 97.16821 ] [ Gone-with-Wind ] [ I-Robot ] [ 97.31733 96.01124 96.00326 100.00000 97.30176 97.87105 96.06118 97.35198 97.11884 ] [ I-Robot ] [ Moby-Dick ] [ 96.76409 95.22426 96.98087 97.30176 100.00000 98.04903 96.06974 97.39297 96.84666 ] [ Moby-Dick ] [ nineteen-eighty-four ] [ 97.11247 96.47916 97.01012 97.87105 98.04903 100.00000 95.54986 97.87913 97.10264 ] [ nineteen-eighty-four ] [ Shakespeare ] [ 95.57426 95.24344 95.91322 96.06118 96.06974 95.54986 100.00000 97.07934 95.89015 ] [ Shakespeare ] [ Sherlock-Holmes ] [ 97.08918 96.51556 97.11608 97.35198 97.39297 97.87913 97.07934 100.00000 97.54125 ] [ Sherlock-Holmes ] [ Tom-Sawyer ] [ 97.49465 95.53771 97.16821 97.11884 96.84666 97.10264 95.89015 97.54125 100.00000 ] [ Tom-Sawyer ] |matrix>
-- how many words in the data set? sa: count "" |word: _list> |number: 233088> -- now some words: sa: POS |word: the> 0.500|POS: Definite Article> + 0.500|POS: Adverb> sa: POS |word: frog> 0.500|POS: Noun> + 0.500|POS: Verb (participle)> sa: POS |word: swim> 0.250|POS: Verb (participle)> + 0.250|POS: Verb (transitive)> + 0.250|POS: Noun> + 0.250|POS: Verb (intransitive)> sa: POS |word: fly> 0.200|POS: Verb (participle)> + 0.200|POS: Verb (intransitive)> + 0.200|POS: Verb (transitive)> + 0.200|POS: Noun> + 0.200|POS: Adjective> sa: POS |word: Australia> |POS: Noun> -- do this to catch words/objects we have no POS data on: sa: POS |*> => |don't know> sa: POS |word: alkjdf> |don't know> -- add up all the parts-of-speech in a sentence: sa: POS read |text: the frog jumped over the mat on his way to dinner> 1.500|POS: Definite Article> + 2.500|POS: Adverb> + 2.750|POS: Noun> + 0.750|POS: Verb (participle)> + |don't know> + |POS: Preposition> + 0.750|POS: Adjective> + 0.250|POS: Verb (transitive)> + 0.500|POS: Pronoun>Now, a couple of comments:
sa: load matrix-as-network.sw sa: |list> => |Alex> + |Bill> + |Harry> sa: dump "" |list> -- take a look at the data friends |Alex> => |Jason> + |Ed> + |Mary> + |Liz> + |Beth> + |James> + |nathan> friends |Bill> => |Jason> + |Beth> + |lena> + |John> + |nathan> friends |Harry> => |charlie> + |bella> + |sam> + |smithie> + |david> + |nathan> sa: intersection(friends |Alex>, friends |Bill>, friends |Harry>) -- the strict version. |nathan> sa: friends "" |list> -- take a look at the data 2.000|Jason> + |Ed> + |Mary> + |Liz> + 2.000|Beth> + |James> + 3.000|nathan> + |lena> + |John> + |charlie> + |bella> + |sam> + |smithie> + |david> sa: drop-below[3] friends "" |list> -- replicate the (strict) intersection result 3.000|nathan> sa: drop-below[2] friends "" |list> -- softer "intersection" result 2.000|Jason> + 2.000|Beth> + 3.000|nathan> -- and if the non-one coeffs bug you, simply: sa: clean drop-below[3] friends "" |list> |nathan> sa: clean drop-below[2] friends "" |list> |Jason> + |Beth> + |nathan>So I guess the general idea is you have a largish training set (I have no exact number in mind at the moment).
foo |signal> => drop-below[t] (foo |example 1> + foo |example 2> + foo |example 3> + ... + foo |example n>)OK. Here is a worked example:
sa: load H-I-pat-rec.sw sa: |list> => |letter: H> + |noisy: H> + |noisy: H2> sa: print-pixels[pixels] |letter: H> I: 5 J: 7 1 1 1 1 1 1 11111 1 1 1 1 1 1 |pixels> sa: print-pixels[pixels] |noisy: H> I: 5 J: 7 1 1 1 1 1 111 1 1 1 1 1 1 |pixels> sa: print-pixels[pixels] |noisy: H2> I: 5 J: 7 1 1 1 1 111 11111 11 1 1 1 111 1 |pixels> sa: dim-1 |merged H> => |dimension: 5> -- we need to define the dimensions, else print-pixels has no idea of the size. sa: dim-2 |merged H> => |dimension: 7> sa: pixels |merged H> => pixels "" |list> -- define the merged H pattern sa: print-pixels[pixels] |merged H> I: 5 J: 7 2 3 -- this pattern can be considered signal + noise. 3 2 3 113 33323 31 2 3 3 311 3 |pixels> sa: pixels |merged H> => common[pixels] "" |list> -- the intersection version (note, common is an alias for intersection) sa: print-pixels[pixels] I: 5 J: 7 1 1 1 1 111 1 1 1 1 1 1 |pixels> sa: pixels |merged H> => drop-below[3] pixels "" |list> -- the addition + strict threshold filter version sa: print-pixels[pixels] I: 5 J: 7 3 3 3 3 333 3 3 3 3 3 3 |pixels> sa: pixels |merged H> => drop-below[2] pixels "" |list> -- the addition + softer threshold filter version sa: print-pixels[pixels] I: 5 J: 7 2 3 -- heh. And our H pattern emerges out of the noise! 3 2 3 3 33323 3 2 3 3 3 3 |pixels> sa: pixels |merged H> => clean drop-below[2] pixels "" |list> -- clean the coeffs back to 1. sa: print-pixels[pixels] I: 5 J: 7 1 1 -- and there is our nice clean H pattern! 1 1 1 1 11111 1 1 1 1 1 1 |pixels> where I used a typing short-cut when I used "" |list> eg: pixels |merged H> => pixels "" |list> pixels |merged H> => common[pixels] "" |list> pixels |merged H> => drop-below[3] pixels "" |list> pixels |merged H> => drop-below[2] pixels "" |list> pixels |merged H> => clean drop-below[2] pixels "" |list> is identical to: pixels |merged H> => pixels |letter: H> + pixels |noisy: H> + pixels |noisy: H2> pixels |merged H> => intersection( pixels|letter: H> + pixels|noisy: H> + pixels|noisy: H2>) pixels |merged H> => drop-below[3] (pixels |letter: H> + pixels |noisy: H> + pixels |noisy: H2>) pixels |merged H> => drop-below[2] (pixels |letter: H> + pixels |noisy: H> + pixels |noisy: H2>) pixels |merged H> => clean drop-below[2] (pixels |letter: H> + pixels |noisy: H> + pixels |noisy: H2>)So I consider that a nice demonstration/proof-of-concept of my learning idea.
well-behaved means similar objects return similar superpositions (this is the hard bit to achieve, but hopefully not impossible) deterministic means if you feed in the same object, you get essentially the same superposition. There is some lee-way in that it doesn't have to be 100% identical on each run, but close. distinctive means different object types have easily distinguishable superpositions (again, this is on the hard side)
|A> => |a1> + |a2> + |a3> + |ab> + |ac> + |abc> |B> => |b1> + |b2> + |b3> + |ab> + |bc> + |abc> |C> => |c1> + |c2> + |c3> + |ac> + |bc> + |abc>NB: any superposition with coeffs in {0,1} can be considered a standard maths set.
A = {a1,a2,a3,ab,ac,abc} B = {b1,b2,b3,ab,bc,abc} C = {c1,c2,c3,ac,bc,abc}I'm not sure if it goes the other way, can all sets be represented by superpositions?
S = {{a1,a2,a3},{b1,b2},c1,{d1},{e1,e2,e3,e4},f1}OK. How about this:
|S> => |a> + |b> + |c1> + |d> + |e> + |f1> |a> => |a1> + |a2> + |a3> |b> => |b1> + |b2> |d> => |d1> |e> => |e1> + |e2> + |e3> + |e4>Now, let's give some examples to show the correpondence between intersection and addition:
sa: intersection(""|A>,""|B>) |ab> + |abc> sa: ""|A> + ""|B> |a1> + |a2> + |a3> + 2.000|ab> + |ac> + 2.000|abc> + |b1> + |b2> + |b3> + |bc> sa: drop-below[2] (""|A> + ""|B>) 2.000|ab> + 2.000|abc> sa: clean drop-below[2] (""|A> + ""|B>) -- as promised, this is the same as an intersection. |ab> + |abc> -- though I think there are cases where it will give a different answer from intersection. -- one example is when coeffs of A, B and C are not in {0,1}Here are the other two:
sa: intersection(""|A>,""|C>) |ac> + |abc> sa: ""|A> + ""|C> |a1> + |a2> + |a3> + |ab> + 2.000|ac> + 2.000|abc> + |c1> + |c2> + |c3> + |bc> sa: clean drop-below[2] (""|A> + ""|C>) |ac> + |abc> sa: intersection(""|B>,""|C>) |bc> + |abc> sa: ""|B> + ""|C> |b1> + |b2> + |b3> + |ab> + 2.000|bc> + 2.000|abc> + |c1> + |c2> + |c3> + |ac> sa: clean drop-below[2] (""|B> + ""|C>) |bc> + |abc>Now, the final example, intersection between A, B and C.
sa: intersection(""|A>,""|B>,""|C>) |abc> sa: ""|A> + ""|B> + ""|C> -- it's cool to note the coeffs here match the numbers in the Venn diagram above. |a1> + |a2> + |a3> + 2.000|ab> + 2.000|ac> + 3.000|abc> + |b1> + |b2> + |b3> + 2.000|bc> + |c1> + |c2> + |c3> sa: clean drop-below[2] (""|A> + ""|B> + ""|C>) |ab> + |ac> + |abc> + |bc> sa: clean drop-below[3] (""|A> + ""|B> + ""|C>) |abc>And I guess an observation. If you add a bunch of sets using superposition notation the coeffs represent the number of sets that object is in.
sa: clean (""|A> + ""|B> + ""|C>) |a1> + |a2> + |a3> + |ab> + |ac> + |abc> + |b1> + |b2> + |b3> + |bc> + |c1> + |c2> + |c3>
S = {{a1,a2,a3},{b1,b2},c1,{d1},{e1,e2,e3,e4},f1} set |S> => |a> + |b> + |c1> + |d> + |e> + |f1> set |a> => |a1> + |a2> + |a3> set |b> => |b1> + |b2> set |c1> => |c1> set |d> => |d1> set |e> => |e1> + |e2> + |e3> + |e4> set |f1> => |f1>21/8/2014 update: And we can use exp-max on this data.
sa: load nested-set.sw sa: exp-max[set] |S> |S> + |a> + |b> + 3.000|c1> + |d> + |e> + 3.000|f1> + |a1> + |a2> + |a3> + |b1> + |b2> + |d1> + |e1> + |e2> + |e3> + |e4> sa: exp-max[set] |a> |a> + |a1> + |a2> + |a3>
|x> in op |object> such that <y|op-sequence|x> >= 0.7Then, if you want to map the resulting |x> to something, then (making use of the linearity of operators):
another-op (|x> in op |object> such that <y|op-sequence|x> >= 0.7)Then use it in a learn-rule:
|some answer> => yet-another-op another-op (|x> in op |object> such that <y|op-sequence|x> >= 0.7)Anyway, something along those lines.
Maybe the brackets are not necessary. eg: |some answer> => yet-another-op another-op |x> in op |object> such that <y|op-sequence|x> >= 0.7 We can have more than one condition. eg: |answer> => some-op |x> in op |object> such that <y|op-sequence|x> >= 0.7 or <z|op-sequence-2|x> == 0.3 and op-foo |x> == 2
kevin-bacon-0 |result> => actors movies |actor: Kevin Bacon> -- set of actors that share a movie with Kevin. kevin-bacon-1 |result> => actors movies actors movies |actor: Kevin Bacon> -- set of actors one step removed from Kevin. kevin-bacon-2 |result> => actors movies actors movies actors movies |actor: Kevin Bacon> -- set of actors two steps removed. kevin-bacon-3 |result> => actors movies actors movies actors movies actors movies |actor: Kevin Bacon> -- three steps removedBTW, the coeffs of the actor kets contains some information too.
# similar to context.recall(op,label), but this works with files instead of all data in memory. # Motivated by wanting to run the Kevin Bacon game on IMDB, but EC2 is too expensive to do all in mem. # So hopefully I can do it using this approach (I already have a copy of IMDB in sw format - took about a week on EC2) # # filename is sw data/source file # op is the operator label, a string # label is the ket label, a string or a ket # # returns a superposition def file_recall(filename,op,label): if type(label) == ket: coeff = label.value ket_label = label.label else: coeff = 1 ket_label = label pattern = op + " |" + ket_label + "> => " n = len(pattern) print("pattern:",pattern) print("n: ",n) with open(filename,'r') as f: for line in f: if line.startswith(pattern): print("line:",line) return extract_literal_superposition(line[n:])[0].multiply(coeff) return ket("",0)Did some quick testing, with:
sw_file = "sw-examples/fred-sam-friends.sw" r = file_recall(sw_file,"friends","Fred") print("r:",r)Then again, using "Sam" instead. And quashed a small bug, and it works great. Now I can upscale it to Kevin Bacon size, using this code:
# a single layer of the Kevin Bacon game. # one is a superposition (though should handle kets too) # returns a superposition. # # the code does: # actors movies one-superposition # eg: # actors movies |actor: Kevin Bacon> # def Kevin_Bacon_game(bacon_file,one): if type(one) == str: # make sure we have a superposition, one = superposition() + ket(one) # even if fed a string or a ket elif type(one) == ket: # Hrmm... there has to be a neater way to write this mess! one = superposition() + one # one = one.apply_sigmoid(clean) # optional to clean coeffs from our incomming sp sp1 = superposition() for x in one.data: sp1 += file_recall(bacon_file,"movies",x) sp2 = superposition() for x in sp1.data: sp2 += file_recall(bacon_file,"actors",x) print("len:",len(sp2)) return sp2.coeff_sort()OK. Then put it to use:
# this is the full game we are trying to replicate: # kevin-bacon-0 |result> => actors movies |actor: Kevin Bacon> -- set of actors that share a movie with Kevin. # kevin-bacon-1 |result> => actors movies actors movies |actor: Kevin Bacon> -- set of actors one step removed from Kevin. # kevin-bacon-2 |result> => actors movies actors movies actors movies |actor: Kevin Bacon> -- set of actors two steps removed. # kevin-bacon-3 |result> => actors movies actors movies actors movies actors movies |actor: Kevin Bacon> -- three steps removed # ... sw_bacon_file = "sw-examples/just-movies-imdb.sw" # our imdb data r = ket("actor: Kevin (I) Bacon") # NB: we can choose any actor we like! We have the whole damn imdb to choose from! N = 4 # How deep we want to go. For now 4, but maybe 10 or bigger later! for k in range(N): r = Kevin_Bacon_game(sw_bacon_file,r) C.learn("kevin-bacon-" + str(k),"result",r) name = "sw-examples/kevin-bacon.sw" # save the results. save_sw(C,name)OK. I improved the code (hopefully faster, or at least less RAM) so now it writes results as it goes, instead of storing it all in RAM (not that I know anything about flushing and what-not).
# let's write a version that writes to disk as it goes. sw_bacon_file = "sw-examples/just-movies-imdb.sw" # our imdb data sw_dest_file = "sw-examples/fast-write--kevin-bacon.sw" # where we are going to save the results dest = open(sw_dest_file,'w') # fake the context header: dest.write("----------------------------------------\n") dest.write("|context> => |context: Kevin Bacon game>\n\n") # can't be bothered to fake the supported-ops line. r = ket("actor: Kevin (I) Bacon") # NB: we can choose any actor we like! We have the whole damn imdb to choose from! N = 10 # How deep we want to go. For now 4, but maybe 10 or bigger later! for k in range(N): r = Kevin_Bacon_game(sw_bacon_file,r) dest.write("kevin-bacon-" + str(k) + " |result> => " + r.display(True) + "\n") # r.display(True) for exact dump, not str(sp) version. dest.write("----------------------------------------\n") dest.close()Anyway, here are the kevin-bacon-0 results.
kevin-bacon-1 |result> => actors movies actors movies |actor: Kevin Bacon>I have since found out that others call this kevin-bacon-2, and kevin-bacon-0 == |actor: Kevin Bacon>.
n = <actor: Y|actors movies|actor: Kevin Bacon> and more generally: m = <actor: Y|actors movies|actor: X> n is the number of pathways between Kevin Bacon and actor Y (also of course, the number of movies they have shared). m is the number of pathways between actor X and actor Y. Not yet sure the meaning of the more general: d1(X,Y,k) = <actor: Y|[actors movies]^k |actor: X> or: d2(X,Y,k) = <actor: Y|[actors movies clean]^k |actor: X>Also NB, we see here a clear example showing that left association of operators is meaningless in this scheme (as we already noted way below).
(<actor: Y|actors) ...clearly can't make sense, since the "actors" op is not defined when applied to an actor.
(movies |actor: Y>)which clearly does make sense.
d(X,Y,op,k) = <X|op^k|Y> where: X,Y can be pretty much anything op is just some operator, and can be compound, eg op = [actors movies] cf. just above d() is symmetrical in X,Y only if op is "well defined" (which I need to define at some point :)
$ ./minimalist_find_common_ma.py "Sandra Bullock" "Keanu Reeves" common movies for: Sandra Bullock Keanu Reeves number of common movies: 6 common movies: movie: Inside 'Speed' (2002), movie: Cmo conseguir un papel en Hollywood (2007), movie: The Lake House (2006), movie: The Making of 'Speed' (1994), movie: Speed (1994), movie: Twentieth Century Fox: The Blockbuster Years (2000) $ ./minimalist_find_common_ma.py "Star Trek Into Darkness (2013)" "Paul (2011)" common actors for: Star Trek Into Darkness (2013) Paul (2011) number of common actors: 2 common actors: actor: Simon Pegg, actor: Bill Hader $ ./minimalist_find_common_ma.py "Star Trek Into Darkness (2013)" "12 Years a Slave (2013)" common actors for: Star Trek Into Darkness (2013) 12 Years a Slave (2013) number of common actors: 1 common actors: actor: Benedict CumberbatchNote BTW, the code auto works out if you have given it 2 actors or 2 movies (though at the cost of a longer run-time).
def file_recall(filename,op,label): pattern = op + " |" + label + "> => " n = len(pattern) with open(filename,'r') as f: for line in f: if line.startswith(pattern): line = line[n:] # return line[1:-1].split("> + |") # NB how easy it is to parse a well defined literal superposition with all coeffs of 1. # put in a tweak to filter out "Awards" shows: # Much cleaner than the extract_literal_superposition() code! # but some TV shows remain. The fix is to filter out (TV) and (V) when parsing the original text version of IMDB. return [x for x in line[1:-1].split("> + |") if "Awards" not in x] return []
imdb-votes |movie: The Matrix (1999)> => |votes: 896239> imdb-votes-self |movie: The Matrix (1999)> => 896239|movie: The Matrix (1999)> imdb-rating |movie: The Matrix (1999)> => |rating: 8.7> imdb-rating-self |movie: The Matrix (1999)> => 8.7|movie: The Matrix (1999)>Anyway, the plan is to do things like:
imdb-rating movies |actor: Fred> average-movie-rating |actor: Fred> => arithmetic(count-sum imdb-rating-self movies |_self>,|/>,count imdb-rating-self movies |_self>)Once we have that working, generate the full set:
average-movie-rating |actor: *> #=> arithmetic(count-sum imdb-rating-self movies |_self>,|/>,count imdb-rating-self movies |_self>) map[average-movie-rating] "" |actor: _list>
$ ./find_average_movie_rating.py "Kevin (I) Bacon" actor: Kevin (I) Bacon number of movies: 74 ratings: 8.1 movie: A Little Vicious (1991) 8.1 movie: Skum Rocks! (2013) 8.0 movie: JFK (1991) 8.0 movie: Mystic River (2003) 7.9 movie: Saving Angelo (2007) 7.8 movie: Sundance Skippy (2010) 7.8 movie: X-Men: First Class (2011) 7.7 movie: Frost/Nixon (2008) 7.6 movie: A Few Good Men (1992) 7.6 movie: Animal House (1978) 7.6 movie: Apollo 13 (1995) 7.6 movie: Freedom Downtime (2001) 7.6 movie: Planes, Trains & Automobiles (1987) 7.5 movie: Crazy, Stupid, Love. (2011) 7.5 movie: Sleepers (1996) 7.4 movie: Beyond All Boundaries (2009) 7.3 movie: Going to Pieces: The Rise and Fall of the Slasher Film (2006) 7.3 movie: Murder in the First (1995) 7.3 movie: The Woodsman (2004) 7.2 movie: Diner (1982) 7.2 movie: Tremors (1990) 7.1 movie: Vanilla Ice Archive (2012) 7.0 movie: Balto (1995) 7.0 movie: My Dog Skip (2000) 7.0 movie: Stir of Echoes (1999) 7.0 movie: The Air I Breathe (2007) 6.9 movie: A Look Behind the Scenes: Super (2011) 6.9 movie: Digging to China (1997) 6.9 movie: Natural Disasters: Forces of Nature (2004) 6.8 movie: Death Sentence (2007) 6.8 movie: Eastwood Directs: The Untold Story (2013) 6.8 movie: Lemon Sky (1988) 6.8 movie: Rails & Ties (2007) 6.8 movie: Super (2010/I) 6.8 movie: We Married Margo (2000) 6.6 movie: My One and Only (2009) 6.6 movie: Starting Over (1979) 6.6 movie: Where the Truth Lies (2005) 6.5 movie: Flatliners (1990) 6.5 movie: Friday the 13th (1980) 6.5 movie: Only When I Laugh (1981) 6.5 movie: Wild Things (1998) 6.4 movie: Boffo! Tinseltown's Bombs and Blockbusters (2006) 6.4 movie: Footloose (1984) 6.3 movie: Jayne Mansfield's Car (2012) 6.3 movie: Telling Lies in America (1997) 6.3 movie: The Big Picture (1989) 6.3 movie: The River Wild (1994) 6.2 movie: Hero at Large (1980) 6.2 movie: Trapped (2002) 6.1 movie: New York Skyride (1994) 6.1 movie: White Water Summer (1987) 5.9 movie: Queens Logic (1991) 5.8 movie: Criminal Law (1988) 5.8 movie: Novocaine (2001) 5.8 movie: She's Having a Baby (1988) 5.7 movie: End of the Line (1987) 5.7 movie: Forty Deuce (1982) 5.7 movie: Hollow Man (2000) 5.6 movie: Cavedweller (2004) 5.6 movie: R.I.P.D. (2013) 5.5 movie: He Said, She Said (1991) 5.5 movie: Loverboy (2005) 5.5 movie: Quicksilver (1986) 5.4 movie: Beauty Shop (2005) 5.4 movie: Enormous Changes at the Last Minute (1983) 5.4 movie: Imagine New York (2003) 5.4 movie: Picture Perfect (1997) 5.4 movie: The Air Up There (1994) 5.3 movie: In the Cut (2003) 5.3 movie: These Vagabond Shoes (2009) 5.2 movie: Film Trix 2004 (2004) 5.1 movie: Elephant White (2011) 4.7 movie: Pyrates (1991) average movie rating: 6.56 $ ./find_average_movie_rating.py "Brad Pitt" actor: Brad Pitt number of movies: 61 ratings: 8.9 movie: Fight Club (1999) 8.7 movie: Se7en (1995) 8.3 movie: Inglourious Basterds (2009) 8.3 movie: Snatch. (2000) 8.2 movie: 12 Years a Slave (2013) 8.1 movie: Exit Through the Gift Shop (2010) 8.1 movie: Twelve Monkeys (1995) 8.0 movie: True Romance (1993) 7.8 movie: Being John Malkovich (1999) 7.8 movie: Ocean's Eleven (2001) 7.8 movie: The Curious Case of Benjamin Button (2008) 7.6 movie: Interview with the Vampire: The Vampire Chronicles (1994) 7.6 movie: Moneyball (2011) 7.6 movie: The Assassination of Jesse James by the Coward Robert Ford (2007) 7.5 movie: Babel (2006) 7.5 movie: Legends of the Fall (1994) 7.5 movie: Sleepers (1996) 7.5 movie: The Big Uneasy (2010) 7.4 movie: Beyond All Boundaries (2009) 7.4 movie: Special Thanks to Roy London (2005) 7.4 movie: Thelma & Louise (1991) 7.4 movie: Touch of Evil (2011) 7.3 movie: A River Runs Through It (1992) 7.3 movie: Megamind (2010) 7.2 movie: Troy (2004) 7.1 movie: Confessions of a Dangerous Mind (2002) 7.1 movie: Meet Joe Black (1998) 7.1 movie: World War Z (2013) 7.0 movie: Burn After Reading (2008) 7.0 movie: Seven Years in Tibet (1997) 7.0 movie: Smash His Camera (2010) 7.0 movie: Spy Game (2001) 6.9 movie: Ocean's Thirteen (2007) 6.7 movie: Don't Tell My Booker!!! (2007) 6.7 movie: Kalifornia (1993) 6.7 movie: The Tree of Life (2011) 6.6 movie: Bad Boy Kummer (2010) 6.6 movie: Sinbad: Legend of the Seven Seas (2003) 6.5 movie: Mr. & Mrs. Smith (2005) 6.4 movie: Boffo! Tinseltown's Bombs and Blockbusters (2006) 6.4 movie: Ocean's Twelve (2004) 6.3 movie: Contact (1992) 6.3 movie: Less Than Zero (1987) 6.2 movie: Killing Them Softly (2012) 6.1 movie: Los Angeles (2005) 6.1 movie: The Mexican (2001) 6.0 movie: Happy Together (1989) 6.0 movie: No Man's Land (1987) 6.0 movie: The Devil's Own (1997) 5.9 movie: Happy Feet Two (2011) 5.8 movie: Johnny Suede (1991) 5.7 movie: Across the Tracks (1990) 5.4 movie: The Counselor (2013) 5.4 movie: The Dark Side of the Sun (1988) 5.2 movie: The Favor (1994) 4.8 movie: Full Frontal (2002) 4.7 movie: Brad Pitt Video Portrait (2006) 4.7 movie: Cool World (1992) 4.6 movie: Abby Singer (2003) 4.4 movie: Hunk (1987) 4.1 movie: Cutting Class (1989) average movie rating: 6.77 $ ./find_average_movie_rating.py "Angelina Jolie" actor: Angelina Jolie number of movies: 47 ratings: 8.1 movie: Exit Through the Gift Shop (2010) 7.8 movie: Changeling (2008) 7.6 movie: Kung Fu Panda (2008) 7.3 movie: Girl, Interrupted (1999) 7.3 movie: Kung Fu Panda 2 (2011) 7.3 movie: Maleficent (2014) 7.3 movie: The Day After Peace (2008) 7.2 movie: Playing by Heart (1998) 7.2 movie: The International Criminal Court (2013) 7.1 movie: Jane's Journey (2010) 7.0 movie: Smash His Camera (2010) 6.7 movie: A Mighty Heart (2007) 6.7 movie: Don't Tell My Booker!!! (2007) 6.7 movie: The Good Shepherd (2006) 6.7 movie: Wanted (2008) 6.6 movie: The Bone Collector (1999) 6.5 movie: Mr. & Mrs. Smith (2005) 6.4 movie: Beyond Borders (2003) 6.4 movie: Gone in Sixty Seconds (2000) 6.4 movie: Salt (2010) 6.4 movie: Top Priority: The Terror Within (2012) 6.3 movie: Beowulf (2007) 6.2 movie: Hackers (1995) 6.2 movie: Valencia: The Movie/S (2013) 6.1 movie: A Place in Time (2007) 6.1 movie: Foxfire (1996) 6.1 movie: Sky Captain and the World of Tomorrow (2004) 6.1 movie: Taking Lives (2004) 6.1 movie: The Fever (2004) 6.0 movie: Original Sin (2001) 6.0 movie: Pushing Tin (1999) 6.0 movie: Shark Tale (2004) 6.0 movie: The Tourist (2010) 5.8 movie: Life or Something Like It (2002) 5.7 movie: Lara Croft: Tomb Raider (2001) 5.6 movie: Playing God (1997) 5.5 movie: Alexander (2004) 5.4 movie: Lara Croft Tomb Raider: The Cradle of Life (2003) 5.2 movie: Lookin' to Get Out (1982) 5.2 movie: Love Is All There Is (1996) 5.2 movie: Mojave Moon (1996) 5.0 movie: Alice & Viril (1993) 5.0 movie: Trading Women (2003) 4.9 movie: Angela & Viril (1993) 4.9 movie: Hell's Kitchen (1998) 4.3 movie: Without Evidence (1995) 2.8 movie: Sledge: The Untold Story (2005) average movie rating: 6.18 $ ./find_average_movie_rating.py "Tom Cruise" actor: Tom Cruise number of movies: 50 ratings: 8.1 movie: A Tribute to J.J. Abrams (2013) 8.1 movie: Edge of Tomorrow (2014) 8.0 movie: Magnolia (1999) 8.0 movie: Rain Man (1988) 8.0 movie: Stanley Kubrick: A Life in Pictures (2001) 7.8 movie: Religulous (2008) 7.7 movie: Minority Report (2002) 7.7 movie: The Last Samurai (2003) 7.6 movie: A Few Good Men (1992) 7.6 movie: Collateral (2004) 7.6 movie: Interview with the Vampire: The Vampire Chronicles (1994) 7.5 movie: Mission: Impossible Ghost Protocol Special Feature - Soaring in Dubai (2011) 7.5 movie: Space Station 3D (2002) 7.4 movie: Mission: Impossible - Ghost Protocol (2011) 7.4 movie: The Queen (2006) 7.3 movie: Der Geist des Geldes (2007) 7.3 movie: Eyes Wide Shut (1999) 7.3 movie: Jerry Maguire (1996) 7.2 movie: Born on the Fourth of July (1989) 7.2 movie: The Outsiders (1983) 7.1 movie: Valkyrie (2008) 7.0 movie: Jack Reacher (2012) 7.0 movie: Mission: Impossible (1996) 7.0 movie: Oblivion (2013/I) 7.0 movie: The Color of Money (1986) 7.0 movie: Tropic Thunder (2008) 6.9 movie: Vanilla Sky (2001) 6.8 movie: Mission: Impossible III (2006) 6.8 movie: Risky Business (1983) 6.8 movie: Sex, Drugs & Religion (2010) 6.8 movie: The Firm (1993) 6.8 movie: Top Gun (1986) 6.7 movie: Don't Tell My Booker!!! (2007) 6.7 movie: Taps (1981) 6.5 movie: Far and Away (1992) 6.5 movie: War of the Worlds (2005) 6.4 movie: Boffo! Tinseltown's Bombs and Blockbusters (2006) 6.4 movie: Legend (1985) 6.3 movie: Knight and Day (2010) 6.2 movie: Austin Powers in Goldmember (2002) 6.2 movie: Lions for Lambs (2007) 6.0 movie: Mission: Impossible II (2000) 5.9 movie: All the Right Moves (1983) 5.9 movie: Rock of Ages (2012) 5.8 movie: Days of Thunder (1990) 5.7 movie: Cocktail (1988) 5.4 movie: August (2008) 4.8 movie: Losin' It (1983) 4.6 movie: Endless Love (1981) 4.1 movie: Junket Whore (1998) average movie rating: 6.83 $ ./find_average_movie_rating.py "Alyssa Milano" actor: Alyssa Milano number of movies: 27 ratings: 7.3 movie: Life After Tomorrow (2006) 7.1 movie: The Blue Hour (2007) 6.9 movie: Where the Day Takes You (1991) 6.7 movie: 10 Minutes (2010) 6.6 movie: Commando (1985) 6.5 movie: Jimmy Zip (1996) 6.5 movie: Old Enough (1984) 6.2 movie: Fear (1996) 6.0 movie: Pathology (2008) 5.9 movie: Buying the Cow (2002) 5.9 movie: Hall Pass (2011) 5.8 movie: Little Sister (1992) 5.8 movie: My Girlfriend's Boyfriend (2010) 5.6 movie: Dickie Roberts: Former Child Star (2003) 5.6 movie: Glory Daze (1995) 5.6 movie: Kiss the Bride (2002) 5.6 movie: New Year's Eve (2011) 5.6 movie: Rockin' the Corps: An American Thank You (2005) 5.2 movie: Dinotopia: Quest for the Ruby Sunstone (2005) 5.1 movie: Hugo Pool (1997) 5.0 movie: Below Utopia (1997) 4.8 movie: Deadly Sins (1995) 4.3 movie: Embrace of the Vampire (1995) 4.3 movie: Poison Ivy II (1996) 4.2 movie: Speed Zone (1989) 3.9 movie: Conflict of Interest (1993) 3.5 movie: Double Dragon (1994) average movie rating: 5.61
favourite-movies |Fred> => 6 |movie: a> + 9 |movie: b> + 8 |movie: c> + 10 |movie: d> + 6.5 |movie: e> -- where the coeffs represent how strongly Fred liked those movies. -- Indeed, this is a case where negative coeffs would be useful. -- eg, in my case, -20 for musicals! -- -20 |feature: musical> -- a db of features for movies, with the respective strengths as coeffs. features |movie: a> => 5 |feature: 9> + 14 |feature: 3> + 9 |feature: 1> + 2 |feature: 20> features |movie: b> => 13 |feature: 24> + |feature: 27> + 6 |feature: 42> + 14 |feature: 23> + 6 |feature: 22> features |movie: c> => 3 |feature: 4> + 7 |feature: 44> + 2 |feature: 28> features |movie: d> => 11 |feature: 10> + 4 |feature: 43> + 4 |feature: 4> + 13 |feature: 23> + 12 |feature: 21> + 2 |feature: 1> features |movie: e> => 1 |feature: 47> + 6 |feature: 11> + 13 |feature: 2> + 11 |feature: 17> + 7 |feature: 2> features |Fred's favourite movie features> => features favourite-movies |Fred> |Fred's suggested movies: 1> => 100 similar[features] |Fred's favourite movie features>OK. If we load that up into the console we get:
|Fred's suggested movies: 1> => 48.279|movie: d> + 36.485|movie: b> + 18.392|movie: e> + 14.892|movie: a> + 10.127|movie: c>OK. It sort of works, but a rated 8/10 movie ended up with the lowest score, so we need to tweak it.
favourite-movies |Fred> => 6 |movie: a> + 9 |movie: b> + 8 |movie: c> + 10 |movie: d> + 6.5 |movie: e> normed-features |movie: a> => normalize( 5 |feature: 9> + 14 |feature: 3> + 9 |feature: 1> + 2 |feature: 20>) normed-features |movie: b> => normalize( 13 |feature: 24> + |feature: 27> + 6 |feature: 42> + 14 |feature: 23> + 6 |feature: 22>) normed-features |movie: c> => normalize( 3 |feature: 4> + 7 |feature: 44> + 2 |feature: 28>) normed-features |movie: d> => normalize( 11 |feature: 10> + 4 |feature: 43> + 4 |feature: 4> + 13 |feature: 23> + 12 |feature: 21> + 2 |feature: 1>) normed-features |movie: e> => normalize( 1 |feature: 47> + 6 |feature: 11> + 13 |feature: 2> + 11 |feature: 17> + 7 |feature: 2> ) normed-features |normed Fred's favourite movie features> => normed-features favourite-movies |Fred> |Fred's suggested movies: 2> => 100 similar[normed-features] |normed Fred's favourite movie features>Then load it up in the console, and we get:
|Fred's suggested movies: 2> => 41.602|movie: d> + 29.939|movie: b> + 22.455|movie: c> + 16.456|movie: e> + 16.291|movie: a>Which looks like we are on the right track at least, since it has the same ordering as his favourite movies:
-- sorted by coeff: favourite-movies |Fred> => 10 |movie: d> + 9 |movie: b> + 8 |movie: c> + 6.5 |movie: e> + 6 |movie: a>So the hard bit now is to build up a database mapping movies to a list of features, and their respective strengths.
normed-features |movie: a> => normalize( 5 |feature: 9> + 14 |feature: 3> + 9 |feature: 1> + 2 |feature: 20>) normed-features |movie: b> => normalize( 13 |feature: 24> + |feature: 27> + 6 |feature: 42> + 14 |feature: 23> + 6 |feature: 22>) normed-features |movie: c> => normalize( 3 |feature: 4> + 7 |feature: 44> + 2 |feature: 28>) normed-features |movie: d> => normalize( 11 |feature: 10> + 4 |feature: 43> + 4 |feature: 4> + 13 |feature: 23> + 12 |feature: 21> + 2 |feature: 1>) normed-features |movie: e> => normalize( 1 |feature: 47> + 6 |feature: 11> + 13 |feature: 2> + 11 |feature: 17> + 7 |feature: 2> ) normed-features |movie: f> => normalize( ...) normed-features |movie: g> => normalize(...) etc.Then for each user, we get them to rate their favourite movies.
favourite-movies |Fred> => 6 |movie: a> + 9 |movie: b> + 8 |movie: c> + 10 |movie: d> + 6.5 |movie: e> favourite-movies |Sam> => 9 |movie: a> + 5 |movie: b> + 6 |movie: c> + 10 |movie: g> etc.Then for each user we do a little processing:
normed-features |normed Fred's favourite movie features> => normed-features favourite-movies |Fred> normed-features |normed Sam's favourite movie features> => normed-features favourite-movies |Sam> etc.Then we look up the data and try to find the best match:
|Fred's suggested movies> => 100 similar[normed-features] |normed Fred's favourite movie features> |Sam's suggested movies> => 100 similar[normed-features] |normed Sam's favourite movie features>
evidence-1 |crime: a> => normalize( 9 |suspect: 1> + 2 |suspect: 2> + 15 |suspect: 3>) evidence-2 |crime: a> => normalize( 0 |suspect: 1> + 5 |suspect: 2> + 4 |suspect: 3> + 12 |suspect: 4>) evidence-3 |crime: a> => normalize( 7 |suspect: 1> + 13 |suspect: 2> + 25 |suspect: 4>) evidence-4 |crime: a> => normalize( 2 |suspect: 1> + 4 |suspect: 3> + 6 |suspect: 6>) |suspect> => evidence-1 |crime: a> + evidence-2 |crime: a> + evidence-3 |crime: a> + evidence-4 |crime: a>Load this up into the console, and we have:
sa: load evidence-crime-vs-suspect.sw sa: dump ---------------------------------------- |context> => |context: evidence of crime vs suspect> evidence-1 |crime: a> => 0.346|suspect: 1> + 0.077|suspect: 2> + 0.577|suspect: 3> evidence-2 |crime: a> => 0.000|suspect: 1> + 0.238|suspect: 2> + 0.190|suspect: 3> + 0.571|suspect: 4> evidence-3 |crime: a> => 0.156|suspect: 1> + 0.289|suspect: 2> + 0.556|suspect: 4> evidence-4 |crime: a> => 0.167|suspect: 1> + 0.333|suspect: 3> + 0.500|suspect: 6> |suspect> => 0.668|suspect: 1> + 0.604|suspect: 2> + 1.101|suspect: 3> + 1.127|suspect: 4> + 0.500|suspect: 6> ---------------------------------------- sa: coeff-sort "" |suspect> 1.127|suspect: 4> + 1.101|suspect: 3> + 0.668|suspect: 1> + 0.604|suspect: 2> + 0.500|suspect: 6>
sa: load WP-word-frequencies.sw sa: find-topic[words] |wikipedia> 18.539|WP: US presidents> + 18.539|WP: particle physics> + 16.479|WP: rivers> + 16.479|WP: physics> + 16.479|WP: country list> + 13.483|WP: Australia> sa: find-topic[words] |adelaide> 74.576|WP: Adelaide> + 25.424|WP: Australia> sa: find-topic[words] |sydney> 60.241|WP: Australia> + 39.759|WP: Adelaide> sa: find-topic[words] |canberra> 100.000|WP: Australia> sa: find-topic[words-2] |aami stadium> 100.000|WP: Adelaide> sa: find-topic[words-2] |river torrens> 100.000|WP: Adelaide> sa: find-topic[words] (|river> + |nile>) -- an example of a superposition version of find-topic. 76.811|WP: rivers> + 13.788|WP: Adelaide> + 9.401|WP: Australia> sa: find-topic[words-2] |river nile> -- NB: ket version gives a better answer in this case. 100.000|WP: rivers> -- If you know the exact phrase, use the ket version. -- If you don't, then use the superposition version -- which adds up the results from the pieces. sa: find-topic[words-2] |adelaide university> -- here is an example of what I was just talking about. |> -- there is no exact match for "adelaide university" sa: find-topic[words] (|adelaide> + |university>) 66.236|WP: Adelaide> + 33.764|WP: Australia> -- at least this time, using sp version, we got something of a result. sa: find-topic[words-3] |university of adelaide> -- ahh... we found an exact match this time. 76.923|WP: Adelaide> + 23.077|WP: Australia> -- also means the WP page for Australia contains the phrase: -- "university of adelaide"
sa: find-topic[words] |physics> 54.237|WP: physics> + 45.763|WP: particle physics> sa: find-topic[words-2] |particle physics> 60.000|WP: particle physics> + 40.000|WP: physics> -- so we have an exact phrase match of "particle physics" sa: find-topic[words] (|particle> + |physics>) 51.605|WP: particle physics> + 48.395|WP: physics> -- we have a match with a "softer" phrase match too, of course. sa: find-topic[words] |electron> 62.791|WP: particle physics> + 37.209|WP: physics> sa: find-topic[words-2] |bill clinton> 100.000|WP: US presidents> sa: find-topic[words-2] |george bush> -- no match on the exact phrase. |> -- probably because of the need to disambiguate between father and son. sa: find-topic[words] (|george> + |bush>) 67.705|WP: US presidents> + 22.363|WP: Australia> + 9.932|WP: Adelaide> -- softer match still gives good results. sa: find-topic[words-2] |richard nixon> 100.000|WP: US presidents> sa: find-topic[words-2] |thomas jefferson> 100.000|WP: US presidents> sa: find-topic[words] |reagan> 100.000|WP: US presidents> sa: find-topic[words-2] |united states> -- heh. matched more than expected. 34.913|WP: rivers> + 24.938|WP: US presidents> + 13.965|WP: particle physics> + 13.092|WP: country list> + 13.092|WP: Australia> sa: find-topic[words-2] |united kingdom> 56.000|WP: Australia> + 28.000|WP: country list> + 16.000|WP: US presidents> sa: find-topic[words] |thailand> 66.667|WP: rivers> + 33.333|WP: country list> sa: find-topic[words] |burma> 100.000|WP: country list> sa: find-topic[words-2] |new zealand> -- I'm getting the impression the WP page on countries does not mention the country very often. 66.667|WP: Australia> + 16.667|WP: Adelaide> + 16.667|WP: country list> -- and hence the resulting low coeffs for that page. -- since smaller frequencies for a term, give smaller find-topic coeffs. sa: find-topic[words] |japan> 53.598|WP: Australia> + 24.566|WP: particle physics> + 21.836|WP: country list> sa: find-topic[words] |egypt> 66.667|WP: rivers> + 33.333|WP: country list> sa: find-topic[words] |brazil> 85.714|WP: rivers> + 14.286|WP: country list> -- now, an aside. The results for Egypt and Brazil are similar. ie, they are more known for having rivers, than as countries. -- let's check this in the console: sa: |t1> => find-topic[words] |egypt> sa: |t2> => find-topic[words] |brazil> sa: 100 ket-simm(""|t1>, "" |t2>) 80.952|simm> -- so yeah. Simm backs up that thought. 81% similarity in terms of their find-topic result.OK. Now, let's make fuller use of the superposition version of find-topic:
find-topic[words] (|australia> + |austria> + |brazil> + |chile> + |denmark> + |holland> + |germany> + |france> + |japan> + |italay> + |greece>) 39.901|WP: country list> + 24.711|WP: rivers> + 19.970|WP: Australia> + 7.646|WP: Adelaide> + 4.324|WP: particle physics> + 3.448|WP: physics> -- and now again without the "italay" typo: find-topic[words] (|australia> + |austria> + |brazil> + |chile> + |denmark> + |holland> + |germany> + |france> + |japan> + |italy> + |greece>) 38.242|WP: country list> + 24.433|WP: rivers> + 19.765|WP: Australia> + 10.494|WP: Adelaide> + 3.931|WP: particle physics> + 3.135|WP: physics> -- heh. So didn't change much. sa: find-topic[words] (|adelaide> + |perth> + |sydney> + |melbourne> + |brisbane> + |hobart> + |darwin> + |canberra>) 68.536|WP: Australia> + 27.623|WP: Adelaide> + 3.841|WP: US presidents> -- heh. This is a nice example of signal perculating upwards as you add more terms. -- Let's make that more obvious. First, save me some effort by saving this as WP-post-processing.sw: |t1> => find-topic[words] (|adelaide>) -- NB: there is no |context> line here. |t2> => find-topic[words] (|adelaide> + |perth>) -- so results get merged into current context. |t3> => find-topic[words] (|adelaide> + |perth> + |sydney>) |t4> => find-topic[words] (|adelaide> + |perth> + |sydney> + |melbourne>) |t5> => find-topic[words] (|adelaide> + |perth> + |sydney> + |melbourne> + |brisbane>) |t6> => find-topic[words] (|adelaide> + |perth> + |sydney> + |melbourne> + |brisbane> + |hobart>) |t7> => find-topic[words] (|adelaide> + |perth> + |sydney> + |melbourne> + |brisbane> + |hobart> + |darwin>) |t8> => find-topic[words] (|adelaide> + |perth> + |sydney> + |melbourne> + |brisbane> + |hobart> + |darwin> + |canberra>) |list> => |t1> + |t2> + |t3> + |t4> + |t5> + |t6> + |t7> + |t8> -- then load it up in the console: sa: load WP-post-processing.sw sa: dump "" |list> |t1> => 74.576|WP: Adelaide> + 25.424|WP: Australia> |t2> => 62.712|WP: Australia> + 37.288|WP: Adelaide> |t3> => 61.888|WP: Australia> + 38.112|WP: Adelaide> |t4> => 61.476|WP: Australia> + 38.524|WP: Adelaide> |t5> => 60.720|WP: Australia> + 39.280|WP: Adelaide> |t6> => 58.048|WP: Australia> + 36.831|WP: Adelaide> + 5.121|WP: US presidents> |t7> => 64.042|WP: Australia> + 31.569|WP: Adelaide> + 4.389|WP: US presidents> |t8> => 68.536|WP: Australia> + 27.623|WP: Adelaide> + 3.841|WP: US presidents> -- so it sort of works (note the changes in the coeff of |WP: Australia>), but not quite as good as I hoped. -- Let's try with US presidents: Again, save some effort by saving this as WP-post-processing-2.sw: |s1> => find-topic[words-2] (|thomas jefferson>) |s2> => find-topic[words-2] (|thomas jefferson> + |ronald regan>) |s3> => find-topic[words-2] (|thomas jefferson> + |ronald regan> + |richard nixon>) |s4> => find-topic[words-2] (|thomas jefferson> + |ronald regan> + |richard nixon> + |bill clinton>) |s5> => find-topic[words-2] (|thomas jefferson> + |ronald regan> + |richard nixon> + |bill clinton> + |barack obama>) |s6> => find-topic[words-2] (|thomas jefferson> + |ronald regan> + |richard nixon> + |bill clinton> + |barack obama> + |george washington>) |s7> => find-topic[words-2] (|thomas jefferson> + |ronald regan> + |richard nixon> + |bill clinton> + |barack obama> + |george washington> + |james monroe>) |s8> => find-topic[words-2] (|thomas jefferson> + |ronald regan> + |richard nixon> + |bill clinton> + |barack obama> + |george washington> + |james monroe> + |jimmy carter>) |list> => |s1> + |s2> + |s3> + |s4> + |s5> + |s6> + |s7> + |s8> -- then load it up in the console: sa: load WP-post-processing-2.sw sa: dump "" |list> |s1> => 100.000|WP: US presidents> |s2> => 100.000|WP: US presidents> |s3> => 100.000|WP: US presidents> |s4> => 100.000|WP: US presidents> |s5> => 100.000|WP: US presidents> |s6> => 100.000|WP: US presidents> |s7> => 100.000|WP: US presidents> |s8> => 100.000|WP: US presidents> -- hrmm... boring, but correct, result. -- Let's try again. How about first names this time, something a little harder, and more interesting, I hope. |s1> => find-topic[words] (|thomas>) |s2> => find-topic[words] (|thomas> + |ronald>) |s3> => find-topic[words] (|thomas> + |ronald> + |richard>) |s4> => find-topic[words] (|thomas> + |ronald> + |richard> + |bill>) |s5> => find-topic[words] (|thomas> + |ronald> + |richard> + |bill> + |barack>) |s6> => find-topic[words] (|thomas> + |ronald> + |richard> + |bill> + |barack> + |george>) |s7> => find-topic[words] (|thomas> + |ronald> + |richard> + |bill> + |barack> + |george> + |james>) |s8> => find-topic[words] (|thomas> + |ronald> + |richard> + |bill> + |barack> + |george> + |james> + |jimmy>) |list> => |s1> + |s2> + |s3> + |s4> + |s5> + |s6> + |s7> + |s8> sa: load WP-post-processing-3.sw sa: dump "" |list> |s1> => 63.953|WP: US presidents> + 23.256|WP: Australia> + 12.791|WP: Adelaide> |s2> => 81.977|WP: US presidents> + 11.628|WP: Australia> + 6.395|WP: Adelaide> |s3> => 82.856|WP: US presidents> + 12.880|WP: Australia> + 4.264|WP: Adelaide> |s4> => 87.142|WP: US presidents> + 9.660|WP: Australia> + 3.198|WP: Adelaide> |s5> => 89.714|WP: US presidents> + 7.728|WP: Australia> + 2.558|WP: Adelaide> |s6> => 85.108|WP: US presidents> + 9.450|WP: Australia> + 5.443|WP: Adelaide> |s7> => 79.819|WP: US presidents> + 12.097|WP: Australia> + 6.863|WP: Adelaide> + 1.221|WP: rivers> |s8> => 76.786|WP: US presidents> + 11.561|WP: Adelaide> + 10.585|WP: Australia> + 1.069|WP: rivers> -- so it works pretty well. Though having a larger collection of wikipedia pages might weaken the effect.
sa: split |thomas ronald richard bill barack george james jimmy> |thomas> + |ronald> + |richard> + |bill> + |barack> + |george> + |james> + |jimmy> sa: find-topic[words] split |thomas ronald richard bill barack george james jimmy> 76.786|WP: US presidents> + 11.561|WP: Adelaide> + 10.585|WP: Australia> + 1.069|WP: rivers>which as expected gives the same result as |s8> above.
def split_ket(one): result = superposition() result.data = [ket(w) for w in one.the_label().split() ] return result
sa: history 1000 files load WP-word-frequencies.sw find-topic[words] |south australia> find-topic[words-2] |south australia> find-topic[words] |adelaide> find-topic[words] |sydney> find-topic[words] |wikipedia> find-topic[words] |canberra> find-topic[words] (|river> + |nile>) find-topic[words-2] |river nile> find-topic[words-2] |adelaide university> find-topic[words] (|adelaide> + |university>) find-topic[words-3] |university of adelaide> find-topic[words] |physics> find-topic[words-2] |particle physics> find-topic[words] (|particle> + |physics>) find-topic[words] |electron> find-topic[words-2] |bill clinton> find-topic[words] |george bush> find-topic[words-2] |george bush> find-topic[words] (|george> + |bush>) find-topic[words-2] |richard nixon> find-topic[words-2] |thomas jefferson> find-topic[words] |reagan> find-topic[words-2] |aami stadium> find-topic[words-2] |river torrens> find-topic[words-2] |rundle mall> find-topic[words-2] |united states> find-topic[words-2] |united kingdom> find-topic[words] |thailand> find-topic[words] |burma> find-topic[words] |new zealand> find-topic[words-2] |new zealand> find-topic[words] |japan> find-topic[words] |egypt> find-topic[words] |brazil> |t1> => find-topic[words] |egypt> |t2> => find-topic[words] |brazil> ket-simm(""|t1>, "" |t2>) history 100 100 ket-simm(""|t1>, "" |t2>) matrix[words] history 1000Second, here are the WP word to frequency lists as matrices:
find |*> #=> find-topic[words] |_self> + find-topic[words-2] |_self> + find-topic[words-3] |_self>Anyway, something close to that.
sa: find-topic[words] |physics> 54.237|WP: physics> + 45.763|WP: particle physics> sa: find-topic[words-2] |particle physics> 60.000|WP: particle physics> + 40.000|WP: physics> sa: find-topic[words] (|particle> + |physics>) 51.605|WP: particle physics> + 48.395|WP: physics>Here we have:
sa: find |physics> 54.237|WP: physics> + 45.763|WP: particle physics> sa: find |particle physics> 60.000|WP: particle physics> + 40.000|WP: physics> sa: find (|particle> + |physics>) 103.210|WP: particle physics> + 96.790|WP: physics>So find applied to a superposition is broken. Not sure why, yet.
normed-find |*> #=> normalize[100] (find-topic[words] |_self> + find-topic[words-2] |_self> + find-topic[words-3] |_self>)So the bug is in using |*> rules on a superposition. I'll try and think of a fix later.
-- First try is simply: sa: compress-ratio |*> #=> arithmetic(count words |_self>, |/>,count-sum words |_self>) -- test an example: sa: compress-ratio |WP: Adelaide> |number: 0.25677603423680456> -- tweak it: sa: compress-ratio-100 |*> #=> 100 to-number arithmetic(count words |_self>, |/>,count-sum words |_self>) sa: compress-ratio-100 |WP: Adelaide> 25.678| > -- now, let's map compress-ratio-100 to all our WP pages, and show the result: sa: map[compress-ratio-100,cr-100] relevant-kets[words] -- NB: relevant-kets is useful. Means we don't have to manually specify the list of interest. sa: matrix[cr-100] [ ] = [ 25.68 29.43 30.15 35.40 28.81 35.99 33.33 ] [ WP: Adelaide ] [ WP: Australia ] [ WP: country list ] [ WP: particle physics ] [ WP: physics ] [ WP: rivers ] [ WP: US presidents ] -- now the words-2 data: sa: w2-compress-ratio-100 |*> #=> 100 to-number arithmetic(count words-2 |_self>, |/>,count-sum words-2 |_self>) sa: map[w2-compress-ratio-100,cr-100-2] relevant-kets[words-2] sa: matrix[cr-100-2] [ ] = [ 75.00 77.49 65.21 80.15 77.02 82.07 71.26 ] [ WP: Adelaide ] [ WP: Australia ] [ WP: country list ] [ WP: particle physics ] [ WP: physics ] [ WP: rivers ] [ WP: US presidents ] -- now the words-3 data: sa: w3-compress-ratio-100 |*> #=> 100 to-number arithmetic(count words-3 |_self>, |/>,count-sum words-3 |_self>) sa: map[w3-compress-ratio-100,cr-100-3] relevant-kets[words-3] sa: matrix[cr-100-3] [ ] = [ 92.96 94.59 81.56 94.18 94.66 93.78 87.30 ] [ WP: Adelaide ] [ WP: Australia ] [ WP: country list ] [ WP: particle physics ] [ WP: physics ] [ WP: rivers ] [ WP: US presidents ]Summary, I guess. As the ngrams get bigger the compression ratio approaches no compression (which is an expected result).
sa: load early-us-presidents.sw sa: display context: early US Presidents early US Presidents: _list supported-ops: op: : Washington, Adams, Jefferson, Madison, Monroe, Q Adams Washington supported-ops: op: president-number, op: president-era, op: party, op: full-name president-number: number: 1 president-era: year: 1789, year: 1790, year: 1791, year: 1792, year: 1793, year: 1794, year: 1795, year: 1796, year: 1797 party: party: Independent full-name: person: George Washington person: George Washington supported-ops: op: : US President: George Washington Adams supported-ops: op: president-number, op: president-era, op: party, op: full-name president-number: number: 2 president-era: year: 1797, year: 1798, year: 1799, year: 1800, year: 1801 party: party: Federalist full-name: person: John Adams person: John Adams supported-ops: op: : US President: John Adams Jefferson supported-ops: op: president-number, op: president-era, op: party, op: full-name president-number: number: 3 president-era: year: 1801, year: 1802, year: 1803, year: 1804, year: 1805, year: 1806, year: 1807, year: 1808, year: 1809 party: party: Democratic-Republican full-name: person: Thomas Jefferson person: Thomas Jefferson supported-ops: op: : US President: Thomas Jefferson Madison supported-ops: op: president-number, op: president-era, op: party, op: full-name president-number: number: 4 president-era: year: 1809, year: 1810, year: 1811, year: 1812, year: 1813, year: 1814, year: 1815, year: 1816, year: 1817 party: party: Democratic-Republican full-name: person: James Madison person: James Madison supported-ops: op: : US President: James Madison Monroe supported-ops: op: president-number, op: president-era, op: party, op: full-name president-number: number: 5 president-era: year: 1817, year: 1818, year: 1819, year: 1820, year: 1821, year: 1822, year: 1823, year: 1824, year: 1825 party: party: Democratic-Republican full-name: person: James Monroe person: James Monroe supported-ops: op: : US President: James Monroe Q Adams supported-ops: op: president-number, op: president-era, op: party, op: full-name president-number: number: 6 president-era: year: 1825, year: 1826, year: 1827, year: 1828, year: 1829 party: party: Democratic-Republican full-name: person: John Quincy Adams person: John Quincy Adams supported-ops: op: : US President: John Quincy Adams party: Democratic-Republican supported-ops: op: founded, op: dissolved founded: year: 1791 dissolved: year: 1825And another example:
sa: load bots.sw sa: display context: bot profile bot: Bella supported-ops: op: name, op: mother, op: father, op: birth-sign, op: number-siblings, op: wine-preference, op: favourite-fruit, op: favourite-music, op: favourite-play, op: hair-colour, op: eye-colour, op: where-live, op: favourite-holiday-spot, op: make-of-car, op: religion, op: personality-type, op: current-emotion, op: bed-time, op: age name: Bella mother: Mia father: William birth-sign: birth-sign: Cancer number-siblings: number: 1 wine-preference: wine: Merlot favourite-fruit: fruit: pineapples favourite-music: music: genre: punk favourite-play: play: Endgame hair-colour: hair-colour: gray eye-colour: eye-colour: hazel where-live: location: Sydney favourite-holiday-spot: location: Paris make-of-car: car: Porsche religion: religion: Christianity personality-type: personality-type: the guardian current-emotion: emotion: fear bed-time: time: 8pm age: age: 31 bot: Emma supported-ops: op: name, op: mother, op: father, op: birth-sign, op: number-siblings, op: wine-preference, op: favourite-fruit, op: favourite-music, op: favourite-play, op: hair-colour, op: eye-colour, op: where-live, op: favourite-holiday-spot, op: make-of-car, op: religion, op: personality-type, op: current-emotion, op: bed-time, op: age name: Emma mother: Madison father: Nathan birth-sign: birth-sign: Capricorn number-siblings: number: 4 wine-preference: wine: Pinot Noir favourite-fruit: fruit: oranges favourite-music: music: genre: hip hop favourite-play: play: No Exit hair-colour: hair-colour: red eye-colour: eye-colour: gray where-live: location: New York favourite-holiday-spot: location: Taj Mahal make-of-car: car: BMW religion: religion: Taoism personality-type: personality-type: the visionary current-emotion: emotion: kindness bed-time: time: 2am age: age: 29 bot: Madison supported-ops: op: name, op: mother, op: father, op: birth-sign, op: number-siblings, op: wine-preference, op: favourite-fruit, op: favourite-music, op: favourite-play, op: hair-colour, op: eye-colour, op: where-live, op: favourite-holiday-spot, op: make-of-car, op: religion, op: personality-type, op: current-emotion, op: bed-time, op: hungry, op: age, op: friends name: Madison mother: Mia father: Ian birth-sign: birth-sign: Cancer number-siblings: number: 6 wine-preference: wine: Pinot Noir favourite-fruit: fruit: pineapples favourite-music: music: genre: blues favourite-play: play: Death of a Salesman hair-colour: hair-colour: red eye-colour: eye-colour: amber where-live: location: Vancouver favourite-holiday-spot: location: Uluru make-of-car: car: Bugatti religion: religion: Islam personality-type: personality-type: the performer current-emotion: emotion: indignation bed-time: time: 10:30pm hungry: starving age: age: 23 friends: bot: Emma, bot: BellaAnd another example:
sa: load george.sw sa: display context: George context: George supported-ops: op: source source: sw-url: http://semantic-db.org/george.sw word: george supported-ops: op: spell, op: spell: 2.00 letter: g, 2.00 letter: e, letter: o, letter: r : person: George person: George supported-ops: op: age, op: dob, op: hair-colour, op: eye-colour, op: gender, op: height, op: wife, op: occupation, op: friends, op: mother, op: father, op: sisters, op: brothers, op: siblings, op: parents, op: family, op: family-and-friends, op: email, op: education, op: can-swim age: age: 29 dob: date: 1984-05-23 hair-colour: hair-colour: brown eye-colour: eye-colour: blue gender: gender: male height: height: cm: 176 wife: person: Beth occupation: occupation: car salesman friends: person: Fred, person: Jane, person: Liz, person: Andrew mother: person: Sarah father: person: David sisters: person: Emily brothers: person: Frank, person: Tim, person: Sam siblings: person: Frank, person: Tim, person: Sam, person: Emily parents: person: Sarah, person: David family: person: Sarah, person: David, person: Frank, person: Tim, person: Sam, person: Emily family-and-friends: person: Sarah, person: David, person: Frank, person: Tim, person: Sam, person: Emily, person: Fred, person: Jane, person: Liz, person: Andrew email: email: george.douglas@gmail.com education: education: high-school can-swim: 0.70 yes person: David Douglas supported-ops: op: is-dead is-dead: yesAnd another example:
sa: load breakfast-menu.sw sa: display context: breakfast menu menu: breakfast supported-ops: op: : food: Belgian Waffles, food: Strawberry Belgian Waffles, food: Berry-Berry Belgian Waffles, food: French Toast, food: Homestyle Breakfast food: Belgian Waffles supported-ops: op: name, op: price, op: description, op: calories name: text: "Belgian Waffles" price: price: 5.95 description: text: "Two of our famous Belgian Waffles with plenty of real maple syrup" calories: calories: 650 food: Strawberry Belgian Waffles supported-ops: op: name, op: price, op: description, op: calories name: text: "Strawberry Belgian Waffles" price: price: 7.95 description: text: "Light Belgian waffles covered with strawberries and whipped cream" calories: calories: 900 food: Berry-Berry Belgian Waffles supported-ops: op: name, op: price, op: description, op: calories name: text: "Berry-Berry Belgian Waffles" price: price: 8.95 description: text: "Light Belgian waffles covered with an assortment of fresh berries and whipped cream" calories: calories: 900 food: French Toast supported-ops: op: name, op: price, op: description, op: calories name: text: "French Toast" price: price: 4.50 description: text: "Thick slices made from our homemade sourdough bread" calories: calories: 600 food: Homestyle Breakfast supported-ops: op: name, op: price, op: description, op: calories name: text: "Homestyle Breakfast" price: price: 6.95 description: text: "Two eggs, bacon or sausage, toast, and our ever-popular hash browns" calories: calories: 950 word: waffles supported-ops: op: : food: waffles word: belgian supported-ops: op: : country: Belgium word: strawberries supported-ops: op: : food: strawberries, fruit: strawberries word: berries supported-ops: op: : food: berries, fruit: berries word: french supported-ops: op: : country: France word: toast supported-ops: op: : food: toast word: breakfast supported-ops: op: : meal: breakfast word: egg supported-ops: op: : food: egg word: eggs supported-ops: op: : food: egg word: bacon supported-ops: op: : food: bacon word: sausage supported-ops: op: : food: sausage word: two supported-ops: op: : number: 2 word: cream supported-ops: op: : food: creamAnd another example:
sa: load binary-tree.sw sa: display context: binary tree x supported-ops: op: text, op: left, op: right text: start node left: 0 right: 1 0 supported-ops: op: text, op: left, op: right text: first child node left: 00 right: 10 1 supported-ops: op: text, op: left, op: right text: second child node left: 01 right: 11 00 supported-ops: op: text, op: left, op: right text: third child node left: 000 right: 100 10 supported-ops: op: text, op: left, op: right text: fourth child node left: 010 right: 110 01 supported-ops: op: text, op: left, op: right text: fifth child node left: 001 right: 101 11 supported-ops: op: text, op: left, op: right text: sixth child node left: 011 right: 111And of course, you don't have to display the entire context. You can choose your ket/sp of interest.
sa: diplay |bot: Madison> or sa: display (|bot: Madison> + |bot: Emma>)BTW, the code for all this is (in the code file):
def display_ket(self,one): # one is a ket label = one.the_label() if type(one) == ket else one head = " " + label + "\n" op_list = self.rule_list[label] if len(op_list) == 0: return head max_len = max(len(op) for op in op_list) sep = ": " frame = "\n".join(" " + op.rjust(max_len) + sep + self.recall(op,label).readable_display() for op in op_list) return head + frame + "\n" def display_sp(self,sp): if type(sp) == ket: return self.display_ket(sp) if type(sp) == superposition: return "\n".join(self.display_ket(x) for x in sp.data) def display_all(self): head = " context: " + self.name + "\n\n" return head + "\n".join(self.display_ket(x) for x in self.known_kets)
sa: load early-us-presidents.sw sa: freq 7.000|op: > + 6.000|op: president-number> + 6.000|op: president-era> + 6.000|op: party> + 6.000|op: full-name> + 6.000|party: Democratic-Republican> + 5.000|Washington> + 5.000|Adams> + 5.000|Jefferson> + 5.000|Madison> + 5.000|Monroe> + 5.000|Q Adams> + 3.000|year: 1825> + 2.000|year: 1791> + 2.000|year: 1797> + 2.000|person: George Washington> + 2.000|year: 1801> + 2.000|person: John Adams> + 2.000|year: 1809> + 2.000|person: Thomas Jefferson> + 2.000|year: 1817> + 2.000|person: James Madison> + 2.000|person: James Monroe> + 2.000|person: John Quincy Adams> + |early US Presidents: _list> + |number: 1> + |year: 1789> + |year: 1790> + |year: 1792> + |year: 1793> + |year: 1794> + |year: 1795> + |year: 1796> + |party: Independent> + |US President: George Washington> + |number: 2> + |year: 1798> + |year: 1799> + |year: 1800> + |party: Federalist> + |US President: John Adams> + |number: 3> + |year: 1802> + |year: 1803> + |year: 1804> + |year: 1805> + |year: 1806> + |year: 1807> + |year: 1808> + |US President: Thomas Jefferson> + |number: 4> + |year: 1810> + |year: 1811> + |year: 1812> + |year: 1813> + |year: 1814> + |year: 1815> + |year: 1816> + |US President: James Madison> + |number: 5> + |year: 1818> + |year: 1819> + |year: 1820> + |year: 1821> + |year: 1822> + |year: 1823> + |year: 1824> + |US President: James Monroe> + |number: 6> + |year: 1826> + |year: 1827> + |year: 1828> + |year: 1829> + |US President: John Quincy Adams> + |op: founded> + |op: dissolved> sa: load bots.sw sa: freq 21.000|bot: Madison> + 20.000|bot: Bella> + 20.000|bot: Emma> + 3.000|op: name> + 3.000|op: mother> + 3.000|op: father> + 3.000|op: birth-sign> + 3.000|op: number-siblings> + 3.000|op: wine-preference> + 3.000|op: favourite-fruit> + 3.000|op: favourite-music> + 3.000|op: favourite-play> + 3.000|op: hair-colour> + 3.000|op: eye-colour> + 3.000|op: where-live> + 3.000|op: favourite-holiday-spot> + 3.000|op: make-of-car> + 3.000|op: religion> + 3.000|op: personality-type> + 3.000|op: current-emotion> + 3.000|op: bed-time> + 3.000|op: age> + 2.000|Mia> + 2.000|birth-sign: Cancer> + 2.000|fruit: pineapples> + 2.000|Madison> + 2.000|wine: Pinot Noir> + 2.000|hair-colour: red> + |Bella> + |William> + |number: 1> + |wine: Merlot> + |music: genre: punk> + |play: Endgame> + |hair-colour: gray> + |eye-colour: hazel> + |location: Sydney> + |location: Paris> + |car: Porsche> + |religion: Christianity> + |personality-type: the guardian> + |emotion: fear> + |time: 8pm> + |age: 31> + |Emma> + |Nathan> + |birth-sign: Capricorn> + |number: 4> + |fruit: oranges> + |music: genre: hip hop> + |play: No Exit> + |eye-colour: gray> + |location: New York> + |location: Taj Mahal> + |car: BMW> + |religion: Taoism> + |personality-type: the visionary> + |emotion: kindness> + |time: 2am> + |age: 29> + |op: hungry> + |op: friends> + |Ian> + |number: 6> + |music: genre: blues> + |play: Death of a Salesman> + |eye-colour: amber> + |location: Vancouver> + |location: Uluru> + |car: Bugatti> + |religion: Islam> + |personality-type: the performer> + |emotion: indignation> + |time: 10:30pm> + |starving> + |age: 23> sa: load george.sw sa: freq 21.000|person: George> + 4.000|person: Sarah> + 4.000|person: David> + 4.000|person: Emily> + 4.000|person: Frank> + 4.000|person: Tim> + 4.000|person: Sam> + 2.000|letter: g> + 2.000|letter: e> + 2.000|word: george> + 2.000|person: Fred> + 2.000|person: Jane> + 2.000|person: Liz> + 2.000|person: Andrew> + 1.700|yes> + |op: source> + |sw-url: http://semantic-db.org/george.sw> + |context: George> + |op: spell> + |op: > + |letter: o> + |letter: r> + |op: age> + |op: dob> + |op: hair-colour> + |op: eye-colour> + |op: gender> + |op: height> + |op: wife> + |op: occupation> + |op: friends> + |op: mother> + |op: father> + |op: sisters> + |op: brothers> + |op: siblings> + |op: parents> + |op: family> + |op: family-and-friends> + |op: email> + |op: education> + |op: can-swim> + |age: 29> + |date: 1984-05-23> + |hair-colour: brown> + |eye-colour: blue> + |gender: male> + |height: cm: 176> + |person: Beth> + |occupation: car salesman> + |email: george.douglas@gmail.com> + |education: high-school> + |op: is-dead> + |person: David Douglas> sa: load breakfast-menu.sw sa: freq 14.000|op: > + 5.000|food: Belgian Waffles> + 5.000|food: Strawberry Belgian Waffles> + 5.000|food: Berry-Berry Belgian Waffles> + 5.000|food: French Toast> + 5.000|food: Homestyle Breakfast> + 5.000|op: name> + 5.000|op: price> + 5.000|op: description> + 5.000|op: calories> + 2.000|calories: 900> + 2.000|food: egg> + |menu: breakfast> + |text: "Belgian Waffles"> + |price: 5.95> + |text: "Two of our famous Belgian Waffles with plenty of real maple syrup"> + |calories: 650> + |text: "Strawberry Belgian Waffles"> + |price: 7.95> + |text: "Light Belgian waffles covered with strawberries and whipped cream"> + |text: "Berry-Berry Belgian Waffles"> + |price: 8.95> + |text: "Light Belgian waffles covered with an assortment of fresh berries and whipped cream"> + |text: "French Toast"> + |price: 4.50> + |text: "Thick slices made from our homemade sourdough bread"> + |calories: 600> + |text: "Homestyle Breakfast"> + |price: 6.95> + |text: "Two eggs, bacon or sausage, toast, and our ever-popular hash browns"> + |calories: 950> + |food: waffles> + |word: waffles> + |country: Belgium> + |word: belgian> + |food: strawberries> + |fruit: strawberries> + |word: strawberries> + |food: berries> + |fruit: berries> + |word: berries> + |country: France> + |word: french> + |food: toast> + |word: toast> + |meal: breakfast> + |word: breakfast> + |word: egg> + |word: eggs> + |food: bacon> + |word: bacon> + |food: sausage> + |word: sausage> + |number: 2> + |word: two> + |food: cream> + |word: cream> sa: load binary-tree.sw sa: freq 7.000|op: text> + 7.000|op: left> + 7.000|op: right> + 4.000|0> + 4.000|1> + 4.000|00> + 4.000|10> + 4.000|01> + 4.000|11> + 3.000|x> + |start node> + |first child node> + |second child node> + |third child node> + |000> + |100> + |fourth child node> + |010> + |110> + |fifth child node> + |001> + |101> + |sixth child node> + |011> + |111>And the code for all this is (in the code file):
def to_freq_list(self): result = superposition() for x in self.known_kets: op_list = self.rule_list[x] count_x = len(op_list) - 1 # we subtract 1 because we don't want to count the supported-ops term. for op in op_list: rule = self.recall(op,x) if type(rule) == ket or type(rule) == superposition: # we currently want to ignore stored_rules. result += rule.apply_sigmoid(clean) # we don't care about the coeffs (hence the clean), just if ket is present or not. result += ket(x,count_x) return result.coeff_sort()
sa: find-topic[kets] |person: Thomas Jefferson> sa: find-topic[kets] |person: George> sa: find-topic[kets] |bot: Emma>And then it will reply with the best sw file. At least that is the plan.
-- decided not to put the result in the standard sw file directory. Else the code would "eat its own tail", by running itself on its own result. -- so we need to change to the appropriate directory: sa: cd sw-frequency-list -- load the file: sa: load sw-files-to-frequency-lists.sw -- which file is best for Thomas Jefferson? sa: find-topic[kets] |person: Thomas Jefferson> 50.000|sw file: breaky-presidents.sw> + 50.000|sw file: early-us-presidents.sw> -- which file is best for George? sa: find-topic[kets] |person: George> 60.000|sw file: george.sw> + 40.000|sw file: recall-general-rules-example.sw> -- let's look at their frequency lists (using dump): sa: dump find-topic[kets] |person: George> kets |sw file: george.sw> => 21.000|person: George> + 4.000|person: Sarah> + 4.000|person: David> + 4.000|person: Emily> + 4.000|person: Frank> + 4.000|person: Tim> + 4.000|person: Sam> + 2.000|word: george> + 2.000|person: Fred> + 2.000|person: Jane> + 2.000|person: Liz> + 2.000|person: Andrew> + 2.000|yes> + |op: source> + |sw-url: http://semantic-db.org/george.sw> + |context: George> + |op: spell> + |op: > + |letter: g> + |letter: e> + |letter: o> + |letter: r> + |op: age> + |op: dob> + |op: hair-colour> + |op: eye-colour> + |op: gender> + |op: height> + |op: wife> + |op: occupation> + |op: friends> + |op: mother> + |op: father> + |op: sisters> + |op: brothers> + |op: siblings> + |op: parents> + |op: family> + |op: family-and-friends> + |op: email> + |op: education> + |op: can-swim> + |age: 29> + |date: 1984-05-23> + |hair-colour: brown> + |eye-colour: blue> + |gender: male> + |height: cm: 176> + |person: Beth> + |occupation: car salesman> + |email: george.douglas@gmail.com> + |education: high-school> + |op: is-dead> + |person: David Douglas> kets |sw file: recall-general-rules-example.sw> => 4.000|person: *> + 2.000|op: bro> + 2.000|op: sis> + 2.000|person: George> + 2.000|person: Zack> + |person: Fred> + |person: Harry> + |person: Mary> + |op: my-id> + |*> + |op: sibs> + |op: is-human> + |op: brothers> + |op: sisters> + |bro 3> + |bro 4> + |bro 5> + |sis 1> -- which file is best for the Emma bot? sa: find-topic[kets] |bot: Emma> 50.000|sw file: bot-emma.sw> + 50.000|sw file: bots.sw> -- let's look at these two files frequency lists (this time using display): sa: display find-topic[kets] |bot: Emma> sw file: bot-emma.sw supported-ops: op: kets kets: 18.00 bot: Emma, op: name, op: mother, op: father, op: birth-sign, op: number-siblings, op: wine-preference, op: favourite-fruit, op: favourite-music, op: favourite-play, op: hair-colour, op: eye-colour, op: where-live, op: favourite-holiday-spot, op: make-of-car, op: religion, op: personality-type, op: current-emotion, op: bed-time, Emma, Madison, Nathan, birth-sign: Capricorn, number: 4, wine: Pinot Noir, fruit: oranges, music: genre: hip hop, play: No Exit, hair-colour: red, eye-colour: gray, location: New York, location: Taj Mahal, car: BMW, religion: Taoism, personality-type: the visionary, emotion: kindness, time: 2am sw file: bots.sw supported-ops: op: kets kets: 21.00 bot: Madison, 20.00 bot: Bella, 20.00 bot: Emma, 3.00 op: name, 3.00 op: mother, 3.00 op: father, 3.00 op: birth-sign, 3.00 op: number-siblings, 3.00 op: wine-preference, 3.00 op: favourite-fruit, 3.00 op: favourite-music, 3.00 op: favourite-play, 3.00 op: hair-colour, 3.00 op: eye-colour, 3.00 op: where-live, 3.00 op: favourite-holiday-spot, 3.00 op: make-of-car, 3.00 op: religion, 3.00 op: personality-type, 3.00 op: current-emotion, 3.00 op: bed-time, 3.00 op: age, 2.00 Mia, 2.00 birth-sign: Cancer, 2.00 fruit: pineapples, 2.00 Madison, 2.00 wine: Pinot Noir, 2.00 hair-colour: red, Bella, William, number: 1, wine: Merlot, music: genre: punk, play: Endgame, hair-colour: gray, eye-colour: hazel, location: Sydney, location: Paris, car: Porsche, religion: Christianity, personality-type: the guardian, emotion: fear, time: 8pm, age: 31, Emma, Nathan, birth-sign: Capricorn, number: 4, fruit: oranges, music: genre: hip hop, play: No Exit, eye-colour: gray, location: New York, location: Taj Mahal, car: BMW, religion: Taoism, personality-type: the visionary, emotion: kindness, time: 2am, age: 29, op: hungry, op: friends, Ian, number: 6, music: genre: blues, play: Death of a Salesman, eye-colour: amber, location: Vancouver, location: Uluru, car: Bugatti, religion: Islam, personality-type: the performer, emotion: indignation, time: 10:30pm, starving, age: 23 -- look for files that make use of the "friends" operator: sa: find-topic[kets] |op: friends> 23.622|sw file: fred-sam-friends.sw> + 23.622|sw file: hello-friends.sw> + 23.622|sw file: matrix-as-network.sw> + 11.811|sw file: random-greetings.sw> + 7.874|sw file: friends.sw> + 4.724|sw file: bots.sw> + 4.724|sw file: george.sw> -- look for files that make use of the "fib" operator: sa: find-topic[kets] |op: fib> 25.000|sw file: active-fib-play.sw> + 25.000|sw file: fib-play.sw> + 25.000|sw file: next-fib-play.sw> + 25.000|sw file: small-fib.sw> -- look for files about frogs: sa: find-topic[kets] |animal: frog> 100.000|sw file: frog.sw> -- look for files about Fred: sa: find-topic[kets] |Fred> 37.500|sw file: fred-sam-friends.sw> + 37.500|sw file: simple-movie-recommendation-example.sw> + 25.000|sw file: hello-friends.sw>Very cool. Works nicely.
-- align and rescale the image, if needed. translate the image (in animals the direction of the eye does this step) rotate the image scale image to bigger or smaller. -- then process it. For a start: 1) unsmooth. ie: f[k] => - f[k-1]/2 + f[k] - f[k+1]/2 (applied once) 2) drop-below[t] 3) Guassian smooth. ie: f[k] => f[k-1]/4 + f[k]/2 + f[k+1]/4 (applied maybe 300 times?) -- then some more steps I haven't yet worked out. where: (1) highlights edges. (2) filters out slowly changing gradients (3) blurs the edges, so matching is less strict on exact alignment of pixelsAnyway, that is the gist of the idea. Need to write/run code to see how well it works, and what we need to do next.
concept 1: 0, 1, 1, 2, 5, 10, 20, 50, 60, 70, 80, 85, 90, 100, 100, 100, 0, 0, 0, 0, 0, 0, 0 concept 2: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 80, 50 -- the joke eventually fadesSo, the brain has machinery that looks out for humourous events.
some concept: 0, 0, 0, 0, 0, 0, 500, 500, 500, 450, 400, 400, 350, 200, 200, 100, 50, 0, 0Swearing is somewhat similar to the wow effect.
age |Fred> => |age: 23>Fine, but how about indirect? The motivating example being:
sa: |you> => |Fred> -- you currently means Fred sa: age "" |you> => |age: 23> -- learn "your" age sa: |you> => |Sam> -- you currently means Sam sa: age "" |you> => |age: 21> -- learn "your" age sa: dump ---------------------------------------- |context> => |context: sw console> |you> => |Sam> age |Fred> => |age: 23> age |Sam> => |age: 21> ----------------------------------------The idea being to try and replicate what happens in a conversation with someone where you have place holders for people you are talking about.
sa: load fred-sam-friends.sw sa: age friends (|Fred> + |Sam>) => |age: 31> -- ie, all of Fred's and Sam's friends are 31 years old. sa: dump -- so let's learn this all in one go. ---------------------------------------- |context> => |context: friends> friends |Fred> => |Jack> + |Harry> + |Ed> + |Mary> + |Rob> + |Patrick> + |Emma> + |Charlie> friends |Sam> => |Charlie> + |George> + |Emma> + |Jack> + |Rober> + |Frank> + |Julie> age |Jack> => |age: 31> age |Harry> => |age: 31> age |Ed> => |age: 31> age |Mary> => |age: 31> age |Rob> => |age: 31> age |Patrick> => |age: 31> age |Emma> => |age: 31> age |Charlie> => |age: 31> age |George> => |age: 31> age |Rober> => |age: 31> age |Frank> => |age: 31> age |Julie> => |age: 31> ----------------------------------------Another common usage is:
op "" |list> => |bah> -- all elements in |list> have the same definition of op.(at least in the general case, and then over-write the specific cases)
sa: |list> => |Maz> + |Liz> + |Sarah> -- just define some list. sa: op-self "" |list> => 19 |_self> -- NB: the |_self> here. sa: dump -- the parse_rule_line() swaps in the right meaning for each case. |list> => |Maz> + |Liz> + |Sarah> op-self |Maz> => 19.000|Maz> -- NB: |_self> is now |Maz> op-self |Liz> => 19.000|Liz> -- |_self> is now |Liz> op-self |Sarah> => 19.000|Sarah> -- |_self> is now |Sarah>
sa: load early-us-presidents.sw -- load a data set. The bigger the better the result. sa: train-of-thought[30] |Jefferson> context: sw console one: |Jefferson> n: 30 |X>: |Jefferson> |year: 1805> recall not found 0.000|> -- its bugging out because we ran into a dead end. recall not found -- we don't know anything at all about |year: 1805>, so train of thought stops. 0.000|> recall not found -- we also don't know anything about |>, so train of thought is dead. 0.000|> recall not found -- quick check: sa: dump |year: 1805> -- nope. nothing. sa: create inverse -- this is the fix. And should prevent most dead ends, by pointing back to where we came from. sa: dump |year: 1805> inverse-president-era |year: 1805> => |Jefferson> -- NB: if we run into |year: 1805> we can at least now go back to |Jefferson> -- so now let's try again: sa: train-of-thought[30] |Jefferson> -- 30 steps in our train. context: sw console one: |Jefferson> n: 30 |X>: |Jefferson> |party: Democratic-Republican> |Q Adams> |party: Democratic-Republican> |Jefferson> |early US Presidents: _list> |Q Adams> |party: Democratic-Republican> |year: 1791> |party: Democratic-Republican> |Madison> |year: 1814> |Madison> |person: James Madison> |US President: James Madison> |person: James Madison> |US President: James Madison> |person: James Madison> |Madison> |early US Presidents: _list> |Q Adams> |number: 6> |Q Adams> |person: John Quincy Adams> |Q Adams> |number: 6> |Q Adams> |party: Democratic-Republican> |year: 1825> |party: Democratic-Republican> |Madison> 6.000|party: Democratic-Republican> + 6.000|Q Adams> + |Jefferson> + 2.000|early US Presidents: _list> + |year: 1791> + 4.000|Madison> + |year: 1814> + 3.000|person: James Madison> + 2.000|US President: James Madison> + 2.000|number: 6> + |person: John Quincy Adams> + |year: 1825>Yeah. The results here aren't super great, but this is only because the data set is no where near big enough to work well.
# where n is an int. def console_train_of_thought(one,context,n): try: n = int(n) except: return ket("",0) print("context:",context.name) print("one:",one) print("n:",n) X = one.pick_elt() print("|X>:",X) print() result = superposition() for k in range(n): op = X.apply_op(context,"supported-ops").pick_elt() # |op> => pick-elt supported-ops |X> X = X.apply_op(context,op).pick_elt() # |X> => pick-elt apply(|op>,|X>) result.data.append(X) print(X.display()) return result # return a record of the train-of-thoughtNow, so what is the idea behind this?
Well, start with a seed superposition Use pick-elt to randomly choose an element from that superposition Then in a loop: look up that kets supported-ops (ie operators that are relevant for that ket) randomly pick one of those operators apply that op to your ket randomly choose a new ket from that resulting ket/sp repeat loopBTW, here is another example of pick-elt:
---------------------------------------- |context> => |context: schrodingers cat> is-alive |cat> => 0.500|yes> + 0.500|no> alive? |*> #=> normalize pick-elt is-alive |_self> ---------------------------------------- sa: alive? |cat> |yes> sa: . -- dot in the console means repeat last computation |no> -- heh. saves typing. In this case "alive? |cat>" sa: . |yes> sa: . |no> sa: . |no>And some trivia: I seem to recall the original motivation for the supported-ops operator was so we could write a train-of-thought function.
$ cat sw-examples/deli-closing-times.sw |context> => |context: deli closing time> |weekday: _list> => |day: monday> + |day: tuesday> + |day: wednesday> + |day: thursday> + |day: friday> |weekend: _list> => |day: saturday> + |day: sunday> deli-closing-time "" |weekday: _list> => |time: 6pm> deli-closing-time "" |weekend: _list> => |time: 4:30pm> deli-closing-time |public holiday> => |closed> $ ./the_semantic_db_console.py Welcome! sa: load deli-closing-times.sw sa: dump ---------------------------------------- |context> => |context: deli closing time> |weekday: _list> => |day: monday> + |day: tuesday> + |day: wednesday> + |day: thursday> + |day: friday> |weekend: _list> => |day: saturday> + |day: sunday> deli-closing-time |day: monday> => |time: 6pm> deli-closing-time |day: tuesday> => |time: 6pm> deli-closing-time |day: wednesday> => |time: 6pm> deli-closing-time |day: thursday> => |time: 6pm> deli-closing-time |day: friday> => |time: 6pm> deli-closing-time |day: saturday> => |time: 4:30pm> deli-closing-time |day: sunday> => |time: 4:30pm> deli-closing-time |public holiday> => |closed> ---------------------------------------- sa: display context: deli closing time weekday: _list supported-ops: op: : day: monday, day: tuesday, day: wednesday, day: thursday, day: friday weekend: _list supported-ops: op: : day: saturday, day: sunday day: monday supported-ops: op: deli-closing-time deli-closing-time: time: 6pm day: tuesday supported-ops: op: deli-closing-time deli-closing-time: time: 6pm day: wednesday supported-ops: op: deli-closing-time deli-closing-time: time: 6pm day: thursday supported-ops: op: deli-closing-time deli-closing-time: time: 6pm day: friday supported-ops: op: deli-closing-time deli-closing-time: time: 6pm day: saturday supported-ops: op: deli-closing-time deli-closing-time: time: 4:30pm day: sunday supported-ops: op: deli-closing-time deli-closing-time: time: 4:30pm public holiday supported-ops: op: deli-closing-time deli-closing-time: closed sa: matrix[deli-closing-time] [ closed ] = [ 0 0 0 0 0 0 0 1.00 ] [ day: friday ] [ time: 4:30pm ] [ 0 0 1.00 1.00 0 0 0 0 ] [ day: monday ] [ time: 6pm ] [ 1.00 1.00 0 0 1.00 1.00 1.00 0 ] [ day: saturday ] [ day: sunday ] [ day: thursday ] [ day: tuesday ] [ day: wednesday ] [ public holiday ]
-- learn that "movie: x" has an imdb rating of 8 imdb-rating-self |movie: x> => 8 |movie: x> -- NB: since this is an op-self operator, the ket on the left and right are the same. -- learn that "movie: y" has an imdb rating of 5.5 -- op-self only changes the coeff of the ket, not the label. imdb-rating-self |movie: y> => 5.5 |movie: y> -- then we need to do the same for all movies we have knowledge about.Next, we need to know the movies of an actor:
movies |actor: v> => |move: a> + |movie: b> + |movie: c> + .... -- eg, Robin Williams (fill in real data later!) movies |actor: Robin Williams> => |move: alpha> + |movie: beta> + ....Now, we have enough to find his best movies (in this case, anything with an imdb rating 7 or above):
best-movies |actor: Robin Williams> => coeff-sort drop-below[7] imdb-rating-self movies |actor: Robin Williams> -- or, more generally: best-movies |actor: *> #=> coeff-sort drop-below[7] imdb-rating-self movies |_self>
for x in word_list: for y in word_list: shared-context |words: x, y> => drop-below[t2] (drop-below[t1] find-topic[op] |x> + drop-below[t1] find-topic[op] |y>)An example perhaps, is I am currently reading a book about Charles Babbage and the Difference Engine.
-- before we have defined anything: sa: plural |word: cat> |> -- in English |> means "I don't know anything about that". -- define a general rule: sa: plural |word: *> #=> merge-labels(|_self> + |s>) -- test it: sa: plural |word: cat> |word: cats> sa: plural |word: dog> |word: dogs> -- ok. But what about the irregular forms? sa: plural |word: mouse> |word: mouses> sa: plural |word: foot> |word: foots> -- ok. we have a general rule, now just define a specific rule: -- learn mouse specific rule sa: plural |word: mouse> => |word: mice> -- learn foot specific rule sa: plural |word: foot> => |word: feet> -- now, try again: sa: plural |word: mouse> |word: mice> sa: plural |word: foot> |word: feet> And, let's check what this looks like in matrix form: sa: matrix[plural] [ word: *s ] = [ 1.00 0 0 ] [ word: * ] [ word: feet ] [ 0 1.00 0 ] [ word: foot ] [ word: mice ] [ 0 0 1.00 ] [ word: mouse ]
1) <x||y> == 0 if x != y. (NB: we deviate from QM a little here, since this is not always true in QM. eg: <p|x> = exp(ipx)) 2) <x||y> == 1 if x == y. 3) <!x||y> == 1 if x != y. (NB: the ! acts as a not. cf, the -v switch for grep) 4) <!x||y> == 0 if x == y. 5) <x: *||y: z> == 0 if x != y. 6) <x: *||y: z> == 1 if x == y, for any z. 7) applying bra's is linear. <x|(|a> + |b> + |c>) == <x||a> + <x||b> + <x||c> 8) if a coeff is not given, then it is 1. eg, <x| == <x|1 and 1|x> == |x> 9) bra's and ket's commute with the coefficients. eg, <x|7 == 7 <x| and 13|x> == |x>13 10) in contrast to QM, in BKO operators are right associative only. <a|(op|b>) is valid and is identical to <a|op|b> (<a|op)|b> is invalid, and undefined. 11) again, in contrast to QM, <a|op|b> != <b|op|a>^* (a consequence of (10) really) 12) applying projections is linear. |x><x|(|a> + |b> + |c>) == |x><x||a> + |x><x||b> + |x><x||c> 13) kets in superpositions commute. |a> + |b> == |b> + |a> 14) kets in sequences do not commute. |a> . |b> != |b> . |a> Though maybe in the sequence version of simm, this would be useful: |a> . |b> = c |b> . c |a>, where usually c is < 1. (yeah, it "bugs out" if you swap it back again, but in practice should be fine) another example: |c> . |a> . |b> = c |a> . c |c> . |b> = c |a> . c |b> . c^2 |c> 15) operators (in general) do not commute. <b|op2 op1|a> != <b|op1 op2|a> 16) if a coeff in a superposition is zero, we can drop it from the superposition without changing the meaning of that superposition. 17) we can arbitrarily add kets to a superposition if they have coeff zero without changing the meaning of that superposition. 18) |> is the identity element for superpositions. sp + |> == |> + sp == sp. 19) the + sign in superpositions is literal. ie, kets add. |a> + |a> + |a> = 3|a> |a> + |b> + |c> + 6|b> = |a> + 7|b> + |c> 20) <x|op-sequence|y> is always a scalar/float 21) |x><x|op-sequence|y> is always a ket or a superpositionNow, some examples:
<hungry|(0.2|hungry> + 10|tired> + 10|emotion: happy>) == <hungry|0.2|hungry> + <hungry|10|tired> + <hungry|10|emotion: happy> == 0.2 + 0 + 0 == 0.2 <tired|(0.2|hungry> + 10|tired> + 10|emotion: happy>) == <tired|0.2|hungry> + <tired|10|tired> + <tired|10|emotion: happy> == 0 + 10 + 0 == 10Applying a category-bra to a superposition (in this case the category of people):
<person: *|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == <person: *|2|fish> + <person: *|9|animal: cat> + <person: *|5|person: Fred> + <person: *|7 |animal: dog> + <person: *|2|person: Sam> + <person: *|13|building: house> == 0 + 0 + <person: *|5|person: Fred> + 0 + <person: *|2|person: Sam> + 0 == 5 + 2 == 7And some examples of projections:
|_self><fish|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == 2|fish> |_self><animal: *|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == 9|animal: cat> + 7|animal: dog> |_self><person: *|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == 5|person: Fred> + 2|person: Sam> |_self><building: *|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == 13|building: house> |_self><person: Fred|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == 5|person: Fred> And a couple using the negation feature: |_self><!animal: *|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == 2|fish> + 0 + 5|person: Fred> + 0 + 2|person: Sam> + 13|building: house> |_self><!person: *|(2|fish> + 9|animal: cat> + 5|person: Fred> + 7 |animal: dog> + 2|person: Sam> + 13|building: house>) == 2|fish> + 9|animal: cat> + 0 + 7 |animal: dog> + 0 + 13|building: house>Physically in the brain (in this model):
1) brain-space is the 3D lattice representing the physical location of neurons in a brain (4D lattice if we count time, though the exact time-step depends on the integration window at each neuron) 2) each ket corresponds to a neuron. 3) applying a bra corresponds to measuring the value of that neuron (since it is spiking, averaged over some time period, presumably). 4) some operators step/propagate through brain-space eg: op |x> => |a> + |b> 5) some operators "measure" a value (this is common in QM) eg: op-self |x> => n |x>, where n is a scalar/float. where the convention in this case is to label the operator op-self.Some notes:
a) we can map a superposition to the lattice representation by adding in kets with 0 coeff for all kets not mentioned in the superposition, but are in the lattice. b) a couple of examples of (4) are: population |location: Adelaide> => |population: 1200000> age |person: Fred> => |age: 26> c) examples of (5) (and representing the same information) are: population-self |location: Adeliade> => 1200000 |location: Adelaide> age-self |person: Fred> => 26 |person: Fred> Depending on what you are doing, you choose the form you want.
friends |Fred> => |Sam> + |Harry> age |Fred> => |age: 21> parents |Harry> => |Liz> + |Richard>in python, these correspond to:
context.learn("friends","Fred",ket("Sam") + ket("Harry")) context.learn("age","Fred","age: 21") context.learn("parents","Harry",ket("Liz") + ket("Richard"))Now, if we load them up in the console:
sa: friends |Fred> => |Sam> + |Harry> sa: age |Fred> => |age: 21> sa: parents |Harry> => |Liz> + |Richard> sa: dump ---------------------------------------- |context> => |context: sw console> friends |Fred> => |Sam> + |Harry> age |Fred> => |age: 21> parents |Harry> => |Liz> + |Richard> ----------------------------------------eg:
-- age of Fred: sa: age |Fred> -- query the age of Fred |age: 21> sa: age |Liz> => |age: 42> -- learn the age of Harry's mum sa: age |Liz> -- query the age of Harry's mum |age: 42>In python the queries are:
context.recall("age",ket("Fred")) context.recall("age",ket("Liz")) -- or: ket("Fred").apply_op(context,"age") ket("Liz").apply_op(context,"age")BTW, "molecules of knowledge" or learn rules, I guess, could be considerd labelled pointers mapping a ket to a superposition.
sa: |tmp> => |some> + |result> sa: "" |tmp> -- NB: |tmp> and "" |tmp> are distinct objects. |some> + |result>