Lightning Talks at NSS Convention

I stole the idea for Lightning Talks from the Python (computer programming language) community. It was a general session at a Python conference in Vancouver, and it was very very cool. I wondered whether the NSS (National Speleological Society – cavers) would be able to use such a session at their annual convention.

Five years later, I have facilitated four Lightning Talks sessions at NSS Conventions, and this year will be the fifth.

This year, Lightning Talks will be on Thursday afternoon at Convention. The usual rules will apply.

  • Please provide your name and the title of your presentation on a 3×5 card to help me with accounting. Blank 3×5 cards will be provided.
  • We will be set-up for powerpoint, if you need to project any visuals. Bring other formats at your own risk. You may use free-of-cost software like OpenOffice (or LibreOffice) to make presentations and save as powerpoint format. Put the file(s) on a USB flash drive, and that’s all you need to bring.
  • Keep it short. Five minutes is a good target. There will be no hooks or sirens or red lights, but if you seem to be running long and eyes are glazing over in the audience, I will make arm or finger gestures at you. In extreme cases, I may begin applause.
  • We will go until we run out of time, or until we run out of presentations. Multiple presentations by any individual speaker will be allowed if time allows.

Lightning Talks is a good place for cavers to try out presenting at Convention. The atmosphere is pretty casual and tolerant, and there’s not a whole lot of preparation required. Since there is no preregistration, you can decide on-the-fly whether to participate. If you embarrass easily, a couple of practice run-throughs will improve your confidence.

Suggested topics:

  • Say a little about a big cave discovery
  • Say a lot about a little cave discovery
  • Relate something that a caver has done
  • Relate something that a caver shouldn’t do
  • Propose a half-baked idea about caves, cavers, or caving (allow time for discussion)
  • Demonstrate a cool device or technique
  • Say something cool and recent about your Grotto, Region, Survey, Project, or significant other
  • Tell a short story about caves, cavers, or caving
  • Enumerate the caves of Rhode Island (actually, this one has already been done)

Over the past four years, we have averaged about thirty in the audience and about a dozen presentations at NSS Convention Lightning Talks. Who knows? Some year, we may run out of time before we run out of presentations.

Posted in Uncategorized | Comments Off

Choosing a Python JSON Translator

Choosing a Python JSON Translator

This is a draft for comments. I already know it needs further work.

Among other things, I plan to move this off WordPress so that the data are more generally usable.

I noticed publication of python-cjson recently. I wondered whether the
claimed 10 to 200 times speed increase was real. It was reasonable to expect that a
“C” back end would make a difference. So, a few lines of code later, here
is a quick evaluation of the popular JSON translators for Python.

If you do not want to read all of this, here is the bottom line.
All of the translators evaluated here work OK. python-cjson is much faster,
but it is less tolerant of edge-case inputs than some. Fortunately, in python
you do not really have to choose. You have the option to send the offending
input to another, more tolerant, implementation when cjson throws an exception.
So, you can have fast JSON processing and still adhere to the mantra, “Be
liberal in what you accept, conservative in what you send.”

This evaluation borrows heavily from a comparison of php json libraries .

Modules Evaluated

In no particular order, here are the JSON translators for this evaluation. Disclosure notification: I am the author of the minjsons.

Package Version License url Cheese Shop Name
python-cjson 1.0.3 LGPL http://cheeseshop.python.org/pypi/python-cjson/1.0.3 python-cjson
demjson 1.1 LGPL http://deron.meranda.us/python/demjson/ demjson
simplejson 1.4 MIT http://undefined.org/python/#simplejson simplejson
json-py 3.4 LGPL http://sourceforge.net/projects/json-py/
minjson (old) 3.4 (dist with json-py) LGPL/ZPL http://sourceforge.net/projects/json-py/
minjson distributed with jsonserver BSD http://zif.hill-street.net/jsonserver/ soon to be zif.jsonserver.minjson in “zif”

Speed tests

Code for the testdata object for the speed tests follows.

testdata = []
data1 = [1,1.01,'hello world',True,None,-1,11.011,"~!@#$%^&*()_+|",False,None]
data2 = {}
for k in ['zero','one','two','three','four','five','six','seven','eight','nine']:
    data2[k] = data1
for k in range(10):
    testdata.append(data2)
    

Timings were done using timeit, the minimum time of five sets
of 100 repetitions encoding or decoding the testdata object. These are not
really benchmarks. These numbers only provide a general idea of the
relative speed of the tested implementations.

Raw Speed of JSON Encoding Implementations
python-cjson demjson simplejson json-py minjson (old) minjson
Actual time (seconds) 0.1 3.0 0.9 0.8 0.3 1.0
Normalized (cjson=1.0) 1.0 31.4 9.5 8.6 3.0 10.2

In the above table, cjson is approximately three to thirty times as fast
encoding objects to JSON as the pure-python translators.

Raw Speed of JSON Decoding Implementations
python-cjson demjson simplejson json-py minjson (old) minjson
Actual time (seconds) 0.1 5.0 2.2 5.1 11.9 5.2
Normalized (cjson=1.0) 1.0 86.3 38.5 88.9 206.9 90.9

In the above table, cjson is approximately 40 to 200 times as fast
decoding objects from JSON as the pure-python translators.

Decoding tests

Python-cjson is fast. What are the risks? How well does it
decode JSON? json-py and minjson have some legacy code in them from earlier definitions
and misunderstandings of JSON. In particular, c-style comments and
single-quoted strings are allowed to some degree by these implementations.
cjson and simplejson are less tolerant of these and some other edge cases
that may crop up in some user-created JSON. Again, borrowing heavily from
the php evaluation article, following is a table of outputs from the various
python JSON translators, given some edge-case inputs. All of the tested translators
are presumably strict on output, so only input edge cases are illustrated here.

Edge-case JSON Decoding Results
test # input (wp removed backslashes; look to the right to see what is going on.) python-cjson demjson simplejson json-py minjson (old) minjson
0
""
empty JSON description (‘can not decode value’, ”) No JSON object could be decoded Nothing to read: ” Syntax error. Could not read ”. Unacceptable JSON expression:
1
"1"
1 1 1 1 1 1
2
"true"
True True True True True True
3
"null"
None None None None None None
4
""hello""
hello hello hello hello hello hello
5
"not a value"
cannot parse JSON description: not a value (‘unknown keyword or identifier’, u’not’) No JSON object could be decoded Trying to read null: ‘not a value’ Syntax error. Could not read ‘not a value’. Unacceptable JSON expression: not a value
6
"[]"
[] [] [] [] [] []
7
"[1]"
[1] [1] [1] [1] [1] [1]
8
"[1.1]"
[1.1000000000000001] [1.1000000000000001] [1.1000000000000001] [1.1000000000000001] [1.1000000000000001] [1.1000000000000001]
9
"[-1E+4]"
[-10000.0] [-10000.0] [-10000.0] Not a valid JSON array: ‘[-1E+4]‘ due to: ‘E’ [-10000.0] [-10000.0]
10
"[100.0e-2]"
[1.0] [1.0] [1.0] Not a valid JSON array: ‘[100.0e-2]‘ due to: ‘e’ [1.0] [1.0]
11
"[.5]"
cannot parse JSON description [0.5] Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[.5]‘ [0.5] [0.5]
12
"[5.]"
[5.0] (‘decimal point must be followed by at least one digit’, u’5.]’) Expecting , delimiter: line 1 column 3 (char 3) [5.0] [5.0] [5.0]
13
"[.]"
cannot parse JSON description (‘decimal point must be followed by at least one digit’, ‘.]’) Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[.]‘ Syntax error. Could not read ‘[.]‘. Unacceptable JSON expression: [.]
14
"[5..5]"
invalid number starting at position 1 (‘decimal point must be followed by at least one digit’, u’5..5]’) Expecting , delimiter: line 1 column 3 (char 3) Not a valid JSON number: ’5..5′ Syntax error. Could not read ‘[5..5]‘. Unacceptable JSON expression: [5..5]
15
"[10e]"
invalid number starting at position 1 (‘not a valid JSON numeric literal’, u”) Expecting , delimiter: line 1 column 4 (char 4) Not a valid JSON array: ‘[10e]‘ due to: ‘e’ Syntax error. Could not read ‘[10e]‘. Unacceptable JSON expression: [10e]
16
"[e10]"
cannot parse JSON description (‘unknown keyword or identifier’, u’e10′) Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[e10]‘ Strings must be quoted. Could not read ‘[e10]‘. Unacceptable JSON expression: [e10]
17
"[010e2]"
[1000.0] (‘initial zero digit must not be followed by other digits (octal numbers are not permitted)’, u’010e2]’) Expecting , delimiter: line 1 column 3 (char 3) Not a valid JSON array: ‘[010e2]‘ due to: ‘e’ [1000.0] [1000.0]
18
"[010.2]"
[10.199999999999999] (‘initial zero digit must not be followed by other digits (octal numbers are not permitted)’, u’010.2]’) Expecting , delimiter: line 1 column 3 (char 3) [10.199999999999999] [10.199999999999999] [10.199999999999999]
19
"[010]"
[10] (‘initial zero digit must not be followed by other digits (octal numbers are not permitted)’, u’010]’) Expecting , delimiter: line 1 column 3 (char 3) [10] [8] [10]
20
"[0xFF]"
cannot parse JSON description [255] Expecting , delimiter: line 1 column 3 (char 3) Not a valid JSON array: ‘[0xFF]‘ due to: ‘x’ [255] [255.0]
21
"[0xff]"
cannot parse JSON description [255] Expecting , delimiter: line 1 column 3 (char 3) Not a valid JSON array: ‘[0xff]‘ due to: ‘x’ [255] [255.0]
22
"[true]"
[True] [True] [True] [True] [True] [True]
23
"[TRUE]"
cannot parse JSON description (‘unknown keyword or identifier’, u’TRUE’) Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[TRUE]‘ Strings must be quoted. Could not read ‘[TRUE]‘. Unacceptable JSON expression: [TRUE]
24
"[null]"
[None] [None] [None] [None] [None] [None]
25
"[NULL]"
cannot parse JSON description: NULL] (‘unknown keyword or identifier’, u’NULL’) Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[NULL]‘ Strings must be quoted. Could not read ‘[NULL]‘. Unacceptable JSON expression: [NULL]
26
"[""]"
[''] [''] [u''] [''] [''] [u'']
27
"["a"]"
['a'] ['a'] [u'a'] ['a'] ['a'] [u'a']
28
" [ "a" ] "
['a'] ['a'] [u'a'] ['a'] ['a'] [u'a']
29
"[1,]"
expecting array item at position 3 [1] Expecting object: line 1 column 3 (char 3) Input is not valid JSON: ‘[1,]‘ [1] Unacceptable JSON expression: [1,]
30
"[,]"
expecting array item at position 1 [demjson.undefined] Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[,]‘ Syntax error. Could not read ‘[,]‘. Unacceptable JSON expression: [,]
31
"[,1]"
expecting array item at position 1 [demjson.undefined, 1] Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[,1]‘ Syntax error. Could not read ‘[,1]‘. Unacceptable JSON expression: [,1]
32
"[1,,1]"
expecting array item at position 3 [1, demjson.undefined, 1] Expecting object: line 1 column 3 (char 3) Input is not valid JSON: ‘[1,,1]‘ Syntax error. Could not read ‘[1,,1]‘. Unacceptable JSON expression: [1,,1]
33
"// comment here
[1]"
cannot parse JSON description [1] No JSON object could be decoded [1] [1] [1]
34
"[// comment here
1]"
cannot parse JSON description [1] Expecting object: line 1 column 1 (char 1) [1] [1] [1]
35
"[1// comment here
]"
cannot parse JSON description [1] Expecting , delimiter: line 1 column 3 (char 3) [1] [1] [1]
36
"[1]// comment here
"
extra data after JSON description at position 3 [1] Extra data: line 1 column 3 – line 2 column 1 (char 3 – 20) [1] Syntax error. Could not read ‘[1]// comment here’. Unacceptable JSON expression: [1]// comment here
37
"/*comment here*/[1]"
cannot parse JSON description [1] No JSON object could be decoded [1] [1] [1]
38
"[/*comment here*/1]"
cannot parse JSON description [1] Expecting object: line 1 column 1 (char 1) [1] [1] [1]
39
"[1/*comment here*/]"
cannot parse JSON description [1] Expecting , delimiter: line 1 column 3 (char 3) [1] [1] [1]
40
"[1]/*comment here*/"
extra data after JSON description at position 3 [1] Extra data: line 1 column 3 – line 1 column 19 (char 3 – 19) [1] [1] [1]
41
"/**/[1]"
cannot parse JSON description [1] No JSON object could be decoded [1] [1] [1]
42
"//
[1]"
cannot parse JSON description [1] No JSON object could be decoded [1] [1] [1]
43
"[1
// comment here
]"
cannot parse JSON description [1] Expecting , delimiter: line 2 column 2 (char 5) [1] Syntax error. Could not read ‘[1
]‘.
[1]
44
"[
// comment here
1]"
cannot parse JSON description [1] Expecting object: line 2 column 1 (char 3) [1] Syntax error. Could not read ‘[
1]‘.
[1]
45
"[1]// comment here"
extra data after JSON description at position 3 [1] Extra data: line 1 column 3 – line 1 column 18 (char 3 – 18) [1] Syntax error. Could not read ‘[1]// comment here’. Unacceptable JSON expression: [1]// comment here
46
"[1]
// comment here"
extra data after JSON description at position 5 [1] Extra data: line 2 column 1 – line 2 column 16 (char 5 – 20) [1] Syntax error. Could not read ‘[1]
// comment here’.
Unacceptable JSON expression: [1]
// comment here
47
"["a
// this is a comment
b"]"
['a\r\n// this is a comment\r\nb'] (‘line terminator characters must be escaped inside string literals’, u’\r\n// this is a comment\r\nb”]’) [u'a\r\n// this is a comment\r\nb'] ['a\r\n// this is a comment\r\nb'] Syntax error. Could not read ‘["a
b"]‘.
Unacceptable JSON expression: ["a
b"]
48
"["a // this is not a comment b"]"
['a // this is not a comment b'] ['a // this is not a comment b'] [u'a // this is not a comment b'] ['a // this is not a comment b'] ['a // this is not a comment b'] [u'a // this is not a comment b']
49
"['a']"
cannot parse JSON description ['a'] Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘['a']‘ ['a'] [u'a']
50
"["a]"
unterminated string starting at position 1 (‘string literal is not terminated with a quotation mark’, u’["a]‘) Unterminated string starting at: line 1 column 1 (char 1) Not a valid JSON string: ‘["a]‘ Syntax error. Could not read ‘["a]‘. Unacceptable JSON expression: ["a]
51
"[a"]"
cannot parse JSON description (‘unknown keyword or identifier’, u’a') Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[a"]‘ Syntax error. Could not read ‘[a"]‘. Unacceptable JSON expression: [a"]
52
"['a"]"
cannot parse JSON description (‘string literal is not terminated with a quotation mark’, u’[\'a"]‘) Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘['a"]‘ Syntax error. Could not read ‘['a"]‘. Unacceptable JSON expression: ['a"]
53
"["a']"
unterminated string starting at position 1 (‘string literal is not terminated with a quotation mark’, u’["a\']‘) Unterminated string starting at: line 1 column 1 (char 1) Not a valid JSON string: ‘["a']‘ Syntax error. Could not read ‘["a']‘. Unacceptable JSON expression: ["a']
54
"[']"
cannot parse JSON description (‘string literal is not terminated with a quotation mark’, “[']“) Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[']‘ Syntax error. Could not read ‘[']‘. Unacceptable JSON expression: [']
55
"['']"
cannot parse JSON description [''] Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘['']‘ [''] [u'']
56
"[''']"
cannot parse JSON description (‘string literal is not terminated with a quotation mark’, u”[''']“) Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘[''']‘ Syntax error. Could not read ‘[''']‘. <minjson.ListToken instance at 0x2b5b09e9edd0>
57
"["]"
unterminated string starting at position 1 (‘string literal is not terminated with a quotation mark’, ‘["]‘) Unterminated string starting at: line 1 column 1 (char 1) Not a valid JSON string: ‘["]‘ Syntax error. Could not read ‘["]‘. Unacceptable JSON expression: ["]
58
"["""]"
unterminated string starting at position 3 (‘string literal is not terminated with a quotation mark’, u’["""]‘) Expecting , delimiter: line 1 column 4 (char 4) Not a valid JSON array: ‘["""]‘ due to: ‘”‘ Syntax error. Could not read ‘["""]‘. <minjson.ListToken instance at 0x2b5b09e9ed88>
59
"["'"]"
["'"] ["'"] [u"'"] ["'"] ["'"] [u"'"]
60
"['"']"
cannot parse JSON description ['"'] Expecting object: line 1 column 1 (char 1) Input is not valid JSON: ‘['"']‘ ['"'] [u'"']
61
"["\'"]"
["\\'"] ["'"] Invalid \escape: “‘”: line 1 column 3 (char 3) Not a valid escaped JSON character: ”’ in ["\'"] ["'"] [u"\\'"]
62
"["\\'"]"
["\\'"] ["\\'"] [u"\\'"] ["\\'"] ["\\'"] [u"\\'"]
63
"["\""]"
['"'] ['"'] [u'"'] ['"'] ['"'] [u'"']
64
"["\"]"
unterminated string starting at position 1 (‘string literal is not terminated with a quotation mark’, u’["\\"]‘) Unterminated string starting at: line 1 column 1 (char 1) Not a valid JSON string: ‘["\"]‘ Syntax error. Could not read ‘["\"]‘. Unacceptable JSON expression: ["\"]
65
"["\\"]"
['\\'] ['\\'] [u'\\'] ['\\'] ['\\'] [u'\\']
66
"["\\\"]"
unterminated string starting at position 1 (‘string literal is not terminated with a quotation mark’, u’["\\\\\\"]‘) Unterminated string starting at: line 1 column 1 (char 1) Not a valid JSON string: ‘["\\\"]‘ Syntax error. Could not read ‘["\\\"]‘. Unacceptable JSON expression: ["\\\"]
67
"
["a"]"
['a'] ['a'] [u'a'] ['a'] ['a'] [u'a']
68
"[
"a"]"
['a'] ['a'] [u'a'] ['a'] Syntax error. Could not read ‘[
"a"]‘.
[u'a']
69
"["a
"]"
['a\r\n'] (‘line terminator characters must be escaped inside string literals’, u’\r\n”]’) [u'a\r\n'] ['a\r\n'] Syntax error. Could not read ‘["a
"]‘.
Unacceptable JSON expression: ["a
"]
70
"["a"
]"
['a'] ['a'] [u'a'] ['a'] Syntax error. Could not read ‘["a"
]‘.
[u'a']
71
"["a"]
"
['a'] ['a'] [u'a'] ['a'] ['a'] [u'a']
72
"["\u0041\u00DC"]"
[u'A\xdc'] [u'A\xdc'] [u'A\xdc'] [u'A\xdc'] ['\\u0041\\u00DC'] [u'A\xdc']
73
"["\b\t\f\v\r\n"]"
['\x08\t\x0c\x0b\r\n'] ['\x08\t\x0c\x0b\r\n'] Invalid \escape: ‘v’: line 1 column 9 (char 9) Not a valid escaped JSON character: ‘v’ in ["\b\t\f\v\r\n"] ['\x08\t\x0c\x0b\r\n'] [u'\x08\t\x0c\\v\r\n']
74
"["\b\t\f\r\n"]"
['\x08\t\x0c\r\n'] ['\x08\t\x0c\r\n'] [u'\x08\t\x0c\r\n'] ['\x08\t\x0c\r\n'] ['\x08\t\x0c\r\n'] [u'\x08\t\x0c\r\n']
75
"["\x41\xDC"]"
['\\x41\\xDC'] [u'A\xdc'] Invalid \escape: ‘x’: line 1 column 3 (char 3) Not a valid escaped JSON character: ‘x’ in ["\x41\xDC"] ['A\xdc'] [u'\\x41\\xDC']
76
"[ ] [ ]"
extra data after JSON description at position 4 (‘unexpected or extra text’, u’[ ]‘) Extra data: line 1 column 4 – line 1 column 7 (char 4 – 7) Input is not valid JSON: ‘[ ] [ ]‘ Syntax error. Could not read ‘[ ] [ ]‘. Unacceptable JSON expression: [ ] [ ]
77
"["a" "b"]"
['a', 'b'] ['a', 'b'] Expecting , delimiter: line 1 column 6 (char 6) Not a valid JSON array: ‘["a" "b"]‘ due to: ‘”‘ ['ab'] Unacceptable JSON expression: ["a" "b"]
78
"[1 2]"
[1, 2] [1, 2] Expecting , delimiter: line 1 column 4 (char 4) Not a valid JSON array: ‘[1 2]‘ due to: ’2′ Syntax error. Could not read ‘[1 2]‘. Unacceptable JSON expression: [1 2]
79
"[{}]"
[{}] [{}] [{}] [{}] [{}] [{}]
80
"[ { } ]"
[{}] [{}] [{}] Input is not valid JSON: ‘[ { } ]‘ [{}] [{}]
81
"[{1}]"
expecting property name in object at position 2 (‘object property (dictionary key) has no value, expected “:”‘, u’{1}]’) Expecting property name: line 1 column 2 (char 2) Not a valid JSON object key (should be a string): 1 Syntax error. Could not read ‘[{1}]‘. Unacceptable JSON expression: [{1}]
82
"[{1:1}]"
expecting property name in object at position 2 [{1: 1}] Expecting property name: line 1 column 2 (char 2) Not a valid JSON object key (should be a string): 1 [{1: 1}] Unacceptable JSON expression: [{1:1}]
83
"[{:1}]"
expecting property name in object at position 2 (‘can not decode value’, u’:1}]’) Expecting property name: line 1 column 2 (char 2) Input is not valid JSON: ‘[{:1}]‘ Syntax error. Could not read ‘[{:1}]‘. Unacceptable JSON expression: [{:1}]
84
"[{"1":}]"
cannot parse JSON description (‘can not decode value’, u’}]’) Expecting object: line 1 column 6 (char 6) Input is not valid JSON: ‘[{"1":}]‘ Syntax error. Could not read ‘[{"1":}]‘. Unacceptable JSON expression: [{"1":}]
85
"[{"1":1}]"
[{'1': 1}] [{'1': 1}] [{u'1': 1}] [{'1': 1}] [{'1': 1}] [{u'1': 1}]
86
"[{"":1}]"
[{'': 1}] [{'': 1}] [{u'': 1}] [{'': 1}] [{'': 1}] [{u'': 1}]
87
"[{"a":}]"
cannot parse JSON description (‘can not decode value’, u’}]’) Expecting object: line 1 column 6 (char 6) Input is not valid JSON: ‘[{"a":}]‘ Syntax error. Could not read ‘[{"a":}]‘. Unacceptable JSON expression: [{"a":}]
88
"[{"a":1}]"
[{'a': 1}] [{'a': 1}] [{u'a': 1}] [{'a': 1}] [{'a': 1}] [{u'a': 1}]
89
"[{"true":1}]"
[{'true': 1}] [{'true': 1}] [{u'true': 1}] [{'true': 1}] [{'true': 1}] [{u'true': 1}]
90
"[{true:1}]"
expecting property name in object at position 2 [{True: 1}] Expecting property name: line 1 column 2 (char 2) Not a valid JSON object key (should be a string): True [{True: 1}] Unacceptable JSON expression: [{true:1}]
91
"[{null:1}]"
expecting property name in object at position 2 (‘object properties (dictionary keys) must be either string literals or numbers’, u’{null:1}]’) Expecting property name: line 1 column 2 (char 2) Not a valid JSON object key (should be a string): None [{None: 1}] Unacceptable JSON expression: [{null:1}]
92
"[{a "a":1}]"
expecting property name in object at position 2 (‘object property (dictionary key) has no value, expected “:”‘, u’{a “a”:1}]’) Expecting property name: line 1 column 2 (char 2) Input is not valid JSON: ‘[{a "a":1}]‘ Syntax error. Could not read ‘[{a "a":1}]‘. Unacceptable JSON expression: [{a "a":1}]
93
"[{"a":1"a"}]"
missing colon after object property name at position 10 (‘object property (dictionary key) has no value, expected “:”‘, u’{“a”:1″a”}]’) Expecting , delimiter: line 1 column 7 (char 7) Not a valid JSON array: ‘[{"a":1"a"}]‘ due to: ‘”‘ Syntax error. Could not read ‘[{"a":1"a"}]‘. Unacceptable JSON expression: [{"a":1"a"}]
94
"[{a b:"a"}]"
expecting property name in object at position 2 (‘object property (dictionary key) has no value, expected “:”‘, u’{a b:”a”}]’) Expecting property name: line 1 column 2 (char 2) Input is not valid JSON: ‘[{a b:"a"}]‘ Syntax error. Could not read ‘[{a b:"a"}]‘. Unacceptable JSON expression: [{a b:"a"}]
95
"[{a b:a b}]"
expecting property name in object at position 2 (‘object property (dictionary key) has no value, expected “:”‘, u’{a b:a b}]’) Expecting property name: line 1 column 2 (char 2) Input is not valid JSON: ‘[{a b:a b}]‘ Syntax error. Could not read ‘[{a b:a b}]‘. Unacceptable JSON expression: [{a b:a b}]
96
"[{a 1:a 1}]"
expecting property name in object at position 2 (‘object property (dictionary key) has no value, expected “:”‘, u’{a 1:a 1}]’) Expecting property name: line 1 column 2 (char 2) Input is not valid JSON: ‘[{a 1:a 1}]‘ Syntax error. Could not read ‘[{a 1:a 1}]‘. Unacceptable JSON expression: [{a 1:a 1}]
97
"[{a:b:c}]"
expecting property name in object at position 2 (‘unknown keyword or identifier’, u’b') Expecting property name: line 1 column 2 (char 2) Input is not valid JSON: ‘[{a:b:c}]‘ Syntax error. Could not read ‘[{a:b:c}]‘. Unacceptable JSON expression: [{a:b:c}]
98
"[/*comment here*/{"1":1}]"
cannot parse JSON description [{'1': 1}] Expecting object: line 1 column 1 (char 1) [{'1': 1}] [{'1': 1}] [{u'1': 1}]
99
"[{/*comment here*/"1":1}]"
expecting property name in object at position 2 [{'1': 1}] Expecting property name: line 1 column 2 (char 2) [{'1': 1}] [{'1': 1}] [{u'1': 1}]
100
"[{"1"/*comment here*/:1}]"
missing colon after object property name at position 5 [{'1': 1}] Expecting : delimiter: line 1 column 5 (char 5) [{'1': 1}] [{'1': 1}] [{u'1': 1}]
101
"[{"1":/*comment here*/1}]"
cannot parse JSON description [{'1': 1}] Expecting object: line 1 column 6 (char 6) [{'1': 1}] [{'1': 1}] [{u'1': 1}]
102
"[{"1":1/*comment here*/}]"
expecting property name in object at position 7 [{'1': 1}] Expecting , delimiter: line 1 column 7 (char 7) [{'1': 1}] [{'1': 1}] [{u'1': 1}]
103
"[{"1":1}/*comment here*/]"
cannot parse JSON description [{'1': 1}] Expecting , delimiter: line 1 column 9 (char 9) [{'1': 1}] [{'1': 1}] [{u'1': 1}]

Final Notes

The above tables show that there are trade-offs between speed and
liberalness with input JSON. Fortunately, you can have it both ways! python-cjson
defines DecodeError, so your server can try: to decode with cjson, and if it fails,
you can handle the exception to use a more liberal translator. Choose your
secondary decoder by speed and/or how well it handles edge-case inputs from the
tables above. This page is largely automatically generated, (code available on request),
so this page can be re-done with additional or revised translators relatively quickly.
Just let me know if and when another run should be made.

I’m not altogether sure whether the above table shows any implementation bugs.
There seems to be a bit of disagreement in test case 73. I had started this evaluation for
personal reasons, but the results came out so interesting that I thought I would share.

Posted in python | Comments Off

Factoradics update

It’s nice to see that there is some interest in the factoradic concept.

I’ve done some updates on the code, particularly “factoring” out some factorial calculations. It was silly to calculate a full factorial for each index when we are iterating through the entire list anyway and can get the factorial value by multiplying as we go.

Anyway, updated code is here. It now memoizes values and their factoradic representations to an sqlite db so that they can be looked up (either way) once calculated. I am using python’s marshal module to represent objects in the db, for speed. My first stab used cPickle, but the code is now different enough that swapping the import will not work. Hint: cPickle.loads wants a string, if you are afraid of marshal. But beware, str(aLargeInt) is surprisingly slow.

The current module is fine for moderately large sets, in the range of a couple of tens of thousands of items. It is extraordinarily zippy for less than a thousand items, which is the size of most Zope 3 containers I am dealing with. The larger the size of the set, the longer it will take for the initial calculations. Even pulling from the database gets slower as the number of items gets very large. I think it is something you just have to deal with when you are working with large numbers.

Posted in python | Comments Off

Factoradics in Python

How to store several different orderings of a set of data?

I want to deliver a test to students. The test is delivered in a computer lab. Johnny can see Janie’s screen. How to keep Johnny from copying Janie’s answers?

The questions are presented to the students one at a time. One solution would be that, even though Janie and Johnny get the same questions, they get them in a different order. Each student would see a different permutation of the questions.

At first guess, one would consider generating a random order for each instance of the test. But what if a computer crashes, or someone sets off a fire alarm during the test? How to know which permutation of questions Johnny has, and which permutation of questions Janie has, so that they can resume the same test with the questions in the same order?

Somehow, the ordering of questions needs to be stored on the server, attached to the student’s data.

The quick solution is to store a list of question ids, in the order randomly generated for each student. Not too bad. But we can do better. We can store an index to the permutation.

Factoradics

A factoradic is a special way to represent any non-negative integer. It’s a little like a basic representation of a number, where the digits are multiplied by a power of a base going from right to left. But instead of a power of the base, the multiplier is the factorial of the index. for example, factoradic [1,2,0,0] would be 1 * 3! + 2 * 2! + 0 * 1! + 0 * 0!. Neat? Maybe. But there’s more. Factoradics can be used to create a two-way mapping from an integer to a particular permutation of a set, and from a particular permutation of a set to an integer.

For any set of n unique items, there are n! possible permutations of the set. I’ll call that a permutation index, where that number (we will use zero-based indexing) may be any integer from 0 to n! – 1. If you want to create a mapping of an integer to a permutation, there are many ways to do that.

But most of those easily-coded functions only work in one direction. You can get the permutation from the index, but not the index from the permutation.

What if you have a particular ordering and want to know what its permutation index is? You can, of course, iterate from 0 using a permutation-generating function until the permutation at the index matches the permutation. But there’s a better way, using factoradics.

Finding the Permutation Index of a Permutation

A simple example will demonstrate the method.

The canonical list is ['a','b','c','d','e'].

We wish to obtain the permutation index of ['c','a','e','d','b'].

The algorithm is:

  • copy the canonical list
  • for each item in the permutation
    • find the index of the item in the canonical list copy.
    • append that index to a list.
    • remove the item from the canonical list copy.

Working the algorithm:

  1. Canonical copy is ['a','b','c','d','e']
  2. The index of ‘c’ in the canonical copy is 2.
    Store [2]
    Canonical copy is ['a','b','d','e']
  3. The index of ‘a’ in the copy is 0.
    Store [2,0]
    Canonical copy is ['b','d','e']
  4. The index of ‘e’ in the copy is 2.
    Store [2,0,2]
    Canonical copy is ['b','d']
  5. The index of ‘d’ in the copy is 1.
    Store [2,0,2,1]
    Canonical copy is ['b']
  6. The index of ‘b’ in the copy is 0.
    Store [2,0,2,1,0]
    Canonical copy is []
    Done.

We get:
[2,0,2,1,0], which is a factoradic representation of the integer 53:
2 * 4! + 0 * 3! + 2 * 2! + 1 * 1! + 0 * 0!
2 * 24 + 2 * 2 + 1
48 + 4 + 1
53

The total number of permutations is 5!, or 120, and these are numbered 0 to 119.
We can say that, by this method, ['c','a','e','d','b'] is at permutation index 53 of ['a','b','c','d','e']. We obtained that index without generating the previous 53 permutations.

Obtaining a Permutation from a Permutation Index

The forward algorithm is similarly easy:

The canonical list is ['a','b','c','d','e'].

We wish to obtain the permutation at permutation index 53.

First, we need to calculate the factoradic representation of 53, which is [2,0,2,1,0]. The actual calculation is easy, requiring only divmod (%) and a factorial function.

Now that we have the factoradic representation of the permutation index, the algorithm is as follows:

  • make a list of integers 0,1,… the same length as the canonical list
  • for each item in the factoradic
    • obtain the item at that index in the integer list.
    • append that item to a list.
    • remove the item from the integer list
  • using the new list, make a list of the items in the canonical list at the index in the new list
  1. The canonical list has five items, so generate [0,1,2,3,4]
  2. First item in the factoradic is 2.
    2 is at index 2 in the integer list.
    Store [2]
    Remove 2 from the integer list.
    Integer list is now [0,1,3,4]
  3. Next item in the factoradic is 0.
    0 is at index 0 in the integer list.
    Store [2,0]
    Remove 0 from the integer list.
    Integer list is now [1,3,4]
  4. Next item in the factoradic is 2.
    4 is at index 2 in the integer list.
    Store [2,0,4]
    Remove 4 from the integer list.
    Integer list is now [1,3]
  5. Next item in the factoradic is 1.
    3 is at index 1 in the integer list.
    Store [2,0,4,3]
    Remove 3 from the integer list.
    Integer list is now [1]
  6. Next item in the factoradic is 0.
    1 is at index 0 in the integer list.
    Store [2,0,4,3,1]
    Remove 1 from the integer list.
    Integer list is now []
    Done iterating.
  7. Using [2,0,4,3,1] as indices to the canonical list, obtain ['c','a','e','d','b'].
    Done.

We know from above that ['c','a','e','d','b'] is at permutation index 53 of ['a','b','c','d','e'].

Where’s the Python Code?

The above algorithms are not Python-specific, but they may be expressed quite nicely in Python. I think there’s enough here for any competent Pythonista to make an acceptable implementation. I provide my own implementation at factoradic.py.

Where Else Can This be Used?

I think this technnique may be useful any time one needs to persist multiple orderings of objects, or any time the ordering process is itself expensive. One particularly nice aspect of this is that obtaining a slice e.g., [65:135] of a previously-stored permutation does not require a possibly-expensive reordering of the entire set each time a slice is requested.

Caveats

For large sets, the indexes are large. For a set with 30,000 items, an index to a random permutation can be over 120,000 digits. The provided code is probably OK to get started, but memoizing some of the function results (perhaps to disk!) and profiling for your specific case may be necessary for an efficient implementation. For small sets of 1000 items or so, the provided code is probably enough. For very large sets, prepare to trade-off memory and disk space for speed. Even with gmpy, calculating 30,000 factorials will take a while.

Credits

I originally found information about the factoradic technique at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnnetsec/html/permutations.asp.
The algorithm I demonstrate here is from Wikipedia.

Posted in python | Comments Off

Now, how to say “code”.

In the next few posts, I will want to show some code.

It should look like this:



class PythonClass(object):
    def __init__(self):
        pass

Ahh! preserve-code-formatting works! That was almost a show-stopper.

Posted in Uncategorized | Comments Off

__init__()

So, I have installed WordPress, and now is the time to create the first blog entry.

We’ll see how much fun this ultimately turns out to be.

The main purpose of this blog is to document my travels in Zope 3, and at the moment, Dojo.

For starters, I will present some ideas and implementations that I have worked out in the past few months.

Posted in Dojo, Zope 3 | Comments Off