2008/03/03

Intermittent Pickle Problems in Jython

The previous post mentioned "integrating" Jython and CPython by transmitting a stream of pickles between the two. I encountered one intermittent problem with this approach, and I'm unsure of its cause. (Hm, and I should probably post this to a Jython mailing list...)

Problem


In Jython I'd pickled the str() of a java.io.StringWriter, into which I'd just written the SD representation of a CDK molecule. Jython could create the pickle alright. But when I tried to unpickle it in CPython, sometimes, for some molecules, I got a traceback:
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py", line 970, in load_string
raise ValueError, "insecure string pickle"
ValueError: insecure string pickle


The error occurred consistently in my application code, always on the same input structure. But I couldn't derive a simple test script to demonstrate the problem.

Investigation


Examination of the problematic pickle data showed that a Python unicode string literal marker had somehow been inserted, and the type code for the item was somehow S (for string) rather than V (for unicode):
...
sS'sdf'
p3
Su'ZINC00000181\n CDK...
^ What the... ?


Workaround


Google turned up a usable workaround: encode the offending string as utf-8 before trying to pickle it.


import codecs
enc = codecs.getencoder('utf8')
...
sdf = enc(sdf)
...


No comments: