These days I'm learning how to work with some new (to me) Java cheminformatics toolkits. Google, Jython 2.2.1, and VoodooPad are helping me get up to speed:
- Google to find the documentation for the Java APIs
- Jython to learn API nuances without an edit/compile/test cycle
- VoodooPad as a scratchpad for sample scripts -- it's easy to save them in pages that sit alongside my worklog
During initial exploration I just bang up the script in VoodooPad, then repeatedly select all
(Cmd-A), copy
(Cmd-C), switch to a Terminal session running Jython
(Cmd-Tab), and paste
(Cmd-V).
The "learnings" are going into proper Jython scripts maintained with
TextMate.
It's nice to be able to deploy using Jython instead of pure Java. Since almost all of the heavy lifting is being done inside the cheminformatics jars, I don't need to worry about the overhead of running a Python interpreter inside a JVM...
Here's a simple example using
CDK, to transform an SD string into a SMILES string.
import java.io
from org.openscience import cdk
def getCDKMol(sdf):
reader = cdk.io.MDLReader(java.io.StringReader(sdf))
result = cdk.Molecule()
result = reader.read(result)
return result
def getSmiles(cdkMol):
result = java.io.StringWriter()
writer = cdk.io.SMILESWriter(result)
writer.write(cdkMol)
return str(result)
def sdfToSmiles(sdf):
return getSmiles(getCDKMol(sdf))
print sdfToSmiles("""1-7
42 44 0 0 0 0 999 V2000
-0.8349 0.9908 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.0502 1.2457 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.0502 2.0707 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.8349 2.3257 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.3198 1.6582 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.6172 0.7608 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.0898 0.2062 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.5310 -0.0597 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.1984 -0.5446 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.9521 -0.2090 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.0383 0.6114 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.3709 1.0964 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.8968 0.0346 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.1517 -0.7500 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.5997 -1.3631 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.7927 -1.1915 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5378 -0.4069 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.1448 1.6582 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.6195 -0.6940 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0
3.1044 -0.0265 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.1346 -1.3614 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.2869 -1.1789 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.8546 -2.1477 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.6172 2.5557 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.0898 3.1103 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.2227 -0.3952 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1.1122 -1.3651 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
2.7920 0.9470 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1.4571 1.9168 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.4488 0.6477 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.9587 -0.9215 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.2407 -1.8046 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.1632 -1.1420 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.1448 0.8332 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.1448 2.4832 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.9698 1.6582 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
3.7719 -0.5114 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
2.8020 -1.8463 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
3.9544 -1.6638 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.0700 -2.4026 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.6392 -1.8928 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.1096 -2.9323 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 5 1 0 0 0 0
1 7 1 0 0 0 0
2 3 2 0 0 0 0
2 6 1 0 0 0 0
3 4 1 0 0 0 0
3 24 1 0 0 0 0
4 5 2 0 0 0 0
4 25 1 0 0 0 0
5 18 1 0 0 0 0
6 8 2 0 0 0 0
6 12 1 0 0 0 0
7 13 2 0 0 0 0
7 17 1 0 0 0 0
8 9 1 0 0 0 0
8 26 1 0 0 0 0
9 10 2 0 0 0 0
9 27 1 0 0 0 0
10 11 1 0 0 0 0
10 19 1 0 0 0 0
11 12 2 0 0 0 0
11 28 1 0 0 0 0
12 29 1 0 0 0 0
13 14 1 0 0 0 0
13 30 1 0 0 0 0
14 15 2 0 0 0 0
14 31 1 0 0 0 0
15 16 1 0 0 0 0
15 23 1 0 0 0 0
16 17 2 0 0 0 0
16 32 1 0 0 0 0
17 33 1 0 0 0 0
18 34 1 0 0 0 0
18 35 1 0 0 0 0
18 36 1 0 0 0 0
19 20 2 0 0 0 0
19 21 2 0 0 0 0
19 22 1 0 0 0 0
22 37 1 0 0 0 0
22 38 1 0 0 0 0
22 39 1 0 0 0 0
23 40 1 0 0 0 0
23 41 1 0 0 0 0
23 42 1 0 0 0 0
M END
$$$$
""")