InChI

04 Nov
Published by RyanPakula

InChI is great and all, but today while exploring, I found two issues that bugged me.  The first arose from my simple double-checking exercise: I drew praziquantel at http://pubchem.ncbi.nlm.nih.gov/edit/ then took the InChI returned and plugged it into chemspider.com to see if it would bring up the appropriate page for PZQ (which I already knew existed).  I made the mistake of leaving one of the amide nitrogens as a carbon, so my search on chemspider returned that there were no hits at all.  Now, in this case that's fine, cause I messed up and I could ammend my structure, get the appropriate InChI, and have chemspider return the proper page, but it made me think about the fact that I don't know if anyone has a search which also returns similar structures in addition to exact matches.  For example, if you want to search for a structure that contains an amine, but you'd also be interested if it was a methylamine or an ethylamine or even an acetamide, I don't believe that literature would be displayed.  You would have to manually search all of these similar structures, or you might just give up if you weren't aware that a manual search was necessary.  Maybe this isn't exactly the main purpose of InChIs, but if InChIs are supposed to allow for quick searching of structures, then a method of searching (very) similar structures seems appropriate.  Granted, this may have been specific of chemspider's search engine and another search engine exists which allows for similar structure searching, but I don't know where to find it (it's not google, that's for sure).  And, yes, I do acknowledge that the programming of such an algorithm which somehow decides what's similar enough and what isn't is likely very difficult, but that's not what I'm arguing here...
 
The second issue is that InChI doesn't program for stereochemistry around non-C centers.  This is fine for organic chemists (mostly), but for organometallic complexes, for example, the stereochemistry is completely ignored.  Or what about stereochemistry about sulfur atoms?  This will hopefully be addressed as InChI is improved, but I think InChI's already been announced to be in its final form.  So, yeah, I'm not sure if this important issue will ever be addressed.
 
Any comments or corrections to false statements I may have made?

Comments

ChemSpiderman's picture

Antony Williams ChemSpiderman Building a Structure Centric Community for Chemists Blog: http://www.chemspider.com/blog Twitter: http://twitter.com/ChemSpiderman
It might have been easier to search for Praziquantel on ChemSpider by name and download the structure? http://www.chemspider.com/Chemical-Structure.4722.html . That doesn't address your thoughts around the limitation of InChI though. We had the ability to do SIMILARITY searches in place about 6 months ago on our alpha version. We have not yet rolled it out and it's not going to be anytime soon but we DO have it on our list of things to do. However, it won't be done using InChIs.
There is definitely a need to support stereo even in organics around non-C centers as you say. For an example of the S-center issue see here, http://www.chemspider.com/Chemical-Structure.23078563.html
Believe me, InChI is FAR from finished. There are very active teams at work at present discussing  organometallics, polymers, Markush structures and the known limitations of InChI.
 

MatTodd's picture

Interesting, Antony. InChI has no mechanism for comparing similar structures built into the system. Without knowing what I'm talking about, perhaps there is no need for this directly. But clearly it would be useful to be able to display similar structures were one to input a slightly-wrong structure, as was the case here. Rather like Google's useful "Did you mean..." prompts.

You're right, and I always do do searches for PZQ by name.  As I said above, this was entirely just to test InChI and see if it actually worked, &c.  I guess I feel like there should be a corresponding IUPAC website which has a very well-adapted and high-level search engine to deal with InChI searches if they're going to push for it to be used.  I don't believe IUPAC can depend on other websites, no matter how good these websites might be, to implement a search engine/function that enhances the usefulness of InChI.  Rather, I think it's their burden to do this if they want InChI to take off...