SciFinder Update: Markush searching and more

UCSD SciFinder Guide – includes information on accessing SciFinder (web and client versions) and links to SciFinder guides and tutorials.

CAS has launched a new release of SciFinder Web. New features include:

  • CAS has added Markush searching, to retrieve patent documents containing generic Markush structures relevant to the query structure. You’ll see the Markush option now when you go to Explore Substances, and it’s complementary to the structure searching in the CAS Registry. (see more about this below the list)
  • Under Explore References, you can now search by Digital Object Identifiers (DOI) if you have them.
  • There’s a new preference setting where you can automatically remove duplicate MEDLINE/PubMed references from your answer sets. Otherwise you can remove them manually after each search.
  • When you use the Similar Reactions feature, reaction centers in your answer sets will be highlighted:
    • Broad similarity: reaction centers highlighted
    • Medium similarity: reaction centers and adjacent atoms/bonds highlighted
    • Narrow similarity: reaction centers and extended atoms/bonds highlighted
  • Substance answer sets can now be sorted by molecular weight or molecular formula as well as by CAS Registry Number or number of references (ascending or descending order).
  • Keep Me Posted updating can be set for weekly or monthly alerting. Some additional customization features have been added.

More about the Markush searching: from Ben Wagner, Univ at Buffalo

1. When you select the Markush option, you are searching only the structures in the Marpat database, NOT the main REGISTRY file itself (54+ million structures). IMPACT: To do a complete novelty search, you must search a query structure with both options: substructure/exact structure & Markush.

2. The Markush search does not return a set of structure records, but rather returns the [Chemical Abstracts] CAPlus patent literature records EQUIVALENT to the corresponding Marpat patent literature record. IMPACT: You never see the actual Marpat record or the actual target Markush structures that are searched.

3. No “hit” registry numbers are highlighted in the detailed CAPlus records retrieved by a Markush query. in 55% of the time, the CAplus patent record contains an embedded structure drawing in the abstract. This embedded graphic (which used to appear in the printed Chemical Abstracts as well) often is expressed in Markush conventions. IMPACT: In 55% of the time, you will get a fair to very good indication of why your query structure retrieved the patent reference based on the graphic embedded in the abstract. In about 45% of the time, you will have a poor-fair chance of figuring out why the patent was retrieved and may need to refer to the full-text of the patent. You have a poor chance because no registry numbers are highlighted and you will be dependent on the title, abstract, and keyword indexing to determine relevance.

4. Markush searching follows the default settings for a STN Marpat search in novice mode. To provide one example, ring atoms and ring groups (Hy, Cy, Cb) in query are matched only to real atoms in the Markush structures. For example, a query pyridine may match only a result pyridine, not a generic Hy. IMPACT: An STN Marpat search in expert mode would likely retrieve more (perhaps many more) hit patents than running the exact same structure in SciFinder Markush mode. If you do Markush searching, try playing around with real atoms/explicit structures vs. generic atoms at critical positions in your query structure.  Markush structures are a simple concept for chemists, but the conventions for the actual computer algorithms that map between a query structure and stored target Markush connection tables with a mix of generic and real atoms in both the query and the target structures is extremely complex.

Categories: Database News Comments: 0

Leave a comment

Archives