Enterprise-based searching always wants more. And dtSearch, a supplier of enterprise and developer text retrieval software, is offering more -- with the recently announced version 7.70 of its product line.
The new release features enhanced document filters, and APIs for OEMs that provide data parsing, conversion and extraction.
Indexed Search In Under a Second
DtSearch, which began providing text retrieval in 1991, offers a product line of enterprise search and developer text search products.
The search products’ spider can search local/remote content and static/dynamic web content, and it can reach across public/private sites, including support for log-ins and forms-based authentication. More than a terabyte of text can be covered in a single index, including directories, databases, online data and emails, and an unlimited number of indexes can be created and searched. According to dtSearch, that indexed search time is under a second, even across terabytes.
The document filters have supported a wide range of file formats and data types. In addition to all office productivity documents (Microsoft Office, OpenOffice, RTF, PDF and others), major email formats, compression formats such as ZIP and RAR, and Web-ready data such as HTML and XML/XSL, the filters are also built for dynamic data, including PHP, ASP.NET and all major databases.
Increased Search Support for Multi-Level Documents
Version 7.70 extends that support to images in Word (.doc/.docx), Powerpoint (.ppt/.pptx), Excel (.xls/.xlsx), Access (.mdb/accdb), RTF, and email files such as Thunderbird (mbox/.eml) and Outlook (.pst/.msg). These formats are shown as highlighted hits in context, and there’s also support for documents created by the Japanese word processor, Ichitaro.
The new release also increases the product’s support for documents and images that reside in multi-level nested configurations. This means it can find and display images in an email file, for instance. The company said that it can also find and display images in a PowerPoint file that has been embedded in a Word document attached as a zipped file to an email.
For developers, a new “object extraction” API allows for navigation through an embedded object’s structure, as if it were a hierarchy, and for extraction o09:31:00f any object.