Skip to main content
Skip table of contents

Cognitive Toolkit March 2022/ Release Notes


Product: Cognitive Toolkit


Date: March 23, 2022

Package: (available from your customer Collab Space)


  • SharePoint On-Premise Support: Support in the Cognitive Toolkit for SharePoint On-Premise is back, as we have updated a number of tools. For each of these, see help for the named tool for available parameters and options:

    • CrawlSharePointOnPrem: Use this tool for crawling your SharePoint On-Premise instance. Note that the unique identifier for items in the Analytics Engine has been changed to now be based on the SharePoint On-Premise Item ID (with the older design this was based on the File Ref). This change was made to allow for better downstream treatment of tools like AddPathValidation, so that moved files still have the same identifier (so they do not have to be re-attributed). Also note that there is no full site crawling yet available - a source setting file needs to be defined for each site that is crawled. (Ref: CT-2356, CT-2428)

    • AddHashAndExtractedText: This tool has been updated to support SharePoint On-Premise. Use for adding hash, extracting text (or both at the same time) from your crawled SharePoint On-Premise instance. For performance reasons, we recommend the combined option for SharePoint On-Premise, if possible. (Ref: CT-2429)

    • AddPathValidation: This tool has been updated to support SharePoint On-Premise. (Ref: CT-2460)

    • Dispose: This tool has been updated to support disposing of documents in SharePoint On-Premise. (Ref: CT-2430)

  • RunScript Coloured Error Messages: Our RunScript tool has been improved so that error messages are displayed in red (and thus easier to see when reviewing the error when reviewing console output. (Ref: CT-2466)

  • Crawl Content Server Crumbs Improvement: When crawling Content Server via the CrawlContentServer tool, the crumbs field has been improved so as to no longer include the file name - so this value is now truly the path in Content Server to the document - i.e. equivalent to the Content Server Hyperlinked Trail. (Ref: CT-2522)

Bug Fixes

  • Bug Fix: Fixed an issue with AddHashAndExtractedText where if the --text-timeout setting of 0 was used, no items were actually processed (they now do process with this setting, with an unlimited time out, as originally intended). (Ref: CT-2498)

  • Bug Fix: Fixed an issue with the ID field which was being generated inconsistently by any of our Crawl tools (so CrawlBox, CrawlContentServer, CrawlDocumentum, CrawlExchange, CrawlFileNet, CrawlFileSystem, CrawlMaximo, CrawlOneDrive, CrawlSharePointOnline, CrawlSharePointOnlineSites - but this issue was only in Cognitive Toolkit (Ref: CT-2857)

  • Bug Fix: Fixed an issue with the ID field being generated inconsistently for CrawlBox when using a single Index. For the moment, the option --use-single-index has been removed from the CrawlBox tool to address this issue. (Ref: CT-2888)

Known Issues

The following items are known issues and are flagged for resolution in a later release:

  • There is currently no support for migration for SharePoint On-Premise.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.