Cognitive Toolkit February 2021/2.4.0 Release Notes
Product: Cognitive Toolkit
Version: 2.4.0
Date: February 22, 2021
Ability to Crawl SharePoint Online Sites: The Cognitive Toolkit now ships with a tool that can be used to explicitly crawl SharePoint Online sites. This information, once gathered can be used for subsequent steps, such as CrawlSharePoint (for a given site). See –help for CrawlSharePointSites for available parameters and options. (Ref: CT-1490)
CrawlSharePoint – Ability to Create one Index per Site: The CrawlSharePoint tool now has the ability to create one Index per Site (which is now the default). This is especially useful when the custom columns in each site are very different from the other sites. If you want to create just one Index, use the –use-single-index option. For more details, see –help for CrawlSharePoint for available parameters and options. (Ref: CT-1697)
CrawlContentServer – Ability to Crawl Description: The CrawlContentServer tool now also crawls the “Description” field from Content Server. In the Index, this field is generalDescription. (Ref: CT-1740)
CrawlFileNet – Exclude Hidden Class Definitions: The CrawlFileNet tool now ignores hidden document classes by default (these classes are missing information required for them to be crawled). (Ref: CT-1709)
Performance Improvements – Scroll Framework: The method whereby our various tools keep track of their progress (AddBreadCrumbs, AddCategoryData, AddHashAndExtractedText, CopyItems, RemoveField, RemoveItems, SetFileSystemPermissions) was improved by not constantly writing back to the Index as the tool ran (typically via a JOB-ID field). As a result, these tools are now much smoother (and faster) to run. (Ref: CT-1638, CT-1663, CT-1673, CT-1692, CT-1698)
Removed Tools: Note that as of this release, the following tools are no longer included in the cognitive-toolkit.exe (each of these tools were deprecated in a previous release – but were still available – we are now no longer including them in the toolkit):
AddHash
AddExtractedText
Bug Fix: Fixed an issue where AddPathValidation was not working properly with the –use-scroll option. (Ref: CT-1728)
Bug Fix: Fixed an issue where MigrateToContentServer was not working properly for Content Server Categories and RM Classifications. (Ref: CT-1724)