This guide explains the operations available within Cognitive Toolkit and how to use them.
Basics
Cognitive Toolkit contains 35 operations to help you understand and manage your data. These operations are runscript-based commands which can be run using a minimum set of required parameters or configured for more complex applications.
Source Setting Files
Source setting files provide the login credentials required to access a content source.
The following source setting files can be edited with your organization’s administrative login credential information:
Encyclopedia
🇦 A-B
Activate
|
Example |
|---|---|
|
Activates the license and must be performed to initiate use of the Cognitive Toolkit. |
Command using minimum required inputs:
|
|
📚 Further reading |
AddClassifications
|
Example |
|---|---|
|
Records Content Server classifications data in an index within the Analytics Engine. Fields created will depend on the fields created within the Content Server. The CrawlContentServer operation must be performed before running AddClassifications. |
Command using minimum required inputs:
|
AddExtractedTextFromEngineeringDrawings
|
Example |
|---|---|
|
Extracts full text from engineering drawings. |
Command using minimum required inputs:
|
|
📚 Further reading |
AddFromSqlDatabase
|
Example |
|---|---|
|
Migrates data using a SQL query to an index within the Analytics Engine. |
Command using minimum required inputs:
|
AddHashAndExtractedText
|
Example |
|---|---|
|
Generates the hash value for each file specified while also extracting full text. Note: When a folder is renamed, changing the file path, AddHashAndExtractedText verifies to ensure the file has not changed. If a file is found to be the same, text extraction is not performed on that file. |
Command using minimum required inputs:
|
AddPathValidation
|
Example |
|---|---|
|
The AddPathValidation operation checks for changes to the index within the Analytics Engine. The changes that are validated result in a false value and are based on the data source itself.
|
Command using minimum required inputs:
|
|
📚 Further reading |
AddPropertyData
|
Example |
|---|---|
|
Pulls property data from an ECM and adds it to the index in the Analytics Engine. Content Server categories and attributes are currently supported.
|
Command using minimum required inputs:
|
🇨 C-D
CacheFileSystemPermissions
|
Example |
|---|---|
|
Caches the permissions on a file system item and creates the following fields within an index in the Analytics Engine:
|
Command using minimum required inputs:
|
|
📚 Further reading |
CopyItems
|
Example |
|---|---|
|
Used to copy an object from one index to another index within the Analytics Engine.
|
Command using minimum required inputs:
|
CrawlBox
|
Example |
|---|---|
|
Crawls for the metadata within Box and adds it to an index in the Analytics Engine. |
Command using minimum required inputs:
|
CrawlContentServer
|
Example |
|---|---|
|
Crawls for the metadata within Content Server and adds it to an index in the Analytics Engine. Use this tool to crawl the Content Server database directly or via REST API. Crawling the Content Server database directly leaves REST API available for other applications.
|
Command using minimum required inputs:
|
CrawlContentServerWorkflows
|
Example |
|---|---|
|
Crawls for the metadata within Content Server workflows and adds it to an index in the Analytics Engine. Use this tool to crawl the Content Server database directly or via REST API. Crawling the Content Server database directly leaves REST API available for other applications.
|
Command using minimum required inputs:
|
CrawlDocumentum
|
Example |
|---|---|
|
Crawls for the metadata within Documentum and adds it to an index in the Analytics Engine.
|
Command using minimum required inputs:
|
CrawlExchange
|
Example |
|---|---|
|
Crawls for the metadata within Microsoft Exchange and adds it to an index in the Analytics Engine.
|
Command using minimum required inputs:
|
CrawlFileNet
|
Example |
|---|---|
|
Crawls for the metadata within FileNet and adds it to an index in the Analytics Engine.
|
Command using minimum required inputs:
|
CrawlFileSystem
|
Example |
|---|---|
|
Base operation for data discovery. Generally performed prior to running any other Cognitive Toolkit operation. This operation crawls the specified path (or multiple paths) for metadata. The metadata is then stored in an index within the Analytics Engine where it can be further mined for insights. |
Command using minimum required inputs:
|
|
📚 Further reading |
CrawlMaximo
|
Example |
|---|---|
|
Crawls for the metadata within Maximo and adds it to an index in the Analytics Engine.
|
Command using minimum required inputs:
|
CrawlOneDrive
|
Example |
|---|---|
|
Crawls for the metadata within OneDrive and adds it to an index in the Analytics Engine. |
Command using minimum required inputs:
|
CrawlSharePointOnline
|
Example |
|---|---|
|
Crawls for the metadata within SharePoint Online and adds it to an index in the Analytics Engine.
|
Command using minimum required inputs:
|
|
📚 Further reading |
CrawlSharePointOnPrem
|
Example |
|---|---|
|
Crawls for the metadata within SharePoint On-Premise and adds it to an index in the Analytics Engine.
|
Command using minimum required inputs:
|
CrawlSharePointOnlineSites
|
Example |
|---|---|
|
Crawls SharepointOnlineSites to create a list of all the site names and add them to an index in the Analytics Engine. This information can then be used to crawl specific subsites using the CrawlSharePointOnline or CrawlSharePointOnPrem operations. |
Command using minimum required inputs:
|
Dispose
|
Example |
|---|---|
|
Deletes the specified data/files based on the query. This will add a field to an index in the Analytics Engine called [dispose] with value of true if successful. For confirmation, Dispose identifies the number of files that will be deleted before the dispose runs. |
Command using minimum required inputs:
|
🇪 E-K
ExportFromIndex
|
Example |
|---|---|
|
Specify and export fields/values from an index in the Analytics Engine into a comma-separated value (csv) file. |
Command using minimum required inputs:
|
|
📚 Further reading |
ExtractAndCrawlPst
|
Example |
|---|---|
|
Extracts text and performs a crawl of pst (email) files. |
Command using minimum required inputs:
|
|
📚 Further reading |
ExtractEntities
|
Example |
|---|---|
|
An information extraction technique whereby key elements from text are identified and classified into predefined categories. Categories include:
This operation transforms unstructured data to structured data that is machine readable and available for standard processing. |
Command using minimum required inputs:
|
|
📚 Further reading |
FindSimilarClassification
|
Example |
|---|---|
|
Adds classifications towards documents based on their similarity to other, already-classified documents in the Analytics Engine. For example, choose 5-10 documents of a similar kind and classify them by their document type, such as offer letters or purchase orders. The Shinydocs Cognitive Suite will “learn” from those examples and will be able to find other similar documents for classification. |
Command using minimum required inputs:
|
|
📚 Further reading |
🇱 L-O
Migrate
|
Example |
|---|---|
|
Migrates data/files from one source to another NOTES:
|
Command using minimum required inputs:
|
🇵 P-S
RemoveField
|
Example |
|---|---|
|
Removes the field specified within the explicit index. |
Command using minimum required inputs:
|
RemoveIndex
|
Example |
|---|---|
|
Removes the index from your Analytics Engine, but does not remove the index pattern. |
Command using minimum required inputs:
|
RemoveItems
|
Example |
|---|---|
|
Removes items from the Index. Use Case: There are times when a dataset has been crawled, but then files have later been deleted from the dataset. This will result in files having an invalid path in the index (path-valid = false). Apply the RemoveItems operation to remove items from the index that are displaying data for files that users have deleted. |
Command using minimum required inputs:
|
|
📚 Further reading |
RestoreCachedFileSystemPermissions
|
Example |
|---|---|
|
Restores the FileSystem permissions. Can be used in conjunction with the following tools CachedFileSystemPermissions and SetFileSystemPermissions.
If you run into problems running the SetFileSystemPermissions, you can perform a Restore which uses the fields created by this tool. This restores the original permissions on the actual file in the file system. |
Command using minimum required inputs:
|
|
📚 Further reading |
RunWithCredentials
|
Example |
|---|---|
|
Allows you to run the Cognitive Toolkit as a different user. |
Command using minimum required inputs:
|
SaveValue
|
Example |
|---|---|
|
Saves values for the purpose of using these values via substitution in tools. This tool will also encrypt passwords and user names used for such required options within the --source-settings and or directly within the command line. |
Command using minimum required inputs:
|
SetFileSystemPermissions
|
Example |
|---|---|
|
Resets permissions on the filesystem. Make sure you retain the Administrators rights on the file system. CacheFileSystemPermissions must be performed before running this operation. |
Command using minimum required inputs:
|
|
📚 Further reading |
🇹 T-Z
TagDuplicate
|
Example |
|---|---|
|
Tags any files that are considered duplicates and the option to identify the primary duplicate. This must be used in combination after you AddHashAndExtractedText tool with adding hash value. |
Command using minimum required inputs:
|
|
📚 Further reading: |
UpdateProperties
|
Example |
|---|---|
|
Once you have migrated items into Content Server, this tool will help you update the items with category and attributes. (if not completed during migration). It also allows you to move and/or rename documents within Content Server. |
Command using minimum required inputs:
|
To access the complete list of available operations from within the Cognitive Toolkit, type the following at the root folder of the Cognitive Toolkit: CognitiveToolkit.exe -h!|--help!