(Archived) Bulk Document Enricher

README

Description

The BulkDocumentEnricher allows you to enrich documents in the index by specifying a comma-separated value file that specifies a mapping between a search term and data to export into the index.

Prerequisites

As this is a Runscript based script, ensure that Cognitive-toolkit is installed and available.

How to run

To run the BulkDocumentEnricher.cs script, you must provide the following parameters to the Runscript tool:

Option	Details	Required
`-c BulkDocumentEnricher`	This is always “BulkDocumentEnricher”	Yes
`-p <Path>`	The path to the script file	Yes
`-i <IndexName>`	Name of the index	Yes
`-u <URL>`	Server URL of the index	Yes
`-q <path to query file>`	The path to the JSON query file	Yes
`-csv <path to csv file>`	Path to the comma-separated value (csv) file	Yes
`-column-names`	A comma-separated list of the column names specified in the csv file	Yes
`-threads`	The number of threads. If not specified, defaults to 1	No

Format of the query file

The query file is a JSON file that uses the standard ElasticSearch/OpenSearch query language. By surrounding a field in curly braces, the BulkDocumentEnricher will replace that term with the value of that field instead.

Format of the comma-separated file

The first line lists the field name you would like to search against, followed by the field names you would like to decorate the documents with.

Subsequent lines list the term to search for, and if found, the value to set the decorator field to.

For example, given the following query:

CODE

{
	"bool": {
		"must": [
			{
				"exists": {
					"field": "fullText"
				}
			},
            {
                "match": {
                    "fullText": "{fullText}"
                }
            }
		]
	}
}

And the following CSV file:

CODE

fullText, category
Mickey, mouse
Donald, duck
Pluto, dwarf planet

Running the BulkDocumentEnricher will search the fullText of each document. If it finds the term “Mickey”, it will add the category field to the document and set it to “mouse”. If it finds the the term “Donald”, it will add a category field set to “duck”, and if it finds the term “Pluto” it will add a category field set to “dwarf planet”.

You can tag documents with more than one field, by adding additional columns to the document. For example:

CODE

fullText, category, video, serial-number
Mickey, mouse, steamboat, 0001

Would apply three tags if the term “Mickey” was found in the fullText.

Adding an asterisk (*) to the column name indicates that the corresponding field should be treated as a single value (a string) rather than a list of values (an array) and that value will overwrite the previous value rather than be appended to the list.

Downloads

File	Modified
Text File BulkDocumentEnricher-2.6.0.cs	Dec 16, 2025
Text File BulkDocumentEnricher-2.6.1.0.2.cs	Dec 16, 2025
Text File BulkDocumentEnricher.cs	Dec 16, 2025
File BulkDocumentEnricher-2.6.1.0.4.cs	Dec 16, 2025