MicroStation Add-in for Shinydocs Cognitive Suite Administrator Manual and User Guide
Description
Use case: You have engineering drawings (.dwg and/or .dgn file extension) from which you want to extract text.
The Bentley MicroStation (64 bit) add-in for Shinydocs Cognitive Suite allows you to extract text and title block information from native (digital) engineering drawings (.dwg and/or .dgn file extension).
This is query based (so is run after a metadata crawl of the engineering drawings). Extracted fields are of course added to our Index.
See --help for AddExtractedTextFromEngineeringDrawings for available parameters and options. (Ref: CT-1925)
Installation
Prerequisites
Bentley MicroStation* must be installed locally.
Shinydocs Cognitive Suite
Sofware Requirements
shinydocs-microstation-addin download
Step-by-Step
Extract the contents of the shinydocs-microstation-addin zipper
Contains ShinydocsMicroStationAddin.dll and ShinydocsMicroStationAddin.dll.config
Configuration
Open ShinydocsMicroStationAddin.dll.config with Notepad++ or another source code editor
Proposed Solution:
We create a Cognitive Toolkit tool that will do this, which will work in concert with the Bentley MicroStation code base (and is in fact required to do this). As all of our other Cognitive Toolkit tools, will be query based (so is run after a metadata crawl of the engineering drawings). Extracted fields are of course added to our Index.
Acceptance Criteria:
Standard fields [text and title block metadata (name value pairs)] can be extracted from a .dwg (AutoCAD) engineering drawing (assuming they exist in the drawing of course) via this tool (leveraging Bentley MicroStation).
Standard fields [text and title block metadata (name value pairs)] can be extracted from a .dgn (MicroStation) engineering drawing.
Original documents (engineering drawings) may reside on a file system.
Original documents (engineering drawings) may reside in Content Server.
Ability to allow a user mapping for document-specific attribute collection tag names.
The attributes may have different tag names for accessing them through code. For example, in the dataset that we have from Synergy, the title_att had multiple different tag names set by the user(s) when they created the drawing that composed of the entire title of the drawing. Example:
DrawingA.dwg attribute name-value pair for the title:
Titleblock Attribute TAG name | Title Attribute Value (all lines consist of the title) |
---|---|
TITLELINE1 | First line of the title |
TITLELINE2 | This is the second line of the title |
TITLELINE3 | This is the third line of the title |
TITLELINE4 | This is the fourth line of the title |
TITLELINE5 | This is the fifth line of the title |
DrawingB.dwg attribute name-value pair for the title:
Titleblock Attribute TAG name | Attribute Value (all lines consist of the title) |
---|---|
DRAWINGTITLE-LINE1 | First line of the title |
DRAWINGTITLE-LINE1 | This is the second line of the title |
DRAWINGTITLE-LINE1 | This is the third line of the title |
DRAWINGTITLE-LINE1 | This is the fourth line of the title |
Maybe XML mapping (Content Server apparently does or did use an XML for this mapping purpose)
Technical Requirements:
TBD
Implementation Notes: \[dev only]
Bentley Systems MicroStation CONNECT treats DWG’s the same as DGN’s for accessing objects within a drawing through code. More information can be found How does MicroStation handle opening DWG files .
The current prototype uses the same code for accessing text within both DWG’s and DGN’s using MicroStation CONNECT SDK.
Code repo here for prototype: https://bitbucket.org/shinydocs/project-clearwater/src/dev/
User Guide
As all of our other Cognitive Toolkit tools, will be query based (so is run after a metadata crawl of the engineering drawings). Extracted fields are of course added to our Index.