Best Practices for Starting a Crawl in Shinydocs Pro
Welcome to Shinydocs Pro!
Congratulations on choosing Shinydocs Pro to take control of your data. You’re on the path to gaining incredible insights into your information, uncovering duplicates, identifying PII, managing ROT, and optimizing Enterprise Search. Let’s help you get started the smart way.
Start Small
Watch the Video Below, you'll find a short video explaining why starting with a small crawl is the best way to get up and running quickly.
When starting your first crawl, it’s important to keep it simple:
Select one or two data sources to begin with.
Identify a manageable size - a terabyte or less is ideal.
Monitor progress during the initial crawl to understand timing.
Prioritize areas where you’ll gain the most value, like duplicates, PII, and ROT.
Leverage results to plan additional crawls efficiently.
Expand gradually, adding more paths, sites, or workspaces as needed.
Why Starting Small Matters
Think of your data crawl like unpacking after a big move. Trying to tackle every box in your new home at once is overwhelming and inefficient. Instead, start with the essentials - the kitchen and bedroom - where you’ll get the most immediate value. Once those are set up, you can move on to the less critical spaces, like the garage or spare room. Similarly, starting small with Shinydocs Pro allows you to focus on high - priority data first, delivering faster and more impactful results.
Adding too many data sources upfront, especially if they total multiple terabytes, can extend the time it takes to realize value from your analysis. By starting with a focused set of data, you’ll quickly see actionable results, such as:
Identifying duplicates to reduce storage waste.
Detecting personally identifiable information (PII) to maintain compliance.
Pinpointing redundant, obsolete, and trivial (ROT) data to clean up your environment.
Enhancing Enterprise Search capabilities for faster, more relevant results.
Examples of Starting Small
Here are examples of selecting the most important paths or sites to begin with for various systems:
SharePoint: Focus on sites with the highest activity or those containing critical business records, such as legal or HR documentation.
iManage: Begin with workspaces tied to ongoing or high - value cases, where analysis of duplicates and ROT can deliver quick wins.
File Systems (File Shares): Target specific directories with known storage issues or critical business data, such as financial record or client information.
Once you’ve processed the initial crawl, adding more data sources is seamless and efficient, just click !
Shinydocs Pro ensures that your subsequent crawls focus only on changes, saving time while delivering ongoing value