With entity extraction, you can create archiving rules based on the types of entities that are present in your data or pattern matches. You can use these entity and pattern-based archiving rules to increase the efficiency of your data management operations.
Before You Begin
- To use the archiving rules with entity extraction, you must first create and enable the following additional setting on the CommServe client:
For more information about creating additional settings, see Add or Modify an Additional Setting.
- You must have configured at least one subclient for entity extraction. You can only configure archiving rules with entities for subclients that have been configured for entity extraction. See Configuring Entity Extraction.
- You must configure Commvault OnePass on the client. For more information, see the OnePass archive documentation for your agent.
- Log in to the CommCell Console.
- In the CommCell Browser, expand Client Computers > Client Name > File System, and then click the backup set that contains the subclient that you want to configure.
- In the backup set tab, right click the subclient that you want to configure, and then click Properties.
- Click the Disk Cleanup tab.
- To archive files based on a string of characters in the file name or file contents, enter a wildcard pattern in the Pattern Match field.
For example, to archive files with the TXT file extension, enter *.txt. Note that files with pattern matches in their content will also be archived.
- To archive files based on entities that appear in the file contents, enter one or more entity match patterns in the Entity Match field as follows:
Entity Type Description Entity Match Pattern BankRouting Bank routing numbers. entity_rtn:* CreditCard Credit card numbers. entity_ccn:* Email addresses. entity_email:* ipAddress IP addresses. entity_ip:* ITIN Tax identification numbers. entity_itin:* Phone Telephone numbers. entity_phone:* ssn Social security numbers. entity_ssn:* USDL US Driver's License ID numbers. entity_usdl:* Hostname Fully qualified domain name (FQDN). entity_hostname:* FinanceTags Patterns of words and phrases that are significant to finance and other businesses.
To view the complete list of words and phrases that are identified as entities using the FinanceTags entity type, see finance_tags.txt.
- Click OK.
When an archiving operation occurs, data in the subclient that matches the entity filters in the Disk Cleanup tab are archived.