Before you deploy Activate for Sensitive Data Governance, verify that the computers where you deploy the various components meet the following requirements.
Index Store, Content Analyzer, and Web Server Packages
You can use the following editions of Windows Server with the Index Store, Content Analyzer, and Web Server packages:
-
Microsoft Windows Server 2019 Editions
-
Microsoft Windows Server 2016 x64 Editions
-
Microsoft Windows Server 2012 R2 x64 Editions
-
Microsoft Windows Server 2012 x64 Edition
Hardware Specifications for the Index Server
Use these guidelines to select the appropriate hardware for your index server.
Considerations
-
If your environment is between two sizes, size up to the larger specification.
-
If you are using one Index Server for both files and email messages, use the size that matches the larger source. For example, if you have 40 TB of file source data and 15 TB of email source data, use the specification for a medium-sized server.
-
The following options affect the performance of the Index Server:
-
Custom entities that include multiple regular expressions
-
Data classification plans that include the optical character recognition (OCR) option or the pre-generated previews option
-
File Data and Email Messages
Component |
Large |
Medium |
Small |
---|---|---|---|
File source data size per node* |
160 TB |
80 TB |
40 TB |
Email source application size |
25 TB |
15 TB |
5 TB |
File objects per node (estimated) |
80 million |
40 million |
20 million |
Email objects per node (estimated)** |
250 million |
150 million |
50 million |
CPU |
32 cores |
16 cores |
8 cores |
RAM |
64 GB |
32 GB |
16 GB |
Index disk space (SSD class disk recommended) |
12 TB |
6 TB |
3 TB |
*Based on an average file size of 1 MB and on the assumption that 50 percent of documents are eligible for content indexing and there is only one version of the file. **Based on an average message size of 100 KB. Messages with attachments are considered a single object.
Specifications for Dedicated Servers for File Data
Component |
Large |
Medium |
Small |
---|---|---|---|
Source data size per node* |
160 TB |
80 TB |
40 TB |
Objects per node (estimated) |
80 million |
40 million |
20 million |
CPU or vCPU |
32 cores |
16 cores |
8 cores |
RAM |
64 GB |
32 GB |
16 GB |
Index disk space (SSD class disk recommended) |
12 TB |
6 TB |
3 TB |
*Based on an average file size of 1 MB and on the assumption that 50 percent of documents are eligible for content indexing and there is only one version of the file.
Specifications for Dedicated Servers for Email Messages
Component |
Large |
Medium |
Small |
---|---|---|---|
Source application size |
25 TB |
15 TB |
5 TB |
Objects per node (estimated)* |
250 million |
150 million |
50 million |
Number of mailboxes** |
5000 |
2000 |
400 |
CPU |
16 cores |
16 cores |
8 cores |
RAM |
64 GB |
32 GB |
16 GB |
Index disk space (SSD class disk recommended) |
10 TB |
6 TB |
2 TB |
*Based on an average message size of 100 KB. Messages with attachments are considered a single object. **Based on an average mailbox size of 5 GB, and an average of 50,000 messages per mailbox.
Hardware Specifications for the Content Analyzer
Content analyzer computers detect personally identifiable information (PII) in the data. Some Activate environments require dedicated content analyzer computers.
Review the following use cases to determine if you need dedicated content analyzer computers:
-
You need to identify machine learning-based entities. For example, Address, Contextual Date, Location, Money, Organization, Person, and Time are entities that are identified by using machine learning. For a complete list of built-in entities, see Personally Identifiable Information.
-
You want to configure parallel processing. You can add multiple content analyzer computers, and then create a separate data classification plan for each content analyzer.
Before you install the Content Analyzer package, use the following guidelines to select the appropriate hardware for your content analyzer computers.
Use case |
CPU |
RAM |
Disk space |
---|---|---|---|
Machine learning-based entities |
32 cores |
64 GB |
2 TB |
Parallel processing |
16 cores |
32 GB |
1 TB |