Adding OCR Support for Additional File Types

You can enable content indexing and entity detection for images in additional file types. By default, only scanned documents in PDF format are processed by using optical character recognition (OCR).

Procedure

  • To the content analyzer computer, add the OCRFilterFileList additional setting as shown in the following table.

Property

Value

Name

OCRFilterFileList

Category

ContentAnalyzer

Type

String

Value

A comma-separated list of file extensions that contain images. For example, you can enter the following comma-separated list of file extensions: pdf,jpeg,docx,doc,msg.

OCR Support for Scanned Documents

Loading...