Loading...

Troubleshooting Backup - Oracle Agent

Table of Contents

Backup Failures

Troubleshooting Performance Issues

Completed with one or more errors

Oracle Errors

ORCL0001: Error 18:18 Oracle database is not available. The database may be down or in an unknown state

ORCL0003: Error 18:40 RMAN script execution failed for this job.

ORCL0004: Oracle bug with available patch - Backup and recovery impacted

ORCL0005: Log files required to troubleshoot Oracle Agent related errors

ORCL0006: Performance statistics for Oracle backup and restore

ORCL0007: Debugging allocate channel failures on UNIX clients

ORCL0009: Job fails due to sbtio.log size

ORCL0010: Command line backup fails

ORCL0011: Command line backup fails for large backups

ORCL0012: Offline backup with Lights Out script fails

ORCL0013: Backup timeout failure

ORCL0014: Backup fails because of $ORACLE_HOME/sqlplus/admin/glogin.sql

ORCL0015: Database block corruption

ORCL0016: Backup fails intermittently on Linux clients

ORCL0017: Configuring an Instance or a Backup fails on Windows clients

ORCL0019: Log backup fails

ORCL0020: Backup fails on Linux clients because of unknown instance status

ORCL0021: Shared memory error

ORCL0034: Backup Fails with Permissions Issue

ORCL0037: Multiple Jobs for Oracle Third Party Command Line Operations

ORCL0038: First Archive Log Backup May Fail after Initial Deployment

ORCL0039: Oracle crosscheck can take a long time to finish

ORCL0042: Validation Error When Running RMAN Scripts

ORCL0043: User Error When Running RMAN Scripts

ORCL0045: RMAN Backup Jobs Run Slowly

ORCL0047: Some Channels Unexpectedly Terminate during a Multistream Oracle 12c Backup

ORCL0050: RMAN third party Command Line backups are not running

ORCL0051: Backup Fails with ORA-19919 Error

ORCL0052: Backup Fails with Character Conversion Error

ORCL0053: An Oracle Backup Might Hang When There is No Space

RMAN-3002: failure of delete command or RMAN-6091: no channel allocated for maintenance

RMAN CrossCheck Failure During Data Aging

Backup Failures

The following section provides information on troubleshooting backups.

Troubleshooting Performance Issues

If you are experiencing performance issues during backup, you can troubleshoot them by enabling logging of performance details in the log files. These performance counters contain information that help in resolving the performance related issues during backups.

  1. Perform a client backup to determine the performance statistics. See Backing Up an Oracle Subclient.

    Track the progress of the job from the Job Controller window of the CommCell Console.

    • Right-click the backup job and click Details and verify the Data Transferred on Network.
    • For example, if backup job is using 10 streams, make sure to backup at least 200 GB of data. If you are performing backups using 5 streams, make sure to backup at least 100 GB of data.
    • If the backup transfer rate is very slow, then kill the job by right-clicking the backup job and then click Kill.
  2. View log files of backup job to verify performance counters. See Viewing Log Files for Completed or Failed Jobs for more information.

    The following are the performance counters that need to be verified in the log files:

    Total Oracle I/O Time

    Time spent per SBT thread for reading the data from disk.

    Total MA I/O Time

    Time spent during data transfer to MediaAgent i.e., data read from the network buffer and written to the disk.

  3. In the log file verify the above performance counters.

    If the Total Oracle I/O Time value is more than the Total MA I/O Time value then perform the following to improve performance:

    If the Total Oracle I/O Time value is lesser than the Total MA I/O Time value then perform the following to improve performance:

    • If the write throughput of the disk is slow, run CvDiskPerf tool to measure the throughput for the disk. See Disk Performance Tool for more information.
    • If the data transfer on the network is slow or you have a low bandwidth network environment, then verify Network Throughput by running CvNetworkTestTool tool. If network throughput is low then enable nNumPipelineBuffers additional setting to increase the data transfer throughput from the client. See Increasing Data Transfer Throughput From Client for more information.

Completed with one or more errors

Backup jobs from Oracle Agent will be displayed as "Completed w/ one or more errors" in the Job History in the following cases:

  • When RMAN Script execution for the backup job completes with warnings.
  • When job is killed after backing up some data.
  • During offline backups, if the database cannot be opened after a backup.

Oracle Errors

If you receive an Oracle error during an Oracle backup operation, we recommend that you follow procedures published by Oracle Corporation on resolving the specific error. We also advise you to consult with your on-site Oracle database administrator, as needed.

ORCL0001: Error 18:18 Oracle database is not available. The database may be down or in an unknown state

Issue

Offline backups are failing on Windows servers with the following error message:

ORACLE database is not available. The database may be down or in an unknown state

Resolution

This problem can happen if the TNS configuration for this database does not have a "static" listener. Most Oracle databases use dynamic listeners, and there could be a delay in the listener between the lights-out script starting the database, and the database finishes registering in the dynamic listener, before the CommServe tries to connect back to the database to get its status. If the database does not register quickly enough, the CommServe will fail to connect to the database using the TNS connection resulting in an unknown status and this error.

To resolve this issue, scan the ClOraAgent.log on the client for a connection error to Oracle where TNS is blocking connections.

If this is indeed the error, a directive can be added to the tnsnames.ora file which will prevent the connection from getting blocked when using a dynamic listener.

Go to the %ORACLE_HOME%\NETWORK\admin folder in the oracle server and add (UR = A) to the connect_data directive.

Once this parameter is added, the connection should no longer be blocked, and the backup should run with no errors.

ORCL0003: Error 18:40 RMAN script execution failed for this job.

Issue

Oracle backups fail with the following error message:

RMAN Script execution failed for this job. Please check RMAN log file for job failure reason.

The following entry is in the RMAN log file:

ORA-19511: Error received from media manager layer, error text:

Resolution

When the error message indicates an error from the Media Manager Layer, it could be a media error or network error. Fix this issue and then rerun the RMAN job.

If you still encounter the same issue, open a case with the contracted support center and upload the following logs for troubleshooting:

  • All logs from the CommServe
  • All logs from the MediaAgent
  • All logs from the Client
  • The RMAN script or backup script

Include the following logs:

  • JobManger.log from CommServe (log file is normally included with a send logs job for CS)
  • ORASBT.log from Client (log file is normally included with a send logs job for CL)
  • cvd.log from MediaAgent (log file is normally included with a send logs job for MA)
  • RMAN log files

    The RMAN log files are in the Commvault JobResults directory. For a Windows configuration (where C: represents the drive that the Commvault resides), the location is:

     C:\Program Files\CommVault\ContentStore\iDataAgent\jobResults\CV_JobResults\2\0\<job_id> \backup.out

    For a UNIX configuration, the location is:

    /opt/commvault/iDataAgent/jobResults/CV_JobResults/2/0/<job_id>/backup.out

Additional Information

In a UNIX configuration, you must set the SBT_LIBRARY parameter to the location of the libobk.xx library. For information on how to set the SBT library parameter, see Using PARMS in the Oracle Allocate Channel Command.

You can create an RMAN script to test the SBT library. On the RMAN command line run the following sample script, substituting any required or optional Oracle SBT parameters. For information on required and optional SBT parameters, see SBT Parameters.

run {
allocate channel ch1 type 'sbt_tape'
PARMS="BLKSIZE=1048576,
SBT_LIBRARY=/opt/commvault/Base/libobk.so, ENV=(CvClientName=client_name,CvInstanceName=Instance001)"
TRACE 2 DEBUG 2;
}

ORCL0004: Oracle bug with available patch - Backup and recovery impacted

Issue

Oracle has released a patch titled "High SCN growth rate from ALTER DATABASE BEGIN BACKUP in 11g" under patch number 12371955.

This information is relevant to Commvault conventional Oracle backups and for IntelliSnap Oracle Online backups as the command ALTER DATABASE BEGIN BACKUP is issued when putting tablespaces into hot backup mode.

Resolution

Please review the issue and apply the patch to the CommCell Oracle clients.

The Oracle link requires Support Account to access:

"Bug 12371955 Hot Backup can cause increased SCN growth rate leading to ORA-600 [2252] errors" (Modified 16-FEB-2012 Type PATCH Status PUBLISHED)

https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=PATCH&id=12371955.8

ORCL0005: Log files required to troubleshoot Oracle Agent related errors

Issue

Log Files are require for backup troubleshooting.

Resolution

When you encounter errors during backup/restore, make sure to view the following logs for troubleshooting:

  • SrvOraAgent.log on the CommServe
  • CIOraAgent.log on the client
  • ORASBT.log on the client.

    This log file is required if you encounter errors during data transfer from/to the MediaAgent.

  • <job_results_directory>/2/0/<jobid>/<backup|restore>.out for RMAN specific errors on the client.

If the information in the log files is not sufficient enough to determine the failure reason, increase the debugging level in the EventManager/.properties file and re-run the job.

In addition, you can also view the RMAN logs from the Oracle Agent. From the Job Controller window or Job History window, right click the specific job and click View RMAN Log.

The RMAN logs are stored in the job results directory as backup.out (for backup jobs) and restore.out (for restore jobs). If there is an issue with data aging, the file in the job results directory will be called crosscheck.out.

ORCL0006: Performance statistics for Oracle backup and restore

The Oracle Agent logs the performance statistics in the ORASBT.log file with the following throughput information:

Backups

Oracle I/O Throughput:

Amount of data read by Oracle from disk in GB /Hour

MediaAgent I/O Throughput:

Amount of data written by the MediaAgent in GB / Hour

I/O Throughput

Net amount of data read from Oracle and written to the MediaAgent (i.e. amount of data backed up) in GB / Hour

(This value includes time taken by Oracle as well as the MediaAgent.)

Restores

Oracle I/O Throughput:

Amount of data written by Oracle to disk in GB / Hour

MediaAgent I/O Throughput:

Amount of data read from the MediaAgent in GB / Hour

I/O Throughput:

Effective amount of data read from the MediaAgent and written by Oracle (i.e. amount of data restored) in GB / Hour.

(This value includes time taken by Oracle as well as the MediaAgent.)

ORCL0007: Debugging allocate channel failures on UNIX clients

Use the following steps to troubleshoot allocate failure errors on UNIX clients:

Install/Permission issues

Oracle user should belong to the Commvault group entered during the Oracle Agent install. Otherwise, Oracle will not able to write to the ORASBT.log and will not be able to access the CommVault registry /etc/CommVaultRegistry. More often, customers select 'sys' group (Oracle does not belong to this group and fails) at the time of Oracle Agent install. Please follow the installation instructions to create Commvault group and reinstall the software packages on the client.

Library loading errors

Oracle backup library loading errors get logged into the temporary hook file libcvobk.log generated under /tmp folder. If this file does not exist, the Oracle user (not the root user) should create this file and run the backup again. Check this file for any library loading errors.

Create trace files:

Execute the following RMAN command and get the latest trace files udump directory of this Oracle instance. Also get the Agent client logs as well.

rman target userid/passwrod@instance nocatalog run { allocate channel ch1 type 'sbt_tape' trace=2 debug=2; } exit;

If the issue still exist, escalate to customer support with output of above steps.

If using a Windows 32bit client, make sure to check if /3GB switch is set in the boot.ini file.

ORCL0009: Job fails due to sbtio.log size

Issue

Sometimes, jobs fail due to increase in the size of sbtio.log file in the $UDUMP directory.

Resolution

To resolve this, set the size limit for the sbtio.log file using the sMAXORASBTIOLOGFILESIZE additional setting. Once the specified size limit is reached, the sbtio.log file gets pruned automatically.

ORCL0010: Command line backup fails

Issue

Command Line backups fail.

Resolution

  • Make sure if the required media resource is available and then run the backups once again.
  • For on demand backups, you can run more than one script for an instance. However, backup jobs will fail if there are more than one instance in the argument file.
  • For Oracle on Windows, it is recommended to avoid using a space after a comma in the argument file. A backup job will fail if you leave a space after a comma in the argument file.
  • RMAN command line backup fails with the following error

    "Unable to open lock file /opt/commvault/Base/Temp/locks/.dir_lock: Permission denied"

    This may occur if the umask parameter is set as 022 in the .profile file for the Oracle instance. As a workaround, change the umask to 000 or 002 and try the backup again.

ORCL0011: Command line backup fails for large backups

Issue

Sometimes, the third party command line jobs may hang when you perform large backups and restores.

Resolution

This happens since ClDBControlAgent updates the job manager for every 100MB data transfer and this causes the thread failure for large backups/ restores after transferring some of the data.

The following exception will be seen in the ClDBControlAgent.log:

5710030 304 02/22 03:47:23 608119 OraAgentBase::NotifyCommServeJobContinue() - m_jobObject->setUnCompBytesToAdd(105119744) ...
5710030 304 02/22 03:47:24 608119 CvThread::start_func() - Unhandled exception.
5710030 405 02/22 03:47:37 608119 ClOraControlAgent::OnClientTimeout() - Got timed out while waiting for msg from client 0

You can set sBYTESDIFFMBS additional setting <value> in MBs in OracleAgent/.properties.

This will update the job manager at every <value> in MBs specified in the key.

ORCL0012: Offline backup with Lights Out script fails

Issue

Offline backup using lights out script fails with the following error:

RMAN error "ORA-12528 TNS listener - all appropriate instances are blocking new connections

Resolution

As a workaround, add a reference to the database in the listener.ora file as shown below:

SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = C:\oracle\product\10.2.0\db_1)
(PROGRAM = extproc)
)
(SID_DESC =
(SID_NAME = rman10g)
(ORACLE_HOME = C:\oracle\product\10.2.0\db_1)
(SID = rman10g)
)
)

Oracle offline backup with lights out option fails when you use the default value for retry attempts for the subclient. As a workaround, increase the retry attempts by setting the Tries number value greater than or equal to 5. See Configuring Oracle Subclients for Offline Backups for more details.

ORCL0013: Backup timeout failure

Issue

The backup fails because of a timeout.

Resolution

The default time for resources to allocate streams during RMAN command line backups is 86400 seconds (24 hours). If a backup fails due to a timeout being reached, you can configure the sALLOCATESTREAMSECS additional setting to increase the waiting time period.

ORCL0014: Backup fails because of $ORACLE_HOME/sqlplus/admin/glogin.sql

Issue

If the following line is present in the $ORACLE_HOME/sqlplus/admin/glogin.sql file, it may cause the SrvOraAgent server process on the CommServe to fail when browsing database contents or executing a backup.

set linesize 80

Resolution

To avoid such failures, comment out that line from the file and re-try the browse or backup operation.

ORCL0015: Database block corruption

Issue

The backup fails with the following error:

LISTING 2: r_20030520213618.log
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on d1 channel at 05/20/2003 21:36:26
ORA-19566: exceeded limit of 0 corrupt blocks for file
/u01/app/Oracle/oradata/MRP/sales_data_01.dbf

Resolution

Make sure that the maximum value for database block corruptions is set for the backup. It is recommended that you set this value to match the number of corrupted database blocks identified by RMAN for the database file being backed up.

ORCL0016: Backup fails intermittently on Linux clients

Issue

On Linux clients, if the libobk.so library fails to load, the backups may fail.

Resolution

As a workaround, do the following steps:

  1. Log in to the Oracle client computer as root.
  2. From the system prompt, enter the following command:

    ldconfig /<Base_directory_name>

    For example: # ldconfig <software installation path>/Base

This will ensure that the libobk.so library is loaded so that backups for Oracle on Linux can run successfully.

ORCL0017: Configuring an Instance or a Backup fails on Windows clients

Issue

Configuring an instance or a backup fails on Windows Clients.

Note: when using Oracle 12c, grant full control permission for the Oracle home user for the Commvault folder.

Resolution

Grant Full Control Permission to the Oracle Home User.

ORCL0019: Log backup fails

Issue

If the Oracle database is configured to save the archive logs in the Flash recovery area, and Oracle subclients having both Backup Recovery Area and Archive Delete enabled at the same time then the backup will fail.

Resolution

To resolve this, there should be two different subclients, one for Backup Recovery Area and the other for Archive Delete.

ORCL0020: Backup fails on Linux clients because of unknown instance status

Issue

Backups may fail on Linux clients if the Oracle instance status is shown as UNKNOWN on CommCell Console.

Resolution

To resolve this issue, make sure the nproc value in /etc/security/limits.d/90-nproc.conf file is greater than 1024.

ORCL0021: Shared memory error

Issue

The backup failed because the shared memory on the HP-UX PA-RISC client has not been configured per operational guidelines.

Resolution

Add the DisableIPC_GLOBAL file in the /apps/Commvault/Base directory on the client where the backup failed.

  1. Stop the Commvault software.
  2. Create an empty file called DisableIPC_GLOBAL in the /apps/Commvault/Base directory. From the command line, enter the following:

    touch /apps/Commvault/Base/DisableIPC_Global

  3. Restart the Commvault software.

ORCL0034: Backup Fails with Permissions Issue

Issue:

The backup fails due to issues accessing the Commvault registry, log files and base directories.

The RMAN backup fails because it cannot load the CommVault SBT Media Management library.

Solution

Run the Database Readiness Check.

ORCL0037: Multiple Jobs for Oracle Third Party Command Line Operations

Issue:

For Oracle 12c, when performing Oracle multiple streams for third party command line operations, multiple jobs may be kicked off.

Solution

Add the user to Local security policy

  1. From Local Security Policy, navigate to User Right Assignment.
  2. Right-click Act as part of the operating system and then select Properties.
  3. Click on Add User or Group and then click OK.
  4. Right-click Create a token object and then select Properties.
  5. Click on Add User or Group and then click OK.
  6. Right-click Replace a process level token and then select Properties.
  7. Click on Add User or Group and then click OK.

ORCL0038: First Archive Log Backup May Fail after Initial Deployment

Issue:

On new Commvault deployments, or migrations from other vendors, the first archive log backup may fail because the logs may have been manually deleted.

The following Oracle error may be displayed.

RMAN-06059 expected archived log not found, loss of archived log compromises recoverability

Solution

  1. Run the crosscheck command to check for missing archive logs.

    crosscheck archivelog all ;

  2. Run the RMAN delete command to remove the entries and synchronize the RMAN catalog files with the database files.

    delete noprompt obsolete;

ORCL0039: Oracle crosscheck can take a long time to finish

Issue:

If there are a large number of Oracle backups available because of a higher retention, then crosschecking these backups can take a long time.

Solution

  1. Limit the CROSSCHECK scope by specifying backups completed after a specified time (for example 40 days).

    crosscheck backup completed after 'SYSDATE - 40'

ORCL0042: Validation Error When Running RMAN Scripts

Issue:

An error message containing "Provide Valid Token" is returned when an RMAN script runs.

Solution

A valid token file was not included in the request.

  1. Run the qlogin command with the token file option (-f) to obtain a token file.
  2. Use the CvQcmdTokenFile parameter with the token file that the qlogin command generates.

    For information on required and optional SBT parameters, see SBT Parameters.

ORCL0043: User Error When Running RMAN Scripts

Issue:

An error message containing "Provide competent user" is returned when an RMAN script runs.

Solution

The user does not have the correct permissions in the CommCell Console to run the backup job.

ORCL0045: RMAN Backup Jobs Run Slowly

Issue:

The RMAN jobs run slowly.

Solution

The sql statements may not be optimized. Set the Rules for the Oracle optimizer_mode Statement.

ORCL0047: Some Channels Unexpectedly Terminate during a Multistream Oracle 12c Backup

Issue

Some channels unexpectedly terminate during a multistream Oracle 12c backup before the operation completes.

Solution

Check the value of the PGA_AGGREGATE_LIMIT database parameter and increase it. The minimum default value that Oracle recommends is 2 GB. You can get more information in Oracle Doc ID 1520324.1.

ORCL0050: RMAN third party Command Line backups are not running

Before you run backups from the RMAN command line for the Oracle Agent, set the SBT_LIBRARY path and environment variables for CvClientName and CvInstanceName in the RMAN script. For example, on a Solaris client, provide the following path:

util_par_file = <ORACLE_HOME>/dbs/init@.utl
rman_parms="BLKSIZE=1048576,SBT_LIBRARY=/opt/commvault/Base64/libobk.so,ENV=(CvClientName=sunsign,CvInstanceName=Instance001)" rman_channels=1

where Cvclientname and CvInstancename are the names of the client and instance (for example, Instance001) where the SAP for Oracle Agent is installed.

On a Windows client, edit the $ORACLE_HOME\database\init<SID>.sap file and provide the parameter as given below

util_par_file = <ORACLE_HOME>\database\init@.utl
RMAN_PARMS="SBT_LIBRARY=,BLKSIZE=1048576,ENV=(CvClientName=<client>,CvInstanceName=<client_name>)"

where Cvclientname and CvInstancename are the names of the client and instance (for example, Instance001) where the Oracle Agent is installed.

The SBT_LIBRARYfor the various platforms are listed below:

Platform

SBT_LIBRARY

AIX with 64 bit Oracle

<Client Agent Install Path>/Base/libobk.a(shr.o)

HP UX PA RISC 64 bit Oracle

<Client Agent Install Path>/Base64/libobk.sl

Solaris with 64 bit Oracle

<Client Agent Install Path>/Base64/libobk.so

All Other Unix platforms

<Client Agent Install Path>/Base/libobk.so

NOTE: The SBT_LIBRARY parameter is not applicable on Windows platforms.

When you use the RMAN utility on Solaris client, set the following parameter on the client computer:

crle -64 -c /var/ld/64/ld.config -l/opt/commvault/Base64:/lib/64:/usr/lib/64

ORCL0051: Backup Fails with ORA-19919 Error

Issue

An Oracle backup that uses the Commvault Media Manager fails with the following error:

ORA-19919: encrypted backups to tertiary storage require Oracle Secure Backup

Resolution

To use Oracle Secure Backup, disable encryption in the RMAN configuration, by using the following command.

CONFIGURE ENCRYPTION FOR DATABASE OFF;

After you disable RMAN encryption, use the Commvault encryption on the client or subclient level. For more information, see the following topics:

 

ORCL0052: Backup Fails with Character Conversion Error

Issue

A backup fails with the following error:

Character conversion not supported

Resolution

The software sets the NLS_LANG environment variable to American_America.US7ASCII character set by default. If the Oracle database on the client uses a different NLS character set (for example, WE8MSWIN1252), then the Agent’s backup operations may fail.

In such cases, use the <oracle_SID>_NLS _LANG additional setting to set the NLS_LANG environment variable to American_America.<database_character_set> on the client computer.

ORCL0053: An Oracle Backup Might Hang When There is No Space

Issue

If there is no space left on the database, then backup might hang when the software performs Oracle browse queries.

Resolution

  1. Connect to the database with the dba user.
  2. On the command line, type the following command:

    purge dba_recyclebin

  3. If you run the command and still have issues, then increase the debug level of the ClOraAgent.log file and contact Oracle suppport.

RMAN-3002: failure of delete command or RMAN-6091: no channel allocated for maintenance

If you applied the July 2018 July 2018 (DBPSU/BP/RU) Oracle patch, then a RMAN script fails with error "The RMAN channel allocation fails during an RMAN operation."

Symptom

If you applied the July 2018 July 2018 (DBPSU/BP/RU) Oracle patch, then a RMAN script fails with a channel allocation error: The following errors appear in the RMAN log file:

RMAN-3002: failure of delete command at 07/24/2018 08:26:03
RMAN-6091: no channel allocated for maintenance (of an appropriate type)

Resolution

This is an issue with the Oracle patch. Oracle has addressed this problem. For more information, go to the Oracle support site and read the following articles:

Note: you need an Oracle support identifier to view these articles.

Oracle bug details:

https://support.oracle.com/epmos/faces/Registration?_adf.ctrl-state=2ku3098iy_4&_afrLoop=384247377134524

Oracle patch details: https://support.oracle.com/epmos/faces/BugMatrix?_afrLoop=259787922049607&id=28391990&_afrWindowMode=0&_adf.ctrl-state=nzva30kz7_150

RMAN CrossCheck Failure During Data Aging

For more information, see KB article 55003 - RMAN Crosscheck failed for Oracle Instance during a data aging operation.

 

Last modified: 1/9/2018 7:32:05 PM