RMAN-Time-Based Recovery

Time-based recovery allows the DBA to recover to a desired point of time. This situation is most likely to occur if archive logfiles or redo logfiles needed for recovery are lost or damaged and cannot be restored. In this situation you would apply all logs until a point in time specified by the UNTIL TIME clause of the RECOVER command.

Steps:

– If the database is still open, shut down the database using the SHUTDOWN command with the   ABORT option.
– Make a full backup of the database including all datafiles, a control file, and the parameter files in case an error is made during the recovery.
– Restore backups of all datafiles. Make sure the backups were taken before the point in time you are going to recover to. Any datafiles added after the point in time you are recovering to should not be restored. They will not be used in the recovery and will have to be recreated after recovery is complete. Any data in the datafiles created after the point of recovery will be lost.

Make sure read-only tablespace are offline before you start recovery so recovery does not try to update the datafile headers.
– Make sure all the datafiles in the database are online. You can check the status of datafiles by querying the V$DATAFILE view.
Now Connect the rman with target database and run the following command:
C:RMAN TARGET SYS/ORACLE@ORCL3 CATALOG RMAN/RMAN@SHAAN

RMAN> shutdown Abort;

RMAN> startup mount;

RMAN> sql “alter session set nls_date_format=”dd-mon-yyyy hh24:mi:ss””;

RMAN> set until time ’15-Mar-2012 14:11:00′;

RMAN> restore database;

RMAN> recover database;

RMAN> alter database open resetlogs;

Oracle will automatically stop recovery when the time specified in the RECOVER command has been reached. Oracle will respond with a recovery successful message.
– You should use the RESETLOGS option if you used a backup of the control file in recovery or the recovery was incomplete. Use the NORESETLOGS option if the recovery was complete. If you are using a standby database and must reset the logs, the standby database will need to be re-created. You can check the ALERT file to see if your incomplete recovery was actually a complete recovery. If the recovery was a complete recovery, the message in the ALERT file is as follows: “RESETLOGS after complete recovery through change scn” and If the recovery was incomplete, the following message is recorded: “RESETLOGS after incomplete recovery UNTIL CHANGE scn”

– After opening the database using the RESETLOGS option, perform a normal shutdown and a full database backup. If you do not do this, any changes made after the recovery and before the next full backup are unrecoverable. If you did not reset the logs, the database is still recoverable.

RMAN- Change-Based (SCN) Recovery

Change-based recovery allows the DBA to recover to a desired point of System change numner (SCN). This situation is most likely to occur if archive logfiles or redo logfiles needed for recovery are lost or damaged and cannot be restored.

From SQL*Plus as SYS get the last archived SCN using the following.

SQL> SELECT archivelog_change#-1 FROM v$database;

ARCHIVELOG_CHANGE#-1

——————–

7536295

Steps:

– If the database is still open, shut down the database using the SHUTDOWN command with the ABORT option.
– Make a full backup of the database including all datafiles, a control file, and the parameter files in case an error is made during the recovery.
– Restore backups of all datafiles. Make sure the backups were taken before the point in time you are going to recover to. Any datafiles added after the point in time you are recovering to should not be restored. They will not be used in the recovery and will have to be recreated after recovery is complete. Any data in the datafiles created after the point of recovery will be lost.

– Make sure read-only tablespace are offline before you start recovery so recovery does not try to update the datafile headers.
Now Connect the rman with the target database and run the following command:

RMAN TARGET SYS/ORACLE@ORCL3 CATALOG RMAN/RMAN@SHAAN

RMAN> shutdown Abort;

RMAN> startup mount;

RMAN>run {

set until scn 1048438;

restore database;

recover database;

alter database open resetlogs;

}

How to find Exact SCN for Oracle Restore
1. If you have a controlfile mounted then you can query the latest backup details, archivelog details and find the exact SCN to use in your recovery script.
select max(absolute_fuzzy_change#) fuzz#, max(checkpoint_change#) chkpnt#
from (select file#, completion_time, checkpoint_change#, absolute_fuzzy_change#
from v$backup_datafile
where incremental_level = 0
and trunc(completion_time) = to_date(‘Nov-28-2012′,’MON-DD-YYYY’)
and file# <> 0
order by completion_time desc
);
2. Either You can choose the day in which your level0 or level1 completed and modify appropriately.
select sequence#, to_date(first_time,’DD-MON-YYYY HH24:MI:SS’) first_time, first_change#,
to_date(next_time,’DD-MON-YYYY HH24:MI:SS’) next_time, next_change#
from v$archived_log
where completion_time between to_date(‘Nov-27-2012′,’MON-DD-YYYY’) and Sysdate;
Note: This will return 2 SCN numbers. Pick the greatest of the 2.
3. Or, since checkpoints occur at every log switch, You can then choose an SCN from the exact log switch closest to your recovery needs.
select sequence#, to_date(first_time,’DD-MON-YYYY HH24:MI:SS’) first_time, first_change#,
to_date(next_time,’DD-MON-YYYY HH24:MI:SS’) next_time, next_change# from v$archived_log
where completion_time between to_date(‘Nov-27-2012′,’MON-DD-YYYY’) and Sysdate ;
4. You can use logMinor to find Exact SCN prior to Crash.
BEGIN
DBMS_LOGMNR.add_logfile (
options     => DBMS_LOGMNR.new,
logfilename => ‘/tmp/a55050.arc’);
END;
/
BEGIN
DBMS_LOGMNR.start_logmnr (
starttime => to_date(’10-MAR-2010 18:50:00′,’DD-MON-YYYY HH24:MI:SS’),
endtime => to_date(’10-MAR-2010 19:05:00′,’DD-MON-YYYY HH24:MI:SS’),
options => Dbms_Logmnr.DDL_Dict_Tracking);
END;
/
Now Query with v$logmnr_contents Table and find the exact SCN
SELECT scn, to_char(timestamp,’DD-MON-YYYY HH24:MI:SS’) timest, operation, sql_redo
FROM   v$logmnr_contents
where scn between  21822207692 and 21822211410
order by scn;
Query to find the SCN for Incomplete Recovery After Restore
SELECT MIN(maxnc)
FROM  (SELECT MAX(a.NEXT_CHANGE#) maxnc
FROM V$ARCHIVED_LOG a, V$THREAD t
WHERE a.THREAD# = t.THREAD#
AND a.ARCHIVED=’YES’ AND t.ENABLED=’DISABLED’
GROUP BY a.THREAD#);
Note: You can also examine the alert.log to find the SCN of an event and recover to a prior SCN
if you follow the above steps, you will always be able to “alter database open resetlogs” without error.

Rman Backup with Stored Script

A stored scripts offer an alternative to command files for managing frequently used sequences of RMAN commands. The advantage of a stored script over a command file is that a stored script is always available to any RMAN client that can connect to the target database and recovery catalog, whereas command files are only available if the RMAN client has access to the file system on which they are stored.

A Stored scripts can be global or local. A local stored script is associated with the target database to which RMAN is connected when the script is created, and can only be executed when you are connected to that target database. A global stored script can be run against any database registered in the recovery catalog, if the RMAN client is connected to the recovery catalog and a target database.
Note that to work with stored scripts, even global ones, you must be connected to both a recovery catalog and a target instance.

RMAN TARGET SYS/ORACLE@ORCL3 CATALOG RMAN/RMAN@SHAAN
NOW on RMAN prompt create the desired scripts. Before creating scripts you must run and check the correctness of script.
CREATE GLOBAL SCRIPT full_backup
{
BACKUP DATABASE PLUS ARCHIVELOG;
DELETE OBSOLETE;
}
CREATE SCRIPT global_full_backup
{
BACKUP DATABASE PLUS ARCHIVELOG;
DELETE OBSOLETE;
}
CREATE GLOBAL SCRIPT global_full_backup
COMMENT ‘use only with ARCHIVELOG mode databases’
{
BACKUP DATABASE PLUS ARCHIVELOG;
DELETE OBSOLETE;
}
You can also create local and global script by reading its contents from text file. Before being Create Script you must create “my_script_file.txt” on the given location.
CREATE SCRIPT full_backup FROM FILE ‘D:BackupRmanmy_script_file.txt’;
How to EXECUTE the Script:
RUN
{
EXECUTE SCRIPT full_backup;
}
RUN
{
EXECUTE GLOBAL SCRIPT full_backup;
}
Executing a global script only affects the connected target database; to run a global script across multiple databases, you must connect the RMAN client to each one separately and execute the script. Your script will use the automatic channels configured at the time you execute the script. Use ALLOCATE CHANNEL commands in the script if you need to override the configured channels. Note that, because of the RUN block, if an RMAN command in the script fails, subsequent RMAN commands in the script will not execute.
DISPLAYING THE CONTENTS OF STORED SCRIPTS:
rman> PRINT SCRIPT full_backup;
rman> PRINT SCRIPT full_backup TO FILE ‘my_script_file.txt’;
rman> PRINT GLOBAL SCRIPT global_full_backup;
rman> PRINT GLOBAL SCRIPT global_full_backup TO FILE ‘my_script_file.txt’;
Listing Stored Scripts:
LIST SCRIPT NAMES;
This command displays the names of all stored scripts, both global and local, that can be executed for the currently connected target database: If RMAN is not connected to a target database when the LIST SCRIPT NAMES command is run, then RMAN will respond with an error.
LIST GLOBAL SCRIPT NAMES;
LIST ALL SCRIPT NAMES;
LIST GLOBAL SCRIPT NAMES and LIST ALL SCRIPT NAMES are the only commands that work when RMAN is connected to a recovery catalog without connecting to a target instance.
How to Update Stored Script:
REPLACE SCRIPT full_backup
{
BACKUP DATABASE PLUS ARCHIVELOG;
}
REPLACE GLOBAL SCRIPT global_full_backup
COMMENT ‘A script for full backup to be used with any database’
{
BACKUP AS BACKUPSET DATABASE PLUS ARCHIVELOG;
}
Global scripts can be updated using the REPLACE GLOBAL SCRIPT command when connected to a recovery catalog, as follows:
REPLACE GLOBAL SCRIPT global_full_backup FROM FILE ‘D:BACKUPRMANmy_script_file.txt’;
DELETING STORED SCRIPTS:
DELETE SCRIPT ‘full_backup’;
DELETE GLOBAL SCRIPT ‘global_full_backup’;
DELETE SCRIPT ‘global_full_backup’;
If you use DELETE SCRIPT without GLOBAL, and there is no stored script for the target database with the specified name, RMAN will look for a global stored script by the specified name and delete the global script if it exists.
Starting the RMAN Client and Running a Stored Script:
CMD> rman TARGET SYS/oracle@trgt CATALOG rman/cat@catdb SCRIPT ‘full_backup’;
Restrictions on Stored Script Names:
1. Avoid script name that begins with character other than A-Z or that are the same as reserve word.
2. When starting the RMAN client with the script name argument and if the local and global script name defiend with the same name then rman gives the priority to execute the local script.
3. For the EXECUTE SCRIPT, DELETE SCRIPT and PRINT SCRIPT commands, if the script name passed as an argument is not the name of a script defined for the connected target instance, RMAN will look for a global script by the same name to execute, delete or print.
4. Consider using some naming convetion to avoid mistakes due to confusion between global stored scripts and local stored scripts.
References:
http://docs.oracle.com/cd/B28359_01/backup.111/b28273/rcmsynta014.htm

Database Backup Strategy and Recovery Plan

To protect the database against potential media failures. You must develop proper backup strategy to perform recovery if disk failure damages the datafiles, online redo log files or control files. To develop a proper backup strategy and plan you must decide the type of requirement first, off course to know the type of requirements you need to find the answer’s to the following questions. I hope this article will  help to the fresher DBA who is recently join the company and working in independent mode.

If a media failure happens, is it acceptable to lose any transactions performed between the last backup and the time of failure?

If it is ‘NO’ then you must run the database in “ARCHIVE” mode otherwise you can run in “NOARCHIVE” mode.Link: Convert Database into Archivelog Mode

Does the database need to be available 24 hours a day, 7 days a week?

If it is ‘YES’ you must run the database in ARCHIVE mode and possibly configure the standby feature to minimize the downtime of database otherwise you can run the database in noarchive mode and take consistent (which require the database to be shutdown) whole database backup.

Link: Creating Physical Standyby Database

Will the database ever need to be recovered to a past point in time other than the time a backup was taken?

If it is ‘YES’ you must run the database in ARCHIVE mode other wise you can run the database in Noarchive Mode.
Can you afford downtime to perform a cold backup?

If it is YES please do so generally for a large database the answer is most likely ‘NO’, However there are backup solution in the market that can backup high rate per hour but in fact it is expensive solution.
What is your Recovery Window?
The more data you have, the longer it will take to restore and recover the entire database. You might want to backup more often or employ incremental backups to reduce the recovery window.

Do you have Read Only data?

If YES, backup time can be reduced by skipping READ ONLY tablespace. These only need to be backed up once, after the tablespace is made READ ONLY.

What kind of archived redologs volume can you expect?

The more bytes of redo you generate, the more bytes of redo that you will have to apply in recovery. You can lessen this by backing up more often and/or employing incremental.

Are you required to store the backup in special location?

It is a good idea to have offsite storage for a copy of your backups. In that way, if your entire data center catches fire, you can go offsite to get your data. Many companies are starting to ensure that their data is stored offsite in some form

What kind of Backup solution do you need?

Where the backup eventually ends up can be a big concern. With larger databases, it gets harder to backup directly to a disk device. And with offsite concerns, one probably wants a copy of the backup on tape somewhere.

Strategy to Complete database Recovery:

Before making Recovery Plan we need to identify all the risks that our data center can face such as: Media recovery, loss or corrupted datafiles, Loss of OS, loss of a disk drive, complete loss of our server etc. Oracle provides number of recovery plan we need to select best one as per our business requirement.
The following combination of backup strategies with RMAN when planning to recover from complete database loss will provide the maximum flexibility for recovery of database. Using this strategy it is possible to recover specific point in time, loss of CRD files as well as spfile. In order to completely restore or recover the lost database all of these items must be available, or, you must have the knowledge required to rebuild them and at last you must maintain some rules of Backup and Recovery:
–        The set of files (redundancy set) required for recovery any Oracle database are: datafiles, control file, or online redo log is called the.

  • – Multiplex the online redo log files and current control file at OS or hardware level. Thus an I/O failure or lost write should only corrupt one of the copies.
  • – Use OS or hardware mirroring for the primary datafiles if possible to avoid having to apply media recovery for simple disk failures.
  • – Keep at least one copy of most recent backup on hard disk.
  • – If your database is stored on a RAID device, then place the redundancy set on a set of devices that is not in the same RAID device.
  • – If you keep backupset on tapes, then maintain at least two copies of the data because tapes can fail. If you are keeping the backupset on disk then keep additional copy on different server.

Note: The complete database consists of datafiles, controlfiles, redolog file along with spfile. You can recover the loss of table from Export backup.

RMAN full offline backup:

This is good solution for large database but it requires down time. It is good to deal the corrupt or missing datafiles. Need to restore only comparably easier, any data entered into the database after the last full backup will be lost. Usually it is taken weekly or monthly basis depending upon the size/space or requirement you have.
How to take Cold Backup with RMAN or Manually

How to Restore From RMAN Cold Backup

Creating Test Envronment DB from Prod DB backup

RMAN full online backup with incremental and cumulative:

This is most preferable method, RMAN does not require down time. Data can be recovered up to the last entry in the online redo logs or possibly restored or recovered up to the date and time of the last entry in the archive log file. You can take incremental and cumulative backup on daily basis and full backup on weekly or monthly basis. Here I am adding two very good script for rman backup on windows environment, which can take upto 120 GB size of database full backup within 25 minutes and daily incremental /cumulative backup within 35 minutes.
Script: Rman Daily/Weekly HotBackup on Windows Env.

Script: Rman Hot Backup on Linux Environment

Script: Rman Cold Backup

EXPORT full backup:

Full exports will allow recovery to the last full export date. Any data entered after the export will not be recoverable. The best source to recover the loss of tables and cross platform migration of database. To recover a database using a full export you must re-create the database and then import the full export into the new database.
Import/Export Utility in Oracle 9i

Restore or Refresh Schema using Import/Export with Toad

Migrating or Transferring data from Oracle 9i to 10g using import/export

Testing the Plan

It is highly recommended to test your plan before actual problem occurs. When your database failed is not good time to test your plan and strategy go through possible causing the types of failure (in Test Environment) you want to be able to recover from and make sure your backup strategy is working.
Benefit of Test Environment:

  1. It requires almost same amount of space as the database or off course same as offline backup.
  2. We can Test or assure our backup validity. In case of disaster some times we came to know our backup is not valid backup for recovery (minimizing risk).
  3. While taking offline backup we have to change the mode of DB (shutdown) each and every time. Test environment will minimize this risk. As all of application and database on the same server (DBF structure not well organized). It is not preferable to change mode of db each time. Suppose our schedule offline backup is terminated in middle for any abnormal power off then DB will remain in any other stats will not available for end user.
  4. We can Test major changing before applying on the production server.
  5. We can replace any corrupt DBF from Test environment DBF and recover the database till the failure using our regular RMAN online backup.
  6. We can synchronize our Test environment with the weekly online production database.
  7. We can Test our application with the Test environment by making sure we can use this server as a production server in case of emergency.
  8. In future we can convert this standby as a mirroring of Production DB. If we have alternative standby server then creating mirroring from prod server risk will minimize.

Data Center having some other complex or expensive solution to improve the data security or avoid disaster without any downtime such as the concept of “DATAGUARD” and “RAC”. Here I am adding some of the Tested Recovery Scenario and will add some more in near future.

Test1: Disaster Recovery from the Scratch

Test2: Complete loss of database

Test3: Cold Backup Recovery
Test4: RMAN- Time Base Recovery

Test5: RMAN- Change-Based (SCN) Recovery

Test6: RMAN duplicate Database Recovery at a Past Point in‎Time

Common Restore or Recovery Technique in Oracle
Important link related to Rman backup:

Benefit of Rman Backup

Point to be considered with Rman Backup script

Format Parameter of Rman Backup

Rman Report and Show Command

Deleting Rman Backup

New Features in Rman since Oracle 9i/10g

Monitoring Rman Backup

List of Complete and Pending Rman Backup

Script to Monitor Rman status while the backup is running

How RMAN behave with the allocated channel during backup

An RMAN channel corresponds to one server session (default) used in Rman backup, Restore and recovery operation. Each channel establishes a connection from the RMAN client to a target or auxiliary database instance by starting a server session on the instance. The following statement uses parallelization, one RMAN BACKUP command backs up all datafiles of database, with all three channels in use. All the three channels are concurrently active first two channel (server session) copies 5-5 datafiles and last channel rest of 3 file in the total of 13 file in the orcl3 database.

run {

allocate channel d1 type disk FORMAT “D:BACKUPRMAN_BACKUPHOTBACKUPMonthly_%T_FULL_%d-%s_%p.db”;

allocate channel d2 type disk FORMAT “D:BACKUPRMAN_BACKUPHOTBACKUPMonthly_%T_FULL_%d-%s_%p.db”;

allocate channel d3 type disk FORMAT “D:BACKUPRMAN_BACKUPHOTBACKUPMonthly_%T_FULL_%d-%s_%p.db”;

backup

incremental level 0

tag Orcl3_FullDB_backup

filesperset 10

FORMAT “D:BACKUPRMAN_BACKUPHOTBACKUPMonthly_%T_FULL_%d-%s_%p.db”

DATABASE;

}
Check out the following backup logfile generated with above channel description:

allocated channel: d1

channel d1: sid=16 devtype=DISK

allocated channel: d2

channel d2: sid=13 devtype=DISK

allocated channel: d3

channel d3: sid=18 devtype=DISK

Starting backup at 08-JUL-12

channel d1: starting incremental level 0 datafile backupset

channel d1: specifying datafile(s) in backupset

input datafile fno=00001 name=C:ORACLE1ORADATAORCL3SYSTEM01.DBF

input datafile fno=00002 name=C:ORACLE1ORADATAORCL3UNDOTBS01.DBF

input datafile fno=00005 name=C:ORACLE1ORADATAORCL3EXAMPLE01.DBF

input datafile fno=00012 name=C:ORACLE1ORADATAORCL3LOGMNRTS.DBF

input datafile fno=00010 name=C:ORACLE1ORADATAORCL3XDB01.DBF

channel d1: starting piece 1 at 08-JUL-12

channel d2: starting incremental level 0 datafile backupset

channel d2: specifying datafile(s) in backupset

input datafile fno=00006 name=C:ORACLE1ORADATAORCL3INDX01.DBF

input datafile fno=00013 name=C:ORACLE1ORADATAORCL3OEM_REPOSITORY.DBF

input datafile fno=00009 name=C:ORACLE1ORADATAORCL3USERS01.DBF

input datafile fno=00003 name=C:ORACLE1ORADATAORCL3CWMLITE01.DBF

input datafile fno=00004 name=C:ORACLE1ORADATAORCL3DRSYS01.DBF

channel d2: starting piece 1 at 08-JUL-12

channel d3: starting incremental level 0 datafile backupset

channel d3: specifying datafile(s) in backupset

input datafile fno=00011 name=C:ORACLE1ORADATAORCL3MAIN_DBF01.DBF

input datafile fno=00007 name=C:ORACLE1ORADATAORCL3ODM01.DBF

input datafile fno=00008 name=C:ORACLE1ORADATAORCL3TOOLS01.DBF

channel d3: starting piece 1 at 08-JUL-12

channel d2: finished piece 1 at 08-JUL-12

piece handle=D:BACKUPRMAN_BACKUPHOTBACKUPMONTHLY_20120708_FULL_ORCL3-1452_1.DB comment=NONE

channel d2: backup set complete, elapsed time: 00:00:52

channel d3: finished piece 1 at 08-JUL-12

piece handle=D:BACKUPRMAN_BACKUPHOTBACKUPMONTHLY_20120708_FULL_ORCL3-1453_1.DB comment=NONE

channel d3: backup set complete, elapsed time: 00:02:43

channel d1: finished piece 1 at 08-JUL-12

piece handle=D:BACKUPRMAN_BACKUPHOTBACKUPMONTHLY_20120708_FULL_ORCL3-1451_1.DB comment=NONE

channel d1: backup set complete, elapsed time: 00:03:05

Finished backup at 08-JUL-12

Starting Control File and SPFILE Autobackup at 08-JUL-12

piece handle=D:BACKUPRMAN_BACKUPHOTBACKUPC-691421794-20120708-06 comment=NONE

Finished Control File and SPFILE Autobackup at 08-JUL-12

released channel: d1

released channel: d2

released channel: d3

Whether you allocate channels manually or automatically, you can use channel control commands and options to do the following:

  • Control the operating system resources RMAN uses when performing RMAN operations
  • Affect the degree of parallelism for a backup or restore command
  • Set limits on I/O bandwidth consumption (CONFIGURE CHANNEL…RATE)
  • Set limits on the size of backup pieces (with the MAXPIECESIZE parameter)
  • Send vendor-specific commands to the media manager (SEND)
  • Specify vendor-specific parameters for the media manager (CONFIGURE CHANNEL … PARMS)
  • Specify which instance performs the operation (CONFIGURE CHANNEL … CONNECT)

Factor affecting the number and size of backup sets                                                                      

  • The number of input files specified in eachbackupSpec clause
  • The number of channels that you allocate
  • The default number of files in each backup set (4 for datafiles and 16 for archived logs)
  • The default number of files read simultaneously by a single channel (8)
  • TheMAXSETSIZE parameter (specified on the CONFIGURE and BACKUP commands), which specifies a maximum backup set size

The most important rules for allocated channels in backup set creation are:

  • Each allocated channel that performs work in the backup job (not idle) generates at least one backup set.
  • RMAN always tries to divide the backup load so that all allocated channels have roughly the same amount of work to do.
  • The default number of datafiles in each backup set is determined by an internal RMAN limit (16 for archived logs, 4 for datafiles).
  • The number of datafiles multiplexed in a backup set is limited by the lesser of the number of files in each backup set and the default number of files read by a single channel simultaneously (8).
  • The maximum size of a backup set is determined by theMAXSETSIZE parameter of the CONFIGURE or BACKUP command.

So look at the issue shared by Helios blog having issue on Oracle version 11.2.0.1 database (2 node RAC system) on AIX 6.1. System is already around 25T and our full backup size almost 3,7T with using compress option. Number of datafiles are almost 1054. Total full backup duration is almost 4, 5 hours using 16 channels.

Issue: Our backup duration became almost 7,8 hours last 3 days. We were investigate backup log file and We noticed backup start 16 channels and that 16 channel backupsets job done almost 4:30 hours. After 16 channels backupsets end one new channel opening and it keep taking backupsets by using 1 channel.

Solution: We investigate issue with Oracle support. Here is the explanation of this behavior RMAN will try to balance the work load such that each channel processes the same amount of data. 17 backupsets are produced; each channel is processing 61-62 datafiles so you have 1054 datafiles. RMAN will initially take the no# files and divide by no# allocated channels – this gives 65.8. The upper limit for filesperset is 64 so: 1054/64 = 16.46 so Rman will create 17 backupsets. As you have 16 channels allocated the first 16 channel will process concurrently and the 17th backupset will inevitably run in isolation. If you allocate 18 channels you will get filesperset of 59 and all 18 channels should process concurrently.

So we increased our number of allocate channels for backup and Issue has been resolved.

References:

http://docs.oracle.com/cd/B12037_01/server.101/b10734/rcmconc1.htm#1012866

How to recover and open the database if the archive log required for recovery is missing.

 I came across a situation while restoring and recovering the one of our sister’s company database first time with the old backup, our restore process goes well and we were able to recreate the controlfile during recovery process, it is asking archive logs by giving error “archive log required for recovery is missing”. I tried all the possible location to find another valid copy of archivelog such as directories defined in the LOG_ARCHIVE_DEST, another directory in the same server or another server, RMAN backup, OS backup but I did not find required archivelog backup. When I consult with our team member, I came to know they are deleting the archivelog manually not having required archivelog. Apart from that I also tried other possible solution to open the database (discuss below) but fail to do. Finally with the help of some expert I tried hidden parameter “_ALLOW_RESETLOGS_CORRUPTION=TRUE” which will force to open database even though it is not properly recovered but unfortunately instance crashed after opening the database, I found our undo tablespace get corrupted.

Caution: Please do not using any hidden parameters without consulting oracle support as it is not recommended by oracle. When I read the Metal ink it clearly states there is no 100% guarantee that setting “_ALLOW_RESETLOGS_CORRUPTION=TRUE” will open the database because it allows to open database without consisting checks. This may results sometimes in corrupted database.

After digging more to resolve undo corruption I changes undo management to “Manual” in “initial parameter file” and able to open the database successfully.

Then after created new undo tablespace and dropped corrupted one. Finally changed back undo management to “AUTO’ and undo Tablespace to Newly created Undo tablespace. Thanks to support by expert we are able to resolve this issue and database is running fine.

Caution: Please do not try this concept on production system. Try on test environment makes sure you have valid backup before performing.

Steps:

  1. Set “_ALLOW_RESETLOGS_CORRUPTION=TRUE” in pfile.
  2. Startup Mount
  3. Recover database
  4. Alter database open Resetlogs.
  5. Set Undo_Management to “MANUAL” in pfile.
  6. Startup database
  7. Create new UNDO tablespace
  8. Change back Undo_Management to “AUTO” and Undo_Tablespace to Newly Create UNDO Tablespace
  9. Restart the instance;

Note: Do not forget to the full database backup again after successfully performing the above operation

Possible Solution: archivelog required for recovery is either missing, lost or corrupted or not found any other alternative backup location

If the archivelog is not found even in any of the alternative backup location then check the SCN (System Change Number) of the datafiles as well as whether the log sequence# required for the recovery is still available in the online redologs. It is one of the approach or strategy to recover and open the database in case of missing archivelog (rare chances of presence of required log sequence but you can try). For the SCN of the datafiles, it is important to know the mode of the database when the datafiles are backed up. That is whether the database is open, mounted or shutdown (normally) when the backup is taken.

If the datafiles are restored from an online or hot backup, which means that the database is open when the backup is taken, then we must apply at least the archivelog(s) or redolog(s) whose log sequence# are generated from the beginning and until the completion of the said backup that was used to restore the datafiles.

And if the datafiles are restored from an offline or cold backup, and the database is cleanly shutdown before the backup is taken, that means that the database is either not open, is in nomount mode or mounted when the backup is taken, then the datafiles are already synchronized in terms of their SCN. In this case we can open the database without even applying archivelogs, because the datafiles are already in a consistent state (except if there is a requirement to roll the database forward to a point-in-time after the said backup is taken).

The main thing here is to ensure that all of the online datafiles are synchronized in terms of their SCN before we can normally open the database.

select status, checkpoint_change#, to_char(checkpoint_time, ‘DD-MON-YYYY HH24:MI:SS’) as checkpoint_time, count(*)

from v$datafile_header

group by status, checkpoint_change#, checkpoint_time

order by status, checkpoint_change#, checkpoint_time;

the above query will determine whether the datafiles are synchronized or not.

Note: Querying the V$DATAFILE_HEADER to know the SCN recorded in the header of the physical datafile and not the V$DATAFILE (which derives the information from the controlfile).

STATUS  CHECKPOINT_CHANGE# CHECKPOINT_TIME        COUNT(*)

——- —————— ——————– ———-

ONLINE             2778991 26-AUG-2012 05:00:51          4

ONLINE             2778993 26-AUG-2012 05:00:55          4

ONLINE             2778996 26-AUG-2012 05:01:01          3

The results of the above query must return one and only one row for the online datafiles which means that they are already synchronized in terms of their SCN. Otherwise, if the results return more than one row for the online datafiles, then the datafiles are still not synchronized yet. In this case, we need to apply archivelog(s) or redolog(s) to synchronize all of the online datafiles.

Note: The CHECKPOINT_TIME in the V$DATAFILE_HEADER, which indicates the date and time how far the datafiles have been recovered.

The results of the query above may return some offline datafiles. So, please ensure that all of the required datafiles are online because we may not be able to recover later the offline datafile once we open the database in resetlogs ( In 10g onward we can recover the database beyond resetlogs due to the introduction of the format “%R” in the LOG_ARCHIVE_FORMAT).

select file#, name from v$datafile

where file# in (select file# from v$datafile_header

where status=’OFFLINE’);

If we are still having required log sequence# in the online redologs and the corresponding redolog member is still physically existing on disk, then we may apply them directly instead of the archivelog. To confirm, query:

select LF.member, L.group#, L.thread#, L.sequence#, L.status,

L.first_change#, L.first_time, DF.min_checkpoint_change#

from v$log L, v$logfile LF,

(select min(checkpoint_change#) min_checkpoint_change#

from v$datafile_header

where status=’ONLINE’) DF

where LF.group# = L.group#

and L.first_change# >= DF.min_checkpoint_change#;

Apply the redolog member returns from above query during recovery. Otherwise, if the above query returns no rows then try to apply each of the redolog members one by one during the recovery.

select * from v$logfile;;

If you have tried to apply all of the online redolog members instead of an archivelog during the recovery, but you always received the ORA-00310 error, as shown in the example below, then the log sequence# required for recovery is no longer available in the online redolog.

ORA-00279: change 187169545 generated at 26/08/2012 08:27:16 needed for thread 1
ORA-00289: suggestion : +ORABACK
ORA-00280: change 187169545 for thread 1 is in sequence #124

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+ORABACK/ISSCO/group_3.458.503422112
ORA-00310: archived log contains sequence 185; sequence 124 required
ORA-00334: archived log: ‘+ORABACK/ISSCO/group_3.458.503422112’

After trying all of the possible solutions but you still cannot open the database due to missing archivelog or the corresponding log sequence# is no longer available in the online redolog, they are already overwritten during the redolog switches, then we cannot normally open the database, as the datafiles are in an inconsistent state then use the below option as a final step to open the database.

Option1: Restore the database from the available backup, and recover the database by applying up to the last available archivelog. In this case you will only recover the database up to the last archivelog that is applied and any data after that are lost. If no archivelogs are applied at all then you are able to recover at a restored backup point of time. However, if we restored from an online or hot backup, then we may not be able to open the database, because we still need to apply the archivelogs generated during the available backup in order to synchronize the SCN of the datafiles before we can normally open the database.

Option2: Force open the database by setting above mentioned hidden parameters in the init.ora parameter. Note that you can only do this under the guidance of Oracle Support with a service request. But there is no 100% guarantee that this will open the database.

How to fix undo block corruption (UNDOTBS) of database

One of users written today that he got the undo block corruption while he is trying to start the database.  The reason he explain that due to power fluctuation the database is abnormally shutdown. When he checks the alert log file then finds the following error:

ORA-01578: ORACLE data block corrupted (file # 8, block # 39)
ORA-01110: data file 8: ‘E:ORACLEPRODUCT10.1.0ORADATASAD1UNDOTBS01.DBF’

Error 1578 happened during db open, shutting down database
USER: terminating instance due to error 1578
Errors in file E:oracleproduct10.1.0adminsad1bdumpsad1_arc3_1400.trc:
The Primarily you have to check the alert log file for any type of error or corruption or you can use the utility dbv (database Verify utility) which provides a mechanism to validate the structure of oracle data files at the operating system level. You can use it on regular basis to inspect data files for sings of corruption but you can use this utility only with datafiles not with control or redolog file.

DBVERIFY – Verification starting: FILE = E:ORACLEPRODUCT10.1.0ORADATASAD1UNDOTBS01.DBF

DBV-00200: Block, dba 887245421, already marked corrupted

DBVERIFY – Verification complete

Total Pages Examined         : 104340

Total Pages Processed (Data) : 0

Total Pages Failing   (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing   (Index): 0

Total Pages Processed (Other): 101216

Total Pages Processed (Seg)  : 0

Total Pages Failing   (Seg)  : 0

Total Pages Empty            : 3124

Total Pages Marked Corrupt   : 1

Total Pages Influx           : 0

Highest block SCN            : 2144459844 (15.24)

Step1: If you have Cold backup

Connect with sys as sysdba

Shutdown immediate;

Startup mount;

Restore datafile 8;

Recover datafile 8;

Alter database open;

Step2: If you have RMAN backup

rman target sys/oracle@sad1.world

blockrecover datafile 8 block 22;

Step3: If you do not have backup

Create another tablespace

Create or edit pfile and change tablespace as:

Create pfile=’d:sad_pfile.sql’ from spfile’;

undo_management=manual

undo_tablespace=UNDOTBS2
Startup Nomount pfile=’d:sad_pfile.sql’;
Startup mount;
alter database open;
Create spfile from pfile=’d:sad_pfile.sql’;

Now you are able to open the database as well as connect the schema but you are not able to perform any DDL and DML operation. This is due to the old rollback segements are still active and either needs recovery  or offline or drop the those segments. If you try to create any table within the connected schema will return the following error:

SQL> connect hrms/hrms

Connected.

SQL> Create table payment_master_test (ser number);

ORA-00604: error occurred at recursive SQL level 1

ORA-01552: cannot use system rollback segment for non-system tablespace ‘HRMS’

ORA-06512: at line 19

Errors in file D:oracleora92adminsad1bdump/sad1_smon_21134.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-00376: file 238 cannot be read at this time

ORA-01110: data file 238: ‘E:f4oradatadwnonundotbs201.dbf’;

Try to find the old online rollback segment which needs recovery and make it offline one by one, thus will be easier to drop it.

SQL> Select segment_name, status from dba_rollback_segs where tablespace_name=’UNDOTBS2′ and status = ‘NEEDS RECOVERY’;

SEGMENT_NAME                   STATUS

—————————— —————-

_SYSSMU4$                      NEEDS RECOVERY

_SYSSMU5$                      NEEDS RECOVERY

_SYSSMU6$                      NEEDS RECOVERY

_SYSSMU7$                      NEEDS RECOVERY

_SYSSMU8$                      NEEDS RECOVERY

_SYSSMU9$                      NEEDS RECOVERY

_SYSSMU10$                     NEEDS RECOVERY

SQL>alter rollback segment “_SYSSMU4$” offline;

SQL> drop rollback segment “_SYSSMU4$”;

SQL> drop rollback segment “_SYSSMU5$”;

Once drop all the rollback segments then easily you can drop the old undo tablespace.

SQL> drop tablespace UNDOTBS2 including contents and datafiles;