Step by Step - How to resolve redo log file corruption using ALTER DATABASE CLEAR UNARCHIVED LOGFILE command



Connect to sys using sqlplus in admin command prompt
Check whether your database is in archive log mode or not. Run command ARCHIVE LOG LIST. It will show you the following details:
It is always preferable to have your database in archive log mode. The reason being if your database is in archive log mode you can recover from all committed changes in the event of an OS or disk failure.
This query will display the path of redo log files, their group and their status. It is always recommended to have a minimum of two members in one group.

As you can see I have only 1 member in group 6 whose current status is INACTIVE. I intentionally have 1 member to generate a scenario for the sake of this practical.

Now go to the specified path where the redo log member of group 6 is located and open it in a notepad. It will show some symbolic data. Delete couple of lines and save it the original location which is the one we got as a result of query.

Why did we do the above step i.e. why did we delete some lines of the inactive redo log file?
The reason is we cannot corrupt the redo log file of the member whose status is CURRENT. Also since I want to explain what is the danger when you have only 1 member in the group and also about the risks of the command which we will be using to restart our database.

Another question is since the status of the member whose redo log file we just corrupted is inactive, then how are we going to check or confirm whether the redo log file is actually corrupt or not and what impact it is going to have on our database?

We certainly cannot wait till that member becomes current to get the error, if any. So this is what we are going to do: We will force the database to switch the log file using
Keep repeating above two commands till the member of group 6 becomes CURRENT while status of other members become INACTIVE.

When it becomes CURRENT, Run SHUT IMMEDIATE and then STARTUP. You should get the following error/s:
But this is one of the most common errors you will get when learning database. So how are we going to know that our redo log file is corrupted and that is the actual reason why we cannot startup our database?

I know I know you are all freaked out since the database isn’t starting... In such case you should recite the vintage and famous magical words:
ABRA KA DABRA!!!

Keep reciting those 3 words till you realize that there is a beautiful text document which is located in the following location
It is the alert_admin (or alert_SID where SID stands for System Identifier. For example if your database name is orcl then the name will be alert_orcl) text document which records or logs everything that’s happening within your database.

Open that document in Notepad (for Windows) or VI (for Linux), then scroll down till the end (or press CTRL+END) and look for the following errors:
And thus we have successfully corrupted our redo log file and the impact of that is we are not able to startup our database! YIPEEE!! Wait a min I don’t think we should be happy about it.................

Let’s take a look at the following steps to resolve these errors:

Since we cannot startup database in current session, open another session and login as sys. We will be connected to an idle instance. Then start the database in mount mode using:
Then run one of the most dangerous DBA's commands you will ever have to run which is:
BUT WAIT!!!! Before running the above command let's first understand WHY this(above) particular command?
(One of the reasons why the above command is incomplete....MUAHAHAHAHA.!!!)

Isn’t it easier if we just drop that corrupted redo log file? Or why not run this command:
The reason is: If the status of your redo log file is CURRENT then there is no archive log file created of that particular redo log file. Hence we cannot run ALTER DATABASE CLEAR LOGFILE.
To check whether your redo log file is archived or not you can run the following queries:
You can see that status of redo log files in group 2 is CURRENT (in 1st query) and a big NO in front of group 2(in 2nd query. Also displays the status, which is CURRENT).
And also database won’t allow you to drop the redo log file whose status is CURRENT. If you try to drop the redo log file whose status is CURRENT it will give you an error. Take a look at the following example:
Here status of members of group 2 is CURRENT. Now run the following command:
But then you will say "AHAHAHA I can change the status of group from CURRENT to INACTIVE by forcing a log switch and then drop the file".

Well BOOO!!! It's not going to help since you will have to drop the entire group. The reason being you cannot just drop the redo log file if it’s the single-lonely member in the group. Take a look at the following example:
As you can see I have only one member in group 6 whose status is CURRENT. I have already explained and shown why we cannot drop redo log file when its status is CURRENT.
Now I will force a log switch using:

As you can see I have forced the log switch and changed the status from CURRENT to INACTIVE. Now look and understand carefully what happens when I try to drop redo log file or member of group 6.
It throws an error saying "cannot remove last log member" meaning you cannot drop redo log member if it is the only member remaining in the group.

Then comes the next question: Can I drop the entire group? Or What happens if I drop the entire group?

The answer to that is it will work but it’s not recommended. The reason being your redo log file of that group is corrupted so you are losing your data anyways. Also if you want the group back you will have recreate the group and add a redo log member thus increasing your own efforts.

So I guess it's time to go back to the most dangerous DBA's command and see what happens after we execute it. Before you execute check your archive log list (You will know the reason later on) and I am going to repeat the steps I mentioned above about corrupting the redo log file and then execute the command. (Geez!! Be patient.)

Step 1: Check the status of your redo log files.
Step 2: Corrupt the redo log member of the inactive group and for this practical only choose the group which has only member in it.

Step 3: As mentioned earlier you won’t see the error of corruption directly on the your screen. Your might get the following error on your screen:
You will be able to see the actual error in the alert log document(Please scroll above for the information about alert log document.)
Depending upon how much you have corrupted your redo log file or which lines you have deleted you should get following errors in the alert log document:
Looking at the error you can make a wild guess that you have deleted something in the redo log file thus causing the reduction in its size.

Step 4: Then perform the following steps:
As you can see database starts in MOUNT mode and then shows the error, which means you cannot start the database in OPEN mode. But you can start the database in MOUNT mode which is exactly what we need to run the required command.

Step 5: Connect to sys since you have shut down the instance. It will connect to an idle instance.
Your database has started in MOUNT mode.

Step 6: Now run the following DBA command:
This is the complete command. The reason I have given GROUP 6 is because I have corrupted the redo log member of this group and no archive log of this redo log member is created since its status is ACTIVE.

Step 7: Check the status of your redo log file by running the following commands:
As you can see the status of the group which we have cleared is now UNUSED which is group 6 in this case. Now let’s run the same query but this time we will check sequence of the group member too.
And as you can see the sequence of the UNUSED redo log group is 0.

Step 8: Now we will run switch log file command to change the current status which is UNUSED.
OOPSSS !!! Looks like our database is not in OPEN mode. So to change the mode to OPEN mode run the following command:
Now we will try to switch log file again.
Thus we need the database to be in OPEN mode to execute the above command.
Let's check the status and sequence of the redo log file again by executing the following command:
As you can see the status of the redo log file of group 6 has now changed to CURRENT from UNUSED and we also have new sequence generated which is 2657.

Step 9: Then run the following command:
You will get a long list. Just scroll to the bottom till you see a certain sequence whose name is nothing but a blank space. In this case look for sequence# 2652. You will find a blank space before it. The reason for the blank space is that there was no archive log created of that particular log file and hence we had to run clear unarchived log file command.

Now think carefully about what will happen if a situation arises where you are required to recover your database and restore it and you don’t have the backup after executing clear unarchived log file command?

Your backup, restore and recovery will work fine till the sequence# 2651 but when it reaches sequence# 2652 there will be a big mess. Since the redo log file wasn’t archived you will lose the data from that redo log file and we have seen earlier that we need redo log file for recovery in case of media failure. And in this case since we have no archive log of this redo log file we won’t be able to recover the database after sequence# 2651.

Thus TAKE BACKUP of the whole database as soon as you execute the following command and have followed the above steps:

If you have followed the above steps perfectly then you should be able to perform this practical without any problems.

Comments

Post a Comment

Popular posts from this blog

Oracle Architecture - Explained In Detailed - Administration I

Oracle Database SQL - Practise - Question - Scott Schema Examples