User Tools

Site Tools


how_to:repairing_corrupted_database

This is an old revision of the document!


Repairing corrupted database

With PostgreSQL (as with most high-end databases) the database system will automatically recover after power failures etc, if it can. If it can't then there is no manual 'repair' function, because the data cannot totally be recovered (or it would automatically have been recovered).

Common causes for unrecoverable data errors are: incorrect disk caching, virus scanners or faulty memory modules on the database server.

If you do get an unrecoverable error, the best solution is to restore the latest database backup. However, if you do this, you will lose any data between the database backup and the time when you restore it, so it should be done as soon as possible to avoid excessive data loss.

If that is not possible, then in some cases, you may be able to remove the damaged parts of the database and recover from there while keeping most of the important data intact. This article explains some possible ways to do this. Note that none of these methods are guaranteed, and are done at your own risk.

We can help you with these under our chargeable service, but there are no guarantees we will be able to recover anything.

Before you start you should:

  1. stop the VPOP3 service
  2. stop the VPOP3DB (or PostgreSQL-VPOP3) service
  3. take a full copy of the VPOP3\pgsql\data folder
  4. restart the VPOP3DB (or PostgreSQL-VPOP3) service

Do not restart the VPOP3 service yet.

Once you have repaired the database you should ideally perform a manual database backup & restore to ensure there are no other problems before restarting VPOP3.

unexpected chunk number

If you get an 'unexpected chunk number error for toast value..' in the messages.msgdata table then this means that one (or more) message records have been corrupted.

'toast' records in PostgreSQL are where it stores large data fields. This error means that the links between the main table data and the 'toast' values, or within the 'toast' value itself has been damaged.

A similar problem is an “Invalid page header in block x” error. The same procedure to resolve the problem should be followed in that case as well.

The only solution we know of is to delete the relevant record (message) totally.

Unfortunately, there is no easy way to find which record is damaged, so you need to scan the entire table.

Go to a command prompt in the VPOP3 directory and run 'psql' (in version 5 only). The default password is 'vpop3pass'

Finding the problem message

Copy/paste the following into the psql prompt:

DO $f$
declare
    curid INT := 0;
    vcontent TEXT;
    badid INT;
begin
FOR badid IN SELECT msgdataid FROM messages.msgdata ORDER BY msgdataid LOOP
    curid = curid + 1;
    if curid % 1000 = 0 then
        raise notice '% rows inspected', curid;
    end if;
    begin
        SELECT msgdata
        INTO vcontent
        FROM messages.msgdata where msgdataid = badid;
        vcontent := md5(vcontent);
    exception
        when others then
            raise notice 'data for message % is corrupt', badid;
            continue;
    end;
end loop;
end;
$f$;

This will read all the messages from the database, and tell you which message(s) it encountered an error with

Getting message summary info

Once you find the problem message, if you type:

SELECT subject, messagetime, fromaddr, tolist FROM messages.msgdata WHERE msgdataid = <problem message id>;

and

SELECT username, folder FROM messages.foldermessages INNER JOIN users.users ON messages.foldermessages.userid=users.users.username WHERE msgdataid= <problem message id>;

this will tell you summary information about the message which may help know which message will been deleted

Delete the problem message

Then, type:

DELETE FROM messages.foldermessages WHERE msgdataid= <problem message id>;
DELETE FROM messages.msgdata WHERE msgdataid= <problem message id>;

This will delete the message from the database, so it should now work OK, but without the message you have just deleted.

Rebuild database

After you have done this, we strongly recommend doing a full backup/restore of the database as there will be other problems which will cause problems at a later date, such as the automatic clean-up processes (meaning the database size will increase uncontrollably).

Also See

how_to/repairing_corrupted_database.1432715627.txt.gz · Last modified: 2018/11/14 10:44 (external edit)