This is an old revision of the document!
With PostgreSQL (as with most high-end databases) the database system will automatically recover after power failures etc, if it can. If it can't then there is no manual 'repair' function, because the data cannot totally be recovered (or it would automatically have been recovered).
Common causes for unrecoverable data errors are: incorrect disk caching, virus scanners or faulty memory modules on the database server.
If you do get an unrecoverable error, the best solution is to restore the latest database backup. However, if you do this, you will lose any data between the database backup and the time when you restore it, so it should be done as soon as possible to avoid excessive data loss.
If that is not possible, then in some cases, you may be able to remove the damaged parts of the database and recover from there while keeping most of the important data intact. This article explains some possible ways to do this. Note that none of these methods are guaranteed, and are done at your own risk.
We can help you with these under our chargeable service, but there are no guarantees we will be able to recover anything.
Before you start you should:
Do not restart the VPOP3 service yet.
Once you have repaired the database you should ideally perform a manual database backup & restore to ensure there are no other problems before restarting VPOP3.
If you get an 'unexpected chunk number error for toast value..' in the messages.msgdata table then this means that one (or more) message records have been corrupted.
'toast' records in PostgreSQL are where it stores large data fields. This error means that the links between the main table data and the 'toast' values, or within the 'toast' value itself has been damaged.
A similar problem is an “Invalid page header in block x” error. The same procedure to resolve the problem should be followed in that case as well.
The only solution we know of is to delete the relevant record (message) totally.
Unfortunately, there is no easy way to find which record is damaged, so you need to use a 'divide & conquer' approach.
Go to a command prompt in the VPOP3 directory and run 'psql' (in version 5 only). The default password is 'vpop3pass'
Type:
SELECT MAX(msgdataid) FROM messages.msgdata;
This will tell you the biggest msgdataid value. Take a note of that. For this example, assume this is 12000.
Then, divide the biggest number by 2 (6000 in this sample) and type
SELECT SUM(LENGTH(msgdata)) FROM messages.msgdata WHERE msgdataid BETWEEN 1 AND 6000;
If that works OK, then you know the bad record is from 6001 to 12000, but if it gives an error, you know it is between 1 and 6000.
Then, divide the range appropriately down until you have isolated the damaged message. For example if we check
etc
Once you find the problem message, if you type:
SELECT subject, messagetime, fromaddr, tolist FROM messages.msgdata WHERE msgdataid = <problem message id>;
and
SELECT username, folder FROM messages.foldermessages INNER JOIN users.users ON messages.foldermessages.userid=users.users.username WHERE msgdataid= <problem message id>;
this will tell you summary information about the message which may help know which message will been deleted
Sometimes the above queries won't find the problem message, because they are just retrieving the lengths of messages, so if that metadata is still valid, the database server may not read the actual message content, and thus not encounter the message. In that case, it will take longer to find the problem as you will need to get the entire message contents, eg:
SELECT msgdata FROM messages.msgdata WHERE msgdataid BETWEEN 1 AND 6000;
Then, type:
DELETE FROM messages.foldermessages WHERE msgdataid= <problem message id>; DELETE FROM messages.msgdata WHERE msgdataid= <problem message id>;
This will delete the message from the database, so it should now work OK, but without the message you have just deleted.
After you have done this, we strongly recommend doing a full backup/restore of the database in case there is any other damage to the database files.