This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
how_to:repairing_corrupted_database [2014/07/17 11:25] – [unexpected chunk number] paul | how_to:repairing_corrupted_database [2023/09/08 13:52] – [Finding the problem message] paul | ||
---|---|---|---|
Line 9: | Line 9: | ||
If that is not possible, then in some cases, you may be able to remove the damaged parts of the database and recover from there while keeping most of the important data intact. This article explains some possible ways to do this. Note that none of these methods are guaranteed, and are done at your own risk. | If that is not possible, then in some cases, you may be able to remove the damaged parts of the database and recover from there while keeping most of the important data intact. This article explains some possible ways to do this. Note that none of these methods are guaranteed, and are done at your own risk. | ||
- | We can help you with these under our chargeable service, but there are no guarantees we will be able to recover anything. | + | We can help you with these under our chargeable service |
Before you start you should: | Before you start you should: | ||
Line 20: | Line 20: | ||
Once you have repaired the database you should ideally perform a manual database backup & restore to ensure there are no other problems before restarting VPOP3. | Once you have repaired the database you should ideally perform a manual database backup & restore to ensure there are no other problems before restarting VPOP3. | ||
+ | |||
+ | =====Finding the problem table/ | ||
+ | Often the error message will say something like: | ||
+ | |||
+ | Invalid page header in block x in relation base/ | ||
+ | | ||
+ | These numbers are ' | ||
+ | |||
+ | Go to a command prompt in the VPOP3 directory and run ' | ||
+ | |||
+ | Then type (or copy/paste) | ||
+ | |||
+ | select n.nspname AS schema, c.relname AS tablename, c.relkind as kind from pg_class c inner join pg_namespace n on (c.relnamespace=n.oid) where c.relfilenode=< | ||
+ | | ||
+ | In the above example error message < | ||
+ | |||
+ | If ' | ||
+ | |||
+ | If the problem is in a normal table, then depending on the table name, you may be able to delete the table without losing critical data. Please contact support@pscs.co.uk with the table name and we will be able to tell you what is stored in that table, and how to delete it if that is an appropriate action. | ||
=====unexpected chunk number===== | =====unexpected chunk number===== | ||
Line 31: | Line 50: | ||
The only solution we know of is to delete the relevant record (message) totally. | The only solution we know of is to delete the relevant record (message) totally. | ||
- | Unfortunately, | + | Unfortunately, |
- | Go to a command prompt in the VPOP3 directory and run ' | + | Go to a command prompt in the VPOP3\pgsql\bin |
====Finding the problem message==== | ====Finding the problem message==== | ||
- | Type: | ||
- | SELECT MAX(msgdataid) FROM messages.msgdata; | + | Copy/ |
- | + | ||
- | This will tell you the biggest msgdataid value. Take a note of that. For this example, assume this is 12000. | + | |
- | Then, divide the biggest number by 2 (6000 in this sample) and type | + | < |
+ | DO $f$ | ||
+ | declare | ||
+ | curid INT := 0; | ||
+ | vcontent TEXT; | ||
+ | badid INT; | ||
+ | begin | ||
+ | FOR badid IN SELECT msgdataid FROM messages.msgdata ORDER BY msgdataid LOOP | ||
+ | curid = curid + 1; | ||
+ | if curid % 1000 = 0 then | ||
+ | raise notice '% rows inspected' | ||
+ | end if; | ||
+ | begin | ||
+ | SELECT msgdata | ||
+ | INTO vcontent | ||
+ | FROM messages.msgdata where msgdataid = badid; | ||
+ | vcontent := md5(vcontent); | ||
+ | exception | ||
+ | when others then | ||
+ | raise notice 'data for message % is corrupt', | ||
+ | continue; | ||
+ | end; | ||
+ | end loop; | ||
+ | end; | ||
+ | $f$; | ||
+ | </ | ||
- | SELECT SUM(LENGTH(msgdata)) FROM messages.msgdata WHERE msgdataid BETWEEN 1 AND 6000; | + | This will read all the messages |
- | + | ||
- | If that works OK, then you know the bad record is from 6001 to 12000, but if it gives an error, you know it is between 1 and 6000. | + | |
- | Then, divide the range appropriately down until you have isolated the damaged message. For example if we check | + | However, sometimes it won't find all problem messages, so it is a good idea to run it again after fixing any problems, until this script doesn' |
- | * BETWEEN 1 and 6000 - error (so must be 1 to 6000) | + | |
- | * BETWEEN 1 and 3000 - OK (so must be 3001 to 6000) | + | |
- | * BETWEEN 3001 and 4500 - error (so must be 3001 to 4500) | + | |
- | * BETWEEN 3001 and 3750 - OK (so must be 3751 to 4500) | + | |
- | * BETWEEN 3751 and 4125 - OK (so must be 4126 to 4500) | + | |
- | * BETWEEN 4126 and 4313 - error (so must be 4126 to 4313) | + | |
- | etc | + | |
===Getting message summary info=== | ===Getting message summary info=== | ||
Line 62: | Line 94: | ||
SELECT subject, messagetime, | SELECT subject, messagetime, | ||
and | and | ||
- | SELECT username, folder FROM messages.foldermessages INNER JOIN users.users ON messages.foldermessages.userid=users.users.username | + | SELECT username, folder FROM messages.foldermessages |
this will tell you summary information about the message which may help know which message will been deleted | this will tell you summary information about the message which may help know which message will been deleted | ||
- | |||
- | ===If you can't find the problem message=== | ||
- | Sometimes the above queries won't find the problem message, because they are just retrieving the lengths of messages, so if that metadata is still valid, the database server may not read the actual message content, and thus not encounter the message. In that case, it will take longer to find the problem as you will need to get the entire message contents, eg: | ||
- | |||
- | SELECT msgdata FROM messages.msgdata WHERE msgdataid BETWEEN 1 AND 6000; | ||
Line 80: | Line 107: | ||
This will delete the message from the database, so it should now work OK, but without the message you have just deleted. | This will delete the message from the database, so it should now work OK, but without the message you have just deleted. | ||
+ | In some cases you may need to disable database triggers before doing this. To do this, stop VPOP3 first, then, before doing the DELETE commands do: | ||
+ | |||
+ | ALTER TABLE messages.foldermessages DISABLE TRIGGER USER; | ||
+ | |||
+ | After doing the deletions, re-enable the triggers by doing: | ||
+ | |||
+ | ALTER TABLE messages.foldermessages ENABLE TRIGGER USER; | ||
+ | | ||
+ | Failure to re-enable the triggers will cause big problems for VPOP3! | ||
+ | | ||
+ | In rare cases the database files may be even more corrupted, so you need to run this command before deleting the records: | ||
+ | |||
+ | SET zero_damaged_pages=on; | ||
+ | | ||
+ | This should be considered a 'last resort' | ||
====Rebuild database==== | ====Rebuild database==== | ||
- | After you have done this, we strongly recommend doing a full [[backup_vpop3# | + | After you have done this, we strongly recommend doing a full [[backup_vpop3# |
====Also See==== | ====Also See==== | ||
[[PostgreSQL Server won't start due to damaged log files]] | [[PostgreSQL Server won't start due to damaged log files]] |