User Tools

Site Tools


how_to:repairing_corrupted_database

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
how_to:repairing_corrupted_database [2015/01/13 09:00] – [Finding the problem message] paulhow_to:repairing_corrupted_database [2024/01/15 11:35] (current) paul
Line 9: Line 9:
 If that is not possible, then in some cases, you may be able to remove the damaged parts of the database and recover from there while keeping most of the important data intact. This article explains some possible ways to do this. Note that none of these methods are guaranteed, and are done at your own risk. If that is not possible, then in some cases, you may be able to remove the damaged parts of the database and recover from there while keeping most of the important data intact. This article explains some possible ways to do this. Note that none of these methods are guaranteed, and are done at your own risk.
  
-We can help you with these under our chargeable service, but there are no guarantees we will be able to recover anything.+We can help you with these under our chargeable service which costs £40/hr (or part of) for this type of problem, but there are no guarantees we will be able to recover anything.
  
 Before you start you should: Before you start you should:
Line 20: Line 20:
  
 Once you have repaired the database you should ideally perform a manual database backup & restore to ensure there are no other problems before restarting VPOP3. Once you have repaired the database you should ideally perform a manual database backup & restore to ensure there are no other problems before restarting VPOP3.
 +
 +=====Finding the problem table/index=====
 +Often the error message will say something like:
 +
 +  Invalid page header in block x in relation base/16385/512312
 +  
 +These numbers are 'random' so will be different from one installation to another. To determine which database table/index the relation refers to, follow these instructions
 +
 +Go to a command prompt in the VPOP3 directory and run 'psql' (in version 5 only). The default password is 'vpop3pass'
 +
 +Then type (or copy/paste)
 +
 +  select n.nspname AS schema, c.relname AS tablename, c.relkind as kind from pg_class c inner join pg_namespace n on (c.relnamespace=n.oid) where c.relfilenode=<filename>;
 +  
 +In the above example error message <filename> would be replaced by 512312
 +
 +If 'kind' is 'i' then the damaged relation is an index, so a database reindex should fix the problem. If it is 'r' then it is a normal table, or if it is 't' then it is a 'toast' table (see below).
 +
 +If the problem is in a normal table, then depending on the table name, you may be able to delete the table without losing critical data. Please contact support@pscs.co.uk with the table name and we will be able to tell you what is stored in that table, and how to delete it if that is an appropriate action.
  
 =====unexpected chunk number===== =====unexpected chunk number=====
Line 31: Line 50:
 The only solution we know of is to delete the relevant record (message) totally. The only solution we know of is to delete the relevant record (message) totally.
  
-Unfortunately, there is no easy way to find which record is damaged, so you need to use a 'divide & conquer' approach+Unfortunately, there is no easy way to find which record is damaged, so you need to scan the entire table
  
-Go to a command prompt in the VPOP3 directory and run 'psql' (in version 5 only). The default password is 'vpop3pass'+Go to a command prompt in the VPOP3\pgsql\bin directory and run 'psql -U postgres -p 5433 vpop3' (in version 3 or 4 omit the '-p 5433'). The default password is 'pgsqlpass'
  
 ====Finding the problem message==== ====Finding the problem message====
Line 46: Line 65:
     badid INT;     badid INT;
 begin begin
 +DROP TABLE IF EXISTS badids;
 +CREATE TEMP TABLE badids (msgdataid BIGINT);
 FOR badid IN SELECT msgdataid FROM messages.msgdata ORDER BY msgdataid LOOP FOR badid IN SELECT msgdataid FROM messages.msgdata ORDER BY msgdataid LOOP
     curid = curid + 1;     curid = curid + 1;
Line 58: Line 79:
     exception     exception
         when others then         when others then
 +            INSERT INTO badids (msgdataid) VALUES(badid);
             raise notice 'data for message % is corrupt', badid;             raise notice 'data for message % is corrupt', badid;
             continue;             continue;
Line 67: Line 89:
  
 This will read all the messages from the database, and tell you which message(s) it encountered an error with This will read all the messages from the database, and tell you which message(s) it encountered an error with
 +
 +However, sometimes it won't find all problem messages, so it is a good idea to run it again after fixing any problems, until this script doesn't find any more errors. 
 +
 +This script stores the bad message IDs in a temporary table called 'badids', so you can look at the values using 'SELECT * FROM badids;' or use the table to help with other functions if it helps (and you know SQL)
  
 ===Getting message summary info=== ===Getting message summary info===
Line 73: Line 99:
   SELECT subject, messagetime, fromaddr, tolist FROM messages.msgdata WHERE msgdataid = <problem message id>;   SELECT subject, messagetime, fromaddr, tolist FROM messages.msgdata WHERE msgdataid = <problem message id>;
 and and
-  SELECT username, folder FROM messages.foldermessages INNER JOIN users.users ON messages.foldermessages.userid=users.users.username WHERE msgdataid= <problem message id>;+  SELECT username, folder FROM messages.foldermessages INNER JOIN messages.folders USING(folderid) INNER JOIN users.users ON messages.folders.userid=users.users.usernumber WHERE msgdataid = <problem message id>;
  
 this will tell you summary information about the message which may help know which message will been deleted this will tell you summary information about the message which may help know which message will been deleted
Line 86: Line 112:
 This will delete the message from the database, so it should now work OK, but without the message you have just deleted. This will delete the message from the database, so it should now work OK, but without the message you have just deleted.
  
 +In some cases you may need to disable database triggers before doing this. To do this, stop VPOP3 first, then, before doing the DELETE commands do:
 +
 +  ALTER TABLE messages.foldermessages DISABLE TRIGGER USER;
 +
 +After doing the deletions, re-enable the triggers by doing:
 +
 +  ALTER TABLE messages.foldermessages ENABLE TRIGGER USER;
 +  
 +Failure to re-enable the triggers will cause big problems for VPOP3!
 +  
 +In rare cases the database files may be even more corrupted, so you need to run this command before deleting the records:
 +
 +  SET zero_damaged_pages=on;
 +  
 +This should be considered a 'last resort' option - http://www.postgresql.org/docs/9.1/static/runtime-config-developer.html#GUC-ZERO-DAMAGED-PAGES
 ====Rebuild database==== ====Rebuild database====
 After you have done this, we strongly recommend doing a full [[backup_vpop3#manual_database_backup|backup]]/[[restore_a_backup_of_vpop3|restore]] of the database as there will be  other problems which will cause problems at a later date, such as the automatic clean-up processes (meaning the database size will increase uncontrollably). After you have done this, we strongly recommend doing a full [[backup_vpop3#manual_database_backup|backup]]/[[restore_a_backup_of_vpop3|restore]] of the database as there will be  other problems which will cause problems at a later date, such as the automatic clean-up processes (meaning the database size will increase uncontrollably).
 ====Also See==== ====Also See====
 [[PostgreSQL Server won't start due to damaged log files]] [[PostgreSQL Server won't start due to damaged log files]]
how_to/repairing_corrupted_database.1421139611.txt.gz · Last modified: 2018/11/14 10:44 (external edit)