Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi guys, Yesterday one of three drbd mounted filesystems went inaccessible (e.g. ls -asl hangs, load >300 etc.). Kernel messages at the time of hang: Sep 22 12:57:01 filera2 crond(pam_unix)[15732]: session closed for user root Sep 22 12:58:01 filera2 crond(pam_unix)[16114]: session opened for user root by (uid=0) Sep 22 12:58:01 filera2 nscd: nscd -HUP succeeded Sep 22 12:58:01 filera2 crond(pam_unix)[16114]: session closed for user root Sep 22 12:58:03 filera2 kernel: [<f08bfc42>] ext3_write_inode+0x22/0x3f [ext3] Sep 22 12:58:03 filera2 kernel: [<c0171c82>] write_inode+0x30/0x37 Sep 22 12:58:03 filera2 kernel: [<c0171cf9>] __sync_single_inode+0x70/0x1c1 Sep 22 12:58:03 filera2 kernel: [<c017207c>] sync_sb_inodes+0x1a7/0x274 Sep 22 12:58:03 filera2 kernel: [<c01721da>] writeback_inodes+0x91/0xde Sep 22 12:58:03 filera2 kernel: [<c01406e9>] balance_dirty_pages+0x7c/0x11c Sep 22 12:58:03 filera2 kernel: [<f08bdf4e>] ext3_ordered_commit_write+0xb6/0xc5 [ext3] Sep 22 12:58:03 filera2 kernel: [<c013dcbf>] generic_file_buffered_write+0x39a/0x47c Sep 22 12:58:03 filera2 kernel: [<f0850d1c>] journal_get_write_access+0x25/0x2c [jbd] Sep 22 12:58:03 filera2 kernel: [<f08c371a>] __ext3_journal_stop+0x19/0x34 [ext3] Sep 22 12:58:03 filera2 kernel: [<c0171bbe>] __mark_inode_dirty+0xe2/0x176 Sep 22 12:58:03 filera2 kernel: [<c013e12a>] generic_file_aio_write_nolock+0x389/0x3b7 Sep 22 12:58:03 filera2 kernel: [<c013e263>] generic_file_aio_write+0x72/0xc6 Sep 22 12:58:03 filera2 kernel: [<f08bbd7a>] ext3_file_write+0x19/0x8b [ext3] Sep 22 12:58:03 filera2 kernel: [<c01560d4>] do_sync_write+0x97/0xc9 Sep 22 12:58:03 filera2 kernel: [<c011f6ee>] autoremove_wake_function+0x0/0x2d Sep 22 12:58:03 filera2 kernel: [<c026ec53>] release_sock+0xf/0x4f Sep 22 12:58:03 filera2 kernel: [<c02933ac>] tcp_recvmsg+0x64a/0x681 Sep 22 12:58:03 filera2 kernel: [<c017fa0b>] v2_write_dquot+0xe8/0x128 Sep 22 12:58:03 filera2 kernel: [<c017ca0b>] dquot_commit+0xa2/0xf4 Sep 22 12:58:03 filera2 kernel: [<f08c5c2b>] ext3_write_dquot+0x39/0x4f [ext3] Sep 22 12:58:03 filera2 kernel: [<c017cf15>] dqput+0x9e/0x117 Sep 22 12:58:03 filera2 kernel: [<c017d97f>] dquot_drop+0x4c/0x80 Sep 22 12:58:03 filera2 kernel: [<f08c5bdc>] ext3_dquot_drop+0x25/0x3b [ext3] Sep 22 12:58:03 filera2 kernel: [<f08bbfdf>] ext3_free_inode+0xf4/0x31b [ext3] Sep 22 12:58:03 filera2 kernel: [<f08bfeac>] ext3_mark_iloc_dirty+0x10/0x18 [ext3] Sep 22 12:58:03 filera2 kernel: [<f08bff6f>] ext3_mark_inode_dirty+0x3a/0x41 [ext3] Sep 22 12:58:03 filera2 kernel: [<f08bcfa4>] ext3_delete_inode+0x93/0xaa [ext3] Sep 22 12:58:05 filera2 kernel: [<f08bcf11>] ext3_delete_inode+0x0/0xaa [ext3] Sep 22 12:58:05 filera2 kernel: [<c016c3fb>] generic_delete_inode+0xa2/0xff Sep 22 12:58:05 filera2 kernel: [<c016c5bd>] iput+0x5f/0x61 Sep 22 12:58:05 filera2 kernel: [<c0163c30>] sys_unlink+0xd7/0x132 Sep 22 12:58:05 filera2 kernel: [<c0156d94>] fget+0x3b/0x42 Sep 22 12:58:05 filera2 kernel: [<c01653ef>] sys_fcntl64+0x76/0x7d Sep 22 12:58:05 filera2 kernel: [<c02c7377>] syscall_call+0x7/0xb Sep 22 12:58:05 filera2 kernel: [<c02c007b>] unix_release_sock+0x15a/0x201 Sep 22 12:59:01 filera2 crond(pam_unix)[16497]: session opened for user root by (uid=0) Sep 22 12:59:02 filera2 nscd: nscd -HUP succeeded Sep 22 12:59:02 filera2 crond(pam_unix)[16497]: session closed for user root Running CentOS 4.1 (2.6.9-11.ELsmp) running drbd version: 0.7.11 (api:77/proto:74) Because of this hangup we failover but still the otherside had the hungup filesystem. I thought, big trouble, but after a fsck it was normally working again. Though the fsck was running verbose it didnt have to fix anything during the numerious passes. It did mention "filesystem was modified". Superblock information: Filesystem volume name: <none> Last mounted on: <not available> Filesystem UUID: 20f873d6-1e7a-4e5e-a0ee-d0af07c71bef Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal resize_inode filetype needs_recovery sparse_super large_file Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 29108032 Block count: 363470617 Reserved block count: 7269412 Free blocks: 38055650 Free inodes: 27042089 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 2624 Inode blocks per group: 82 Filesystem created: Mon Apr 25 09:46:25 2005 Last mount time: Thu Sep 22 14:19:58 2005 Last write time: Thu Sep 22 14:19:58 2005 Mount count: 68 Maximum mount count: 26 Last checked: Mon Apr 25 09:46:25 2005 Check interval: 15552000 (6 months) Next check after: Sat Oct 22 09:46:25 2005 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode: 8 Default directory hash: tea Directory Hash Seed: 38908950-8ab1-42f4-967d-f924e3378aea Journal backup: inode blocks --Leroy