Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello,
I have run drbd 0.7.7 on SLES9 for a couple of month without any problems.
Everytime I upgraded distribution kernel I compiled drbd against new kernel
sources and everything went well. But after last upgrade to kernel version
2.6.5-7.139-smp SLES9_SP1_BRANCH-200501141541330000 I get Oops after trying to
make filesystem on drbd0 device and command accessing device will hang until
hard reboot. Used HW: 2x machine with Intel Xeon 3.0Ghz, 1G RAM, each machine
has 2 SCSI 72GB HDD connected together as HW RAID 1, one partition (53GB)
/dev/cciss/c0d0p7 is used as a physical device for drbd0. Follows actions I made
and dmesg output after them (for this test I used another HDD with 400MB
partition):
...previous initial resynchronization went OK.
$ /etc/init.d/drbd start
drbd: unsupported module, tainting kernel.
drbd: initialised. Version: 0.7.7 (api:77/proto:74)
drbd: SVN Revision: 1680 build by phil at wiesel, 2004-12-14 16:05:39
drbd: registered as block device major 147
drbd0: resync bitmap: bits=66168 words=2068
drbd0: size = 258 MB (264672 KB)
drbd0: 0 KB marked out-of-sync by on disk bit-map.
drbd0: No usable activity log found.
drbd0: Marked additional 0 KB as out-of-sync based on AL.
drbd0: drbdsetup [4587]: cstate Unconfigured --> StandAlone
drbd0: drbdsetup [4600]: cstate StandAlone --> Unconnected
drbd0: drbd0_receiver [4601]: cstate Unconnected --> WFConnection
drbd0: drbd0_receiver [4601]: cstate WFConnection --> WFReportParams
drbd0: Handshake successful: DRBD Network Protocol version 74
drbd0: Connection established.
drbd0: I am(S): 1:00000002:00000001:00000001:00000001:11
drbd0: Peer(S): 1:00000002:00000001:00000002:00000002:10
drbd0: drbd0_receiver [4601]: cstate WFReportParams --> WFBitMapT
drbd0: Secondary/Unknown --> Secondary/Secondary
drbd0: drbd0_receiver [4601]: cstate WFBitMapT --> SyncTarget
drbd0: Resync started as SyncTarget (need to sync 0 KB [0 bits set]).
drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec)
drbd0: drbd0_receiver [4601]: cstate SyncTarget --> Connected
$ drbdadm primary r0
drbd0: Secondary/Secondary --> Primary/Secondary
$ cat /proc/drbd
$ version: 0.7.7 (api:77/proto:74)
$ SVN Revision: 1680 build by phil at wiesel, 2004-12-14 16:05:39
$ 0: cs:Connected st:Primary/Secondary ld:Consistent
$ ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
$ mkreiserfs /dev/drbd0
Unable to handle kernel paging request at virtual address 4008d3c7
printing eip:
c0178555
*pde = 31dd8067
Oops: 0003 [#1]
SMP
CPU: 0
EIP: 0060:[<c0178555>] Tainted: G U
EFLAGS: 00210206 (2.6.5-7.139-smp SLES9_SP1_BRANCH-200501141541330000)
EIP is at __bio_clone+0x35/0xc0
eax: 0000000c ebx: 00000000 ecx: 00000003 edx: edc36b00
esi: cdfd77c0 edi: 4008d3c7 ebp: edd20bc4 esp: ede65b34
ds: 007b es: 007b ss: 0068
Process mkreiserfs (pid: 4650, threadinfo=ede64000 task=f1d779b0)
Stack: ee20a600 00000000 edd20ba8 f183f800 edc36b00 f93900b4 eddf6ff4 f7a95284
000b8005 00000000 00000001 edd20bc4 f7192904 f7192904 c1ae6e14 00000000
00000008 f7192904 00000008 c01741e1 00000000 00000000 00001000 00000000
Call Trace:
[<f93900b4>] drbd_make_request_26+0x364/0xbdf [drbd]
[<c01741e1>] __find_get_block+0xa1/0x1b0
[<c026152d>] generic_make_request+0x11d/0x200
[<f90bf7c3>] search_by_key+0x153/0x14c0 [reiserfs]
[<c0151703>] mempool_alloc+0x73/0x140
[<c0128d50>] autoremove_wake_function+0x0/0x40
[<c0261678>] submit_bio+0x68/0x120
[<c0128d50>] autoremove_wake_function+0x0/0x40
[<c015d5e6>] do_no_page+0x246/0x8e0
[<c01776f2>] bio_alloc+0xd2/0x1c0
[<c017386d>] submit_bh+0x17d/0x230
[<c0175967>] block_read_full_page+0x357/0x360
[<c0179c50>] blkdev_get_block+0x0/0x80
[<c014d3a7>] add_to_page_cache+0x57/0x180
[<c01552c0>] read_pages+0x130/0x1b0
[<c0153b7d>] __alloc_pages+0xad/0x310
[<c015eeb8>] handle_mm_fault+0x138/0xb60
[<c015544e>] do_page_cache_readahead+0x10e/0x180
[<c01555e8>] page_cache_readahead+0x128/0x240
[<c014e692>] do_generic_mapping_read+0x332/0x7d0
[<c014cc60>] file_read_actor+0x0/0xf0
[<c014f682>] __generic_file_aio_read+0x1e2/0x220
[<c014cc60>] file_read_actor+0x0/0xf0
[<c0153b7d>] __alloc_pages+0xad/0x310
[<c014f7ef>] generic_file_read+0x8f/0xb0
[<c015eeb8>] handle_mm_fault+0x138/0xb60
[<c0128d50>] autoremove_wake_function+0x0/0x40
[<c011df93>] do_page_fault+0x163/0x53f
[<c0172216>] vfs_read+0xc6/0x160
[<c0179d80>] block_llseek+0x0/0x110
[<c01724c1>] sys_read+0x91/0xf0
[<c01091d9>] sysenter_past_esp+0x52/0x79
Code: f3 a5 a8 02 74 02 66 a5 a8 01 74 01 a4 8b 42 0c 8b 0a 8b 5a
I do not understand last message, please can you give me some hint what I do
wrong and what to do next?
Thank you in advance,
Pavel Srubar