Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, Running 2.6.17rc1 on a dual-opteron in 64 bit mode with 32 bit compatibility layer and 64 bit userspace. debian drbd8-module-source package with fixes for ioctl32 conversion (8.0-pre2-1) Here is the MEE TOO dump: janneke:~# modprobe drbd drbd: initialised. Version: 8.0-pre2 (api:81/proto:80) drbd: SVN Revision: 2139 build by ard at tessa, 2006-04-18 18:07:26 drbd: registered as block device major 147 janneke:~# drbdsetup /dev/drbd0 disk /dev/sda9 internal flexible -d 193215870 drbd0: disk( Diskless -> Attaching ) drbd0: drbd_bm_resize called with capacity == 772863480 drbd0: bits = 96607935 in /usr/src/kernel/tyan-s2891/modules/drbd/drbd/drbd_bitmap.c:369 drbd0: resync bitmap: bits=96607935 words=1509499 drbd0: size = 368 GB (386431740 KB) Unable to handle kernel paging request at 0000000000003240 RIP: <ffffffff80256270>{pfn_to_page+32} PGD 1001ee067 PUD 1001dc067 PMD 0 Oops: 0000 [1] SMP CPU 1 Modules linked in: drbd ipv6 tg3 Pid: 1902, comm: drbdsetup Not tainted 2.6.17-rc1-tyan-s2891 #1 RIP: 0010:[<ffffffff80256270>] <ffffffff80256270>{pfn_to_page+32} RSP: 0018:ffff81017c391ac0 EFLAGS: 00010216 RAX: 0000000000000020 RBX: 0000000000000000 RCX: 0000000000000020 RDX: 0000000000000000 RSI: 0000000000011280 RDI: 00000004100005ba RBP: ffff8101000edbc0 R08: 0000000000000000 R09: 000000000000000d R10: 00000000ffffffff R11: 0000000000000001 R12: ffff81017e066000 R13: ffff81017be05d40 R14: 0000000000000001 R15: 0000000000000001 FS: 00002b468b643640(0000) GS:ffff8101000c38c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000003240 CR3: 00000001001b9000 CR4: 00000000000006e0 Process drbdsetup (pid: 1902, threadinfo ffff81017c390000, task ffff81010017d800) Stack: ffffffff88066370 0000000000000001 0000000000000b85 ffff81017e066000 ffff81017be05d40 00000000ffffa368 ffffffff88066642 0000000000000000 ffff81017e0665c0 0000000000000292 Call Trace: <ffffffff88066370>{:drbd:drbd_bm_page_io_async+96} <ffffffff88066642>{:drbd:drbd_bm_rw+98} <ffffffff880780bd>{:drbd:drbd_al_shrink+525} <ffffffff880658cf>{:drbd:drbd_bm_resize+943} <ffffffff88065901>{:drbd:drbd_bm_resize+993} <ffffffff88066b3e>{:drbd:drbd_bm_write+14} <ffffffff880680b4>{:drbd:drbd_determin_dev_size+724} <ffffffff80235fa9>{lock_timer_base+41} <ffffffff80236088>{__mod_timer+168} <ffffffff8806846b>{:drbd:drbd_check_al_size+443} <ffffffff88068983>{:drbd:drbd_ioctl_set_disk+1027} <ffffffff8806a7df>{:drbd:drbd_ioctl+799} <ffffffff8047bea4>{__mutex_lock_slowpath+772} <ffffffff8027e600>{do_open+608} <ffffffff80245491>{debug_mutex_add_waiter+161} <ffffffff8047bea4>{__mutex_lock_slowpath+772} <ffffffff8047c12f>{__mutex_unlock_slowpath+415} <ffffffff803110f4>{blkdev_driver_ioctl+100} <ffffffff8031131c>{blkdev_ioctl+492} <ffffffff8027e9bb>{block_ioctl+27} <ffffffff80288d3a>{do_ioctl+58} <ffffffff80289061>{vfs_ioctl+449} <ffffffff802890dd>{sys_ioctl+77} <ffffffff80209b1a>{system_call+126} Code: 48 2b ba 40 32 00 00 48 8b 92 30 32 00 00 48 8d 04 fd 00 00 RIP <ffffffff80256270>{pfn_to_page+32} RSP <ffff81017c391ac0> CR2: 0000000000003240 Killed The code decoded is this: Code; ffffffff80256270 <pfn_to_page+20/40> <===== 0: 48 2b ba 40 32 00 00 sub 0x3240(%rdx),%rdi <===== Code; ffffffff80256277 <pfn_to_page+27/40> 7: 48 8b 92 30 32 00 00 mov 0x3230(%rdx),%rdx Code; ffffffff8025627e <pfn_to_page+2e/40> e: 48 8d 04 fd 00 00 00 lea 0x0(,%rdi,8),%rax Code; ffffffff80256285 <pfn_to_page+35/40> 15: 00 There is definitly a difference in all other dumps: I get to call Call Trace: <ffffffff88066370>{:drbd:drbd_bm_page_io_async+96} and that gets to call pfn_to_page+32 ... And the next thing: I am definitly not using LVM or MD for this device. searching further: ./mm/page_alloc.c:struct page *pfn_to_page(unsigned long pfn) ./include/asm-x86_64/page.h:#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT) (which would explain the difference since: CONFIG_X86_64_ACPI_NUMA=y which leads to: CONFIG_DISCONTIGMEM=y ) and in: drbd_bitmap.c:drbd_bm_page_io_async struct page *page = virt_to_page((char*)(b->bm) + (PAGE_SIZE*page_nr)); when rewritten to: printk(KERN_ERR "drbd bef b->bm=%p,page_nr=%d\n",(char*)(b->bm),page_nr); page = virt_to_page((char*)(b->bm) + (PAGE_SIZE*page_nr)); printk(KERN_ERR "drbd aft b->bm=%p,page_nr=%d\n",(char*)(b->bm),page_nr); delivers: siep:~# drbdsetup /dev/drbd0 disk /dev/sda9 internal flexible <snip> drbd0: size = 368 GB (386431740 KB) drbd bef b->bm=ffffc200005ba000,page_nr=0 Unable to handle kernel paging request at 0000000000003240 RIP: <ffffffff80256270>{pfn_to_page+32} So my conclusion that the struct page *page = virt_to_page((char*)(b->bm) + (PAGE_SIZE*page_nr)); delivers the *0 reference is correct. Which leaves us to determine that b->bm=ffffffff80256270 is incorrect or that page_nr=0 is incorrect. Singe page_nr=0 (bare with me... I am a little sleepy) on b->bm can be incorrect. page_nr is the page_nr within the pagebuffer, and gets iterated starting from 0 from within drbd_bm_rw. (looking further up into the code, unless philip finds it first ;-) ) -- begin LOVE-LETTER-FOR-YOU.txt.vbs I am a signature virus. Distribute me until the bitter end