[DRBD-user] drbd 7 + xfs + 2.6.7

Florin Cazacu florinc at reecemarketing.com
Fri Jul 23 21:31:41 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Lars Ellenberg wrote:

>
>ok.
>now, to help to find the actual problem, you could revert that again,
>but now recompile and install a new kernel with
>"kernel-hacking" ->
> [*] Kernel debugging
> [*] Debug memory allocations
> [*] Page alloc debugging
>or even enable xfs debugging...
>then recompile drbd, of course.
>and then trigger it again, maybe the logs show something more
>interessting then...
>
>
>  
>
    I compiled the kernel like you said, and i get this (the xfs debug 
is not enabled, i need to see how to do it. I remember  it was a option 
in the kernel but i don't see i it in this kernel, I don't see it in 
vanilla 2.6.7 neither.)

Jul 23 14:13:23 dell1 kernel: Bad page state at free_hot_cold_page (in 
process '
rm', page c169fe20)
Jul 23 14:13:23 dell1 kernel: flags:0x20000080 mapping:00000000 
mapcount:0 count
:0
Jul 23 14:13:23 dell1 kernel: Backtrace:
Jul 23 14:13:23 dell1 kernel:  [<c0137f91>] bad_page+0x6d/0x99
Jul 23 14:13:23 dell1 kernel:  [<c0138688>] free_hot_cold_page+0x7c/0x120
Jul 23 14:13:23 dell1 kernel:  [<c02dfb3e>] skb_release_data+0x9b/0xae
Jul 23 14:13:23 dell1 kernel:  [<c02dfc9e>] skb_clone+0x1e/0x183
Jul 23 14:13:23 dell1 kernel:  [<c02dfb64>] kfree_skbmem+0x13/0x2c
Jul 23 14:13:23 dell1 kernel:  [<c02dfc08>] __kfree_skb+0x8b/0x103
Jul 23 14:13:23 dell1 kernel:  [<c0306358>] tcp_clean_rtx_queue+0x13d/0x3b4
Jul 23 14:13:23 dell1 kernel:  [<c0306c26>] tcp_ack+0xec/0x57a
Jul 23 14:13:23 dell1 kernel:  [<c030906c>] __tcp_data_snd_check+0xdd/0xec
Jul 23 14:13:23 dell1 kernel:  [<c02dfc08>] __kfree_skb+0x8b/0x103
Jul 23 14:13:24 dell1 kernel:  [<c03098ca>] tcp_rcv_established+0x460/0x8e0
Jul 23 14:13:24 dell1 kernel:  [<c03129a4>] tcp_v4_do_rcv+0x139/0x13e
Jul 23 14:13:24 dell1 kernel:  [<c02df08f>] __release_sock+0x3c/0x5c
Jul 23 14:13:24 dell1 kernel:  [<c02df793>] release_sock+0x77/0x79
Jul 23 14:13:24 dell1 kernel:  [<c02ff8cb>] tcp_sendpage+0x94/0x98
Jul 23 14:13:24 dell1 kernel:  [<f89511d6>] _drbd_send_page+0x5f/0x100 
[drbd]
Jul 23 14:13:24 dell1 kernel:  [<f8951583>] drbd_send_dblock+0x30c/0x41c 
[drbd]
Jul 23 14:13:24 dell1 kernel:  [<c0298efc>] blk_plug_device+0x57/0x84
Jul 23 14:13:24 dell1 kernel:  [<f894c1ac>] 
drbd_make_request_common+0x3db/0x7a2
 [drbd]
Jul 23 14:13:24 dell1 kernel:  [<c01151cc>] __change_page_attr+0x25/0x1a7
Jul 23 14:13:24 dell1 kernel:  [<c0134980>] find_lock_page+0x29/0xb7
Jul 23 14:13:24 dell1 kernel:  [<f894c637>] 
drbd_make_request_26+0xc4/0x249 [drb
d]
Jul 23 14:13:24 dell1 kernel:  [<c029a787>] generic_make_request+0x113/0x194
Jul 23 14:13:24 dell1 kernel:  [<c01376e5>] mempool_alloc+0x8b/0x150
Jul 23 14:13:24 dell1 kernel:  [<c01195a5>] 
autoremove_wake_function+0x0/0x57
Jul 23 14:13:24 dell1 kernel:  [<c029a878>] submit_bio+0x70/0x121
Jul 23 14:13:24 dell1 kernel:  [<c0159a04>] __bio_add_page+0x118/0x11d
Jul 23 14:13:24 dell1 kernel:  [<c0159a3d>] bio_add_page+0x34/0x38
Jul 23 14:13:24 dell1 kernel:  [<c022f135>] _pagebuf_ioapply+0x1bd/0x2bb
Jul 23 14:13:24 dell1 kernel:  [<c022f2d3>] pagebuf_iorequest+0xa0/0x16e
Jul 23 14:13:24 dell1 kernel:  [<c013eb8f>] __kmalloc+0x1bc/0x259
Jul 23 14:13:24 dell1 kernel:  [<c0117ae5>] default_wake_function+0x0/0x12
Jul 23 14:13:24 dell1 kernel:  [<c022dbee>] _pagebuf_get_pages+0xef/0x15a
Jul 23 14:13:24 dell1 kernel:  [<c0117ae5>] default_wake_function+0x0/0x12
Jul 23 14:13:24 dell1 kernel:  [<c022e856>] 
pagebuf_associate_memory+0x6b/0x175
Jul 23 14:13:24 dell1 kernel:  [<c020f620>] xlog_bdstrat_cb+0x1f/0x64
Jul 23 14:13:24 dell1 kernel:  [<c02100aa>] xlog_sync+0x22b/0x491
Jul 23 14:13:24 dell1 kernel:  [<c021085b>] xlog_write+0x3b7/0x4ea
Jul 23 14:13:24 dell1 kernel:  [<c020f247>] xfs_log_write+0x67/0x99
Jul 23 14:13:24 dell1 kernel:  [<c021e61e>] xfs_trans_commit+0x118/0x44a
Jul 23 14:13:24 dell1 kernel:  [<c022051f>] xfs_trans_log_inode+0x2d/0x52
Jul 23 14:13:24 dell1 kernel:  [<c0207eeb>] xfs_ifree+0xbc/0xe9
Jul 23 14:13:24 dell1 kernel:  [<c0226535>] xfs_inactive+0x350/0x552
Jul 23 14:13:24 dell1 kernel:  [<c01151cc>] __change_page_attr+0x25/0x1a7
Jul 23 14:13:24 dell1 kernel:  [<c023688b>] vn_rele+0xb8/0xba
Jul 23 14:13:24 dell1 kernel:  [<c0235273>] linvfs_clear_inode+0x18/0x30
Jul 23 14:13:24 dell1 kernel:  [<c016ca13>] clear_inode+0xb8/0xd1
Jul 23 14:13:24 dell1 kernel:  [<c016d742>] generic_delete_inode+0x106/0x12e
Jul 23 14:13:24 dell1 kernel:  [<c016d90a>] iput+0x62/0x7c
Jul 23 14:13:24 dell1 kernel:  [<c0163b55>] sys_unlink+0x86/0x138
Jul 23 14:13:24 dell1 kernel:  [<c0105c1f>] syscall_call+0x7/0xb
Jul 23 14:13:24 dell1 kernel:
Jul 23 14:13:24 dell1 kernel: Trying to fix it up, but a reboot is needed

>solution approaches:
>  a. we could disable zero copy networking completely (tcp_sendpage).
>  b. we could make it configurable.
>  c. we could simply fall back to tcp_sendmsg for slab pages.
>
>patch for c. is attached.  if it works for Florin (please confirm),
>then it will go into svn soonish.
>
>  
>
    With the patch you posted this are the errors I get:

Jul 23 18:12:30 dell1 kernel: drbd0: _drbd_send_page: (page_count(page) 
< 1) in
/usr/local/src/drbd-0.7.0/drbd/drbd_main.c:895
Jul 23 18:12:30 dell1 kernel: drbd0: someone wants to send a free page!
Jul 23 18:12:30 dell1 kernel:  [<f8952381>] _drbd_send_page+0x1ad/0x1ba 
[drbd]
Jul 23 18:12:30 dell1 kernel:  [<f895269a>] drbd_send_dblock+0x30c/0x41c 
[drbd]
Jul 23 18:12:30 dell1 kernel:  [<c0298efc>] blk_plug_device+0x57/0x84
Jul 23 18:12:30 dell1 kernel:  [<f894d1ac>] 
drbd_make_request_common+0x3db/0x7a2
 [drbd]
Jul 23 18:12:30 dell1 kernel:  [<c01151cc>] __change_page_attr+0x25/0x1a7
Jul 23 18:12:30 dell1 kernel:  [<f894d637>] 
drbd_make_request_26+0xc4/0x249 [drb
d]
Jul 23 18:12:30 dell1 kernel:  [<c029a787>] generic_make_request+0x113/0x194
Jul 23 18:12:30 dell1 kernel:  [<c01376e5>] mempool_alloc+0x8b/0x150
Jul 23 18:12:30 dell1 kernel:  [<c01195a5>] 
autoremove_wake_function+0x0/0x57
Jul 23 18:12:30 dell1 kernel:  [<c029a878>] submit_bio+0x70/0x121
Jul 23 18:12:30 dell1 kernel:  [<c0159a04>] __bio_add_page+0x118/0x11d
Jul 23 18:12:30 dell1 kernel:  [<c0159a3d>] bio_add_page+0x34/0x38
Jul 23 18:12:30 dell1 kernel:  [<c022f135>] _pagebuf_ioapply+0x1bd/0x2bb
Jul 23 18:12:30 dell1 kernel:  [<c022f2d3>] pagebuf_iorequest+0xa0/0x16e
Jul 23 18:12:30 dell1 kernel:  [<c013eb8f>] __kmalloc+0x1bc/0x259
Jul 23 18:12:30 dell1 kernel:  [<c0117ae5>] default_wake_function+0x0/0x12
Jul 23 18:12:30 dell1 kernel:  [<c022dbee>] _pagebuf_get_pages+0xef/0x15a
Jul 23 18:12:30 dell1 kernel:  [<c0117ae5>] default_wake_function+0x0/0x12
Jul 23 18:12:30 dell1 kernel:  [<c022e856>] 
pagebuf_associate_memory+0x6b/0x175
Jul 23 18:12:30 dell1 kernel:  [<c020f620>] xlog_bdstrat_cb+0x1f/0x64
Jul 23 18:12:31 dell1 kernel:  [<c02100aa>] xlog_sync+0x22b/0x491
Jul 23 18:12:31 dell1 kernel:  [<c02108af>] xlog_write+0x40b/0x4ea
Jul 23 18:12:32 dell1 kernel:  [<c020f247>] xfs_log_write+0x67/0x99
Jul 23 18:12:32 dell1 kernel:  [<c021e61e>] xfs_trans_commit+0x118/0x44a
Jul 23 18:12:33 dell1 kernel:  [<c01151cc>] __change_page_attr+0x25/0x1a7
Jul 23 18:12:33 dell1 kernel:  [<c021db02>] xfs_trans_dup+0x36/0xff
Jul 23 18:12:33 dell1 kernel:  [<c01154a7>] kernel_map_pages+0x33/0x64
Jul 23 18:12:33 dell1 kernel:  [<c021db02>] xfs_trans_dup+0x36/0xff
Jul 23 18:12:33 dell1 kernel:  [<c013e43f>] kmem_cache_alloc+0x179/0x1ff
Jul 23 18:12:33 dell1 kernel:  [<c021db14>] xfs_trans_dup+0x48/0xff
Jul 23 18:12:34 dell1 kernel:  [<c0220f3f>] xfs_dir_ialloc+0x13e/0x2ed
Jul 23 18:12:35 dell1 kernel:  [<c0228057>] xfs_mkdir+0x3d1/0x767
Jul 23 18:12:36 dell1 kernel:  [<c0232c2e>] linvfs_mknod+0x234/0x25d
Jul 23 18:12:37 dell1 kernel:  [<c01edba3>] xfs_dir2_lookup+0x14c/0x14e
Jul 23 18:12:37 dell1 kernel:  [<c013d4fd>] cache_init_objs+0xec/0x1ea
Jul 23 18:12:38 dell1 kernel:  [<c0220d22>] xfs_dir_lookup_int+0x4c/0x12b
Jul 23 18:12:38 dell1 kernel:  [<c0232c90>] linvfs_mkdir+0x2c/0x30
Jul 23 18:12:38 dell1 kernel:  [<c016336c>] vfs_mkdir+0x8d/0x104
Jul 23 18:12:38 dell1 kernel:  [<c01634a9>] sys_mkdir+0xc6/0xf5
Jul 23 18:12:38 dell1 kernel:  [<c0105c1f>] syscall_call+0x7/0xb
Jul 23 18:12:38 dell1 kernel:

a lot's of them. This looks different from the others:

Jul 23 18:12:49 dell1 kernel: drbd0: _drbd_send_page: (page_count(page) 
< 1) in
/usr/local/src/drbd-0.7.0/drbd/drbd_main.c:895
Jul 23 18:12:49 dell1 kernel: drbd0: someone want4ab>] permission+0x2f/0x4b
Jul 23 18:12:49 dell1 kernel:  [<c016288a>] vfs_create+0x99/0x110
Jul 23 18:12:49 dell1 kernel:  [<c0162ef1>] open_namei+0x3bb/0x40d
Jul 23 18:12:49 dell1 kernel:  [<c01538db>] filp_open+0x43/0x69
Jul 23 18:12:49 dell1 kernel:  [<c0153d25>] sys_open+0x5b/0x8b
Jul 23 18:12:49 dell1 kernel:  [<c0105c1f>] syscall_call+0x7/0xb
Jul 23 18:12:49 dell1 kernel:


-----
Florin Cazacu



More information about the drbd-user mailing list