[DRBD-user] Page allocation failure

Lars Ellenberg lars.ellenberg at linbit.com
Wed Aug 24 14:54:50 CEST 2011

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tue, Aug 23, 2011 at 05:26:01PM +0200, Matteo Tescione wrote:
> Hi all,
> 
> 
> we are seeing some strange page allocation failure on our target hosts running custom kernel 2.6.38.5 with drbd-8.3.11 userspace/modules. Link is across IPoIB.
> Is it a low memory situation or increasing the RAM size (currently we have 4GB) cannot help?

Low on available atomic continguous memory at that time. Yes.
Where atomic memory is memory that can be used right away, without
causing write-out or any other lengthy and potentially blocking
operation.

> is it safe to continue operations in such condition?

Safe: probably.
I think this "only" cause packages to be dropped,
and tcp will take care of the rest.

Nothing to do with DRBD directly, though.

Maybe the network stack should avoid to try alloc order:1 pages
(or even try to avoid any allocation) from IRQ context.

Adding RAM will likely help.

Additionally tuning socket buffers higher, or tuning some sysctls so you
have more "atomic reserve" will probably help as well.

> below is a snippet:
> __alloc_pages_slowpath: 396 callbacks suppressed
> drbd0_asender: page allocation failure. order:1, mode:0x4020
> Pid: 25123, comm: drbd0_asender Not tainted 2.6.38 #3
> Call Trace:
>  <IRQ>  [<ffffffff8106ed74>] ? __alloc_pages_nodemask+0x17c/0x586
>  [<ffffffff8109596b>] ? alloc_pages_current+0x9e/0xa7
>  [<ffffffff81098c70>] ? new_slab+0xa4/0x226
>  [<ffffffff810999bd>] ? __slab_alloc+0x1da/0x2d7
>  [<ffffffff81253977>] ? dev_alloc_skb+0x16/0x2c
>  [<ffffffff8127a675>] ? ip_local_deliver_finish+0x78/0xc7
>  [<ffffffff81253977>] ? dev_alloc_skb+0x16/0x2c
>  [<ffffffff8109b1ca>] ? __kmalloc_node_track_caller+0x80/0xba
>  [<ffffffff81253591>] ? __alloc_skb+0x71/0x136
>  [<ffffffff81253977>] ? dev_alloc_skb+0x16/0x2c
>  [<ffffffffa020b131>] ? ipoib_cm_alloc_rx_skb+0x2c/0x37a [ib_ipoib]
>  [<ffffffffa020cf21>] ? ipoib_cm_handle_rx_wc+0x409/0x633 [ib_ipoib]
>  [<ffffffffa02071a1>] ? ipoib_ib_handle_rx_wc+0x219/0x241 [ib_ipoib]
>  [<ffffffffa0208457>] ? ipoib_poll+0x86/0x11f [ib_ipoib]
>  [<ffffffff8125d561>] ? net_rx_action+0xa6/0x156
>  [<ffffffff810346a4>] ? __do_softirq+0x90/0x11e
>  [<ffffffff81002d8c>] ? call_softirq+0x1c/0x28
>  [<ffffffff81004889>] ? do_softirq+0x33/0x6a
>  [<ffffffff81034612>] ? irq_exit+0x36/0x38
>  [<ffffffff81003f2c>] ? do_IRQ+0x9b/0xb2
>  [<ffffffff812d9353>] ? ret_from_intr+0x0/0xe
>  <EOI>  [<ffffffffa02d4706>] ? page_chain_tail+0x9/0x25 [drbd]
>  [<ffffffffa02d8c3f>] ? drbd_pp_free+0x71/0x10e [drbd]
>  [<ffffffffa02d8d10>] ? drbd_free_some_ee+0x34/0xa6 [drbd]
>  [<ffffffffa02d8e87>] ? drbd_process_done_ee+0x105/0x13c [drbd]
>  [<ffffffffa02da9a6>] ? drbd_asender+0x143/0x56b [drbd]
>  [<ffffffffa02ed11b>] ? drbd_thread_setup+0x138/0x1e1 [drbd]
>  [<ffffffff81002c94>] ? kernel_thread_helper+0x4/0x10
>  [<ffffffffa02ecfe3>] ? drbd_thread_setup+0x0/0x1e1 [drbd]
>  [<ffffffff81002c90>] ? kernel_thread_helper+0x0/0x10
> Mem-Info:
> Node 0 DMA per-cpu:
> CPU    0: hi:    0, btch:   1 usd:   0
> CPU    1: hi:    0, btch:   1 usd:   0
> CPU    2: hi:    0, btch:   1 usd:   0
> CPU    3: hi:    0, btch:   1 usd:   0
> Node 0 DMA32 per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 165
> CPU    1: hi:  186, btch:  31 usd: 107
> CPU    2: hi:  186, btch:  31 usd:  56
> CPU    3: hi:  186, btch:  31 usd: 165
> Node 0 Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd:  11
> CPU    1: hi:  186, btch:  31 usd: 134
> CPU    2: hi:  186, btch:  31 usd:  87
> CPU    3: hi:  186, btch:  31 usd: 152
> active_anon:5791 inactive_anon:5350 isolated_anon:0
>  active_file:337540 inactive_file:494676 isolated_file:0
>  unevictable:6035 dirty:3506 writeback:291 unstable:0
>  free:11306 slab_reclaimable:46370 slab_unreclaimable:8605
>  mapped:2743 shmem:51 pagetables:1359 bounce:0
> Node 0 DMA free:15728kB min:28kB low:32kB high:40kB active_anon:0kB inactive_anon:0kB active_file:120kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15628kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:4kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 2162 3926 3926
> Node 0 DMA32 free:21580kB min:4408kB low:5508kB high:6612kB active_anon:800kB inactive_anon:7856kB active_file:739044kB inactive_file:1275184kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2214460kB mlocked:0kB dirty:8040kB writeback:396kB mapped:276kB shmem:12kB slab_reclaimable:145284kB slab_unreclaimable:11732kB kernel_stack:472kB pagetables:552kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 1764 1764
> Node 0 Normal free:9696kB min:3596kB low:4492kB high:5392kB active_anon:22364kB inactive_anon:13544kB active_file:610996kB inactive_file:703520kB unevictable:24140kB isolated(anon):0kB isolated(file):0kB present:1806336kB mlocked:24140kB dirty:5984kB writeback:768kB mapped:10696kB shmem:192kB slab_reclaimable:40192kB slab_unreclaimable:19104kB kernel_stack:2808kB pagetables:4884kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> Node 0 DMA: 2*4kB 1*8kB 2*16kB 2*32kB 2*64kB 1*128kB 2*256kB 1*512kB 2*1024kB 2*2048kB 2*4096kB = 15728kB
> Node 0 DMA32: 1187*4kB 1039*8kB 363*16kB 237*32kB 12*64kB 2*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 27476kB
> Node 0 Normal: 1064*4kB 201*8kB 88*16kB 17*32kB 11*64kB 4*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 9544kB
> 834441 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap  = 2104504kB
> Total swap = 2104504kB
> 1048560 pages RAM
> 50556 pages reserved
> 849683 pages shared
> 146502 pages non-shared
> SLUB: Unable to allocate memory on node -1 (gfp=0x20)
>   cache: kmalloc-8192, object size: 8192, buffer size: 8192, default order: 3, min order: 1
>   node 0: slabs: 313, objs: 604, free: 40
> 
> 
> Regards,
> --
> Matteo Tescione
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed



More information about the drbd-user mailing list