<html><head><style>body{font-family:Helvetica,Arial;font-size:13px}</style></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Howdy folks, I’m having some problems getting a very basic 2 node environment setup with DRBDv9 on Ubuntu 16.04 in Google’s cloud.<div><br></div><div>I’m using the ubuntu-1604-xenial-v20160627 image with basically this additional customization:</div><div><br></div><div>add-apt-repository ppa:linbit/linbit-drbd9-stack</div><div>apt update</div><div>apt install drbd-utils python-drbdmanage drbd-dkms</div><div><br></div><div>That appears to function and compiles the kernel module.</div><div><br></div><div>My vm's have a 100gb disk attached as /dev/sdb and I have been able to mostly get something working with:</div><div><br></div><div>on elysium-test01 vm:</div><div><br></div><div><div>vgcreate drbdpool /dev/sdb</div><div>drbdmanage init 10.12.0.2</div><div>drbdmanage add-node elysium-test02 10.12.0.3</div><div>drbdmanage add-resource data01</div><div>drbdmanage add-volume data01 90gb</div><div>drbdmanage assign-resource data01 elysium-test01</div><div>drbdmanage assign-resource data01 elysium-test02</div></div><div><br></div><div>on elysium-test02 vm:</div><div><br></div><div><div>vgcreate drbdpool /dev/sdb</div><div>drbdmanage join -p 6999 10.12.0.3 1 elysium-test01 10.12.0.2 0 mUEU/uPLZOAFpkZGgmlT</div></div><div><br></div><div>At this point checking with drbd-overview it looks like everything is happy and connected, though elysium-test02 is inconsistent.</div><div><br></div><div>on elysium-test01 vm:</div><div><br></div><div><div>mkfs.ext4 -F -E discard /dev/drbd100</div><div>mkdir -p /mnt/disks/data01</div><div>mount -o discard,defaults /dev/drbd100 /mnt/disks/data01</div></div><div><br></div><div>At this point everything looks okay, and logs show that elysium-test01 is now the primary for data01.</div><div><br></div><div>Then the problems start, on the elysium-test02 node, after a few seconds the logs show "BUG: unable to handle kernel NULL pointer dereference at (null)”</div><div><br></div><div><snip></div><div><div>Jul 14 01:11:57 ubuntu kernel: [ 444.132269] drbd data01/0 drbd100 elysium-test01: received new current UUID: 096A8FD96D357D37</div><div>Jul 14 01:11:59 ubuntu kernel: [ 445.934934] drbd data01/0 drbd100 elysium-test01: Resync done (total 70 sec; paused 0 sec; 1255580 K/sec)</div><div>Jul 14 01:11:59 ubuntu kernel: [ 445.934942] drbd data01/0 drbd100 elysium-test01: updated UUIDs 096A8FD96D357D36:0000000000000000:0000000000000000:0000000000000000</div><div>Jul 14 01:11:59 ubuntu kernel: [ 445.934954] drbd data01/0 drbd100: disk( Inconsistent -> UpToDate )</div><div>Jul 14 01:11:59 ubuntu kernel: [ 445.934957] drbd data01/0 drbd100 elysium-test01: repl( SyncTarget -> Established )</div><div>Jul 14 01:11:59 ubuntu kernel: [ 445.936014] drbd data01/0 drbd100 elysium-test01: helper command: /sbin/drbdadm after-resync-target</div><div>Jul 14 01:11:59 ubuntu drbdadm[12829]: Don't know which config file belongs to resource data01, trying default ones...</div><div>Jul 14 01:11:59 ubuntu kernel: [ 445.942075] drbd data01/0 drbd100 elysium-test01: helper command: /sbin/drbdadm after-resync-target exit code 0 (0x0)</div><div>Jul 14 01:12:01 ubuntu kernel: [ 448.254494] drbd data01 elysium-test01: peer( Primary -> Secondary )</div><div>Jul 14 01:12:12 ubuntu kernel: [ 458.528749] drbd data01 elysium-test01: Preparing remote state change 504089555 (primary_nodes=0, weak_nodes=0)</div><div>Jul 14 01:12:12 ubuntu kernel: [ 458.530153] drbd data01 elysium-test01: Committing remote state change 504089555</div><div>Jul 14 01:12:12 ubuntu kernel: [ 458.530168] drbd data01 elysium-test01: peer( Secondary -> Primary )</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.832177] BUG: unable to handle kernel NULL pointer dereference at (null)</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.840403] IP: [<ffffffff813f91ed>] memcpy_orig+0x9d/0x110</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.846205] PGD 0 </div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.848811] Oops: 0002 [#1] SMP </div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.852394] Modules linked in: drbd_transport_tcp(OE) drbd(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev serio_raw parport_pc pvpanic parport ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse virtio_scsi</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.905902] CPU: 0 PID: 12729 Comm: drbd_r_data01 Tainted: G OE 4.4.0-28-generic #47-Ubuntu</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.915567] Hardware name: Google Google/Google, BIOS Google 01/01/2011</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.922378] task: ffff8800b991e040 ti: ffff8800b9af8000 task.ti: ffff8800b9af8000</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.930084] RIP: 0010:[<ffffffff813f91ed>] [<ffffffff813f91ed>] memcpy_orig+0x9d/0x110</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.938329] RSP: 0018:ffff8800b9afb9a8 EFLAGS: 00010202</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.943760] RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000000200</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.951091] RDX: 0000000000000012 RSI: ffff8800b9db80ae RDI: 0000000000000000</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.958348] RBP: ffff8800b9afb9e0 R08: 0000000000000000 R09: 0000000000000000</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.965591] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800b9afbbb0</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.972841] R13: 0000000000000012 R14: 0000000000000012 R15: ffff8800b9afbb90</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.980087] FS: 0000000000000000(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.988289] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033</div><div>Jul 14 01:12:14 ubuntu kernel: [ 460.994144] CR2: 0000000000000000 CR3: 00000000ba48c000 CR4: 00000000001406f0</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.001397] Stack:</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.003631] ffffffff813fde16 ffff8800b9db80c0 0000000000000200 000000000000003e</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.011633] 0000000000000012 0000000000000012 000000000000002c ffff8800b9afba40</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.019793] ffffffff8170f018 0000000000000000 ffff88012aa42580 0000000000000002</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.027911] Call Trace:</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.030577] [<ffffffff813fde16>] ? copy_to_iter+0x1b6/0x260</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.036358] [<ffffffff8170f018>] skb_copy_datagram_iter+0x68/0x280</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.042960] [<ffffffff817694e3>] tcp_recvmsg+0x613/0xbe0</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.048567] [<ffffffff8179740e>] inet_recvmsg+0x7e/0xb0</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.053987] [<ffffffff816ffa3b>] sock_recvmsg+0x3b/0x50</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.059409] [<ffffffff816ffb91>] kernel_recvmsg+0x61/0x80</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.065002] [<ffffffffc02a9703>] dtt_recv_short+0x63/0x80 [drbd_transport_tcp]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.072666] [<ffffffffc02a97e0>] dtt_recv+0xc0/0x180 [drbd_transport_tcp]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.079771] [<ffffffffc0335f88>] drbd_recv+0x48/0x1f0 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.085894] [<ffffffff816ffa3b>] ? sock_recvmsg+0x3b/0x50</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.091699] [<ffffffffc033ef98>] read_in_block+0xa8/0x350 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.097937] [<ffffffffc0342140>] ? e_end_resync_block+0x110/0x110 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.104848] [<ffffffffc0342250>] receive_Data+0x110/0xcb0 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.111071] [<ffffffffc02a9803>] ? dtt_recv+0xe3/0x180 [drbd_transport_tcp]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.118253] [<ffffffffc0335f88>] ? drbd_recv+0x48/0x1f0 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.124304] [<ffffffffc0342140>] ? e_end_resync_block+0x110/0x110 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.131361] [<ffffffffc0342140>] ? e_end_resync_block+0x110/0x110 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.138269] [<ffffffffc0345ee4>] drbd_receiver+0x3e4/0x620 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.144573] [<ffffffffc0350420>] ? idr_has_entry+0x10/0x10 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.150873] [<ffffffffc035047e>] drbd_thread_setup+0x5e/0x110 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.157453] [<ffffffffc0350420>] ? idr_has_entry+0x10/0x10 [drbd]</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.163750] [<ffffffff810a0808>] kthread+0xd8/0xf0</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.168754] [<ffffffff810a0730>] ? kthread_create_on_node+0x1e0/0x1e0</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.175394] [<ffffffff81827a4f>] ret_from_fork+0x3f/0x70</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.180914] [<ffffffff810a0730>] ? kthread_create_on_node+0x1e0/0x1e0</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.187563] Code: 57 e8 4c 89 5f e0 48 8d 7f e0 73 d2 83 c2 20 48 29 d6 48 29 d7 83 fa 10 72 24 4c 8b 06 4c 8b 4e 08 4c 8b 54 16 f0 4c 8b 5c 16 f8 <4c> 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 c3 90 83 fa </div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.214595] RIP [<ffffffff813f91ed>] memcpy_orig+0x9d/0x110</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.220601] RSP <ffff8800b9afb9a8></div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.224205] CR2: 0000000000000000</div><div>Jul 14 01:12:14 ubuntu kernel: [ 461.227643] ---[ end trace 670dbe9e8d37a576 ]---</div></div><div></snip></div><div><br></div><div>After this, nothing seems to work properly (on either vms). Attempts to unmount the volume hang, other commands like drbd-overview hang; eventually I have to reboot both vms to get back to some sort of sanity, yet DRBD still is basically non-functional and causing kennel errors :-(</div><div><br></div><div>Anyone have any idea whats wrong?</div><div><br></div><div>Thanks!</div><div><br></div><div>—jason</div><div><br></div><div><br></div><div><br></div></body></html>