Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, I have to cross-post to LVM as well to DRBD mailing list as I have no clue where the issue is- if it's not a bug... I can not get working LVM on top of drbd- I am getting I/O erros followed by "diskless" state. Steps to reproduce: Two machine2. A: CentOS7 x64; epel-providedd packages kmod-drbd84-8.4.9-1.el7.elrepo.x86_64 drbd84-utils-8.9.8-1.el7.elrepo.x86_64 B: CentOS6 x64; epel-provided packages kmod-drbd83-8.3.16-3.el6.elrepo.x86_64 drbd83-utils-8.3.16-1.el6.elrepo.x86_64 drbd1.res: resource drbd1 { protocol A; startup { wfc-timeout 240; degr-wfc-timeout 120; become-primary-on backuppc; } net { max-buffers 8000; max-epoch-size 8000; sndbuf-size 128k; shared-secret "13Lue=3"; } syncer { rate 500M; } on backuppc { device /dev/drbd1; disk /dev/sdc; address 192.168.0.1:7790; meta-disk internal; } on drbd { device /dev/drbd1; disk /dev/sda; address 192.168.2.16:7790; meta-disk internal; } } I was able to create the drbd as expected (see first line of following syslog), it gets in sync. So I set up LVM and create filter rules so LVM should ignore the underlying physical device: /etc/lvm/lvm.conf [node1]: filter = ["r|/dev/sdc|"]; /etc/lvm/lvm.conf [node2]: filter = [ "r|/dev/sda|" ] LVM ignores sda as expected: #> pvscan PV /dev/sda2 VG cl lvm2 [15,00 GiB / 0 free] Total: 1 [15,00 GiB] / in use: 1 [15,00 GiB] / in no VG: 0 [0 ] Now creating PV, VG, LV: [root at backuppc etc]# pvcreate /dev/drbd1 Physical volume "/dev/drbd1" successfully created. [root at backuppc etc]# vgcreate test /dev/drbd1 Volume group "test" successfully created [root at backuppc etc]# lvcreate test -n test -L 3G Volume group "test" has insufficient free space (767 extents): 768 required. [root at backuppc etc]# lvcreate test -n test -L 2.9G Rounding up size to full physical extent 2,90 GiB Logical volume "test" created. [root at backuppc etc]# vgdisplay -v test --- Volume group --- VG Name test System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 2 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 3,00 GiB PE Size 4,00 MiB Total PE 767 Alloc PE / Size 743 / 2,90 GiB Free PE / Size 24 / 96,00 MiB VG UUID pUPkxh-oS0f-MEUY-yIeJ-3zPb-Fkg1-TW1fgh --- Logical volume --- LV Path /dev/test/test LV Name test VG Name test LV UUID X0wpkL-niZ7-XT7u-zjT0-ETzC-hYbI-yyv13F LV Write Access read/write LV Creation host, time backuppc, 2017-01-07 10:57:29 +0100 LV Status available # open 0 LV Size 2,90 GiB Current LE 743 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 8192 Block device 253:2 --- Physical volumes --- PV Name /dev/drbd1 PV UUID 3tcvkG-Keqk-vplB-f9zY-1X34-ZxCI-eFYPio PV Status allocatable Total PE / Free PE 767 / 24 Creating filesystem (sorry, output in German): [root at backuppc etc]# mkfs.ext4 /dev/test/test mke2fs 1.42.9 (28-Dec-2013) Dateisystem-Label= OS-Typ: Linux Blockgröße=4096 (log=2) Fragmentgröße=4096 (log=2) Stride=0 Blöcke, Stripebreite=0 Blöcke 190464 Inodes, 760832 Blöcke 38041 Blöcke (5.00%) reserviert für den Superuser Erster Datenblock=0 Maximale Dateisystem-Blöcke=780140544 24 Blockgruppen 32768 Blöcke pro Gruppe, 32768 Fragmente pro Gruppe 7936 Inodes pro Gruppe Superblock-Sicherungskopien gespeichert in den Blöcken: 32768, 98304, 163840, 229376, 294912 Platz für Gruppentabellen wird angefordert: erledigt Inode-Tabellen werden geschrieben: erledigt Erstelle Journal (16384 Blöcke): erledigt Schreibe Superblöcke und Dateisystem-Accountinginformationen: erledigt Mounting and start to use: [root at backuppc etc]# mount /dev/test/test /mnt [root at backuppc etc]# cd /mnt/ [root at backuppc mnt]# cd .. I immediately get I/O errors in syslog (and NO, the physical disk is not damaged. Both are virtual machines (VMware ESXi 5.x) running on HW-RAID): Jan 7 10:42:07 backuppc kernel: block drbd1: Resync done (total 166 sec; paused 0 sec; 18948 K/sec) Jan 7 10:42:07 backuppc kernel: block drbd1: updated UUIDs 2C441CCF3B27BA41:0000000000000000:C9022D0F617A83BA:0000000000000004 Jan 7 10:42:07 backuppc kernel: block drbd1: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) Jan 7 10:58:44 backuppc kernel: EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null) Jan 7 10:58:48 backuppc kernel: block drbd1: local WRITE IO error sector 5296+3960 on sdc Jan 7 10:58:48 backuppc kernel: block drbd1: disk( UpToDate -> Failed ) Jan 7 10:58:48 backuppc kernel: block drbd1: Local IO failed in __req_mod. Detaching... Jan 7 10:58:48 backuppc kernel: block drbd1: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Jan 7 10:58:48 backuppc kernel: block drbd1: disk( Failed -> Diskless ) Jan 7 10:58:48 backuppc kernel: drbd drbd1: sock was shut down by peer Jan 7 10:58:48 backuppc kernel: drbd drbd1: peer( Secondary -> Unknown ) conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) Jan 7 10:58:48 backuppc kernel: drbd drbd1: short read (expected size 8) Jan 7 10:58:48 backuppc kernel: drbd drbd1: meta connection shut down by peer. Jan 7 10:58:48 backuppc kernel: drbd drbd1: ack_receiver terminated Jan 7 10:58:48 backuppc kernel: drbd drbd1: Terminating drbd_a_drbd1 Jan 7 10:58:48 backuppc kernel: block drbd1: helper command: /sbin/drbdadm pri-on-incon-degr minor-1 Jan 7 10:58:48 backuppc kernel: block drbd1: helper command: /sbin/drbdadm pri-on-incon-degr minor-1 exit code 0 (0x0) Jan 7 10:58:48 backuppc kernel: block drbd1: Should have called drbd_al_complete_io(, 5296, 2027520), but my Disk seems to have failed :( Jan 7 10:58:48 backuppc kernel: drbd drbd1: Connection closed Jan 7 10:58:48 backuppc kernel: drbd drbd1: conn( BrokenPipe -> Unconnected ) Jan 7 10:58:48 backuppc kernel: drbd drbd1: receiver terminated Jan 7 10:58:48 backuppc kernel: drbd drbd1: Restarting receiver thread Jan 7 10:58:48 backuppc kernel: drbd drbd1: receiver (re)started Jan 7 10:58:48 backuppc kernel: drbd drbd1: conn( Unconnected -> WFConnection ) Jan 7 10:58:48 backuppc kernel: drbd drbd1: Not fencing peer, I'm not even Consistent myself. Jan 7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor remote data, sector 29096+3968 Jan 7 10:58:48 backuppc kernel: dm-2: WRITE SAME failed. Manually zeroing. Jan 7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor remote data, sector 29096+256 Jan 7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor remote data, sector 29352+256 Jan 7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor remote data, sector 29608+256 Jan 7 10:58:48 backuppc kernel: block drbd1: IO ERROR: neither local nor remote data, sector 29864+256 Jan 7 10:58:49 backuppc kernel: drbd drbd1: Handshake successful: Agreed network protocol version 97 Jan 7 10:58:49 backuppc kernel: drbd drbd1: Feature flags enabled on protocol level: 0x0 none. Jan 7 10:58:49 backuppc kernel: drbd drbd1: conn( WFConnection -> WFReportParams ) Jan 7 10:58:49 backuppc kernel: drbd drbd1: Starting ack_recv thread (from drbd_r_drbd1 [22367]) Jan 7 10:58:49 backuppc kernel: block drbd1: receiver updated UUIDs to effective data uuid: 2C441CCF3B27BA40 Jan 7 10:58:49 backuppc kernel: block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) In the end my /proc/drbd looks like this: version: 8.4.9-1 (api:1/proto:86-101) GIT-hash: 9976da086367a2476503ef7f6b13d4567327a280 build by akemi at Build64R7, 2016-12-04 01:08:48 1: cs:Connected ro:Primary/Secondary ds:Diskless/UpToDate A r----- ns:3212879 nr:0 dw:67260 dr:3149797 al:27 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 pvscan is still fine: [root at backuppc log]# pvscan PV /dev/sda2 VG cl lvm2 [15,00 GiB / 0 free] PV /dev/drbd1 VG test lvm2 [3,00 GiB / 96,00 MiB free] Total: 2 [17,99 GiB] / in use: 2 [17,99 GiB] / in no VG: 0 [0 ] So anyone having an idea what is going wrong here? Greetings Christian