[DRBD-user] drbd and heartbeat

Marcel Kraan marcel at kraan.net
Sun May 13 11:22:13 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


And after power-faillure this is happening.
everything is working but i have a DUnknown  state ?

[root at kvmstorage1 ~]# cat /proc/drbd 
version: 8.3.12 (api:88/proto:86-96)
GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6, 2012-04-08 09:36:52
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:684 dr:9946 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:224


[root at kvmstorage2 ~]# cat /proc/drbd 
version: 8.3.12 (api:88/proto:86-96)
GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6, 2012-04-08 09:36:52
 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
    ns:244 nr:484 dw:1456 dr:15139 al:1 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:264

drbdadm connect main

not helping because it's connected.

drbdadm disconnect main
drbdadm detach main
drbdadm attach main
drbdadm connect main

it still works perfect bit only state worries me :-)



[root at kvmstorage2 ~]# drbdsetup 0 net ipv4:192.168.123.212:7788 ipv4:192.168.123.211:7788 C --set-defaults --create-deviceMay 13 11:20:25 kvmstorage2 kernel: block drbd0: Handshake successful: Agreed network protocol version 96
May 13 11:20:25 kvmstorage2 kernel: block drbd0: conn( WFConnection -> WFReportParams ) 
May 13 11:20:25 kvmstorage2 kernel: block drbd0: Starting asender thread (from drbd0_receiver [5135])
May 13 11:20:25 kvmstorage2 kernel: block drbd0: data-integrity-alg: <not-used>
May 13 11:20:25 kvmstorage2 kernel: block drbd0: drbd_sync_handshake:
May 13 11:20:25 kvmstorage2 kernel: block drbd0: self C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99 bits:66 flags:0
May 13 11:20:25 kvmstorage2 kernel: block drbd0: peer E33CEADD1FF28EE1:9555562D91EACAC3:A615ADBD6A39BD98:A614ADBD6A39BD99 bits:64 flags:0
May 13 11:20:25 kvmstorage2 kernel: block drbd0: uuid_compare()=100 by rule 90
May 13 11:20:25 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
May 13 11:20:25 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
May 13 11:20:25 kvmstorage2 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
May 13 11:20:25 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0
May 13 11:20:25 kvmstorage2 kernel: block drbd0: meta connection shut down by peer.
May 13 11:20:25 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
May 13 11:20:25 kvmstorage2 kernel: block drbd0: conn( WFReportParams -> Disconnecting ) 
May 13 11:20:25 kvmstorage2 kernel: block drbd0: error receiving ReportState, l: 4!
May 13 11:20:25 kvmstorage2 kernel: block drbd0: asender terminated
May 13 11:20:25 kvmstorage2 kernel: block drbd0: Terminating asender thread
May 13 11:20:25 kvmstorage2 kernel: block drbd0: Connection closed
May 13 11:20:25 kvmstorage2 kernel: block drbd0: conn( Disconnecting -> StandAlone ) 
May 13 11:20:25 kvmstorage2 kernel: block drbd0: receiver terminated
May 13 11:20:25 kvmstorage2 kernel: block drbd0: Terminating receiver thread



[root at kvmstorage1 ~]# drbdadm connect  all
May 13 11:20:24 kvmstorage1 kernel: block drbd0: conn( StandAlone -> Unconnected ) 
May 13 11:20:24 kvmstorage1 kernel: block drbd0: Starting receiver thread (from drbd0_worker [1467])
May 13 11:20:24 kvmstorage1 kernel: block drbd0: receiver (re)started
May 13 11:20:24 kvmstorage1 kernel: block drbd0: conn( Unconnected -> WFConnection ) 
[root at kvmstorage1 ~]# May 13 11:20:25 kvmstorage1 kernel: block drbd0: Handshake successful: Agreed network protocol version 96
May 13 11:20:25 kvmstorage1 kernel: block drbd0: conn( WFConnection -> WFReportParams ) 
May 13 11:20:25 kvmstorage1 kernel: block drbd0: Starting asender thread (from drbd0_receiver [5991])
May 13 11:20:25 kvmstorage1 kernel: block drbd0: data-integrity-alg: <not-used>
May 13 11:20:25 kvmstorage1 kernel: block drbd0: drbd_sync_handshake:
May 13 11:20:25 kvmstorage1 kernel: block drbd0: self E33CEADD1FF28EE1:9555562D91EACAC3:A615ADBD6A39BD98:A614ADBD6A39BD99 bits:64 flags:0
May 13 11:20:25 kvmstorage1 kernel: block drbd0: peer C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99 bits:66 flags:0
May 13 11:20:25 kvmstorage1 kernel: block drbd0: uuid_compare()=100 by rule 90
May 13 11:20:25 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
May 13 11:20:25 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
May 13 11:20:25 kvmstorage1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
May 13 11:20:25 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0
May 13 11:20:25 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
May 13 11:20:25 kvmstorage1 kernel: block drbd0: conn( WFReportParams -> Disconnecting ) 
May 13 11:20:25 kvmstorage1 kernel: block drbd0: error receiving ReportState, l: 4!
May 13 11:20:25 kvmstorage1 kernel: block drbd0: asender terminated
May 13 11:20:25 kvmstorage1 kernel: block drbd0: Terminating asender thread
May 13 11:20:25 kvmstorage1 kernel: block drbd0: Connection closed
May 13 11:20:25 kvmstorage1 kernel: block drbd0: conn( Disconnecting -> StandAlone ) 
May 13 11:20:25 kvmstorage1 kernel: block drbd0: receiver terminated
May 13 11:20:25 kvmstorage1 kernel: block drbd0: Terminating receiver thread
May 13 11:21:18 kvmstorage1 ntpd[1447]: synchronized to 88.159.164.16, stratum 2











On 11 mei 2012, at 17:57, Marcelo Pereira wrote:

> You can test it now, and check if the amount decreases or stays!!
> 
> 
> On Fri, May 11, 2012 at 11:55 AM, <marcel at kraan.net> wrote:
> I wait. But i waited 2 days ago and then it started again. I let you know when it happens again.
> 
> --
> The ultimate is landing in Waist deep champagne powder..
> But I guess the BigAirBAG would be the best man made option.
> 
> BigAirBAG BV
> 
> Office:
> Amsterdamseweg 68  1981LH  Velsen Zuid  The Netherlands
> 
> Factory:
> Jupiter 2  8448CD  Heerenveen The Netherlands
> 
> VAT/BTW             : NL8065.67.831.B01
> Chamber of Commerce : 08076232 Amsterdam
> IBAN                : NL82RABO0109278585
> Phone               : +31 654378837
> Fax                 : +31 235513420
> Website              : http://www.bigairbag.com
> 
> 
> On 11 mei 2012, at 17:31, Marcelo Pereira <marcelops at gmail.com> wrote:
> 
> > It seems to be ok!!
> >
> > Let it sync!! It will take 35 hours, but just wait!!
> >
> > If you want to check if it's sync'ing, then pay attention to this info:
> >
> > [>....................] sync'ed:  0.6% (3354164/*3373748*)M
> >
> > This is the amount of data that the server is actually sync'ing. You can
> > stop the DRBD and restart it again (not the physical server, only the
> > service). Then check if the amount stays on the same amount (3373748) or if
> > it decreases.
> >
> > If it decreases, than go grab a coffee (or a dozen of coffees) and let it
> > sync. You should be all set after that.
> >
> > Regards,
> > Marcelo
> >
> > On Fri, May 11, 2012 at 11:11 AM, Marcel Kraan <marcel at kraan.net> wrote:
> >
> >> This is node 1 and 2
> >>
> >> root at kvmstorage1 ~]# cat /proc/drbd
> >> version: 8.3.12 (api:88/proto:86-96)
> >> GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6,
> >> 2012-04-08 09:36:52
> >> 0: cs:SyncTarget ro:Secondary/Secondary ds:Inconsistent/UpToDate C r-----
> >>    ns:72 nr:25543460 dw:25543136 dr:8978 al:1 bm:1559 lo:1 pe:7420 ua:0
> >> ap:0 ep:1 wo:b oos:3434666332
> >> [>....................] sync'ed:  0.6% (3354164/3373748)M
> >> finish: 34:56:00 speed: 27,304 (20,504) want: 51,200 K/sec
> >>
> >>
> >> [root at kvmstorage2 drbd.d]# cat /proc/drbd
> >> version: 8.3.12 (api:88/proto:86-96)
> >> GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6,
> >> 2012-04-08 09:36:52
> >> 0: cs:SyncSource ro:Secondary/Primary ds:UpToDate/Inconsistent C r-----
> >>    ns:19677536 nr:60 dw:132 dr:19695382 al:1 bm:1201 lo:1 pe:2 ua:64 ap:0
> >> ep:1 wo:b oos:3435043932
> >> [>....................] sync'ed:  0.6% (3354532/3373748)M
> >> finish: 33:28:00 speed: 28,492 (20,408) K/sec
> >>
> >>
> >> On 11 mei 2012, at 17:09, Marcelo Pereira wrote:
> >>
> >> what about the /proc/drbd on the other node?
> >>
> >>
> >> On Fri, May 11, 2012 at 11:01 AM, Marcel Kraan <marcel at kraan.net> wrote:
> >>
> >>> I have removed the partition and now i have make a new one
> >>>
> >>> i synced forced with 100M
> >>> drbdsetup /dev/drbd0 syncer -r 100M
> >>> (getting only 25M)
> >>>
> >>>
> >>> root at kvmstorage2 ~]# cat /proc/drbd
> >>> version: 8.3.12 (api:88/proto:86-96)
> >>> GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil at Build64R6,
> >>> 2012-04-08 09:36:52
> >>> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
> >>>    ns:3424212 nr:48 dw:112 dr:3442010 al:1 bm:209 lo:1 pe:2 ua:64 ap:0
> >>> ep:1 wo:b oos:3451297116
> >>> [>....................] sync'ed:  0.1% (3370404/3373748)M
> >>> finish: 35:04:59 speed: 27,324 (9,508) K/sec
> >>>
> >>>
> >>> [root at kvmstorage2 ~]# drbdsetup /dev/drbd0 primary
> >>> [root at kvmstorage2 ~]# mount /dev/drbd0 /datastore/
> >>>
> >>> [root at kvmstorage2 ~]# ls -al /datastore/
> >>> total 871628916
> >>> drwxr-xr-x   4 root root         4096 May 10 16:36 .
> >>> dr-xr-xr-x. 25 root root         4096 May 11 16:54 ..
> >>> -rw-r--r--   1 root root 838860800000 May 11 12:08 BigAirBag.com-data.img
> >>> -rw-r--r--   1 root root  53687091200 May 11 13:54 BigAirBag.com.img
> >>> drwx------   2 qemu qemu        16384 May  9 22:46 lost+found
> >>> drwxr-xr-x.  3 root root         4096 May 11 14:16 nfs
> >>>
> >>>
> >>> [root at kvmstorage2 ~]# mount
> >>> /dev/mapper/vg_storage1-lv_root on / type ext4 (rw)
> >>> proc on /proc type proc (rw)
> >>> sysfs on /sys type sysfs (rw)
> >>> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> >>> tmpfs on /dev/shm type tmpfs (rw)
> >>> /dev/sda1 on /boot type ext4 (rw)
> >>> none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> >>> nfsd on /proc/fs/nfsd type nfsd (rw)
> >>> /dev/drbd0 on /datastore type ext4 (rw)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 11 mei 2012, at 15:38, Marcelo Pereira wrote:
> >>>
> >>> Dear Marcel,
> >>>
> >>> I don't want to keep the conversation until you give us the info that we
> >>> need to "hunt the issue".
> >>>
> >>> If it's taking long to sync, or if it's "resync'ing" after a reboot, so
> >>> something might be wrong, but we can't figure it out without the
> >>> information.
> >>>
> >>> If you happen to have the /proc/drbd, I can try something, otherwise
> >>> leave me alone!!
> >>>
> >>> Have a good day!
> >>> Marcelo
> >>>
> >>> On Fri, May 11, 2012 at 9:26 AM, <marcel at kraan.net> wrote:
> >>>
> >>>> The /dev/drdb was syncing with a speed of 29000 kbps
> >>>> But the partition was 3.4TB it takes 2 days. Its the syncing time that
> >>>> is my problem.
> >>>>
> >>>> --
> >>>> The ultimate is landing in Waist deep champagne powder..
> >>>> But I guess the BigAirBAG would be the best man made option.
> >>>>
> >>>> BigAirBAG BV
> >>>>
> >>>> Office:
> >>>> Amsterdamseweg 68  1981LH  Velsen Zuid  The Netherlands
> >>>>
> >>>> Factory:
> >>>> Jupiter 2  8448CD  Heerenveen The Netherlands
> >>>>
> >>>> VAT/BTW             : NL8065.67.831.B01
> >>>> Chamber of Commerce : 08076232 Amsterdam
> >>>> IBAN                : NL82RABO0109278585
> >>>> Phone               : +31 654378837
> >>>> Fax                 : +31 235513420
> >>>> Website              : http://www.bigairbag.com
> >>>>
> >>>>
> >>>> On 11 mei 2012, at 15:09, Marcelo Pereira <marcelops at gmail.com> wrote:
> >>>>
> >>>>> Ok, come back when you manage to put it back online, and with the
> >>>> output of that file.
> >>>>>
> >>>>> Have a good weekend. And good luck.
> >>>>>
> >>>>> --Marcelo
> >>>>>
> >>>>> On May 11, 2012, at 9:01 AM, marcel at kraan.net wrote:
> >>>>>
> >>>>>> I have removed the setup so the partition also
> >>>>>>
> >>>>>> --
> >>>>>> The ultimate is landing in Waist deep champagne powder..
> >>>>>> But I guess the BigAirBAG would be the best man made option.
> >>>>>>
> >>>>>> BigAirBAG BV
> >>>>>>
> >>>>>> Office:
> >>>>>> Amsterdamseweg 68  1981LH  Velsen Zuid  The Netherlands
> >>>>>>
> >>>>>> Factory:
> >>>>>> Jupiter 2  8448CD  Heerenveen The Netherlands
> >>>>>>
> >>>>>> VAT/BTW             : NL8065.67.831.B01
> >>>>>> Chamber of Commerce : 08076232 Amsterdam
> >>>>>> IBAN                : NL82RABO0109278585
> >>>>>> Phone               : +31 654378837
> >>>>>> Fax                 : +31 235513420
> >>>>>> Website              : http://www.bigairbag.com
> >>>>>>
> >>>>>>
> >>>>>> On 11 mei 2012, at 14:59, Marcelo Pereira <marcelops at gmail.com>
> >>>> wrote:
> >>>>>>
> >>>>>>> Where is the /proc/drbd ??
> >>>>>>>
> >>>>>>> --Marcelo
> >>>>>>>
> >>>>>>> On May 11, 2012, at 8:38 AM, Marcel Kraan <marcel at kraan.net> wrote:
> >>>>>>>
> >>>>>>>> i just deleted the whole setup it drives me crazy.
> >>>>>>>> i format de disks and i go with the normal NFS mount and
> >>>> RAID5/hotspare
> >>>>>>>>
> >>>>>>>> root at kvmstorage2 drbd.d]# cat main.res
> >>>>>>>> resource main {
> >>>>>>>>
> >>>>>>>> protocol C;
> >>>>>>>>
> >>>>>>>> startup { wfc-timeout 0; degr-wfc-timeout 120; }
> >>>>>>>>
> >>>>>>>> disk { on-io-error detach; }
> >>>>>>>>
> >>>>>>>> on kvmstorage1.localdomain {
> >>>>>>>>    device /dev/drbd0;
> >>>>>>>>    disk /dev/sdb1;
> >>>>>>>>    meta-disk internal;
> >>>>>>>>    address 192.168.123.211:7788;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> on kvmstorage2.localdomain {
> >>>>>>>>    device /dev/drbd0;
> >>>>>>>>    disk /dev/sdb1;
> >>>>>>>>    meta-disk internal;
> >>>>>>>>    address 192.168.123.212:7788;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ###################################################################
> >>>>>>>>
> >>>>>>>> [root at kvmstorage2 drbd.d]# cat global_common.conf
> >>>>>>>> global {
> >>>>>>>> usage-count no;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> common {
> >>>>>>>> protocol C;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> resource drbd {
> >>>>>>>>    handlers {
> >>>>>>>>            pri-on-incon-degr
> >>>> "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
> >>>> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> >>>> reboot -f";
> >>>>>>>>            pri-lost-after-sb
> >>>> "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
> >>>> /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
> >>>> reboot -f";
> >>>>>>>>            local-io-error "/usr/lib/drbd/notify-io-error.sh;
> >>>> /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ;
> >>>> halt -f";
> >>>>>>>>            # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
> >>>>>>>>            # split-brain "/usr/lib/drbd/notify-split-brain.sh
> >>>> root";
> >>>>>>>>            # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh
> >>>> root";
> >>>>>>>>            # before-resync-target
> >>>> "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
> >>>>>>>>            # after-resync-target
> >>>> /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
> >>>>>>>>    }
> >>>>>>>>
> >>>>>>>>    startup {
> >>>>>>>>            # wfc-timeout degr-wfc-timeout outdated-wfc-timeout
> >>>> wait-after-sb
> >>>>>>>>    # become-primary-on both;
> >>>>>>>>    }
> >>>>>>>>
> >>>>>>>>    disk {
> >>>>>>>>            # on-io-error fencing use-bmbv no-disk-barrier
> >>>> no-disk-flushes
> >>>>>>>>            # no-disk-drain no-md-flushes max-bio-bvecs
> >>>>>>>>    # c-plan-ahead 200;
> >>>>>>>>            # c-max-rate 10M;
> >>>>>>>>            # c-fill-target 15M;
> >>>>>>>>    }
> >>>>>>>>
> >>>>>>>>    net {
> >>>>>>>>            # sndbuf-size rcvbuf-size timeout connect-int ping-int
> >>>> ping-timeout max-buffers
> >>>>>>>>            # max-epoch-size ko-count allow-two-primaries
> >>>> cram-hmac-alg shared-secret
> >>>>>>>>            # after-sb-0pri after-sb-1pri after-sb-2pri
> >>>> data-integrity-alg no-tcp-cork
> >>>>>>>>    # allow-two-primaries;
> >>>>>>>>    }
> >>>>>>>>
> >>>>>>>>    syncer {
> >>>>>>>>            # rate after al-extents use-rle cpu-mask verify-alg
> >>>> csums-alg
> >>>>>>>>    rate 110M;
> >>>>>>>>    }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 11 mei 2012, at 14:32, Marcelo Pereira wrote:
> >>>>>>>>
> >>>>>>>>> Hey Marcel,
> >>>>>>>>>
> >>>>>>>>> Could you send your /etc/drbd.conf and the output from the
> >>>> /proc/drbd ??
> >>>>>>>>>
> >>>>>>>>> If your reboot the servers properly, you shouldn't need to do a
> >>>> full sync, it'll sync only what changed instead.
> >>>>>>>>>
> >>>>>>>>> Get back with the files, and someone will probably help you out.
> >>>>>>>>>
> >>>>>>>>> []s
> >>>>>>>>>
> >>>>>>>>> --Marcelo
> >>>>>>>>>
> >>>>>>>>> On May 11, 2012, at 8:21 AM, Marcel Kraan <marcel at kraan.net>
> >>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> I have a 3TB disk shared for drbd0
> >>>>>>>>>>
> >>>>>>>>>> the syncing is with 30MB/sec and it takes 2 days to complete
> >>>>>>>>>>
> >>>>>>>>>> While syncing heartbeat is not working (is this correct?)
> >>>>>>>>>>
> >>>>>>>>>> when i put both servers offline the disks need to resync again..
> >>>> (2 days)
> >>>>>>>>>>
> >>>>>>>>>> Is this normal?
> >>>>>>>>>>
> >>>>>>>>>> Is there a new way for clustering?
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> drbd-user mailing list
> >>>>>>>>>> drbd-user at lists.linbit.com
> >>>>>>>>>> http://lists.linbit.com/mailman/listinfo/drbd-user
> >>>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120513/b6b91eb8/attachment.htm>


More information about the drbd-user mailing list