Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
first, thank you very much for your time, Lars
>> #cat /proc/drbd
>> #grep . /sys/module/drbd/*version
-------------------------
version: 8.4.1 (api:1/proto:86-100)
GIT-hash: 91b4c048c1a0e06777b5f65d312b38d47abaea80 build by root at drbdnodeA,
2012-02-13 16:06:27
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate B r-----
ns:0 nr:8496 dw:8496 dr:0 al:0 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
/sys/module/drbd/srcversion:4A4FDD6F2ECF22BD2AD5970
/sys/module/drbd/version:8.4.1
-------------------------
the initrd file also looks good, it contains the 8.4.1 module only
>> So you *do* have a working DRBD,
>> and only the monitor operation fails "occasionally" (much too often,
>> still), with the below error log.
Yes, this *may* to be the case.
I'm not sure if the drbd module really crashes (and gets started again by
pacemaker afterwards) or if it never failed at all.
So far I only really see/know that pacemaker detects "a problem" and
initiates the failover.
After that all services continue to run on the other node and drbd switched
its primary/secondary state. (and I see all these errors/messages in the
log)
if it helps - the crm_mon output changes from:
--------------------------------------
Online: [ drbdnodeA drbdnodeB ]
Master/Slave Set: ms_drbd_r0 [p_drbd_r0]
Masters: [ drbdnodeA ]
Slaves: [ drbdnodeB ]
Resource Group: g_haservices
p_ipv4 (ocf::heartbeat:IPaddr2): Started drbdnodeA
p_fsmount_cgpro (ocf::heartbeat:Filesystem): Started drbdnodeA
p_exportnfs_cgpro (ocf::heartbeat:exportfs): Started drbdnodeA
Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
Started: [ drbdnodeA drbdnodeB ]
Clone Set: cl_exportnfs_root [p_exportnfs_root]
Started: [ drbdnodeA drbdnodeB ]
--------------------------------------
into
--------------------------------------
Online: [ drbdnodeA drbdnodeB ]
Master/Slave Set: ms_drbd_r0 [p_drbd_r0]
Masters: [ drbdnodeB ]
Slaves: [ drbdnodeA ]
Resource Group: g_haservices
p_ipv4 (ocf::heartbeat:IPaddr2): Started drbdnodeB
p_fsmount_cgpro (ocf::heartbeat:Filesystem): Started drbdnodeB
p_exportnfs_cgpro (ocf::heartbeat:exportfs): Started drbdnodeB
Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
Started: [ drbdnodeA drbdnodeB ]
Clone Set: cl_exportnfs_root [p_exportnfs_root]
Started: [ drbdnodeA drbdnodeB ]
Failed actions:
p_drbd_r0:0_monitor_15000 (node=drbdnodeA, call=26, rc=7,
status=complete): not running
--------------------------------------
Christoph Roethlisberger