Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
hi, On 02/21/2011 10:36 AM, Lars Ellenberg wrote: > Fix your fence-peer helper, > that may be the cause of trouble there. which actuall is 'your' fence-peer helper, right? :) Feb 16 03:13:45 c02n01 kernel: [3675911.371516] block drbd0: updated UUIDs A9AE9E56A0D5D66F:0000000000000000:3E9700A8847A37AD:3E9600A8847A37AD Feb 16 03:13:45 c02n01 kernel: [3675911.371635] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) Feb 16 03:13:45 c02n01 kernel: [3675911.505550] block drbd0: bitmap WRITE of 3050 pages took 34 jiffies Feb 16 03:13:45 c02n01 kernel: [3675911.505615] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Feb 16 03:13:45 c02n01 cibadmin: [14957]: info: Invoked: cibadmin -Q -t 1 Feb 16 03:13:45 c02n01 crm-fence-peer.sh[14918]: WARNING peer is Secondary, did not place the constraint! Feb 16 03:13:45 c02n01 kernel: [3675912.019501] block drbd0: helper command: /sbin/drbdadm fence-peer minor-0 exit code 1 (0x100) Feb 16 03:13:45 c02n01 kernel: [3675912.019622] block drbd0: fence-peer helper broken, returned 1 Feb 16 03:13:45 c02n01 kernel: [3675912.019687] block drbd0: pdsk( UpToDate -> DUnknown ) Feb 16 03:13:45 c02n01 kernel: [3675912.019768] block drbd0: new current UUID 6798C570121477F1:A9AE9E56A0D5D66F:3E9700A8847A37AD:3E9600A8847A37AD thus, basically coming back to [1] where florian asks: > Look at your paste. You have no node where DRBD is Secondary. What do > you expect the agent to do? (i know, i talked about the agent in this email. but the the agent and crm-fence-peer.sh are closely tied, aren't they?) looking at crm-fence-peer.sh's source, i see: > Secondary|Primary) > # WTF? We are supposed to fence the peer, > # but the replication link is just fine? > echo WARNING "peer is $DRBD_peer, did not place the constraint!" > rc=0 > return > ;; > esac so, this should actually be obsoleted by fixing the following bug, right? on the other hand, what's wrong in trying to disconnect and reconnect the resources and see what happens? (e.g. via a tiny contraint that is only valid for PT1M? > Feb 16 06:25:04 c02n01 kernel: [3687390.947555] block drbd1: pdsk( UpToDate -> DUnknown ) > > This should not have happened, either: > We must not change the pdsk state to DUnknown while keeping conn state at Connected. > That's nonsense. > > Feb 16 06:25:04 c02n01 kernel: [3687390.947633] block drbd1: new current UUID 89084B22FE454C03:3C1DADF6B38C1AD7:E7E50184F3F3AC0B:E7E40184F3F3AC0B please let me know if you need any further input from my side. thanks, raoul [1] http://www.gossamer-threads.com/lists/drbd/users/20605#20605 -- ____________________________________________________________________ DI (FH) Raoul Bhatia M.Sc. email. r.bhatia at ipax.at Technischer Leiter IPAX - Aloy Bhatia Hava OG web. http://www.ipax.at Barawitzkagasse 10/2/2/11 email. office at ipax.at 1190 Wien tel. +43 1 3670030 FN 277995t HG Wien fax. +43 1 3670030 15 ____________________________________________________________________