Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I updated to 8.4.0 and still have a stability issue. Outline of steps: Stopped pacemaker/clvmd/cman and stopped drbd. Verified kernel module was unloaded. Updated to 8.4.0, started drbd on both sides and verified primary/primary. Ran a verify and 0 blocks out of sync. started cman/clvmd/pacemaker, everything came up clean. Rebooted 01 cluster node and when drbd reconnected after reboot, 02 node kernel paniced and 01 node complained about being outdated. kernel panic from 02 is here: http://i.imgur.com/cSOzV.png dmesg from 01 has this - Seems to be more of a symptom of 02 crashing than anything else. My next troubleshooting step is to bring them both up with only drbd and reboot to see if it reconnects properly. d-con gfs00: PingAck did not arrive in time. d-con gfs00: peer( Primary -> Unknown ) conn( WFSyncUUID -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) block drbd0: IO ERROR: neither local nor remote disk d-con gfs00: asender terminated d-con gfs00: Terminating asender thread block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 exit code 0 (0x0) Buffer I/O error on device drbd0, logical block 5242688 block drbd0: bitmap WRITE of 0 pages took 0 jiffies block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. block drbd0: IO ERROR: neither local nor remote disk d-con gfs00: Connection closed d-con gfs00: Not fencing peer, I'm not even Consistent myself. d-con gfs00: conn( NetworkFailure -> Unconnected ) d-con gfs00: receiver terminated d-con gfs00: Restarting receiver thread d-con gfs00: receiver (re)started d-con gfs00: conn( Unconnected -> WFConnection ) Buffer I/O error on device drbd0, logical block 5242688 block drbd0: IO ERROR: neither local nor remote disk Buffer I/O error on device drbd0, logical block 5242709 Buffer I/O error on device drbd0, logical block 5242709 Buffer I/O error on device drbd0, logical block 0 Buffer I/O error on device drbd0, logical block 0 Buffer I/O error on device drbd0, logical block 1 Buffer I/O error on device drbd0, logical block 5242710 Buffer I/O error on device drbd0, logical block 5242710 dlm: Using SCTP for communications SCTP: Hash tables configured (established 65536 bind 65536) block drbd0: 271 messages suppressed in /root/rpmbuild/BUILD/drbd-8.4.0/drbd/drbd_req.c:856. block drbd0: IO ERROR: neither local nor remote disk block drbd0: IO ERROR: neither local nor remote disk block drbd0: IO ERROR: neither local nor remote disk __ratelimit: 265 callbacks suppressed Buffer I/O error on device drbd0, logical block 0 Buffer I/O error on device drbd0, logical block 1 Buffer I/O error on device drbd0, logical block 2 Buffer I/O error on device drbd0, logical block 3 Buffer I/O error on device drbd0, logical block 0 Buffer I/O error on device drbd0, logical block 0 Buffer I/O error on device drbd0, logical block 1 Buffer I/O error on device drbd0, logical block 2 Buffer I/O error on device drbd0, logical block 3 Buffer I/O error on device drbd0, logical block 0