Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, I have managed to get this setup to work in a virtual environment, but am running into problems when running on physical hardware. My current setup is the latest debian linux with the approved packages for drbd and heartbeat This is my heartbeat status when running. root at HPSATA01:~# crm_mon -1 ============ Last updated: Tue Apr 2 11:14:32 2013 Stack: Heartbeat Current DC: hpsata02 (9e01243d-9232-437e-81ad-e00aabffef8f) - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, unknown expected votes 2 Resources configured. ============ Online: [ hpsata01 hpsata02 ] Master/Slave Set: ms_g_drbd Masters: [ hpsata01 ] Slaves: [ hpsata02 ] Resource Group: g_services MetaFS (ocf::heartbeat:Filesystem): Started hpsata01 lvmdata (ocf::heartbeat:LVM): Started hpsata01 ClusterIP (ocf::heartbeat:IPaddr2): Started hpsata01 iscsi (lsb:iscsitarget): Started hpsata01 Failed actions: iscsi_monitor_0 (node=hpsata01, call=7, rc=1, status=complete): unknown error iscsi_monitor_0 (node=hpsata02, call=105, rc=1, status=complete): unknown error root at HPSATA01:~# And my DRBD status: root at HPSATA01:~# service drbd status drbd driver loaded OK; device status: version: 8.3.7 (api:88/proto:86-91) srcversion: EE47D8BF18AC166BE219757 m:res cs ro ds p mounted fstype 0:meta Connected Primary/Secondary UpToDate/UpToDate C /meta ext3 1:data Connected Primary/Secondary UpToDate/UpToDate C root at HPSATA01:~# So far so good. The problem arises when I put the primary node to standby. The same thing happens when I reboot, power off, or pull the plug. root at HPSATA01:~# crm_mon -1 ============ Last updated: Tue Apr 2 12:54:56 2013 Stack: Heartbeat Current DC: hpsata02 (9e01243d-9232-437e-81ad-e00aabffef8f) - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, unknown expected votes 2 Resources configured. ============ Node hpsata01 (d60c3fcb-55e4-44a1-9e7a-2b03efb501e8): standby Online: [ hpsata02 ] Master/Slave Set: ms_g_drbd Slaves: [ hpsata02 ] Stopped: [ g_drbd:0 ] On node 2 this is the DRBD status root at HPSATA02:~# service drbd status drbd driver loaded OK; device status: version: 8.3.7 (api:88/proto:86-91) srcversion: EE47D8BF18AC166BE219757 m:res cs ro ds p mounted fstype 0:meta WFConnection Secondary/Unknown Outdated/DUnknown C 1:data WFConnection Secondary/Unknown Outdated/DUnknown C root at HPSATA02:~# Here is some info from my ha-debug file: Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: unpack_config: On loss of CCM Quorum: Ignore Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: unpack_status: Node hpsata01 is in standby-mode Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: determine_online_status: Node hpsata01 is standby Apr 02 12:53:33 HPSATA02 pengine: [1892]: WARN: unpack_rsc_op: Processing failed op iscsi_monitor_0 on hpsata01: unknown error (1) Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: determine_online_status: Node hpsata02 is online Apr 02 12:53:33 HPSATA02 pengine: [1892]: WARN: unpack_rsc_op: Processing failed op iscsi_monitor_0 on hpsata02: unknown error (1) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: clone_print: Master/Slave Set: ms_g_drbd Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: short_print: Slaves: [ hpsata02 ] Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: short_print: Stopped: [ g_drbd:0 ] Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: group_print: Resource Group: g_services Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: MetaFS (ocf::heartbeat:Filesystem): Stopped Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: lvmdata (ocf::heartbeat:LVM): Stopped Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: ClusterIP (ocf::heartbeat:IPaddr2): Stopped Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: iscsi (lsb:iscsitarget): Stopped Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_merge_weights: drbd_meta:0: Rolling back scores from drbd_data:0 Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_color: Resource drbd_meta:0 cannot run anywhere Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_color: Resource drbd_data:0 cannot run anywhere Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: Promoting g_drbd:1 (Slave hpsata02) Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: ms_g_drbd: Promoted 1 instances of a possible 1 to master Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: Promoting g_drbd:1 (Slave hpsata02) Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: ms_g_drbd: Promoted 1 instances of a possible 1 to master Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: create_notification_boundaries: Creating boundaries for ms_g_drbd Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: create_notification_boundaries: Creating boundaries for ms_g_drbd Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: create_notification_boundaries: Creating boundaries for ms_g_drbd Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: create_notification_boundaries: Creating boundaries for ms_g_drbd Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: RecurringOp: Start recurring monitor (30s) for ClusterIP on hpsata02 Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Leave resource drbd_meta:0 (Stopped) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Leave resource drbd_data:0 (Stopped) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Promote drbd_meta:1 (Slave -> Master hpsata02) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Promote drbd_data:1 (Slave -> Master hpsata02) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start MetaFS (hpsata02) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start lvmdata (hpsata02) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start ClusterIP (hpsata02) Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start iscsi (hpsata02) Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: unpack_graph: Unpacked transition 51: 21 actions in 21 synapses Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_te_invoke: Processing graph 51 (ref=pe_calc-dc-1364928813-381) derived from /var/lib/pengine/pe-input-1669.bz2 Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 43 fired and confirmed Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 84: notify drbd_meta:1_pre_notify_promote_0 on hpsata02 (local) Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing key=84:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_notify_0 ) Apr 02 12:53:33 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:143: notify Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 86: notify drbd_data:1_pre_notify_promote_0 on hpsata02 (local) Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing key=86:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_notify_0 ) Apr 02 12:53:33 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:144: notify Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: process_pe_message: Transition 51: PEngine Input stored in: /var/lib/pengine/pe-input-1669.bz2 Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation drbd_meta:1_notify_0 (call=143, rc=0, cib-update=254, confirmed=true) ok Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation drbd_data:1_notify_0 (call=144, rc=0, cib-update=255, confirmed=true) ok Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action drbd_meta:1_pre_notify_promote_0 (84) confirmed on hpsata02 (rc=0) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action drbd_data:1_pre_notify_promote_0 (86) confirmed on hpsata02 (rc=0) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 44 fired and confirmed Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 41 fired and confirmed Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 27 fired and confirmed Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 16: promote drbd_meta:1_promote_0 on hpsata02 (local) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing key=16:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_promote_0 ) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:145: promote Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_meta:1:promote:stdout) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation drbd_meta:1_promote_0 (call=145, rc=0, cib-update=256, confirmed=true) ok Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action drbd_meta:1_promote_0 (16) confirmed on hpsata02 (rc=0) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 20: promote drbd_data:1_promote_0 on hpsata02 (local) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing key=20:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_promote_0 ) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:146: promote Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_data:1:promote:stdout) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation drbd_data:1_promote_0 (call=146, rc=0, cib-update=257, confirmed=true) ok Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action drbd_data:1_promote_0 (20) confirmed on hpsata02 (rc=0) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 28 fired and confirmed Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 42 fired and confirmed Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 45 fired and confirmed Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 85: notify drbd_meta:1_post_notify_promote_0 on hpsata02 (local) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing key=85:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_notify_0 ) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:147: notify Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 87: notify drbd_data:1_post_notify_promote_0 on hpsata02 (local) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing key=87:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_notify_0 ) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:148: notify Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_meta:1:notify:stdout) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_data:1:notify:stdout) drbd[2542]: 2013/04/02_12:53:34 ERROR: meta: Called drbdadm -c /etc/drbd.conf outdate meta drbd[2543]: 2013/04/02_12:53:34 ERROR: data: Called drbdadm -c /etc/drbd.conf outdate data drbd[2542]: 2013/04/02_12:53:34 ERROR: meta: Exit code 17 drbd[2543]: 2013/04/02_12:53:34 ERROR: data: Exit code 17 drbd[2542]: 2013/04/02_12:53:34 ERROR: meta: Command output: drbd[2543]: 2013/04/02_12:53:34 ERROR: data: Command output: Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_meta:1:notify:stdout) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_data:1:notify:stdout) Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_trigger_update: Sending flush op to all hosts for: master-drbd_meta:1 (-INFINITY) Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_perform_update: Sent update 137: master-drbd_meta:1=-INFINITY Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_trigger_update: Sending flush op to all hosts for: master-drbd_data:1 (-INFINITY) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_meta:1:notify:stdout) Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: (drbd_data:1:notify:stdout) Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation drbd_meta:1_notify_0 (call=147, rc=0, cib-update=258, confirmed=true) ok Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation drbd_data:1_notify_0 (call=148, rc=0, cib-update=259, confirmed=true) ok Once I bring node 1 back up everything works fine again. If anyone has a clue as to what is going on I would appreciate the help. Thanks, Jared. ________________________________ This email and any files transmitted with it are intended solely for the use of the individuals or entities to which they are addressed and are not to be disclosed to any other party. If you have received this email in error please return it to the sender, and delete any copies thereof. Thank you for your cooperation. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20130402/9822eb35/attachment.htm>