Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi. Im trying to setup a two node cluster with two DRBD resources that will be working with OCFS2. The configuration is : running on OpenSuse 12.1 kernel 2.6.37.6-101-desktop - I had to downgrade it as I had trouble with DRBD and the original kernel before. DRBD version drbd-8.3.9- Pacemaker version pacemaker-1.1.6 If I start the DRBD with the initscript of the system everything works fine and we get both resources started as primary/primary. apolo:~ # cat /proc/drbd version: 8.3.9 (api:88/proto:86-95) srcversion: A67EB2D25C5AFBFF3D8B788 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:672 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:680 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 When the start of the resources is configured on the cluster, the resources arent properly started at a random base. It ends up with a split brain at one of the resouces, although the crm says its all ok because there is at least one resource im primary state but Standalone. apolo:~ # crm resource crm(live)resource# status Master/Slave Set: msDRBD_0 [resDRBD_0] Masters: [ apolo diana ] Master/Slave Set: msDRBD_1 [resDRBD_1] Masters: [ apolo diana ] apolo:~ # cat /proc/drbd version: 8.3.9 (api:88/proto:86-95) srcversion: A67EB2D25C5AFBFF3D8B788 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:672 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- ns:0 nr:0 dw:0 dr:680 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 Here are my configuration files for the DRBD resources and the cluster https://dl.dropbox.com/u/96446079/backup.res https://dl.dropbox.com/u/96446079/export.res https://dl.dropbox.com/u/96446079/crm_resources.txt I even tryed tho add some delay on the start/stop process but with no results op start interval="0" timeout="240" \ op stop interval="0" timeout="100" \ Can you please help me to solve this issue? Best regards, Carlos