[DRBD-user] Couldn't find filesystem ext3 when running HA

Cindy KS TOH kstoh at dlsjubm.com.my
Tue Mar 24 04:24:57 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Dear Sir/Madam,

Can anyone help me to diagnose what is the problem of my script? Cause i 
cannot get my HA to work as accordingly.
I have tested using one drbd partition and tested ok with HA.
Now i am using 2 drbd partitions. If manually run the drbd and mounting 
it, all works fine, but when i want it to be automatically up by HA, it 
is not working as i wanted. It just don't want to mount my drbd1 and 
then all things failed.

I have included my drbd.conf, ha.cf, haresources and ha-debug log in 
this email.  Cause i am not sure how to describe it.
Help?

 From
Cindy

-----------------------------------------------
heartbeat[8043]: 2009/03/24_09:19:34 info: **************************
heartbeat[8043]: 2009/03/24_09:19:34 info: Configuration validated. 
Starting heartbeat 2.1.3
heartbeat[8044]: 2009/03/24_09:19:34 info: heartbeat: version 2.1.3
heartbeat[8044]: 2009/03/24_09:19:34 info: Heartbeat generation: 1237792620
heartbeat[8044]: 2009/03/24_09:19:34 info: glib: UDP Broadcast heartbeat 
started on port 694 (694) interface eth0
heartbeat[8044]: 2009/03/24_09:19:34 info: glib: UDP Broadcast heartbeat 
closed on port 694 interface eth0 - Status: 1
heartbeat[8044]: 2009/03/24_09:19:34 info: G_main_add_TriggerHandler: 
Added signal manual handler
heartbeat[8044]: 2009/03/24_09:19:34 info: G_main_add_TriggerHandler: 
Added signal manual handler
heartbeat[8044]: 2009/03/24_09:19:34 info: G_main_add_SignalHandler: 
Added signal handler for signal 17
heartbeat[8044]: 2009/03/24_09:19:34 info: Local status now set to: 'up'
heartbeat[8044]: 2009/03/24_09:19:35 info: Link f10-1:eth0 up.
heartbeat[8044]: 2009/03/24_09:20:24 info: Link f10-2:eth0 up.
heartbeat[8044]: 2009/03/24_09:20:24 info: Status update for node f10-2: 
status up
heartbeat[8055]: 2009/03/24_09:20:24 debug: notify_world: setting 
SIGCHLD Handler to SIG_DFL
heartbeat[8044]: 2009/03/24_09:20:24 debug: get_delnodelist: delnodelist=
harc[8055]:    2009/03/24_09:20:24 info: Running /etc/ha.d/rc.d/status 
status
heartbeat[8044]: 2009/03/24_09:20:25 info: Comm_now_up(): updating 
status to active
heartbeat[8044]: 2009/03/24_09:20:25 info: Local status now set to: 'active'
heartbeat[8044]: 2009/03/24_09:20:25 info: Status update for node f10-2: 
status active
heartbeat[8072]: 2009/03/24_09:20:25 debug: notify_world: setting 
SIGCHLD Handler to SIG_DFL
harc[8072]:    2009/03/24_09:20:25 info: Running /etc/ha.d/rc.d/status 
status
heartbeat[8044]: 2009/03/24_09:20:35 info: local resource transition 
completed.
heartbeat[8044]: 2009/03/24_09:20:35 info: Initial resource acquisition 
complete (T_RESOURCES(us))
IPaddr[8127]:    2009/03/24_09:20:35 INFO:  Resource is stopped
heartbeat[8091]: 2009/03/24_09:20:35 info: Local Resource acquisition 
completed.
heartbeat[8044]: 2009/03/24_09:20:35 debug: StartNextRemoteRscReq(): 
child count 1
heartbeat[8166]: 2009/03/24_09:20:35 debug: notify_world: setting 
SIGCHLD Handler to SIG_DFL
harc[8166]:    2009/03/24_09:20:35 info: Running 
/etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[8166]:    2009/03/24_09:20:35 received ip-request-resp 
172.16.3.100 OK yes
ResourceManager[8187]:    2009/03/24_09:20:35 info: Acquiring resource 
group: f10-1 172.16.3.100 drbddisk::r0 drbddisk::r1 
Filesystem::/dev/drbd0::/data::ext3 Filesystem::/dev/drbd1::/data2::ext3
heartbeat[8044]: 2009/03/24_09:20:35 info: remote resource transition 
completed.
IPaddr[8214]:    2009/03/24_09:20:35 INFO:  Resource is stopped
ResourceManager[8187]:    2009/03/24_09:20:35 info: Running 
/etc/ha.d/resource.d/IPaddr 172.16.3.100 start
ResourceManager[8187]:    2009/03/24_09:20:35 debug: Starting 
/etc/ha.d/resource.d/IPaddr 172.16.3.100 start
IPaddr[8290]:    2009/03/24_09:20:35 INFO: Using calculated nic for 
172.16.3.100: eth1
IPaddr[8290]:    2009/03/24_09:20:36 INFO: Using calculated netmask for 
172.16.3.100: 255.255.0.0
IPaddr[8290]:    2009/03/24_09:20:36 DEBUG: Using calculated broadcast 
for 172.16.3.100: 172.16.255.255
IPaddr[8290]:    2009/03/24_09:20:36 INFO: eval ifconfig eth1:0 
172.16.3.100 netmask 255.255.0.0 broadcast 172.16.255.255
IPaddr[8290]:    2009/03/24_09:20:36 DEBUG: Sending Gratuitous Arp for 
172.16.3.100 on eth1:0 [eth1]
IPaddr[8273]:    2009/03/24_09:20:36 INFO:  Success
INFO:  Success
ResourceManager[8187]:    2009/03/24_09:20:36 debug: 
/etc/ha.d/resource.d/IPaddr 172.16.3.100 start done. RC=0
Filesystem[8432]:    2009/03/24_09:20:36 INFO:  Resource is stopped
ResourceManager[8187]:    2009/03/24_09:20:36 info: Running 
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 start
ResourceManager[8187]:    2009/03/24_09:20:36 debug: Starting 
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 start
Filesystem[8513]:    2009/03/24_09:20:36 INFO: Running start for 
/dev/drbd0 on /data
Filesystem[8502]:    2009/03/24_09:20:36 INFO:  Success
INFO:  Success
ResourceManager[8187]:    2009/03/24_09:20:36 debug: 
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 start done. RC=0
Filesystem[8581]:    2009/03/24_09:20:36 INFO:  Resource is stopped
*ResourceManager[8187]:    2009/03/24_09:20:36 info: Running 
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data2 ext3
 start
ResourceManager[8187]:    2009/03/24_09:20:36 debug: Starting 
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data2 ext3
 start
Filesystem[8662]:    2009/03/24_09:20:37 INFO: Running start for 
/dev/drbd1 on /data2
Filesystem[8662]:    2009/03/24_09:20:37 ERROR: Couldn't find filesystem 
ext3
 in /proc/filesystems
Filesystem[8651]:    2009/03/24_09:20:37 ERROR:  Illegal argument
ERROR:  Illegal argument*
ResourceManager[8187]:    2009/03/24_09:20:37 debug: 
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data2 ext3
 start done. RC=2
ResourceManager[8187]:    2009/03/24_09:20:37 ERROR: Return code 2 from 
/etc/ha.d/resource.d/Filesystem
ResourceManager[8187]:    2009/03/24_09:20:37 CRIT: Giving up resources 
due to failure of Filesystem::/dev/drbd1::/data2::ext3
ResourceManager[8187]:    2009/03/24_09:20:37 info: Releasing resource 
group: f10-1 172.16.3.100 drbddisk::r0 drbddisk::r1 
Filesystem::/dev/drbd0::/data::ext3 Filesystem::/dev/drbd1::/data2::ext3
ResourceManager[8187]:    2009/03/24_09:20:37 info: Running 
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data2 ext3
 stop
ResourceManager[8187]:    2009/03/24_09:20:37 debug: Starting 
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data2 ext3
 stop
Filesystem[8765]:    2009/03/24_09:20:37 INFO: Running stop for 
/dev/drbd1 on /data2
Filesystem[8754]:    2009/03/24_09:20:37 INFO:  Success
INFO:  Success
ResourceManager[8187]:    2009/03/24_09:20:37 debug: 
/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data2 ext3
 stop done. RC=0
ResourceManager[8187]:    2009/03/24_09:20:37 info: Running 
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
ResourceManager[8187]:    2009/03/24_09:20:37 debug: Starting 
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop
Filesystem[8848]:    2009/03/24_09:20:37 INFO: Running stop for 
/dev/drbd0 on /data
Filesystem[8848]:    2009/03/24_09:20:37 INFO: Trying to unmount /data
Filesystem[8848]:    2009/03/24_09:20:37 INFO: unmounted /data successfully
Filesystem[8837]:    2009/03/24_09:20:37 INFO:  Success
INFO:  Success
ResourceManager[8187]:    2009/03/24_09:20:37 debug: 
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop done. RC=0
ResourceManager[8187]:    2009/03/24_09:20:37 info: Running 
/etc/ha.d/resource.d/drbddisk r1 stop
ResourceManager[8187]:    2009/03/24_09:20:37 debug: Starting 
/etc/ha.d/resource.d/drbddisk r1 stop
ResourceManager[8187]:    2009/03/24_09:20:37 debug: 
/etc/ha.d/resource.d/drbddisk r1 stop done. RC=0
ResourceManager[8187]:    2009/03/24_09:20:37 info: Running 
/etc/ha.d/resource.d/drbddisk r0 stop
ResourceManager[8187]:    2009/03/24_09:20:37 debug: Starting 
/etc/ha.d/resource.d/drbddisk r0 stop
ResourceManager[8187]:    2009/03/24_09:20:37 debug: 
/etc/ha.d/resource.d/drbddisk r0 stop done. RC=0
ResourceManager[8187]:    2009/03/24_09:20:37 info: Running 
/etc/ha.d/resource.d/IPaddr 172.16.3.100 stop
ResourceManager[8187]:    2009/03/24_09:20:37 debug: Starting 
/etc/ha.d/resource.d/IPaddr 172.16.3.100 stop
In IP Stop
SIOCDELRT: No such process
IPaddr[9012]:    2009/03/24_09:20:38 INFO: ifconfig eth1:0 down
IPaddr[8995]:    2009/03/24_09:20:38 INFO:  Success
INFO:  Success
ResourceManager[8187]:    2009/03/24_09:20:38 debug: 
/etc/ha.d/resource.d/IPaddr 172.16.3.100 stop done. RC=0
hb_standby[9057]:    2009/03/24_09:21:08 Going standby [foreign].
heartbeat[8044]: 2009/03/24_09:21:08 info: f10-1 wants to go standby 
[foreign]
heartbeat[8044]: 2009/03/24_09:21:08 info: standby: f10-2 can take our 
foreign resources
heartbeat[9071]: 2009/03/24_09:21:08 info: give up foreign HA resources 
(standby).
heartbeat[9071]: 2009/03/24_09:21:08 info: foreign HA resource release 
completed (standby).
heartbeat[8044]: 2009/03/24_09:21:08 info: Local standby process 
completed [foreign].
heartbeat[8044]: 2009/03/24_09:21:08 WARN: 1 lost packet(s) for [f10-2] 
[36:38]
heartbeat[8044]: 2009/03/24_09:21:08 info: remote resource transition 
completed.
heartbeat[8044]: 2009/03/24_09:21:08 info: No pkts missing from f10-2!
heartbeat[8044]: 2009/03/24_09:21:08 info: Other node completed standby 
takeover of foreign resources.

---------------------------
[root at f10-1 ~]# cat /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility     local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
bcast   eth0            # Linux
auto_failback on
node    f10-1
node    f10-2
---------------------------
[root at f10-1 ~]# cat /etc/ha.d/haresources
f10-1 172.16.3.100 drbddisk::r0 drbddisk::r1 
Filesystem::/dev/drbd0::/data::ext3 Filesystem::/dev/drbd1::/data2::ext3

---------------------------
[root at f10-1 ~]# cat /etc/drbd.conf

global {
    usage-count yes;
}
common {
  syncer { rate 100M; }
}
resource r0 {
  protocol C;

  handlers {
    pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
    pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
    local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
    outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
  }

  startup {
    degr-wfc-timeout 120;    # 2 minutes.
  }

  disk {
   on-io-error   detach;
  }

  net {
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }

  syncer {
    rate 1000M;
    al-extents 257;
  }

  on f10-1 {
    device     /dev/drbd0;
    disk       /dev/VolGroup00/LogVol03;
    address    10.0.0.1:7788;
    meta-disk  /dev/VolGroup00/LogVol02[0];
  }

  on f10-2 {
    device    /dev/drbd0;
    disk      /dev/VolGroup00/LogVol03;
    address   10.0.0.2:7788;
    meta-disk /dev/VolGroup00/LogVol02[0];
  }
}

resource r1 {
  protocol C;
  handlers {
    pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
    pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
    local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
    outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
  }
  startup {
    wfc-timeout         0;  ## Infinite!
    degr-wfc-timeout  120;  ## 2 minutes.
  }
  disk {
    on-io-error detach;
  }
  net {
    # timeout           60;
    # connect-int       10;
    # ping-int          10;
    # max-buffers     2048;
    # max-epoch-size  2048;
    # cram-hmac-alg "sha1";
    # shared-secret "FooFunFactory";
     after-sb-0pri disconnect;
     after-sb-1pri disconnect;
     after-sb-2pri disconnect;
     rr-conflict disconnect;
  }
  syncer {
  rate 1000M;
  al-extents 257;
  }

  device    /dev/drbd1;
  disk        /dev/VolGroup00/LogVol04;
  meta-disk    /dev/VolGroup00/LogVol02[1];

  on f10-1 {
    address    10.0.0.1:7789;
  }

  on f10-2 {
    address     10.0.0.2:7789;
  }
}






More information about the drbd-user mailing list