Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Mon, Aug 11, 2008 at 12:01:58PM -0400, Knight, Doug wrote: > List, > > This morning I found the following error in /var/log/messages. I use heartbeat > to manage drbd. I am running drbd 8.2.5. I was running a moderate level of load > testing on my server, though the stats I see from the crash indicate the system > was lightly loaded. This error caused heartbeat to think drbd had failed, and > it restarted everything that depended upon drbd (including postgres). > Everything came back up fine with the heartbeat restart of drbd. My questions > are: > > > > 1) What does exit code 20 from drbdsetup mean? "not further specified generic error" or some such. > 2) The driver was definitely loaded, so what could cause the “no response > from driver” message? Can a heavy system load cause some kind of timeout? yes. aparently. what kind of load did you have? what kind of servers are these? any virtualization involved? how much cpu cores? kernel version? any "high priority" or real-time priority tasks (other than heartbeat)? > Excerpt from messages: > > Aug 11 07:13:12 arc-stgsky-agg1 lrmd: : info: RA output: > (rsc_drbd_7788:monitor:stderr) No response from the DRBD driver! Is the module > loaded? > > Aug 11 07:13:12 arc-stgsky-agg1 lrmd: : info: RA output: > (rsc_drbd_7788:monitor:stderr) Command '/sbin/drbdsetup /dev/drbd0 state' > terminated with exit code 20 drbdadm aborting > > Aug 11 07:13:16 arc-stgsky-agg1 crmd: : info: process_lrm_event: LRM > operation rsc_drbd_7788_monitor_120000 (call=477, rc=7) complete > > My current /proc/drbd: which does not tell what it looked like during the incident... but it does not matter anyways. > version: 8.2.5 (api:88/proto:86-88) > > GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by > root at arc-stgsky-agg1.wsicorp.com, 2008-05-19 10:01:19 > > 0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- > > ns:593314408 nr:296596 dw:592558332 dr:124756728 al:2019770 bm:1535 lo:0 > pe:0 ua:0 ap:0 > > resync: used:0/31 hits:65601 misses:191 starving:0 dirty:0 changed:191 > > act_log: used:0/257 hits:682916576 misses:2027491 starving:4 dirty:7717 > changed:2019770 -- : Lars Ellenberg http://www.linbit.com : : DRBD/HA support and consulting sales at linbit.com : : LINBIT Information Technologies GmbH Tel +43-1-8178292-0 : : Vivenotgasse 48, A-1120 Vienna/Europe Fax +43-1-8178292-82 : __ please don't Cc me, but send to list -- I'm subscribed