Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> > We finally have our systems configured and functioning in a > drbd environment with redhat 9.0 / 2.4.x kernel SMP. > Thanks to all for the assistance. Now that the partitions > are syncing, we see that we are getting some errors on the > primary as follows: > > drbd0: Secondary/Secondary --> Primary/Secondary kjournald > starting. Commit interval 5 seconds > EXT3 FS 2.4-0.9.19, 19 August 2002 on drbd(147,0), internal journal > EXT3-fs: mounted filesystem with ordered data mode. > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967294 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967294 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967294 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967294 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967294 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > aacraid:ID(0:00:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > drbd0: [drbd0_worker/2411] sock_sendmsg time expired, ko = 4294967295 > > Notice that I've only seen one occurance of the aacraid error > on the primary... > Now, see the output from the secondary box below. These > systems are both Dell 2400 with the same configuration / > hardware. Is this pointing to a hardware issue? I don't > think it is, but I'm not sure, that's why I'm asking if > anyone has any ideas. I notice it never complains about > devices (0:00:0) or (0:01:0) which would be hard drive 0 and > 1. Anyone have anything to add to this? I belive the time > expired errors are being caused by the underlying issue, > whatever that may be. I don't think any tweaking of drbd > will fix it, but instead maybe a raid issue or hardware issue. > > drbd0: Resync started as SyncTarget (need to sync 52863812 KB > [13215953 bits set]). > aacraid:ID(0:05:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:03:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:05:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > drbd0: Secondary/Secondary --> Secondary/Primary > aacraid: <...repeats 1 more times> > aacraid:ID(0:03:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:03:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:05:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x28] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:03:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:05:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid: <...repeats 2 more times> > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid: <...repeats 1 more times> > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:03:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid: <...repeats 1 more times> > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid: <...repeats 1 more times> > aacraid:ID(0:05:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid: <...repeats 1 more times> > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid: <...repeats 2 more times> > aacraid:ID(0:05:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:02:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:05:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:03:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > aacraid:ID(0:04:0) Timeout detected on cmd[0x2a] aacraid:SCSI > Channel[0]: Timeout Detected On 1 Command(s) > [root at linux2 src]# > > Thanks, > Dan Some more info for you. On the system that is not showing the aacraid errors (linux1 / primary) the system load is very low: [root at linux1 /]# uptime 09:40:23 up 1:07, 1 user, load average: 0.02, 0.09, 0.08 However, on the system that is showing the problems (linux2 / secondary) the system load is very high. I am thinking this is because of whatever is the underlying issue: [root at linux2 mail]# uptime 23:46:14 up 1:25, 1 user, load average: 2.96, 2.85, 2.03 Any thoughts on this? Thanks, Dan > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user >