Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, my setup: drbd-0.7.24 2 drbd devices (active/active in the 0.7.x sense), one for nfs (drbd0) and one for database files/postgresql (drbd1). For simplicity's sake, let's call the servers nfs and database as well. This morning, I failed over the nfs (drbd0) device to the second server, upgraded nfs-utils on nfs (now secondary) server and failed the resource back over to it, restarted the services and everything went as planned... ...not quite: although this part worked, nfs is now occasionally spewing messages like this: ---snip--- Dec 7 10:29:43 nfs kernel: drbd1: 195 messages suppressed in /usr/src/2.6/2.6/2.6.12.6/drbd-0.7.24/drbd/drbd_req.c:214. Dec 7 10:29:43 nfs kernel: drbd1: Not in Primary state, no IO requests allowed Dec 7 10:29:43 nfs kernel: printk: 119 messages suppressed. Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 11645072 Dec 7 10:29:43 nfs kernel: drbd1: Not in Primary state, no IO requests allowed Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 11645072 Dec 7 10:29:43 nfs kernel: drbd1: Not in Primary state, no IO requests allowed Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 0 Dec 7 10:29:43 nfs kernel: drbd1: Not in Primary state, no IO requests allowed Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 1 Dec 7 10:29:43 nfs kernel: drbd1: Not in Primary state, no IO requests allowed Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 2 Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 3 Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 4 Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 5 Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 6 Dec 7 10:29:43 nfs kernel: Buffer I/O error on device drbd1, logical block 7 ---snip--- note that drbd1 was not even mentioned in this procedure and indeed, no-one messed with it as far as I can tell. The only thing I noticed out of the ordinary is that the servers clocks had wandered off each other by a few (20 or so) seconds due to a hastily written ntpdaemon init script. I have also invalidated the secondary database device, did a full sync but the messages keep coming at 5-10 minute intervals. Any ideas what these mean / how to make them stop / correct the problem (if any) ? (I am certain that no daemon, programme, script, whatever is trying to do direct i/o to /dev/drbd1 - apart from drbd itself, that is) Also note that everything, apart from those messages, seems to be in order, i.e. cat /proc/drbd shows ---snip--- version: 0.7.24 (api:79/proto:74) SVN Revision: 2875 build by @nfs, 2007-09-27 21:19:02 0: cs:Connected st:Primary/Secondary ld:Consistent ns:1791478308 nr:168499164 dw:1966908328 dr:832356966 al:15116024 bm:20140 lo:0 pe:0 ua:0 ap:0 1: cs:Connected st:Secondary/Primary ld:Consistent ns:1653508 nr:166246668 dw:166908692 dr:8437441 al:12636 bm:8913 lo:0 pe:0 ua:0 ap:0 ---snip--- on nfs and the opposite on the database.