[DRBD-user] 0.6.11 "unknown packet type" ... connection lost

Felix Ide felix.ide-drbd at educators.de
Thu Feb 19 10:28:48 CET 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Lars,

I added the patch and this is what I get now:

syslog on primary node:
Feb 18 18:56:16 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned 4
Feb 18 18:56:16 drbd1-bfd kernel: drbd0: [bonnie++/20787] sock_sendmsg 
returned -32
Feb 18 18:56:16 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned -1000
Feb 18 18:56:16 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned -1000
Feb 18 18:56:19 drbd1-bfd kernel: drbd0: Connection lost.
Feb 18 18:56:19 drbd1-bfd kernel: drbd0: Connection established. size=26097472 
KB / blksize=4096 B
Feb 18 18:56:19 drbd1-bfd kernel: drbd0: Synchronisation started blks=64
Feb 18 18:56:39 drbd1-bfd kernel: drbd0: Synchronisation done.
Feb 18 19:12:55 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned 4
Feb 18 19:12:55 drbd1-bfd kernel: drbd0: [kupdated/7] sock_sendmsg returned 
-32
Feb 18 19:12:55 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned -1001
Feb 18 19:12:55 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned -1001
Feb 18 19:12:55 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned -1000
Feb 18 19:13:03 drbd1-bfd kernel: drbd0: Connection lost.
Feb 18 19:13:03 drbd1-bfd kernel: drbd0: Connection established. size=26097472 
KB / blksize=4096 B
Feb 18 19:13:03 drbd1-bfd kernel: drbd0: Synchronisation started blks=64
Feb 18 19:13:06 drbd1-bfd kernel: drbd0: transferlog too small!! 
Feb 18 19:13:08 drbd1-bfd kernel: drbd0: [drbd_syncer_0/21484] sock_sendmsg 
returned -104
Feb 18 19:13:08 drbd1-bfd kernel: drbd0: Syncer send failed.
Feb 18 19:13:08 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned -1001
Feb 18 19:13:08 drbd1-bfd kernel: drbd0: send_cmd_dontwait returned -1000
Feb 18 19:13:11 drbd1-bfd kernel: drbd0: Connection lost.
Feb 18 19:13:11 drbd1-bfd kernel: drbd0: Connection established. size=26097472 
KB / blksize=4096 B
Feb 18 19:13:11 drbd1-bfd kernel: drbd0: Synchronisation started blks=64
Feb 18 19:23:44 drbd1-bfd kernel: drbd0: Synchronisation done.

syslog on secondary node:
Feb 18 18:56:22 drbd2-bfd kernel: drbd0: unknown packet type! 
83740267:8374:0267
Feb 18 18:56:22 drbd2-bfd kernel: drbd0: Connection lost.
Feb 18 18:56:25 drbd2-bfd kernel: drbd0: Connection established. size=26097472 
KB / blksize=4096 B
Feb 18 19:13:01 drbd2-bfd kernel: drbd0: unknown packet type! 
83740267:8374:0267
Feb 18 19:13:01 drbd2-bfd kernel: drbd0: Connection lost.
Feb 18 19:13:09 drbd2-bfd kernel: drbd0: Connection established. size=26097472 
KB / blksize=4096 B
Feb 18 19:13:14 drbd2-bfd kernel: drbd0: [drbd_asender_0/24363] sock_sendmsg 
returned 0
Feb 18 19:13:14 drbd2-bfd kernel: drbd0: Connection lost.
Feb 18 19:13:17 drbd2-bfd kernel: drbd0: Connection established. size=26097472 
KB / blksize=4096 B

After that, the primary system stalled!!! Heartbeat tries to switch to 
secondary node, but while taking over the datadisk drbd0 start command hangs 
and the failover does not completly succeed (datadisk drbd0 start now hangs 
since > 12 hours!)

Again my question: what the biggest possible size of transferlog? It is now 
set to 16000, when I try to increase it I get error messages while starting 
drbd and parsing config file.

Thanks for your help and your work,
Felix


Am Montag, 16. Februar 2004 19:59 schrieb Lars Ellenberg:
> / 2004-02-16 18:34:14 +0100
>
> \ Lars Ellenberg:
> > / 2004-02-16 13:23:24 +0100
> >
> > \ Felix Ide:
> > > This is what the secondary says:
> > > Feb 16 13:11:46 drbd2-bfd kernel: drbd0: unknown packet type!
> > > 83740267:8374:0267
> >
> > oh shit.
> >
> > this looks like
> > "DRBD_MAGIC DRBD_MAGIC"
> >
> > Philipp, is it possible that we (I) created a new BUG while
> > trying to solve the blocking write_hints issue?
>
> I remember someone else reporting similar recently,
> but searching my lokal archive does not give me any hit,
> neither did online search.
>
> anyways: please add below "debug printk".
> if it triggers, then I know what I did wrong,
> and have an idea how to fix it.
>
> 	Lars Ellenberg
>
>
> diff -u -p -r1.90 drbd_main.c
> --- drbd_main.c	18 Jan 2004 20:18:19 -0000	1.90
> +++ drbd_main.c	16 Feb 2004 18:39:37 -0000
> @@ -1078,6 +1078,10 @@ STATIC void drbd_send_write_hint(void *d
>  		//       ": send_cmd_dontwait would have blocked\n");
>  		queue_task(&mdev->write_hint_tq, &tq_disk);
>  	} else {
> +		if (i != 0)
> +			printk(KERN_ERR DEVICE_NAME
> +				"%d: send_cmd_dontwait returned %d\n",
> +				(int)(mdev-drbd_conf),i);
>  		// no need for error handling here,
>  		// drbd_send_cmd_dontwait already does it.
>  		clear_bit(WRITE_HINT_QUEUED, &mdev->flags);
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

-- 

Mit freundlichen Grüssen,
Felix Ide

---------------------------------------------------------
 e d u c a t o r s  -  W i r   b i l d e n   E r f o l g
---------------------------------------------------------
 Felix Ide                    |   educators GmbH
 Geschäftsführer              |   Hölzengraben 2
 felix.ide at educators.de       |   D-67657 Kaiserslautern

 Tel.: +49(0)631 34106-11
 Fax : +49(0)631 34106-22

 http://www.educators.de/
---------------------------------------------------------




More information about the drbd-user mailing list