[DRBD-user] 0.7.6 and sync after invalidate

Philipp Reisner philipp.reisner at linbit.com
Tue Nov 23 17:53:17 CET 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Tuesday 23 November 2004 13:40, Eugene Crosser wrote:
> Philipp and guys,
>
> I seem to hit the same problem today that I already reported long ago
> and that was apparently fixed long ago.  I am running kernel.org kernel
> 2.6.10-rc2 and drbd branch/drbd-0.7 checked out this morning, which
> reports itself as 0.7.6.  I was testing my system reaction to pulling
> out a disk; it did all right, drbd noticed underlying device failure and
> dutyfully panicked.  After reconnecting the disk (to hardware RAID0), I
> got the system up and ran "drdbdadm invalidate all" on the system with
> would-be-replaced disk.  In half an hour, SyncTarget reported sync
> complete, but the SyncSource did not:
>
> Nov 23 14:00:58 nfsb2.mail.back kernel: drbd0: 214540288 KB now marked
> out-of-sync by on disk bit-map.
> Nov 23 14:00:58 nfsb2.mail.back kernel: drbd0: drbd0_receiver [155]:
> cstate Connected --> SyncSource
> Nov 23 14:00:58 nfsb2.mail.back kernel: drbd0: Resync started as
> SyncSource (need to sync 214540288 KB [53635072 bits set]).
> Nov 23 14:02:43 nfsb1.mail.back ntpd[133]: time set -0.032269 s
> Nov 23 14:16:10 nfsb2.mail.back -- MARK --
> Nov 23 14:18:05 nfsb1.mail.back ntpd[133]: time reset -0.145189 s
> Nov 23 14:36:10 nfsb2.mail.back -- MARK --
> Nov 23 14:38:45 nfsb1.mail.back -- MARK --
> Nov 23 14:41:42 nfsb1.mail.back kernel: drbd0: Resync done (total 2445
> sec; paused 0 sec; 87744 K/sec)
> Nov 23 14:41:42 nfsb1.mail.back kernel: drbd0: drbd0_worker [152]:
> cstate SyncTarget --> Connected
>

The start sync from nfsb1 line is missing, could you post that please
as well ? 

> Now, this is /proc/drbd on both notes:
>
> root at hanode1:~# cat /proc/drbd
> version: 0.7.6 (api:77/proto:74)
> SVN Revision: 1649 build by crosser at ariel.sovam.com, 2004-11-23 11:15:51
>   0: cs:Connected st:Secondary/Primary ld:Consistent
>      ns:0 nr:234193156 dw:234193156 dr:0 al:0 bm:27472 lo:0 pe:0 ua:0 ap:0
>
> root at hanode2:~# cat /proc/drbd
> version: 0.7.6 (api:77/proto:74)
> SVN Revision: 1649 build by crosser at ariel.sovam.com, 2004-11-23 11:15:51
>   0: cs:SyncSource st:Primary/Secondary ld:Consistent
>      ns:234169036 nr:17584552 dw:38525788 dr:219993817 al:8449 bm:30241
> lo:0 pe:0 ua:0 ap:0
>          [===================>] sync'ed:100.0% (1/209512)M
>          finish: 0:00:00 speed: 120 (44,464) K/sec
>

Hmmm, this is not good....

> Also interesting thing, on the SyncSource note (hanode2) `uptime'
> reports unreasonable loadaverage:
>
> root at hanode2:~# uptime
>   15:34:27  up  3:18,  1 user,  load average: 407.18, 593.69, 871.42
>
> which is simply impossible given that there are only 119 processes...
> Running "drbdadm disconnect all" and "drbdadm connect all" apparently
> put things in order.  Maybe.
>

Hmmm... what the hell is happening here. Your ntp daemon set the time 
during resync .... hmmm

-phil
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria    http://www.linbit.com :



More information about the drbd-user mailing list