[DRBD-user] Problems after upgrade 8.2.0 to 8.3.0

John Du jjohndu at gmail.com
Tue Feb 10 00:10:22 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Lars Ellenberg wrote:
> On Mon, Feb 09, 2009 at 01:04:37PM -0800, John Du wrote:
>   
>>>> I still do not understand why iostat only shows DRBD devices on this  
>>>> particular node with 8.2.7 and 8.3.0 but not other nodes with the 
>>>> same  hardware, same Linux Kernel and same DRBD version.
>>>>     
>>>>         
>>> io stats accounting was introduced only in drbd-8.0.12 respective 8.2.6.
>>> if you don't see drbd in iostats, you probably use an older DRBD version.
>>>
>>>   
>>>       
>> I obviously did not make myself clear.  We were running 8.3 on six nodes  
>> and only this node showed DRBD in iostat and only this node was having  
>> the problem I reported. I reverted to 8.2 on this node to make our  
>> production going.
>>     
>
> so you say
>   six nodes.
>   same hardware. same linux kernel. same drbd.
>   but ONE node behaves different.
>
> pretty non-deterministic behaviour for software.
>
>   
Yes. Everything is identical.  Only this node works with 8.2 but not 
8.3.  I know it is hard to believe.  It is hard for me to believe too. 
Assume something is different on this node, what difference would make 
DRBD 8.3 not work but 8.2 do?  is that possible that 8.3 sees the meta 
data differently than 8.2?

According to your message, iostat should show DRBD with 8.3.  But it 
does not on all of the other five nodes.
> I doubt I can help, as if that is true,
> circumstantial evidence suggests that it has nothing to do with drbd,
> but everything to do with whatever makes the non-behaving node behave
> different.
>
> though my guess is
> that either these nodes are not all that identical as you think they are.
> or you installed the new kernel module, but did not actually reload it.
>
>   
The log from the problematic node is as follows:  You can see it went 
from 8.3.0 to 8.2.7 to 8.2.0.  You cannot see the server was slow from 
the log though. Trust me, it was very very slow. Also I ran the 
different versions of DRBD with the same config file shown in my 
original message.

Feb  6 22:22:17 newimapn kernel: drbd: initialised. Version: 8.3.0 
(api:88/proto:86-89)
Feb  6 22:22:17 newimapn kernel: drbd: GIT-hash: 
9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by root at newimapr, 
2009-02-02 23:57:10
Feb  6 22:22:17 newimapn kernel: drbd: registered as block device major 147
Feb  6 22:22:17 newimapn kernel: drbd: minor_table @ 0xffff81021f0294c0
Feb  6 22:22:17 newimapn kernel: drbd1: disk( Diskless -> Attaching )
Feb  6 22:22:17 newimapn kernel: drbd1: Starting worker thread (from 
cqueue/3 [257])
Feb  6 22:22:17 newimapn kernel: klogd 1.4.1, ---------- state change 
----------
Feb  6 22:22:17 newimapn kernel: drbd1: Found 4 transactions (192 active 
extents) in activity log.
Feb  6 22:22:17 newimapn kernel: drbd1: Method to ensure write ordering: 
barrier
Feb  6 22:22:17 newimapn kernel: drbd1: max_segment_size ( = BIO size ) 
= 32768
Feb  6 22:22:17 newimapn kernel: drbd1: drbd_bm_resize called with 
capacity == 2571204968
Feb  6 22:22:17 newimapn kernel: drbd1: resync bitmap: bits=321400621 
words=5021885
Feb  6 22:22:17 newimapn kernel: drbd1: size = 1226 GB (1285602484 KB)
Feb  6 22:22:17 newimapn kernel: drbd1: recounting of set bits took 
additional 43 jiffies
Feb  6 22:22:17 newimapn kernel: drbd1: 148 KB (37 bits) marked 
out-of-sync by on disk bit-map.
Feb  6 22:22:17 newimapn kernel: drbd1: disk( Attaching -> UpToDate )
Feb  6 22:22:17 newimapn kernel: drbd1: conn( StandAlone -> Unconnected )
Feb  6 22:22:17 newimapn kernel: drbd1: Starting receiver thread (from 
drbd1_worker [5507])
Feb  6 22:22:17 newimapn kernel: drbd1: receiver (re)started
Feb  6 22:22:17 newimapn kernel: drbd1: conn( Unconnected -> WFConnection )
Feb  6 22:22:52 newimapn kernel: drbd1: role( Secondary -> Primary )
Feb  6 22:22:53 newimapn kernel: kjournald starting.  Commit interval 5 
seconds
Feb  6 22:22:53 newimapn kernel: EXT3-fs warning: maximal mount count 
reached, running e2fsck is recommended
Feb  6 22:22:53 newimapn kernel: EXT3 FS on drbd1, internal journal
Feb  6 22:22:53 newimapn kernel: EXT3-fs: mounted filesystem with 
ordered data mode.
Feb  6 22:23:07 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 22:23:07 newimapn avahi-daemon[4967]: Leaving mDNS multicast 
group on interface eth0.IPv4 with address 10.100.2.239.
Feb  6 22:23:07 newimapn avahi-daemon[4967]: Joining mDNS multicast 
group on interface eth0.IPv4 with address 10.100.2.232.
Feb  6 22:23:09 newimapn kernel: drbd1: role( Primary -> Secondary )
Feb  6 22:24:45 newimapn kernel: drbd1: role( Secondary -> Primary )
Feb  6 22:24:45 newimapn kernel: kjournald starting.  Commit interval 5 
seconds
Feb  6 22:24:45 newimapn kernel: EXT3-fs warning: maximal mount count 
reached, running e2fsck is recommended
Feb  6 22:24:45 newimapn kernel: EXT3 FS on drbd1, internal journal
Feb  6 22:24:45 newimapn kernel: EXT3-fs: mounted filesystem with 
ordered data mode.
Feb  6 22:24:45 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 22:24:45 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 22:24:45 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 22:25:01 newimapn ntpd[4078]: synchronized to 10.100.2.249, stratum 2
Feb  6 22:25:03 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 22:25:04 newimapn kernel: drbd1: role( Primary -> Secondary )
Feb  6 22:26:08 newimapn kernel: drbd1: role( Secondary -> Primary )
Feb  6 22:26:58 newimapn kernel: kjournald starting.  Commit interval 5 
seconds
Feb  6 22:26:58 newimapn kernel: EXT3-fs warning: maximal mount count 
reached, running e2fsck is recommended
Feb  6 22:26:58 newimapn kernel: EXT3 FS on drbd1, internal journal
Feb  6 22:26:58 newimapn kernel: EXT3-fs: mounted filesystem with 
ordered data mode.
Feb  6 22:28:15 newimapn ntpd[4078]: synchronized to 70.86.250.6, stratum 2
Feb  6 22:28:18 newimapn ntpd[4078]: synchronized to 63.240.161.99, 
stratum 2
Feb  6 22:29:11 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 22:29:11 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 22:29:11 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 22:31:24 newimapn ntpd[4078]: synchronized to 70.86.250.6, stratum 2
Feb  6 22:36:43 newimapn ntpd[4078]: synchronized to 64.247.17.251, 
stratum 2
Feb  6 22:42:35 newimapn httpd: nss_ldap: reconnected to LDAP server 
ldap://ldap after 1 attempt
Feb  6 22:43:47 newimapn httpd: nss_ldap: reconnected to LDAP server 
ldap://ldap after 1 attempt
Feb  6 22:45:46 newimapn httpd: nss_ldap: reconnected to LDAP server 
ldap://ldap after 1 attempt
Feb  6 22:48:18 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 22:48:19 newimapn kernel: drbd1: role( Primary -> Secondary )
Feb  6 22:48:47 newimapn kernel: drbd1: conn( WFConnection -> 
Disconnecting )
Feb  6 22:48:47 newimapn kernel: drbd1: Discarding network configuration.
Feb  6 22:48:47 newimapn kernel: drbd1: Connection closed
Feb  6 22:48:47 newimapn kernel: drbd1: conn( Disconnecting -> StandAlone )
Feb  6 22:48:47 newimapn kernel: drbd1: receiver terminated
Feb  6 22:48:47 newimapn kernel: drbd1: Terminating receiver thread
Feb  6 22:48:47 newimapn kernel: drbd1: disk( UpToDate -> Diskless )
Feb  6 22:48:47 newimapn kernel: drbd1: drbd_bm_resize called with 
capacity == 0
Feb  6 22:48:47 newimapn kernel: drbd1: worker terminated
Feb  6 22:48:47 newimapn kernel: drbd1: Terminating worker thread
Feb  6 22:48:47 newimapn kernel: drbd: module cleanup done.
Feb  6 22:51:31 newimapn kernel: drbd: initialised. Version: 8.2.7 
(api:88/proto:86-88)
Feb  6 22:51:31 newimapn kernel: drbd: GIT-hash: 
61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by root at newimapn, 
2009-02-06 22:33:19
Feb  6 22:51:31 newimapn kernel: drbd: registered as block device major 147
Feb  6 22:51:31 newimapn kernel: drbd: minor_table @ 0xffff81023c700480
Feb  6 22:51:31 newimapn kernel: drbd1: disk( Diskless -> Attaching )
Feb  6 22:51:31 newimapn kernel: drbd1: Starting worker thread (from 
cqueue/5 [259])
Feb  6 22:51:31 newimapn kernel: klogd 1.4.1, ---------- state change 
----------
Feb  6 22:51:31 newimapn kernel: drbd1: Found 4 transactions (192 active 
extents) in activity log.
Feb  6 22:51:31 newimapn kernel: drbd1: Method to ensure write ordering: 
barrier
Feb  6 22:51:31 newimapn kernel: drbd1: max_segment_size ( = BIO size ) 
= 32768
Feb  6 22:51:31 newimapn kernel: drbd1: drbd_bm_resize called with 
capacity == 2571204968
Feb  6 22:51:31 newimapn kernel: drbd1: resync bitmap: bits=321400621 
words=5021885
Feb  6 22:51:31 newimapn kernel: drbd1: size = 1226 GB (1285602484 KB)
Feb  6 22:51:31 newimapn kernel: drbd1: recounting of set bits took 
additional 40 jiffies
Feb  6 22:51:31 newimapn kernel: drbd1: 10 MB (2684 bits) marked 
out-of-sync by on disk bit-map.
Feb  6 22:51:31 newimapn kernel: drbd1: disk( Attaching -> UpToDate )
Feb  6 22:51:31 newimapn kernel: drbd1: conn( StandAlone -> Unconnected )
Feb  6 22:51:31 newimapn kernel: drbd1: Starting receiver thread (from 
drbd1_worker [7544])
Feb  6 22:51:31 newimapn kernel: drbd1: receiver (re)started
Feb  6 22:51:31 newimapn kernel: drbd1: conn( Unconnected -> WFConnection )
Feb  6 22:52:19 newimapn kernel: drbd1: role( Secondary -> Primary )
Feb  6 22:52:19 newimapn kernel: kjournald starting.  Commit interval 5 
seconds
Feb  6 22:52:19 newimapn kernel: EXT3-fs warning: maximal mount count 
reached, running e2fsck is recommended
Feb  6 22:52:19 newimapn kernel: EXT3 FS on drbd1, internal journal
Feb  6 22:52:19 newimapn kernel: EXT3-fs: mounted filesystem with 
ordered data mode.
Feb  6 22:52:20 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 22:52:20 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 22:52:20 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 22:57:07 newimapn ntpd[4078]: synchronized to 70.86.250.6, stratum 2
Feb  6 23:00:13 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 23:00:15 newimapn kernel: drbd1: role( Primary -> Secondary )
Feb  6 23:01:01 newimapn kernel: drbd1: conn( WFConnection -> 
Disconnecting )
Feb  6 23:01:01 newimapn kernel: drbd1: Discarding network configuration.
Feb  6 23:01:01 newimapn kernel: drbd1: Connection closed
Feb  6 23:01:01 newimapn kernel: drbd1: conn( Disconnecting -> StandAlone )
Feb  6 23:01:01 newimapn kernel: drbd1: receiver terminated
Feb  6 23:01:01 newimapn kernel: drbd1: Terminating receiver thread
Feb  6 23:01:01 newimapn kernel: drbd1: disk( UpToDate -> Diskless )
Feb  6 23:01:01 newimapn kernel: drbd1: drbd_bm_resize called with 
capacity == 0
Feb  6 23:01:01 newimapn kernel: drbd1: worker terminated
Feb  6 23:01:01 newimapn kernel: drbd1: Terminating worker thread
Feb  6 23:01:01 newimapn kernel: drbd: module cleanup done.
Feb  6 23:04:01 newimapn kernel: drbd: initialised. Version: 8.2.0 
(api:86/proto:86-87)
Feb  6 23:04:01 newimapn kernel: drbd: SVN Revision: 3079 build by 
root at newimapn, 2009-02-06 22:58:05
Feb  6 23:04:01 newimapn kernel: drbd: registered as block device major 147
Feb  6 23:04:01 newimapn kernel: drbd: minor_table @ 0xffff81023c700c80
Feb  6 23:04:01 newimapn kernel: drbd1: disk( Diskless -> Attaching )
Feb  6 23:04:01 newimapn kernel: klogd 1.4.1, ---------- state change 
----------
Feb  6 23:04:01 newimapn kernel: drbd1: Found 4 transactions (52 active 
extents) in activity log.
Feb  6 23:04:01 newimapn kernel: drbd1: max_segment_size ( = BIO size ) 
= 32768
Feb  6 23:04:01 newimapn kernel: drbd1: drbd_bm_resize called with 
capacity == 2571204968
Feb  6 23:04:01 newimapn kernel: drbd1: resync bitmap: bits=321400621 
words=5021885
Feb  6 23:04:01 newimapn kernel: drbd1: size = 1226 GB (1285602484 KB)
Feb  6 23:04:02 newimapn kernel: drbd1: reading of bitmap took 198 jiffies
Feb  6 23:04:02 newimapn kernel: drbd1: recounting of set bits took 
additional 39 jiffies
Feb  6 23:04:02 newimapn kernel: drbd1: 11 MB marked out-of-sync by on 
disk bit-map.
Feb  6 23:04:02 newimapn kernel: drbd1: disk( Attaching -> UpToDate )
Feb  6 23:04:02 newimapn kernel: drbd1: Writing meta data super block now.
Feb  6 23:04:02 newimapn kernel: drbd1: conn( StandAlone -> Unconnected )
Feb  6 23:04:02 newimapn kernel: drbd1: receiver (re)started
Feb  6 23:04:02 newimapn kernel: drbd1: conn( Unconnected -> WFConnection )
Feb  6 23:04:28 newimapn ntpd[4078]: synchronized to 63.240.161.99, 
stratum 2
Feb  6 23:04:46 newimapn kernel: drbd1: role( Secondary -> Primary )
Feb  6 23:04:46 newimapn kernel: drbd1: Writing meta data super block now.
Feb  6 23:04:46 newimapn kernel: kjournald starting.  Commit interval 5 
seconds
Feb  6 23:04:46 newimapn kernel: EXT3-fs warning: maximal mount count 
reached, running e2fsck is recommended
Feb  6 23:04:46 newimapn kernel: EXT3 FS on drbd1, internal journal
Feb  6 23:04:46 newimapn kernel: EXT3-fs: mounted filesystem with 
ordered data mode.
Feb  6 23:04:46 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 23:04:46 newimapn avahi-daemon[4967]: Withdrawing address record 
for 10.100.2.239 on eth0.
Feb  6 23:04:46 newimapn avahi-daemon[4967]: Registering new address 
record for 10.100.2.239 on eth0.
Feb  6 23:13:23 newimapn ntpd[4078]: synchronized to 70.86.250.6, stratum 2
Feb  6 23:16:24 newimapn kernel: drbd1: conn( WFConnection -> 
WFReportParams )
Feb  6 23:16:24 newimapn kernel: drbd1: Handshake successful: Agreed 
network protocol version 87
Feb  6 23:16:24 newimapn kernel: drbd1: data-integrity-alg:
Feb  6 23:16:24 newimapn kernel: drbd1: peer( Unknown -> Secondary ) 
conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
Feb  6 23:16:43 newimapn kernel: drbd1: Writing meta data super block now.
Feb  6 23:16:43 newimapn kernel: drbd1: BUG! md_sync_timer expired! 
Worker calls drbd_md_sync().
Feb  6 23:16:54 newimapn kernel: drbd1: conn( WFBitMapS -> SyncSource ) 
pdsk( UpToDate -> Inconsistent )
Feb  6 23:16:54 newimapn kernel: drbd1: Began resync as SyncSource (will 
sync 13052 KB [3263 bits set]).
Feb  6 23:16:54 newimapn kernel: drbd1: Writing meta data super block now.
Feb  6 23:17:12 newimapn kernel: drbd1: Resync done (total 18 sec; 
paused 0 sec; 724 K/sec)
Feb  6 23:17:12 newimapn kernel: drbd1: conn( SyncSource -> Connected ) 
pdsk( Inconsistent -> UpToDate )
Feb  6 23:17:12 newimapn kernel: drbd1: Writing meta data super block now.
Feb  6 23:18:12 newimapn kernel: drbd1: peer( Secondary -> Unknown ) 
conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
Feb  6 23:18:12 newimapn kernel: drbd1: Creating new current UUID
Feb  6 23:18:12 newimapn kernel: drbd1: Writing meta data super block now.
Feb  6 23:18:12 newimapn kernel: drbd1: asender terminated
Feb  6 23:18:12 newimapn kernel: drbd1: tl_clear()
Feb  6 23:18:12 newimapn kernel: drbd1: Connection closed
Feb  6 23:18:12 newimapn kernel: drbd1: conn( TearDown -> Unconnected )
Feb  6 23:18:12 newimapn kernel: drbd1: receiver terminated
Feb  6 23:18:12 newimapn kernel: drbd1: receiver (re)started
Feb  6 23:18:12 newimapn kernel: drbd1: conn( Unconnected -> WFConnection )




> if you still have the kernel logs, double check whether you find the
> "drbd: initialised. Version 8.3.0 ..." line.
>
> if not, you never loaded nor used nor benchmarked against 8.3.0.
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090209/741ebbe5/attachment.htm>


More information about the drbd-user mailing list