[DRBD-user] Problems after upgrade 8.2.0 to 8.3.0

John Du jjohndu at gmail.com
Tue Feb 10 01:09:37 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


John Du wrote:
> Lars Ellenberg wrote:
>> On Mon, Feb 09, 2009 at 01:04:37PM -0800, John Du wrote:
>>   
>>>>> I still do not understand why iostat only shows DRBD devices on this  
>>>>> particular node with 8.2.7 and 8.3.0 but not other nodes with the 
>>>>> same  hardware, same Linux Kernel and same DRBD version.
>>>>>     
>>>>>         
>>>> io stats accounting was introduced only in drbd-8.0.12 respective 8.2.6.
>>>> if you don't see drbd in iostats, you probably use an older DRBD version.
>>>>
>>>>   
>>>>       
>>> I obviously did not make myself clear.  We were running 8.3 on six nodes  
>>> and only this node showed DRBD in iostat and only this node was having  
>>> the problem I reported. I reverted to 8.2 on this node to make our  
>>> production going.
>>>     
>>
>> so you say
>>   six nodes.
>>   same hardware. same linux kernel. same drbd.
>>   but ONE node behaves different.
>>
>> pretty non-deterministic behaviour for software.
>>
>>   
> Yes. Everything is identical.  Only this node works with 8.2 but not 
> 8.3.  I know it is hard to believe.  It is hard for me to believe too. 
> Assume something is different on this node, what difference would make 
> DRBD 8.3 not work but 8.2 do?  is that possible that 8.3 sees the meta 
> data differently than 8.2?
>
> According to your message, iostat should show DRBD with 8.3.  But it 
> does not on all of the other five nodes.
>> I doubt I can help, as if that is true,
>> circumstantial evidence suggests that it has nothing to do with drbd,
>> but everything to do with whatever makes the non-behaving node behave
>> different.
>>
>> though my guess is
>> that either these nodes are not all that identical as you think they are.
>> or you installed the new kernel module, but did not actually reload it.
>>
>>   
> The log from the problematic node is as follows:  You can see it went 
> from 8.3.0 to 8.2.7 to 8.2.0.  You cannot see the server was slow from 
> the log though. Trust me, it was very very slow. Also I ran the 
> different versions of DRBD with the same config file shown in my 
> original message.
>
> Feb  6 22:22:17 newimapn kernel: drbd: initialised. Version: 8.3.0 
> (api:88/proto:86-89)
> Feb  6 22:22:17 newimapn kernel: drbd: GIT-hash: 
> 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by root at newimapr, 
> 2009-02-02 23:57:10
> Feb  6 22:22:17 newimapn kernel: drbd: registered as block device 
> major 147
> Feb  6 22:22:17 newimapn kernel: drbd: minor_table @ 0xffff81021f0294c0
> Feb  6 22:22:17 newimapn kernel: drbd1: disk( Diskless -> Attaching )
> Feb  6 22:22:17 newimapn kernel: drbd1: Starting worker thread (from 
> cqueue/3 [257])
> Feb  6 22:22:17 newimapn kernel: klogd 1.4.1, ---------- state change 
> ----------
> Feb  6 22:22:17 newimapn kernel: drbd1: Found 4 transactions (192 
> active extents) in activity log.
> Feb  6 22:22:17 newimapn kernel: drbd1: Method to ensure write 
> ordering: barrier
> Feb  6 22:22:17 newimapn kernel: drbd1: max_segment_size ( = BIO size 
> ) = 32768
> Feb  6 22:22:17 newimapn kernel: drbd1: drbd_bm_resize called with 
> capacity == 2571204968
> Feb  6 22:22:17 newimapn kernel: drbd1: resync bitmap: bits=321400621 
> words=5021885
> Feb  6 22:22:17 newimapn kernel: drbd1: size = 1226 GB (1285602484 KB)
> Feb  6 22:22:17 newimapn kernel: drbd1: recounting of set bits took 
> additional 43 jiffies
> Feb  6 22:22:17 newimapn kernel: drbd1: 148 KB (37 bits) marked 
> out-of-sync by on disk bit-map.
> Feb  6 22:22:17 newimapn kernel: drbd1: disk( Attaching -> UpToDate )
> Feb  6 22:22:17 newimapn kernel: drbd1: conn( StandAlone -> Unconnected )
> Feb  6 22:22:17 newimapn kernel: drbd1: Starting receiver thread (from 
> drbd1_worker [5507])
> Feb  6 22:22:17 newimapn kernel: drbd1: receiver (re)started
> Feb  6 22:22:17 newimapn kernel: drbd1: conn( Unconnected -> 
> WFConnection )
> Feb  6 22:22:52 newimapn kernel: drbd1: role( Secondary -> Primary )
> Feb  6 22:22:53 newimapn kernel: kjournald starting.  Commit interval 
> 5 seconds
> Feb  6 22:22:53 newimapn kernel: EXT3-fs warning: maximal mount count 
> reached, running e2fsck is recommended
> Feb  6 22:22:53 newimapn kernel: EXT3 FS on drbd1, internal journal
> Feb  6 22:22:53 newimapn kernel: EXT3-fs: mounted filesystem with 
> ordered data mode.
> Feb  6 22:23:07 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:23:07 newimapn avahi-daemon[4967]: Leaving mDNS multicast 
> group on interface eth0.IPv4 with address 10.100.2.239.
> Feb  6 22:23:07 newimapn avahi-daemon[4967]: Joining mDNS multicast 
> group on interface eth0.IPv4 with address 10.100.2.232.
> Feb  6 22:23:09 newimapn kernel: drbd1: role( Primary -> Secondary )
> Feb  6 22:24:45 newimapn kernel: drbd1: role( Secondary -> Primary )
> Feb  6 22:24:45 newimapn kernel: kjournald starting.  Commit interval 
> 5 seconds
> Feb  6 22:24:45 newimapn kernel: EXT3-fs warning: maximal mount count 
> reached, running e2fsck is recommended
> Feb  6 22:24:45 newimapn kernel: EXT3 FS on drbd1, internal journal
> Feb  6 22:24:45 newimapn kernel: EXT3-fs: mounted filesystem with 
> ordered data mode.
> Feb  6 22:24:45 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:24:45 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:24:45 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:25:01 newimapn ntpd[4078]: synchronized to 10.100.2.249, 
> stratum 2
> Feb  6 22:25:03 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:25:04 newimapn kernel: drbd1: role( Primary -> Secondary )
> Feb  6 22:26:08 newimapn kernel: drbd1: role( Secondary -> Primary )
> Feb  6 22:26:58 newimapn kernel: kjournald starting.  Commit interval 
> 5 seconds
> Feb  6 22:26:58 newimapn kernel: EXT3-fs warning: maximal mount count 
> reached, running e2fsck is recommended
> Feb  6 22:26:58 newimapn kernel: EXT3 FS on drbd1, internal journal
> Feb  6 22:26:58 newimapn kernel: EXT3-fs: mounted filesystem with 
> ordered data mode.
> Feb  6 22:28:15 newimapn ntpd[4078]: synchronized to 70.86.250.6, 
> stratum 2
> Feb  6 22:28:18 newimapn ntpd[4078]: synchronized to 63.240.161.99, 
> stratum 2
> Feb  6 22:29:11 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:29:11 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:29:11 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:31:24 newimapn ntpd[4078]: synchronized to 70.86.250.6, 
> stratum 2
> Feb  6 22:36:43 newimapn ntpd[4078]: synchronized to 64.247.17.251, 
> stratum 2
> Feb  6 22:42:35 newimapn httpd: nss_ldap: reconnected to LDAP server 
> ldap://ldap after 1 attempt
> Feb  6 22:43:47 newimapn httpd: nss_ldap: reconnected to LDAP server 
> ldap://ldap after 1 attempt
> Feb  6 22:45:46 newimapn httpd: nss_ldap: reconnected to LDAP server 
> ldap://ldap after 1 attempt
> Feb  6 22:48:18 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:48:19 newimapn kernel: drbd1: role( Primary -> Secondary )
> Feb  6 22:48:47 newimapn kernel: drbd1: conn( WFConnection -> 
> Disconnecting )
> Feb  6 22:48:47 newimapn kernel: drbd1: Discarding network configuration.
> Feb  6 22:48:47 newimapn kernel: drbd1: Connection closed
> Feb  6 22:48:47 newimapn kernel: drbd1: conn( Disconnecting -> 
> StandAlone )
> Feb  6 22:48:47 newimapn kernel: drbd1: receiver terminated
> Feb  6 22:48:47 newimapn kernel: drbd1: Terminating receiver thread
> Feb  6 22:48:47 newimapn kernel: drbd1: disk( UpToDate -> Diskless )
> Feb  6 22:48:47 newimapn kernel: drbd1: drbd_bm_resize called with 
> capacity == 0
> Feb  6 22:48:47 newimapn kernel: drbd1: worker terminated
> Feb  6 22:48:47 newimapn kernel: drbd1: Terminating worker thread
> Feb  6 22:48:47 newimapn kernel: drbd: module cleanup done.
> Feb  6 22:51:31 newimapn kernel: drbd: initialised. Version: 8.2.7 
> (api:88/proto:86-88)
> Feb  6 22:51:31 newimapn kernel: drbd: GIT-hash: 
> 61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by root at newimapn, 
> 2009-02-06 22:33:19
> Feb  6 22:51:31 newimapn kernel: drbd: registered as block device 
> major 147
> Feb  6 22:51:31 newimapn kernel: drbd: minor_table @ 0xffff81023c700480
> Feb  6 22:51:31 newimapn kernel: drbd1: disk( Diskless -> Attaching )
> Feb  6 22:51:31 newimapn kernel: drbd1: Starting worker thread (from 
> cqueue/5 [259])
> Feb  6 22:51:31 newimapn kernel: klogd 1.4.1, ---------- state change 
> ----------
> Feb  6 22:51:31 newimapn kernel: drbd1: Found 4 transactions (192 
> active extents) in activity log.
> Feb  6 22:51:31 newimapn kernel: drbd1: Method to ensure write 
> ordering: barrier
> Feb  6 22:51:31 newimapn kernel: drbd1: max_segment_size ( = BIO size 
> ) = 32768
> Feb  6 22:51:31 newimapn kernel: drbd1: drbd_bm_resize called with 
> capacity == 2571204968
> Feb  6 22:51:31 newimapn kernel: drbd1: resync bitmap: bits=321400621 
> words=5021885
> Feb  6 22:51:31 newimapn kernel: drbd1: size = 1226 GB (1285602484 KB)
> Feb  6 22:51:31 newimapn kernel: drbd1: recounting of set bits took 
> additional 40 jiffies
> Feb  6 22:51:31 newimapn kernel: drbd1: 10 MB (2684 bits) marked 
> out-of-sync by on disk bit-map.
> Feb  6 22:51:31 newimapn kernel: drbd1: disk( Attaching -> UpToDate )
> Feb  6 22:51:31 newimapn kernel: drbd1: conn( StandAlone -> Unconnected )
> Feb  6 22:51:31 newimapn kernel: drbd1: Starting receiver thread (from 
> drbd1_worker [7544])
> Feb  6 22:51:31 newimapn kernel: drbd1: receiver (re)started
> Feb  6 22:51:31 newimapn kernel: drbd1: conn( Unconnected -> 
> WFConnection )
> Feb  6 22:52:19 newimapn kernel: drbd1: role( Secondary -> Primary )
> Feb  6 22:52:19 newimapn kernel: kjournald starting.  Commit interval 
> 5 seconds
> Feb  6 22:52:19 newimapn kernel: EXT3-fs warning: maximal mount count 
> reached, running e2fsck is recommended
> Feb  6 22:52:19 newimapn kernel: EXT3 FS on drbd1, internal journal
> Feb  6 22:52:19 newimapn kernel: EXT3-fs: mounted filesystem with 
> ordered data mode.
> Feb  6 22:52:20 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:52:20 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:52:20 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 22:57:07 newimapn ntpd[4078]: synchronized to 70.86.250.6, 
> stratum 2
> Feb  6 23:00:13 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 23:00:15 newimapn kernel: drbd1: role( Primary -> Secondary )
> Feb  6 23:01:01 newimapn kernel: drbd1: conn( WFConnection -> 
> Disconnecting )
> Feb  6 23:01:01 newimapn kernel: drbd1: Discarding network configuration.
> Feb  6 23:01:01 newimapn kernel: drbd1: Connection closed
> Feb  6 23:01:01 newimapn kernel: drbd1: conn( Disconnecting -> 
> StandAlone )
> Feb  6 23:01:01 newimapn kernel: drbd1: receiver terminated
> Feb  6 23:01:01 newimapn kernel: drbd1: Terminating receiver thread
> Feb  6 23:01:01 newimapn kernel: drbd1: disk( UpToDate -> Diskless )
> Feb  6 23:01:01 newimapn kernel: drbd1: drbd_bm_resize called with 
> capacity == 0
> Feb  6 23:01:01 newimapn kernel: drbd1: worker terminated
> Feb  6 23:01:01 newimapn kernel: drbd1: Terminating worker thread
> Feb  6 23:01:01 newimapn kernel: drbd: module cleanup done.
> Feb  6 23:04:01 newimapn kernel: drbd: initialised. Version: 8.2.0 
> (api:86/proto:86-87)
> Feb  6 23:04:01 newimapn kernel: drbd: SVN Revision: 3079 build by 
> root at newimapn, 2009-02-06 22:58:05
> Feb  6 23:04:01 newimapn kernel: drbd: registered as block device 
> major 147
> Feb  6 23:04:01 newimapn kernel: drbd: minor_table @ 0xffff81023c700c80
> Feb  6 23:04:01 newimapn kernel: drbd1: disk( Diskless -> Attaching )
> Feb  6 23:04:01 newimapn kernel: klogd 1.4.1, ---------- state change 
> ----------
> Feb  6 23:04:01 newimapn kernel: drbd1: Found 4 transactions (52 
> active extents) in activity log.
> Feb  6 23:04:01 newimapn kernel: drbd1: max_segment_size ( = BIO size 
> ) = 32768
> Feb  6 23:04:01 newimapn kernel: drbd1: drbd_bm_resize called with 
> capacity == 2571204968
> Feb  6 23:04:01 newimapn kernel: drbd1: resync bitmap: bits=321400621 
> words=5021885
> Feb  6 23:04:01 newimapn kernel: drbd1: size = 1226 GB (1285602484 KB)
> Feb  6 23:04:02 newimapn kernel: drbd1: reading of bitmap took 198 jiffies
> Feb  6 23:04:02 newimapn kernel: drbd1: recounting of set bits took 
> additional 39 jiffies
> Feb  6 23:04:02 newimapn kernel: drbd1: 11 MB marked out-of-sync by on 
> disk bit-map.
> Feb  6 23:04:02 newimapn kernel: drbd1: disk( Attaching -> UpToDate )
> Feb  6 23:04:02 newimapn kernel: drbd1: Writing meta data super block now.
> Feb  6 23:04:02 newimapn kernel: drbd1: conn( StandAlone -> Unconnected )
> Feb  6 23:04:02 newimapn kernel: drbd1: receiver (re)started
> Feb  6 23:04:02 newimapn kernel: drbd1: conn( Unconnected -> 
> WFConnection )
> Feb  6 23:04:28 newimapn ntpd[4078]: synchronized to 63.240.161.99, 
> stratum 2
> Feb  6 23:04:46 newimapn kernel: drbd1: role( Secondary -> Primary )
> Feb  6 23:04:46 newimapn kernel: drbd1: Writing meta data super block now.
> Feb  6 23:04:46 newimapn kernel: kjournald starting.  Commit interval 
> 5 seconds
> Feb  6 23:04:46 newimapn kernel: EXT3-fs warning: maximal mount count 
> reached, running e2fsck is recommended
> Feb  6 23:04:46 newimapn kernel: EXT3 FS on drbd1, internal journal
> Feb  6 23:04:46 newimapn kernel: EXT3-fs: mounted filesystem with 
> ordered data mode.
> Feb  6 23:04:46 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 23:04:46 newimapn avahi-daemon[4967]: Withdrawing address 
> record for 10.100.2.239 on eth0.
> Feb  6 23:04:46 newimapn avahi-daemon[4967]: Registering new address 
> record for 10.100.2.239 on eth0.
> Feb  6 23:13:23 newimapn ntpd[4078]: synchronized to 70.86.250.6, 
> stratum 2
> Feb  6 23:16:24 newimapn kernel: drbd1: conn( WFConnection -> 
> WFReportParams )
> Feb  6 23:16:24 newimapn kernel: drbd1: Handshake successful: Agreed 
> network protocol version 87
> Feb  6 23:16:24 newimapn kernel: drbd1: data-integrity-alg:
> Feb  6 23:16:24 newimapn kernel: drbd1: peer( Unknown -> Secondary ) 
> conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
> Feb  6 23:16:43 newimapn kernel: drbd1: Writing meta data super block now.
> Feb  6 23:16:43 newimapn kernel: drbd1: BUG! md_sync_timer expired! 
> Worker calls drbd_md_sync().
> Feb  6 23:16:54 newimapn kernel: drbd1: conn( WFBitMapS -> SyncSource 
> ) pdsk( UpToDate -> Inconsistent )
> Feb  6 23:16:54 newimapn kernel: drbd1: Began resync as SyncSource 
> (will sync 13052 KB [3263 bits set]).
> Feb  6 23:16:54 newimapn kernel: drbd1: Writing meta data super block now.
> Feb  6 23:17:12 newimapn kernel: drbd1: Resync done (total 18 sec; 
> paused 0 sec; 724 K/sec)
> Feb  6 23:17:12 newimapn kernel: drbd1: conn( SyncSource -> Connected 
> ) pdsk( Inconsistent -> UpToDate )
> Feb  6 23:17:12 newimapn kernel: drbd1: Writing meta data super block now.
> Feb  6 23:18:12 newimapn kernel: drbd1: peer( Secondary -> Unknown ) 
> conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
> Feb  6 23:18:12 newimapn kernel: drbd1: Creating new current UUID
> Feb  6 23:18:12 newimapn kernel: drbd1: Writing meta data super block now.
> Feb  6 23:18:12 newimapn kernel: drbd1: asender terminated
> Feb  6 23:18:12 newimapn kernel: drbd1: tl_clear()
> Feb  6 23:18:12 newimapn kernel: drbd1: Connection closed
> Feb  6 23:18:12 newimapn kernel: drbd1: conn( TearDown -> Unconnected )
> Feb  6 23:18:12 newimapn kernel: drbd1: receiver terminated
> Feb  6 23:18:12 newimapn kernel: drbd1: receiver (re)started
> Feb  6 23:18:12 newimapn kernel: drbd1: conn( Unconnected -> 
> WFConnection )
>
>
>
>
>> if you still have the kernel logs, double check whether you find the
>> "drbd: initialised. Version 8.3.0 ..." line.
>>
>> if not, you never loaded nor used nor benchmarked against 8.3.0.
>>
>>   
>

Forgot to show you the kernel version:

[root at newimapn# uname -a
Linux newimapn 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 
x86_64 x86_64 GNU/Linux
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090209/5b104b60/attachment.htm>


More information about the drbd-user mailing list