[DRBD-user] Cannot synchronize stacked device to backup server with DRBD9
Artur Kaszuba
artur at netmix.pl
Tue Jun 19 09:19:04 CEST 2018
Hi Lars, thx for answer
W dniu 18.06.2018 o 17:10, Lars Ellenberg pisze:
> On Wed, Jun 13, 2018 at 01:03:53PM +0200, Artur Kaszuba wrote:
>> I know about 3 node solution and i have used it for some time (from ~9.0.8),
>> but i had problems with stability and decided to change configuration to
>> stacked configuration, with hope it will work more stable. As a last
>> solution i will downgrade to drbd8 where never had any problems with
>> stability, but i would like to stay with 9 and after some time switch again
>> to 3 node config.
>>
>> By stability i mean such situation:
>> - last version of drbd9 (9.0.14)
>> - kernel 4.13.0-43-generic on Ubuntu 16.04
>> - high disk usage/IO on drbd devices
>> - 3 node configuration
>> - random system crash on "drbdam disconnect/connect" command
>> When i disable one node everything works without problems and
>> disconnect/connect works perfectly. Before 9.0.14 i dont had such crashes,
>> but had other which are fixed now.
>
> And you cannot be bothered to report "such crashes"
> in a way that makes it possible to understand and fix those?
>
> "random system crash" is not good enough :-/
>
Yep, i know it is not enough to find a reason of this crashes, and that
why i don't reported this separately, i asked only why stacking solution
does not work in my case :).
Sorry but i cannot wrote to much more, this happening on production
environment and i cannot make tests there.
I can add:
- simple tests to reproduce this situation, but without high disk usage
does not create crashes
- problems started after upgrade from drbd 9.0.12 to 9.0.14 and
drbd-utils 9.3.0-1ppa1~xenial1 to 9.4.0-1ppa1~xenial1, before this we
dont had such crashes
- we have ~15 drbd resources on this environment, with high IO in random
pattern (databases, indexers, git, file servers, kvm etc)
>> Unfortunately i cannot wait for next fix,
>> i need stable environment.
>
> "I want it all, and I want it now" :-)
>
> For the benefit of those that can afford to wait for the next fix,
> maybe you should still report the crashes in a way that we can work with.
>
Sorry if i wrote it in wrong way, English is not my native language and
i did not want to be sound rude.
I only wrote about such situation:
- system works without crashes for months
- system is core production environment in company
- drbd upgrade causes random crashes (3 node configuration for drbd9)
- we cannot manage/create drbd resources because system could crash on
any drbdadm connect/disconnect command (what already happened in the
middle of day when we trying to reconnect backup server :/)
Such situation does not allow me to wait for next fix, i need to find
other solution/workaround.
>> I prefer to use stacking configuration, even when it is deprecated in
>> DRBD9. I decided to write this post because stacked configuration is
>> still described in documentation and should work? Unfortunately for
>> now it is not possible to create such configuration or i missed
>> something :/
>
> I know there are DRBD 9 users using "stacked" configurations out there.
>
Hmm, maybe they created resources some time ago and drbd works for
already created resources. That what i found is problem with initial
synchronization to backup server:
- source servers pair are up and one is primary
- backup server try to synchronize data (first time)
- primary server try to enter into Source state for stacked device, at
this moment it end with error:
[1636671.252028] drbd system-test-U/0 drbd113 z1: helper command:
/sbin/drbdadm before-resync-source
[1636671.255933] drbd system-test-U/0 drbd113: before-resync-source
handler returned 1, dropping connection.
[1636671.255942] drbd system-test-U z1: conn( Connected -> Disconnecting
) peer( Secondary -> Unknown )
- the same error (error code) happened when i executed drbdadm
before-resync-source directly:
'system-test-U' is a stacked resource, and not available in normal mode.
I think it could be a problem in drbd-utils or in drbd module:
- drbdadm before-resync-source should detect type of resource (not
requiring --stacking switch)
- or drbd module should execute drbdadm before-resync-source command
with "--stacking" switch when start before-resync-source handler for
stacked resources
Of course, it could be caused by my misconfiguration (test config in
initial mail), but i cannot find error there :(
> Maybe you missed to upgrade your drbd-utils?
> Current drbd-utils version would be 9.4.0
>
Im already use latest version of drbd-utils and drdb-dkms module:
ii drbd-dkms 9.0.14-1ppa1~xenial1 all RAID 1 over TCP/IP for
Linux module source
ii drbd-utils 9.4.0-1ppa1~xenial1 amd64 RAID 1 over TCP/IP for
Linux (user utilities)
If someone could help me to understand this situation i will be really
grateful.
--
Artur Kaszuba
More information about the drbd-user
mailing list