[DRBD-user] DRBD - on ARM (armel)

Nick Liebmann nickdrbd at alfiecam.co.uk
Tue Jun 16 23:49:10 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Philipp,

I have spent a bit more time looking into this. I am now working from 
the latest version of the git repository, which seems to have had your 
changes applied.

I have gone back to basics, since I was concered I was feeding you 
mis-information.


I have built the userland binaries, and the kernel module from the 
latest source (git HEAD, no patches)

I have copied them onto both machines (both exactly the same machines)

My DRBD config is as follows:

***********************
global {
    usage-count yes;
}

common {
    syncer {
        rate 10M;
    }
}

resource r0 {
  protocol C;
  device     /dev/drbd0;
  meta-disk  internal;

  net {
    allow-two-primaries;
    after-sb-0pri discard-zero-changes;
    after-sb-1pri discard-secondary;
    after-sb-2pri disconnect;
  }

  startup {
#    become-primary-on both;
  }

  on nasty {
    address   XXX.XXX.XXX.XXX:7788;
    disk       /dev/sda7;
  }

  on coral {
    address   XXX.XXX.XXX.XXX:7788;
    disk       /dev/sda7;
  }
}
*****************************

I am running the following commands on both machines

dd if=/dev/zero of=/dev/sda7 bs=10M

nasty:~# drbdadm create-md r0
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
success

nasty:~# modprobe drbd
</var/log/kern.log
Jun 16 22:32:14 nasty kernel: drbd: initialised. Version: 8.3.2rc1 
(api:88/proto:86-90)
Jun 16 22:32:14 nasty kernel: drbd: GIT-hash: 
df0abe3ea8a2beb412e094fbc76c8f4f0ba91858 build by nick at nasty, 2009-06-16 
22:04:24
Jun 16 22:32:14 nasty kernel: drbd: registered as block device major 147
Jun 16 22:32:14 nasty kernel: drbd: minor_table @ 0xc6f80620

nasty:~# drbdadm up r0
0: Failure: (126) UnknownMandatoryTag
Command 'drbdsetup 0 net XXX.XXX.XXX.XXX:7788 XXX.XXX.XXX.XXX:7788 C 
--set-defaults --create-device --allow-two-primaries 
--after-sb-0pri=discard-zero-changes --after-sb-1pri=discard-secondary 
--after-sb-2pri=disconnect' terminated with exit code 10

</var/log/kern.log
Jun 16 22:34:19 nasty kernel: block drbd0: Starting worker thread (from 
cqueue [8513])
Jun 16 22:34:19 nasty kernel: block drbd0: disk( Diskless -> Attaching )
Jun 16 22:34:19 nasty kernel: block drbd0: No usable activity log found.
Jun 16 22:34:19 nasty kernel: block drbd0: Method to ensure write 
ordering: barrier
Jun 16 22:34:19 nasty kernel: block drbd0: max_segment_size ( = BIO size 
) = 32768
Jun 16 22:34:19 nasty kernel: block drbd0: drbd_bm_resize called with 
capacity == 208696
Jun 16 22:34:19 nasty kernel: block drbd0: resync bitmap: bits=26087 
words=816
Jun 16 22:34:19 nasty kernel: block drbd0: size = 102 MB (104348 KB)
Jun 16 22:34:19 nasty kernel: block drbd0: Writing the whole bitmap, 
size changed
Jun 16 22:34:20 nasty kernel: block drbd0: 102 MB (26087 bits) marked 
out-of-sync by on disk bit-map.
Jun 16 22:34:20 nasty kernel: block drbd0: recounting of set bits took 
additional 0 jiffies
Jun 16 22:34:20 nasty kernel: block drbd0: 102 MB (26087 bits) marked 
out-of-sync by on disk bit-map.
Jun 16 22:34:20 nasty kernel: block drbd0: disk( Attaching -> 
Inconsistent )
Jun 16 22:34:20 nasty kernel: block drbd0: Unknown tag: 3526

As you can see, this is not good!

If I try the verbose method;

nasty:~# drbdadm attach r0
</var/log/kern.log
Jun 16 22:38:05 nasty kernel: block drbd0: Starting worker thread (from 
cqueue [8513])
Jun 16 22:38:05 nasty kernel: block drbd0: disk( Diskless -> Attaching )
Jun 16 22:38:05 nasty kernel: block drbd0: No usable activity log found.
Jun 16 22:38:05 nasty kernel: block drbd0: Method to ensure write 
ordering: barrier
Jun 16 22:38:05 nasty kernel: block drbd0: max_segment_size ( = BIO size 
) = 32768
Jun 16 22:38:05 nasty kernel: block drbd0: drbd_bm_resize called with 
capacity == 208696
Jun 16 22:38:05 nasty kernel: block drbd0: resync bitmap: bits=26087 
words=816
Jun 16 22:38:05 nasty kernel: block drbd0: size = 102 MB (104348 KB)
Jun 16 22:38:05 nasty kernel: block drbd0: recounting of set bits took 
additional 0 jiffies
Jun 16 22:38:05 nasty kernel: block drbd0: 102 MB (26087 bits) marked 
out-of-sync by on disk bit-map.
Jun 16 22:38:05 nasty kernel: block drbd0: disk( Attaching -> 
Inconsistent )

nasty:~# cat /proc/drbd
version: 8.3.2rc1 (api:88/proto:86-90)
GIT-hash: df0abe3ea8a2beb412e094fbc76c8f4f0ba91858 build by nick at nasty, 
2009-06-16 22:04:24
 0: cs:StandAlone ro:Secondary/Unknown ds:Inconsistent/DUnknown   r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:104348

nasty:~# drbdadm syncer r0

nasty:~# drbdadm connect r0
</var/log/kern.log
Jun 16 22:40:35 nasty kernel: block drbd0: conn( StandAlone -> 
Unconnected )
Jun 16 22:40:35 nasty kernel: block drbd0: Starting receiver thread 
(from drbd0_worker [1370])
Jun 16 22:40:35 nasty kernel: block drbd0: receiver (re)started
Jun 16 22:40:35 nasty kernel: block drbd0: conn( Unconnected -> 
WFConnection )
Jun 16 22:40:36 nasty kernel: block drbd0: Handshake successful: Agreed 
network protocol version 90
Jun 16 22:40:36 nasty kernel: block drbd0: conn( WFConnection -> 
WFReportParams )
Jun 16 22:40:36 nasty kernel: block drbd0: Starting asender thread (from 
drbd0_receiver [1424])
Jun 16 22:40:36 nasty kernel: block drbd0: data-integrity-alg: <not-used>
Jun 16 22:40:36 nasty kernel: block drbd0: drbd_sync_handshake:
Jun 16 22:40:36 nasty kernel: block drbd0: self 
0000000000000004:0000000000000000:0000000000000000:0000000000000000 
bits:26087 flags:0
Jun 16 22:40:36 nasty kernel: block drbd0: peer 
0000000000000004:0000000000000000:0000000000000000:0000000000000000 
bits:26087 flags:0
Jun 16 22:40:36 nasty kernel: block drbd0: uuid_compare()=0 by rule 1
Jun 16 22:40:36 nasty kernel: block drbd0: No resync, but 26087 bits in 
bitmap!
Jun 16 22:40:36 nasty kernel: block drbd0: peer( Unknown -> Secondary ) 
conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent )

nasty:~# cat /proc/drbd
version: 8.3.2rc1 (api:88/proto:86-90)
GIT-hash: df0abe3ea8a2beb412e094fbc76c8f4f0ba91858 build by nick at nasty, 
2009-06-16 22:04:24
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:104348


nasty:~# drbdadm -- --overwrite-data-of-peer primary r0
0: State change failed: (-2) Refusing to be Primary without at least one 
UpToDate disk
Command 'drbdsetup 0 primary --overwrite-data-of-peer' terminated with 
exit code 17

Jun 16 22:43:11 nasty kernel: block drbd0: State change failed: Refusing 
to be Primary without at least one UpToDate disk
Jun 16 22:43:11 nasty kernel: block drbd0:   state = { cs:Connected 
ro:Secondary/Secondary ds:Inconsistent/Inconsistent r--- }
Jun 16 22:43:11 nasty kernel: block drbd0:  wanted = { cs:Connected 
ro:Primary/Secondary ds:Inconsistent/Inconsistent r--- }

The behaviour is the same from whichever end I run the last command, and 
the logs are essentially the same at boths ends.

I am confident my network (firewall) connection is valid, it was working 
with the previous (patched) version, this morning.

Is there any more information you would like from me.

I have some time to do some more debugging tomorrow, if you could give 
me a clue where I should start looking?

Regards

Nick







Philipp Reisner wrote:
> On Monday 15 June 2009 23:21:51 you wrote:
>   
>> Hi,
>>
>> With the latest patch, things are not working very well at all!
>>
>>
>> It think the problem may lie here:
>>
>>
>> #define put_unaligned(val, ptr) ({                    \
>> +    typeof(val) v;                            \
>> +    switch (sizeof(*(ptr))) {                    \
>> +    case 1:                                \
>> +        *(uint8_t *)(ptr) = (uint8_t)(val);            \
>> +        break;                            \
>> +    case 2:                                \
>> +    case 4:                                \
>> +    case 8:                                \
>> +        v = val;                        \
>> +        memcpy(ptr, &v, sizeof(*(ptr)));            \
>> +        break;                            \
>> +    default:                            \
>> +        __bad_unaligned_access_size();                \
>> +        break;                            \
>> +    }                                \
>> +    (void)0; })
>> +
>>
>>
>> When called with
>>
>> put_unaligned(tag, tl->tag_list_cpos++);
>>
>> ptr is evaluated in the switch statement, and twice in the memcpy call,
>> and hence the post-increment will be done three times ....not what we want!
>>
>> Maybe an inline function, or taking a copy of ptr if safer here
>>
>>     
>
> Hi Nick,
>
> No, typeof(), sizeof() do not evaluate the expression at runtime. They
> just deliver the type or the size at compile time. But in that macro 
> there was an error with the type conversation.
>
> I have again, attached a patch, that should fix the issue. Please verify.
>
>   



More information about the drbd-user mailing list