Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, I am trying to use DRBD 8.0.12 with a modified Linux 2.6.21 based code and a proprietary linux IP stack. I am running into a problem where the Primary closes the connection to the secondary because it receives a ERESTARTSYS which is not handled in 'sock_recvmsg' on the primary node and it closes the connection. I have pasted the command activity and the output of dmesg of the primary and secondary nodes. Has someone else faced a similar problem with ERESTARTSYS. How is this supposed to be handled in the kernel? Any help is appreciated. Thanks, Praveen. PRIMARY NODE COMMAND ACTIVITY: ============================== root at drbd1:/root> drbdadm create-md all v08 Magic number not found v07 Magic number not found v07 Magic number not found v08 Magic number not found Writing meta data... initialising activity log NOT initialized bitmap New drbd meta data block sucessfully created. success root at drbd1:/root> cat /proc/drbd version: 8.0.12 (api:86/proto:86) GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 root at drbd1:/root> drbdadm up all root at drbd1:/root> cat /proc/drbd version: 8.0.12 (api:86/proto:86) GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 0: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 root at drbd1:/root> drbdsetup /dev/drbd0 primary -o root at drbd1:/root> cat /proc/drbd version: 8.0.12 (api:86/proto:86) GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 0: cs:StandAlone st:Primary/Unknown ds:UpToDate/Inconsistent r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 root at drbd1:/root> PRIMARY 'dmesg' DRBD OUTPUT: ========================== root at drbd1:/root> dmesg|grep drbd drbd: initialised. Version: 8.0.12 (api:86/proto:86) drbd: GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 drbd: registered as block device major 147 drbd: minor_table @ 0xcf7cbcc0 drbd0: disk( Diskless -> Attaching ) drbd0: Starting worker thread (from cqueue/0 [2315]) drbd0: No usable activity log found. drbd0: max_segment_size ( = BIO size ) = 32768 drbd0: drbd_bm_resize called with capacity == 1048152 drbd0: resync bitmap: bits=131019 words=4096 drbd0: size = 512 MB (524076 KB) drbd0: Writing the whole bitmap, size changed drbd0: writing of bitmap took 1 jiffies drbd0: 512 MB (131019 bits) marked out-of-sync by on disk bit-map. drbd0: reading of bitmap took 4 jiffies drbd0: recounting of set bits took additional 0 jiffies drbd0: 512 MB (131019 bits) marked out-of-sync by on disk bit-map. drbd0: disk( Attaching -> Inconsistent ) drbd0: Writing meta data super block now. drbd0: conn( StandAlone -> Unconnected ) drbd0: Starting receiver thread (from drbd0_worker [2339]) drbd0: receiver (re)started drbd0: conn( Unconnected -> WFConnection ) drbd0: Handshake successful: DRBD Network Protocol version 86 drbd0: conn( WFConnection -> WFReportParams ) drbd0: Starting asender thread (from drbd0_receiver [2344]) drbd0: No resync, but 131019 bits in bitmap! drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent ) drbd0: Writing meta data super block now. drbd0: sock_recvmsg returned -512 drbd0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) drbd0: asender terminated drbd0: Terminating asender thread drbd0: role( Secondary -> Primary ) disk( Inconsistent -> UpToDate ) drbd0: short read expecting header on sock: r=-512 drbd0: Writing meta data super block now. drbd0: Forced to consider local data as UpToDate! drbd0: Creating new current UUID drbd0: Writing meta data super block now. drbd0: tl_clear() drbd0: Connection closed drbd0: conn( NetworkFailure -> Unconnected ) drbd0: receiver terminated drbd0: receiver (re)started drbd0: conn( Unconnected -> WFConnection ) drbd0: Unable to bind sock2 (-98) drbd0: conn( WFConnection -> Disconnecting ) drbd0: Discarding network configuration. drbd0: tl_clear() drbd0: Connection closed drbd0: conn( Disconnecting -> StandAlone ) drbd0: receiver terminated drbd0: Terminating receiver thread root at drbd1:/root> SECONDARY NODE COMMAND ACTIVITY: ================================= root at drbd2:/root> drbdadm create-md all v08 Magic number not found v07 Magic number not found v07 Magic number not found v08 Magic number not found Writing meta data... initialising activity log NOT initialized bitmap New drbd meta data block sucessfully created. success root at drbd2:/root> cat /proc/drbd version: 8.0.12 (api:86/proto:86) GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 root at drbd2:/root> drbdadm up all root at drbd2:/root> cat /proc/drbd version: 8.0.12 (api:86/proto:86) GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 0: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 root at drbd2:/root> cat /proc/drbd version: 8.0.12 (api:86/proto:86) GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 0: cs:StandAlone st:Secondary/Unknown ds:Inconsistent/Inconsistent r--- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0 root at drbd2:/root> SECONDARY 'dmesg' DRBD OUTPUT: ============================= root at drbd2:/root> dmesg|grep drbd drbd: initialised. Version: 8.0.12 (api:86/proto:86) drbd: GIT-hash: 5c9f89594553e32adb87d9638dce591782f947e3 build by wichorus_bld at build5.wichorus.com, 2008-05-27 16:10:47 drbd: registered as block device major 147 drbd: minor_table @ 0xcf531d40 drbd0: disk( Diskless -> Attaching ) drbd0: Starting worker thread (from cqueue/0 [2329]) drbd0: No usable activity log found. drbd0: max_segment_size ( = BIO size ) = 32768 drbd0: drbd_bm_resize called with capacity == 1048152 drbd0: resync bitmap: bits=131019 words=4096 drbd0: size = 512 MB (524076 KB) drbd0: Writing the whole bitmap, size changed drbd0: writing of bitmap took 0 jiffies drbd0: 512 MB (131019 bits) marked out-of-sync by on disk bit-map. drbd0: reading of bitmap took 3 jiffies drbd0: recounting of set bits took additional 1 jiffies drbd0: 512 MB (131019 bits) marked out-of-sync by on disk bit-map. drbd0: disk( Attaching -> Inconsistent ) drbd0: Writing meta data super block now. drbd0: conn( StandAlone -> Unconnected ) drbd0: Starting receiver thread (from drbd0_worker [2341]) drbd0: receiver (re)started drbd0: conn( Unconnected -> WFConnection ) drbd0: Handshake successful: DRBD Network Protocol version 86 drbd0: conn( WFConnection -> WFReportParams ) drbd0: Starting asender thread (from drbd0_receiver [2346]) drbd0: No resync, but 131019 bits in bitmap! drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent ) drbd0: Writing meta data super block now. drbd0: meta connection shut down by peer. drbd0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) drbd0: asender terminated drbd0: Terminating asender thread drbd0: sock was shut down by peer drbd0: short read expecting header on sock: r=0 drbd0: Writing meta data super block now. drbd0: tl_clear() drbd0: Connection closed drbd0: conn( NetworkFailure -> Unconnected ) drbd0: receiver terminated drbd0: receiver (re)started drbd0: conn( Unconnected -> WFConnection ) drbd0: Unable to bind sock2 (-98) drbd0: conn( WFConnection -> Disconnecting ) drbd0: Discarding network configuration. drbd0: tl_clear() drbd0: Connection closed drbd0: conn( Disconnecting -> StandAlone ) drbd0: receiver terminated drbd0: Terminating receiver thread root at drbd2:/root>