Hi All,<div><br></div><div>Having issues with drbd proxy initial setup. My config is as follows:</div><div><br></div><div><div>cat drbd.conf</div><div># See /usr/share/doc/drbd-8.0.2/drbd.conf or fully annotated file.</div>
<div><br></div><div>global { usage-count no; }</div><div><br></div><div>common {</div><div> protocol A;</div><div> handlers {</div><div> pri-on-incon-degr "/usr/bin/logger -p local2.emerg -t DRBD pri-on-incon-degr tripped";</div>
<div> pri-lost-after-sb "/usr/bin/logger -p local2.emerg -t DRBD pri-lost-after-sb tripped";</div><div> local-io-error "/usr/bin/logger -p local2.emerg -t DRBD local-io-error tripped";</div><div>
#outdate-peer "/usr/sbin/drbd-peer-outdater"; # needs more setup to use</div><div> }</div><div> syncer {</div><div> #rate 10M; # 100baseT </div><div> #rate 100M; # Full Gig-E</div><div> rate 12M; # don't saturate disk</div>
<div> }</div><div> startup {</div><div> wfc-timeout 30; # 30 seconds</div><div> degr-wfc-timeout 120; # 2 minutes.</div><div> } </div><div><br></div><div> net {</div><div> # increase timeout and maybe ping-int in net{}, if you see</div>
<div> # problems with "connection lost/connection established"</div><div><br></div><div> # timeout 60; # 6 seconds (unit = 0.1 seconds)</div><div> # connect-int 10; # 10 seconds (unit = 1 second)</div>
<div> # ping-int 10; # 10 seconds (unit = 1 second)</div><div> # ping-timeout 5; # 500 ms (unit = 0.1 seconds)</div><div><br></div><div># With this option set you might make both nodes primary. You only should use </div>
<div># this options if you use a shared storage file system on top of DRBD. At the </div><div># time of writing the only ones are: OCFS2 and GFS. If you use this option with </div><div># any other filesystem you are goint to crash your nodes and to corrupt your data!</div>
<div># allow-two-primaries;</div><div> cram-hmac-alg "sha1";</div><div> shared-secret "nyb";</div><div> }</div><div>}</div><div><br></div></div><div><br></div><div><div>resource r0 {</div><div>
<br></div><div> proxy {</div><div> compression on;</div><div> memlimit 100M;</div><div> }</div><div><br></div><div> on ip-10-251-193-191 {</div><div> device /dev/drbd0;</div><div> disk /dev/VolGroup00/backupvg;</div>
<div> address <a href="http://127.0.0.1:7789">127.0.0.1:7789</a>;</div><div> </div><div> flexible-meta-disk internal;</div><div><br></div><div> proxy on ip-10-251-193-191 {</div><div> inside <a href="http://127.0.0.1:7788">127.0.0.1:7788</a>;</div>
<div> outside <a href="http://10.251.193.191:7788">10.251.193.191:7788</a>;</div><div> }</div><div><br></div><div> }</div><div><br></div><div> on fx-5 {</div><div> device /dev/drbd0;</div><div> disk /dev/loop0;</div>
<div> </div><div> address <a href="http://127.0.0.1:7789">127.0.0.1:7789</a>;</div><div> </div><div> flexible-meta-disk internal;</div><div><br></div><div> proxy on fx-5 {</div><div> inside <a href="http://127.0.0.1:7788">127.0.0.1:7788</a>;</div>
<div> outside 38.104.nyb.nyb:7788;</div><div> }</div><div> }</div><div><br></div><div><br></div><div><br></div><div>}</div><div><br></div><div>I only have two nodes. One node is on a regular server and the other is in the amazon cloud. The cloud log file says:</div>
<div><br></div><div><div>Aug 28 14:20:46 ip-10-251-193-191 kernel: [9313062.114113] block drbd0: conn( Unconnected -> WFConnection ) </div><div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773054] block drbd0: Handshake successful: Agreed network protocol version 90</div>
<div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773111] block drbd0: sock_sendmsg returned -32</div><div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773116] block drbd0: conn( WFConnection -> BrokenPipe ) </div>
<div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773121] block drbd0: Authentication of peer failed</div><div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773123] block drbd0: Discarding network configuration.</div>
<div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773125] block drbd0: conn( BrokenPipe -> Disconnecting ) </div><div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773348] block drbd0: Connection closed</div>
<div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773358] block drbd0: conn( Disconnecting -> StandAlone ) </div><div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773409] block drbd0: receiver terminated</div>
<div>Aug 28 14:21:18 ip-10-251-193-191 kernel: [9313093.773411] block drbd0: Terminating receiver thread</div><div><br></div><div>netstat -an|grep 7788 on the cloud shows:</div><div><br></div><div><div>[root@ip-10-251-193-191:/etc] netstat -an|grep 7788</div>
<div>tcp 0 0 <a href="http://10.251.193.191:7788">10.251.193.191:7788</a> 0.0.0.0:* LISTEN </div><div>tcp 0 0 <a href="http://127.0.0.1:7788">127.0.0.1:7788</a> 0.0.0.0:* LISTEN</div>
<div><br></div><div><div>netstat -an|grep 7789</div><div>[root@ip-10-251-193-191:/etc]</div><div><br></div><div>I think the reason is because of the Authentication of peer failed message above. How do I fix that?</div><div>
<br></div><div>On the non-cloud side it shows 7788 and 7789 listening but has this in the logs constantly:</div><div><br></div><div><div>Aug 28 11:27:38 fx-5 kernel: block drbd0: Connection closed</div><div>Aug 28 11:27:38 fx-5 kernel: block drbd0: conn( BrokenPipe -> Unconnected ) </div>
<div>Aug 28 11:27:39 fx-5 kernel: block drbd0: conn( Unconnected -> WFConnection ) </div><div>Aug 28 11:27:53 fx-5 kernel: block drbd0: sock_recvmsg returned -11</div><div>Aug 28 11:27:53 fx-5 kernel: block drbd0: conn( WFConnection -> BrokenPipe ) </div>
<div>Aug 28 11:27:53 fx-5 kernel: block drbd0: short read expecting header on sock: r=-11</div><div>Aug 28 11:27:53 fx-5 kernel: block drbd0: Connection closed</div><div>Aug 28 11:27:53 fx-5 kernel: block drbd0: conn( BrokenPipe -> Unconnected ) </div>
<div>Aug 28 11:27:54 fx-5 kernel: block drbd0: conn( Unconnected -> WFConnection ) </div><div>Aug 28 11:28:08 fx-5 kernel: block drbd0: sock_recvmsg returned -11</div><div>Aug 28 11:28:08 fx-5 kernel: block drbd0: conn( WFConnection -> BrokenPipe ) </div>
<div>Aug 28 11:28:08 fx-5 kernel: block drbd0: short read expecting header on sock: r=-11</div><div>Aug 28 11:28:08 fx-5 kernel: block drbd0: Connection closed</div><div><br></div></div><div><br></div><div>-Tim</div></div>
</div></div></div>