Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi I've set up a DRBD8 Active/Active configuration over two remote places, connected via (Open)VPN on ADSL connection, OCFS2 fs. The two systems are both Debian Etch, kernel 2.6.22-4-686, DRBD v8.0.11, OCFS2 1.3.3. I know the solution is quite hazardous, but so far has been working better than I expected. The big issue is that it's very unstable, because it's suffering network leaks. Sometimes it can go straight without problems for weeks, sometimes it fails every other hour. I should now have set up a quite safe configuration that will restore service upon reboot, but now I would like to know if there are some fine tuning I'm missing to better overcome those problems. I've recently done some changes on VPN config (TCP in favour of UDP for instance), but still I receive a lot of "PingAck did not arrive in time" errors. I also have the chance to set up another adsl at each place, being then able to bond two VPN connections together: can this improve DRBD connection's reliability? Actually I preferred to dedicate a whole VPN to DRBD, letting other stuff go on the other connection. Here follows my DRBD config: resource r0 { protocol C; handlers { pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater"; pri-lost "echo pri-lost. Have a look at the log files. | mail -s 'DRBD Alert' spam at me.com; reboot"; pri-lost-after-sb "echo pri-lost-after-sb. Have a look at the log files. | mail -s 'DRBD Alert' spam at me.com; reboot"; split-brain "echo split-brain. drbdadm -- --discard-my-data connect $DRBD_RESOURCE ? | mail -s 'DRBD Alert' spam at me.com"; } startup { wfc-timeout 120; degr-wfc-timeout 120; become-primary-on both; } disk { on-io-error detach; fencing resource-only; } net { allow-two-primaries; after-sb-0pri discard-node-file-server-2; after-sb-1pri discard-secondary; after-sb-2pri call-pri-lost-after-sb; rr-conflict call-pri-lost; timeout 600; # Tempo per rispondere (in decimi di secondo) connect-int 11; # Tempo tra due tentativi di connessione ping-int 11; # Tempo per il keep-alive ping-timeout 100; # Tempo per rispondere al ping (decimi di secondo) } syncer { rate 10M; al-extents 257; } any suggestion is highly appreciated! thanks. -- Lorenzo Milesi - lorenzo.milesi at yetopen.it YetOpen S.r.l. - http://www.yetopen.it/ C.so E. Filiberto, 74 23900 Lecco - ITALY - Tel 0341 220 205 - Fax 178 607 8199 GPG/PGP Key-Id: 0xE704E230 - http://keyserver.linux.it