Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Sorry for another post, it's something i'm working on quite actively.
So the problem then appears to be when a DRBD peer gets rebooted when
the mount is in use i.e. having a file transferred to it - the system
gets hard-rebooted (no shutdown actions are run). Shall I assume this is
a kernel error or something that's been dealt with and raise a bug with
the Ubuntu-server team to port a version > 8.0.11?
Henri Cook wrote:
> An off-list respondent suggested that I post my drbd.conf - I dont'
> have any reboot options in there (that are being used) - but I may be
> missing something. (Below)
>
> Please note that reboot-sane (a custom reboot script) emails me,
> writes to a file, waits 2 seconds, then reboots the system and is NOT
> being called in the situation I describe (that's what i originally
> thought was happening)
>
> #
> # drbd.conf
> #
>
> skip {
>
> }
>
> global {
> # minor-count 64;
> # dialog-refresh 5; # 5 seconds
> usage-count yes;
> }
>
> common {
> syncer { rate 60M; }
> protocol C;
> }
>
> resource shared {
>
> handlers {
> pri-on-incon-degr "echo 'DRBD: Inconsistent - Rebooting.' >>
> /var/log/drbd.log ; /usr/local/sbin/reboot-sane";
> pri-lost-after-sb "echo 'DRBD: A split brain situation occured.
> This node lost. Rebooted' >> /var/log/drbd.log ;
> /usr/local/sbin/reboot-sane";
> local-io-error "echo 'DRBD: A local IO error occurred, rebooting.'
> >> /var/log/drbd.log ; /usr/local/sbin/reboot-sane";
> pri-lost "echo 'DRBD: Pri-lost, check log files.' >>
> /var/log/drbd.log ; /usr/local/sbin/reboot-sane";
>
> outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5 -r shared
> -p dean"; # NB: The machine we're on now is torvil, the -p option here
> is different on the other host
>
>
> split-brain "echo 'DRBD: A split-brain situation occurred and was
> resolved successfully.' >> /var/log/drbd.log";
> }
>
> startup {
> wfc-timeout 15;
>
> degr-wfc-timeout 30;
>
> # In case you are using DRBD for GFS/OCFS2 you want that the
> # startup script promotes it to primary. Nodenames are also
> # possible instead of "both".
> become-primary-on both;
> }
>
> disk {
> on-io-error detach;
>
> fencing resource-only;
> }
>
> net {
>
> # timeout 60; # 6 seconds (unit = 0.1 seconds)
> # connect-int 10; # 10 seconds (unit = 1 second)
> # ping-int 10; # 10 seconds (unit = 1 second)
> # ping-timeout 5; # 500 ms (unit = 0.1 seconds)
>
> # max-buffers 2048;
>
> # unplug-watermark 128;
>
> # max-epoch-size 2048;
>
> ko-count 4;
>
> allow-two-primaries;
>
> cram-hmac-alg "sha256";
> shared-secret "w405FDS^%tngpDSFg^";
>
> after-sb-0pri discard-older-primary;
> after-sb-1pri discard-secondary;
> after-sb-2pri call-pri-lost-after-sb;
>
> rr-conflict call-pri-lost;
>
> }
>
> syncer {
> al-extents 257;
> }
>
> on torvil {
> device /dev/drbd0;
> disk /dev/md4;
> address 10.0.0.2:7788;
> meta-disk /dev/md5[0];
> }
>
> on dean {
> device /dev/drbd0;
> disk /dev/md4;
> address 10.0.0.3:7788;
> meta-disk /dev/md5[0];
> }
> }
>
>
>
> Henri Cook wrote:
>> Please, can anyone help? This is severely affecting my setup
>>
>> If I start say, an FTP file transfer to my drbd /shared directory on
>> node A, then reboot node B which is the other machine in the
>> Primary-Primary configuration DRBD on node A register's a NetworkFailure
>> which appears to trigger a reboot action - I can't find anywhere to
>> define this behaviour, i'd very much like to stop the reboot happening.
>>
>> So to confirm behaviour, during a transfer to A onto /shared, if I
>> reboot B as soon as A loses the connection to B, A reboots also -
>> cripping the pair.
>>
>> Henri
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
>
>
>
>
> Henri Cook wrote:
>> Please, can anyone help? This is severely affecting my setup
>>
>> If I start say, an FTP file transfer to my drbd /shared directory on
>> node A, then reboot node B which is the other machine in the
>> Primary-Primary configuration DRBD on node A register's a NetworkFailure
>> which appears to trigger a reboot action - I can't find anywhere to
>> define this behaviour, i'd very much like to stop the reboot happening.
>>
>> So to confirm behaviour, during a transfer to A onto /shared, if I
>> reboot B as soon as A loses the connection to B, A reboots also -
>> cripping the pair.
>>
>> Henri
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
> ------------------------------------------------------------------------
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20080906/5a9ea7be/attachment.htm>