Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
For those of you who are trying to setup a test cluster, and want to stress test it while "simulating" hardware failures: I just now put my Cluster Test Harness into CVS, testing/CTH/*. You can find a copy of its README below. To use it for testing DRBD, you should *not* have heartbeat configured, and you should *not* start drbd after boot. Node failures are triggerd by doing an "reboot -nf" on the node, which should do an immediate reboot and therefore it won't behave like a real node failure if you start drbd from the init scripts. Link failure is simulated (currently) with "iptables ethX -j DROP", so you better make sure that you have a dedicated nic for the DRBD traffic, or you lock yourself out on the admin nic... This will be improved to only drop the relevant drbd connections somewhen, which would make it possible to even simulate more independend nics than you actually have. Disks are used by putting a DM on top of the real device. This DM now will be used as if it was the real device. Which makes it possible to simulate disk failure by reloading the DM with the "error" target. Current test runs suggest that the "on_error detach;" option of DRBD does work, only the "drbdadm attach" command to *reattach* does not (yet) work as intended under all circumstances. To get the IPs and hostnames, and device names and so on right, you need to write your own configuration. Base it on the sample configurations. To start, I suggest to only fail the link and relocate test resources. If you get familiar with how it works, allow more components to fail. If you want to suggest or contribute more generic test resources, tell me. If you think it (this CTH) does not work as intended, or you think how it was intended makes no sense, tell me. If this triggers bad behaviour of DRBD, scream out loud, and send in the logs and syslogs and oopses and whatever. Lengthy reports best go directly to Philipp and me, and NOT to the list. Ah, we are talking DRBD-0.7 here, of course. Thanks, Lars Ellenberg ---------- = Cluster Test Harness = == What is this? == This is meant to be a generic Cluster Test Harness (CTH). Its main purpose is to simulate hardware failure. Since the Hardware aspects are abstracted, this can be used to test any software/hardware interaction in presence of hardware failures. Because it is used to test DRBDs behaviour, some DRBD specifics are built in. The CTH expects * to to run on one controlling box * which has exclusive access to at least two test nodes via ssh * that the ssh login will not ask for a password, but just let me through * the test nodes run linux kernel 2.6 (may work with 2.4, too) * the test nodes to have "dm" available * the test nodes to have "iptables" available Test nodes may well be UML sessions. '''Note''' that since some path's are hardcoded, the CTH expects to be run from the current directory. === Requirements === Some of the internal scripting was easier done with versions of grep and ping, that Debian (and maybe other distros) has not installed by default. In particular, I expect * ping from iputils-ping instead of netkit-ping * grep > 2.5 (e.g. from 2.5.1.ds1-1.backports.org.1) For stress testing, I like to use tiobench, and wbtest. If you want to use them, you should compile and install them into e.g. /root/bin on the test nodes. == File List == LGE_CTH*:: The perl module. Only adventurouse people should have a look at the internals here. functions.sh:: The core bash scripts, which are used and triggered by the CTH. generic_test.pl:: Example of how to use the LGE_CTH perl module, use a certain configuration, and generate randome hardware failures and test resource relocations Since some of it is still hardcoded, you have to edit it, choose a config file, and some parameters. some sample configuration files: * uml-minna.conf * bloodymary.conf CTH_bash.sh:: Example of how to use the core functions directly from bash. In contrast to generic_test.pl and the LGE_CTH perl modules, this is meant to script one particular failure scenario. CTH_bash.conf:: sample configuration for the CTH_bash.sh similar to uml-minna.conf CTH_bash.helpers:: guess what ... to keep the CTH_bash.sh clean. === Test Resources === * tiobench * wbtest == AUTHOR == of all this crap is Lars Ellenberg <l.g.e at web.de> In case you care for an explicit '''license statement''': This is and needs to be GPL. # vim: set ft=wiki-moin : # $Id: README,v 1.1.2.1 2004/05/27 12:44:18 lars Exp $