[DRBD-cvs] r1447 - trunk/testing/CTH/wbtest

drbd-user@lists.linbit.com drbd-user@lists.linbit.com
Wed, 21 Jul 2004 14:16:54 +0200 (CEST)


Author: lars
Date: 2004-07-21 14:16:53 +0200 (Wed, 21 Jul 2004)
New Revision: 1447

Removed:
   trunk/testing/CTH/wbtest/run-wbtest.sh
Modified:
   trunk/testing/CTH/wbtest/README
   trunk/testing/CTH/wbtest/wbtest.c
Log:
rewrote wbtest to match our needs better

Modified: trunk/testing/CTH/wbtest/README
===================================================================
--- trunk/testing/CTH/wbtest/README	2004-07-21 11:23:18 UTC (rev 1446)
+++ trunk/testing/CTH/wbtest/README	2004-07-21 12:16:53 UTC (rev 1447)
@@ -1,25 +1,23 @@
+Original wbtest is
+ * Copyright (C) 2003-2004 EMC Corporation
+ *
+ * wbtest.c - a testing utility for the write barrier file system 
+ *            functionality.
+ *
+ * Written by Brett Russ <russb@emc.com>
 
-slightly patched by lge:
-fsync on directory descriptor, fflush(0) before fork() ...
+rewritten to match my needs
+ Copyright 2004 Lars Ellenberg <l.g.e@web.de>
 
-DON'T use run-wbtest.sh with the CTH, that does not make any sense.
-
 ----
 
 wbtest.c README
 
-Requirements:
+Intentional Usage:
 
-Requires (at least) 2 drives: one test drive, with write cache (WC)
-enabled and one safe drive, with WC disabled.  The run-wbtest.sh self
-installer must be modifed to point to the path to a directory on the
-test drive that will contain test data files (of size <min> to <max>;
-default 4B to 100KB).  Additionally, supply a path to a directory on
-the safe drive that will contain "checkpoint" files used to verify the
-integrity of the test data files after a powerfail/reboot cycle.
-Last, supply the device names for the safe and test drives such that
-the script can force WC disabled/enabled respectively on these drives
-prior to every run of wbtest.
+You should setup a pair of DRBD, mount it on one node, and run wbtest
+on some path on that device. Then you can trigger more or less gracefull
+failovers, and run wbtest on the other node.
 
 Compile:
 
@@ -27,52 +25,53 @@
 
 Summary:
 
-The program starts and first scans the safe dir for checkpoint files.
-This dir should be empty on first run.  Finding none, it will begin
-creating processes whose job is to write data to the test dir and log
-it in checkpoint files (1 per process).  Each process writes a number
-of files of size in range <min> to <max> and then exits.  A max of
-<concurrent> procs are launched at once.  A pass is defined as the
-cycle a process makes from <min> to <max> or <max> to <min>; odd PIDs
-acsend, even PIDs descend.  A <passes> value of 0 runs to infinity.
+The program starts and first scans the data dir for files matching the
+expected naming convention. All unfinished files (ending in '+') will
+then just be unlinked, all other files will be verified to match their
+expected size and content.
 
+This data dir should be empty on first run.  Finding none, wbtest will
+begin to create processes whose job is to create new data files.  Each
+process writes a number of files of size in range <min> to <max>
+(default 4B -- 100KB) and then exits.  A max of <concurrent> procs are
+launched at once.  A pass is defined as the cycle a process makes from
+<min> to <max> or <max> to <min>; odd PIDs ascend, even PIDs descend.
+A <passes> value of 0 runs to infinity.  After <recycle> passes, wbtest
+starts to verify and remove single passes again, to avoid to fill up the
+file system completely.
+
 The point is to cut off power to the running machine at a random point
 during the testing.  To do this, either set <passes> to 0 or ensure
 that you set it high enough such that I/O is running when the power is
-shut off.  Care must be taken to ensure that the file system serving
-the test data dir does not fill before power is cut.  Also, if you are
-automating the power failures, ensure not only that the disk doesn't
-fill before the power cuts, but also that the verification process (to
-be discussed next) finishes and I/O begins before power cuts out.
+shut off.
 
-Since the run-wbtest.sh script installs itself at the end of rc.local,
-the wbtest will launch at the end of the next boot.  Now, the safe dir
-should contain valid checkpoint files and these will be scanned and
-control the verification of the test data.  If errors are found,
-testing stops and if a log file was provided the file in error is
-logged.  All test data and checkpoint files that verify successfully
+On failover or reboot, the test data files are verified.
+The file naming convention controlls the verification of the test data.
+These files are named <pid>-<size>-<hexdigits>, and are expected to be
+of size <size>, and to be filled with the four byte pattern <hexdigits>.
+A file ending in an additional '+' was not yet completely written, so it
+is not expected to verify correctly. It will just be removed in the
+verification step.
+
+If errors are found, testing stops and if a log file was provided the
+file in error is logged.  All test data files that verify successfully
 are removed.  Any test data files with errors are preserved for
 inspection.
 
 I think that's about it...here's the usage text:
 
-Usage: ./wbtest [-hvVs] [-m <min>] [-M <max>] [-p <passes>]
-[-c <concurrent>] [-l <vLog>] -s <safedir> -t <testdir>
+Usage: ./wbtest [-hvV] [-m <min>] [-M <max>] [-p <passes>]
+		[-r <recycle>] [-c <concurrent>] [-l <vLog>] -d <datadir>
 
-./wbtest - Version 1.0 Options:
-        -h prints this usage text
-        -v forces NO verification step of existing files (if any)
-        -V forces exit after verification step of existing files
-        <min> == minimum IO size to use (bytes)
-        <max> == maximum IO size to use (bytes)
-        <passes> == # of passes to run (0 for INF)
-        <concurrent> == # of processes to run at once
-        <vLog> == log all file verify failures here, otherwise mkstemp
-        <safedir> == writable DIRECTORY to store 'checkpoint' files
-        <testdir> == writable DIRECTORY to store 'test data' files
-
-        Difference between safedir and testdir is that safedir should 
-        be 'safe' storage meaning that it is not using drive write 
-        cache, whereas testdir is intended to be using drive write 
-        cache and the write barrier. Naturally, the two should be on 
-        separate drives
+./wbtest - Version 1.1-lge Options:
+  -h prints this usage text
+  -v SKIP verification step of existing files (if any)
+  -V forces exit after verification step of existing files
+  min         minimum IO size to use (bytes)
+  max         maximum IO size to use (bytes)
+  passes      # of passes to run (0 for INF)
+  recycle     # of passes after which recycling of disk space starts
+  concurrent  # of processes to run at once
+  vLog        log (append) all file verify failures here.
+              otherwise /tmp/wbtest-vLog-<timestamp>-<pid> will be used
+  datadir     required; writable directory on DRBD to store 'test data' files

Deleted: trunk/testing/CTH/wbtest/run-wbtest.sh
===================================================================
--- trunk/testing/CTH/wbtest/run-wbtest.sh	2004-07-21 11:23:18 UTC (rev 1446)
+++ trunk/testing/CTH/wbtest/run-wbtest.sh	2004-07-21 12:16:53 UTC (rev 1447)
@@ -1,36 +0,0 @@
-#!/bin/sh
-
-# device name for drive that holds safe directory
-drv_safe="/dev/hdX"
-drv_test="/dev/hdY"
-
-# path prefixes for safe and test directories
-safe_dir="/path/on/drv_safe/wbtest-safe"
-test_dir="/path/on/drv_test/wbtest-test"
-
-# path prefix to wbtest and run-wbtest.sh
-wbtest_path="/root"
-
-if [ ! -d $safe_dir ]; then
-    mkdir -p $safe_dir
-fi
-
-if [ ! -d $test_dir ]; then
-    mkdir -p $test_dir
-	
-fi
-
-# make sure that the safe drive has write cache disabled
-hdparm -W0 $drv_safe
-# make sure that the test drive has write cache enabled
-hdparm -W1 $drv_test
-
-RC_LOCAL_TOUCHED=/var/wbtest-mod-rc-local
-
-if [ ! -f $RC_LOCAL_TOUCHED ]; then
-    echo "${wbtest_path}/run-wbtest.sh" >> /etc/rc.d/rc.local
-    > $RC_LOCAL_TOUCHED
-fi
-
-# Run it!
-${wbtest_path}/wbtest -p 0 -c 40 -s $safe_dir -t $test_dir &

Modified: trunk/testing/CTH/wbtest/wbtest.c
===================================================================
--- trunk/testing/CTH/wbtest/wbtest.c	2004-07-21 11:23:18 UTC (rev 1446)
+++ trunk/testing/CTH/wbtest/wbtest.c	2004-07-21 12:16:53 UTC (rev 1447)
@@ -1,7 +1,7 @@
-/* 
+/*
  * Copyright (C) 2003-2004 EMC Corporation
  *
- * wbtest.c - a testing utility for the write barrier file system 
+ * wbtest.c - a testing utility for the write barrier file system
  *            functionality.
  *
  * Written by Brett Russ <russb@emc.com>
@@ -19,13 +19,38 @@
  * You should have received a copy of the GNU General Public License
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ * ----------
+ * 2004 modified by Lars Ellenberg for testing the data integrity
+ * on a "shared" storage like DRBD.
+ *
+ * - I don't need the "Checkpoint files", since any data file name
+ *   already contains all neccessary information about its content.
+ *   So I remove all references to the Checkpoint thingy.
+ * - a data file starts as  "$DATA_DIR/%pid-%size-%rnum+"
+ *   and will be renamed to "$DATA_DIR/%pid-%size-%rnum"
+ *   when it was fsynced to disk.
+ * - in the verify stage, we first delete all files that are not fully
+ *   synced to disk ("*+"). Then, all files are verified against their
+ *   expected content (according to their name), then removed.
+ * - when the disk fills up, we do some additional verify stage in
+ *   between, so we continuously to produce IO-load, but never fill up
+ *   the disk.
+ * - explicitly open log file early
+ *
+ * ToDo:
+ * - make endian save to be able to test cross platform DRBD
+ * - handle signals more gracefully
+ *
  */
 
 #include <stdio.h>
 #include <stdlib.h>
+#include <stdarg.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <sys/wait.h>
+#include <sys/vfs.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include <string.h>
@@ -34,661 +59,590 @@
 #include <dirent.h>
 #include <assert.h>
 #include <time.h>
+#include <endian.h>
 
-#define WBTEST_VERSION "1.0"
+#define WBTEST_VERSION "1.1-lge"
 
 typedef unsigned int UINT_32;
 
+/* hard limit number of passes before internal cleanup is due */
+#define MAX_RECYCLE   10000
+#define DATA_BUF_LEN   4096
+#define FNAME_LEN       128
 
-#define MEM_SCRATCH_LEN 128
-#define DATA_BUF_LEN 8096
 
-#define WAIT_ON_SINGLE_CHILD 1
-#define WAIT_ON_ALL_CHILDREN 2
+/* global parameters, can be changed on commandline, with defaults: */
+static char Data_path[FNAME_LEN] = "";	// -d
+static char Log_fname[FNAME_LEN] = "";	// -l
+static int Verify_only   = 0;		// -V
+static int Dont_verify   = 0;		// -v
+static UINT_32 Recycle   = 4000;	// -r
+static UINT_32 Pass_cnt  = 100;		// -p
+static UINT_32 Max_conc  = 25;		// -c
+static UINT_32 Min_size  = 4;		// -m
+static UINT_32 Max_size  = 100*1024;	// -M
 
-#define DEFAULT_MAX_CONCURRENT 25
-#define DEFAULT_PASS_COUNT 100
+/* globals, initialized after option parsing */
+static DIR     *Data_dir;	// so I can fsync(dirfd(dp));
+static FILE    *Log_fp;
+static UINT_32  Data_buffer[DATA_BUF_LEN];
 
 #define MIN(x,y) (((x) < (y)) ? (x) : (y))
+#define MAX(x,y) (((x) > (y)) ? (x) : (y))
 
-/* The range of file sizes in bytes we will write
- */
-static UINT_32 Min_io_sz = 4;
-static UINT_32 Max_io_sz = 102400;
+void Logf(char *fmt, ...)
+{
+	va_list ap;
 
-/* Checkpoint file descriptor and pointer 
- */
-static int Ckpt_fd = 0;
-static FILE *Ckpt_fp = NULL;
+	if (Log_fp) {
+		va_start(ap, fmt);
+		vfprintf(Log_fp, fmt, ap);
+		va_end(ap);
+		if ( fflush(Log_fp) || fsync(fileno(Log_fp)) ) {
+			fprintf(stderr, "Logf error: ");
+			perror(0);
+		}
+	}
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+}
 
-static char *Mem_scratch_p = NULL;
-static char *Mem_scratch2_p = NULL;
-static UINT_32 *Data_p = NULL;
+#define PERROR(fmt , args...) do {		\
+	Logf( fmt ": %s (%d)\n" , ##args ,	\
+	      strerror(errno), errno );		\
+} while (0)
 
-#define FLAG_NO_VERIFY 0x00000001
-#define FLAG_VERIFY_ONLY 0x00000002
-static UINT_32 Flags = 0;
+/*
+ * in child process
+ ******************************************************/
 
-/* Directory required to hold checkpoint files.  This dir should be 
- * on a disk w/write cache and write barrier disabled.
- */
-static char *Checkpoint_path = NULL;
-/* Directory required to hold data files.  This dir should be 
- * on a disk w/write cache and write barrier enabled.
- */
-static char *Data_path = NULL;
-/* Failed verifies (data filenames that don't match the checkpoint file)
- * are logged here.
- */
-static char *Verify_fail_logf = NULL;
+void fill(const UINT_32 word, const size_t num_bytes)
+{
+	UINT_32 i;
+	UINT_32 *p = Data_buffer;
+	for (i = 0; i < num_bytes; i += 4) *p++ = word;
+}
 
-
-UINT_32 initial_setup_child(const pid_t pid)
+int write_file(const pid_t pid, size_t size)
 {
-	char *mem_p = NULL;
-	time_t seed;
+	static char    Fname_curr[FNAME_LEN] = "";
+	static char    Fname_done[FNAME_LEN] = "";
+	int fd;
+	ssize_t c = 0;
+	UINT_32 rnum = (UINT_32) random();
+	size_t wsize = size;
 
-	Mem_scratch_p = (char *) malloc((size_t) MEM_SCRATCH_LEN);
-	if (NULL == Mem_scratch_p) {
-		fprintf(stderr, "can't alloc mem scratch\n");
-		return 1;
-	}
-	mem_p = Mem_scratch_p;
+	snprintf(Fname_done, FNAME_LEN, "%u-%u-%08X", pid, size, rnum);
+	snprintf(Fname_curr, FNAME_LEN, "%s+", Fname_done);
 
-	sprintf(mem_p, "%s/%08u.chk", Checkpoint_path, pid);
+	fill(rnum, MIN(size, DATA_BUF_LEN));
 
-	Ckpt_fd =
-	    open(mem_p, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR);
-	if (-1 == Ckpt_fd) {
-		fprintf(stderr,
-			"can't open fd for checkpoint file errno=%i\n",
-			errno);
-		return 1;
+	fd = open(Fname_curr, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR);
+	if (-1 == fd) {
+		PERROR("open(%s,WRITE)", Fname_curr);
+		return -1;
 	}
 
-	Ckpt_fp = fdopen(Ckpt_fd, "w");
-	if (NULL == Ckpt_fp) {
-		fprintf(stderr,
-			"can't open fp for checkpoint file errno=%i\n",
-			errno);
-		return 1;
+	while (DATA_BUF_LEN < wsize) {
+		c = write(fd, (const void *) Data_buffer, DATA_BUF_LEN);
+		if (c != DATA_BUF_LEN) {
+			PERROR("D write(%s) (wrote %i/%u)",
+				Fname_curr, c, DATA_BUF_LEN);
+			wsize = 0;
+			c = -1;
+			break;
+		}
+		wsize -= DATA_BUF_LEN;
 	}
 
-
-	Data_p = (UINT_32 *) malloc((size_t) DATA_BUF_LEN);
-	if (NULL == Data_p) {
-		fprintf(stderr, "can't alloc data mem\n");
-		return 1;
+	if (wsize) {
+		c = write(fd, (const void *) Data_buffer, wsize);
+		if (c != wsize) {
+			PERROR("D write(%s) (wrote %i/%u)",
+				Fname_curr, c, wsize);
+			c = -1;
+		}
 	}
 
-	seed = time(NULL);
-	assert(-1 != seed);
-	seed ^= pid;
-	srandom((UINT_32) seed);
-
-	return 0;
-}
-
-void cleanup_mem(void)
-{
-	if (NULL != Mem_scratch_p) {
-		free((void *) Mem_scratch_p);
-		Mem_scratch_p = NULL;
+	if (0 != fsync(fd)) {
+		PERROR("D fsync(%s)", Fname_curr);
 	}
-	if (NULL != Mem_scratch2_p) {
-		free((void *) Mem_scratch2_p);
-		Mem_scratch2_p = NULL;
-	}
-	if (NULL != Data_p) {
-		free((void *) Data_p);
-		Data_p = NULL;
-	}
-}
 
-UINT_32 initial_setup_parent(void)
-{
-	Mem_scratch_p = (char *) malloc((size_t) MEM_SCRATCH_LEN);
-	if (NULL == Mem_scratch_p) {
-		fprintf(stderr, "can't alloc mem scratch\n");
-		return 1;
-	}
+	fd = close(fd);
+	assert(-1 != fd);
 
-	Mem_scratch2_p = (char *) malloc((size_t) MEM_SCRATCH_LEN);
-	if (NULL == Mem_scratch2_p) {
-		fprintf(stderr, "can't alloc mem scratch2\n");
-		return 1;
+	if (-1 != c) {
+		if (rename(Fname_curr, Fname_done)) {
+			PERROR("D rename(%s)", Fname_curr);
+		};
 	}
+	fsync(dirfd(Data_dir));
 
-	Data_p = (UINT_32 *) malloc((size_t) DATA_BUF_LEN);
-	if (NULL == Data_p) {
-		fprintf(stderr, "can't alloc data mem\n");
-		return 1;
-	}
-
 	return 0;
 }
 
-int wait_for_kids(UINT_32 * kids, int single)
+void run_ascending(const pid_t pid)
 {
-	int err = 0;
-
-	while (*kids) {
-		pid_t reaped_pid;
-		int status;
-
-		reaped_pid = wait(&status);
-
-		if (WIFEXITED(status)) {
-			(*kids)--;
-			if (WEXITSTATUS(status)) {
-				fprintf(stderr,
-					"child %u exited with status %u (%u remain)\n",
-					reaped_pid, WEXITSTATUS(status),
-					*kids);
-				err = 1; // DO error here, we do NOT want to keep going
-			}
-			if (WAIT_ON_SINGLE_CHILD == single) {
-				break;
-			}
-		} else if (WIFSIGNALED(status)) {
-			(*kids)--;
-			fprintf(stderr,
-				"child %u exited with status %u (%u remain)\n",
-				reaped_pid, WTERMSIG(status),
-				*kids);
-			err = 1; // DO error here, we do NOT want to keep going
-			if (WAIT_ON_SINGLE_CHILD == single) {
-				break;
-			}
-		} else if (0 > reaped_pid) {
-			fprintf(stderr,
-				"wait exited with error; quitting\n");
-			err = 1;
+	UINT_32 size;
+	for (size = Min_size; size <= Max_size; size <<= 1) {
+		// size &= ~3;
+		if (write_file(pid, size)) {
+			Logf("ending ascending run\n");
 			break;
 		}
 	}
-
-	return err;
 }
 
-/* Function responsible for recording the contents and size of each file
- * written to the WSC/WB enabled FS
- */
-UINT_32 record_file(const size_t size, const UINT_32 rNum)
+void run_descending(const pid_t pid)
 {
-
-	fprintf(Ckpt_fp, "%08u:%08X\n", size, rNum);
-
-	if (0 != fflush(Ckpt_fp)) {
-		fprintf(stderr, "fflush error: errno=%i\n", errno);
-	}
-
-	if (0 != fsync(Ckpt_fd)) {
-		fprintf(stderr, "fsync error: errno=%i\n", errno);
-	}
-
-	return 0;
-}
-
-void log_verify_failure(char *file_path_p)
-{
-	static FILE *fp = NULL;
-	int err;
-
-	if (NULL == file_path_p) {
-		if (NULL != fp) {
-			err = fclose(fp);
-			assert(EOF != err);
-			fp = NULL;
+	UINT_32 size;
+	for (size = Max_size; size >= Min_size; size >>= 1) {
+		size &= ~3;
+		if (write_file(pid, size)) {
+			Logf("ending descending run\n");
+			break;
 		}
-	} else {
-		if (NULL == fp) {
-			/* not open yet, open it */
-			if (Verify_fail_logf) {
-				fp = fopen(Verify_fail_logf, "a");
-			} else {
-				int fd = 0;
-				fd = mkstemp("/tmp/wbtest-vLog-XXXXXXX");
-				assert(-1 != fd);
-				fp = fdopen(fd, "a");
-			}
-			assert(NULL != fp);
-		}
-		fprintf(fp, "%s\n", file_path_p);
 	}
 }
 
-void fill(const UINT_32 * loc_p, const UINT_32 word,
-	  const size_t num_bytes)
-{
-	UINT_32 ctr;
-	UINT_32 *d_p = (UINT_32 *) loc_p;
+/*
+ * below only called from master process
+ ******************************************************/
 
-	for (ctr = 0; ctr < num_bytes; ctr += 4) {
-		*d_p++ = word;
-	}
-
-	return;
+void usage(char *prog)
+{
+	printf("Usage: %s [-hvV] [-m <min>] [-M <max>] [-p <passes>]\n"
+	       "\t\t[-r <recycle>] [-c <concurrent>] [-l <vLog>] -d <datadir>\n"
+	       "\n%s - Version %s Options:\n"
+	       "  -h prints this usage text\n"
+	       "  -v SKIP verification step of existing files (if any)\n"
+	       "  -V forces exit after verification step of existing files\n"
+	       "  min         minimum IO size to use (bytes)\n"
+	       "  max         maximum IO size to use (bytes)\n"
+	       "  passes      # of passes to run (0 for INF)\n"
+	       "  recycle     # of passes after which recycling of disk space starts\n"
+	       "  concurrent  # of processes to run at once\n"
+	       "  vLog        log (append) all file verify failures here.\n"
+	       "              otherwise /tmp/wbtest-vLog-<timestamp>-<pid> will be used\n"
+	       "  datadir     required; writable directory on DRBD to store 'test data' files\n",
+	       prog, prog, WBTEST_VERSION);
 }
 
-int write_file(const pid_t pid, size_t size)
+void parse_options(int argc, char *argv[])
 {
-	int fd;
-	ssize_t wrote;
-	char *mem_p = Mem_scratch_p;
-	UINT_32 *d_p = Data_p;
-	UINT_32 rnum = (UINT_32) random();
-	size_t size_sv = size;
-	DIR *dp;
+	char c;
 
-	dp = opendir(Checkpoint_path);
-	if (NULL == dp) {
-		fprintf(stderr, "opendir failed\n");
-		return 1;
-	}
+	while ((c = getopt(argc, argv, "c:hl:m:M:p:d:r:vV")) != -1) {
+		UINT_32 scr;
 
-	sprintf(mem_p, "%s/%u-%u-%08X", Data_path, pid, size, rnum);
-
-	fill(d_p, rnum, MIN(size, DATA_BUF_LEN));
-
-	fd = open(mem_p, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR);
-	if (-1 == fd) {
-		fprintf(stderr, "can't open fd for data file errno=%i\n",
-			errno);
-		return 1;
-	}
-
-	while (DATA_BUF_LEN < size) {
-		wrote = write(fd, (const void *) d_p, DATA_BUF_LEN);
-		if (wrote != DATA_BUF_LEN) {
-			fprintf(stderr,
-				"D write error (wrote %i/%u): errno=%i\n",
-				wrote, DATA_BUF_LEN, errno);
+		switch (c) {
+		case 'd':
+			/* "test" path -- dir on DRBD to store test data files
+			 */
+			snprintf(Data_path, FNAME_LEN, "%s", optarg);
+			assert(strlen(Data_path) == strlen(optarg));
+			break;
+		case 'l':
+			/* where to store the verify "log" showing found problems
+			 */
+			snprintf(Log_fname, FNAME_LEN, "%s", optarg);
+			assert(strlen(Log_fname) == strlen(optarg));
+			break;
+		case 'm':
+			scr = ((UINT_32) atoi(optarg)) & ~3;
+			Min_size = scr;
+			break;
+		case 'M':
+			scr = ((UINT_32) atoi(optarg)) & ~3;
+			Max_size = scr;
+			break;
+		case 'p':
+			scr = ((UINT_32) atoi(optarg));
+			Pass_cnt = scr;
+			break;
+		case 'c':
+			scr = ((UINT_32) atoi(optarg));
+			Max_conc = scr;
+			break;
+		case 'r':
+			scr = ((UINT_32) atoi(optarg));
+			Recycle = scr;
+			break;
+		case 'v':
+			Dont_verify = 1;
+			break;
+		case 'V':
+			Verify_only = 1;
+			break;
+		case 'h':
+		case '?':
+		default:
+			usage(argv[0]);
+			exit(1);
 		}
-		size -= DATA_BUF_LEN;
 	}
-
-	if (size) {
-		wrote = write(fd, (const void *) d_p, size);
-		if (wrote != size) {
-			fprintf(stderr,
-				"D write error (wrote %i/%u): errno=%i\n",
-				wrote, size, errno);
-		}
+	if (Min_size < 4)         Min_size = 4;
+	if (Max_size > 1024*1024) Max_size = 1024*1024;
+	if (Verify_only) Dont_verify = 0;
+	if (!Data_path[0]) {
+		fprintf(stderr,
+			"Missing -d Data_Path argument, required\n");
+		usage(argv[0]);
+		exit(1);
 	}
+	if (Log_fname[0]) {
+		Log_fp = fopen(Log_fname, "a");
+	} else {
+		int fd = 0;
+		time_t t = time(NULL);
 
-	if (0 != fsync(fd)) {
-		fprintf(stderr, "D fsync error: errno=%i\n", errno);
+		snprintf(Log_fname, FNAME_LEN,
+			"/tmp/wbtest-vLog-%u-%u",
+			(unsigned int)t, getpid());
+		fd = open(Log_fname, O_WRONLY | O_CREAT | O_EXCL | O_APPEND,
+		          S_IRUSR | S_IWUSR);
+		assert(-1 != fd); // if this was a real program, retry!
+		Log_fp = fdopen(fd, "a");
 	}
-
-	fd = close(fd);
-	assert(-1 != fd);
-
-	fsync(dirfd(dp));
-	closedir(dp);
-
-	record_file(size_sv, rnum);
-
-	return 0;
+	assert(NULL != Log_fp);
+	if (chdir(Data_path)) {
+		PERROR("chdir(%s)", Data_path);
+		exit(1);
+	}
+	{
+	char *p;
+	p = getcwd(Data_path,sizeof(Data_path));
+	assert(p == Data_path);
+	}
+	Data_dir = opendir(".");
+	if (NULL == Data_dir) {
+		PERROR("opendir(%s)", Data_path);
+		exit(1);
+	}
 }
 
-int verify_data(const UINT_32 * loc_p,
-		const size_t num_bytes, const UINT_32 word)
+int wait_for_kid(pid_t kid)
 {
-	UINT_32 ctr;
-	UINT_32 *d_p = (UINT_32 *) loc_p;
-	int mismatches = 0;
+	int err = 0;
 
-	for (ctr = 0; ctr < num_bytes; ctr += 4) {
-		if (word != *d_p++) {
-			mismatches++;
+	pid_t reaped_pid;
+	int status;
+
+	do {
+		reaped_pid = waitpid(kid,&status,0);
+	} while (reaped_pid == -EINTR);
+
+	if (WIFEXITED(status)) {
+		if (WEXITSTATUS(status)) {
+			Logf("child %u exited with status %u\n",
+				reaped_pid, WEXITSTATUS(status) );
+			err = 1; // DO error here, we do NOT want to keep going
 		}
+	} else if (WIFSIGNALED(status)) {
+		Logf("child %u exited with status %u\n",
+			reaped_pid, WTERMSIG(status) );
+		err = 1; // DO error here, we do NOT want to keep going
+	} else if (0 > reaped_pid) {
+		Logf("wait exited with error; quitting\n");
+		err = 1;
 	}
-
-	return mismatches;
+	return err;
 }
 
-int read_file(pid_t pid, size_t size, UINT_32 rnum)
+pid_t spawn(int pass)
 {
-	char *mem_p = Mem_scratch2_p;
-	UINT_32 *d_p = Data_p;
-	int fd;
-	ssize_t reads;
-	int mismatches = 0;
-	int err = 0;
+	time_t seed;
+	pid_t pid;
 
-	sprintf(mem_p, "%s/%u-%u-%08X", Data_path, pid, size, rnum);
-	fd = open(mem_p, O_RDONLY);
-	if (-1 == fd) {
-		fprintf(stderr,
-			"can't open fd %s for read data: errno=%i\n",
-			mem_p, errno);
-		return 1;
-	}
+	fflush(0);
+	pid = fork();
+	assert(-1 != pid);
 
-	while (DATA_BUF_LEN < size) {
-		reads = read(fd, (void *) d_p, DATA_BUF_LEN);
-		assert(DATA_BUF_LEN == reads);
+	if (pid) return pid;
 
-		mismatches += verify_data(d_p, DATA_BUF_LEN, rnum);
+	// in child
+	pid = getpid();
 
-		size -= DATA_BUF_LEN;
-	}
+	seed = time(NULL);
+	assert(-1 != seed);
+	seed ^= pid;
+	srandom((UINT_32) seed);
 
-	if (size) {
-		reads = read(fd, (void *) d_p, size);
-		assert(size == reads);
-
-		mismatches += verify_data(d_p, size, rnum);
+	if (pass & 1) {
+		run_ascending(pid);
+	} else {
+		run_descending(pid);
 	}
+	exit(0); // child exit
+}
 
-	fd = close(fd);
-	assert(-1 != fd);
+void strange_fname(const char *name)
+{
+	Logf("%s: strange filename\n", name);
+}
 
-	if (0 < mismatches) {
-		printf("FAILED verify of %s: %i word mismatches\n", mem_p,
-		       mismatches);
-		err = 1;
-		log_verify_failure(mem_p);
+void remove_unfinished(const char *name)
+{
+	if (unlink(name)) {
+		PERROR("unlink(%s)", name);
 	} else {
-		fd = unlink(mem_p);
-		assert(-1 != fd);
+		Logf("%s: unfinished, removed.\n", name);
 	}
-
-	return err;
 }
 
-int parse_chkfile(char *filepath_p, pid_t pid)
+int verify_fname(const char *name, UINT_32 size, UINT_32 rnum)
 {
-	UINT_32 count_ttl = 0, count_err = 0;
-	int err = 0;
-	char buf[32];
-	FILE *fp = fopen(filepath_p, "r");
-	int j;
-
-	if (NULL == fp) {
-		fprintf(stderr,
-			"fopen ret err in parse_chkfile, errno=%i\n",
-			errno);
-		return 1;
+	UINT_32 rsize, errors;
+	int fd, i, c;
+	fd = open(name, O_RDONLY);
+	if (-1 == fd) {
+		PERROR("open(%s,READ)", name);
+		return -1;
 	}
-
-	while (fgets(buf, 32, fp) != NULL) {
-		char *tok_p;
-		size_t nl_pos = strlen(buf) - 1;
-		size_t iosz;
-		UINT_32 rnum;
-
-		// chomp:
-		if (buf[nl_pos] == '\n') {
-			buf[nl_pos] = '\0';
+	rsize = 0;
+	errors = 0;
+	do {
+		c = read(fd, (void *) Data_buffer, DATA_BUF_LEN);
+		if (c < 0) {
+			PERROR("read(%s)", name);
+			fd = close(fd);
+			assert(-1 != fd);
+			return -1;
 		}
-
-		iosz = strtoul(buf, &tok_p, 10);
-		assert(NULL != tok_p);
-
-		/* Advance past the colon */
-		/* assert(':' == *tok_p); */
-		if (':' != *tok_p) {
-			/* OK, I've seen this case where the checkfile will end with 
-			 * a bunch of NUL bytes and the line read in will not have
-			 * the correct format (will be all 0's).  This seems like a
-			 * FS quirk so in the interest of assuring all the data is OK 
-			 * on the write barrier/write cache partition I will log the 
-			 * quirk and move on.
-			 */
-			log_verify_failure(filepath_p);
-			continue;
+		for (i = 0; i < c/sizeof(UINT_32); i++) {
+			if (Data_buffer[i] != rnum) {
+				++errors;
+			}
 		}
-		++tok_p;
-
-		rnum = (UINT_32) strtoul(tok_p, NULL, 16);
-
-		++count_ttl;
-
-		if (read_file(pid, iosz, rnum)) {
-			err = 1;
-			++count_err;
+		rsize += c;
+	} while(c > 0);
+	fd = close(fd);
+	assert(-1 != fd);
+	if (errors == 0 && rsize == size) {
+		if (unlink(name)) {
+			PERROR("unlink(%s)", name);
+			return -1;
 		}
+		return 0;
 	}
-
-	printf("Processed checkfile %s: %u/%u passed\n",
-	       filepath_p, (count_ttl - count_err), count_ttl);
-
-	j = fclose(fp);
-	assert(EOF != j);
-
-	j = unlink(filepath_p);
-	assert(-1 != j);
-
-	return err;
+	if (errors)
+		Logf("%s: %u word errors\n", name, errors);
+	if (rsize != size)
+		Logf("%s: %u byte read, but %u expected\n",
+			name, rsize, size);
+	return -1;
 }
 
-int parse_dir(void)
+int do_verify(pid_t glob_pid)
 {
-	int err = 0;
-	char *tmp_p = Mem_scratch_p;
-	DIR *dp;
+	static char glob[FNAME_LEN];
+	int verified       = 0;
+	int verify_failure = 0;
 	struct dirent *dir_p;
-	pid_t pid;
+	char unfinished, tmp;
 
-	dp = opendir(Checkpoint_path);
-	if (NULL == dp) {
-		fprintf(stderr, "opendir failed\n");
-		return 1;
+	if (glob_pid) {
+		snprintf(glob,FNAME_LEN, "%u-%%u-%%08x%%c%%c", glob_pid);
+	} else {
+		snprintf(glob,FNAME_LEN, "%s", "%*[0-9]-%u-%08x%c%c");
 	}
 
-	while ((dir_p = readdir(dp)) != NULL) {
+	rewinddir(Data_dir);
+
+	while ((dir_p = readdir(Data_dir)) != NULL) {
+		int c, size, rnum;
 		if ((strcmp(dir_p->d_name, ".") == 0) ||
 		    (strcmp(dir_p->d_name, "..") == 0)) {
 			continue;
 		}
 
-		pid = strtoul(dir_p->d_name, NULL, 10);
-
-		sprintf(tmp_p, "%s/%s", Checkpoint_path, dir_p->d_name);
-
-		if (parse_chkfile(tmp_p, pid)) {
-			err = 1;
+		unfinished = tmp = '\0';
+		c = sscanf(dir_p->d_name, glob,
+				&size, &rnum, &unfinished, &tmp);
+		if (c == 2) {
+			++verified;
+			if (verify_fname(dir_p->d_name, size, rnum)) {
+				++verify_failure;
+			} else {
+				putchar('.');
+			}
+			if (verified % 50 == 0) {
+				printf("\t(%u/%u)\n",
+					verified - verify_failure,
+					verified);
+			}
+		} else if (c == 3) {
+			if (unfinished != '+') {
+				strange_fname(dir_p->d_name);
+				continue;
+			}
+			remove_unfinished(dir_p->d_name);
+			continue;
+		} else {
+			if (glob_pid) continue;
+			strange_fname(dir_p->d_name);
+			continue;
 		}
-		fsync(dirfd(dp));
 	}
-
-	return err;
+	printf("\t(%u/%u)\n",
+		verified - verify_failure,
+		verified);
+	if (glob_pid) {
+		/*
+		fprintf(stdout, "verify \"%u-*-*\": (%u/%u) passed\n",
+				glob_pid,
+				verified - verify_failure,
+				verified);
+		*/
+	} else {
+		Logf("verify: (%u/%u) passed\n",
+			verified - verify_failure,
+			verified);
+	}
+	return verify_failure;
 }
 
-/* This function starts with a small file size and increases
- */
-void run_ascending(const pid_t pid)
-{
-	UINT_32 size;
+struct array_s {
+	UINT_32 first, last, count, number;
+	UINT_32 v[0];
+};
 
-	for (size = Min_io_sz; size < Max_io_sz; size *= 2) {
-		size &= ~3;
-		if (write_file(pid, size)) {
-			fprintf(stderr, "ending ascending run\n");
-			break;
-		}
-	}
+struct array_s * array_new(UINT_32 number)
+{
+	UINT_32 bytes = sizeof(struct array_s)
+		      + sizeof(UINT_32)*number;
+	struct array_s *a = malloc(bytes);
+	if (!a) return NULL;
+	memset(a,0,bytes);
+	a->number = number;
+	return a;
 }
 
-/* This function starts with a large file size and decreases
- */
-void run_descending(const pid_t pid)
+void array_destroy(struct array_s *a)
 {
-	size_t size;
+	free(a);
+}
 
-	for (size = Max_io_sz; size > Min_io_sz; size /= 2) {
-		size &= ~3;
-		if (write_file(pid, size)) {
-			fprintf(stderr, "ending descending run\n");
-			break;
-		}
+void array_push(struct array_s *a, const UINT_32 v)
+{
+	assert(a->count < a->number);
+	if (a->count > 0) {
+		if (++a->last == a->number)
+			a->last = 0;
 	}
+	a->v[a->last] = v;
+	++a->count;
+	/*
+	fprintf(stderr, "=p v(%u:%u)[%u]: %u\n",
+		a->first, a->last, a->count, v);
+	*/
 }
 
-void usage(char *prog)
+UINT_32 array_shift(struct array_s *a)
 {
-	printf("Usage: %s [-hvVs] [-m <min>] [-M <max>] [-p <passes>]\n"
-	       "[-c <concurrent>] [-l <vLog>] -s <safedir> -t <testdir>\n"
-	       "\n%s - Version %s Options:\n"
-	       "\t-h prints this usage text\n"
-	       "\t-v forces NO verification step of existing files (if any)\n"
-	       "\t-V forces exit after verification step of existing files\n"
-	       "\t<min> == minimum IO size to use (bytes)\n"
-	       "\t<max> == maximum IO size to use (bytes)\n"
-	       "\t<passes> == # of passes to run (0 for INF)\n"
-	       "\t<concurrent> == # of processes to run at once\n"
-	       "\t<vLog> == log all file verify failures here, otherwise mkstemp\n"
-	       "\t<safedir> == writable DIRECTORY to store 'checkpoint' files\n"
-	       "\t<testdir> == writable DIRECTORY to store 'test data' files\n"
-	       "\n"
-	       "\tDifference between safedir and testdir is that safedir should \n"
-	       "\tbe 'safe' storage meaning that it is not using drive write \n"
-	       "\tcache, whereas testdir is intended to be using drive write \n"
-	       "\tcache and the write barrier. Naturally, the two should be on \n"
-	       "\tseparate drives\n", prog, prog, WBTEST_VERSION);
+	UINT_32 v;
+	assert(a->count > 0);
+	v = a->v[a->first];
+	if (++a->first == a->number)
+		a->first = 0;
+	--a->count;
+	/*
+	fprintf(stderr, "=s v(%u:%u)[%u]: %u\n",
+		a->first, a->last, a->count, v);
+	*/
+	return v;
 }
 
-void exit_safe(int ret_code)
+UINT_32 array_idx(struct array_s *a, UINT_32 i)
 {
-	cleanup_mem();
-	exit(ret_code);
+	UINT_32 v;
+	assert(a->count > i);
+	v = a->v[(a->first + i) % a->number];
+	/*
+	fprintf(stderr, "=i v(%u:%u)[%u:%u]: %u\n",
+		a->first, a->last, a->count, i, v);
+	*/
+	return v;
 }
 
 int main(int argc, char *argv[])
 {
-	int c;
-	pid_t pid = 1;
-	UINT_32 pass, pass_cnt = DEFAULT_PASS_COUNT;
-	UINT_32 max_conc = DEFAULT_MAX_CONCURRENT;
-	UINT_32 kids = 0;
 
-	while ((c = getopt(argc, argv, "c:hl:m:M:p:s:t:vV")) != -1) {
-		UINT_32 scr;
+	struct statfs s;
+	struct array_s *Kids;
+	UINT_32 blocks_per_pass, tmp, pass, kids = 0;
 
-		switch (c) {
-		case 's':
-			/* "safe" path -- non-cached dir to store checkpoint files
-			 */
-			Checkpoint_path = optarg;
-			break;
-		case 't':
-			/* "test" path -- cached/barrier protected dir to store test data
-			 * files
-			 */
-			Data_path = optarg;
-			break;
-		case 'l':
-			/* where to store the verify "log" showing found problems
-			 */
-			Verify_fail_logf = optarg;
-			break;
-		case 'm':
-			scr = ((UINT_32) atoi(optarg)) & ~3;
-			Min_io_sz = scr;
-			break;
-		case 'M':
-			scr = ((UINT_32) atoi(optarg)) & ~3;
-			Max_io_sz = scr;
-			break;
-		case 'p':
-			scr = ((UINT_32) atoi(optarg));
-			pass_cnt = scr;
-			break;
-		case 'c':
-			scr = ((UINT_32) atoi(optarg));
-			max_conc = scr;
-			break;
-		case 'v':
-			Flags |= FLAG_NO_VERIFY;
-			break;
-		case 'V':
-			Flags |= FLAG_VERIFY_ONLY;
-			break;
-		case 'h':
-		case '?':
-		default:
-			usage(argv[0]);
-			exit_safe(1);
-		}
-	}
+	parse_options(argc,argv);
 
-	if (!Checkpoint_path || !Data_path) {
-		fprintf(stderr,
-			"Missing -s or -t arguments, both required\n");
-		usage(argv[0]);
-		exit_safe(1);
-	}
-	// first, verify what's there using the checkpoint files
-	if (!(Flags & FLAG_NO_VERIFY)) {
-		printf("Beginning verify stage\n");
-		if (initial_setup_parent()) {
-			exit_safe(1);
-		}
-		if (parse_dir()) {
+	if (!Dont_verify) {
+		printf( "Data_path: %s\nLogfile:   %s\n"
+			"Beginning global verify stage.\n",
+			Data_path, Log_fname );
+		if (do_verify(0)) {
 			printf("Verify failed; program exiting.\n");
-			exit_safe(1);
-		} else {
-			printf
-			    ("Verify completed successfully; program continuing.\n");
+			exit(1);
 		}
-
-		log_verify_failure(NULL);	/* close out the failure log if opened */
-		cleanup_mem();	/* free used mem from parent */
-
-		if (Flags & FLAG_VERIFY_ONLY) {
-			printf("Performing verify step ONLY as desired\n");
-			exit_safe(0);
+		if (Verify_only) {
+			printf("Verify completed successfully.\n");
+			printf("Nothing else to do (-V).\n");
+			fclose(Log_fp);
+			exit(0);
 		}
+		printf("Verify completed successfully; program continuing.\n");
 	} else {
-		printf("skipping verify step as desired\n");
+		printf("Skip initial verify step (-v).\n");
 	}
 
-	printf
-	    ("Using I/O Min: %u, Max: %u, %u passes with %u procs running\n",
-	     Min_io_sz, Max_io_sz, pass_cnt, max_conc);
+	if (fstatfs(dirfd(Data_dir),&s)) {
+		perror("statfs");
+		return 1;
+	}
 
-	// now launch new I/O
-	for (pass = 0; (pass < pass_cnt) || (0 == pass_cnt); pass++) {
-		fflush(0);
-		pid = fork();
-		assert(-1 != pid);
+	/* estimate recycle count */
+	blocks_per_pass = ( ((Max_size<<1) -1) + (Min_size-1) ) / s.f_bsize;
+	tmp = (s.f_bavail>>1) / blocks_per_pass +1;
+	if (tmp > MAX_RECYCLE) tmp = MAX_RECYCLE;
+	if (Recycle > tmp) Recycle = tmp;
+	Recycle = MAX(Recycle,Max_conc+1);
 
-		if (0 == pid) {
-			// child
-			pid_t new_pid = getpid();
+	Logf("Using I/O Min: %u, Max: %u, %u passes, "
+	     "with %u procs running\n"
+	     "Recycling starts with the %u. pass\n"
+	     "Data_path: %s\n"
+	     "Logfile:   %s\n",
+	     Min_size, Max_size, Pass_cnt, Max_conc,
+	     Recycle, Data_path, Log_fname);
 
-			if (0 != initial_setup_child(new_pid)) {
-				exit_safe(1);
-			}
+	Kids = array_new(Recycle+10);
+	if (!Kids) {
+		Logf("array_new(%u): out of memory\n", Recycle);
+		exit(1);
+	}
 
-			if (new_pid % 2) {
-				run_ascending(new_pid);
-			} else {
-				run_descending(new_pid);
-			}
-			exit_safe(0); // child exit
-		} else {
-			if (++kids < max_conc) {
-				continue;
-			} else {
-				if (wait_for_kids
-				    (&kids, WAIT_ON_SINGLE_CHILD)) {
-					exit_safe(1);
-				}
-			}
+	// now launch new I/O
+	for (pass = 0; (pass < Pass_cnt) || (0 == Pass_cnt); pass++) {
+		if (Kids->count >= Recycle) {
+			do_verify(array_shift(Kids));
 		}
+		array_push(Kids, spawn(pass) );
+		if (++kids >= Max_conc) {
+			if ( wait_for_kid(array_idx(Kids,Kids->count - kids)) )
+				exit(1);
+			--kids;
+		}
+	}
 
-	}			// end pass loop
-
 	/* we have finished the number of passes we wanted to originate.  Wait
 	 * for the rest of the kids before leaving
 	 */
-	if ((pid > 0) && wait_for_kids(&kids, WAIT_ON_ALL_CHILDREN)) {
-		exit_safe(1);
+	while (kids--) {
+	       if (wait_for_kid(-1)) exit(1);
 	}
 
-	exit_safe(0);
-	return 0;
+	fclose(Log_fp);
+	closedir(Data_dir);
+	array_destroy(Kids);
+
+	exit(0);
 }