[DRBD-cvs] r1447 - trunk/testing/CTH/wbtest
drbd-user@lists.linbit.com
drbd-user@lists.linbit.com
Wed, 21 Jul 2004 14:16:54 +0200 (CEST)
Author: lars
Date: 2004-07-21 14:16:53 +0200 (Wed, 21 Jul 2004)
New Revision: 1447
Removed:
trunk/testing/CTH/wbtest/run-wbtest.sh
Modified:
trunk/testing/CTH/wbtest/README
trunk/testing/CTH/wbtest/wbtest.c
Log:
rewrote wbtest to match our needs better
Modified: trunk/testing/CTH/wbtest/README
===================================================================
--- trunk/testing/CTH/wbtest/README 2004-07-21 11:23:18 UTC (rev 1446)
+++ trunk/testing/CTH/wbtest/README 2004-07-21 12:16:53 UTC (rev 1447)
@@ -1,25 +1,23 @@
+Original wbtest is
+ * Copyright (C) 2003-2004 EMC Corporation
+ *
+ * wbtest.c - a testing utility for the write barrier file system
+ * functionality.
+ *
+ * Written by Brett Russ <russb@emc.com>
-slightly patched by lge:
-fsync on directory descriptor, fflush(0) before fork() ...
+rewritten to match my needs
+ Copyright 2004 Lars Ellenberg <l.g.e@web.de>
-DON'T use run-wbtest.sh with the CTH, that does not make any sense.
-
----
wbtest.c README
-Requirements:
+Intentional Usage:
-Requires (at least) 2 drives: one test drive, with write cache (WC)
-enabled and one safe drive, with WC disabled. The run-wbtest.sh self
-installer must be modifed to point to the path to a directory on the
-test drive that will contain test data files (of size <min> to <max>;
-default 4B to 100KB). Additionally, supply a path to a directory on
-the safe drive that will contain "checkpoint" files used to verify the
-integrity of the test data files after a powerfail/reboot cycle.
-Last, supply the device names for the safe and test drives such that
-the script can force WC disabled/enabled respectively on these drives
-prior to every run of wbtest.
+You should setup a pair of DRBD, mount it on one node, and run wbtest
+on some path on that device. Then you can trigger more or less gracefull
+failovers, and run wbtest on the other node.
Compile:
@@ -27,52 +25,53 @@
Summary:
-The program starts and first scans the safe dir for checkpoint files.
-This dir should be empty on first run. Finding none, it will begin
-creating processes whose job is to write data to the test dir and log
-it in checkpoint files (1 per process). Each process writes a number
-of files of size in range <min> to <max> and then exits. A max of
-<concurrent> procs are launched at once. A pass is defined as the
-cycle a process makes from <min> to <max> or <max> to <min>; odd PIDs
-acsend, even PIDs descend. A <passes> value of 0 runs to infinity.
+The program starts and first scans the data dir for files matching the
+expected naming convention. All unfinished files (ending in '+') will
+then just be unlinked, all other files will be verified to match their
+expected size and content.
+This data dir should be empty on first run. Finding none, wbtest will
+begin to create processes whose job is to create new data files. Each
+process writes a number of files of size in range <min> to <max>
+(default 4B -- 100KB) and then exits. A max of <concurrent> procs are
+launched at once. A pass is defined as the cycle a process makes from
+<min> to <max> or <max> to <min>; odd PIDs ascend, even PIDs descend.
+A <passes> value of 0 runs to infinity. After <recycle> passes, wbtest
+starts to verify and remove single passes again, to avoid to fill up the
+file system completely.
+
The point is to cut off power to the running machine at a random point
during the testing. To do this, either set <passes> to 0 or ensure
that you set it high enough such that I/O is running when the power is
-shut off. Care must be taken to ensure that the file system serving
-the test data dir does not fill before power is cut. Also, if you are
-automating the power failures, ensure not only that the disk doesn't
-fill before the power cuts, but also that the verification process (to
-be discussed next) finishes and I/O begins before power cuts out.
+shut off.
-Since the run-wbtest.sh script installs itself at the end of rc.local,
-the wbtest will launch at the end of the next boot. Now, the safe dir
-should contain valid checkpoint files and these will be scanned and
-control the verification of the test data. If errors are found,
-testing stops and if a log file was provided the file in error is
-logged. All test data and checkpoint files that verify successfully
+On failover or reboot, the test data files are verified.
+The file naming convention controlls the verification of the test data.
+These files are named <pid>-<size>-<hexdigits>, and are expected to be
+of size <size>, and to be filled with the four byte pattern <hexdigits>.
+A file ending in an additional '+' was not yet completely written, so it
+is not expected to verify correctly. It will just be removed in the
+verification step.
+
+If errors are found, testing stops and if a log file was provided the
+file in error is logged. All test data files that verify successfully
are removed. Any test data files with errors are preserved for
inspection.
I think that's about it...here's the usage text:
-Usage: ./wbtest [-hvVs] [-m <min>] [-M <max>] [-p <passes>]
-[-c <concurrent>] [-l <vLog>] -s <safedir> -t <testdir>
+Usage: ./wbtest [-hvV] [-m <min>] [-M <max>] [-p <passes>]
+ [-r <recycle>] [-c <concurrent>] [-l <vLog>] -d <datadir>
-./wbtest - Version 1.0 Options:
- -h prints this usage text
- -v forces NO verification step of existing files (if any)
- -V forces exit after verification step of existing files
- <min> == minimum IO size to use (bytes)
- <max> == maximum IO size to use (bytes)
- <passes> == # of passes to run (0 for INF)
- <concurrent> == # of processes to run at once
- <vLog> == log all file verify failures here, otherwise mkstemp
- <safedir> == writable DIRECTORY to store 'checkpoint' files
- <testdir> == writable DIRECTORY to store 'test data' files
-
- Difference between safedir and testdir is that safedir should
- be 'safe' storage meaning that it is not using drive write
- cache, whereas testdir is intended to be using drive write
- cache and the write barrier. Naturally, the two should be on
- separate drives
+./wbtest - Version 1.1-lge Options:
+ -h prints this usage text
+ -v SKIP verification step of existing files (if any)
+ -V forces exit after verification step of existing files
+ min minimum IO size to use (bytes)
+ max maximum IO size to use (bytes)
+ passes # of passes to run (0 for INF)
+ recycle # of passes after which recycling of disk space starts
+ concurrent # of processes to run at once
+ vLog log (append) all file verify failures here.
+ otherwise /tmp/wbtest-vLog-<timestamp>-<pid> will be used
+ datadir required; writable directory on DRBD to store 'test data' files
Deleted: trunk/testing/CTH/wbtest/run-wbtest.sh
===================================================================
--- trunk/testing/CTH/wbtest/run-wbtest.sh 2004-07-21 11:23:18 UTC (rev 1446)
+++ trunk/testing/CTH/wbtest/run-wbtest.sh 2004-07-21 12:16:53 UTC (rev 1447)
@@ -1,36 +0,0 @@
-#!/bin/sh
-
-# device name for drive that holds safe directory
-drv_safe="/dev/hdX"
-drv_test="/dev/hdY"
-
-# path prefixes for safe and test directories
-safe_dir="/path/on/drv_safe/wbtest-safe"
-test_dir="/path/on/drv_test/wbtest-test"
-
-# path prefix to wbtest and run-wbtest.sh
-wbtest_path="/root"
-
-if [ ! -d $safe_dir ]; then
- mkdir -p $safe_dir
-fi
-
-if [ ! -d $test_dir ]; then
- mkdir -p $test_dir
-
-fi
-
-# make sure that the safe drive has write cache disabled
-hdparm -W0 $drv_safe
-# make sure that the test drive has write cache enabled
-hdparm -W1 $drv_test
-
-RC_LOCAL_TOUCHED=/var/wbtest-mod-rc-local
-
-if [ ! -f $RC_LOCAL_TOUCHED ]; then
- echo "${wbtest_path}/run-wbtest.sh" >> /etc/rc.d/rc.local
- > $RC_LOCAL_TOUCHED
-fi
-
-# Run it!
-${wbtest_path}/wbtest -p 0 -c 40 -s $safe_dir -t $test_dir &
Modified: trunk/testing/CTH/wbtest/wbtest.c
===================================================================
--- trunk/testing/CTH/wbtest/wbtest.c 2004-07-21 11:23:18 UTC (rev 1446)
+++ trunk/testing/CTH/wbtest/wbtest.c 2004-07-21 12:16:53 UTC (rev 1447)
@@ -1,7 +1,7 @@
-/*
+/*
* Copyright (C) 2003-2004 EMC Corporation
*
- * wbtest.c - a testing utility for the write barrier file system
+ * wbtest.c - a testing utility for the write barrier file system
* functionality.
*
* Written by Brett Russ <russb@emc.com>
@@ -19,13 +19,38 @@
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * ----------
+ * 2004 modified by Lars Ellenberg for testing the data integrity
+ * on a "shared" storage like DRBD.
+ *
+ * - I don't need the "Checkpoint files", since any data file name
+ * already contains all neccessary information about its content.
+ * So I remove all references to the Checkpoint thingy.
+ * - a data file starts as "$DATA_DIR/%pid-%size-%rnum+"
+ * and will be renamed to "$DATA_DIR/%pid-%size-%rnum"
+ * when it was fsynced to disk.
+ * - in the verify stage, we first delete all files that are not fully
+ * synced to disk ("*+"). Then, all files are verified against their
+ * expected content (according to their name), then removed.
+ * - when the disk fills up, we do some additional verify stage in
+ * between, so we continuously to produce IO-load, but never fill up
+ * the disk.
+ * - explicitly open log file early
+ *
+ * ToDo:
+ * - make endian save to be able to test cross platform DRBD
+ * - handle signals more gracefully
+ *
*/
#include <stdio.h>
#include <stdlib.h>
+#include <stdarg.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
+#include <sys/vfs.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
@@ -34,661 +59,590 @@
#include <dirent.h>
#include <assert.h>
#include <time.h>
+#include <endian.h>
-#define WBTEST_VERSION "1.0"
+#define WBTEST_VERSION "1.1-lge"
typedef unsigned int UINT_32;
+/* hard limit number of passes before internal cleanup is due */
+#define MAX_RECYCLE 10000
+#define DATA_BUF_LEN 4096
+#define FNAME_LEN 128
-#define MEM_SCRATCH_LEN 128
-#define DATA_BUF_LEN 8096
-#define WAIT_ON_SINGLE_CHILD 1
-#define WAIT_ON_ALL_CHILDREN 2
+/* global parameters, can be changed on commandline, with defaults: */
+static char Data_path[FNAME_LEN] = ""; // -d
+static char Log_fname[FNAME_LEN] = ""; // -l
+static int Verify_only = 0; // -V
+static int Dont_verify = 0; // -v
+static UINT_32 Recycle = 4000; // -r
+static UINT_32 Pass_cnt = 100; // -p
+static UINT_32 Max_conc = 25; // -c
+static UINT_32 Min_size = 4; // -m
+static UINT_32 Max_size = 100*1024; // -M
-#define DEFAULT_MAX_CONCURRENT 25
-#define DEFAULT_PASS_COUNT 100
+/* globals, initialized after option parsing */
+static DIR *Data_dir; // so I can fsync(dirfd(dp));
+static FILE *Log_fp;
+static UINT_32 Data_buffer[DATA_BUF_LEN];
#define MIN(x,y) (((x) < (y)) ? (x) : (y))
+#define MAX(x,y) (((x) > (y)) ? (x) : (y))
-/* The range of file sizes in bytes we will write
- */
-static UINT_32 Min_io_sz = 4;
-static UINT_32 Max_io_sz = 102400;
+void Logf(char *fmt, ...)
+{
+ va_list ap;
-/* Checkpoint file descriptor and pointer
- */
-static int Ckpt_fd = 0;
-static FILE *Ckpt_fp = NULL;
+ if (Log_fp) {
+ va_start(ap, fmt);
+ vfprintf(Log_fp, fmt, ap);
+ va_end(ap);
+ if ( fflush(Log_fp) || fsync(fileno(Log_fp)) ) {
+ fprintf(stderr, "Logf error: ");
+ perror(0);
+ }
+ }
+ va_start(ap, fmt);
+ vfprintf(stderr, fmt, ap);
+ va_end(ap);
+}
-static char *Mem_scratch_p = NULL;
-static char *Mem_scratch2_p = NULL;
-static UINT_32 *Data_p = NULL;
+#define PERROR(fmt , args...) do { \
+ Logf( fmt ": %s (%d)\n" , ##args , \
+ strerror(errno), errno ); \
+} while (0)
-#define FLAG_NO_VERIFY 0x00000001
-#define FLAG_VERIFY_ONLY 0x00000002
-static UINT_32 Flags = 0;
+/*
+ * in child process
+ ******************************************************/
-/* Directory required to hold checkpoint files. This dir should be
- * on a disk w/write cache and write barrier disabled.
- */
-static char *Checkpoint_path = NULL;
-/* Directory required to hold data files. This dir should be
- * on a disk w/write cache and write barrier enabled.
- */
-static char *Data_path = NULL;
-/* Failed verifies (data filenames that don't match the checkpoint file)
- * are logged here.
- */
-static char *Verify_fail_logf = NULL;
+void fill(const UINT_32 word, const size_t num_bytes)
+{
+ UINT_32 i;
+ UINT_32 *p = Data_buffer;
+ for (i = 0; i < num_bytes; i += 4) *p++ = word;
+}
-
-UINT_32 initial_setup_child(const pid_t pid)
+int write_file(const pid_t pid, size_t size)
{
- char *mem_p = NULL;
- time_t seed;
+ static char Fname_curr[FNAME_LEN] = "";
+ static char Fname_done[FNAME_LEN] = "";
+ int fd;
+ ssize_t c = 0;
+ UINT_32 rnum = (UINT_32) random();
+ size_t wsize = size;
- Mem_scratch_p = (char *) malloc((size_t) MEM_SCRATCH_LEN);
- if (NULL == Mem_scratch_p) {
- fprintf(stderr, "can't alloc mem scratch\n");
- return 1;
- }
- mem_p = Mem_scratch_p;
+ snprintf(Fname_done, FNAME_LEN, "%u-%u-%08X", pid, size, rnum);
+ snprintf(Fname_curr, FNAME_LEN, "%s+", Fname_done);
- sprintf(mem_p, "%s/%08u.chk", Checkpoint_path, pid);
+ fill(rnum, MIN(size, DATA_BUF_LEN));
- Ckpt_fd =
- open(mem_p, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR);
- if (-1 == Ckpt_fd) {
- fprintf(stderr,
- "can't open fd for checkpoint file errno=%i\n",
- errno);
- return 1;
+ fd = open(Fname_curr, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR);
+ if (-1 == fd) {
+ PERROR("open(%s,WRITE)", Fname_curr);
+ return -1;
}
- Ckpt_fp = fdopen(Ckpt_fd, "w");
- if (NULL == Ckpt_fp) {
- fprintf(stderr,
- "can't open fp for checkpoint file errno=%i\n",
- errno);
- return 1;
+ while (DATA_BUF_LEN < wsize) {
+ c = write(fd, (const void *) Data_buffer, DATA_BUF_LEN);
+ if (c != DATA_BUF_LEN) {
+ PERROR("D write(%s) (wrote %i/%u)",
+ Fname_curr, c, DATA_BUF_LEN);
+ wsize = 0;
+ c = -1;
+ break;
+ }
+ wsize -= DATA_BUF_LEN;
}
-
- Data_p = (UINT_32 *) malloc((size_t) DATA_BUF_LEN);
- if (NULL == Data_p) {
- fprintf(stderr, "can't alloc data mem\n");
- return 1;
+ if (wsize) {
+ c = write(fd, (const void *) Data_buffer, wsize);
+ if (c != wsize) {
+ PERROR("D write(%s) (wrote %i/%u)",
+ Fname_curr, c, wsize);
+ c = -1;
+ }
}
- seed = time(NULL);
- assert(-1 != seed);
- seed ^= pid;
- srandom((UINT_32) seed);
-
- return 0;
-}
-
-void cleanup_mem(void)
-{
- if (NULL != Mem_scratch_p) {
- free((void *) Mem_scratch_p);
- Mem_scratch_p = NULL;
+ if (0 != fsync(fd)) {
+ PERROR("D fsync(%s)", Fname_curr);
}
- if (NULL != Mem_scratch2_p) {
- free((void *) Mem_scratch2_p);
- Mem_scratch2_p = NULL;
- }
- if (NULL != Data_p) {
- free((void *) Data_p);
- Data_p = NULL;
- }
-}
-UINT_32 initial_setup_parent(void)
-{
- Mem_scratch_p = (char *) malloc((size_t) MEM_SCRATCH_LEN);
- if (NULL == Mem_scratch_p) {
- fprintf(stderr, "can't alloc mem scratch\n");
- return 1;
- }
+ fd = close(fd);
+ assert(-1 != fd);
- Mem_scratch2_p = (char *) malloc((size_t) MEM_SCRATCH_LEN);
- if (NULL == Mem_scratch2_p) {
- fprintf(stderr, "can't alloc mem scratch2\n");
- return 1;
+ if (-1 != c) {
+ if (rename(Fname_curr, Fname_done)) {
+ PERROR("D rename(%s)", Fname_curr);
+ };
}
+ fsync(dirfd(Data_dir));
- Data_p = (UINT_32 *) malloc((size_t) DATA_BUF_LEN);
- if (NULL == Data_p) {
- fprintf(stderr, "can't alloc data mem\n");
- return 1;
- }
-
return 0;
}
-int wait_for_kids(UINT_32 * kids, int single)
+void run_ascending(const pid_t pid)
{
- int err = 0;
-
- while (*kids) {
- pid_t reaped_pid;
- int status;
-
- reaped_pid = wait(&status);
-
- if (WIFEXITED(status)) {
- (*kids)--;
- if (WEXITSTATUS(status)) {
- fprintf(stderr,
- "child %u exited with status %u (%u remain)\n",
- reaped_pid, WEXITSTATUS(status),
- *kids);
- err = 1; // DO error here, we do NOT want to keep going
- }
- if (WAIT_ON_SINGLE_CHILD == single) {
- break;
- }
- } else if (WIFSIGNALED(status)) {
- (*kids)--;
- fprintf(stderr,
- "child %u exited with status %u (%u remain)\n",
- reaped_pid, WTERMSIG(status),
- *kids);
- err = 1; // DO error here, we do NOT want to keep going
- if (WAIT_ON_SINGLE_CHILD == single) {
- break;
- }
- } else if (0 > reaped_pid) {
- fprintf(stderr,
- "wait exited with error; quitting\n");
- err = 1;
+ UINT_32 size;
+ for (size = Min_size; size <= Max_size; size <<= 1) {
+ // size &= ~3;
+ if (write_file(pid, size)) {
+ Logf("ending ascending run\n");
break;
}
}
-
- return err;
}
-/* Function responsible for recording the contents and size of each file
- * written to the WSC/WB enabled FS
- */
-UINT_32 record_file(const size_t size, const UINT_32 rNum)
+void run_descending(const pid_t pid)
{
-
- fprintf(Ckpt_fp, "%08u:%08X\n", size, rNum);
-
- if (0 != fflush(Ckpt_fp)) {
- fprintf(stderr, "fflush error: errno=%i\n", errno);
- }
-
- if (0 != fsync(Ckpt_fd)) {
- fprintf(stderr, "fsync error: errno=%i\n", errno);
- }
-
- return 0;
-}
-
-void log_verify_failure(char *file_path_p)
-{
- static FILE *fp = NULL;
- int err;
-
- if (NULL == file_path_p) {
- if (NULL != fp) {
- err = fclose(fp);
- assert(EOF != err);
- fp = NULL;
+ UINT_32 size;
+ for (size = Max_size; size >= Min_size; size >>= 1) {
+ size &= ~3;
+ if (write_file(pid, size)) {
+ Logf("ending descending run\n");
+ break;
}
- } else {
- if (NULL == fp) {
- /* not open yet, open it */
- if (Verify_fail_logf) {
- fp = fopen(Verify_fail_logf, "a");
- } else {
- int fd = 0;
- fd = mkstemp("/tmp/wbtest-vLog-XXXXXXX");
- assert(-1 != fd);
- fp = fdopen(fd, "a");
- }
- assert(NULL != fp);
- }
- fprintf(fp, "%s\n", file_path_p);
}
}
-void fill(const UINT_32 * loc_p, const UINT_32 word,
- const size_t num_bytes)
-{
- UINT_32 ctr;
- UINT_32 *d_p = (UINT_32 *) loc_p;
+/*
+ * below only called from master process
+ ******************************************************/
- for (ctr = 0; ctr < num_bytes; ctr += 4) {
- *d_p++ = word;
- }
-
- return;
+void usage(char *prog)
+{
+ printf("Usage: %s [-hvV] [-m <min>] [-M <max>] [-p <passes>]\n"
+ "\t\t[-r <recycle>] [-c <concurrent>] [-l <vLog>] -d <datadir>\n"
+ "\n%s - Version %s Options:\n"
+ " -h prints this usage text\n"
+ " -v SKIP verification step of existing files (if any)\n"
+ " -V forces exit after verification step of existing files\n"
+ " min minimum IO size to use (bytes)\n"
+ " max maximum IO size to use (bytes)\n"
+ " passes # of passes to run (0 for INF)\n"
+ " recycle # of passes after which recycling of disk space starts\n"
+ " concurrent # of processes to run at once\n"
+ " vLog log (append) all file verify failures here.\n"
+ " otherwise /tmp/wbtest-vLog-<timestamp>-<pid> will be used\n"
+ " datadir required; writable directory on DRBD to store 'test data' files\n",
+ prog, prog, WBTEST_VERSION);
}
-int write_file(const pid_t pid, size_t size)
+void parse_options(int argc, char *argv[])
{
- int fd;
- ssize_t wrote;
- char *mem_p = Mem_scratch_p;
- UINT_32 *d_p = Data_p;
- UINT_32 rnum = (UINT_32) random();
- size_t size_sv = size;
- DIR *dp;
+ char c;
- dp = opendir(Checkpoint_path);
- if (NULL == dp) {
- fprintf(stderr, "opendir failed\n");
- return 1;
- }
+ while ((c = getopt(argc, argv, "c:hl:m:M:p:d:r:vV")) != -1) {
+ UINT_32 scr;
- sprintf(mem_p, "%s/%u-%u-%08X", Data_path, pid, size, rnum);
-
- fill(d_p, rnum, MIN(size, DATA_BUF_LEN));
-
- fd = open(mem_p, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR);
- if (-1 == fd) {
- fprintf(stderr, "can't open fd for data file errno=%i\n",
- errno);
- return 1;
- }
-
- while (DATA_BUF_LEN < size) {
- wrote = write(fd, (const void *) d_p, DATA_BUF_LEN);
- if (wrote != DATA_BUF_LEN) {
- fprintf(stderr,
- "D write error (wrote %i/%u): errno=%i\n",
- wrote, DATA_BUF_LEN, errno);
+ switch (c) {
+ case 'd':
+ /* "test" path -- dir on DRBD to store test data files
+ */
+ snprintf(Data_path, FNAME_LEN, "%s", optarg);
+ assert(strlen(Data_path) == strlen(optarg));
+ break;
+ case 'l':
+ /* where to store the verify "log" showing found problems
+ */
+ snprintf(Log_fname, FNAME_LEN, "%s", optarg);
+ assert(strlen(Log_fname) == strlen(optarg));
+ break;
+ case 'm':
+ scr = ((UINT_32) atoi(optarg)) & ~3;
+ Min_size = scr;
+ break;
+ case 'M':
+ scr = ((UINT_32) atoi(optarg)) & ~3;
+ Max_size = scr;
+ break;
+ case 'p':
+ scr = ((UINT_32) atoi(optarg));
+ Pass_cnt = scr;
+ break;
+ case 'c':
+ scr = ((UINT_32) atoi(optarg));
+ Max_conc = scr;
+ break;
+ case 'r':
+ scr = ((UINT_32) atoi(optarg));
+ Recycle = scr;
+ break;
+ case 'v':
+ Dont_verify = 1;
+ break;
+ case 'V':
+ Verify_only = 1;
+ break;
+ case 'h':
+ case '?':
+ default:
+ usage(argv[0]);
+ exit(1);
}
- size -= DATA_BUF_LEN;
}
-
- if (size) {
- wrote = write(fd, (const void *) d_p, size);
- if (wrote != size) {
- fprintf(stderr,
- "D write error (wrote %i/%u): errno=%i\n",
- wrote, size, errno);
- }
+ if (Min_size < 4) Min_size = 4;
+ if (Max_size > 1024*1024) Max_size = 1024*1024;
+ if (Verify_only) Dont_verify = 0;
+ if (!Data_path[0]) {
+ fprintf(stderr,
+ "Missing -d Data_Path argument, required\n");
+ usage(argv[0]);
+ exit(1);
}
+ if (Log_fname[0]) {
+ Log_fp = fopen(Log_fname, "a");
+ } else {
+ int fd = 0;
+ time_t t = time(NULL);
- if (0 != fsync(fd)) {
- fprintf(stderr, "D fsync error: errno=%i\n", errno);
+ snprintf(Log_fname, FNAME_LEN,
+ "/tmp/wbtest-vLog-%u-%u",
+ (unsigned int)t, getpid());
+ fd = open(Log_fname, O_WRONLY | O_CREAT | O_EXCL | O_APPEND,
+ S_IRUSR | S_IWUSR);
+ assert(-1 != fd); // if this was a real program, retry!
+ Log_fp = fdopen(fd, "a");
}
-
- fd = close(fd);
- assert(-1 != fd);
-
- fsync(dirfd(dp));
- closedir(dp);
-
- record_file(size_sv, rnum);
-
- return 0;
+ assert(NULL != Log_fp);
+ if (chdir(Data_path)) {
+ PERROR("chdir(%s)", Data_path);
+ exit(1);
+ }
+ {
+ char *p;
+ p = getcwd(Data_path,sizeof(Data_path));
+ assert(p == Data_path);
+ }
+ Data_dir = opendir(".");
+ if (NULL == Data_dir) {
+ PERROR("opendir(%s)", Data_path);
+ exit(1);
+ }
}
-int verify_data(const UINT_32 * loc_p,
- const size_t num_bytes, const UINT_32 word)
+int wait_for_kid(pid_t kid)
{
- UINT_32 ctr;
- UINT_32 *d_p = (UINT_32 *) loc_p;
- int mismatches = 0;
+ int err = 0;
- for (ctr = 0; ctr < num_bytes; ctr += 4) {
- if (word != *d_p++) {
- mismatches++;
+ pid_t reaped_pid;
+ int status;
+
+ do {
+ reaped_pid = waitpid(kid,&status,0);
+ } while (reaped_pid == -EINTR);
+
+ if (WIFEXITED(status)) {
+ if (WEXITSTATUS(status)) {
+ Logf("child %u exited with status %u\n",
+ reaped_pid, WEXITSTATUS(status) );
+ err = 1; // DO error here, we do NOT want to keep going
}
+ } else if (WIFSIGNALED(status)) {
+ Logf("child %u exited with status %u\n",
+ reaped_pid, WTERMSIG(status) );
+ err = 1; // DO error here, we do NOT want to keep going
+ } else if (0 > reaped_pid) {
+ Logf("wait exited with error; quitting\n");
+ err = 1;
}
-
- return mismatches;
+ return err;
}
-int read_file(pid_t pid, size_t size, UINT_32 rnum)
+pid_t spawn(int pass)
{
- char *mem_p = Mem_scratch2_p;
- UINT_32 *d_p = Data_p;
- int fd;
- ssize_t reads;
- int mismatches = 0;
- int err = 0;
+ time_t seed;
+ pid_t pid;
- sprintf(mem_p, "%s/%u-%u-%08X", Data_path, pid, size, rnum);
- fd = open(mem_p, O_RDONLY);
- if (-1 == fd) {
- fprintf(stderr,
- "can't open fd %s for read data: errno=%i\n",
- mem_p, errno);
- return 1;
- }
+ fflush(0);
+ pid = fork();
+ assert(-1 != pid);
- while (DATA_BUF_LEN < size) {
- reads = read(fd, (void *) d_p, DATA_BUF_LEN);
- assert(DATA_BUF_LEN == reads);
+ if (pid) return pid;
- mismatches += verify_data(d_p, DATA_BUF_LEN, rnum);
+ // in child
+ pid = getpid();
- size -= DATA_BUF_LEN;
- }
+ seed = time(NULL);
+ assert(-1 != seed);
+ seed ^= pid;
+ srandom((UINT_32) seed);
- if (size) {
- reads = read(fd, (void *) d_p, size);
- assert(size == reads);
-
- mismatches += verify_data(d_p, size, rnum);
+ if (pass & 1) {
+ run_ascending(pid);
+ } else {
+ run_descending(pid);
}
+ exit(0); // child exit
+}
- fd = close(fd);
- assert(-1 != fd);
+void strange_fname(const char *name)
+{
+ Logf("%s: strange filename\n", name);
+}
- if (0 < mismatches) {
- printf("FAILED verify of %s: %i word mismatches\n", mem_p,
- mismatches);
- err = 1;
- log_verify_failure(mem_p);
+void remove_unfinished(const char *name)
+{
+ if (unlink(name)) {
+ PERROR("unlink(%s)", name);
} else {
- fd = unlink(mem_p);
- assert(-1 != fd);
+ Logf("%s: unfinished, removed.\n", name);
}
-
- return err;
}
-int parse_chkfile(char *filepath_p, pid_t pid)
+int verify_fname(const char *name, UINT_32 size, UINT_32 rnum)
{
- UINT_32 count_ttl = 0, count_err = 0;
- int err = 0;
- char buf[32];
- FILE *fp = fopen(filepath_p, "r");
- int j;
-
- if (NULL == fp) {
- fprintf(stderr,
- "fopen ret err in parse_chkfile, errno=%i\n",
- errno);
- return 1;
+ UINT_32 rsize, errors;
+ int fd, i, c;
+ fd = open(name, O_RDONLY);
+ if (-1 == fd) {
+ PERROR("open(%s,READ)", name);
+ return -1;
}
-
- while (fgets(buf, 32, fp) != NULL) {
- char *tok_p;
- size_t nl_pos = strlen(buf) - 1;
- size_t iosz;
- UINT_32 rnum;
-
- // chomp:
- if (buf[nl_pos] == '\n') {
- buf[nl_pos] = '\0';
+ rsize = 0;
+ errors = 0;
+ do {
+ c = read(fd, (void *) Data_buffer, DATA_BUF_LEN);
+ if (c < 0) {
+ PERROR("read(%s)", name);
+ fd = close(fd);
+ assert(-1 != fd);
+ return -1;
}
-
- iosz = strtoul(buf, &tok_p, 10);
- assert(NULL != tok_p);
-
- /* Advance past the colon */
- /* assert(':' == *tok_p); */
- if (':' != *tok_p) {
- /* OK, I've seen this case where the checkfile will end with
- * a bunch of NUL bytes and the line read in will not have
- * the correct format (will be all 0's). This seems like a
- * FS quirk so in the interest of assuring all the data is OK
- * on the write barrier/write cache partition I will log the
- * quirk and move on.
- */
- log_verify_failure(filepath_p);
- continue;
+ for (i = 0; i < c/sizeof(UINT_32); i++) {
+ if (Data_buffer[i] != rnum) {
+ ++errors;
+ }
}
- ++tok_p;
-
- rnum = (UINT_32) strtoul(tok_p, NULL, 16);
-
- ++count_ttl;
-
- if (read_file(pid, iosz, rnum)) {
- err = 1;
- ++count_err;
+ rsize += c;
+ } while(c > 0);
+ fd = close(fd);
+ assert(-1 != fd);
+ if (errors == 0 && rsize == size) {
+ if (unlink(name)) {
+ PERROR("unlink(%s)", name);
+ return -1;
}
+ return 0;
}
-
- printf("Processed checkfile %s: %u/%u passed\n",
- filepath_p, (count_ttl - count_err), count_ttl);
-
- j = fclose(fp);
- assert(EOF != j);
-
- j = unlink(filepath_p);
- assert(-1 != j);
-
- return err;
+ if (errors)
+ Logf("%s: %u word errors\n", name, errors);
+ if (rsize != size)
+ Logf("%s: %u byte read, but %u expected\n",
+ name, rsize, size);
+ return -1;
}
-int parse_dir(void)
+int do_verify(pid_t glob_pid)
{
- int err = 0;
- char *tmp_p = Mem_scratch_p;
- DIR *dp;
+ static char glob[FNAME_LEN];
+ int verified = 0;
+ int verify_failure = 0;
struct dirent *dir_p;
- pid_t pid;
+ char unfinished, tmp;
- dp = opendir(Checkpoint_path);
- if (NULL == dp) {
- fprintf(stderr, "opendir failed\n");
- return 1;
+ if (glob_pid) {
+ snprintf(glob,FNAME_LEN, "%u-%%u-%%08x%%c%%c", glob_pid);
+ } else {
+ snprintf(glob,FNAME_LEN, "%s", "%*[0-9]-%u-%08x%c%c");
}
- while ((dir_p = readdir(dp)) != NULL) {
+ rewinddir(Data_dir);
+
+ while ((dir_p = readdir(Data_dir)) != NULL) {
+ int c, size, rnum;
if ((strcmp(dir_p->d_name, ".") == 0) ||
(strcmp(dir_p->d_name, "..") == 0)) {
continue;
}
- pid = strtoul(dir_p->d_name, NULL, 10);
-
- sprintf(tmp_p, "%s/%s", Checkpoint_path, dir_p->d_name);
-
- if (parse_chkfile(tmp_p, pid)) {
- err = 1;
+ unfinished = tmp = '\0';
+ c = sscanf(dir_p->d_name, glob,
+ &size, &rnum, &unfinished, &tmp);
+ if (c == 2) {
+ ++verified;
+ if (verify_fname(dir_p->d_name, size, rnum)) {
+ ++verify_failure;
+ } else {
+ putchar('.');
+ }
+ if (verified % 50 == 0) {
+ printf("\t(%u/%u)\n",
+ verified - verify_failure,
+ verified);
+ }
+ } else if (c == 3) {
+ if (unfinished != '+') {
+ strange_fname(dir_p->d_name);
+ continue;
+ }
+ remove_unfinished(dir_p->d_name);
+ continue;
+ } else {
+ if (glob_pid) continue;
+ strange_fname(dir_p->d_name);
+ continue;
}
- fsync(dirfd(dp));
}
-
- return err;
+ printf("\t(%u/%u)\n",
+ verified - verify_failure,
+ verified);
+ if (glob_pid) {
+ /*
+ fprintf(stdout, "verify \"%u-*-*\": (%u/%u) passed\n",
+ glob_pid,
+ verified - verify_failure,
+ verified);
+ */
+ } else {
+ Logf("verify: (%u/%u) passed\n",
+ verified - verify_failure,
+ verified);
+ }
+ return verify_failure;
}
-/* This function starts with a small file size and increases
- */
-void run_ascending(const pid_t pid)
-{
- UINT_32 size;
+struct array_s {
+ UINT_32 first, last, count, number;
+ UINT_32 v[0];
+};
- for (size = Min_io_sz; size < Max_io_sz; size *= 2) {
- size &= ~3;
- if (write_file(pid, size)) {
- fprintf(stderr, "ending ascending run\n");
- break;
- }
- }
+struct array_s * array_new(UINT_32 number)
+{
+ UINT_32 bytes = sizeof(struct array_s)
+ + sizeof(UINT_32)*number;
+ struct array_s *a = malloc(bytes);
+ if (!a) return NULL;
+ memset(a,0,bytes);
+ a->number = number;
+ return a;
}
-/* This function starts with a large file size and decreases
- */
-void run_descending(const pid_t pid)
+void array_destroy(struct array_s *a)
{
- size_t size;
+ free(a);
+}
- for (size = Max_io_sz; size > Min_io_sz; size /= 2) {
- size &= ~3;
- if (write_file(pid, size)) {
- fprintf(stderr, "ending descending run\n");
- break;
- }
+void array_push(struct array_s *a, const UINT_32 v)
+{
+ assert(a->count < a->number);
+ if (a->count > 0) {
+ if (++a->last == a->number)
+ a->last = 0;
}
+ a->v[a->last] = v;
+ ++a->count;
+ /*
+ fprintf(stderr, "=p v(%u:%u)[%u]: %u\n",
+ a->first, a->last, a->count, v);
+ */
}
-void usage(char *prog)
+UINT_32 array_shift(struct array_s *a)
{
- printf("Usage: %s [-hvVs] [-m <min>] [-M <max>] [-p <passes>]\n"
- "[-c <concurrent>] [-l <vLog>] -s <safedir> -t <testdir>\n"
- "\n%s - Version %s Options:\n"
- "\t-h prints this usage text\n"
- "\t-v forces NO verification step of existing files (if any)\n"
- "\t-V forces exit after verification step of existing files\n"
- "\t<min> == minimum IO size to use (bytes)\n"
- "\t<max> == maximum IO size to use (bytes)\n"
- "\t<passes> == # of passes to run (0 for INF)\n"
- "\t<concurrent> == # of processes to run at once\n"
- "\t<vLog> == log all file verify failures here, otherwise mkstemp\n"
- "\t<safedir> == writable DIRECTORY to store 'checkpoint' files\n"
- "\t<testdir> == writable DIRECTORY to store 'test data' files\n"
- "\n"
- "\tDifference between safedir and testdir is that safedir should \n"
- "\tbe 'safe' storage meaning that it is not using drive write \n"
- "\tcache, whereas testdir is intended to be using drive write \n"
- "\tcache and the write barrier. Naturally, the two should be on \n"
- "\tseparate drives\n", prog, prog, WBTEST_VERSION);
+ UINT_32 v;
+ assert(a->count > 0);
+ v = a->v[a->first];
+ if (++a->first == a->number)
+ a->first = 0;
+ --a->count;
+ /*
+ fprintf(stderr, "=s v(%u:%u)[%u]: %u\n",
+ a->first, a->last, a->count, v);
+ */
+ return v;
}
-void exit_safe(int ret_code)
+UINT_32 array_idx(struct array_s *a, UINT_32 i)
{
- cleanup_mem();
- exit(ret_code);
+ UINT_32 v;
+ assert(a->count > i);
+ v = a->v[(a->first + i) % a->number];
+ /*
+ fprintf(stderr, "=i v(%u:%u)[%u:%u]: %u\n",
+ a->first, a->last, a->count, i, v);
+ */
+ return v;
}
int main(int argc, char *argv[])
{
- int c;
- pid_t pid = 1;
- UINT_32 pass, pass_cnt = DEFAULT_PASS_COUNT;
- UINT_32 max_conc = DEFAULT_MAX_CONCURRENT;
- UINT_32 kids = 0;
- while ((c = getopt(argc, argv, "c:hl:m:M:p:s:t:vV")) != -1) {
- UINT_32 scr;
+ struct statfs s;
+ struct array_s *Kids;
+ UINT_32 blocks_per_pass, tmp, pass, kids = 0;
- switch (c) {
- case 's':
- /* "safe" path -- non-cached dir to store checkpoint files
- */
- Checkpoint_path = optarg;
- break;
- case 't':
- /* "test" path -- cached/barrier protected dir to store test data
- * files
- */
- Data_path = optarg;
- break;
- case 'l':
- /* where to store the verify "log" showing found problems
- */
- Verify_fail_logf = optarg;
- break;
- case 'm':
- scr = ((UINT_32) atoi(optarg)) & ~3;
- Min_io_sz = scr;
- break;
- case 'M':
- scr = ((UINT_32) atoi(optarg)) & ~3;
- Max_io_sz = scr;
- break;
- case 'p':
- scr = ((UINT_32) atoi(optarg));
- pass_cnt = scr;
- break;
- case 'c':
- scr = ((UINT_32) atoi(optarg));
- max_conc = scr;
- break;
- case 'v':
- Flags |= FLAG_NO_VERIFY;
- break;
- case 'V':
- Flags |= FLAG_VERIFY_ONLY;
- break;
- case 'h':
- case '?':
- default:
- usage(argv[0]);
- exit_safe(1);
- }
- }
+ parse_options(argc,argv);
- if (!Checkpoint_path || !Data_path) {
- fprintf(stderr,
- "Missing -s or -t arguments, both required\n");
- usage(argv[0]);
- exit_safe(1);
- }
- // first, verify what's there using the checkpoint files
- if (!(Flags & FLAG_NO_VERIFY)) {
- printf("Beginning verify stage\n");
- if (initial_setup_parent()) {
- exit_safe(1);
- }
- if (parse_dir()) {
+ if (!Dont_verify) {
+ printf( "Data_path: %s\nLogfile: %s\n"
+ "Beginning global verify stage.\n",
+ Data_path, Log_fname );
+ if (do_verify(0)) {
printf("Verify failed; program exiting.\n");
- exit_safe(1);
- } else {
- printf
- ("Verify completed successfully; program continuing.\n");
+ exit(1);
}
-
- log_verify_failure(NULL); /* close out the failure log if opened */
- cleanup_mem(); /* free used mem from parent */
-
- if (Flags & FLAG_VERIFY_ONLY) {
- printf("Performing verify step ONLY as desired\n");
- exit_safe(0);
+ if (Verify_only) {
+ printf("Verify completed successfully.\n");
+ printf("Nothing else to do (-V).\n");
+ fclose(Log_fp);
+ exit(0);
}
+ printf("Verify completed successfully; program continuing.\n");
} else {
- printf("skipping verify step as desired\n");
+ printf("Skip initial verify step (-v).\n");
}
- printf
- ("Using I/O Min: %u, Max: %u, %u passes with %u procs running\n",
- Min_io_sz, Max_io_sz, pass_cnt, max_conc);
+ if (fstatfs(dirfd(Data_dir),&s)) {
+ perror("statfs");
+ return 1;
+ }
- // now launch new I/O
- for (pass = 0; (pass < pass_cnt) || (0 == pass_cnt); pass++) {
- fflush(0);
- pid = fork();
- assert(-1 != pid);
+ /* estimate recycle count */
+ blocks_per_pass = ( ((Max_size<<1) -1) + (Min_size-1) ) / s.f_bsize;
+ tmp = (s.f_bavail>>1) / blocks_per_pass +1;
+ if (tmp > MAX_RECYCLE) tmp = MAX_RECYCLE;
+ if (Recycle > tmp) Recycle = tmp;
+ Recycle = MAX(Recycle,Max_conc+1);
- if (0 == pid) {
- // child
- pid_t new_pid = getpid();
+ Logf("Using I/O Min: %u, Max: %u, %u passes, "
+ "with %u procs running\n"
+ "Recycling starts with the %u. pass\n"
+ "Data_path: %s\n"
+ "Logfile: %s\n",
+ Min_size, Max_size, Pass_cnt, Max_conc,
+ Recycle, Data_path, Log_fname);
- if (0 != initial_setup_child(new_pid)) {
- exit_safe(1);
- }
+ Kids = array_new(Recycle+10);
+ if (!Kids) {
+ Logf("array_new(%u): out of memory\n", Recycle);
+ exit(1);
+ }
- if (new_pid % 2) {
- run_ascending(new_pid);
- } else {
- run_descending(new_pid);
- }
- exit_safe(0); // child exit
- } else {
- if (++kids < max_conc) {
- continue;
- } else {
- if (wait_for_kids
- (&kids, WAIT_ON_SINGLE_CHILD)) {
- exit_safe(1);
- }
- }
+ // now launch new I/O
+ for (pass = 0; (pass < Pass_cnt) || (0 == Pass_cnt); pass++) {
+ if (Kids->count >= Recycle) {
+ do_verify(array_shift(Kids));
}
+ array_push(Kids, spawn(pass) );
+ if (++kids >= Max_conc) {
+ if ( wait_for_kid(array_idx(Kids,Kids->count - kids)) )
+ exit(1);
+ --kids;
+ }
+ }
- } // end pass loop
-
/* we have finished the number of passes we wanted to originate. Wait
* for the rest of the kids before leaving
*/
- if ((pid > 0) && wait_for_kids(&kids, WAIT_ON_ALL_CHILDREN)) {
- exit_safe(1);
+ while (kids--) {
+ if (wait_for_kid(-1)) exit(1);
}
- exit_safe(0);
- return 0;
+ fclose(Log_fp);
+ closedir(Data_dir);
+ array_destroy(Kids);
+
+ exit(0);
}