Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, I started some benchmarks on a 4,5TB lvm wich is build on top of 3 drbd devices (three 1,5TB volumes of a 4,5TB RAID5 with Areca Controller). The write performance seems to be good and is only limited by the GbE link. Setup: Core 2 Duo 6600 CPU 4 GB RAM Ubuntu Linux 6.06 2.6.15-27-amd64-server (just switched to x86-amd64 from i386) drbd 0.7.15 directly connected GbE links 3 x 1,5TB drbd devices (internal meta data, sdc1/r0, sdd1/r1, sde1/r2) 3 x 1,5TB lvm --> 1 x 4,5TB lv with xfs as fs I have some problems with xfs. Running tiobench with 8 threads on the 4,5TB lvm, the server hangs after some time. I'm still able to ping the server, I get the ssh login banner and I'm able to enter user/password, but then I'm not getting any further. The same happens on the console. There is also a high system load (~13, well that's not _that_ high). There is no other way than to reset the server and reboot. No messages on the console, syslog just stops printing --Mark-- messages. I also stopped tiobench once to avoid rebooting the system and tiobench was listed in the process list as defunct. But the system load still grow although the cpu was 100% idle and no disk activity happend. I had to reset the server after a couple of minutes. This problem does only happen if drbd is in disconnected mode (higher disk throughput?). It does not happen with ext3 - which is a bit slower - and not with a 4,5TB lvm without drbd. Here are some benchmark results: tiobench --numruns 3 --threads 8 --block 4096 --size 8000 File Blk Num Avg Maximum Lat% Lat% CPU Size Size Thr Rate(CPU%) Latency Latency >2s >10s Eff Sequential Reads (no drbd! xfs) 8000 4096 8 106.42 77.62% 0.844 10901.35 0.00264 0.00000 137 Sequential Writes (no drbd! xfs) 8000 4096 8 199.21 278.7% 0.280 42462.46 0.00307 0.00020 71 Sequential Reads (Connected xfs) 8000 4096 8 98.54 103.7% 0.912 11898.48 0.00302 0.00000 95 Sequential Writes (Connected xfs) 8000 4096 8 97.36 163.2% 0.666 118173.24 0.00336 0.00102 60 Sequential Reads (Connected ext3) 8000 4096 8 92.37 98.69% 0.988 12600.68 0.00307 0.00000 94 Sequential Writes (Connected ext3) 8000 4096 8 59.97 158.5% 1.221 58146.28 0.01143 0.00020 38 Sequential Reads (Disconnected xfs) system hangs Sequential Wrtites (Disconnected xfs) system hangs Sequential Reads (Disconnected ext3) 8000 4096 8 77.53 87.43% 1.187 16140.48 0.00395 0.00000 89 Sequential Writes (Disconnected ext3) 8000 4096 8 64.43 143.7% 1.020 135364.99 0.00928 0.00323 45 One other thing I only noticed with ext3 and not with xfs: kernel: [242696.450763] drbd2: [tiotest/7417] sock_sendmsg time expired, ko = 3 kernel: [242794.614237] drbd2: [tiotest/7419] sock_sendmsg time expired, ko = 3 kernel: [242820.882346] drbd2: [kjournald/7015] sock_sendmsg time expired, ko = 3 Any ideas about this xfs/drbd problem? I'm a bit lost, because I don't see any kernel messages or logfile entries. I'm not even sure if it's a kernel, drbd, lvm or xfs problem, but it only occures in conjunction with xfs and drbd. Ralf