Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Lars Ellenberg wrote: > On Fri, Jan 30, 2009 at 11:34:22AM -0800, John Du wrote: > >> Hi, >> >> I upgraded DRBD 8.2.0 to 8.3.0. The upgrade went smoothly. However, >> the upgraded version is very slow and the system load is near 100. >> Before the upgrade the load never exceeded 1. >> >> The DRBD worker thread runs into "un-interruptible state" very often for >> long time. >> >> iostat shows: >> >> avg-cpu: %user %nice %system %iowait %steal %idle >> 0.52 0.00 0.57 1.47 0.00 97.44 >> >> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn >> sdb 31.02 548.36 668.39 27950963 34068995 >> drbd1 87.19 99.38 648.63 5065610 33061496 >> >> sdb is the underlying storage, a SAN partition of 1.2 TB. >> >> Note that the read speed for sdb is 548 but for drbd1 99. The write >> speeds for sdb and drbd1 are about the same. Note the numbers of blocks >> read for sdb and drbd1 are very different. >> > > strange. > > >> top shows: >> >> Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie >> Cpu(s): 0.2%us, 0.2%sy, 0.0%ni, 98.9%id, 0.7%wa, 0.0%hi, 0.0%si, >> 0.0%st >> Mem: 8175372k total, 8120248k used, 55124k free, 849224k buffers >> Swap: 2031608k total, 0k used, 2031608k free, 4206700k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 5507 root 15 0 0 0 0 D 0 0.0 >> 0:29.36 drbd1_worker >> >> Here is the background information >> >> OS: Linux 2.6.18-8.1.15.el5 #1 SMP Thu Oct 4 04:06:39 EDT 2007 x86_64 >> x86_64 x86_64 GNU/Linux >> >> The primary DRBD is not connected to the secondary. >> >> DRBD configuration: >> >> global { >> usage-count no; >> } >> >> common { >> >> net { >> sndbuf-size 512k; >> timeout 60; >> connect-int 10; >> ping-int 10; >> max-buffers 2048; >> max-epoch-size 2048; >> } >> >> resource drbd0 { >> protocol A; >> >> startup { >> wfc-timeout 30; >> >> degr-wfc-timeout 120; # 2 minutes. >> } >> >> on host1 { >> device /dev/drbd1; >> disk /dev/sdb1; >> address 10.100.2.232:7789; >> meta-disk internal; >> } >> >> on host2 { >> device /dev/drbd1; >> disk /dev/sdb1; >> address 10.101.152.36:7789; >> meta-disk internal; >> } >> } >> >> >> My questions are: >> >> 1.. Does DRBD 8.3 re-organize the data on disk after the upgrade >> > > No. > > >> making the IO on drbd1 slow now and it will return to normal after it >> is done? >> >> 2. Can I rollback to 8.2.0 while investigating the cause of the >> slowness? >> > > yes. > > >> Does 8.3.0 make any changes that 8.2.0 does not rcognize? >> > > no. > > >> 3. What else can I do to improve the performance to a level close to >> what it was before the upgrade? >> > > I have no idea what goes on there. > It certainly is unexpected behaviour. > > I upgraded the kernel to the latest Red Hat release: 2.6.18-128.el5 and that did not fix the problem. I installed 8.2.7 and it behaves the same as 8.3. The system becomes too slow to use. I rolled back to 8.2.0. 8.2.0 works in terms of writing data to the underlying storage but it is not stable in replicating data to the secondary node. Unloading the DRBD module of 8.2.0 locks up the host. 8.2.7 and 8.3 do not log nay errors when they are slow. The DRBD worker thread stays in state D and other processes are blocked. Here is some observation: When running 8.2.0, the command "iostat" does not show the DRBD device. With 8.3.0 or 8.2.7, "iostat" shows the DRBD device as one of the IO devices. On all our other server where DRBD are used, iostat does not display the DRBD devices. In other words, when iostat shows a DRBD divce (/dev/drbd1 in my case) as one of the IO devices, the host system becomes very slow. Thanks for your help. John -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090207/89dae942/attachment.htm>