Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
thx again for the tip, but disconnecting the peer (ie. WFConnection-mode) was the first thing i've done. i'm currently deleting with find subdir/ -type f | while read LINE ; do rm -vf $LINE && sleep 0.03; done that delay seems to be enough to not cause the device to block I/O-access and so at least the machine is online again. deleting this way though will most likely take till end of next week. enjoy your weekend, joe Am 28.01.2011 21:59, schrieb Moti Levy: > All I can think of is that DRBD is trying to catch up and causes the > delays. > Maybe take one of the nodes offline and try to delete without "real time > replication" ? > > Moti > > > On Fri, Jan 28, 2011 at 2:44 PM, Joseph Hauptmann<joseph at digiconcept.net>wrote: > >> Yes, I did try that. Doesn't make much of a (speed) difference. >> >> It seems, that the problem is less that rm gets stuck for good, but that it >> takes really long breaks (about 20 sec.) while deleting - during those >> breaks the whole partition is stuck and iostat reports 100% utilization >> compared to ~95% while actually deleting files. Could the "hang-time" be >> DRBD writing meta-information (internal in my case) and blocking every other >> access as long the meta-data isn't written to the disk? Of course there is >> also the ext3-journal that has to be written, but still I don't see why it >> should take that long: I'm currently timing how long it takes to delete a >> subdir with 285868 block-sized files in it (already more than 30 min). >> >> >> dmesg is clear, so it does not seem to be a SATA reset. >> >> any other ideas? >> >> >> >> >> >> Am 2011-01-28 20:02, schrieb Moti Levy: >> >> Have you tried : >> find dirname -type f -exec rm {} \; >> >> >> On Fri, Jan 28, 2011 at 1:46 PM, Joseph Hauptmann<joseph at digiconcept.net >>> wrote: >>> Hello DRBD-users worldwide... >>> >>> I've been using DRBD almost a year now, until now without problems that I >>> couldn't resolve myself. >>> But now I ran into quite a serious problem and I'm interested if someone >>> else experienced something similar with or without DRBD (as of course I >>> can't really be sure that DRBD is the problem): >>> >>> A few months ago a colleague of mine forgot to activate a cronjob, that >>> deletes a couple thousand very small temporary files each night on a >>> DRBD-device. Now I have a directory with, I guess more than a million files, >>> which wouldn't be so bad, if rm -rf {dir}/ could delete it. But sadly that >>> is not the case. >>> rm gets stuck after it deleted a few hundred files and doesn't resume >>> operation. Furthermore the all IO-access on the DRBD-device is complete >>> stuck until the rm process is killed. >>> >>> I've already disconnected all resources from it's peer and shut down most >>> of the non essential services on the machine. >>> >>> It's running Debian Lenny with >>> >>> uname -a >>> Linux srv1.xxx.at 2.6.26-2-openvz-amd64 #1 SMP Wed May 12 18:14:56 UTC >>> 2010 x86_64 GNU/Linux >>> >>> cat /proc/drbd >>> version: 8.3.7 (api:88/proto:86-91) >>> GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by >>> root at srv1.xxx.at, 2010-03-28 21:47:13 >>> 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r---- >>> ns:1875795496 nr:0 dw:225995436 dr:566154981 al:105639961 bm:11019801 >>> lo:2 pe:0 ua:0 ap:1 ep:1 wo:b oos:1242040 >>> 1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r---- >>> ns:0 nr:31796784 dw:31796784 dr:2253416 al:0 bm:1134 lo:0 pe:0 ua:0 >>> ap:0 ep:1 wo:d oos:0 >>> 2: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r---- >>> ns:0 nr:57709884 dw:143774088 dr:8480 al:0 bm:50 lo:0 pe:0 ua:0 ap:0 >>> ep:1 wo:d oos:0 >>> >>> The filesystem on resource 0 is ext3 with a block size of 4096 and lies >>> on a SW-RAID5 (far from ideal - I know). >>> >>> >>> Atm. I'm using a bash-hack, that kills the rm-process every 30 seconds and >>> restarts it as long as the directory still exists. >>> >>> Thanks for any hints to what might cause this problem. >>> >>> Joe >>> >>> -- >>> Joseph Hauptmann >>> >>> /digiconcept/ - GmbH. >>> 1080 Wien >>> Blindengasse 52/1 >>> >>> Tel. +43 1 218 0 212 - 24 >>> Fax +43 1 218 0 212 - 10 >>> >>> _______________________________________________ >>> drbd-user mailing list >>> drbd-user at lists.linbit.com >>> http://lists.linbit.com/mailman/listinfo/drbd-user >>> >> >> >> -- >> Joseph Hauptmann >> >> /digiconcept/ - GmbH. >> 1080 Wien >> Blindengasse 52/1 >> >> Tel. +43 1 218 0 212 - 24 >> Fax +43 1 218 0 212 - 10 >> >> >> _______________________________________________ >> drbd-user mailing list >> drbd-user at lists.linbit.com >> http://lists.linbit.com/mailman/listinfo/drbd-user >> >>