Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Yes, I did try that. Doesn't make much of a (speed) difference.
It seems, that the problem is less that rm gets stuck for good, but that
it takes really long breaks (about 20 sec.) while deleting - during
those breaks the whole partition is stuck and iostat reports 100%
utilization compared to ~95% while actually deleting files. Could the
"hang-time" be DRBD writing meta-information (internal in my case) and
blocking every other access as long the meta-data isn't written to the
disk? Of course there is also the ext3-journal that has to be written,
but still I don't see why it should take that long: I'm currently timing
how long it takes to delete a subdir with 285868 block-sized files in it
(already more than 30 min).
dmesg is clear, so it does not seem to be a SATA reset.
any other ideas?
Am 2011-01-28 20:02, schrieb Moti Levy:
> Have you tried :
> find dirname -type f -exec rm {} \;
>
>
> On Fri, Jan 28, 2011 at 1:46 PM, Joseph Hauptmann
> <joseph at digiconcept.net <mailto:joseph at digiconcept.net>> wrote:
>
> Hello DRBD-users worldwide...
>
> I've been using DRBD almost a year now, until now without problems
> that I couldn't resolve myself.
> But now I ran into quite a serious problem and I'm interested if
> someone else experienced something similar with or without DRBD
> (as of course I can't really be sure that DRBD is the problem):
>
> A few months ago a colleague of mine forgot to activate a cronjob,
> that deletes a couple thousand very small temporary files each
> night on a DRBD-device. Now I have a directory with, I guess more
> than a million files, which wouldn't be so bad, if rm -rf {dir}/
> could delete it. But sadly that is not the case.
> rm gets stuck after it deleted a few hundred files and doesn't
> resume operation. Furthermore the all IO-access on the DRBD-device
> is complete stuck until the rm process is killed.
>
> I've already disconnected all resources from it's peer and shut
> down most of the non essential services on the machine.
>
> It's running Debian Lenny with
>
> uname -a
> Linux srv1.xxx.at <http://srv1.xxx.at> 2.6.26-2-openvz-amd64 #1
> SMP Wed May 12 18:14:56 UTC 2010 x86_64 GNU/Linux
>
> cat /proc/drbd
> version: 8.3.7 (api:88/proto:86-91)
> GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by
> root at srv1.xxx.at <mailto:root at srv1.xxx.at>, 2010-03-28 21:47:13
> 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
> ns:1875795496 nr:0 dw:225995436 dr:566154981 al:105639961
> bm:11019801 lo:2 pe:0 ua:0 ap:1 ep:1 wo:b oos:1242040
> 1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----
> ns:0 nr:31796784 dw:31796784 dr:2253416 al:0 bm:1134 lo:0 pe:0
> ua:0 ap:0 ep:1 wo:d oos:0
> 2: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----
> ns:0 nr:57709884 dw:143774088 dr:8480 al:0 bm:50 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:d oos:0
>
> The filesystem on resource 0 is ext3 with a block size of 4096
> and lies on a SW-RAID5 (far from ideal - I know).
>
>
> Atm. I'm using a bash-hack, that kills the rm-process every 30
> seconds and restarts it as long as the directory still exists.
>
> Thanks for any hints to what might cause this problem.
>
> Joe
>
> --
> Joseph Hauptmann
>
> /digiconcept/ - GmbH.
> 1080 Wien
> Blindengasse 52/1
>
> Tel. +43 1 218 0 212 - 24
> Fax +43 1 218 0 212 - 10
>
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com <mailto:drbd-user at lists.linbit.com>
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
--
Joseph Hauptmann
/digiconcept/ - GmbH.
1080 Wien
Blindengasse 52/1
Tel. +43 1 218 0 212 - 24
Fax +43 1 218 0 212 - 10
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20110128/cc7b3770/attachment.htm>