Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, 15 Jul 2008, Philipp Reisner wrote: > Hi Nathan, > > Thanks for typing that Call trace... Here is an excerpt from drbd_bitmap.c > with the line marked where the crash happened. > > int drbd_bm_test_bit(struct drbd_conf *mdev, const unsigned long bitnr) > { > unsigned long flags; > struct drbd_bitmap *b = mdev->bitmap; > int i; > ERR_IF(!b) return 0; > ERR_IF(!b->bm) return 0; > > spin_lock_irqsave(&b->bm_lock, flags); > if (bitnr < b->bm_bits) { > i = test_bit(bitnr, b->bm) ? 1 : 0; <=<<==<<<===<<<<====<<<<<===== HERE > } else if (bitnr == b->bm_bits) { > i = -1; > } else { /* (bitnr > b->bm_bits) */ > ERR("bitnr=%lu > bm_bits=%lu\n", bitnr, b->bm_bits); > i = 0; > } > > spin_unlock_irqrestore(&b->bm_lock, flags); > return i; > } > > You are right with that we should print a nice error message saying > that something went wrong with the allocation of the bitmap instead of OOPSing > in that case. As far as I know we do that. So is it crashing because the set is larger then 4TB or some other reason? > The question here is, why does it not abort with a failed bitmap allocation ? > Can you provide us the kernel log from just before the crash ? Absolutely nothing in logs. : ( > Was the resync already running for some time, or does it crash instantaneously ? It can happen almost instantaneously, it can take 4 sec, it can take 30 sec. I thought it had something to do with the since because it did run without problem for almost 24 hours. Now I am in this less then 30 sec mode. > Are there any chances that you could also proved the upper part of the OOPS ? I am open to ways to get that, just don't know how. Since nothing is in the logs, I had to get that by doing a screen shot of the display via the IPMI card and manually type all that in. > Nathan, I do not want to create the impression that it will work for you if > you help us to fix this. Probably it will then fail for you with a nice > error message in the kernel log saying that the allocation of the bitmap > failed... I am open to helping you guys fix this regardless! Please let me know anything I can do to help. Also, I am new to git, but I pulled the latest source and have the same problem. ><> Nathan Stratton CTO, BlinkMind, Inc. nathan at robotics.net nathan at blinkmind.com http://www.robotics.net http://www.blinkmind.com