Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all, Currently we are using DRBD 8.2 on RHEL 4.5. We have 5 drbd partitions and works with GFS and multipath. It works well in the past few months. Last week, we increased the number of partitons to 12 and do a full sync, OOM occurred and we need to reboot the whole cluster. The 2 storages have dual Xeon 3GHz CPUs and 2GB RAM, is it too few to do the full sync? Our target is to increased to 30 partitions, how much memory shall be enough for the operational need on the storage servers? Any suggestions or ideas are welcome. Thanks a lot. Cheers, PN /var/log/messages ... Jun 12 12:09:02 ista02 kernel: drbd9: conn( Connected -> StartingSyncT ) disk( UpToDate -> Inconsistent ) Jun 12 12:09:02 ista02 kernel: drbd9: Writing meta data super block now. Jun 12 12:09:02 ista02 kernel: drbd9: writing of bitmap took 145 jiffies Jun 12 12:09:02 ista02 kernel: drbd9: 4886 MB marked out-of-sync by on disk bit-map. Jun 12 12:09:02 ista02 kernel: drbd9: 5004052 KB now marked out-of-sync by on disk bit-map. Jun 12 12:09:02 ista02 kernel: drbd9: Writing meta data super block now. Jun 12 12:09:02 ista02 kernel: drbd9: conn( StartingSyncT -> WFSyncUUID ) Jun 12 12:09:02 ista02 kernel: drbd9: conn( WFSyncUUID -> SyncTarget ) Jun 12 12:09:02 ista02 kernel: drbd9: Began resync as SyncTarget (will sync 5004052 KB [1251013 bits set]). Jun 12 12:09:02 ista02 kernel: drbd9: Writing meta data super block now. Jun 12 12:09:02 ista02 kernel: drbd11: conn( Connected -> StartingSyncT ) disk( UpToDate -> Inconsistent ) Jun 12 12:09:02 ista02 kernel: drbd11: Writing meta data super block now. Jun 12 12:09:03 ista02 kernel: drbd11: writing of bitmap took 355 jiffies Jun 12 12:09:03 ista02 kernel: drbd11: 49 GB marked out-of-sync by on disk bit-map. Jun 12 12:09:03 ista02 kernel: drbd11: 52217616 KB now marked out-of-sync by on disk bit-map. Jun 12 12:09:03 ista02 kernel: drbd11: Writing meta data super block now. Jun 12 12:09:03 ista02 kernel: drbd11: conn( StartingSyncT -> WFSyncUUID ) Jun 12 12:09:03 ista02 kernel: drbd11: conn( WFSyncUUID -> SyncTarget ) Jun 12 12:09:03 ista02 kernel: drbd11: Began resync as SyncTarget (will sync 52217616 KB [13054404 bits set]). Jun 12 12:09:03 ista02 kernel: drbd11: Writing meta data super block now. Jun 12 12:09:03 ista02 kernel: drbd12: State change failed: Device is diskless, the requesed operation requires a disk Jun 12 12:09:03 ista02 kernel: drbd12: state = { cs:Connected st:Secondary/Primary ds:Diskless/UpToDate r--- } Jun 12 12:09:03 ista02 kernel: drbd12: wanted = { cs:StartingSyncT st:Secondary/Primary ds:Inconsistent/UpToDate r--- } Jun 12 12:09:12 ista02 kernel: drbd10: Resync done (total 60 sec; paused 0 sec; 16732 K/sec) Jun 12 12:09:12 ista02 kernel: drbd10: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) Jun 12 12:09:12 ista02 kernel: drbd10: Writing meta data super block now. Jun 12 12:14:11 ista02 kernel: oom-killer: gfp_mask=0xd2 Jun 12 12:14:11 ista02 kernel: Mem-info: Jun 12 12:14:11 ista02 kernel: Node 0 DMA per-cpu: Jun 12 12:14:11 ista02 kernel: cpu 0 hot: low 2, high 6, batch 1 Jun 12 12:14:11 ista02 kernel: cpu 0 cold: low 0, high 2, batch 1 Jun 12 12:14:11 ista02 kernel: cpu 1 hot: low 2, high 6, batch 1 Jun 12 12:14:11 ista02 kernel: cpu 1 cold: low 0, high 2, batch 1 Jun 12 12:14:11 ista02 kernel: Node 0 Normal per-cpu: Jun 12 12:14:11 ista02 kernel: cpu 0 hot: low 32, high 96, batch 16 Jun 12 12:14:11 ista02 kernel: cpu 0 cold: low 0, high 32, batch 16 Jun 12 12:14:11 ista02 kernel: cpu 1 hot: low 32, high 96, batch 16 Jun 12 12:14:11 ista02 kernel: cpu 1 cold: low 0, high 32, batch 16 Jun 12 12:14:11 ista02 kernel: Node 0 HighMem per-cpu: empty Jun 12 12:14:11 ista02 kernel: Jun 12 12:14:11 ista02 kernel: Free pages: 12348kB (0kB HighMem) Jun 12 12:14:12 ista02 kernel: Active:1356 inactive:637 dirty:0 writeback:0 unstable:0 free:3087 slab:14970 mapped:1327 pagetables:702 Jun 12 12:14:12 ista02 kernel: Node 0 DMA free:11836kB min:8kB low:16kB high:24kB active:0kB inactive:0kB present:16384kB pages_scanned:25 all_unreclaimable? yes Jun 12 12:14:12 ista02 kernel: protections[]: 0 0 0 Jun 12 12:14:12 ista02 kernel: Node 0 Normal free:512kB min:1428kB low:2856kB high:4284kB active:5424kB inactive:2548kB present:2080192kB pages_scanned:10824 all_unreclaimable? yes Jun 12 12:14:12 ista02 kernel: protections[]: 0 0 0 Jun 12 12:14:12 ista02 kernel: Node 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Jun 12 12:14:12 ista02 kernel: protections[]: 0 0 0 Jun 12 12:14:12 ista02 kernel: Node 0 DMA: 7*4kB 2*8kB 1*16kB 4*32kB 2*64kB 2*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11836kB Jun 12 12:14:12 ista02 kernel: Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB Jun 12 12:14:12 ista02 kernel: Node 0 HighMem: empty Jun 12 12:14:12 ista02 kernel: Swap cache: add 17896, delete 17629, find 356/595, race 0+0 Jun 12 12:14:12 ista02 kernel: Free swap: 4128184kB Jun 12 12:14:12 ista02 kernel: 524144 pages of RAM Jun 12 12:14:12 ista02 kernel: 10212 reserved pages Jun 12 12:14:12 ista02 kernel: 69354 pages shared Jun 12 12:14:12 ista02 kernel: 267 pages swap cached Jun 12 12:14:12 ista02 kernel: Out of Memory: Killed process 4713 (htt_server). Jun 12 12:14:12 ista02 kernel: oom-killer: gfp_mask=0xd2 Jun 12 12:14:12 ista02 kernel: Mem-info: Jun 12 12:14:13 ista02 kernel: Node 0 DMA per-cpu: Jun 12 12:14:13 ista02 kernel: cpu 0 hot: low 2, high 6, batch 1 Jun 12 12:14:13 ista02 kernel: cpu 0 cold: low 0, high 2, batch 1 Jun 12 12:14:13 ista02 kernel: cpu 1 hot: low 2, high 6, batch 1 Jun 12 12:14:13 ista02 kernel: cpu 1 cold: low 0, high 2, batch 1 Jun 12 12:14:13 ista02 kernel: Node 0 Normal per-cpu: Jun 12 12:14:13 ista02 kernel: cpu 0 hot: low 32, high 96, batch 16 Jun 12 12:14:13 ista02 kernel: cpu 0 cold: low 0, high 32, batch 16 Jun 12 12:14:13 ista02 kernel: cpu 1 hot: low 32, high 96, batch 16 Jun 12 12:14:13 ista02 kernel: cpu 1 cold: low 0, high 32, batch 16 Jun 12 12:14:13 ista02 kernel: Node 0 HighMem per-cpu: empty Jun 12 12:14:13 ista02 kernel: Jun 12 12:14:13 ista02 kernel: Free pages: 12348kB (0kB HighMem) Jun 12 12:14:13 ista02 kernel: Active:1010 inactive:983 dirty:0 writeback:0 unstable:0 free:3087 slab:15023 mapped:1327 pagetables:662 Jun 12 12:14:13 ista02 kernel: Node 0 DMA free:11836kB min:8kB low:16kB high:24kB active:0kB inactive:0kB present:16384kB pages_scanned:35 all_unreclaimable? yes Jun 12 12:14:13 ista02 kernel: protections[]: 0 0 0 Jun 12 12:14:13 ista02 kernel: Node 0 Normal free:512kB min:1428kB low:2856kB high:4284kB active:4040kB inactive:3932kB present:2080192kB pages_scanned:15774 all_unreclaimable? yes Jun 12 12:14:13 ista02 kernel: protections[]: 0 0 0 Jun 12 12:14:13 ista02 kernel: Node 0 HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Jun 12 12:14:13 ista02 kernel: protections[]: 0 0 0 Jun 12 12:14:13 ista02 kernel: Node 0 DMA: 7*4kB 2*8kB 1*16kB 4*32kB 2*64kB 2*128kB 2*256kB 1*512kB 0*1024kB 1*2048kB 2*4096kB = 11836kB Jun 12 12:14:13 ista02 kernel: Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB Jun 12 12:14:13 ista02 kernel: Node 0 HighMem: empty Jun 12 12:14:13 ista02 kernel: Swap cache: add 17904, delete 17637, find 356/599, race 0+0 Jun 12 12:14:13 ista02 kernel: Free swap: 4128900kB Jun 12 12:14:13 ista02 kernel: 524144 pages of RAM Jun 12 12:14:13 ista02 kernel: 10212 reserved pages Jun 12 12:14:13 ista02 kernel: 69095 pages shared Jun 12 12:14:13 ista02 kernel: 267 pages swap cached Jun 12 12:14:13 ista02 kernel: Out of Memory: Killed process 4725 (cannaserver). .... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20080623/2a86a617/attachment.htm>