Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Dear DRBD-Developers, dear DRBD-Users, Actually I would be very fond of DRBD -- But unfortunately I had somtimes data-losses (rarely, but I had them). FOR DEVELOPERS AND USERS: DRBD-Versions principally concerned: 9.0.7-1, 9.0.8-1, 9.0.9rc1-1 : "THE VERSIONS" I think the following configuration options are mandatory to have these data losses: net { congestion-fill "1"; # 1 sector [1*)] on-congestion "pull-ahead"; protocol "A"; [... (other options)] } (the goal of these settings: a very slow network-connection should not slow down the local disk-io.) 1*) I think, it is sufficient that "congestion-fill" != 0 . The data-losses showed as follows (actually only with Version 9.0.7-1, but I guess the other versions are also concerned): * Having a local drbd-device and its (one / only) peer. * The local drbd-device is set to "primary", the peer-device to "secondary". * The local drbd-device is mounted as ext3-filesystem. * Changing the contents of some memory-mapped files on the mounted drbd- device (totally e.g. ca. 400MB ... 2GB). * When the peer-device has been synchronized with the local device, some sectors have not been transferred to the peer device (about e.g. ca. 4MB). This fact has been verified by md5summing the files before writing 'to the drbd-device'; then unmounting the local drbd-device, switching it to "secondary" and switching the peer-device to "primary", mounting the peer-device as ext3-filesytem and verifying the checksums of the files on it. * For the following is also essential: I have set 'csums-alg "md5"', so only sectors which are not equal on the local device and the peer-device are transferred. * Also when after a such data-loss, I invalidate the peer-device and force a total resynchronization of the peer-device with the local-device, these missing sectors 'are shown' in the status display as (ca.) "received: 4133". (So I know, that about 4MB are missing. [only the missing sectors are transferred, see above.]) MOSTLY FOR DEVELOPERS (AND INTERESTED USERS): * Every time when I have had such a data-loss, I have seen in the system- messages the following message: "[...] SyncSource still sees bits set!! FIXME". (But not every time when I have seen this message I have had a data-loss.) (Source Code, "drbd/drbd_receiver.c"; Version 9.0.7-1: lines# 6258 ff.; Version 9.0.8-1: lines# 6249 ff.: " /* TODO: Since DRBD9 we experience that SyncSource still has bits set... NEED TO UNDERSTAND AND FIX! */ if (drbd_bm_total_weight(peer_device) > peer_device->rs_failed) drbd_warn(peer_device, "SyncSource still sees bits set!! FIXME\n"); ") * I strongly suppose, that my data losses are related to those bits still set. * In the following explanations, I refer to the Source Code of Version 9.0.7-1 . * Supposed Explanation: * Suppose, according to the warning mentioned above, some bits of the bitmap are and stay set. * In "drbd/drbd_req.c::drbd_process_write_request()" at line# 1404 (because of my 'mandatory settings') sooner or later "drbd_should_do_remote()" returns "false"; so "remote == false". * So, at line# 1422 the following branch is taken: "} else if (drbd_set_out_of_sync([...])) _req_mod(req, QUEUE_FOR_SEND_OOS, peer_device); }" * If the request "req" now is for writing to sectors, for which the bits in the bitmap are already set, "drbd_set_out_of_sync()" returns (a count of) "0" (because already set bits are not counted by the functions called by "drbd_set_out_of_sync()". * So the branch "_req_mod(req, QUEUE_FOR_SEND_OOS, peer_device);" is not taken; and the sectors will never be transmitted to the peer? * I also observed the following (for all "THE VERSIONS"; I have also tried Version 8.4.10-1 with this, and it works correctly [the out-of sync-bios are completely synchronized]): * I have a test-resource with equivalent settings to the devices, where I have had data-losses (local-device: primary, peer-device: secondary); without filesystem. * I write by "dd if=/dev/urandom of=/dev/drbd6 bs=4096 count=1000" to the local device "/dev/drbd6". * Then when the writing has finished, I got (among others) the following status-messages for the local-device: "received:0 sent:1564 out-of-sync:4000 pending:0 unacked: 0" * The "out-of-sync"-count remains (the local and the peer device are never synchronized). * After a "drbdadm down [...]"-"drbdadm up [...]" sequence the "out-of- sync-count" is still "4000". (Why do 'you' not synchronize automati- cally two 'corresponding' devices out of sync after restarting?) * When I write out the metadata of the local device, I get the following: " [...] bitmap[0] { # at 0kB 12 times 0xFFFFFFFFFFFFFFFF; 0xFFFFFFFFFFFFFFFF; 0xFFFFFFFFFFFFFFFF; 0xFFFFFFFFFFFFFFFF; 0x000000FFFFFFFFFF; 65520 times 0x0000000000000000; } # bits-set 1000; "; so 'the "4000" out of sync' are persistent. * When I write (with the 'the "4000" out of sync' set) by "dd if=/dev/urandom of=/dev/drbd6 bs=4096 count=1000" again to the local device "/dev/drbd6"; I get e.g (among others) the following status-messages for the local-device: "received:0 sent:1432 out-of-sync:4000 pending:0 unacked: 0", then * only the first "1432" KBs have been transferred to the peer- device. * the requests "req" of the not transferred sectors are only and only the requests, for which (at the corresponding branches) "drbd_set_out_of_sync()" returns (a count of) "0". I am longing for a perfectly working DRBD, Sincerely Thomas Bruecker --------------------------------------------------------------------------- Appendix: * Proceedings to get the above propositions: I have logged for every request "req" 'passing through drbd' the start- sector# of the "bio" in the request. I have tagged the start-sector# according to 'the location / function', where I have logged the request / "bio"; e.g. a request where "drbd_set_out_of_sync()" returned (a count of) "0" produced e.g. the following output: "dcs2:72" (sector# 72). * Configuration Files: * Devices with data-loss: " resource EF1C0E32-3CB0-11DB-B6E3-0000C00A45A9.RESOURCE { net { congestion-fill "1"; csums-alg "md5"; on-congestion "pull-ahead"; protocol "A"; transport "tcp"; } on ico { address ipv4 10.235.1.88:7789; node-id 1; volume 0 { device "/dev/drbd0"; disk "/media/byUuid/ EF1C0E32-3CB0-11DB-B6E3-0000C00A45A9.N1.V0.BASE"; meta-disk "internal"; } } on xxxxx { address ipv4 192.168.250.6:7789; node-id 0; volume 0 { device "/dev/drbd0"; disk "/media/byUuid/ EF1C0E32-3CB0-11DB-B6E3-0000C00A45A9.N0.V0.BASE"; meta-disk "internal"; } } } " * Test-resource: " resource EF1C0E32-3CB0-11DB-B6E3-0000C00A45A9.TEST3.RESOURCE { net { congestion-fill "1"; csums-alg "md5"; on-congestion "pull-ahead"; protocol "A"; transport "tcp"; } on xxxxx { address ipv4 192.168.250.6:7792; node-id 0; volume 0 { device "/dev/drbd6"; disk "/media/byUuid/ EF1C0E32-3CB0-11DB-B6E3-0000C00A45A9.TEST3.N0.V0.BASE"; meta-disk "internal"; } } on build-centOS-6-x.int.thomas-r-bruecker.ch { address ipv4 10.235.1.42:7792; node-id 1; volume 0 { device "/dev/drbd6"; disk "/media/byUuid/ EF1C0E32-3CB0-11DB-B6E3-0000C00A45A9.TEST3.N1.V0.BASE"; meta-disk "internal"; } } } "