Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi everyone, I have a two-node HA cluster using DRBD for data replication. Filesystems are all ext3 in ordered-write mode, DRBD v0.6.12, Linux 2.4.26 and Debian GNU/Linux 3.0r2 with DRBD taken direct from <http://fsrc.csee.wvu.edu/debian/apt-repository>. The two machines are connected to each other via GbE with a crossover cable. They have two physical hardware SCSI RAID 10 arrays each, and two DRBD devices set up, one for each array. One DRBD device is low-traffic, used for shared configs. The other DRBD device is used heavily for the PostgreSQL database. During synchronization and when there is heavy database I/O, I get a lot of the following messages in my logs on the primary node (interestingly, the secondary node doesn't complain at all): kernel: drbd1: transferlog too small!! kernel: drbd1: tl messed up! kernel: drbd1: Epoch set size wrong!!found=192 reported=191 I searched the archives and found that this basically means I need to tune my drbd.conf file, but isn't something critical. Is this correct? Would anyone know of a general tuning and optimization guide for DRBD? Or perhaps would anyone be able to spare me some time to comment on my DRBD configuration file? I'm also curious: what's the most reliable way of finding the value to put in disk-size? Or can this be omitted for configurations where the partitions on both sides are of exactly the same size? My configuration is as follows: resource drbd0 { protocol = C fsckcmd = fsck -p -y inittimeout = 60 disk { do-panic disk-size = 61522304k } net { sndbuf-size = 1M sync-nice = -20 sync-min = 4M sync-max = 600M tl-size = 5000 timeout = 60 connect-int = 10 ping-int = 10 ko-count = 4 } on node1 { device = /dev/nb0 disk = /dev/cciss/c0d0p5 address = 192.168.4.1 port = 7788 } on node2 { device = /dev/nb0 disk = /dev/cciss/c0d0p5 address = 192.168.4.2 port = 7788 } } resource drbd1 { protocol = C fsckcmd = fsck -p -y inittimeout = 60 disk { do-panic disk-size = 70005836k } net { sndbuf-size = 1M sync-nice = -20 sync-min = 4M sync-max = 600M tl-size = 5000 timeout = 60 connect-int = 10 ping-int = 10 ko-count = 4 } on node1 { device = /dev/nb1 disk = /dev/cciss/c0d1p1 address = 192.168.4.1 port = 7789 } on node2 { device = /dev/nb1 disk = /dev/cciss/c0d1p1 address = 192.168.4.2 port = 7789 } } Thank you very much! :) --> Jijo -- Federico Sevilla III : jijo.free.net.ph : When we speak of free software GNU/Linux Specialist : GnuPG 0x93B746BE : we refer to freedom, not price.