Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi list, i have two nodes with SuSE 8.0 setup to replicate 64GB of data through drbd. On top of /dev/nb0 I have ext3 with blocksize=1024 (mke2fs -j -b 1024 /dev/nb0). I serve a Pervasive 8.1 db and samba from that disk. The versions are as follows: drbd: 0.6.12 heartbeat: 1.0.3 Linux: (uname -a) Linux nodeb 2.4.18-64GB-SMP #1 SMP Wed Mar 27 13:58:12 UTC 2002 i686 unknown The two nodes are called nodea and nodeb. Nodea is primary and nodeb is secondary. They replicate through a gigabit ethernet crossover. If I understand how drbd works if nodea is primary and I shutdown nodeb as soon as nodeb boots again and provided that nodea was not rebooted in the meanwhile a quick sync should happen. This is not happening to me. What I do is as follows: 1) boot both nodes, eventually nodea becomes primary and mounts /dev/nb0 2) nodeb is secondary 3) shutdown nodeb 4) copy data on drbd device or run some query on db. If I ls /pervasive (mount point of /dev/nb0) I see the files are there and they are ok (using md5 checksumming) 5) boot nodeb again 6) nodeb doesn not quicksync 7) check that the nodes are not syncing by looking at /proc/drbd 8) make nodeb primary (run heartbeat restart on nodea) 9) ls /pervasive yelds umpredictable results: from fs corruption to missing files (usually missing files) I tought: maybe it is the fs cache, so i put a sync command in background to run every 5 mins and in haresources, but I still don't get the QuickSync to happen. On the contrary, If I do not reboot the secondary, but only cause a failover by running /etc/init.d/heartbeat restart on primary all data gets migrated just fine. This happens consistently only when I reboot the secondary. Configuration and dmesg log are attached. What am I doing wrong? Many thanks in advance, Umberto -------------- next part -------------- resource drbd0 { protocol=C fsckcmd=/bin/true disk { disk-size=65776188 do-panic } net { tl-size = 5000 sndbuf-size = 1280 sync-rate=160M # bytes/sec timeout=60 connect-int=10 ping-int=10 } on nodea { device=/dev/nb0 disk=/dev/sda6 address=192.168.1.1 port=7789 } on nodeb { device=/dev/nb0 disk=/dev/sda6 address=192.168.1.2 port=7789 } } -------------- next part -------------- following is dmesg log of nodea seeing nodeb rebbot <-- boot of nodea drbd: initialised. Version: 0.6.12 (api:64/proto:62) drbd0: Creating state file "/var/lib/drbd/drbd0" bcm5700: eth0 NIC Link is Down bcm5700: eth0 NIC Link is Up, 1000 Mbps full duplex drbd0: Connection established. size=65776188 KB / blksize=4096 B <-- nodeb is up too isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found IPv6 v0.8 for NET4.0 IPv6 over IPv4 tunneling driver eth0: no IPv6 routers present get_hw_addr uses obsolete (PF_INET,SOCK_PACKET) eth1: no IPv6 routers present Journalled Block Device driver loaded drbd0: blksize=1024 B <-- hertbeat ran datadisk start kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.17, 10 Jan 2002 on drbd(43,0), internal journal EXT3-fs: mounted filesystem with ordered data mode. <-- here I shutdown nodeb manually drbd0: Connection lost. bcm5700: eth0 NIC Link is Down bcm5700: eth0 NIC Link is Up, 100 Mbps full duplex bcm5700: eth0 NIC Link is Down bcm5700: eth0 NIC Link is Up, 1000 Mbps full duplex bcm5700: eth0 NIC Link is Down bcm5700: eth0 NIC Link is Up, 1000 Mbps full duplex bcm5700: eth0 NIC Link is Down <--- network is eventually up again, the two nodes reconnect to each other bcm5700: eth0 NIC Link is Up, 1000 Mbps full duplex drbd0: Connection established. size=65776188 KB / blksize=1024 B <-- this is strange: 0 blks??? drbd0: Synchronisation started blks=0 drbd0: Synchronisation done. drbd0: blksize=1024 B -------------- next part -------------- nodea 192.168.100.20 sync datadisk::drbd0 sync smb psql # # Sync is my homegrown script. It just calls sync. #