[DRBD-user] Fsync problem with DRBD

Thu Nov 10 16:12:31 CET 2005

Hello, 

        I've got strange problem with DRBD with Protocol C and fsync on 
files. Fsync on primary machine costs about 2-4 seconds, and strange for 
me that this is only on /var filesystems on server which has spools there, 
so I guess its connected with changes on filesystem. 

Here is configuration for one of server: 

global {
    minor-count 6;
    dialog-refresh 1;
}
resource var {
  protocol C;
  startup {
    degr-wfc-timeout 120;    # 2 minutes.
  }
  disk {
    on-io-error   detach;
  }
  net {
    on-disconnect reconnect;
  }
  syncer {
    rate 680M;
    group 1;
    al-extents 257;
  }
  on primary {
    device     /dev/drbd0;
    disk       /dev/ida/c0d0p5;
    address    1.1.1.1:7788;
    meta-disk  internal;
  }
  on secondary {
    device    /dev/drbd0;
    disk      /dev/cciss/c0d0p5;
    address   1.1.2.2:7788;
    meta-disk internal;
  }
}
resource usr {
  protocol C;
  startup {
    degr-wfc-timeout 120;    # 2 minutes.
  }
  disk {
    on-io-error   detach;
  }
  net {
    on-disconnect reconnect;
  }
  syncer {
    rate 680M;
    group 1;
    al-extents 257;
  }
  on primary {
    device     /dev/drbd1;
    disk       /dev/ida/c0d0p6;
    address    1.1.1.1:7789;
    meta-disk  internal;
  }
  on secondary {
    device    /dev/drbd1;
    disk      /dev/cciss/c0d0p6;
    address   1.1.2.2:7789;
    meta-disk internal;
  }
}

resource home {
  protocol C;
  startup {
    degr-wfc-timeout 120;    # 2 minutes.
  }
  disk {
    on-io-error   detach;
  }
  net {
    on-disconnect reconnect;
  }
  syncer {
    rate 680M;
    group 1;
    al-extents 257;
  }
  on primary {
    device     /dev/drbd2;
    disk       /dev/ida/c0d1p1;
    address    1.1.1.1:7790;
    meta-disk  internal;
  }
  on secondary {
    device    /dev/drbd2;
    disk      /dev/cciss/c0d0p7;
    address   1.1.1.2.2:7790;
    meta-disk internal;
  }
}

Current /proc/drbd on primary machine: 
version: 0.7.10 (api:77/proto:74)
SVN Revision: 1743 build by root at debian, 2004-11-11 15:10:16
 0: cs:Connected st:Primary/Secondary ld:Consistent
    ns:153776 nr:0 dw:1009555868 dr:85196925 al:1947 bm:1520 lo:0 pe:0 
ua:0 ap:0
 1: cs:Connected st:Primary/Secondary ld:Consistent
    ns:419888 nr:0 dw:33664956 dr:16916101 al:15 bm:431 lo:0 pe:0 ua:0 
ap:0
 2: cs:Connected st:Primary/Secondary ld:Consistent
    ns:5757168 nr:0 dw:638144400 dr:626179845 al:308612 bm:2781 lo:0 pe:0 
ua:0 ap:0

The only thing which connects all these servers is high DW count. I know 
its high becouse of many changes on that filesystem. AP never goes big. 
Filesystem is not highly used when I do fsync on written file. Fsyncing on 
other DRBD mount points on that server works fast as it should. 

As it is production enviroment I could only rebuild drbd disk on secondary 
server. But if this will be needed for debuging process I can switch off 
primary on off hours. 
For fast workaround I did set Protocol B on that var mount point, but I'm 
not satisfied with this. Could anyone explain why that happends and how to 
avoid this? 

Best regards,

-- 
Sylwester Żelazko
PolCard S.A. (Zespół Administracji Systemami)
(22)-515-38-04
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20051110/ef3f5978/attachment.htm>