[Csync2] Keeping files in sync on Debian 8.x

Martin Bähr mbaehr at email.archlab.tuwien.ac.at
Fri Sep 4 13:05:13 CEST 2015


Excerpts from Rabin Yasharzadehe's message of 2015-09-04 07:15:52 +0200:
> How many files and size are we talking about ?
> one why is to reduce the time between the syncs.

i have a similar problem, and i am running csync in a loop, that is, as soon as
it's done, it starts again with only a 1 minute delay.

i have done some numbercrunching on our logs and found that the average
handling time per dirty file is 400ms.
now, while we only had hundreds of dirty files this meant a few turnarounds per hour.
but as the number of changed files grew into the thousands and tenthousands, we
now have csync running only a few times per day because each run takes several
hours:
             total          average    per
month   runtime  files    time  files file
2014-05 2091583 19203989  1570  14417  108
2014-06 2616801  8424713  1900   6118  310
2014-07 2485303  4622864  2124   3951  537
2014-08 2627817  4715252  2327   4176  557
2014-09 2591575  3579943  2093   2891  723
2014-10 2678521  4376318  2633   4303  612
2014-11 2755648  9627775 10477  36607  286
2014-12 3001554  9829024 30628 100296  305
2015-01 2204490  5739876 68890 179371  384
2015-02 2373900  2743593 38916  44976  865
2015-03 2320890 15195710 18131 118716  152
2015-04 2557158  4119004 15688  25269  620
2015-05 2679586  5888131 17176  37744  455
2015-06 2474713  5971289 15466  37320  414
2015-07 2685188  6891218 20342  52206  389
2015-08 2589799  3693746 17984  25651  701
2015-09  283830   228472 14938  12024 1242

the total runtime each month is more or less the same, there were some
downtimes here or there that account for larger differences.
the average runtime jumps at the end of the year, which is where we have most
of our sales.

what i don't yet understand is how are we actually handling more files unless
maybe more files are created and deleted before csync sees them.

or am i counting the wrong values?

the loop i am running is this:
l=$(ls /var/log/sync/csync-xv.log,* | awk -F, 'BEGIN{getline;max=strtonum($NF)} NF{ max=(max>strtonum($NF))?max:strtonum($NF) } END{ print max}')
while :; do
    let l=l+1;
    mv /var/log/sync/csync-xv.log /var/log/sync/csync-xv.log,$l;
    date > /var/log/sync/csync-xv.log; 
    csync2 -xv -N target-server >> /var/log/sync/csync-xv.log 2>&1;
    sleep 60;
done

so i am creating a logfile for each run.

to get the above numbers i pull the timestamp at the beginning of each log and
the date of the file from ls -l
and then for each file i count the "Marking file as dirty" lines.

regardless, a total runtime of several hours for each run is simply to slow,
it doesn't look like that this can be improved, so i guess we have outgrown
csyncs usecase and hence we are now looking at alternative solutions like
glusterfs.

greetings, martin.

-- 
eKita                   -   the online platform for your entire academic life
-- 
chief engineer                                                       eKita.co
pike programmer      pike.lysator.liu.se    caudium.net     societyserver.org
secretary                                                      beijinglug.org
mentor                                                           fossasia.org
foresight developer  foresightlinux.org                            realss.com
unix sysadmin
Martin Bähr          working in china        http://societyserver.org/mbaehr/


More information about the Csync2 mailing list