[Csync2] csync2 prefix mapping ==> I/O Error 'No such file or directory' in rsync-check
Samba
saasira at gmail.com
Sat Jun 2 02:39:37 CEST 2012
Sorry for flooding the list; but I think this problem can be worked around
if there is a way to specify the temporary directory for csync2.
But specifying "tempdir /opt/temp" outside the "group" in the csync2.cfg
file is silently ignoring it or the "tmpdir" value is not being considered
by rsync_check function.
Here is the config file:
tempdir /opt/temp;
group default
{
host master;
host (slave);
key /etc/csync2/csync2.key;
include /opt/test;
auto left;
}
Here is the confirmation for the above assumption:
Opening basis_file and sig_file..
ERROR: Could not open result from
tempnam(/opt/test/mgmt/tm/.tmclear.log.XXXXXX)!
I/O Error 'No such file or directory' in rsync-check:
/opt/test/mgmt/tm/tmclear.log
ERROR: Could not open result from
tempnam(/opt/test/mgmt/tm/.tmsecurity.log.25852)!
It looks like rsync_check or rsync_patch are calling "open_temp_file"
function which inturn calls "get_tmpname" function to get the name of the
temporary file from within which we get the above error message that it
could not open the temporary file. However, there is a catch since the
"get_tmpname" function does not actually call "tempnam" library function
but instead tries to create a temporary file within the directory where the
original file has been located.
Since that particular directory has been deleted, csync2 could not create a
temporary file under it and it is not even using the "tempdir" which is
configured in the csync2 configuration file. And log message is misleading
since the call to "tempnam" has never been made from "get_tmpname" function.
The call to "tempnam" library function is being made only in
"paranoid_tmpfile" function but I'm not sure how can I force csync2 to use
that method for creating temporary files instead of creating those temp
files in the same directory as the one being modified or deleted.
Alternatively, "get_tmpname" function can decide to recursively lookup one
level up in the directory tree to see if the parent of the current
directory [that is being deleted or updated] is available [that may also be
deleted] until it reaches the very root of the directory which is
configured to be replicated in order to create a temp file in the first
available parent directory-- but i guess it would be more easier to
maintain and faster in execution if we just use the "tempnam" function to
create the files in an already specified "tempdir" from config file.
The patch submitted by *Dennis Schafroth *seems to be taking the route of
always executing "paranoid_tmpfile", which may or may not be good depending
on the CPU cost of "mkstemp" function call. Also to not that that patch is
also ignoring the "tempdir" provided by the user in csync2 config file(s).
Even if "mkstemp" is chosen, care must be taken to create the temp file
under the "tempdir" if one is specified else create it under the last
available parent directory.
Apart from the above, the patch looks good enough to fix the current issue
and also an issue with backing up of prefix mapped files. I will review,
test and verify the patch in a bit more detail and send my comments in a
couple of days.
Thanks and Regards,
Samba
===============================================================
On Sat, Jun 2, 2012 at 1:58 AM, Samba <saasira at gmail.com> wrote:
>
> Hi Lars,
>
>
> I think this issue is even occurring for straight/direct replication to
> same location on slave also.
>
> It took me sometime to narrow down to the root cause but the actual issue
> seems to be this:
>
> When a subdirectory of a parent directory [configured to be replciated] is
> deleted on primary, then we are getting this issue. Ideally, csync2 is
> supposed to delete that subdirectory on slave server also when sync request
> made, but instead it is throwing error like as shown below:
>
> Interestingly, this issue is occurring only when a subdirectory is deleted
> but there seems to be no issue when a file anywhere within or outside that
> subdirectory is deleted.
>
> Note : "/opt/test" is configured to be replicated by csync2
>
> On deleting a subdirectory on master
>
>
> [root at master ~]# csync2 -x
> [root at master ~]# rm -rf /opt/test/mgmt/tm
> [root at master ~]# csync2 -x
> I/O Error 'No such file or directory' in rsync-check:
> /opt/test/mgmt/tm/tmclearmsg.log
>
> While syncing file /opt/test/mgmt/tm/tmtrace.log:
> ERROR from peer(/opt/test/mgmt/tm/tmtrace.log): slave octet-stream 12
> While syncing file /opt/test/mgmt/tm/tmsecurity.log:
> ERROR from peer(/opt/test/mgmt/tm/tmsecurity.log): slave rs6
> I/O Error 'No such file or directory' in rsync-check:
> /opt/test/mgmt/tm/tmoperations.log
>
> While syncing file /opt/test/mgmt/tm/tmaudit.log:
> ERROR from peer(/opt/test/mgmt/tm/tmaudit.log): slave octet-stream 216
> While syncing file /opt/test/mgmt/tm:
> ERROR from peer(/opt/test/mgmt/tm): slave rs6
> Finished with 4 errors.
>
>
>
> On deleting the subdirectory on Slave:
>
>
> [root at slave ~]# rm -rf /opt/test/mgmt/tm
>
> and running csync2 sync command from master:
> [root at master ~]# csync2 -x
> I/O Error 'No such file or directory' in rsync-check:
> /opt/test/mgmt/tm/tmclearmsg.log
> While syncing file /opt/test/mgmt/tm/tmtrace.log:
> ERROR from peer(/opt/test/mgmt/tm/tmtrace.log): slave octet-stream 0
> I/O Error 'No such file or directory' in rsync-check:
> /opt/test/mgmt/tm/tmtecurity.log
> While syncing file /opt/test/mgmt/tm/tmoperations.log:
> ERROR from peer(/opt/test/mgmt/tm/tmoperations.log): slave---
> While syncing file /opt/test/mgmt/tm/tmaudit.log:
> ERROR from peer(/opt/test/mgmt/tm/tmaudit.log): slave octet-stream 0
> Format-error while receiving data.
>
> A workaround is to delete the csync2 metadata and let csync2 regenerate it
> from the scratch.
>
> So, after running "rm -rf /var/lib/csync2/*" on both master and slave, and
> followed by "csync2 -x", at least csync2 stopped giving errors but the
> concerned subdirectory which is supposed to be deleted on slave is not
> deleted.
>
> Besides these, i noticed that sometimes csync2 is producing output with
> wiered characters, along with errors similar to the above. I have not yet
> narrowed down on that scenario but I'll produce a test case for that too in
> a day or two.
>
> May be the second case where "Slave deleting a directory" may not an
> important use case for many but the first test case where a subdirectory
> getting deleted on master is a common one and that deletion need to be
> replicated to slave server.
>
> Another important observation is that CSync2 is failing fast on the first
> error it encounters; i think this rationale needs to be revisited since
> there may be cases like the second test case i pointed where logically that
> may not a use case supported by csync2 but still I suppose that it is not
> unreasonable to expect that the overall replciation must be completed
> because there are no additioal problems for all the other files/folders.
> Logical errors like the one mentioned above can be logged and the sync
> process should be continued instead of blocking the entire sync on the
> first error it encounters.
>
> Perhaps, errors need classfication as to fatal or not-- for example, a
> connection error, or SSL error, or authentication key error or csync2 lock
> error, etc must be fatal errors and the sync process must be immediately
> stopped without looking further. But the logical errors like not being able
> to replicate a particular file/folder due to permissions issues, or
> unsupported usage, etc can be logged and proceed with the sync process.
>
> I hope you look into the issue presented above and offer an
> advice/solution.
>
> Thanks and Regards,
> Samba
>
>
> =========================================================================================================
>
> On Fri, Jun 1, 2012 at 3:01 PM, Samba <saasira at gmail.com> wrote:
>
>> Hi Lars,
>>
>> It looks like the issue not always reproducible; but there seems to be a
>> pattern which i could not yet identify. CSync2 has occasionally failed to
>> replicate some files alternate locations.
>>
>> Here is the test that i ran as per you suggestion:
>>
>> [root at master ~]# mkdir /opt/test-bet/net
>>> [root at master ~]# touch /opt/test-bet/net/abc
>>> [root at master ~]# csync2 -c
>>> [root at master ~]# csync2 -x
>>> [root at master ~]# csync2 -rf /opt/test-bet/net
>>> [root at master ~]# csync2 -x
>>> [root at master ~]# csync2 -rf /opt/test-bet/net
>>> [root at master ~]# ls /opt/test-bet/net
>>> abc
>>> [root at master ~]# rm -rf /opt/test-bet/net
>>> [root at master ~]# csync2 -x
>>> I/O Error 'No such file or directory' in rsync-check:
>>> /opt/test-bet/net/abc
>>> While syncing file %_opt_test-bet%/net:
>>> ERROR from peer(%_opt_test-bet%/net): slave octet-stream 12
>>> ERROR from peer(<no file>): slave rs6
>>> Finished with 2 errors.
>>
>>
>>
>> Here is my configuration file:
>>
>> group default
>>> {
>>> host master; #primary/master server
>>> host (slave); #secondary/slave server
>>> key /etc/csync2/csync2.key;
>>> include %_opt_test-bet%;
>>> exclude *~ .*;
>>>
>>> backup-directory /var/backups/csync2;
>>> backup-generations 3;
>>> auto left;
>>> }
>>> prefix _opt_test-bet {
>>> on master: /opt/test-bet;
>>> on *: /opt/best;
>>> }
>>
>>
>>
>>
>>
>> unfortunately, we could not gather logs at that moment and hence are not
>> in a position to confirm if the failure was also due to a deletion of an
>> existing file on master; or there may be other cases also when it fails
>> which we have not yet identified.
>>
>> It would be great if you can look into this issue and verify if the patch
>> submitted by *Dennis Schafroth * is good enough to fix this issue.
>>
>> I will also test and verify that patch so as ensure that we do not
>> introduce any other issues with it.
>>
>> Thanks and Regards,
>> Samba
>>
>>
>> =======================================================
>>
>>
>>
>> On Thu, May 24, 2012 at 1:34 AM, Lars Ellenberg <
>> lars.ellenberg at linbit.com> wrote:
>>
>>> On Mon, May 21, 2012 at 03:12:41PM +0530, Samba wrote:
>>> > Hi Lars,
>>> >
>>> > I'm facing an issue with syncing files to alternate locations via
>>> prefix
>>> > mapping.
>>> >
>>> > Here is my config:
>>> >
>>> > group smgr_core
>>> > {
>>> > host master;
>>> > host (slave);
>>> > key /etc/csync2/csync2.key;
>>> > include %_var_net-snmp%;
>>> > backup-directory /var/backups/csync2;
>>> > backup-generations 3;
>>> > auto left;
>>> > }
>>> >
>>> >
>>> > prefix _var_net-snmp
>>> >
>>> > {
>>> >
>>> > on host[master]: /var/net-snmp;
>>> >
>>> > on *: /opt/trap_listener;
>>> >
>>> > }
>>> >
>>> > Errors:
>>> >
>>> > While syncing file %_var_net-snmp%/snmpd.conf:
>>> >
>>> >
>>> > ERROR from peer(%_var_net-snmp%/snmpd.conf): slave File is also marked
>>> > dirty here!
>>> >
>>> >
>>> > Auto-resolving conflict: Won 'master/slave' test. [but did not sync the
>>> > file]
>>> >
>>> >
>>> > I/O Error 'No such file or directory' in rsync-check:
>>> > /opt/trap_listener/mib_indexes/0
>>> >
>>> >
>>> > While syncing file %_var_net-snmp%/mib_indexes:
>>> >
>>> >
>>> > ERROR from peer(%_var_net-snmp%/mib_indexes): slave octet-stream 36
>>>
>>>
>>> Can you give me a "mkdir ; touch; csync2 -c; rm -r ; csync2 -x"
>>> mini-reproducer?
>>>
>>> > These errors seems to be the same as those mentioned by *Dennis
>>> Schafroth *
>>> >
>>> > in the mail thread given below:
>>> >
>>> > http://lists.linbit.com/pipermail/csync2/2011-May/000758.html
>>> >
>>> > which also contains a patch for the issue.
>>> >
>>> > Can you confirm if the patch has been applied to the source on Git
>>> trunk?
>>>
>>> Seems to have fallen through the cracks.
>>> Sorry about that.
>>>
>>> > If not, are there any plans to commit that patch or do you have a
>>> different
>>> > idea about the fix? I can test the patch verify if it works for me
>>> also but
>>> > it is better to build a stable rpm with the original source from trunk
>>> > rather than internal modifications, hence requesting you to commit the
>>> > patch if you agree with the fix
>>>
>>> I don't know yet if I agree :-/
>>> May be some days until I can actually review/test things
>>> or even try to come up with a better solution, if any.
>>>
>>> But if it works for Dennis, and it works for you,
>>> and it does apparently not break anything else,
>>> I certainly won't refuse to commit it.
>>>
>>>
>>> --
>>> : Lars Ellenberg
>>> : LINBIT | Your Way to High Availability
>>> : DRBD/HA support and consulting http://www.linbit.com
>>> _______________________________________________
>>> Csync2 mailing list
>>> Csync2 at lists.linbit.com
>>> http://lists.linbit.com/mailman/listinfo/csync2
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/csync2/attachments/20120602/dd08ed14/attachment-0001.htm>
More information about the Csync2
mailing list