* sync/fsync issues on cifs drives
@ 2012-03-30 17:47 Federico Sauter
[not found] ` <4F75F1BF.3050604-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
0 siblings, 1 reply; 2+ messages in thread
From: Federico Sauter @ 2012-03-30 17:47 UTC (permalink / raw)
To: linux-cifs-u79uwXL29TY76Z2rM5mHXA
Greetings,
I am using an older kernel (2.6.27.57). I have made the following
observations:
Let S be a windows shared drive that maps to the local directory s (on
Windows) and let Q be another shared drive on the same machine, such
that Q maps to the local directory q and that s is the parent of q.
On my machine, I am mounting both S and Q separately, so that S is
mounted read-only and Q is mounted with read/write access. I have also
mounted S and Q using a different user for each as well as using the
same user for both. This does not seem to make a difference (even though
at least once it seemed to matter, but I could not reproduce it.)
Scenario 1:
When I finish my operations, I write the results to Q and perform a
fsync system call on each one of the written files.
I have observed that, under Windows NT4 this leads to an error condition:
fsync failed (11): Resource temporarily unavailable
Also showing LOG entries similar to:
kernel: CIFS VFS: Write2 ret -11, wrote 9370
kernel: CIFS VFS: No response to cmd 46 mid 58582
Scenario 2:
On a separated process that has nothing to do with the shared drives, a
sync system calls stalls for a very long time (possibly hours) before
returning.
I observed this behavior on NT4, I have reports of this happening with
Windows 2008 R2 Server, but I could *not* reproduce it with Windows XP
PRO nor with Windows Server 2003.
---- END of observations ----
Question 1: why does this happen?
Question 2: Is this fixed in a newer kernel release?
Question 3: Is there any way to limit a sync call to a single filesystem?
---- END of questions -----
I went ahead and looked at the kernel code, and the reason for this is
clearly that on fs/cifs/file.c:162
if (n_iov) {
/* Search for a writable handle every time we call
* CIFSSMBWrite2. We can't rely on the last handle
* we used to still be valid
*/
open_file = find_writable_file(CIFS_I(mapping->host));
if (!open_file) {
cERROR(1, ("No writable handles for inode"));
rc = -EBADF;
} else {
rc = CIFSSMBWrite2(xid, cifs_sb->tcon,
open_file->netfid,
bytes_to_write, offset,
&bytes_written, iov, n_iov,
CIFS_LONG_OP);
atomic_dec(&open_file->wrtPending);
if (rc || bytes_written < bytes_to_write) {
cERROR(1, ("Write2 ret %d, wrote %d",
rc, bytes_written));
/* BB what if continued retry is
requested via mount flags? */
if (rc == -ENOSPC)
set_bit(AS_ENOSPC, &mapping->flags);
else
set_bit(AS_EIO, &mapping->flags);
} else {
cifs_stats_bytes_written(cifs_sb->tcon,
bytes_written);
}
}
The call to CIFSSMBWrite2 is never checked for an EAGAIN condition
(which is what is returned in those cases.) I have not experimented with
this yet (it may very well be that any number of retries end of in the
same situation,) but wanted to know whether modiying this would make
sense at all or not. I am fairly new to this portion of the kernel code.
Note: I am not suggesting making a patch out of that idea, I just want
to check whether the idea makes sense.
Thanks in advance for your kind support!
Cheers,
--
Federico Sauter / Firmware developer
Innominate Security Technologies AG / protecting industrial networks
tel: +49.30.921028-210 / fax: +49.30.921028-020
Rudower Chaussee 13 / D-12489 Berlin / http://www.innominate.com/
Register Court: AG Charlottenburg, HR B 81603
Management Board: Dirk Seewald
Chairman of the Supervisory Board: Christoph Leifer
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: sync/fsync issues on cifs drives
[not found] ` <4F75F1BF.3050604-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
@ 2012-03-30 20:03 ` Jeff Layton
0 siblings, 0 replies; 2+ messages in thread
From: Jeff Layton @ 2012-03-30 20:03 UTC (permalink / raw)
To: Federico Sauter; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, 30 Mar 2012 19:47:43 +0200
Federico Sauter <fsauter-LVkJPw3T+odGBRGhe+f61g@public.gmane.org> wrote:
> Greetings,
>
>
> I am using an older kernel (2.6.27.57). I have made the following
> observations:
>
> Let S be a windows shared drive that maps to the local directory s (on
> Windows) and let Q be another shared drive on the same machine, such
> that Q maps to the local directory q and that s is the parent of q.
>
> On my machine, I am mounting both S and Q separately, so that S is
> mounted read-only and Q is mounted with read/write access. I have also
> mounted S and Q using a different user for each as well as using the
> same user for both. This does not seem to make a difference (even though
> at least once it seemed to matter, but I could not reproduce it.)
>
> Scenario 1:
> When I finish my operations, I write the results to Q and perform a
> fsync system call on each one of the written files.
>
> I have observed that, under Windows NT4 this leads to an error condition:
>
> fsync failed (11): Resource temporarily unavailable
>
> Also showing LOG entries similar to:
>
> kernel: CIFS VFS: Write2 ret -11, wrote 9370
> kernel: CIFS VFS: No response to cmd 46 mid 58582
>
> Scenario 2:
> On a separated process that has nothing to do with the shared drives, a
> sync system calls stalls for a very long time (possibly hours) before
> returning.
>
> I observed this behavior on NT4, I have reports of this happening with
> Windows 2008 R2 Server, but I could *not* reproduce it with Windows XP
> PRO nor with Windows Server 2003.
>
> ---- END of observations ----
>
> Question 1: why does this happen?
>
> Question 2: Is this fixed in a newer kernel release?
>
> Question 3: Is there any way to limit a sync call to a single filesystem?
>
> ---- END of questions -----
>
> I went ahead and looked at the kernel code, and the reason for this is
> clearly that on fs/cifs/file.c:162
>
> if (n_iov) {
> /* Search for a writable handle every time we call
> * CIFSSMBWrite2. We can't rely on the last handle
> * we used to still be valid
> */
> open_file = find_writable_file(CIFS_I(mapping->host));
> if (!open_file) {
> cERROR(1, ("No writable handles for inode"));
> rc = -EBADF;
> } else {
> rc = CIFSSMBWrite2(xid, cifs_sb->tcon,
>
> open_file->netfid,
> bytes_to_write, offset,
> &bytes_written, iov, n_iov,
> CIFS_LONG_OP);
> atomic_dec(&open_file->wrtPending);
> if (rc || bytes_written < bytes_to_write) {
> cERROR(1, ("Write2 ret %d, wrote %d",
> rc, bytes_written));
> /* BB what if continued retry is
> requested via mount flags? */
> if (rc == -ENOSPC)
> set_bit(AS_ENOSPC, &mapping->flags);
> else
> set_bit(AS_EIO, &mapping->flags);
> } else {
> cifs_stats_bytes_written(cifs_sb->tcon,
> bytes_written);
> }
> }
>
> The call to CIFSSMBWrite2 is never checked for an EAGAIN condition
> (which is what is returned in those cases.) I have not experimented with
> this yet (it may very well be that any number of retries end of in the
> same situation,) but wanted to know whether modiying this would make
> sense at all or not. I am fairly new to this portion of the kernel code.
>
I think you might want to look at commit
941b853d779de3298e39f1eb4e252984464eaea8, though that has never really
had much testing in isolation from the conversion to async writes.
> Note: I am not suggesting making a patch out of that idea, I just want
> to check whether the idea makes sense.
>
> Thanks in advance for your kind support!
>
>
> Cheers,
>
2.6.27 is really old at this point...
The short answer here is that the behavior in more recent kernels
(3.x-ish) should be much better. The cifs client now does asynchronous
writes which speeds things up tremendously. It's also more tolerant of
network problems during writeback.
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-03-30 20:03 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-30 17:47 sync/fsync issues on cifs drives Federico Sauter
[not found] ` <4F75F1BF.3050604-LVkJPw3T+odGBRGhe+f61g@public.gmane.org>
2012-03-30 20:03 ` Jeff Layton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.