From: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Wayne Walker <wwalker-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Data corruption problem
Date: Fri, 18 Feb 2011 15:45:52 -0500 [thread overview]
Message-ID: <20110218154552.7cf091a8@corrin.poochiereds.net> (raw)
In-Reply-To: <20110218183003.GF25484-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
On Fri, 18 Feb 2011 12:30:04 -0600
Wayne Walker <wwalker-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org> wrote:
> On Thu, Feb 10, 2011 at 11:14:59PM -0600, Wayne Walker wrote:
> > First, I'm not certain whether this is samba, the linux cifs driver, or
> > something else.
> >
> > During testing, one of my QA guys was running an inhouse program that
> > generates pseudo-random, but fully recreatable, data and writes it to
> > a file, the file is named with a name that is essentially the seed to
> > the pseudo- random stream, so, given a filename, it can read the file
> > and verify that the data is correct.
> ... snip ...
>
> So, my QA guy has repeated the failure - 93 times, only from a linux box, so it appears to definitely be a cifs driver issue.
>
> What can I do to gather useful info? tcpdump on both client and server drop too many packets to be useful.
>
I asked before, but I don't think you ever gave a conclusive answer...
Did the kernel report an error when you did a fsync() or close()? I
suspect that it did, but sadly a lot of programs don't bother to check
for that (usually because they're not really able to deal with it).
> From a Linux client (hostname: acorn):
> Feb 17 16:54:30 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
> Feb 17 16:57:10 acorn kernel: CIFS VFS: No response to cmd 47 mid 46382
> Feb 17 16:57:10 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
> Feb 17 16:57:16 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
> Feb 17 16:57:31 acorn kernel: CIFS VFS: No response for cmd 50 mid 46388
> Feb 17 16:59:52 acorn kernel: CIFS VFS: No response to cmd 47 mid 64873
> Feb 17 16:59:52 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
> Feb 17 16:59:53 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
>
Those mean that calls to the server were occasionally timing out.
That's not terribly unusual under heavy load. Until very recently when
that happened, the kernel would treat that like a hard error and would
disconnect the socket.
You may want to test something more recent (like 2.6.38-rc5) to see if
the problems go away with that. Since you mention you're using CentOS
you could also open a bug at bugzilla.redhat.com and I'll try to look
at it when I get time.
If you have a RH support contract you may also want to open a support
case with this problem which would allow me to give it more priority.
Cheers,
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
next prev parent reply other threads:[~2011-02-18 20:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-11 5:14 Data corruption problem Wayne Walker
[not found] ` <20110211051458.GD27051-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
2011-02-11 5:21 ` Wayne Walker
2011-02-11 11:53 ` Jeff Layton
[not found] ` <20110211065318.62f91a5b-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2011-02-11 14:35 ` Wayne Walker
[not found] ` <20110211143520.GI27051-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
2011-02-11 14:41 ` Jeff Layton
[not found] ` <20110211094117.1f012cae-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2011-02-11 15:00 ` UNS: " Wayne Walker
2011-02-18 18:30 ` Wayne Walker
[not found] ` <20110218183003.GF25484-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
2011-02-18 20:45 ` Jeff Layton [this message]
[not found] ` <20110218154552.7cf091a8-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2011-02-18 21:49 ` UNS: " Wayne Walker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110218154552.7cf091a8@corrin.poochiereds.net \
--to=jlayton-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=wwalker-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox