From: Wayne Walker <wwalker-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
To: linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Data corruption problem
Date: Fri, 18 Feb 2011 12:30:04 -0600 [thread overview]
Message-ID: <20110218183003.GF25484@solid-constructs.com> (raw)
In-Reply-To: <20110211051458.GD27051-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
On Thu, Feb 10, 2011 at 11:14:59PM -0600, Wayne Walker wrote:
> First, I'm not certain whether this is samba, the linux cifs driver, or
> something else.
>
> During testing, one of my QA guys was running an inhouse program that
> generates pseudo-random, but fully recreatable, data and writes it to
> a file, the file is named with a name that is essentially the seed to
> the pseudo- random stream, so, given a filename, it can read the file
> and verify that the data is correct.
... snip ...
So, my QA guy has repeated the failure - 93 times, only from a linux box, so it appears to definitely be a cifs driver issue.
What can I do to gather useful info? tcpdump on both client and server drop too many packets to be useful.
A couple weeks ago, when running my data generator, I ran into a data corruption problem when creating a ~8GB file using `dp'. Based on an analysis that Wayne performed, he concluded that this problem is likely a CIFS/Samba bug. Since then, I created a test environment that now writes data to a disk array from 3 clients (2 Windows & 1 Linux). Yesterday, I ran a job that writes 500GB of data spread across ~11,000 files. I used `dp' to read back each file and verify the data, and it found 93 corrupt files.
Here are the results: http://qatest-sp/ui/index_archive_node.php/results/data_generator_test_detail/89
A couple of things to note:
All the corrupt files were created on the Linux host `acorn'. None were from the Windows boxes
The size of the corrupt files range from 350K to ~1 GB
This time, I am able to see additional log messages that I did not see last time (perhaps since I did not reboot the machines).
From the Samba server (CentOS 5.5 samba-3.0.33-3.29.el5_5.1, hostname: snape):
[2011/02/17 18:20:41, 0] lib/util_sock.c:write_data(562)
write_data: write failure in writing to client 192.168.20.155. Error Broken pipe
[2011/02/17 18:20:41, 0] lib/util_sock.c:send_smb(761)
Error writing 55 bytes to client. -1. (Broken pipe)
[2011/02/17 18:20:41, 1] smbd/service.c:close_cnum(1274)
192.168.20.155 (192.168.20.155) closed connection to service data2
[2011/02/17 18:20:41, 1] smbd/service.c:close_cnum(1274)
192.168.20.155 (192.168.20.155) closed connection to service data2
[2011/02/17 18:20:41, 1] smbd/service.c:make_connection_snum(1077)
192.168.20.155 (192.168.20.155) connect to service data2 initially as user root (uid=0, gid=0) (pid 5312)
From a Linux client (hostname: acorn):
Feb 17 16:54:30 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
Feb 17 16:57:10 acorn kernel: CIFS VFS: No response to cmd 47 mid 46382
Feb 17 16:57:10 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
Feb 17 16:57:16 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
Feb 17 16:57:31 acorn kernel: CIFS VFS: No response for cmd 50 mid 46388
Feb 17 16:59:52 acorn kernel: CIFS VFS: No response to cmd 47 mid 64873
Feb 17 16:59:52 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
Feb 17 16:59:53 acorn kernel: CIFS VFS: Write2 ret -11, wrote 0
--
Wayne Walker
wwalker-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org
(512) 633-8076
Senior Consultant
Solid Constructs, LLC
> A: Because it messes up the order in which people normally read text.
> > Q: Why is top-posting such a bad thing?
> > > A: Top-posting.
> > > > Q: What is the most annoying thing in e-mail?
next prev parent reply other threads:[~2011-02-18 18:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-11 5:14 Data corruption problem Wayne Walker
[not found] ` <20110211051458.GD27051-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
2011-02-11 5:21 ` Wayne Walker
2011-02-11 11:53 ` Jeff Layton
[not found] ` <20110211065318.62f91a5b-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2011-02-11 14:35 ` Wayne Walker
[not found] ` <20110211143520.GI27051-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
2011-02-11 14:41 ` Jeff Layton
[not found] ` <20110211094117.1f012cae-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2011-02-11 15:00 ` UNS: " Wayne Walker
2011-02-18 18:30 ` Wayne Walker [this message]
[not found] ` <20110218183003.GF25484-7+hyfkrzchDWTcdHvfGLfFaTQe2KTcn/@public.gmane.org>
2011-02-18 20:45 ` Jeff Layton
[not found] ` <20110218154552.7cf091a8-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2011-02-18 21:49 ` UNS: " Wayne Walker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110218183003.GF25484@solid-constructs.com \
--to=wwalker-7+hyfkrzchdwtcdhvfglffatqe2ktcn/@public.gmane.org \
--cc=linux-cifs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox