All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Dokos <nicholas.dokos@hp.com>
To: Andreas Dilger <adilger@sun.com>
Cc: Nick Dokos <nicholas.dokos@hp.com>,
	Valerie Aurora <vaurora@redhat.com>,
	linux-ext4@vger.kernel.org
Subject: Re: ll_ver_fs data verification failure - 96TB fs
Date: Thu, 06 Aug 2009 17:28:08 -0400	[thread overview]
Message-ID: <18690.1249594088@alphaville.usa.hp.com> (raw)
In-Reply-To: Message from Andreas Dilger <adilger@sun.com> of "Thu, 06 Aug 2009 14:50:02 MDT." <20090806205002.GH3340@webber.adilger.int>

> On Aug 06, 2009  16:37 -0400, Nick Dokos wrote:
> > I did that to begin with but the problem turns out to be much more
> > mundane: there was an IO error on one of the volumes. It wasn't quite
> > obvious (no red lights going off) but there *was* a message in
> > /var/log/messages - unfortunately I missed it. I eventually recreated
> > the error by trying to read the file with ``od -c'' and then went back
> > and found the original error. I don't know why/how ll_ver_fs managed to
> > read the offset and come up with a 1M difference[1] -- ``od -c'' failed with
> > a big thud.
> 
> Can you have a look at the error handling in ll_ver_fs at that point?
> It seems that it might just have re-used the previous 1MB buffer, but
> didn't detect/report the error from the read, which would itself be bad.
> 

It looks right to me:

,----
|               ...
| 		if (read(fd, chunk_buf, chunksize) < 0) {
| 			fprintf(stderr, "\n%s: read %s+%llu failed: %s\n",
| 				progname, file, offset, strerror(errno));
| 			return 1;
| 		}
| 		if (verify_chunk(chunk_buf, chunksize, offset, time_st,
| 				 inode_st, file) != 0)
| 			return 1;
|               ...
`----

The read() should have failed (and I should have gotten a different error
message) but somehow it didn't - instead, verify_chunk() was called and
*that* detected the mismatch.

Thanks,
Nick




  reply	other threads:[~2009-08-06 21:28 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-03 13:54 ll_ver_fs data verification failure - 96TB fs Nick Dokos
2009-08-03 14:22 ` Nick Dokos
2009-08-06 20:04 ` Valerie Aurora
2009-08-06 20:37   ` Nick Dokos
2009-08-06 20:50     ` Andreas Dilger
2009-08-06 21:28       ` Nick Dokos [this message]
2009-08-06 21:43         ` Nick Dokos
2009-08-06 22:01           ` Nick Dokos
2009-08-06 22:54             ` Nick Dokos
2009-08-06 22:19         ` Andreas Dilger
2009-08-06 20:36 ` Valerie Aurora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18690.1249594088@alphaville.usa.hp.com \
    --to=nicholas.dokos@hp.com \
    --cc=adilger@sun.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=vaurora@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.