Linux NFS development
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: reservation errors during fstests on pNFS block
Date: Fri, 14 Jun 2024 11:26:17 -0700	[thread overview]
Message-ID: <ZmyLSZGWDeaIXdx4@infradead.org> (raw)
In-Reply-To: <C02C8230-4ACA-4F2D-AC28-B9583ADCADA5@oracle.com>

On Fri, Jun 14, 2024 at 05:46:21PM +0000, Chuck Lever III wrote:
> > Reservation means another node has an active reservation on that LU.
> 
> There are only two accessors of the LUN: the NFS server and
> the NFS client running the test. That's why these errors are
> a little surprising to me.

You can create registrations from userspace, and some cluster managers
do that.  But none of that should happen for a default setup.

> > When pNFS layout access fails we fall back to normal access through the
> > MDS, so this is expected.
> 
> Expected, OK. From a usability standpoint, error messages like
> this would probably be alarming to administrators. I plan to
> convert the printk's and dprintk's in the NFSD layout code into
> trace points, but that doesn't help the messages emitted by the
> block and SCSI drivers. Ideally this should be less noisy.

Well, they really should be alarming because the admin configured
a block layout setup and it did not work as expected.  So it should
ring alarm bells.

> > Is generic/069 that first test that failed when doing a full xfstests
> > run?
> 
> Yes, it's a full run. generic/069 is the first test where there
> are remarkable system journal messages (ie, PR errors), though
> there are a few subsequent tests that are also whinging.

Interesting.  Normally only the server actually reserves the LU,
the clients just register.  And something went wrong here and only
for these tests.

> > Do you see LAYOUT* ops in /proc/self/mountstats for the previous
> > tests?
> 
> generic/013 is known to generate layout recalls, for example,
> so there is layout activity during the test run.

Ok.  The other thing would be to run blktrace on the client and
see that it shows I/O.  But all this sounds like the tests in
general work, but something is up with generic/069.

generic/069 just does O_APPEND writes, so I can't see what
would be so special about it.

> 
> I can go back and try reproducing with just generic/069 and
> tcpdump as a first step. Is there a way I can tell that the
> PR errors are not reporting a possible data corruption?

xfstests in general does data verifycation to check for data integrity,
so we should not rely on kernel messages.

I'm a bit busy right now, but I'll try to reproduce this locally next
week.


  reply	other threads:[~2024-06-14 18:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-14 14:46 reservation errors during fstests on pNFS block Chuck Lever III
2024-06-14 16:38 ` Christoph Hellwig
2024-06-14 17:46   ` Chuck Lever III
2024-06-14 18:26     ` Christoph Hellwig [this message]
2024-06-14 18:33       ` Chuck Lever III
2024-06-14 18:34         ` Christoph Hellwig
2024-06-15 18:09         ` Chuck Lever III
2024-06-17  5:38           ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmyLSZGWDeaIXdx4@infradead.org \
    --to=hch@infradead.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox