From: Chuck Lever III <chuck.lever@oracle.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: reservation errors during fstests on pNFS block
Date: Sat, 15 Jun 2024 18:09:20 +0000 [thread overview]
Message-ID: <3B61FCDD-2684-4E5E-9790-2CEFDF69539D@oracle.com> (raw)
In-Reply-To: <ED7EB3B3-F56A-451F-A639-D30BBA125EE6@oracle.com>
> On Jun 14, 2024, at 2:33 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>> On Jun 14, 2024, at 2:26 PM, Christoph Hellwig <hch@infradead.org> wrote:
>>
>> On Fri, Jun 14, 2024 at 05:46:21PM +0000, Chuck Lever III wrote:
>>>
>>> I can go back and try reproducing with just generic/069 and
>>> tcpdump as a first step. Is there a way I can tell that the
>>> PR errors are not reporting a possible data corruption?
>>
>> xfstests in general does data verifycation to check for data integrity,
>> so we should not rely on kernel messages.
>>
>> I'm a bit busy right now, but I'll try to reproduce this locally next
>> week.
>
> Thanks, I'll also try to investigate further.
This is 100% reproducible in my set up.
bl_alloc_lseg() calls this:
561 static struct nfs4_deviceid_node *
562 bl_find_get_deviceid(struct nfs_server *server,
563 const struct nfs4_deviceid *id, const struct cred *cred,
564 gfp_t gfp_mask)
565 {
566 struct nfs4_deviceid_node *node;
567 unsigned long start, end;
568
569 retry:
570 node = nfs4_find_get_deviceid(server, id, cred, gfp_mask);
571 if (!node)
572 return ERR_PTR(-ENODEV);
nfs4_find_get_deviceid() tries to be clever and do a lookup without
the spin lock first.
If it can't find a matching deviceid, it creates a new device_info
(which calls bl_alloc_deviceid_node, and that registers the device's
PR key).
Then it takes the nfs4_deviceid_lock and looks up the deviceid again.
If it finds it this time, bl_find_get_deviceid() frees the spare
(new) device_info, which unregisters the PR key for the same device.
Any subsequent I/O from this client on that device gets EBADE.
The umount later unregisters the device's PR key again.
Seems like PR key registration should be done from a more
idempotent context...?
--
Chuck Lever
next prev parent reply other threads:[~2024-06-15 18:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-14 14:46 reservation errors during fstests on pNFS block Chuck Lever III
2024-06-14 16:38 ` Christoph Hellwig
2024-06-14 17:46 ` Chuck Lever III
2024-06-14 18:26 ` Christoph Hellwig
2024-06-14 18:33 ` Chuck Lever III
2024-06-14 18:34 ` Christoph Hellwig
2024-06-15 18:09 ` Chuck Lever III [this message]
2024-06-17 5:38 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3B61FCDD-2684-4E5E-9790-2CEFDF69539D@oracle.com \
--to=chuck.lever@oracle.com \
--cc=hch@infradead.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox