From: Chuck Lever <chuck.lever@oracle.com>
To: Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: Trond Myklebust <trondmy@kernel.org>,
Anna Schumaker <anna@kernel.org>,
linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 06/14] sunrpc: add helpers to count and snapshot pending cache requests
Date: Thu, 19 Mar 2026 14:47:20 -0400 [thread overview]
Message-ID: <163efbf2-5e1a-4dde-b6c8-787c9b68e878@oracle.com> (raw)
In-Reply-To: <579b0239abbbb0b95d619e6b400bb919fedab60d.camel@kernel.org>
On 3/19/26 2:22 PM, Jeff Layton wrote:
> On Thu, 2026-03-19 at 14:07 -0400, Chuck Lever wrote:
>> On 3/16/26 11:14 AM, Jeff Layton wrote:
>>> Add sunrpc_cache_requests_count() and sunrpc_cache_requests_snapshot()
>>> to allow callers to count and snapshot the pending upcall request list
>>> without exposing struct cache_request outside of cache.c.
>>>
>>> Both functions skip entries that no longer have CACHE_PENDING set.
>>>
>>> The snapshot function takes a cache_get() reference on each item so the
>>> caller can safely use them after the queue_lock is released.
>>>
>>> These will be used by the nfsd generic netlink dumpit handler for
>>> svc_export upcall requests.
>>>
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>> include/linux/sunrpc/cache.h | 5 ++++
>>> net/sunrpc/cache.c | 57 ++++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 62 insertions(+)
>>>
>>> diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
>>> index c358151c23950ab48e83991c6138bb7d0e049ace..343f0fb675d761e019989e47e03645484e0aa30f 100644
>>> --- a/include/linux/sunrpc/cache.h
>>> +++ b/include/linux/sunrpc/cache.h
>>> @@ -251,6 +251,11 @@ extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
>>> extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
>>> extern void sunrpc_cache_unhash(struct cache_detail *, struct cache_head *);
>>>
>>> +int sunrpc_cache_requests_count(struct cache_detail *cd);
>>> +int sunrpc_cache_requests_snapshot(struct cache_detail *cd,
>>> + struct cache_head **items,
>>> + u64 *seqnos, int max);
>>> +
>>> /* Must store cache_detail in seq_file->private if using next three functions */
>>> extern void *cache_seq_start_rcu(struct seq_file *file, loff_t *pos);
>>> extern void *cache_seq_next_rcu(struct seq_file *file, void *p, loff_t *pos);
>>> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
>>> index 819f12add8f26562fdc6aaa200f55dec0180bfbc..2a78560edee5ca07be0fce90f87ce43a19df245f 100644
>>> --- a/net/sunrpc/cache.c
>>> +++ b/net/sunrpc/cache.c
>>> @@ -1906,3 +1906,60 @@ void sunrpc_cache_unhash(struct cache_detail *cd, struct cache_head *h)
>>> spin_unlock(&cd->hash_lock);
>>> }
>>> EXPORT_SYMBOL_GPL(sunrpc_cache_unhash);
>>> +
>>> +/**
>>> + * sunrpc_cache_requests_count - count pending upcall requests
>>> + * @cd: cache_detail to query
>>> + *
>>> + * Returns the number of requests on the cache's request list that
>>> + * still have CACHE_PENDING set.
>>> + */
>>> +int sunrpc_cache_requests_count(struct cache_detail *cd)
>>> +{
>>> + struct cache_request *crq;
>>> + int cnt = 0;
>>> +
>>> + spin_lock(&cd->queue_lock);
>>> + list_for_each_entry(crq, &cd->requests, list) {
>>> + if (test_bit(CACHE_PENDING, &crq->item->flags))
>>> + cnt++;
>>> + }
>>> + spin_unlock(&cd->queue_lock);
>>> + return cnt;
>>> +}
>>> +EXPORT_SYMBOL_GPL(sunrpc_cache_requests_count);
>>> +
>>> +/**
>>> + * sunrpc_cache_requests_snapshot - snapshot pending upcall requests
>>> + * @cd: cache_detail to query
>>> + * @items: array to fill with cache_head pointers (caller-allocated)
>>> + * @seqnos: array to fill with sequence numbers (caller-allocated)
>>> + * @max: size of the arrays
>>> + *
>>> + * Only entries with CACHE_PENDING set are included. Takes a reference
>>> + * on each cache_head via cache_get(). Caller must call cache_put()
>>> + * on each returned item when done.
>>> + *
>>> + * Returns the number of entries filled.
>>> + */
>>> +int sunrpc_cache_requests_snapshot(struct cache_detail *cd,
>>> + struct cache_head **items,
>>> + u64 *seqnos, int max)
>>> +{
>>> + struct cache_request *crq;
>>> + int i = 0;
>>> +
>>> + spin_lock(&cd->queue_lock);
>>> + list_for_each_entry(crq, &cd->requests, list) {
>>> + if (i >= max)
>>> + break;
>>> + if (!test_bit(CACHE_PENDING, &crq->item->flags))
>>> + continue;
>>> + items[i] = cache_get(crq->item);
>>> + seqnos[i] = crq->seqno;
>>> + i++;
>>> + }
>>> + spin_unlock(&cd->queue_lock);
>>> + return i;
>>> +}
>>> +EXPORT_SYMBOL_GPL(sunrpc_cache_requests_snapshot);
>>>
>>
>> This API architecture introduces a TOCTOU, since as soon as the
>> queue_lock is dropped, the count can immediately become stale.
>>
>> The count returned by sunrpc_cache_requests_count() is used as the array
>> bound. To wit:
>>
>> cnt = sunrpc_cache_requests_count(cd);
>>
>> items = kcalloc(cnt, sizeof(*items), GFP_KERNEL);
>>
>> seqnos = kcalloc(cnt, sizeof(*seqnos), GFP_KERNEL);
>>
>> cnt = sunrpc_cache_requests_snapshot(cd, items, seqnos, cnt);
>>
>>
>>
>> This appears in all four dumpit handlers (ip_map, unix_gid, svc_export,
>> expkey).
>>
>> This isn't a memory safety issue; _snapshot() caps its output at max, so
>> if the list grows between the two calls, entries are silently truncated
>> rather than overflowing the buffer. If the list shrinks, the arrays are
>> merely oversized.
>>
>> However, the practical risk is incomplete data returned to userspace. If
>> the caller is guaranteed to be re-invoked (e.g., the netlink dumpit
>> callback gets called again until it returns 0), then missing items due
>> to list growth between count() and snapshot() is harmless: they'll be
>> picked up on the next pass.
>>
>> But looking at the callers, they all do this:
>>
>> if (cb->args[0])
>> return 0;
>>
>> and then set cb->args[0] after the single snapshot dump.
>>
>> The dumpit is a one-shot: it snapshots once and signals completion. If
>> the list grows between count() and snapshot(), the truncated entries are
>> silently lost and there's no subsequent pass to pick them up, IIUC.
>>
>
> Userland will receive a separate notify request whenever a
> cache_request is queued, and that notification is only sent after the
> cache_request is queued.
>
> So while it might not receive all of the queued requests from the
> kernel in the race you describe, it won't matter because select() will
> return the notify descriptor again on the next pass and mountd will
> pick up the remaining entries at that point.
Makes sense. But perhaps should be documented somewhere; the "retry"
loop is not obvious from the source code introduced here.
"Caller is expected to ..." or some such language.
--
Chuck Lever
next prev parent reply other threads:[~2026-03-19 18:47 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 15:14 [PATCH 00/14] nfsd/sunrpc: add support for netlink upcalls for mountd/exportd Jeff Layton
2026-03-16 15:14 ` [PATCH 01/14] nfsd: move struct nfsd_genl_rqstp to nfsctl.c Jeff Layton
2026-03-16 15:14 ` [PATCH 02/14] sunrpc: rename sunrpc_cache_pipe_upcall() to sunrpc_cache_upcall() Jeff Layton
2026-03-16 15:14 ` [PATCH 03/14] sunrpc: rename sunrpc_cache_pipe_upcall_timeout() Jeff Layton
2026-03-16 15:14 ` [PATCH 04/14] sunrpc: rename cache_pipe_upcall() to cache_do_upcall() Jeff Layton
2026-03-19 13:54 ` Chuck Lever
2026-03-16 15:14 ` [PATCH 05/14] sunrpc: add a cache_notify callback Jeff Layton
2026-03-19 15:13 ` Chuck Lever
2026-03-16 15:14 ` [PATCH 06/14] sunrpc: add helpers to count and snapshot pending cache requests Jeff Layton
2026-03-19 18:07 ` Chuck Lever
2026-03-19 18:22 ` Jeff Layton
2026-03-19 18:47 ` Chuck Lever [this message]
2026-03-16 15:14 ` [PATCH 07/14] sunrpc: add a generic netlink family for cache upcalls Jeff Layton
2026-03-19 18:44 ` Chuck Lever
2026-03-19 19:14 ` Chuck Lever
2026-03-19 19:19 ` Jeff Layton
2026-03-19 19:20 ` Chuck Lever
2026-03-19 19:31 ` Chuck Lever
2026-03-16 15:14 ` [PATCH 08/14] sunrpc: add netlink upcall for the auth.unix.ip cache Jeff Layton
2026-03-16 15:14 ` [PATCH 09/14] sunrpc: add netlink upcall for the auth.unix.gid cache Jeff Layton
2026-03-20 14:32 ` Chuck Lever
2026-03-16 15:14 ` [PATCH 10/14] nfsd: add new netlink spec for svc_export upcall Jeff Layton
2026-03-20 15:17 ` Chuck Lever
2026-03-23 20:00 ` Jeff Layton
2026-03-16 15:14 ` [PATCH 11/14] nfsd: add netlink upcall for the svc_export cache Jeff Layton
2026-03-16 15:14 ` [PATCH 12/14] nfsd: add netlink upcall for the nfsd.fh cache Jeff Layton
2026-03-16 15:14 ` [PATCH 13/14] sunrpc: add SUNRPC_CMD_CACHE_FLUSH netlink command Jeff Layton
2026-03-16 15:14 ` [PATCH 14/14] nfsd: add NFSD_CMD_CACHE_FLUSH " Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=163efbf2-5e1a-4dde-b6c8-787c9b68e878@oracle.com \
--to=chuck.lever@oracle.com \
--cc=Dai.Ngo@oracle.com \
--cc=anna@kernel.org \
--cc=jlayton@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox