From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
To: Trond Myklebust <trondmy@kernel.org>
Cc: chuck lever <chuck.lever@oracle.com>,
jlayton <jlayton@kernel.org>, anna <anna@kernel.org>,
linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] [RFC] nfs4: inject process namespace into COMPOUND tag
Date: Tue, 30 Jun 2026 15:48:26 +0200 (CEST) [thread overview]
Message-ID: <255476971.4622529.1782827306515.JavaMail.zimbra@desy.de> (raw)
In-Reply-To: <fb81ad5a850160daab7092a8289bc626862f6072.camel@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 7228 bytes --]
----- Original Message -----
> From: "Trond Myklebust" <trondmy@kernel.org>
> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>, "chuck lever" <chuck.lever@oracle.com>, "jlayton"
> <jlayton@kernel.org>, "anna" <anna@kernel.org>
> Cc: "linux-nfs" <linux-nfs@vger.kernel.org>
> Sent: Sunday, 28 June, 2026 23:15:42
> Subject: Re: [PATCH] [RFC] nfs4: inject process namespace into COMPOUND tag
> On Fri, 2026-06-26 at 17:10 +0200, Tigran Mkrtchyan wrote:
>> On large shared machines often multiple jobs of a same user run in
>> parallel. For debugging, it's usually impossible to identify requests
>> coming from different processes.
>>
>> The batch systems like HTCondor or SLURM start every job in it's own
>> namespace, thus passing namespace info to the server will help by
>> debugging.
>>
>> 192.168.122.150 → 192.168.178.69 NFS 260 V4 Call GETATTR FH:
>> 0xd5ffb2cb
>> 192.168.178.69 → 192.168.122.150 NFS 324 V4 Reply (Call In 89)
>> GETATTR
>> 192.168.122.150 → 192.168.178.69 NFS 260 V4 Call GETATTR FH:
>> 0xd0b0a44e
>> 192.168.178.69 → 192.168.122.150 NFS 324 V4 Reply (Call In 95)
>> GETATTR
>> 192.168.122.150 → 192.168.178.69 NFS 268 V4 Call ACCESS FH:
>> 0xd0b0a44e, [Check: RD LU MD XT DL XAR XAW XAL]
>> 192.168.178.69 → 192.168.122.150 NFS 240 V4 Reply (Call In 101)
>> ACCESS, [Allowed: RD LU MD XT DL XAR XAW XAL]
>> 192.168.122.150 → 192.168.178.69 NFS 284 V4 Call READDIR FH:
>> 0xd0b0a44e
>> 192.168.178.69 → 192.168.122.150 NFS 664 V4 Reply (Call In 105)
>> READDIR
>> 192.168.122.150 → 192.168.178.69 NFS 284 V4 Call ns:4026532507
>> GETATTR FH: 0xd67b66a5
>> 192.168.178.69 → 192.168.122.150 NFS 340 V4 Reply (Call In 111)
>> ns:4026532507 GETATTR
>> 192.168.122.150 → 192.168.178.69 NFS 292 V4 Call ns:4026532507 ACCESS
>> FH: 0xd67b66a5, [Check: RD LU MD XT DL XAR XAW XAL]
>> 192.168.178.69 → 192.168.122.150 NFS 256 V4 Reply (Call In 117)
>> ns:4026532507 ACCESS, [Access Denied: MD XT DL XAW], [Allowed: RD LU
>> XAR XAL]
>> 192.168.122.150 → 192.168.178.69 NFS 308 V4 Call ns:4026532507
>> READDIR FH: 0xd67b66a5
>> 192.168.178.69 → 192.168.122.150 NFS 200 V4 Reply (Call In 121)
>> ns:4026532507 READDIR
>>
>> Suggested-by: Chuck Lever <chuck.lever@oracle.com>
>> Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
>> ---
>> fs/nfs/nfs4xdr.c | 24 +++++++++++++++++++-----
>> include/linux/sunrpc/sched.h | 2 ++
>> net/sunrpc/sched.c | 6 ++++++
>> 3 files changed, 27 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>> index c23c2eee1b5c..9c035c74a3b5 100644
>> --- a/fs/nfs/nfs4xdr.c
>> +++ b/fs/nfs/nfs4xdr.c
>> @@ -46,6 +46,7 @@
>> #include <linux/kdev_t.h>
>> #include <linux/module.h>
>> #include <linux/utsname.h>
>> +#include <linux/pid_namespace.h>
>> #include <linux/sunrpc/clnt.h>
>> #include <linux/sunrpc/msg_prot.h>
>> #include <linux/sunrpc/gss_api.h>
>> @@ -71,12 +72,8 @@ static void encode_layoutget(struct xdr_stream
>> *xdr,
>> static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst
>> *req,
>> struct nfs4_layoutget_res *res);
>>
>> -/* NFSv4 COMPOUND tags are only wanted for debugging purposes */
>> -#ifdef DEBUG
>> +/* Enable compound tags to include namespace information */
>> #define NFS4_MAXTAGLEN 20
>> -#else
>> -#define NFS4_MAXTAGLEN 0
>> -#endif
>>
>> /* lock,open owner id:
>> * we currently use size 2 (u64) out of (NFS4_OPAQUE_LIMIT >> 2)
>> @@ -1034,6 +1031,23 @@ static void encode_compound_hdr(struct
>> xdr_stream *xdr,
>> {
>> __be32 *p;
>>
>> + /* Inject namespace info into compound tag if not already
>> set */
>> + if (hdr->taglen == 0 && req->rq_task != NULL) {
>> + /* Use namespace info captured at task creation time
>> */
>> + struct rpc_task *task = req->rq_task;
>> +
>> + if (taks->tk_ns_inum != 0) {
>
> Hmm.... This has not been compile tested.
I am pretty sure this is the code that I run on VM right now :)
>
>> + char ns_tag[NFS4_MAXTAGLEN + 1];
>> +
>> + hdr->taglen = snprintf(ns_tag,
>> sizeof(ns_tag), "ns:%u", taks->tk_ns_inum);
>> + if (hdr->taglen > NFS4_MAXTAGLEN) {
>> + hdr->taglen = NFS4_MAXTAGLEN;
>> + ns_tag[NFS4_MAXTAGLEN] = '\0';
>> + }
>> + hdr->tag = ns_tag;
>
> ns_tag is only scoped to this block. I suggest that instead of
> assigning to hdr->taglen and hdr->tag, you just do the assignment to
> hdr->replen + call to encode_string() here, so you don't have to assign
> a scope limited buffer to an externally visible struct.
>
> Note also that there is no need to NUL terminate ns_tag[] when hdr-
>>taglen > NFS4_MAXTAGLEN, since encode_string() does not require a nul
> terminated string. In addition, snprintf() always guarantees that the
> string is nul terminated, even when truncated by the buffer size :-).
>
>> + }
>> + }
>> +
>> /* initialize running count of expected bytes in reply.
>> * NOTE: the replied tag SHOULD be the same is the one sent,
>> * but this is not required as a MUST for the server to do
>> so. */
>> diff --git a/include/linux/sunrpc/sched.h
>> b/include/linux/sunrpc/sched.h
>> index 0dbdf3722537..d376b52a72a1 100644
>> --- a/include/linux/sunrpc/sched.h
>> +++ b/include/linux/sunrpc/sched.h
>> @@ -92,6 +92,8 @@ struct rpc_task {
>>
>> pid_t tk_owner; /* Process id for
>> batching tasks */
>>
>> + unsigned int tk_ns_inum; /* PID namespace
>> inum for namespace tracking */
>> +
>> int tk_rpc_status; /* Result of last
>> RPC operation */
>> unsigned short tk_flags; /* misc flags */
>> unsigned short tk_timeouts; /* maj timeouts */
>> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
>> index 016f16ca5779..4e8e7fa849d5 100644
>> --- a/net/sunrpc/sched.c
>> +++ b/net/sunrpc/sched.c
>> @@ -21,6 +21,7 @@
>> #include <linux/mutex.h>
>> #include <linux/freezer.h>
>> #include <linux/sched/mm.h>
>> +#include <linux/pid_namespace.h>
>>
>> #include <linux/sunrpc/clnt.h>
>> #include <linux/sunrpc/metrics.h>
>> @@ -1110,6 +1111,11 @@ static void rpc_init_task(struct rpc_task
>> *task, const struct rpc_task_setup *ta
>> task->tk_priority = task_setup_data->priority -
>> RPC_PRIORITY_LOW;
>> task->tk_owner = current->tgid;
>>
>> + struct pid_namespace *pid_ns = task_active_pid_ns(current);
>> + /* Keep track on namespace id */
>> + if (pid_ns != &init_pid_ns)
>> + task->tk_ns_inum = pid_ns->ns.inum;
>
> For buffered writes, this will tell you the pid namespace of the
> process that is flushing the data, not that of the process that wrote
> the data into the page cache. Is that what you expected?
In general, with read-ahead, delegations, and other caching mechanisms,
is it possible to match NFS I/O to an application?
Best regards,
Tigran.
>
>> +
>> /* Initialize workqueue for async tasks */
>> task->tk_workqueue = task_setup_data->workqueue;
>>
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trondmy@kernel.org, trond.myklebust@hammerspace.com
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2309 bytes --]
prev parent reply other threads:[~2026-06-30 13:48 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-26 15:10 [PATCH] [RFC] nfs4: inject process namespace into COMPOUND tag Tigran Mkrtchyan
2026-06-28 21:15 ` Trond Myklebust
2026-06-30 13:48 ` Mkrtchyan, Tigran [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=255476971.4622529.1782827306515.JavaMail.zimbra@desy.de \
--to=tigran.mkrtchyan@desy.de \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trondmy@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox