From: Trond Myklebust <trondmy@kernel.org>
To: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>,
chuck.lever@oracle.com, jlayton@kernel.org, anna@kernel.org
Cc: linux-nfs@vger.kernel.org
Subject: Re: [PATCH] [RFC] nfs4: inject process namespace into COMPOUND tag
Date: Sun, 28 Jun 2026 17:15:42 -0400 [thread overview]
Message-ID: <fb81ad5a850160daab7092a8289bc626862f6072.camel@kernel.org> (raw)
In-Reply-To: <20260626151029.1516839-1-tigran.mkrtchyan@desy.de>
On Fri, 2026-06-26 at 17:10 +0200, Tigran Mkrtchyan wrote:
> On large shared machines often multiple jobs of a same user run in
> parallel. For debugging, it's usually impossible to identify requests
> coming from different processes.
>
> The batch systems like HTCondor or SLURM start every job in it's own
> namespace, thus passing namespace info to the server will help by
> debugging.
>
> 192.168.122.150 → 192.168.178.69 NFS 260 V4 Call GETATTR FH:
> 0xd5ffb2cb
> 192.168.178.69 → 192.168.122.150 NFS 324 V4 Reply (Call In 89)
> GETATTR
> 192.168.122.150 → 192.168.178.69 NFS 260 V4 Call GETATTR FH:
> 0xd0b0a44e
> 192.168.178.69 → 192.168.122.150 NFS 324 V4 Reply (Call In 95)
> GETATTR
> 192.168.122.150 → 192.168.178.69 NFS 268 V4 Call ACCESS FH:
> 0xd0b0a44e, [Check: RD LU MD XT DL XAR XAW XAL]
> 192.168.178.69 → 192.168.122.150 NFS 240 V4 Reply (Call In 101)
> ACCESS, [Allowed: RD LU MD XT DL XAR XAW XAL]
> 192.168.122.150 → 192.168.178.69 NFS 284 V4 Call READDIR FH:
> 0xd0b0a44e
> 192.168.178.69 → 192.168.122.150 NFS 664 V4 Reply (Call In 105)
> READDIR
> 192.168.122.150 → 192.168.178.69 NFS 284 V4 Call ns:4026532507
> GETATTR FH: 0xd67b66a5
> 192.168.178.69 → 192.168.122.150 NFS 340 V4 Reply (Call In 111)
> ns:4026532507 GETATTR
> 192.168.122.150 → 192.168.178.69 NFS 292 V4 Call ns:4026532507 ACCESS
> FH: 0xd67b66a5, [Check: RD LU MD XT DL XAR XAW XAL]
> 192.168.178.69 → 192.168.122.150 NFS 256 V4 Reply (Call In 117)
> ns:4026532507 ACCESS, [Access Denied: MD XT DL XAW], [Allowed: RD LU
> XAR XAL]
> 192.168.122.150 → 192.168.178.69 NFS 308 V4 Call ns:4026532507
> READDIR FH: 0xd67b66a5
> 192.168.178.69 → 192.168.122.150 NFS 200 V4 Reply (Call In 121)
> ns:4026532507 READDIR
>
> Suggested-by: Chuck Lever <chuck.lever@oracle.com>
> Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
> ---
> fs/nfs/nfs4xdr.c | 24 +++++++++++++++++++-----
> include/linux/sunrpc/sched.h | 2 ++
> net/sunrpc/sched.c | 6 ++++++
> 3 files changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index c23c2eee1b5c..9c035c74a3b5 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -46,6 +46,7 @@
> #include <linux/kdev_t.h>
> #include <linux/module.h>
> #include <linux/utsname.h>
> +#include <linux/pid_namespace.h>
> #include <linux/sunrpc/clnt.h>
> #include <linux/sunrpc/msg_prot.h>
> #include <linux/sunrpc/gss_api.h>
> @@ -71,12 +72,8 @@ static void encode_layoutget(struct xdr_stream
> *xdr,
> static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst
> *req,
> struct nfs4_layoutget_res *res);
>
> -/* NFSv4 COMPOUND tags are only wanted for debugging purposes */
> -#ifdef DEBUG
> +/* Enable compound tags to include namespace information */
> #define NFS4_MAXTAGLEN 20
> -#else
> -#define NFS4_MAXTAGLEN 0
> -#endif
>
> /* lock,open owner id:
> * we currently use size 2 (u64) out of (NFS4_OPAQUE_LIMIT >> 2)
> @@ -1034,6 +1031,23 @@ static void encode_compound_hdr(struct
> xdr_stream *xdr,
> {
> __be32 *p;
>
> + /* Inject namespace info into compound tag if not already
> set */
> + if (hdr->taglen == 0 && req->rq_task != NULL) {
> + /* Use namespace info captured at task creation time
> */
> + struct rpc_task *task = req->rq_task;
> +
> + if (taks->tk_ns_inum != 0) {
Hmm.... This has not been compile tested.
> + char ns_tag[NFS4_MAXTAGLEN + 1];
> +
> + hdr->taglen = snprintf(ns_tag,
> sizeof(ns_tag), "ns:%u", taks->tk_ns_inum);
> + if (hdr->taglen > NFS4_MAXTAGLEN) {
> + hdr->taglen = NFS4_MAXTAGLEN;
> + ns_tag[NFS4_MAXTAGLEN] = '\0';
> + }
> + hdr->tag = ns_tag;
ns_tag is only scoped to this block. I suggest that instead of
assigning to hdr->taglen and hdr->tag, you just do the assignment to
hdr->replen + call to encode_string() here, so you don't have to assign
a scope limited buffer to an externally visible struct.
Note also that there is no need to NUL terminate ns_tag[] when hdr-
>taglen > NFS4_MAXTAGLEN, since encode_string() does not require a nul
terminated string. In addition, snprintf() always guarantees that the
string is nul terminated, even when truncated by the buffer size :-).
> + }
> + }
> +
> /* initialize running count of expected bytes in reply.
> * NOTE: the replied tag SHOULD be the same is the one sent,
> * but this is not required as a MUST for the server to do
> so. */
> diff --git a/include/linux/sunrpc/sched.h
> b/include/linux/sunrpc/sched.h
> index 0dbdf3722537..d376b52a72a1 100644
> --- a/include/linux/sunrpc/sched.h
> +++ b/include/linux/sunrpc/sched.h
> @@ -92,6 +92,8 @@ struct rpc_task {
>
> pid_t tk_owner; /* Process id for
> batching tasks */
>
> + unsigned int tk_ns_inum; /* PID namespace
> inum for namespace tracking */
> +
> int tk_rpc_status; /* Result of last
> RPC operation */
> unsigned short tk_flags; /* misc flags */
> unsigned short tk_timeouts; /* maj timeouts */
> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> index 016f16ca5779..4e8e7fa849d5 100644
> --- a/net/sunrpc/sched.c
> +++ b/net/sunrpc/sched.c
> @@ -21,6 +21,7 @@
> #include <linux/mutex.h>
> #include <linux/freezer.h>
> #include <linux/sched/mm.h>
> +#include <linux/pid_namespace.h>
>
> #include <linux/sunrpc/clnt.h>
> #include <linux/sunrpc/metrics.h>
> @@ -1110,6 +1111,11 @@ static void rpc_init_task(struct rpc_task
> *task, const struct rpc_task_setup *ta
> task->tk_priority = task_setup_data->priority -
> RPC_PRIORITY_LOW;
> task->tk_owner = current->tgid;
>
> + struct pid_namespace *pid_ns = task_active_pid_ns(current);
> + /* Keep track on namespace id */
> + if (pid_ns != &init_pid_ns)
> + task->tk_ns_inum = pid_ns->ns.inum;
For buffered writes, this will tell you the pid namespace of the
process that is flushing the data, not that of the process that wrote
the data into the page cache. Is that what you expected?
> +
> /* Initialize workqueue for async tasks */
> task->tk_workqueue = task_setup_data->workqueue;
>
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@kernel.org, trond.myklebust@hammerspace.com
next prev parent reply other threads:[~2026-06-28 21:15 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-26 15:10 [PATCH] [RFC] nfs4: inject process namespace into COMPOUND tag Tigran Mkrtchyan
2026-06-28 21:15 ` Trond Myklebust [this message]
2026-06-30 13:48 ` Mkrtchyan, Tigran
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb81ad5a850160daab7092a8289bc626862f6072.camel@kernel.org \
--to=trondmy@kernel.org \
--cc=anna@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=tigran.mkrtchyan@desy.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox