linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Nikhil Jha <njha@janestreet.com>
Cc: Trond Myklebust <trondmy@kernel.org>,
	Anna Schumaker <anna@kernel.org>,
	Jeff Layton <jlayton@kernel.org>, Neil Brown <neilb@suse.de>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH v2 0/2] fix gss seqno handling to be more rfc-compliant
Date: Wed, 11 Jun 2025 15:10:49 -0400	[thread overview]
Message-ID: <d277b990-fa08-47b1-a827-a0e81e9d2ca8@oracle.com> (raw)
In-Reply-To: <20250611A1905207b67479b.njha@janestreet.com>

On 6/11/25 3:05 PM, Nikhil Jha wrote:
> On Wed, Jun 11, 2025 at 02:54:09PM -0400, Chuck Lever wrote:
>> On 6/11/25 2:50 PM, Nikhil Jha wrote:
>>> On Thu, Mar 20, 2025 at 09:16:15AM -0400, Chuck Lever wrote:
>>>> On 3/19/25 1:02 PM, Nikhil Jha via B4 Relay wrote:
>>>>> When the client retransmits an operation (for example, because the
>>>>> server is slow to respond), a new GSS sequence number is associated with
>>>>> the XID. In the current kernel code the original sequence number is
>>>>> discarded. Subsequently, if a response to the original request is
>>>>> received there will be a GSS sequence number mismatch. A mismatch will
>>>>> trigger another retransmit, possibly repeating the cycle, and after some
>>>>> number of failed retries EACCES is returned.
>>>>>
>>>>> RFC2203, section 5.3.3.1 suggests a possible solution... “cache the
>>>>> RPCSEC_GSS sequence number of each request it sends” and "compute the
>>>>> checksum of each sequence number in the cache to try to match the
>>>>> checksum in the reply's verifier." This is what FreeBSD’s implementation
>>>>> does (rpc_gss_validate in sys/rpc/rpcsec_gss/rpcsec_gss.c).
>>>>>
>>>>> However, even with this cache, retransmits directly caused by a seqno
>>>>> mismatch can still cause a bad message interleaving that results in this
>>>>> bug. The RFC already suggests ignoring incorrect seqnos on the server
>>>>> side, and this seems symmetric, so this patchset also applies that
>>>>> behavior to the client.
>>>>>
>>>>> These two patches are *not* dependent on each other. I tested them by
>>>>> delaying packets with a Python script hooked up to NFQUEUE. If it would
>>>>> be helpful I can send this script along as well.
>>>>>
>>>>> Signed-off-by: Nikhil Jha <njha@janestreet.com>
>>>>> ---
>>>>> Changes since v1:
>>>>>  * Maintain the invariant that the first seqno is always first in
>>>>>    rq_seqnos, so that it doesn't need to be stored twice.
>>>>>  * Minor formatting, and resending with proper mailing-list headers so the
>>>>>    patches are easier to work with.
>>>>>
>>>>> ---
>>>>> Nikhil Jha (2):
>>>>>       sunrpc: implement rfc2203 rpcsec_gss seqnum cache
>>>>>       sunrpc: don't immediately retransmit on seqno miss
>>>>>
>>>>>  include/linux/sunrpc/xprt.h    | 17 +++++++++++-
>>>>>  include/trace/events/rpcgss.h  |  4 +--
>>>>>  include/trace/events/sunrpc.h  |  2 +-
>>>>>  net/sunrpc/auth_gss/auth_gss.c | 59 ++++++++++++++++++++++++++----------------
>>>>>  net/sunrpc/clnt.c              |  9 +++++--
>>>>>  net/sunrpc/xprt.c              |  3 ++-
>>>>>  6 files changed, 64 insertions(+), 30 deletions(-)
>>>>> ---
>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>> change-id: 20250314-rfc2203-seqnum-cache-52389d14f567
>>>>>
>>>>> Best regards,
>>>>
>>>> This seems like a sensible thing to do to me.
>>>>
>>>> Acked-by: Chuck Lever <chuck.lever@oracle.com>
>>>>
>>>> -- 
>>>> Chuck Lever
>>>
>>> Hi,
>>>
>>> We've been running this patch for a while now and noticed a (very silly
>>> in hindsight) bug.
>>>
>>> maj_stat = gss_validate_seqno_mic(ctx, task->tk_rqstp->rq_seqnos[i], seq, p, len);
>>>
>>> needs to be
>>>
>>> maj_stat = gss_validate_seqno_mic(ctx, task->tk_rqstp->rq_seqnos[i++], seq, p, len);
>>>
>>> Or the kernel gets stuck in a loop when you have more than two retries.
>>> I can resend this patch but I noticed it's already made its way into
>>> quite a few trees. Should this be a separate patch instead?
>>
>> The course of action depends on what trees you found the patch in.
>>
>>
>> -- 
>> Chuck Lever
> 
> It shows up here, so I think it's in v6.16-rc1 already.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.16-rc1&id=08d6ee6d8a10aef958c2af16bb121070290ed589

In that case, post your fix delta To: Anna, Cc: linux-nfs@ and she will
apply it for a subsequent upstream pull request for v6.16-rc.


-- 
Chuck Lever

      reply	other threads:[~2025-06-11 19:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-19 17:02 [PATCH v2 0/2] fix gss seqno handling to be more rfc-compliant Nikhil Jha via B4 Relay
2025-03-19 17:02 ` [PATCH v2 1/2] sunrpc: implement rfc2203 rpcsec_gss seqnum cache Nikhil Jha via B4 Relay
2025-03-19 17:02 ` [PATCH v2 2/2] sunrpc: don't immediately retransmit on seqno miss Nikhil Jha via B4 Relay
2025-03-20 13:16 ` [PATCH v2 0/2] fix gss seqno handling to be more rfc-compliant Chuck Lever
2025-06-11 18:50   ` Nikhil Jha
2025-06-11 18:54     ` Chuck Lever
2025-06-11 19:05       ` Nikhil Jha
2025-06-11 19:10         ` Chuck Lever [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d277b990-fa08-47b1-a827-a0e81e9d2ca8@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=Dai.Ngo@oracle.com \
    --cc=anna@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jlayton@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=neilb@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=njha@janestreet.com \
    --cc=okorniev@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tom@talpey.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).