Re: Handling of BADSESSON error

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Trond Myklebust <trondmy@hammerspace.com>
To: "aglo@umich.edu" <aglo@umich.edu>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: Handling of BADSESSON error
Date: Wed, 14 Jun 2023 21:43:49 +0000	[thread overview]
Message-ID: <ec5c685da9f44197f38e8b2e7514ca495b078a5e.camel@hammerspace.com> (raw)
In-Reply-To: <CAN-5tyGD2NCwgsUmaOVjhxjdtjxBng_LyjY1k5ap0qP0w+bxdg@mail.gmail.com>

On Wed, 2023-06-14 at 15:58 -0400, Olga Kornievskaia wrote:
> Hi Trond,
> 
> I'm looking for advice on how to handle the problem that when
> BADSESSION is received (on an interrupted slot) and we don't
> increment
> the seqid for that slot. The client releases the slot and it's
> possible for another thread to use it before the session is frozen.
> Here are the (unfiltered sequential) tracepoints showing the problem.
> Follow slot_nr=0 and seq_nr=7673
> 
>    kworker/u2:26-541     [000] .....   869.508658:
> nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=4 seq_nr=4259
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>    kworker/u2:26-541     [000] .....   869.508661: nfs4_write:
> error=-10052 (BADSESSION) fileid=00:3b:111 fhandle=0x59c8ccff
> offset=2304664 count=7992 res=0 stateid=1:0x3f4f04cd
> layoutstateid=0:0x00000000
>     kworker/u2:1-3198    [000] .....   869.508898: nfs4_xdr_status:
> task:0000a2ae@00000011 xid=0x5d0f6dda error=-10052 (BADSESSION)
> operation=53
>     kworker/u2:1-3198    [000] .....   869.508905:
> nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>               dt-3684    [000] .....   869.508918: nfs4_set_lock:
> error=-10052 (BADSESSION) cmd=SETLK:WRLCK range=1603340:1834535
> fileid=00:3b:109 fhandle=0x7c6bc6b4 stateid=1:0x8f5f1fe4
> lockstateid=0:0x7bd5c66f
> 
> *** this is use of slot_nr=0 seq_nr=7673 that gets BADSESSION. Slot
> gets released without incrementing the seq#. The next tracepoint
> shows
> the use of the slot again by another lock call ***
> 
>     kworker/u2:1-3198    [000] .....   869.508928:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_used_slotid=1
>    kworker/u2:29-549     [000] .....   869.509746:
> nfs4_sequence_done:
> error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
>               dt-3672    [000] .....   869.509770: nfs4_set_lock:
> error=0 (OK) cmd=SETLK:WRLCK range=146432:159743 fileid=00:3b:129
> fhandle=0x50fa2dd4 stateid=1:0xcf065b31 lockstateid=1:0x5c571804
>    kworker/u2:26-541     [000] .....   869.509814:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7674
> highest_used_slotid=0
>    kworker/u2:26-541     [000] .....   869.509857:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=1 seq_nr=7805
> highest_used_slotid=1
> 
> ** finally the state manager gets to run? But only after 3 "NEW" use
> of slots are done **
> 
>  172.28.68.180-m-3751    [000] .....   869.510267: nfs4_state_mgr:
> hostname=172.28.68.180 clp state=MANAGER_RUNNING|CHECK_LEASE|0xc040
>    kworker/u2:29-549     [000] .....   869.510977: nfs4_xdr_status:
> task:0000a2c8@00000011 xid=0x5e0f6dda error=-10052 (BADSESSION)
> operation=53
>    kworker/u2:29-549     [000] .....   869.510983:
> nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=1 seq_nr=7805
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>    kworker/u2:29-549     [000] .....   869.510985: nfs4_write:
> error=-10052 (BADSESSION) fileid=00:3b:129 fhandle=0x50fa2dd4
> offset=146432 count=13312 res=0 stateid=1:0xcf065b31
> layoutstateid=0:0x00000000
>    kworker/u2:26-541     [000] .....   869.511318:
> nfs4_sequence_done:
> error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7674
> highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
>               dt-3669    [000] .....   869.511337: nfs4_set_lock:
> error=0 (OK) cmd=SETLK:WRLCK range=2462720:2469375 fileid=00:3b:138
> fhandle=0xe30d8cf3 stateid=1:0xe2787aa1 lockstateid=1:0x216421fe
>  172.28.68.180-m-3751    [000] .....   869.511918:
> nfs4_destroy_session: error=0 (OK) dstaddr=172.28.68.180
>  172.28.68.180-m-3751    [000] .....   869.513347:
> nfs4_create_session: error=0 (OK) dstaddr=172.28.68.180
> 
> To prevent reuse of the same slot/seqid for when we receive
> BADSESSION, can we perhaps set slot->seq_done? Then, when
> nfs41_sequence_process() calls nfs41_sequence_free_slot(), it'd
> increment seq_nr then. Slot re-use would be prevented.
> 
> Or, perhaps we set the NFS4_SLOT_TBL_DRAINING bit right in
> nfs41_sequence_process() for BADSESSION so that nothing else can get
> the slot when it's released?
> 
> Or some other way or preventing slots being (re)used after receiving
> BADSESSION on that slot. The problem if re-using (interrupted) slots
> is that they get cached reply from the server and those operations
> "think" operation succeeded and they have wrong/invalid stateids for
> instance.
> 
> Here's the sequence of events. First of all this is a session
> trunking
> scenario where one of the servers leaves the group.
> NFS OP uses slot=0 seq=0 sends it to server 1. Server 1 processes the
> request populates its session cache. But the reply never reaches the
> client. Connection gets reset.
> NFS OP is resent using slot=0 seq=0 to server 2 which just left the
> trunking group. It replies with BADSESSION
> (session is not frozen on the client yet) new NFS OP uses slot=0
> seq=0
> and sends it to server 1. Server 1 responds out of the session cache.
> Client destroys the session
> Client uses stateid returned from the new OP which is really invalid
> for the operation. Server fails the operation. Application failure
> occurs.
> 
> Thank you..

I suggest just adding a call along the lines of

	set_bit(NFS4_SLOT_TBL_DRAINING, &session->fc_slot_table.slot_tbl_state);

immediately before the call to nfs4_schedule_session_recovery() in
nfs41_sequence_process(). That ought to be race-free because we should
still be holding the slot. It won't try to do any of the other fancy
stuff in nfs4_drain_slot_tbl(). All that will happen is that
nfs4_setup_sequence() will stop allocating new unprivileged slots.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com

     prev parent reply	other threads:[~2023-06-14 21:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-14 19:58 Handling of BADSESSON error Olga Kornievskaia
2023-06-14 20:24 ` Rick Macklem
2023-06-14 20:43   ` Olga Kornievskaia
2023-06-14 20:43     ` Olga Kornievskaia
2023-06-14 21:27       ` Rick Macklem
2023-06-14 21:01     ` Rick Macklem
2023-06-14 21:43 ` Trond Myklebust [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec5c685da9f44197f38e8b2e7514ca495b078a5e.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=aglo@umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox