All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton via Bugspray Bot <bugbot@kernel.org>
To: linux-nfs@vger.kernel.org, jlayton@kernel.org,
	chuck.lever@oracle.com,  herzog@phys.ethz.ch, tom@talpey.com,
	carnil@debian.org, anna@kernel.org,
	 benoit.gschwind@minesparis.psl.eu, trondmy@kernel.org,
	 harald.dunkel@aixigo.com, cel@kernel.org,
	baptiste.pellegrin@ac-grenoble.fr
Subject: Re: NFSD threads hang when destroying a session or client ID
Date: Thu, 23 Jan 2025 13:50:22 +0000	[thread overview]
Message-ID: <20250123-b219710c18-e354a69e709a@bugzilla.kernel.org> (raw)
In-Reply-To: <20250120-b219710c0-da932078cddb@bugzilla.kernel.org>

Jeff Layton writes via Kernel.org Bugzilla:

There is another scenario that could explain a hang here. From nfsd4_cb_sequence_done():

------------------8<---------------------
        case -NFS4ERR_BADSLOT:
                goto retry_nowait;
        case -NFS4ERR_SEQ_MISORDERED:        
                if (session->se_cb_seq_nr[cb->cb_held_slot] != 1) {
                        session->se_cb_seq_nr[cb->cb_held_slot] = 1;
                        goto retry_nowait;     
                }      
                break;
        default:                          
                nfsd4_mark_cb_fault(cb->cb_clp);
        }                       
        trace_nfsd_cb_free_slot(task, cb);
        nfsd41_cb_release_slot(cb);             
                 
        if (RPC_SIGNALLED(task))
                goto need_restart;
out:                  
        return ret;
retry_nowait:
        if (rpc_restart_call_prepare(task))
                ret = false;                
        goto out;
------------------8<---------------------

Since it doesn't check RPC_SIGNALLED in the v4.1+ case until very late in the function, it's possible to get a BADSLOT or SEQ_MISORDERED error that causes the callback client to immediately resubmit the rpc_task to the RPC engine without resubmitting to the callback workqueue.

I think that we should assume that when RPC_SIGNALLED returns true that the result is suspect, and that we should halt further processing into the CB_SEQUENCE response and restart the callback.

View: https://bugzilla.kernel.org/show_bug.cgi?id=219710#c18
You can reply to this message to join the discussion.
-- 
Deet-doot-dot, I am a bot.
Kernel.org Bugzilla (bugspray 0.1-dev)


  parent reply	other threads:[~2025-01-23 13:49 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-20 15:00 NFSD threads hang when destroying a session or client ID Chuck Lever via Bugspray Bot
2025-01-20 15:14 ` Chuck Lever
2025-01-20 15:25 ` Chuck Lever via Bugspray Bot
2025-01-20 15:40 ` Chuck Lever via Bugspray Bot
2025-01-20 19:00 ` Chuck Lever via Bugspray Bot
2025-01-20 20:35 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-21 14:40 ` Jeff Layton via Bugspray Bot
2025-01-21 16:10 ` Chuck Lever via Bugspray Bot
2025-01-21 17:35   ` Jeff Layton via Bugspray Bot
2025-01-21 19:38     ` Tom Talpey
2025-01-21 19:43       ` Chuck Lever
2025-01-21 16:25 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-21 16:35   ` Chuck Lever via Bugspray Bot
2025-01-22 11:40     ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-22 14:19       ` Chuck Lever
2025-01-22 21:25 ` JJ Jordan via Bugspray Bot
2025-01-22 21:25 ` JJ Jordan via Bugspray Bot
2025-01-23  2:10 ` Li Lingfeng via Bugspray Bot
2025-01-23 13:50 ` Jeff Layton via Bugspray Bot [this message]
2025-01-23 14:22   ` Chuck Lever
2025-01-23 20:25 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-23 21:45 ` Chuck Lever via Bugspray Bot
2025-01-26  9:25 ` Baptiste PELLEGRIN via Bugspray Bot
2025-01-26 17:05   ` Chuck Lever via Bugspray Bot
2025-01-29 13:15 ` rik.theys via Bugspray Bot
2025-01-29 19:40 ` Chuck Lever via Bugspray Bot
2025-01-30 14:05   ` rik.theys via Bugspray Bot
2025-01-29 19:50 ` Chuck Lever via Bugspray Bot
2025-02-10 12:05 ` Baptiste PELLEGRIN via Bugspray Bot
2025-02-21 13:42   ` Salvatore Bonaccorso
2025-02-21 13:57     ` Harald Dunkel
2025-02-21 14:31       ` Salvatore Bonaccorso
2025-02-21 14:50       ` Jeff Layton via Bugspray Bot
2025-02-21 16:00     ` Chuck Lever via Bugspray Bot
2025-02-21 14:45 ` Jeff Layton via Bugspray Bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250123-b219710c18-e354a69e709a@bugzilla.kernel.org \
    --to=bugbot@kernel.org \
    --cc=anna@kernel.org \
    --cc=baptiste.pellegrin@ac-grenoble.fr \
    --cc=benoit.gschwind@minesparis.psl.eu \
    --cc=carnil@debian.org \
    --cc=cel@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=harald.dunkel@aixigo.com \
    --cc=herzog@phys.ethz.ch \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tom@talpey.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.