From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-o-2.desy.de (smtp-o-2.desy.de [131.169.56.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED52C285C9E for ; Wed, 25 Jun 2025 18:56:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=131.169.56.155 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750877793; cv=none; b=qcTEjHPhCQW6vuCb6MWhxWKUZtohQjKS6+reRmdpEGmySliQbH81Olvc8jPa9IrtTAIHn5OUuu1/CD5ORD/wnqgFuGUw++fl54ViIVt2Den3UdeeAKXPjbOYiSUM7L/U9Aq6tyhJLPJtIWksbdW3iNp07WNKF7s7DdFcmsIbrhA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750877793; c=relaxed/simple; bh=PopqO6xkAtVVzCGhj6F+epgDRrhGcridhsx1xnRqoIg=; h=Date:From:To:Cc:Message-ID:In-Reply-To:References:Subject: MIME-Version:Content-Type; b=W/pe+khEM9S6JIzF1LGWV4kiIj2TGKXF88IUG2uRsYuy2yAUX61LoDaM1/mlCtDM3NvnnyWARUskz0ibSG6PIfoi/v2glA9A1x/WYA8VB75FxtQe9xuWdXCVv+NOOsXrYlvua1MMqy9SutR8qR9n42dksIlQBgjI7NTExhaaUvk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=desy.de; spf=pass smtp.mailfrom=desy.de; dkim=pass (1024-bit key) header.d=desy.de header.i=@desy.de header.b=h24kTBtL; arc=none smtp.client-ip=131.169.56.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=desy.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=desy.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=desy.de header.i=@desy.de header.b="h24kTBtL" Received: from smtp-buf-2.desy.de (smtp-buf-2.desy.de [IPv6:2001:638:700:1038::1:a5]) by smtp-o-2.desy.de (Postfix) with ESMTP id AA1B713F647 for ; Wed, 25 Jun 2025 20:56:27 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp-o-2.desy.de AA1B713F647 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=desy.de; s=default; t=1750877787; bh=+SKVQMQpSAGXIXIPWmom+X6qNAiPEnMWzXmjjQ6HaRU=; h=Date:From:To:Cc:In-Reply-To:References:Subject:From; b=h24kTBtLtpWm9XYhM5GEZ7kN/rb3a+RPKGrG1Dcw0QobNf67W/oDjbB5FoALJEt1L LZBk6X9PYxu+ya/1zrASUjUuVDrm3uLWESXBF8GENpw9uL4lF4FI+enI3P81uiH2zL LpnHZ51e+XQUkEjJwubHNFsx2r7Q3ZYL3LwE9YCQ= Received: from smtp-m-1.desy.de (smtp-m-1.desy.de [IPv6:2001:638:700:1038::1:81]) by smtp-buf-2.desy.de (Postfix) with ESMTP id A0864120043; Wed, 25 Jun 2025 20:56:27 +0200 (CEST) Received: from c1722.mx.srv.dfn.de (c1722.mx.srv.dfn.de [194.95.239.47]) by smtp-m-1.desy.de (Postfix) with ESMTP id 9518840044; Wed, 25 Jun 2025 20:56:27 +0200 (CEST) Received: from smtp-intra-1.desy.de (smtp-intra-1.desy.de [131.169.56.82]) by c1722.mx.srv.dfn.de (Postfix) with ESMTP id D233310A3CD; Wed, 25 Jun 2025 20:56:26 +0200 (CEST) Received: from z-mbx-2.desy.de (z-mbx-2.desy.de [131.169.55.140]) by smtp-intra-1.desy.de (Postfix) with ESMTP id A19DF80046; Wed, 25 Jun 2025 20:56:26 +0200 (CEST) Date: Wed, 25 Jun 2025 20:56:26 +0200 (CEST) From: "Mkrtchyan, Tigran" To: Trond Myklebust Cc: linux-nfs Message-ID: <166279785.14465613.1750877786582.JavaMail.zimbra@desy.de> In-Reply-To: References: Subject: Re: [PATCH] NFSv4/flexfiles: Fix handling of NFS level errors in I/O Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Mailer: Zimbra 9.0.0_GA_4769 (ZimbraWebClient - FF140 (Linux)/9.0.0_GA_4769) Thread-Topic: NFSv4/flexfiles: Fix handling of NFS level errors in I/O Thread-Index: 7y4N4rT94u8FvK6CScvYpbhOjYqT4A== (second attempt after linux-nfs@vger.kernel.org has rejected the first one.) Splitting RPC level error from NFS errors returned by DS looks reasonable to me. So, if it helps: Reviewed-by: Tigran Mkrtchyan Tigran. ----- Original Message ----- > From: "Trond Myklebust" > To: "linux-nfs" > Sent: Wednesday, 25 June, 2025 01:17:12 > Subject: [PATCH] NFSv4/flexfiles: Fix handling of NFS level errors in I/O > From: Trond Myklebust > > Allow the flexfiles error handling to recognise NFS level errors (as > opposed to RPC level errors) and handle them separately. The main > motivator is the NFSERR_PERM errors that get returned if the NFS client > connects to the data server through a port number that is lower than > 1024. In that case, the client should disconnect and retry a READ on a > different data server, or it should retry a WRITE after reconnecting. > > Signed-off-by: Trond Myklebust > --- > fs/nfs/flexfilelayout/flexfilelayout.c | 118 ++++++++++++++++++------- > 1 file changed, 84 insertions(+), 34 deletions(-) > > diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c > b/fs/nfs/flexfilelayout/flexfilelayout.c > index df4807460596..4bea008dbebd 100644 > --- a/fs/nfs/flexfilelayout/flexfilelayout.c > +++ b/fs/nfs/flexfilelayout/flexfilelayout.c > @@ -1105,6 +1105,7 @@ static void ff_layout_reset_read(struct nfs_pgio_header > *hdr) > } > > static int ff_layout_async_handle_error_v4(struct rpc_task *task, > + u32 op_status, > struct nfs4_state *state, > struct nfs_client *clp, > struct pnfs_layout_segment *lseg, > @@ -1115,34 +1116,42 @@ static int ff_layout_async_handle_error_v4(struct > rpc_task *task, > struct nfs4_deviceid_node *devid = FF_LAYOUT_DEVID_NODE(lseg, idx); > struct nfs4_slot_table *tbl = &clp->cl_session->fc_slot_table; > > - switch (task->tk_status) { > - case -NFS4ERR_BADSESSION: > - case -NFS4ERR_BADSLOT: > - case -NFS4ERR_BAD_HIGH_SLOT: > - case -NFS4ERR_DEADSESSION: > - case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION: > - case -NFS4ERR_SEQ_FALSE_RETRY: > - case -NFS4ERR_SEQ_MISORDERED: > + switch (op_status) { > + case NFS4_OK: > + case NFS4ERR_NXIO: > + break; > + case NFSERR_PERM: > + if (!task->tk_xprt) > + break; > + xprt_force_disconnect(task->tk_xprt); > + goto out_retry; > + case NFS4ERR_BADSESSION: > + case NFS4ERR_BADSLOT: > + case NFS4ERR_BAD_HIGH_SLOT: > + case NFS4ERR_DEADSESSION: > + case NFS4ERR_CONN_NOT_BOUND_TO_SESSION: > + case NFS4ERR_SEQ_FALSE_RETRY: > + case NFS4ERR_SEQ_MISORDERED: > dprintk("%s ERROR %d, Reset session. Exchangeid " > "flags 0x%x\n", __func__, task->tk_status, > clp->cl_exchange_flags); > nfs4_schedule_session_recovery(clp->cl_session, task->tk_status); > - break; > - case -NFS4ERR_DELAY: > + goto out_retry; > + case NFS4ERR_DELAY: > nfs_inc_stats(lseg->pls_layout->plh_inode, NFSIOS_DELAY); > fallthrough; > - case -NFS4ERR_GRACE: > + case NFS4ERR_GRACE: > rpc_delay(task, FF_LAYOUT_POLL_RETRY_MAX); > - break; > - case -NFS4ERR_RETRY_UNCACHED_REP: > - break; > + goto out_retry; > + case NFS4ERR_RETRY_UNCACHED_REP: > + goto out_retry; > /* Invalidate Layout errors */ > - case -NFS4ERR_PNFS_NO_LAYOUT: > - case -ESTALE: /* mapped NFS4ERR_STALE */ > - case -EBADHANDLE: /* mapped NFS4ERR_BADHANDLE */ > - case -EISDIR: /* mapped NFS4ERR_ISDIR */ > - case -NFS4ERR_FHEXPIRED: > - case -NFS4ERR_WRONG_TYPE: > + case NFS4ERR_PNFS_NO_LAYOUT: > + case NFS4ERR_STALE: > + case NFS4ERR_BADHANDLE: > + case NFS4ERR_ISDIR: > + case NFS4ERR_FHEXPIRED: > + case NFS4ERR_WRONG_TYPE: > dprintk("%s Invalid layout error %d\n", __func__, > task->tk_status); > /* > @@ -1155,6 +1164,11 @@ static int ff_layout_async_handle_error_v4(struct > rpc_task *task, > pnfs_destroy_layout(NFS_I(inode)); > rpc_wake_up(&tbl->slot_tbl_waitq); > goto reset; > + default: > + break; > + } > + > + switch (task->tk_status) { > /* RPC connection errors */ > case -ENETDOWN: > case -ENETUNREACH: > @@ -1174,27 +1188,56 @@ static int ff_layout_async_handle_error_v4(struct > rpc_task *task, > nfs4_delete_deviceid(devid->ld, devid->nfs_client, > &devid->deviceid); > rpc_wake_up(&tbl->slot_tbl_waitq); > - fallthrough; > + break; > default: > - if (ff_layout_avoid_mds_available_ds(lseg)) > - return -NFS4ERR_RESET_TO_PNFS; > -reset: > - dprintk("%s Retry through MDS. Error %d\n", __func__, > - task->tk_status); > - return -NFS4ERR_RESET_TO_MDS; > + break; > } > + > + if (ff_layout_avoid_mds_available_ds(lseg)) > + return -NFS4ERR_RESET_TO_PNFS; > +reset: > + dprintk("%s Retry through MDS. Error %d\n", __func__, > + task->tk_status); > + return -NFS4ERR_RESET_TO_MDS; > + > +out_retry: > task->tk_status = 0; > return -EAGAIN; > } > > /* Retry all errors through either pNFS or MDS except for -EJUKEBOX */ > static int ff_layout_async_handle_error_v3(struct rpc_task *task, > + u32 op_status, > struct nfs_client *clp, > struct pnfs_layout_segment *lseg, > u32 idx) > { > struct nfs4_deviceid_node *devid = FF_LAYOUT_DEVID_NODE(lseg, idx); > > + switch (op_status) { > + case NFS_OK: > + case NFSERR_NXIO: > + break; > + case NFSERR_PERM: > + if (!task->tk_xprt) > + break; > + xprt_force_disconnect(task->tk_xprt); > + goto out_retry; > + case NFSERR_ACCES: > + case NFSERR_BADHANDLE: > + case NFSERR_FBIG: > + case NFSERR_IO: > + case NFSERR_NOSPC: > + case NFSERR_ROFS: > + case NFSERR_STALE: > + goto out_reset_to_pnfs; > + case NFSERR_JUKEBOX: > + nfs_inc_stats(lseg->pls_layout->plh_inode, NFSIOS_DELAY); > + goto out_retry; > + default: > + break; > + } > + > switch (task->tk_status) { > /* File access problems. Don't mark the device as unavailable */ > case -EACCES: > @@ -1218,6 +1261,7 @@ static int ff_layout_async_handle_error_v3(struct rpc_task > *task, > nfs4_delete_deviceid(devid->ld, devid->nfs_client, > &devid->deviceid); > } > +out_reset_to_pnfs: > /* FIXME: Need to prevent infinite looping here. */ > return -NFS4ERR_RESET_TO_PNFS; > out_retry: > @@ -1228,6 +1272,7 @@ static int ff_layout_async_handle_error_v3(struct rpc_task > *task, > } > > static int ff_layout_async_handle_error(struct rpc_task *task, > + u32 op_status, > struct nfs4_state *state, > struct nfs_client *clp, > struct pnfs_layout_segment *lseg, > @@ -1246,10 +1291,11 @@ static int ff_layout_async_handle_error(struct rpc_task > *task, > > switch (vers) { > case 3: > - return ff_layout_async_handle_error_v3(task, clp, lseg, idx); > - case 4: > - return ff_layout_async_handle_error_v4(task, state, clp, > + return ff_layout_async_handle_error_v3(task, op_status, clp, > lseg, idx); > + case 4: > + return ff_layout_async_handle_error_v4(task, op_status, state, > + clp, lseg, idx); > default: > /* should never happen */ > WARN_ON_ONCE(1); > @@ -1302,6 +1348,7 @@ static void ff_layout_io_track_ds_error(struct > pnfs_layout_segment *lseg, > switch (status) { > case NFS4ERR_DELAY: > case NFS4ERR_GRACE: > + case NFS4ERR_PERM: > break; > case NFS4ERR_NXIO: > ff_layout_mark_ds_unreachable(lseg, idx); > @@ -1334,7 +1381,8 @@ static int ff_layout_read_done_cb(struct rpc_task *task, > trace_ff_layout_read_error(hdr, task->tk_status); > } > > - err = ff_layout_async_handle_error(task, hdr->args.context->state, > + err = ff_layout_async_handle_error(task, hdr->res.op_status, > + hdr->args.context->state, > hdr->ds_clp, hdr->lseg, > hdr->pgio_mirror_idx); > > @@ -1507,7 +1555,8 @@ static int ff_layout_write_done_cb(struct rpc_task *task, > trace_ff_layout_write_error(hdr, task->tk_status); > } > > - err = ff_layout_async_handle_error(task, hdr->args.context->state, > + err = ff_layout_async_handle_error(task, hdr->res.op_status, > + hdr->args.context->state, > hdr->ds_clp, hdr->lseg, > hdr->pgio_mirror_idx); > > @@ -1556,8 +1605,9 @@ static int ff_layout_commit_done_cb(struct rpc_task *task, > trace_ff_layout_commit_error(data, task->tk_status); > } > > - err = ff_layout_async_handle_error(task, NULL, data->ds_clp, > - data->lseg, data->ds_commit_index); > + err = ff_layout_async_handle_error(task, data->res.op_status, > + NULL, data->ds_clp, data->lseg, > + data->ds_commit_index); > > trace_nfs4_pnfs_commit_ds(data, err); > switch (err) { > -- > 2.49.0