* [PATCH 0/1] pNFS/flexfiles: mark device unavailable on fatal connection error @ 2025-06-09 21:43 Tigran Mkrtchyan 2025-06-09 21:43 ` [PATCH 1/1] " Tigran Mkrtchyan 0 siblings, 1 reply; 5+ messages in thread From: Tigran Mkrtchyan @ 2025-06-09 21:43 UTC (permalink / raw) To: linux-nfs; +Cc: Trond Myklebust, Anna Schumaker, Tigran Mkrtchyan As mentioned in the thread https://lore.kernel.org/linux-nfs/601285843.50695650.1748800817824.JavaMail.zimbra@desy.de/T/#u We observe that interrupted batch processing jobs put the client into an unrecoverable state that requires the client host reboot. Finally, I was able to build a custom kernel with all required third-party drivers to prove my assumption. So indeed, marking pNFS device unavailable fixes the issue. Thus, please consider the proposed change and backport it to older kernels. I did testing with (which is not part of the patch) and will try to add a trace point as soon as I find out how to implement one. Tigran Mkrtchyan (1): pNFS/flexfiles: mark device unavailable on fatal connection error fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ 1 file changed, 4 insertions(+) -- 2.49.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error 2025-06-09 21:43 [PATCH 0/1] pNFS/flexfiles: mark device unavailable on fatal connection error Tigran Mkrtchyan @ 2025-06-09 21:43 ` Tigran Mkrtchyan 2025-06-25 19:19 ` Mkrtchyan, Tigran 0 siblings, 1 reply; 5+ messages in thread From: Tigran Mkrtchyan @ 2025-06-09 21:43 UTC (permalink / raw) To: linux-nfs; +Cc: Trond Myklebust, Anna Schumaker, Tigran Mkrtchyan Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect") When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a connection to DS, client ends in an infinite loop of connect-disconnect. This source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an error on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by rpc_signal_task, but the error is treated as transient, thus retried. The issue is reproducible with script as (there should be ~1000 files in a directory, client should must not have any connections to DSes): ``` echo 3 > /proc/sys/vm/drop_caches for i in * do head -1 $i & PP=$! sleep 10e-03 kill -TERM $PP done ``` Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> --- fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c index 4a304cf17c4b..0008a8180c9b 100644 --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c @@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, mirror->mirror_ds->ds_versions[0].wsize = max_payload; goto out; } + /* There is a fatal error to connect to DS. Mark it unavailable to avoid infinite retry loop. */ + if (nfs_error_is_fatal(status)) + nfs4_mark_deviceid_unavailable(&mirror->mirror_ds->id_node); + noconnect: ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg->pls_layout), mirror, lseg->pls_range.offset, -- 2.49.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error 2025-06-09 21:43 ` [PATCH 1/1] " Tigran Mkrtchyan @ 2025-06-25 19:19 ` Mkrtchyan, Tigran 2025-06-25 19:39 ` Trond Myklebust 0 siblings, 1 reply; 5+ messages in thread From: Mkrtchyan, Tigran @ 2025-06-25 19:19 UTC (permalink / raw) To: linux-nfs; +Cc: Trond Myklebust, Anna Schumaker [-- Attachment #1: Type: text/plain, Size: 2188 bytes --] Hi Folks, Do you have any opinion on this one? Would you like me to address it differently? Tigran. ----- Original Message ----- > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de> > To: "linux-nfs" <linux-nfs@vger.kernel.org> > Cc: "Trond Myklebust" <trondmy@kernel.org>, "Anna Schumaker" <anna@kernel.org>, "Tigran Mkrtchyan" > <tigran.mkrtchyan@desy.de> > Sent: Monday, 9 June, 2025 23:43:03 > Subject: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error > Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect") > > When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a > connection > to DS, client ends in an infinite loop of connect-disconnect. This > source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an > error > on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by > rpc_signal_task, but > the error is treated as transient, thus retried. > > The issue is reproducible with script as (there should be ~1000 files in > a directory, client should must not have any connections to DSes): > > ``` > echo 3 > /proc/sys/vm/drop_caches > > for i in * > do > head -1 $i & > PP=$! > sleep 10e-03 > kill -TERM $PP > done > ``` > > Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> > --- > fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c > b/fs/nfs/flexfilelayout/flexfilelayoutdev.c > index 4a304cf17c4b..0008a8180c9b 100644 > --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c > +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c > @@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, > mirror->mirror_ds->ds_versions[0].wsize = max_payload; > goto out; > } > + /* There is a fatal error to connect to DS. Mark it unavailable to avoid > infinite retry loop. */ > + if (nfs_error_is_fatal(status)) > + nfs4_mark_deviceid_unavailable(&mirror->mirror_ds->id_node); > + > noconnect: > ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg->pls_layout), > mirror, lseg->pls_range.offset, > -- > 2.49.0 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 2826 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error 2025-06-25 19:19 ` Mkrtchyan, Tigran @ 2025-06-25 19:39 ` Trond Myklebust 2025-06-26 9:17 ` Mkrtchyan, Tigran 0 siblings, 1 reply; 5+ messages in thread From: Trond Myklebust @ 2025-06-25 19:39 UTC (permalink / raw) To: Mkrtchyan, Tigran, linux-nfs; +Cc: Anna Schumaker On Wed, 2025-06-25 at 21:19 +0200, Mkrtchyan, Tigran wrote: > > Hi Folks, > > Do you have any opinion on this one? Would you like me to address it > differently? > I don't think we should mark the device as being unavailable just because someone signalled the RPC task. It would be better to have nfs4_ff_layout_prepare_ds() return any fatal errors that it encounters using ERR_PTR(), so that the callers can handle them. Then maybe return ERR_PTR(-EAGAIN) for the case where we currently return NULL so that those callers don't have to use the hated IS_ERR_OR_NULL() test. > Tigran. > > ----- Original Message ----- > > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de> > > To: "linux-nfs" <linux-nfs@vger.kernel.org> > > Cc: "Trond Myklebust" <trondmy@kernel.org>, "Anna Schumaker" > > <anna@kernel.org>, "Tigran Mkrtchyan" > > <tigran.mkrtchyan@desy.de> > > Sent: Monday, 9 June, 2025 23:43:03 > > Subject: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on > > fatal connection error > > > Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of > > nfs4_pnfs_ds_connect") > > > > When an applications get killed (SIGTERM/SIGINT) while pNFS client > > performs a > > connection > > to DS, client ends in an infinite loop of connect-disconnect. This > > source of the issue, it that > > flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an > > error > > on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by > > rpc_signal_task, but > > the error is treated as transient, thus retried. > > > > The issue is reproducible with script as (there should be ~1000 > > files in > > a directory, client should must not have any connections to DSes): > > > > ``` > > echo 3 > /proc/sys/vm/drop_caches > > > > for i in * > > do > > head -1 $i & > > PP=$! > > sleep 10e-03 > > kill -TERM $PP > > done > > ``` > > > > Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> > > --- > > fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c > > b/fs/nfs/flexfilelayout/flexfilelayoutdev.c > > index 4a304cf17c4b..0008a8180c9b 100644 > > --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c > > +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c > > @@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct > > pnfs_layout_segment *lseg, > > mirror->mirror_ds->ds_versions[0].wsize = > > max_payload; > > goto out; > > } > > + /* There is a fatal error to connect to DS. Mark it > > unavailable to avoid > > infinite retry loop. */ > > + if (nfs_error_is_fatal(status)) > > + nfs4_mark_deviceid_unavailable(&mirror->mirror_ds- > > >id_node); > > + > > noconnect: > > ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg- > > >pls_layout), > > mirror, lseg->pls_range.offset, > > -- > > 2.49.0 -- Trond Myklebust Linux NFS client maintainer, Hammerspace trondmy@kernel.org, trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error 2025-06-25 19:39 ` Trond Myklebust @ 2025-06-26 9:17 ` Mkrtchyan, Tigran 0 siblings, 0 replies; 5+ messages in thread From: Mkrtchyan, Tigran @ 2025-06-26 9:17 UTC (permalink / raw) To: Trond Myklebust; +Cc: linux-nfs, Anna Schumaker [-- Attachment #1: Type: text/plain, Size: 3601 bytes --] I posted a different patch with the suggested approach. Tigran. ----- Original Message ----- > From: "Trond Myklebust" <trondmy@kernel.org> > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>, "linux-nfs" <linux-nfs@vger.kernel.org> > Cc: "Anna Schumaker" <anna@kernel.org> > Sent: Wednesday, 25 June, 2025 21:39:15 > Subject: Re: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error > On Wed, 2025-06-25 at 21:19 +0200, Mkrtchyan, Tigran wrote: >> >> Hi Folks, >> >> Do you have any opinion on this one? Would you like me to address it >> differently? >> > > I don't think we should mark the device as being unavailable just > because someone signalled the RPC task. > > It would be better to have nfs4_ff_layout_prepare_ds() return any fatal > errors that it encounters using ERR_PTR(), so that the callers can > handle them. Then maybe return ERR_PTR(-EAGAIN) for the case where we > currently return NULL so that those callers don't have to use the hated > IS_ERR_OR_NULL() test. > >> Tigran. >> >> ----- Original Message ----- >> > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de> >> > To: "linux-nfs" <linux-nfs@vger.kernel.org> >> > Cc: "Trond Myklebust" <trondmy@kernel.org>, "Anna Schumaker" >> > <anna@kernel.org>, "Tigran Mkrtchyan" >> > <tigran.mkrtchyan@desy.de> >> > Sent: Monday, 9 June, 2025 23:43:03 >> > Subject: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on >> > fatal connection error >> >> > Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of >> > nfs4_pnfs_ds_connect") >> > >> > When an applications get killed (SIGTERM/SIGINT) while pNFS client >> > performs a >> > connection >> > to DS, client ends in an infinite loop of connect-disconnect. This >> > source of the issue, it that >> > flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an >> > error >> > on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by >> > rpc_signal_task, but >> > the error is treated as transient, thus retried. >> > >> > The issue is reproducible with script as (there should be ~1000 >> > files in >> > a directory, client should must not have any connections to DSes): >> > >> > ``` >> > echo 3 > /proc/sys/vm/drop_caches >> > >> > for i in * >> > do >> > head -1 $i & >> > PP=$! >> > sleep 10e-03 >> > kill -TERM $PP >> > done >> > ``` >> > >> > Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> >> > --- >> > fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ >> > 1 file changed, 4 insertions(+) >> > >> > diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c >> > b/fs/nfs/flexfilelayout/flexfilelayoutdev.c >> > index 4a304cf17c4b..0008a8180c9b 100644 >> > --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c >> > +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c >> > @@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct >> > pnfs_layout_segment *lseg, >> > mirror->mirror_ds->ds_versions[0].wsize = >> > max_payload; >> > goto out; >> > } >> > + /* There is a fatal error to connect to DS. Mark it >> > unavailable to avoid >> > infinite retry loop. */ >> > + if (nfs_error_is_fatal(status)) >> > + nfs4_mark_deviceid_unavailable(&mirror->mirror_ds- >> > >id_node); >> > + >> > noconnect: >> > ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg- >> > >pls_layout), >> > mirror, lseg->pls_range.offset, >> > -- >> > 2.49.0 > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trondmy@kernel.org, trond.myklebust@hammerspace.com [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 2826 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-06-26 9:17 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-09 21:43 [PATCH 0/1] pNFS/flexfiles: mark device unavailable on fatal connection error Tigran Mkrtchyan 2025-06-09 21:43 ` [PATCH 1/1] " Tigran Mkrtchyan 2025-06-25 19:19 ` Mkrtchyan, Tigran 2025-06-25 19:39 ` Trond Myklebust 2025-06-26 9:17 ` Mkrtchyan, Tigran
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox