Linux NFS development
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "rbergant@redhat.com" <rbergant@redhat.com>,
	"chuck.lever@oracle.com" <chuck.lever@oracle.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"anna.schumaker@netapp.com" <anna.schumaker@netapp.com>
Subject: Re: [PATCH] pNFS: make DS availability problems visible in log
Date: Thu, 4 Mar 2021 20:35:04 +0000	[thread overview]
Message-ID: <9e98d684c6ecdbb0395beb66a0bc694c4ca870c8.camel@hammerspace.com> (raw)
In-Reply-To: <FBBBDFDD-6819-450C-879D-0B11B917BD10@oracle.com>

On Thu, 2021-03-04 at 20:20 +0000, Chuck Lever wrote:
> Hello Roberto!
> 
> > On Mar 3, 2021, at 10:33 AM, Roberto Bergantinos Corpas <
> > rbergant@redhat.com> wrote:
> > 
> > Would be interesting to promote DS availability logging outside
> > debug
> > so that we are more aware that I/O is diverted to MDS and some part
> > of the infraestructure failed.
> > 
> > Also added logging for failed DS connection attempts.
> 
> Given that this enables remote system behavior to generate
> kernel log traffic that can fill the local root partition,
> I'd like to see either:
> 
> - the explicit use of rate limiting, or
> 
> - these dprintks replaced with tracepoints

I cannot accept a pr_warn(), even a rate limited one, for a timeout
error or for a connection error in the data path. Those are just too
nasty to deal with in a syslog.

Tracepoints would be acceptable.

> 
> 
> > Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
> > ---
> > fs/nfs/filelayout/filelayout.c         | 4 ++--
> > fs/nfs/flexfilelayout/flexfilelayout.c | 6 +++---
> > fs/nfs/pnfs_nfs.c                      | 6 +++++-
> > 3 files changed, 10 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/nfs/filelayout/filelayout.c
> > b/fs/nfs/filelayout/filelayout.c
> > index 7f5aa0403e16..fef2d31a501a 100644
> > --- a/fs/nfs/filelayout/filelayout.c
> > +++ b/fs/nfs/filelayout/filelayout.c
> > @@ -181,7 +181,7 @@ static int filelayout_async_handle_error(struct
> > rpc_task *task,
> >         case -EIO:
> >         case -ETIMEDOUT:
> >         case -EPIPE:
> > -               dprintk("%s DS connection error %d\n", __func__,
> > +               pr_warn("%s DS connection error %d\n", __func__,
> >                         task->tk_status);
> >                 nfs4_mark_deviceid_unavailable(devid);
> >                 pnfs_error_mark_layout_for_return(inode, lseg);
> > @@ -190,7 +190,7 @@ static int filelayout_async_handle_error(struct
> > rpc_task *task,
> >                 fallthrough;
> >         default:
> > reset:
> > -               dprintk("%s Retry through MDS. Error %d\n",
> > __func__,
> > +               pr_warn("%s Retry through MDS. Error %d\n",
> > __func__,
> >                         task->tk_status);
> >                 return -NFS4ERR_RESET_TO_MDS;
> >         }
> > diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c
> > b/fs/nfs/flexfilelayout/flexfilelayout.c
> > index a163533446fa..7150d94e80e6 100644
> > --- a/fs/nfs/flexfilelayout/flexfilelayout.c
> > +++ b/fs/nfs/flexfilelayout/flexfilelayout.c
> > @@ -1129,7 +1129,7 @@ static int
> > ff_layout_async_handle_error_v4(struct rpc_task *task,
> >         case -EIO:
> >         case -ETIMEDOUT:
> >         case -EPIPE:
> > -               dprintk("%s DS connection error %d\n", __func__,
> > +               pr_warn("%s DS connection error %d\n", __func__,
> >                         task->tk_status);
> >                 nfs4_delete_deviceid(devid->ld, devid->nfs_client,
> >                                 &devid->deviceid);
> > @@ -1139,7 +1139,7 @@ static int
> > ff_layout_async_handle_error_v4(struct rpc_task *task,
> >                 if (ff_layout_avoid_mds_available_ds(lseg))
> >                         return -NFS4ERR_RESET_TO_PNFS;
> > reset:
> > -               dprintk("%s Retry through MDS. Error %d\n",
> > __func__,
> > +               pr_warn("%s Retry through MDS. Error %d\n",
> > __func__,
> >                         task->tk_status);
> >                 return -NFS4ERR_RESET_TO_MDS;
> >         }
> > @@ -1167,7 +1167,7 @@ static int
> > ff_layout_async_handle_error_v3(struct rpc_task *task,
> >                 nfs_inc_stats(lseg->pls_layout->plh_inode,
> > NFSIOS_DELAY);
> >                 goto out_retry;
> >         default:
> > -               dprintk("%s DS connection error %d\n", __func__,
> > +               pr_warn("%s DS connection error %d\n", __func__,
> >                         task->tk_status);
> >                 nfs4_delete_deviceid(devid->ld, devid->nfs_client,
> >                                 &devid->deviceid);
> > diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c
> > index 679767ac258d..322661a48348 100644
> > --- a/fs/nfs/pnfs_nfs.c
> > +++ b/fs/nfs/pnfs_nfs.c
> > @@ -934,8 +934,11 @@ static int _nfs4_pnfs_v4_ds_connect(struct
> > nfs_server *mds_srv,
> >                                                 (struct sockaddr
> > *)&da->da_addr,
> >                                                 da->da_addrlen,
> > IPPROTO_TCP,
> >                                                 timeo, retrans,
> > minor_version);
> > -                       if (IS_ERR(clp))
> > +                       if (IS_ERR(clp)) {
> > +                               pr_warn("%s: DS: %s unable to
> > connect with address %s, error: %ld\n",
> > +                                       __func__, ds->ds_remotestr,
> > da->da_remotestr, PTR_ERR(clp));
> >                                 continue;
> > +                       }
> > 
> >                         status = nfs4_init_ds_session(clp,
> >                                         mds_srv->nfs_client-
> > >cl_lease_time);
> > @@ -949,6 +952,7 @@ static int _nfs4_pnfs_v4_ds_connect(struct
> > nfs_server *mds_srv,
> >         }
> > 
> >         if (IS_ERR(clp)) {
> > +               pr_warn("%s: no DS available\n", __func__);
> >                 status = PTR_ERR(clp);
> >                 goto out;
> >         }
> > -- 
> > 2.21.0
> > 
> 
> --
> Chuck Lever
> 
> 
> 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2021-03-04 20:38 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 15:33 [PATCH] pNFS: make DS availability problems visible in log Roberto Bergantinos Corpas
2021-03-04 20:20 ` Chuck Lever
2021-03-04 20:35   ` Trond Myklebust [this message]
2021-03-05 12:29     ` Roberto Bergantinos Corpas
2021-03-05 13:42       ` Benjamin Coddington
2021-03-05 15:24         ` Roberto Bergantinos Corpas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e98d684c6ecdbb0395beb66a0bc694c4ca870c8.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=anna.schumaker@netapp.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rbergant@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox