linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Gal Pressman <galpress@amazon.com>
Cc: Doug Ledford <dledford@redhat.com>,
	linux-rdma@vger.kernel.org,
	Alexander Matushevsky <matua@amazon.com>,
	Firas JahJah <firasj@amazon.com>,
	Yossi Leybovich <sleybo@amazon.com>
Subject: Re: [PATCH for-next 2/3] RDMA/efa: Count mmap failures
Date: Sun, 26 Apr 2020 10:30:00 -0300	[thread overview]
Message-ID: <20200426133000.GL26002@ziepe.ca> (raw)
In-Reply-To: <523c9dee-cedf-b762-8a68-cd1232e87e48@amazon.com>

On Sun, Apr 26, 2020 at 09:42:27AM +0300, Gal Pressman wrote:
> On 24/04/2020 21:26, Jason Gunthorpe wrote:
> > On Fri, Apr 24, 2020 at 06:25:54PM +0300, Gal Pressman wrote:
> >> On 24/04/2020 17:59, Jason Gunthorpe wrote:
> >>> On Mon, Apr 20, 2020 at 09:22:12AM +0300, Gal Pressman wrote:
> >>>> Add a new stat that counts mmap failures, which might help when
> >>>> debugging different issues.
> >>>>
> >>>> Reviewed-by: Firas JahJah <firasj@amazon.com>
> >>>> Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
> >>>> Signed-off-by: Gal Pressman <galpress@amazon.com>
> >>>>  drivers/infiniband/hw/efa/efa.h       | 3 ++-
> >>>>  drivers/infiniband/hw/efa/efa_verbs.c | 9 +++++++--
> >>>>  2 files changed, 9 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h
> >>>> index aa7396a1588a..77c9ff798117 100644
> >>>> +++ b/drivers/infiniband/hw/efa/efa.h
> >>>> @@ -1,6 +1,6 @@
> >>>>  /* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
> >>>>  /*
> >>>> - * Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
> >>>> + * Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All rights reserved.
> >>>>   */
> >>>>  
> >>>>  #ifndef _EFA_H_
> >>>> @@ -40,6 +40,7 @@ struct efa_sw_stats {
> >>>>  	atomic64_t reg_mr_err;
> >>>>  	atomic64_t alloc_ucontext_err;
> >>>>  	atomic64_t create_ah_err;
> >>>> +	atomic64_t mmap_err;
> >>>>  };
> >>>>  
> >>>>  /* Don't use anything other than atomic64 */
> >>>> diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
> >>>> index b555845d6c14..75eef1ec2474 100644
> >>>> +++ b/drivers/infiniband/hw/efa/efa_verbs.c
> >>>> @@ -44,7 +44,8 @@ struct efa_user_mmap_entry {
> >>>>  	op(EFA_CREATE_CQ_ERR, "create_cq_err") \
> >>>>  	op(EFA_REG_MR_ERR, "reg_mr_err") \
> >>>>  	op(EFA_ALLOC_UCONTEXT_ERR, "alloc_ucontext_err") \
> >>>> -	op(EFA_CREATE_AH_ERR, "create_ah_err")
> >>>> +	op(EFA_CREATE_AH_ERR, "create_ah_err") \
> >>>> +	op(EFA_MMAP_ERR, "mmap_err")
> >>>>  
> >>>>  #define EFA_STATS_ENUM(ename, name) ename,
> >>>>  #define EFA_STATS_STR(ename, name) [ename] = name,
> >>>> @@ -1569,6 +1570,7 @@ static int __efa_mmap(struct efa_dev *dev, struct efa_ucontext *ucontext,
> >>>>  		ibdev_dbg(&dev->ibdev,
> >>>>  			  "pgoff[%#lx] does not have valid entry\n",
> >>>>  			  vma->vm_pgoff);
> >>>> +		atomic64_inc(&dev->stats.sw_stats.mmap_err);
> >>>>  		return -EINVAL;
> >>>>  	}
> >>>>  	entry = to_emmap(rdma_entry);
> >>>> @@ -1604,12 +1606,14 @@ static int __efa_mmap(struct efa_dev *dev, struct efa_ucontext *ucontext,
> >>>>  		err = -EINVAL;
> >>>>  	}
> >>>>  
> >>>> -	if (err)
> >>>> +	if (err) {
> >>>>  		ibdev_dbg(
> >>>>  			&dev->ibdev,
> >>>>  			"Couldn't mmap address[%#llx] length[%#zx] mmap_flag[%d] err[%d]\n",
> >>>>  			entry->address, rdma_entry->npages * PAGE_SIZE,
> >>>>  			entry->mmap_flag, err);
> >>>> +		atomic64_inc(&dev->stats.sw_stats.mmap_err);
> >>>
> >>> Really? Isn't this something that is only possible with a buggy
> >>> rdma-core provider? Why count it?
> >>
> >> Though unlikely, it could happen, otherwise this error flow wouldn't exist in
> >> the first place.
> >>
> >> If for some reason a customer app steps on a bug we're not aware of, this
> >> counter could serve as a red flag.
> > 
> > But there are lots of cases where a buggy provider can cause error
> > exits, why choose this one to count against all the others?
> 
> It's not one against all others, most if not all of our userspace facing API
> error flows have a similar counter.

Hurm, seems a bit strange, but OK

> And TBH, I think that the mmap flow is quite convoluted with the cookie response
> from the crate verb, so it deserves a counter IMO.

How so? Userspace takes the u64 from the command and pass it to mmap,
what is convoluted?

Jason

  reply	other threads:[~2020-04-26 13:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-20  6:22 [PATCH for-next 0/3] EFA statistics minor updates Gal Pressman
2020-04-20  6:22 ` [PATCH for-next 1/3] RDMA/efa: Report create CQ error counter Gal Pressman
2020-04-20  6:22 ` [PATCH for-next 2/3] RDMA/efa: Count mmap failures Gal Pressman
2020-04-24 14:59   ` Jason Gunthorpe
2020-04-24 15:25     ` Gal Pressman
2020-04-24 18:26       ` Jason Gunthorpe
2020-04-26  6:42         ` Gal Pressman
2020-04-26 13:30           ` Jason Gunthorpe [this message]
2020-04-26 14:17             ` Gal Pressman
2020-04-20  6:22 ` [PATCH for-next 3/3] RDMA/efa: Count admin commands errors Gal Pressman
2020-05-02 23:33 ` [PATCH for-next 0/3] EFA statistics minor updates Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200426133000.GL26002@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=dledford@redhat.com \
    --cc=firasj@amazon.com \
    --cc=galpress@amazon.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=matua@amazon.com \
    --cc=sleybo@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).