All of lore.kernel.org
 help / color / mirror / Atom feed
From: dledford@redhat.com (Doug Ledford)
Subject: [PATCH v2 0/3] Fix request completion holes
Date: Mon, 13 Nov 2017 17:10:44 -0500	[thread overview]
Message-ID: <1510611044.3735.49.camel@redhat.com> (raw)
In-Reply-To: <20171108100616.26605-1-sagi@grimberg.me>

On Wed, 2017-11-08@12:06 +0200, Sagi Grimberg wrote:
> We have two holes in nvme-rdma when completing request.
> 
> 1. We never wait for send work request to complete before completing
> a request. It is possible that the HCA retries a send operation (due
> to dropped ack) after the nvme cqe has already arrived back to the host.
> If we unmap the host buffer upon reception of the cqe, the HCA might
> get iommu errors when attempting to access an unmapped host buffer.
> We must wait also for the send completion before completing a request,
> most of the time it will be before the nvme cqe has arrived back so
> we pay only for the extra cq entry processing.
> 
> 2. We don't wait for the request memory region to be fully invalidated
> in case the target didn't invalidate remotely. We must wait for the local
> invalidation to complete before completing the request.
> 
> Note that we might face two concurrent completion processing contexts for
> a single request. One is the ib_cq irq-poll context and the second is
> blk_mq_poll which is invoked from IOCB_HIPRI requests. Thus we need the
> completion flags updates (send/receive) to be atomic. A new request
> lock is introduced to guarantee the mutual exclusion of the completion
> flags updates.
> 
> Note that we could have used a per-queue lock for these updates (which
> would have generated less locks as we have less queues), but given that
> we access the request in the completion handlers we might benefit by having
> the lock local in the request. I'm open to suggestions though.
> 
> Changes from v1:
> - Added atomic send/resp_completed updated (via per-request lock)
> 
> Sagi Grimberg (3):
>   nvme-rdma: don't suppress send completions
>   nvme-rdma: don't complete requests before a send work request has
>     completed
>   nvme-rdma: wait for local invalidation before completing a request
> 
>  drivers/nvme/host/rdma.c | 125 ++++++++++++++++++++++++++---------------------
>  1 file changed, 70 insertions(+), 55 deletions(-)
> 

Sagi, are you ready for me to take this series in?  It seemed like there
was a question as to whether you might want to try atomics instead of
spin locks, or do you want to stick with spinlocks?

-- 
Doug Ledford <dledford at redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20171113/72fd8cc4/attachment.sig>

WARNING: multiple messages have this Message-ID (diff)
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Cc: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
	Max Gurtuvoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: [PATCH v2 0/3] Fix request completion holes
Date: Mon, 13 Nov 2017 17:10:44 -0500	[thread overview]
Message-ID: <1510611044.3735.49.camel@redhat.com> (raw)
In-Reply-To: <20171108100616.26605-1-sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2431 bytes --]

On Wed, 2017-11-08 at 12:06 +0200, Sagi Grimberg wrote:
> We have two holes in nvme-rdma when completing request.
> 
> 1. We never wait for send work request to complete before completing
> a request. It is possible that the HCA retries a send operation (due
> to dropped ack) after the nvme cqe has already arrived back to the host.
> If we unmap the host buffer upon reception of the cqe, the HCA might
> get iommu errors when attempting to access an unmapped host buffer.
> We must wait also for the send completion before completing a request,
> most of the time it will be before the nvme cqe has arrived back so
> we pay only for the extra cq entry processing.
> 
> 2. We don't wait for the request memory region to be fully invalidated
> in case the target didn't invalidate remotely. We must wait for the local
> invalidation to complete before completing the request.
> 
> Note that we might face two concurrent completion processing contexts for
> a single request. One is the ib_cq irq-poll context and the second is
> blk_mq_poll which is invoked from IOCB_HIPRI requests. Thus we need the
> completion flags updates (send/receive) to be atomic. A new request
> lock is introduced to guarantee the mutual exclusion of the completion
> flags updates.
> 
> Note that we could have used a per-queue lock for these updates (which
> would have generated less locks as we have less queues), but given that
> we access the request in the completion handlers we might benefit by having
> the lock local in the request. I'm open to suggestions though.
> 
> Changes from v1:
> - Added atomic send/resp_completed updated (via per-request lock)
> 
> Sagi Grimberg (3):
>   nvme-rdma: don't suppress send completions
>   nvme-rdma: don't complete requests before a send work request has
>     completed
>   nvme-rdma: wait for local invalidation before completing a request
> 
>  drivers/nvme/host/rdma.c | 125 ++++++++++++++++++++++++++---------------------
>  1 file changed, 70 insertions(+), 55 deletions(-)
> 

Sagi, are you ready for me to take this series in?  It seemed like there
was a question as to whether you might want to try atomics instead of
spin locks, or do you want to stick with spinlocks?

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2017-11-13 22:10 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-08 10:06 [PATCH v2 0/3] Fix request completion holes Sagi Grimberg
2017-11-08 10:06 ` Sagi Grimberg
2017-11-08 10:06 ` [PATCH v2 1/3] nvme-rdma: don't suppress send completions Sagi Grimberg
2017-11-08 10:06   ` Sagi Grimberg
2017-11-09  9:18   ` Christoph Hellwig
2017-11-09  9:18     ` Christoph Hellwig
2017-11-09 11:08     ` Sagi Grimberg
2017-11-09 11:08       ` Sagi Grimberg
2017-11-20  8:18       ` Christoph Hellwig
2017-11-20  8:18         ` Christoph Hellwig
2017-11-20  8:33         ` Sagi Grimberg
2017-11-20  8:33           ` Sagi Grimberg
2017-11-20  9:32           ` Christoph Hellwig
2017-11-20  9:32             ` Christoph Hellwig
2017-11-08 10:06 ` [PATCH v2 2/3] nvme-rdma: don't complete requests before a send work request has completed Sagi Grimberg
2017-11-08 10:06   ` Sagi Grimberg
2017-11-09  9:21   ` Christoph Hellwig
2017-11-09  9:21     ` Christoph Hellwig
2017-11-09 11:14     ` Sagi Grimberg
2017-11-09 11:14       ` Sagi Grimberg
2017-11-20  8:31       ` Christoph Hellwig
2017-11-20  8:31         ` Christoph Hellwig
2017-11-20  8:37         ` Sagi Grimberg
2017-11-20  8:37           ` Sagi Grimberg
2017-11-20  8:41           ` Christoph Hellwig
2017-11-20  8:41             ` Christoph Hellwig
2017-11-20  9:04             ` Sagi Grimberg
2017-11-20  9:04               ` Sagi Grimberg
2017-11-20  9:28             ` Sagi Grimberg
2017-11-20  9:28               ` Sagi Grimberg
2017-11-20 10:49               ` Christoph Hellwig
2017-11-20 10:49                 ` Christoph Hellwig
2017-11-20 11:12                 ` Sagi Grimberg
2017-11-20 11:12                   ` Sagi Grimberg
2017-11-20 11:16                   ` Christoph Hellwig
2017-11-20 11:16                     ` Christoph Hellwig
2017-11-08 10:06 ` [PATCH v2 3/3] nvme-rdma: wait for local invalidation before completing a request Sagi Grimberg
2017-11-08 10:06   ` Sagi Grimberg
2017-11-09  9:39   ` Christoph Hellwig
2017-11-09  9:39     ` Christoph Hellwig
2017-11-13 22:10 ` Doug Ledford [this message]
2017-11-13 22:10   ` [PATCH v2 0/3] Fix request completion holes Doug Ledford
2017-11-16 15:39   ` Sagi Grimberg
2017-11-16 15:39     ` Sagi Grimberg
2017-11-16 15:58     ` Doug Ledford
2017-11-16 15:58       ` Doug Ledford
2017-11-20  7:37     ` Christoph Hellwig
2017-11-20  7:37       ` Christoph Hellwig
2017-11-20  8:33       ` Sagi Grimberg
2017-11-20  8:33         ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1510611044.3735.49.camel@redhat.com \
    --to=dledford@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.