From: Sagi Grimberg <sagigrim@gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: leon@kernel.org, linux-rdma@vger.kernel.org,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v2 01/24] mlx4-ib: Use coherent memory for priv pages
Date: Sun, 19 Jun 2016 12:48:51 +0300 [thread overview]
Message-ID: <57666A83.8050502@gmail.com> (raw)
In-Reply-To: <652EBA09-2978-414C-8606-38A96C63365A@oracle.com>
>> First of all, IIRC the patch author was Christoph wasn't he.
>>
>> Plus, you do realize that this patch makes the pages allocation
>> in granularity of pages. In systems with a large page size this
>> is completely redundant, it might even be harmful as the storage
>> ULPs need lots of MRs.
>
> I agree that the official fix should take a conservative
> approach to allocating this resource; there will be lots
> of MRs in an active system. This fix doesn't seem too
> careful.
>
>
>> Also, I don't see how that solves the issue, I'm not sure I even
>> understand the issue. Do you? Were you able to reproduce it?
>
> The issue is that dma_map_single() does not seem to DMA map
> portions of a memory region that are past the end of the first
> page of that region. Maybe that's a bug?
That seems weird to me, from looking at the code I didn't see
any indication that such a mapping would fail. Maybe we are seeing
a mlx4 specific issue? If this is some kind of generic dma-mapping
bug mlx5 would suffer from the same problem right? does it?
> This patch works around that behavior by guaranteeing that
>
> a) the memory region starts at the beginning of a page, and
> b) the memory region is never larger than a page
>
> This patch is not sufficient to repair mlx5, because b)
> cannot be satisfied in that case; the array of __be64's can
> be larger than 512 entries.
If a single page boundary is indeed the root-cause then I agree
this would not solve the problem for mlx5.
>> IFF the pages buffer end not being aligned to a cacheline is problematic
>> then why not extent it to end in a cacheline? Why in the next full page?
>
> I think the patch description justifies the choice of
> solution, but does not describe the original issue at
> all. The original issue had nothing to do with cacheline
> alignment.
>
> Lastly, this patch should remove the definition of
> MLX4_MR_PAGES_ALIGN.
The mlx4 PRM explicitly states that the translation (pages) vector
should align to 64 bytes and this is where this define comes from,
hence I don't think it should be removed from the code.
WARNING: multiple messages have this Message-ID (diff)
From: Sagi Grimberg <sagigrim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Linux NFS Mailing List
<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH v2 01/24] mlx4-ib: Use coherent memory for priv pages
Date: Sun, 19 Jun 2016 12:48:51 +0300 [thread overview]
Message-ID: <57666A83.8050502@gmail.com> (raw)
In-Reply-To: <652EBA09-2978-414C-8606-38A96C63365A-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
>> First of all, IIRC the patch author was Christoph wasn't he.
>>
>> Plus, you do realize that this patch makes the pages allocation
>> in granularity of pages. In systems with a large page size this
>> is completely redundant, it might even be harmful as the storage
>> ULPs need lots of MRs.
>
> I agree that the official fix should take a conservative
> approach to allocating this resource; there will be lots
> of MRs in an active system. This fix doesn't seem too
> careful.
>
>
>> Also, I don't see how that solves the issue, I'm not sure I even
>> understand the issue. Do you? Were you able to reproduce it?
>
> The issue is that dma_map_single() does not seem to DMA map
> portions of a memory region that are past the end of the first
> page of that region. Maybe that's a bug?
That seems weird to me, from looking at the code I didn't see
any indication that such a mapping would fail. Maybe we are seeing
a mlx4 specific issue? If this is some kind of generic dma-mapping
bug mlx5 would suffer from the same problem right? does it?
> This patch works around that behavior by guaranteeing that
>
> a) the memory region starts at the beginning of a page, and
> b) the memory region is never larger than a page
>
> This patch is not sufficient to repair mlx5, because b)
> cannot be satisfied in that case; the array of __be64's can
> be larger than 512 entries.
If a single page boundary is indeed the root-cause then I agree
this would not solve the problem for mlx5.
>> IFF the pages buffer end not being aligned to a cacheline is problematic
>> then why not extent it to end in a cacheline? Why in the next full page?
>
> I think the patch description justifies the choice of
> solution, but does not describe the original issue at
> all. The original issue had nothing to do with cacheline
> alignment.
>
> Lastly, this patch should remove the definition of
> MLX4_MR_PAGES_ALIGN.
The mlx4 PRM explicitly states that the translation (pages) vector
should align to 64 bytes and this is where this define comes from,
hence I don't think it should be removed from the code.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-06-19 9:48 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-15 3:15 [PATCH v2 00/24] NFS/RDMA client patches proposed for v4.8 Chuck Lever
2016-06-15 3:15 ` Chuck Lever
2016-06-15 3:15 ` [PATCH v2 01/24] mlx4-ib: Use coherent memory for priv pages Chuck Lever
2016-06-15 3:15 ` Chuck Lever
2016-06-15 4:28 ` Leon Romanovsky
2016-06-15 4:28 ` Leon Romanovsky
2016-06-15 16:40 ` Chuck Lever
2016-06-15 16:40 ` Chuck Lever
2016-06-16 14:35 ` Leon Romanovsky
2016-06-16 14:35 ` Leon Romanovsky
2016-06-16 21:10 ` Sagi Grimberg
2016-06-16 21:10 ` Sagi Grimberg
2016-06-16 21:58 ` Chuck Lever
2016-06-16 21:58 ` Chuck Lever
2016-06-17 9:20 ` Leon Romanovsky
2016-06-17 9:20 ` Leon Romanovsky
2016-06-17 19:55 ` Chuck Lever
2016-06-17 19:55 ` Chuck Lever
2016-06-18 10:56 ` Leon Romanovsky
2016-06-18 10:56 ` Leon Romanovsky
2016-06-18 20:08 ` Chuck Lever
2016-06-18 20:08 ` Chuck Lever
2016-06-19 10:04 ` Sagi Grimberg
2016-06-19 10:04 ` Sagi Grimberg
2016-06-19 19:38 ` Or Gerlitz
2016-06-19 19:38 ` Or Gerlitz
2016-06-19 19:43 ` Or Gerlitz
2016-06-19 19:43 ` Or Gerlitz
2016-06-19 20:02 ` Chuck Lever
2016-06-19 20:02 ` Chuck Lever
2016-06-20 5:44 ` Leon Romanovsky
2016-06-20 5:44 ` Leon Romanovsky
2016-06-20 6:34 ` Sagi Grimberg
2016-06-20 6:34 ` Sagi Grimberg
2016-06-20 7:01 ` Leon Romanovsky
2016-06-20 7:01 ` Leon Romanovsky
2016-06-20 8:35 ` Sagi Grimberg
2016-06-20 8:35 ` Sagi Grimberg
2016-06-20 13:41 ` Yishai Hadas
2016-06-20 13:41 ` Yishai Hadas
2016-06-21 13:56 ` Sagi Grimberg
2016-06-21 13:56 ` Sagi Grimberg
2016-06-21 14:35 ` Laurence Oberman
2016-06-21 14:35 ` Laurence Oberman
2016-06-19 9:58 ` Sagi Grimberg
2016-06-19 9:58 ` Sagi Grimberg
2016-06-19 9:48 ` Sagi Grimberg [this message]
2016-06-19 9:48 ` Sagi Grimberg
2016-06-17 9:05 ` Leon Romanovsky
2016-06-17 9:05 ` Leon Romanovsky
2016-06-19 7:05 ` Sagi Grimberg
2016-06-19 7:05 ` Sagi Grimberg
2016-06-15 3:15 ` [PATCH v2 02/24] xprtrdma: Remove FMRs from the unmap list after unmapping Chuck Lever
2016-06-15 3:15 ` Chuck Lever
2016-06-15 3:15 ` [PATCH v2 03/24] xprtrdma: Create common scatterlist fields in rpcrdma_mw Chuck Lever
2016-06-15 3:15 ` Chuck Lever
2016-06-15 3:15 ` [PATCH v2 04/24] xprtrdma: Move init and release helpers Chuck Lever
2016-06-15 3:15 ` Chuck Lever
2016-06-15 3:15 ` [PATCH v2 05/24] xprtrdma: Rename fields in rpcrdma_fmr Chuck Lever
2016-06-15 3:15 ` Chuck Lever
2016-06-15 3:16 ` [PATCH v2 06/24] xprtrdma: Use scatterlist for DMA mapping and unmapping under FMR Chuck Lever
2016-06-15 3:16 ` Chuck Lever
2016-06-15 3:16 ` [PATCH v2 07/24] xprtrdma: Refactor MR recovery work queues Chuck Lever
2016-06-15 3:16 ` Chuck Lever
2016-06-15 3:16 ` [PATCH v2 08/24] xprtrdma: Do not leak an MW during a DMA map failure Chuck Lever
2016-06-15 3:16 ` Chuck Lever
2016-06-15 3:16 ` [PATCH v2 09/24] xprtrdma: Remove ALLPHYSICAL memory registration mode Chuck Lever
2016-06-15 3:16 ` Chuck Lever
2016-06-15 3:16 ` [PATCH v2 10/24] xprtrdma: Remove rpcrdma_map_one() and friends Chuck Lever
2016-06-15 3:16 ` Chuck Lever
2016-06-15 3:16 ` [PATCH v2 11/24] xprtrdma: Reply buffer exhaustion can be catastrophic Chuck Lever
2016-06-15 3:16 ` Chuck Lever
2016-06-15 3:16 ` [PATCH v2 12/24] xprtrdma: Honor ->send_request API contract Chuck Lever
2016-06-15 3:16 ` Chuck Lever
2016-06-15 3:17 ` [PATCH v2 13/24] xprtrdma: Chunk list encoders must not return zero Chuck Lever
2016-06-15 3:17 ` Chuck Lever
2016-06-15 3:17 ` [PATCH v2 14/24] xprtrdma: Allocate MRs on demand Chuck Lever
2016-06-15 3:17 ` Chuck Lever
2016-06-15 3:17 ` [PATCH v2 15/24] xprtrdma: Release orphaned MRs immediately Chuck Lever
2016-06-15 3:17 ` Chuck Lever
2016-06-15 3:17 ` [PATCH v2 16/24] xprtrdma: Place registered MWs on a per-req list Chuck Lever
2016-06-15 3:17 ` Chuck Lever
2016-06-15 3:17 ` [PATCH v2 17/24] xprtrdma: Chunk list encoders no longer share one rl_segments array Chuck Lever
2016-06-15 3:17 ` Chuck Lever
2016-06-15 3:17 ` [PATCH v2 18/24] xprtrdma: rpcrdma_inline_fixup() overruns the receive page list Chuck Lever
2016-06-15 3:17 ` Chuck Lever
2016-06-15 3:17 ` [PATCH v2 19/24] xprtrdma: Do not update {head, tail}.iov_len in rpcrdma_inline_fixup() Chuck Lever
2016-06-15 3:17 ` Chuck Lever
2016-06-15 3:18 ` [PATCH v2 20/24] xprtrdma: Update only specific fields in private receive buffer Chuck Lever
2016-06-15 3:18 ` Chuck Lever
2016-06-15 3:18 ` [PATCH v2 21/24] xprtrdma: Clean up fixup_copy_count accounting Chuck Lever
2016-06-15 3:18 ` Chuck Lever
2016-06-15 3:18 ` [PATCH v2 22/24] xprtrdma: No direct data placement with krb5i and krb5p Chuck Lever
2016-06-15 3:18 ` Chuck Lever
2016-06-15 3:18 ` [PATCH v2 23/24] svc: Avoid garbage replies when pc_func() returns rpc_drop_reply Chuck Lever
2016-06-15 3:18 ` Chuck Lever
2016-06-15 3:18 ` [PATCH v2 24/24] NFS: Don't drop CB requests with invalid principals Chuck Lever
2016-06-15 3:18 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57666A83.8050502@gmail.com \
--to=sagigrim@gmail.com \
--cc=chuck.lever@oracle.com \
--cc=leon@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.