All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: linux-rdma@vger.kernel.org, leonro@nvidia.com,
	selvin.xavier@broadcom.com, andrew.gospodarek@broadcom.com
Subject: Re: [PATCH  rdma-rc v1] RDMA/core: fix sg_to_page mapping for boundary condition
Date: Thu, 18 Aug 2022 20:52:28 -0300	[thread overview]
Message-ID: <Yv7QvMADD7g3yPWh@nvidia.com> (raw)
In-Reply-To: <20220816081153.1580612-1-kashyap.desai@broadcom.com>

On Tue, Aug 16, 2022 at 01:41:53PM +0530, Kashyap Desai wrote:
> This issue frequently hits if AMD IOMMU is enabled.
> 
> In case of 1MB data transfer, ib core is supposed to set 256 entries of
> 4K page size in MR page table. Because of the defect in ib_sg_to_pages,
> it breaks just after setting one entry.
> Memory region page table entries may find stale entries (or NULL if
> address is memset). Something like this -
> 
> crash> x/32a 0xffff9cd9f7f84000
> <<< -This looks like stale entries. Only first entry is valid ->>>
> 0xffff9cd9f7f84000:     0xfffffffffff00000      0x68d31000
> 0xffff9cd9f7f84010:     0x68d32000      0x68d33000
> 0xffff9cd9f7f84020:     0x68d34000      0x975c5000
> 0xffff9cd9f7f84030:     0x975c6000      0x975c7000
> 0xffff9cd9f7f84040:     0x975c8000      0x975c9000
> 0xffff9cd9f7f84050:     0x975ca000      0x975cb000
> 0xffff9cd9f7f84060:     0x975cc000      0x975cd000
> 0xffff9cd9f7f84070:     0x975ce000      0x975cf000
> 0xffff9cd9f7f84080:     0x0     0x0
> 0xffff9cd9f7f84090:     0x0     0x0
> 0xffff9cd9f7f840a0:     0x0     0x0
> 0xffff9cd9f7f840b0:     0x0     0x0
> 0xffff9cd9f7f840c0:     0x0     0x0
> 0xffff9cd9f7f840d0:     0x0     0x0
> 0xffff9cd9f7f840e0:     0x0     0x0
> 0xffff9cd9f7f840f0:     0x0     0x0
> 
> All addresses other than 0xfffffffffff00000 are stale entries.
> Once this kind of incorrect page entries are passed to the RDMA h/w,
> AMD IOMMU module detects the page fault whenever h/w tries to access
> addresses which are not actually populated by the ib stack correctly.
> Below prints are logged whenever this issue hits.

I don't understand this. AFAIK on AMD platforms you can't create an
IOVA mapping at -1 like you are saying above, so how is
0xfffffffffff00000 a valid DMA address?

Or, if the AMD IOMMU HW can actually do this, then I would say it is a
bug in the IOMM DMA API to allow the aperture used for DMA mapping to
get to the end of ULONG_MAX, it is just asking for overflow bugs.

And if we have to tolerate these addreses then the code should be
designed to avoid the overflow in the first place ie 'end_dma_addr'
should be changed to 'last_dma_addr = dma_addr + (dma_len - 1)' which
does not overflow, and all the logics carefully organized so none of
the math overflows.

Jason

  reply	other threads:[~2022-08-18 23:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-16  8:11 [PATCH rdma-rc v1] RDMA/core: fix sg_to_page mapping for boundary condition Kashyap Desai
2022-08-18 23:52 ` Jason Gunthorpe [this message]
2022-08-19  9:43   ` Kashyap Desai
2022-08-19 11:48     ` Jason Gunthorpe
2022-08-22 14:21       ` Kashyap Desai
2022-08-26 13:14         ` Jason Gunthorpe
2022-09-01 12:06           ` Kashyap Desai
2022-09-06 17:33             ` Jason Gunthorpe
2022-09-12 11:02               ` Kashyap Desai
2022-09-20 19:14                 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yv7QvMADD7g3yPWh@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=andrew.gospodarek@broadcom.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=leonro@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=selvin.xavier@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.