From: Nicolin Chen <nicolinc@nvidia.com>
To: Will Deacon <will@kernel.org>
Cc: <sagi@grimberg.me>, <hch@lst.de>, <axboe@kernel.dk>,
<kbusch@kernel.org>, <joro@8bytes.org>, <robin.murphy@arm.com>,
<jgg@nvidia.com>, <linux-nvme@lists.infradead.org>,
<linux-kernel@vger.kernel.org>, <iommu@lists.linux.dev>,
<murphyt7@tcd.ie>, <baolu.lu@linux.intel.com>
Subject: Re: [PATCH v1 0/2] nvme-pci: Fix dma-iommu mapping failures when PAGE_SIZE=64KB
Date: Wed, 14 Feb 2024 11:57:32 -0800 [thread overview]
Message-ID: <Zc0bLAIXSAqsQJJv@Asurada-Nvidia> (raw)
In-Reply-To: <20240214164138.GA31927@willie-the-truck>
Hi Will,
On Wed, Feb 14, 2024 at 04:41:38PM +0000, Will Deacon wrote:
> Hi Nicolin,
>
> On Tue, Feb 13, 2024 at 01:53:55PM -0800, Nicolin Chen wrote:
> > It's observed that an NVME device is causing timeouts when Ubuntu boots
> > with a kernel configured with PAGE_SIZE=64KB due to failures in swiotlb:
> > systemd[1]: Started Journal Service.
> > => nvme 0000:00:01.0: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 32 (slots)
> > note: journal-offline[392] exited with irqs disabled
> > note: journal-offline[392] exited with preempt_count 1
> >
> > An NVME device under a PCIe bus can be behind an IOMMU, so dma mappings
> > going through dma-iommu might be also redirected to swiotlb allocations.
> > Similar to dma_direct_max_mapping_size(), dma-iommu should implement its
> > dma_map_ops->max_mapping_size to return swiotlb_max_mapping_size() too.
> >
> > Though an iommu_dma_max_mapping_size() is a must, it alone can't fix the
> > issue. The swiotlb_max_mapping_size() returns 252KB, calculated from the
> > default pool 256KB subtracted by min_align_mask NVME_CTRL_PAGE_SIZE=4KB,
> > while dma-iommu can roundup a 252KB mapping to 256KB at its "alloc_size"
> > when PAGE_SIZE=64KB via iova->granule that is often set to PAGE_SIZE. So
> > this mismatch between NVME_CTRL_PAGE_SIZE=4KB and PAGE_SIZE=64KB results
> > in a similar failure, though its signature has a fixed size "256KB":
> > systemd[1]: Started Journal Service.
> > => nvme 0000:00:01.0: swiotlb buffer is full (sz: 262144 bytes), total 32768 (slots), used 128 (slots)
> > note: journal-offline[392] exited with irqs disabled
> > note: journal-offline[392] exited with preempt_count 1
> >
> > Both failures above occur to NVME behind IOMMU when PAGE_SIZE=64KB. They
> > were likely introduced for the security feature by:
> > commit 82612d66d51d ("iommu: Allow the dma-iommu api to use bounce buffers"),
> >
> > So, this series bundles two fixes together against that. They should be
> > taken at the same time to entirely fix the mapping failures.
>
> It's a bit of a shot in the dark, but I've got a pending fix to some of
> the alignment handling in swiotlb. It would be interesting to know if
> patch 1 has any impact at all on your NVME allocations:
>
> https://lore.kernel.org/r/20240205190127.20685-1-will@kernel.org
I applied these three patches locally for a test.
Though I am building with a v6.6 kernel, I see some warnings:
from kernel/dma/swiotlb.c:26:
kernel/dma/swiotlb.c: In function ‘swiotlb_area_find_slots’:
./include/linux/minmax.h:21:35: warning: comparison of distinct pointer types lacks a cast
21 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
| ^~
./include/linux/minmax.h:27:18: note: in expansion of macro ‘__typecheck’
27 | (__typecheck(x, y) && __no_side_effects(x, y))
| ^~~~~~~~~~~
./include/linux/minmax.h:37:31: note: in expansion of macro ‘__safe_cmp’
37 | __builtin_choose_expr(__safe_cmp(x, y), \
| ^~~~~~~~~~
./include/linux/minmax.h:75:25: note: in expansion of macro ‘__careful_cmp’
75 | #define max(x, y) __careful_cmp(x, y, >)
| ^~~~~~~~~~~~~
kernel/dma/swiotlb.c:1007:26: note: in expansion of macro ‘max’
1007 | stride = max(stride, PAGE_SHIFT - IO_TLB_SHIFT + 1);
| ^~~
Replacing with a max_t() can fix these.
And it seems to get worse, as even a 64KB mapping is failing:
[ 0.239821] nvme 0000:00:01.0: swiotlb buffer is full (sz: 65536 bytes), total 32768 (slots), used 0 (slots)
With a printk, I found the iotlb_align_mask isn't correct:
swiotlb_area_find_slots:alloc_align_mask 0xffff, iotlb_align_mask 0x800
But fixing the iotlb_align_mask to 0x7ff still fails the 64KB
mapping..
Thanks
Nicolin
next prev parent reply other threads:[~2024-02-14 19:57 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-13 21:53 [PATCH v1 0/2] nvme-pci: Fix dma-iommu mapping failures when PAGE_SIZE=64KB Nicolin Chen
2024-02-13 21:53 ` [PATCH v1 1/2] iommu/dma: Force swiotlb_max_mapping_size on an untrusted device Nicolin Chen
2024-02-13 21:53 ` [PATCH v1 2/2] nvme-pci: Fix iommu map (via swiotlb) failures when PAGE_SIZE=64KB Nicolin Chen
2024-02-13 23:31 ` Keith Busch
2024-02-14 6:09 ` Nicolin Chen
2024-02-15 1:36 ` Keith Busch
2024-02-15 4:46 ` Nicolin Chen
2024-02-15 12:01 ` Robin Murphy
2024-02-16 1:07 ` Nicolin Chen
2024-02-14 16:41 ` [PATCH v1 0/2] nvme-pci: Fix dma-iommu mapping " Will Deacon
2024-02-14 19:57 ` Nicolin Chen [this message]
2024-02-15 14:22 ` Will Deacon
2024-02-15 16:35 ` Will Deacon
2024-02-16 0:26 ` Nicolin Chen
2024-02-16 16:13 ` Will Deacon
2024-02-17 5:19 ` Nicolin Chen
2024-02-19 4:05 ` Michael Kelley
2024-02-16 0:29 ` Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zc0bLAIXSAqsQJJv@Asurada-Nvidia \
--to=nicolinc@nvidia.com \
--cc=axboe@kernel.dk \
--cc=baolu.lu@linux.intel.com \
--cc=hch@lst.de \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=murphyt7@tcd.ie \
--cc=robin.murphy@arm.com \
--cc=sagi@grimberg.me \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.