* Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
@ 2023-11-03 15:13 Niklas Schnelle
2023-11-03 16:14 ` Halil Pasic
2023-11-03 18:59 ` Petr Tesařík
0 siblings, 2 replies; 20+ messages in thread
From: Niklas Schnelle @ 2023-11-03 15:13 UTC (permalink / raw)
To: Christoph Hellwig, Bjorn Helgaas, Marek Szyprowski, Robin Murphy,
Petr Tesarik, Ross Lagerwall
Cc: linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic
Hi Swiotlb Maintainers,
With s390 slated to use dma-iommu.c I was experimenting with
CONFIG_SWIOTLB_DYNAMIC but was getting errors all over the place.
Debugging this I've managed to narrow things down to what I believe is
a memory corruption issue caused by overrunning the entire transient
memory pool allocated by swiotlb. In my test this happens on the
iommu_dma_map_page(), swiotlb_tbl_map_single() path when handling
untrusted PCI devices.
I've seen this happen only for transient pools when:
* allocation size >=PAGE_SIZE and,
* the original address of the mapping is not page aligned.
The overrun can be seen clearly by adding a simple "tlb_addr +
alloc_size > pool->end' overrun check to swiotlb_tbl_map_single() and
forcing PCI test devices to be untrusted. With that in place while an
NVMe fio work load runs fine TCP/IP tests on a Mellanox ConnectX-4 VF
reliably trigger the overrun check and with a debug print produce
output like the following:
software IO TLB:
transient:1
index:1
dma_get_min_align_mask(dev):0
orig_addr:ac0caebe
tlb_addr=a4d0f800
start:a4d0f000
end:a4d10000
pool_size:4096
alloc_size:4096
offset:0
The complete code used for this test is available on my
git.kernel.org[0] repository but it's bascially v6.6 + iommu/next (for
s390 DMA API) + 2 trivial debug commits.
For further analysis I've worked closely with Halil Pasic.
The reason why we think this is happening seems twofold. Under a
certain set of circumstances in the function swiotlb_find_slots():
1) the allocated transient pool can not fit the required bounce buffer,
and
2) the invocation of swiotlb_pool_find_slots() finds "a suitable
slot" even though it should fail.
The reason for 2), i.e. swiotlb_pool_find_slots() thinking there is a
suitable bounce buffer in the pool is that for transient pools pool-
>slots[i].list end up nonsensical, because swiotlb_init_io_tlb_pool(),
among other places in swiotlb, assumes that the nslabs of the pool is a
multiple of IO_TLB_SEGSIZE
The reason for 1) is a bit more convoluted and not entirely understood
by us. We are certain though that the function swiotlb_find_slots()
allocates a pool with nr_slots(alloc_size), where this alloc_size is
the alloc_size from swiotlb_tbl_map_single() + swiotlb_align_offset(),
but for alignment reasons some slots may get "skipped over" in
swiotlb_area_find_slots() causing the picked slots to overrun the
allocation.
Not sure how to properly fix this as the different alignment
requirements get pretty complex quickly. So would appreciate your
input.
Thanks,
Niklas
[0] bounce-min branch of my git.kernel.org repository at
https://git.kernel.org/pub/scm/linux/kernel/git/niks/linux.git
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 15:13 Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y Niklas Schnelle
@ 2023-11-03 16:14 ` Halil Pasic
2023-11-03 20:50 ` Petr Tesařík
2023-11-03 18:59 ` Petr Tesařík
1 sibling, 1 reply; 20+ messages in thread
From: Halil Pasic @ 2023-11-03 16:14 UTC (permalink / raw)
To: Niklas Schnelle
Cc: Christoph Hellwig, Bjorn Helgaas, Marek Szyprowski, Robin Murphy,
Petr Tesarik, Ross Lagerwall, linux-pci, linux-kernel, iommu,
Matthew Rosato, Halil Pasic
On Fri, 03 Nov 2023 16:13:03 +0100
Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> The reason for 1) is a bit more convoluted and not entirely understood
> by us. We are certain though that the function swiotlb_find_slots()
> allocates a pool with nr_slots(alloc_size), where this alloc_size is
> the alloc_size from swiotlb_tbl_map_single() + swiotlb_align_offset(),
> but for alignment reasons some slots may get "skipped over" in
> swiotlb_area_find_slots() causing the picked slots to overrun the
> allocation.
>
> Not sure how to properly fix this as the different alignment
> requirements get pretty complex quickly. So would appreciate your
> input.
Let me post a more detailed analysis of why do we observe
swiotlb_area_find_slots() considering the slot with the
index 0 invalid in our particular case, and how does that
relate to the whole "alignment" complex.
Currently there are three distinct mechanisms that dictate the "alignment":
a) min_align_mask (introduced by 36950f2da1ea ("driver core: add a
min_align_mask))
field to struct device_dma_parameters"))
b) alloc_size >= PAGE_SIZE which requires "page alignment"
c) alloc_aligned_mask.
In our case min_align_mask == 0 and a) is thus not applicable, because b) and
c) we end up with iotlb_align_mask = 0x800. And because orig_add & 0x800 ==
0x800 but pool->start & 0x800 == 0 and the slot at index i is skipped over. The
slot 0 is skipped over because it is page aligned, when !!((1UL << PAGE_SHIFT)
& orig_addr)
Let us note that with the current implementation the min_align_size mask, that
is mechanism a) also controls the tlb_addr within the first slot so that:
tlb_addr & min_align_mask == orig_addr & min_align_mask. In that sense a) is
very unlike b) and c).
For example, if !min_align_mask, then tlb_addr & (IO_TLB_SIZE - 1) is always 0,
even if the alloc_size is >= PAGE_SIZE or if alloc_aligned_size is non 0.
If with b) and c) the goal is that the swiotlb buffer shall not stretch over
more pages or address space blocks of a size dictated by alloc_aligned_mask
then, that goal is accomplished. If however the goal is to preserve the offsets
modulo some exponent of 2 dictated either by PAGE_SHIFT or by alloc_aligned
mask, then that goal is not reached.
But there is more to it! In the beginning there was b), or more precisely in the
olden days for mapping_size >= PAGE_SIZE we used to allocate properly page
aligned bounce buffers. That is tlb_addr & (~PAGE_MASK) == 0 regardless of what
orig_addr & (~PAGE_MASK) & (IO_TLB_SIZE - 1) is. That first got borked by
commit 1f221a0d0dbf ("swiotlb: respect min_align_mask") and then it did not get
fixed by commit 0eee5ae10256 ("swiotlb: fix slot alignment checks").
Let us also note that if more than one of the above mechanisms are applicable,
then for the slot selection the idea is apparently to go with the strictest
"alignment requirement", while for the offset within the slot only a) matters
(if applicable, i.e. min_align_mask != 0), which may appear strange if
not thoroughly documented.
In our opinion the first step towards getting this right is to figure out what
the different kinds of alignments are really supposed to mean. For each of the
mechanisms we need to understand and document, whether making sure that the
bounce buffer does not stretch over more of certain units of memory (like,
pages, iova granule size, whatever), or is it about preserving offset within a
certain unit of memory, and if yes to what extent (the least significant n-bits
of the orig_addr dictated by the respective mask, or something different).
Thank you for your help in advance!
Regards,
Halil and Niklas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 15:13 Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y Niklas Schnelle
2023-11-03 16:14 ` Halil Pasic
@ 2023-11-03 18:59 ` Petr Tesařík
2023-11-06 7:44 ` Christoph Hellwig
2023-11-08 10:52 ` Halil Pasic
1 sibling, 2 replies; 20+ messages in thread
From: Petr Tesařík @ 2023-11-03 18:59 UTC (permalink / raw)
To: Niklas Schnelle
Cc: Christoph Hellwig, Bjorn Helgaas, Marek Szyprowski, Robin Murphy,
Petr Tesarik, Ross Lagerwall, linux-pci, linux-kernel, iommu,
Matthew Rosato, Halil Pasic
Hello Niklas,
thank you for your report. This is some great analysis!
On Fri, 03 Nov 2023 16:13:03 +0100
Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> Hi Swiotlb Maintainers,
>
> With s390 slated to use dma-iommu.c I was experimenting with
> CONFIG_SWIOTLB_DYNAMIC but was getting errors all over the place.
> Debugging this I've managed to narrow things down to what I believe is
> a memory corruption issue caused by overrunning the entire transient
> memory pool allocated by swiotlb. In my test this happens on the
> iommu_dma_map_page(), swiotlb_tbl_map_single() path when handling
> untrusted PCI devices.
>
> I've seen this happen only for transient pools when:
> * allocation size >=PAGE_SIZE and,
> * the original address of the mapping is not page aligned.
>
> The overrun can be seen clearly by adding a simple "tlb_addr +
> alloc_size > pool->end' overrun check to swiotlb_tbl_map_single() and
> forcing PCI test devices to be untrusted. With that in place while an
> NVMe fio work load runs fine TCP/IP tests on a Mellanox ConnectX-4 VF
> reliably trigger the overrun check and with a debug print produce
> output like the following:
>
> software IO TLB:
> transient:1
> index:1
> dma_get_min_align_mask(dev):0
> orig_addr:ac0caebe
> tlb_addr=a4d0f800
> start:a4d0f000
> end:a4d10000
> pool_size:4096
> alloc_size:4096
> offset:0
>
> The complete code used for this test is available on my
> git.kernel.org[0] repository but it's bascially v6.6 + iommu/next (for
> s390 DMA API) + 2 trivial debug commits.
>
> For further analysis I've worked closely with Halil Pasic.
>
> The reason why we think this is happening seems twofold. Under a
> certain set of circumstances in the function swiotlb_find_slots():
> 1) the allocated transient pool can not fit the required bounce buffer,
I am aware that this can happen. It's in fact why the index returned by
swiotlb_search_pool_area() is checked in swiotlb_find_slots().
> and
> 2) the invocation of swiotlb_pool_find_slots() finds "a suitable
> slot" even though it should fail.
This needs fixing.
> The reason for 2), i.e. swiotlb_pool_find_slots() thinking there is a
> suitable bounce buffer in the pool is that for transient pools pool-
> >slots[i].list end up nonsensical, because swiotlb_init_io_tlb_pool(),
> among other places in swiotlb, assumes that the nslabs of the pool is a
> multiple of IO_TLB_SEGSIZE
Ah... Yes, the transient pool size must also be a multiple of segment
size, but it is not enforced.
This reminds me of my debugging session that resulted in commit
aabd12609f91 ("swiotlb: always set the number of areas before
allocating the pool") and this conversation:
https://lore.kernel.org/linux-iommu/20230629074403.7f8ed9d6@meshulam.tesarici.cz/
> The reason for 1) is a bit more convoluted and not entirely understood
> by us. We are certain though that the function swiotlb_find_slots()
> allocates a pool with nr_slots(alloc_size), where this alloc_size is
> the alloc_size from swiotlb_tbl_map_single() + swiotlb_align_offset(),
> but for alignment reasons some slots may get "skipped over" in
> swiotlb_area_find_slots() causing the picked slots to overrun the
> allocation.
Yes, this can happen.
> Not sure how to properly fix this as the different alignment
> requirements get pretty complex quickly. So would appreciate your
> input.
I don't think it's possible to improve the allocation logic without
modifying the page allocator and/or the DMA atomic pool allocator to
take additional constraints into account.
I had a wild idea back in March, but it would require some intrusive
changes in the mm subsystem. Among other things, it would make memory
zones obsolete. I mean, people may actually like to get rid of DMA,
DMA32 and NORMAL, but you see how many nasty bugs were introduced even
by a relatively small change in SWIOTLB. Replacing memory zones with a
system based on generic physical allocation constraints would probably
blow up the universe. ;-)
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 16:14 ` Halil Pasic
@ 2023-11-03 20:50 ` Petr Tesařík
2023-11-06 7:42 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Petr Tesařík @ 2023-11-03 20:50 UTC (permalink / raw)
To: Halil Pasic
Cc: Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato
On Fri, 3 Nov 2023 17:14:47 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:
>[...]
> In our case min_align_mask == 0 and a) is thus not applicable, because b) and
> c) we end up with iotlb_align_mask = 0x800. And because orig_add & 0x800 ==
> 0x800 but pool->start & 0x800 == 0 and the slot at index i is skipped over. The
> slot 0 is skipped over because it is page aligned, when !!((1UL << PAGE_SHIFT)
> & orig_addr)
Wait. These mask values can quickly become confusing. Do you mean
iotlb_align_mask == 0xfff?
> Let us note that with the current implementation the min_align_size mask, that
> is mechanism a) also controls the tlb_addr within the first slot so that:
> tlb_addr & min_align_mask == orig_addr & min_align_mask. In that sense a) is
> very unlike b) and c).
It is silently assumed that PAGE_SIZE >= IO_TLB_SIZE, so if the buffer
is page-aligned, the lower bits of the alignment inside the io tlb slot
must be zero.
If the same assumption is made about alloc_align_mask, it should be
documented, but it is not.
>[...]
> In our opinion the first step towards getting this right is to figure out what
> the different kinds of alignments are really supposed to mean. For each of the
> mechanisms we need to understand and document, whether making sure that the
> bounce buffer does not stretch over more of certain units of memory (like,
> pages, iova granule size, whatever), or is it about preserving offset within a
> certain unit of memory, and if yes to what extent (the least significant n-bits
> of the orig_addr dictated by the respective mask, or something different).
Seconded. I have also been struggling with the various alignment
constraints. I have even written (but not yet submitted) a patch to
calculate the combined alignment mask in swiotlb_tbl_map_single() and
pass it down to all other functions, just to make it clear what
alignment mask is used.
My understanding is that buffer alignment may be required by:
1. hardware which cannot handle an unaligned base address (presumably
because the chip performs a simple OR operation to get the addresses
of individual fields);
2. isolation of untrusted devices, where no two bounce buffers should
end up in the same iova granule;
3. allocation size; I could not find an explanation, so this might be
merely an attempt at reducing SWIOTLB internal fragmentation.
I hope other people on the Cc list can shed more light on the intended
behaviour.
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 20:50 ` Petr Tesařík
@ 2023-11-06 7:42 ` Christoph Hellwig
2023-11-07 17:24 ` Halil Pasic
2023-11-06 10:08 ` Halil Pasic
2023-11-07 17:24 ` Halil Pasic
2 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-11-06 7:42 UTC (permalink / raw)
To: Petr Tesařík
Cc: Halil Pasic, Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato
On Fri, Nov 03, 2023 at 09:50:53PM +0100, Petr Tesařík wrote:
> Seconded. I have also been struggling with the various alignment
> constraints. I have even written (but not yet submitted) a patch to
> calculate the combined alignment mask in swiotlb_tbl_map_single() and
> pass it down to all other functions, just to make it clear what
> alignment mask is used.
That does sound like a good idea.
> My understanding is that buffer alignment may be required by:
>
> 1. hardware which cannot handle an unaligned base address (presumably
> because the chip performs a simple OR operation to get the addresses
> of individual fields);
There's all kinds of weird encodings that discard the low bits.
For NVMe it's the PRPs (that is actually documented in the NVMe
spec, so it might be easiest to grasp), but except for a Mellox
vendor extension this is also how all RDMA memory registrations
work.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 18:59 ` Petr Tesařík
@ 2023-11-06 7:44 ` Christoph Hellwig
2023-11-06 12:46 ` Petr Tesarik
2023-11-08 10:52 ` Halil Pasic
1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-11-06 7:44 UTC (permalink / raw)
To: Petr Tesařík
Cc: Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic
On Fri, Nov 03, 2023 at 07:59:49PM +0100, Petr Tesařík wrote:
> I don't think it's possible to improve the allocation logic without
> modifying the page allocator and/or the DMA atomic pool allocator to
> take additional constraints into account.
>
> I had a wild idea back in March, but it would require some intrusive
> changes in the mm subsystem. Among other things, it would make memory
> zones obsolete. I mean, people may actually like to get rid of DMA,
> DMA32 and NORMAL, but you see how many nasty bugs were introduced even
> by a relatively small change in SWIOTLB. Replacing memory zones with a
> system based on generic physical allocation constraints would probably
> blow up the universe. ;-)
It would be very nice, at least for DMA32 or the 30/31-bit DMA pools
used on some architectures. For the x86-style 16MB zone DMA I suspect
just having a small pool on the side that's not even exposed to the
memory allocator would probably work better.
I think a lot of the MM folks would love to be able to kill of the
extra zones.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 20:50 ` Petr Tesařík
2023-11-06 7:42 ` Christoph Hellwig
@ 2023-11-06 10:08 ` Halil Pasic
2023-11-07 17:24 ` Halil Pasic
2 siblings, 0 replies; 20+ messages in thread
From: Halil Pasic @ 2023-11-06 10:08 UTC (permalink / raw)
To: Petr Tesařík
Cc: Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic
On Fri, 3 Nov 2023 21:50:53 +0100
Petr Tesařík <petr@tesarici.cz> wrote:
> >[...]
> > In our case min_align_mask == 0 and a) is thus not applicable, because b) and
> > c) we end up with iotlb_align_mask = 0x800. And because orig_add & 0x800 ==
> > 0x800 but pool->start & 0x800 == 0 and the slot at index i is skipped over. The
> > slot 0 is skipped over because it is page aligned, when !!((1UL << PAGE_SHIFT)
> > & orig_addr)
>
> Wait. These mask values can quickly become confusing. Do you mean
> iotlb_align_mask == 0xfff?
I mean iotlb_align_mask == 0x800. Because of
iotlb_align_mask &= ~(IO_TLB_SIZE - 1);
in line 994 masks away the 0x7FF part
(https://github.com/torvalds/linux/blob/d2f51b3516dade79269ff45eae2a7668ae711b25/kernel/dma/swiotlb.c#L994C2-L994C41)
what remains of 0xfff (when PAGE_SHIFT == 12). My idea was to write
0x800 differently with a reference to PAGE_SHIFT, because for a
larger PAGE_SHIFT we end up with a different pattern and thus
requirement, but didn't really think it through properly because
even (1UL << (PAGE_SHIFT- 1)) (which is for PAGE_SHIFT == 12
0x800) does not tell the full story. Because all bits in the
interval [PAGE_SHIFT,IO_TLB_SHIFT) matter and not just the most
significant one (for PAGE_SHIFT == 12 and IO_TLB_SHIFT == 1 there is
just one).
Shame on me! Sorry for the confusion!
Regards,
Halil
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-06 7:44 ` Christoph Hellwig
@ 2023-11-06 12:46 ` Petr Tesarik
0 siblings, 0 replies; 20+ messages in thread
From: Petr Tesarik @ 2023-11-06 12:46 UTC (permalink / raw)
To: Christoph Hellwig, Petr Tesařík
Cc: Niklas Schnelle, Bjorn Helgaas, Marek Szyprowski, Robin Murphy,
Ross Lagerwall, linux-pci, linux-kernel, iommu, Matthew Rosato,
Halil Pasic
Hi Christoph,
On 11/6/2023 8:44 AM, Christoph Hellwig wrote:
> On Fri, Nov 03, 2023 at 07:59:49PM +0100, Petr Tesařík wrote:
>> I don't think it's possible to improve the allocation logic without
>> modifying the page allocator and/or the DMA atomic pool allocator to
>> take additional constraints into account.
>>
>> I had a wild idea back in March, but it would require some intrusive
>> changes in the mm subsystem. Among other things, it would make memory
>> zones obsolete. I mean, people may actually like to get rid of DMA,
>> DMA32 and NORMAL, but you see how many nasty bugs were introduced even
>> by a relatively small change in SWIOTLB. Replacing memory zones with a
>> system based on generic physical allocation constraints would probably
>> blow up the universe. ;-)
>
> It would be very nice, at least for DMA32 or the 30/31-bit DMA pools
> used on some architectures. For the x86-style 16MB zone DMA I suspect
> just having a small pool on the side that's not even exposed to the
> memory allocator would probably work better.
>
> I think a lot of the MM folks would love to be able to kill of the
> extra zones.
There's more to it. If you look at DMA buffer allocations, they need
memory which is contiguous in DMA address space of the requesting
device, but we allocate buffers that are contiguous in physical address
of the CPU. This difference is in fact responsible for some of the odd
DMA address limits.
All hell breaks loose when you try to fix this properly. Instead, we get
away with the observation that physically contiguous memory regions
coincide with DMA contiguous regions on real-world systems. But if
anyone feels like starting from scratch, they could also take the extra
time to look at this part. ;-)
FWIW I'm not volunteering, or at least not this year.
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 20:50 ` Petr Tesařík
2023-11-06 7:42 ` Christoph Hellwig
2023-11-06 10:08 ` Halil Pasic
@ 2023-11-07 17:24 ` Halil Pasic
2023-11-08 9:13 ` Petr Tesařík
2 siblings, 1 reply; 20+ messages in thread
From: Halil Pasic @ 2023-11-07 17:24 UTC (permalink / raw)
To: Petr Tesařík
Cc: Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic,
Jianxiong Gao
On Fri, 3 Nov 2023 21:50:53 +0100
Petr Tesařík <petr@tesarici.cz> wrote:
> > In our opinion the first step towards getting this right is to figure out what
> > the different kinds of alignments are really supposed to mean. For each of the
> > mechanisms we need to understand and document, whether making sure that the
> > bounce buffer does not stretch over more of certain units of memory (like,
> > pages, iova granule size, whatever), or is it about preserving offset within a
> > certain unit of memory, and if yes to what extent (the least significant n-bits
> > of the orig_addr dictated by the respective mask, or something different).
>
>
> Seconded. I have also been struggling with the various alignment
> constraints. I have even written (but not yet submitted) a patch to
> calculate the combined alignment mask in swiotlb_tbl_map_single() and
> pass it down to all other functions, just to make it clear what
> alignment mask is used.
Can you cc me when posting that rework?
>
> My understanding is that buffer alignment may be required by:
>
> 1. hardware which cannot handle an unaligned base address (presumably
> because the chip performs a simple OR operation to get the addresses
> of individual fields);
I'm not sure I understood this properly. What is "base address" in this
context? Is for swiotlb "base address" basically the return value
of swiotlb_tbl_map_single() -- I referred to this as tlb_addr previously?
If yes, I understand that satisfying 1 means satisfying
tlb_addr & combined_mask == 0, where combined_mask describes the
combined alignment requirement (i.e. combined_mask == min_align_mask |
alloc_align_mask | (alloc_size < PAGE_SIZE ? 0 : (PAGE_SIZE-1)). Does
that sound about right?
Can we assume that if 1. then the start address of the mapping
that is orig_addr needs to be already aligned?
>
> 2. isolation of untrusted devices, where no two bounce buffers should
> end up in the same iova granule;
>
> 3. allocation size; I could not find an explanation, so this might be
> merely an attempt at reducing SWIOTLB internal fragmentation.
Assumed I understood 1 correctly, I think we are missing something.
4. preserve the n (0 <= n <= 31) lowest bits of all addresses within the
mapping.
Was it just 1, 2 and 3 then we wouldn't need the whole offsetting
business introduced by commit 1f221a0d0dbf ("swiotlb: respect
min_align_mask"). Let me cite from its commit message:
"""
Respect the min_align_mask in struct device_dma_parameters in swiotlb.
There are two parts to it:
1) for the lower bits of the alignment inside the io tlb slot, just
extent the size of the allocation and leave the start of the slot
empty
2) for the high bits ensure we find a slot that matches the high bits
of the alignment to avoid wasting too much memory
Based on an earlier patch from Jianxiong Gao <jxgao@google.com>.
"""
Do we agree, that 4. needs to be added to the list? Or was it supposed
to be covered by 1.?
AFAIU 4. is about either (tlb_addr & combined_mask) == (orig_addr &
combined_mask) or about (tlb_addr & min_aling_mask) == (orig_addr &
min_align_mask). And I would like to know which of the two options it is.
Cc-ing Jianxiong.
>
> I hope other people on the Cc list can shed more light on the intended
> behaviour.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-06 7:42 ` Christoph Hellwig
@ 2023-11-07 17:24 ` Halil Pasic
2023-11-08 7:30 ` Christoph Hellwig
0 siblings, 1 reply; 20+ messages in thread
From: Halil Pasic @ 2023-11-07 17:24 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Petr Tesařík, Niklas Schnelle, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic,
Jianxiong Gao
On Mon, 6 Nov 2023 08:42:43 +0100
Christoph Hellwig <hch@lst.de> wrote:
> > 1. hardware which cannot handle an unaligned base address (presumably
> > because the chip performs a simple OR operation to get the addresses
> > of individual fields);
>
> There's all kinds of weird encodings that discard the low bits.
> For NVMe it's the PRPs (that is actually documented in the NVMe
> spec, so it might be easiest to grasp), but except for a Mellox
> vendor extension this is also how all RDMA memory registrations
> work.
Thanks Christoph! So for NVMe in certain contexts the low
bits of addresses get discarded, but in other contexts the high bits
of addresses get discarded and the low bits need to remain the same
after the bounce (and that's why we need commits 36950f2da1ea ("driver
core: add a min_align_mask) and 1f221a0d0dbf ("swiotlb: respect
min_align_mask").
Does that sound about right?
Regards,
Halil
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-07 17:24 ` Halil Pasic
@ 2023-11-08 7:30 ` Christoph Hellwig
0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2023-11-08 7:30 UTC (permalink / raw)
To: Halil Pasic
Cc: Christoph Hellwig, Petr Tesařík, Niklas Schnelle,
Bjorn Helgaas, Marek Szyprowski, Robin Murphy, Petr Tesarik,
Ross Lagerwall, linux-pci, linux-kernel, iommu, Matthew Rosato,
Jianxiong Gao
On Tue, Nov 07, 2023 at 06:24:37PM +0100, Halil Pasic wrote:
> Thanks Christoph! So for NVMe in certain contexts the low
> bits of addresses get discarded, but in other contexts the high bits
> of addresses get discarded and the low bits need to remain the same
> after the bounce (and that's why we need commits 36950f2da1ea ("driver
> core: add a min_align_mask) and 1f221a0d0dbf ("swiotlb: respect
> min_align_mask").
Nothing really gets discarded. NVMe basicaly requires all pages
(where the page is a device page, currently hard coded to 4k in
Linux) except for the first to not have an offset. This is what
the driver sets the virt boundary to in Linux. When bounce buffering
thus all segments (except possibly the first) need to keep their
alignment.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-07 17:24 ` Halil Pasic
@ 2023-11-08 9:13 ` Petr Tesařík
2023-11-23 10:16 ` Petr Tesařík
0 siblings, 1 reply; 20+ messages in thread
From: Petr Tesařík @ 2023-11-08 9:13 UTC (permalink / raw)
To: Halil Pasic
Cc: Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Jianxiong Gao
On Tue, 7 Nov 2023 18:24:20 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:
> On Fri, 3 Nov 2023 21:50:53 +0100
> Petr Tesařík <petr@tesarici.cz> wrote:
>
> > > In our opinion the first step towards getting this right is to figure out what
> > > the different kinds of alignments are really supposed to mean. For each of the
> > > mechanisms we need to understand and document, whether making sure that the
> > > bounce buffer does not stretch over more of certain units of memory (like,
> > > pages, iova granule size, whatever), or is it about preserving offset within a
> > > certain unit of memory, and if yes to what extent (the least significant n-bits
> > > of the orig_addr dictated by the respective mask, or something different).
> >
> >
> > Seconded. I have also been struggling with the various alignment
> > constraints. I have even written (but not yet submitted) a patch to
> > calculate the combined alignment mask in swiotlb_tbl_map_single() and
> > pass it down to all other functions, just to make it clear what
> > alignment mask is used.
>
> Can you cc me when posting that rework?
Absolutely. I mean, if it still makes sense after we clarify the
intended goals of the various alignment parameters. Because I believe
you have indeed found something!
> > My understanding is that buffer alignment may be required by:
> >
> > 1. hardware which cannot handle an unaligned base address (presumably
> > because the chip performs a simple OR operation to get the addresses
> > of individual fields);
>
> I'm not sure I understood this properly. What is "base address" in this
> context? Is for swiotlb "base address" basically the return value
> of swiotlb_tbl_map_single() -- I referred to this as tlb_addr previously?
>
> If yes, I understand that satisfying 1 means satisfying
> tlb_addr & combined_mask == 0, where combined_mask describes the
> combined alignment requirement (i.e. combined_mask == min_align_mask |
> alloc_align_mask | (alloc_size < PAGE_SIZE ? 0 : (PAGE_SIZE-1)). Does
> that sound about right?
>
> Can we assume that if 1. then the start address of the mapping
> that is orig_addr needs to be already aligned?
No, not really. A very nice diagram can be found in commit 5f89468e2f06
("swiotlb: manipulate orig_addr when tlb_addr has offset"):
"""
1. Get dma_addr_t from dma_map_single()
dma_addr_t tlb_addr = dma_map_single(dev, vaddr, vsize, DMA_TO_DEVICE);
|<---------------vsize------------->|
+-----------------------------------+
| | original buffer
+-----------------------------------+
vaddr
swiotlb_align_offset
|<----->|<---------------vsize------------->|
+-------+-----------------------------------+
| | | swiotlb buffer
+-------+-----------------------------------+
tlb_addr
"""
Here, the aligned address is outside the original buffer at
[vadddr; vaddr+vsize). This is what I referred to as "base
address". The N lowest bits of this address are zero. It may not
even be mapped in the SWIOTLB if N is greater than IO_TLB_SHIFT.
However, the exact values of the N lowest bits of the original
buffer's physical start address are preserved in tlb_addr.
> > 2. isolation of untrusted devices, where no two bounce buffers should
> > end up in the same iova granule;
> >
> > 3. allocation size; I could not find an explanation, so this might be
> > merely an attempt at reducing SWIOTLB internal fragmentation.
>
>
> Assumed I understood 1 correctly, I think we are missing something.
>
> 4. preserve the n (0 <= n <= 31) lowest bits of all addresses within the
> mapping.
>
> Was it just 1, 2 and 3 then we wouldn't need the whole offsetting
> business introduced by commit 1f221a0d0dbf ("swiotlb: respect
> min_align_mask"). Let me cite from its commit message:
>
> """
> Respect the min_align_mask in struct device_dma_parameters in swiotlb.
>
> There are two parts to it:
> 1) for the lower bits of the alignment inside the io tlb slot, just
> extent the size of the allocation and leave the start of the slot
> empty
> 2) for the high bits ensure we find a slot that matches the high bits
> of the alignment to avoid wasting too much memory
>
> Based on an earlier patch from Jianxiong Gao <jxgao@google.com>.
> """
>
> Do we agree, that 4. needs to be added to the list? Or was it supposed
> to be covered by 1.?
That's it. It's what case 1 is supposed to be. However, IIUC cases
2 and 3 don't need to preserve any lowest bits.
At least for case 3, I'm now quite confident that the intention
was to start big buffers on a page-aligned slot, leaving the gaps
for buffers smaller than a page.
Case 2 is not clear to me. Some comments suggest that it should
prevent exposing a single iova granule to multiple untrusted
devices. What the code really does now is prevent crossing an iova
granule boundary if the original buffer did not cross one. I'm not
sure whether it achieves the goal, because commit e81e99bacc9f
("swiotlb: Support aligned swiotlb buffers") also references
PAGE_SIZE, but AFAICS it should use the same logic as case 3
(page-aligned allocations).
To sum it up, there are two types of alignment:
1. specified by a device's min_align_mask; this says how many low
bits of a buffer's physical address must be preserved,
2. specified by allocation size and/or the alignment parameter;
this says how many low bits in the first IO TLB slot's physical
address must be zero.
I hope somebody can confirm or correct this summary before I go
and break something. You know, it's not like cleanups in SWIOTLB
have never broken anything. ;-)
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-03 18:59 ` Petr Tesařík
2023-11-06 7:44 ` Christoph Hellwig
@ 2023-11-08 10:52 ` Halil Pasic
2023-11-08 11:04 ` Petr Tesarik
1 sibling, 1 reply; 20+ messages in thread
From: Halil Pasic @ 2023-11-08 10:52 UTC (permalink / raw)
To: Petr Tesařík
Cc: Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic
On Fri, 3 Nov 2023 19:59:49 +0100
Petr Tesařík <petr@tesarici.cz> wrote:
> > Not sure how to properly fix this as the different alignment
> > requirements get pretty complex quickly. So would appreciate your
> > input.
>
> I don't think it's possible to improve the allocation logic without
> modifying the page allocator and/or the DMA atomic pool allocator to
> take additional constraints into account.
I don't understand. What speaks against calculating the amount of space
needed, so that with the waste we can still fit the bounce-buffer in the
pool?
I believe alloc_size + combined_mask is a trivial upper bound, but we can
do slightly better since we know that we allocate pages.
For the sake of simplicity let us assume we only have the min_align_mask
requirement. Then I believe the worst case is that we need
(orig_addr & min_align_mask & PAGE_MASK) + (min_align_mask & ~PAGE_MASK)
extra space to fit.
Depending on how the semantics pan out one may be able to replace
min_align_mask with combined_mask.
Is your point that for large combined_mask values
_get_free_pages(GFP_NOWAIT | __GFP_NOWARN, required_order) is not
likely to complete successfully?
Regards,
Halil
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-08 10:52 ` Halil Pasic
@ 2023-11-08 11:04 ` Petr Tesarik
2023-11-08 14:32 ` Halil Pasic
0 siblings, 1 reply; 20+ messages in thread
From: Petr Tesarik @ 2023-11-08 11:04 UTC (permalink / raw)
To: Halil Pasic, Petr Tesařík
Cc: Niklas Schnelle, Christoph Hellwig, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Ross Lagerwall, linux-pci,
linux-kernel, iommu, Matthew Rosato
On 11/8/2023 11:52 AM, Halil Pasic wrote:
> On Fri, 3 Nov 2023 19:59:49 +0100
> Petr Tesařík <petr@tesarici.cz> wrote:
>
>>> Not sure how to properly fix this as the different alignment
>>> requirements get pretty complex quickly. So would appreciate your
>>> input.
>>
>> I don't think it's possible to improve the allocation logic without
>> modifying the page allocator and/or the DMA atomic pool allocator to
>> take additional constraints into account.
>
> I don't understand. What speaks against calculating the amount of space
> needed, so that with the waste we can still fit the bounce-buffer in the
> pool?
>
> I believe alloc_size + combined_mask is a trivial upper bound, but we can
> do slightly better since we know that we allocate pages.
>
> For the sake of simplicity let us assume we only have the min_align_mask
> requirement. Then I believe the worst case is that we need
> (orig_addr & min_align_mask & PAGE_MASK) + (min_align_mask & ~PAGE_MASK)
> extra space to fit.
>
> Depending on how the semantics pan out one may be able to replace
> min_align_mask with combined_mask.
>
> Is your point that for large combined_mask values
> _get_free_pages(GFP_NOWAIT | __GFP_NOWARN, required_order) is not
> likely to complete successfully?
Yes, that's the reason. OTOH it's probably worth a try. The point is
that mapping a DMA buffer is allowed to fail, so callers should be
prepared anyway.
And for the case you reported initially, I don't think there is any need
to preserve bit 11 (0x800) from the original buffer's physical address,
which is enough to fix it. See also my other email earlier today.
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-08 11:04 ` Petr Tesarik
@ 2023-11-08 14:32 ` Halil Pasic
2023-11-08 14:45 ` Petr Tesarik
0 siblings, 1 reply; 20+ messages in thread
From: Halil Pasic @ 2023-11-08 14:32 UTC (permalink / raw)
To: Petr Tesarik
Cc: Petr Tesařík, Niklas Schnelle, Christoph Hellwig,
Bjorn Helgaas, Marek Szyprowski, Robin Murphy, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic
On Wed, 8 Nov 2023 12:04:12 +0100
Petr Tesarik <petr.tesarik1@huawei-partners.com> wrote:
[..]
> >
> > For the sake of simplicity let us assume we only have the min_align_mask
> > requirement. Then I believe the worst case is that we need
> > (orig_addr & min_align_mask & PAGE_MASK) + (min_align_mask & ~PAGE_MASK)
> > extra space to fit.
> >
> > Depending on how the semantics pan out one may be able to replace
> > min_align_mask with combined_mask.
> >
> > Is your point that for large combined_mask values
> > _get_free_pages(GFP_NOWAIT | __GFP_NOWARN, required_order) is not
> > likely to complete successfully?
>
> Yes, that's the reason. OTOH it's probably worth a try. The point is
> that mapping a DMA buffer is allowed to fail, so callers should be
> prepared anyway.
>
> And for the case you reported initially, I don't think there is any need
> to preserve bit 11 (0x800) from the original buffer's physical address,
> which is enough to fix it. See also my other email earlier today.
Hm. Do you mean "[PATCH 1/1] swiotlb: fix out-of-bounds TLB allocations
with CONFIG_SWIOTLB_DYNAMIC" or a different one?
I only see "[PATCH 1/1] swiotlb: fix out-of-bounds TLB allocations
with CONFIG_SWIOTLB_DYNAMIC" but I don't think that one takes
care of "I don't think there is any need to preserve bit 11 (0x800)
from the original buffer's physical address". But it should take care of
the corruption, I agree with that. I hope to provide review for that
patch by the end of the day.
Regards,
Halil
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-08 14:32 ` Halil Pasic
@ 2023-11-08 14:45 ` Petr Tesarik
2023-11-10 9:22 ` Halil Pasic
0 siblings, 1 reply; 20+ messages in thread
From: Petr Tesarik @ 2023-11-08 14:45 UTC (permalink / raw)
To: Halil Pasic
Cc: Petr Tesařík, Niklas Schnelle, Christoph Hellwig,
Bjorn Helgaas, Marek Szyprowski, Robin Murphy, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato
On 11/8/2023 3:32 PM, Halil Pasic wrote:
> On Wed, 8 Nov 2023 12:04:12 +0100
> Petr Tesarik <petr.tesarik1@huawei-partners.com> wrote:
> [..]
>>>
>>> For the sake of simplicity let us assume we only have the min_align_mask
>>> requirement. Then I believe the worst case is that we need
>>> (orig_addr & min_align_mask & PAGE_MASK) + (min_align_mask & ~PAGE_MASK)
>>> extra space to fit.
>>>
>>> Depending on how the semantics pan out one may be able to replace
>>> min_align_mask with combined_mask.
>>>
>>> Is your point that for large combined_mask values
>>> _get_free_pages(GFP_NOWAIT | __GFP_NOWARN, required_order) is not
>>> likely to complete successfully?
>>
>> Yes, that's the reason. OTOH it's probably worth a try. The point is
>> that mapping a DMA buffer is allowed to fail, so callers should be
>> prepared anyway.
>>
>> And for the case you reported initially, I don't think there is any need
>> to preserve bit 11 (0x800) from the original buffer's physical address,
>> which is enough to fix it. See also my other email earlier today.
>
> Hm. Do you mean "[PATCH 1/1] swiotlb: fix out-of-bounds TLB allocations
> with CONFIG_SWIOTLB_DYNAMIC" or a different one?
>
> I only see "[PATCH 1/1] swiotlb: fix out-of-bounds TLB allocations
> with CONFIG_SWIOTLB_DYNAMIC" but I don't think that one takes
> care of "I don't think there is any need to preserve bit 11 (0x800)
> from the original buffer's physical address".
Yes, I mean only this patch. I want to fix memory corruption fast, while
waiting for more feedback on my understanding of the alignment masks.
What I'm trying to say is that your specific use case may not even need
a bigger allocation if the page alignment should be interpreted differently.
Again, thank you for your in-depth inspection, because it's not totally
clear how the various alignment parameters should be interpreted. It's
difficult to write correct code then...
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-08 14:45 ` Petr Tesarik
@ 2023-11-10 9:22 ` Halil Pasic
0 siblings, 0 replies; 20+ messages in thread
From: Halil Pasic @ 2023-11-10 9:22 UTC (permalink / raw)
To: Petr Tesarik
Cc: Petr Tesařík, Niklas Schnelle, Christoph Hellwig,
Bjorn Helgaas, Marek Szyprowski, Robin Murphy, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Halil Pasic
On Wed, 8 Nov 2023 15:45:49 +0100
Petr Tesarik <petr.tesarik1@huawei-partners.com> wrote:
> On 11/8/2023 3:32 PM, Halil Pasic wrote:
> > On Wed, 8 Nov 2023 12:04:12 +0100
> > Petr Tesarik <petr.tesarik1@huawei-partners.com> wrote:
> > [..]
> >>>
> >>> For the sake of simplicity let us assume we only have the min_align_mask
> >>> requirement. Then I believe the worst case is that we need
> >>> (orig_addr & min_align_mask & PAGE_MASK) + (min_align_mask & ~PAGE_MASK)
> >>> extra space to fit.
> >>>
> >>> Depending on how the semantics pan out one may be able to replace
> >>> min_align_mask with combined_mask.
> >>>
> >>> Is your point that for large combined_mask values
> >>> _get_free_pages(GFP_NOWAIT | __GFP_NOWARN, required_order) is not
> >>> likely to complete successfully?
> >>
> >> Yes, that's the reason. OTOH it's probably worth a try. The point is
> >> that mapping a DMA buffer is allowed to fail, so callers should be
> >> prepared anyway.
> >>
> >> And for the case you reported initially, I don't think there is any need
> >> to preserve bit 11 (0x800) from the original buffer's physical address,
> >> which is enough to fix it. See also my other email earlier today.
> >
> > Hm. Do you mean "[PATCH 1/1] swiotlb: fix out-of-bounds TLB allocations
> > with CONFIG_SWIOTLB_DYNAMIC" or a different one?
> >
> > I only see "[PATCH 1/1] swiotlb: fix out-of-bounds TLB allocations
> > with CONFIG_SWIOTLB_DYNAMIC" but I don't think that one takes
> > care of "I don't think there is any need to preserve bit 11 (0x800)
> > from the original buffer's physical address".
>
> Yes, I mean only this patch. I want to fix memory corruption fast, while
> waiting for more feedback on my understanding of the alignment masks.
> What I'm trying to say is that your specific use case may not even need
> a bigger allocation if the page alignment should be interpreted differently.
>
> Again, thank you for your in-depth inspection, because it's not totally
> clear how the various alignment parameters should be interpreted. It's
> difficult to write correct code then...
I fully understand. Thanks for tackling this. We decided to go with a bug
report and not with a fix because of the very same reasons: lack of
clarity on how certain things are supposed to work. Let us see how
the discussion develops. :)
Regards,
Halil
>
> Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-08 9:13 ` Petr Tesařík
@ 2023-11-23 10:16 ` Petr Tesařík
2023-11-27 15:59 ` Christoph Hellwig
0 siblings, 1 reply; 20+ messages in thread
From: Petr Tesařík @ 2023-11-23 10:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Halil Pasic, Niklas Schnelle, Bjorn Helgaas, Marek Szyprowski,
Robin Murphy, Petr Tesarik, Ross Lagerwall, linux-pci,
linux-kernel, iommu, Matthew Rosato, Jianxiong Gao
Hello everybody,
I don't think I have ever seen an answer to my question regarding
alignment constraints on swiotlb bounce buffers:
On Wed, 8 Nov 2023 10:13:47 +0100
Petr Tesařík <petr@tesarici.cz> wrote:
>[...]
> To sum it up, there are two types of alignment:
>
> 1. specified by a device's min_align_mask; this says how many low
> bits of a buffer's physical address must be preserved,
>
> 2. specified by allocation size and/or the alignment parameter;
> this says how many low bits in the first IO TLB slot's physical
> address must be zero.
>
> I hope somebody can confirm or correct this summary before I go
> and break something. You know, it's not like cleanups in SWIOTLB
> have never broken anything. ;-)
If no answer means that nobody knows, then based on my understanding the
existing code (both implementation and users), I can assume that this
is the correct interpretation.
I'm giving it a few more days. If there's still no reaction, expect a
beautiful documentation patch and a less beautiful cleanup patch in the
next week.
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-23 10:16 ` Petr Tesařík
@ 2023-11-27 15:59 ` Christoph Hellwig
2023-11-28 7:16 ` Petr Tesařík
0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2023-11-27 15:59 UTC (permalink / raw)
To: Petr Tesařík
Cc: Christoph Hellwig, Halil Pasic, Niklas Schnelle, Bjorn Helgaas,
Marek Szyprowski, Robin Murphy, Petr Tesarik, Ross Lagerwall,
linux-pci, linux-kernel, iommu, Matthew Rosato, Jianxiong Gao
On Thu, Nov 23, 2023 at 11:16:08AM +0100, Petr Tesařík wrote:
> > To sum it up, there are two types of alignment:
> >
> > 1. specified by a device's min_align_mask; this says how many low
> > bits of a buffer's physical address must be preserved,
> >
> > 2. specified by allocation size and/or the alignment parameter;
> > this says how many low bits in the first IO TLB slot's physical
> > address must be zero.
Both are correct.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
2023-11-27 15:59 ` Christoph Hellwig
@ 2023-11-28 7:16 ` Petr Tesařík
0 siblings, 0 replies; 20+ messages in thread
From: Petr Tesařík @ 2023-11-28 7:16 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Halil Pasic, Niklas Schnelle, Bjorn Helgaas, Marek Szyprowski,
Robin Murphy, Petr Tesarik, Ross Lagerwall, linux-pci,
linux-kernel, iommu, Matthew Rosato, Jianxiong Gao
On Mon, 27 Nov 2023 16:59:13 +0100
Christoph Hellwig <hch@lst.de> wrote:
> On Thu, Nov 23, 2023 at 11:16:08AM +0100, Petr Tesařík wrote:
> > > To sum it up, there are two types of alignment:
> > >
> > > 1. specified by a device's min_align_mask; this says how many low
> > > bits of a buffer's physical address must be preserved,
> > >
> > > 2. specified by allocation size and/or the alignment parameter;
> > > this says how many low bits in the first IO TLB slot's physical
> > > address must be zero.
>
> Both are correct.
Great! Thank you for confirmation. Unfortunately, that's not quite how
the code works now.
I'm on it to fix things.
Petr T
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2023-11-28 7:16 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-03 15:13 Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y Niklas Schnelle
2023-11-03 16:14 ` Halil Pasic
2023-11-03 20:50 ` Petr Tesařík
2023-11-06 7:42 ` Christoph Hellwig
2023-11-07 17:24 ` Halil Pasic
2023-11-08 7:30 ` Christoph Hellwig
2023-11-06 10:08 ` Halil Pasic
2023-11-07 17:24 ` Halil Pasic
2023-11-08 9:13 ` Petr Tesařík
2023-11-23 10:16 ` Petr Tesařík
2023-11-27 15:59 ` Christoph Hellwig
2023-11-28 7:16 ` Petr Tesařík
2023-11-03 18:59 ` Petr Tesařík
2023-11-06 7:44 ` Christoph Hellwig
2023-11-06 12:46 ` Petr Tesarik
2023-11-08 10:52 ` Halil Pasic
2023-11-08 11:04 ` Petr Tesarik
2023-11-08 14:32 ` Halil Pasic
2023-11-08 14:45 ` Petr Tesarik
2023-11-10 9:22 ` Halil Pasic
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).