Re: [PATCH] arm: dma-mapping: don't call folio_next() beyond the requested region

From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@kernel.org>
Subject: Re: [PATCH] arm: dma-mapping: don't call folio_next() beyond the requested region
Date: Thu, 10 Aug 2023 12:06:09 +0100	[thread overview]
Message-ID: <ZNTEoUB7V5BtNvfp@shell.armlinux.org.uk> (raw)
In-Reply-To: <20230810091955.3579004-1-m.szyprowski@samsung.com>

On Thu, Aug 10, 2023 at 11:19:55AM +0200, Marek Szyprowski wrote:
> Add a check for the non-zero offset case to avoid calling folio_next()
> beyond the requested region and relying on its parameters.
> 
> Fixes: cc24e9c0895c ("arm: implement the new page table range API")
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> ---
>  arch/arm/mm/dma-mapping.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 0474840224d9..6c952d6899f2 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -715,6 +715,8 @@ static void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
>  
>  		if (offset) {
>  			left -= folio_size(folio) - offset;

In most cases, "offset" is masked by ~PAGE_MASK, the only exception
being when we're walking scatterlists, where it is whatever is in
sg->offset.

So, what is the range of values that sg->offset could take? If it is
guaranteed to be less than PAGE_SIZE (or folio_size() of the first
folio) then we're all good.

However, consider what happens with the above when offset is larger
than the first folio size. To show this, let's rewrite it:

	left += offset - folio_size(folio);

In that case, "left" becomes larger than the original size, which
surely is not what we want?

This wasn't a problem with the original code, because we guaranteed
that "off" was always less than PAGE_SIZE, so

	left -= PAGE_SIZE - off;

would always result in a reduction, but this is no longer the case
with folios.

The more I'm looking at this, the more I'm convinced that the original
conversion is wrong. Let's go back to the original code and see what
it is doing:

                size_t left = size;

                pfn = page_to_pfn(page) + off / PAGE_SIZE;

This positions pfn to be the page frame number to which the page and
offset passed into the function reference.

                off %= PAGE_SIZE;

This gives us the offset _within_ that page.

                if (off) {
                        pfn++;
                        left -= PAGE_SIZE - off;
                }

What this is doing is saying if the first page is a partial page, then
we skip marking it clean - only _full_ pages get marked clean.

                while (left >= PAGE_SIZE) {
                        page = pfn_to_page(pfn++);
                        set_bit(PG_dcache_clean, &page->flags);
                        left -= PAGE_SIZE;
                }

There, we iterate over the size for the number of _whole_ pages only
and not a final partial page.

Now, if we consider the folio version:

+               ssize_t left = size;

Casts an unsigned to a signed, which will not give expected results if
large.

+               size_t offset = offset_in_folio(folio, paddr);

paddr here is the physical address of the page plus the passed in
offset. If the offset is larger than a page, then it will be the
following pages. What if the offset is larger than the first folio
size - bearing in mind that there is no limit on the offset in a
scatterlist?

If offset is bounded to the size of the folio, then that can truncate
the original "offset" that was passed in and we'll end up marking
the wrong folios, because:

+
+               if (offset) {
+                       left -= folio_size(folio) - offset;
+                       folio = folio_next(folio);
                }

This only allows us to move to the next folio.

Moreover, if offset here _is_ allowed to be bigger than folio_size()
then we end up _increasing_ "left" as stated above, so we end up
marking _more_ folios as clean than the user of this function requested.

+
+               while (left >= (ssize_t)folio_size(folio)) {
+                       set_bit(PG_dcache_clean, &folio->flags);
+                       left -= folio_size(folio);
+                       folio = folio_next(folio);
                }

So, in all, to me it looks like this conversion is basically wrong, and it
needs to be something like:

		size_t left = size;

		while (off >= folio_size(folio)) {
			off -= folio_size(folio);
			folio = folio_next(folio);
		}
		if (off) {
			/* Partial first folio */
			size_t first = folio_size(folio) - off;

			/* Size doesn't extend the full folio size, so exit */
			if (left < first)
				return;

			/* Truncate the size and move to the next folio */
			left -= first;
			folio = folio_next(folio);
		}

		while (left >= folio_size(folio)) {
			/* can't become negative */
			left -= folio_size(folio);
			set_bit(PG_dcache_clean, &folio->flags);
			if (!left)
				break;
			folio = folio_next(folio);
		}

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel