From: Jason Gunthorpe <jgg@nvidia.com>
To: Steve Sistare <steven.sistare@oracle.com>
Cc: iommu@lists.linux.dev, Kevin Tian <kevin.tian@intel.com>,
Nicolin Chen <nicolinc@nvidia.com>
Subject: Re: [PATCH V4 5/9] iommufd: folio subroutines
Date: Mon, 21 Oct 2024 14:30:21 -0300 [thread overview]
Message-ID: <20241021173021.GC13034@nvidia.com> (raw)
In-Reply-To: <1729286856-127844-6-git-send-email-steven.sistare@oracle.com>
On Fri, Oct 18, 2024 at 02:27:32PM -0700, Steve Sistare wrote:
> -/* true if the pfn was added, false otherwise */
> -static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
> +/* returns the number of pfn's added */
> +static long batch_add_pfn_num(struct pfn_batch *batch, unsigned long pfn,
> + unsigned long nr)
> {
> - const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
> + const unsigned long MAX_NPFNS = type_max(typeof(*batch->npfns));
> + unsigned long n = 0;
>
> - if (batch->end &&
> - pfn == batch->pfns[batch->end - 1] + batch->npfns[batch->end - 1] &&
> - batch->npfns[batch->end - 1] != MAX_NPFNS) {
> - batch->npfns[batch->end - 1]++;
> - batch->total_pfns++;
> - return true;
> + if (batch->end) {
> + unsigned long npfn_end = batch->npfns[batch->end - 1];
> + unsigned long pfn_end = batch->pfns[batch->end - 1];
> +
> + if (pfn == pfn_end + npfn_end && npfn_end < MAX_NPFNS) {
> + n = min_t(unsigned long, MAX_NPFNS - npfn_end, nr);
> + batch->npfns[batch->end - 1] += n;
> + batch->total_pfns += n;
> + if (nr == n)
> + return n;
> + nr -= n;
Looking at this a bit more carefully, MAX_NPFNS is 2**32 and a PFN is
2**12, so a batch entry can describe 2**44 bytes of contiguous memory.
I don't think we need all this logic to try to do 'min'. If the thing
we are trying to add doesn't fit then just skip trying to join it and
continue on.
The next empty item will hold 2**44 and that can hold any single folio
- folio npages is limited to an unsigned int.
One reason I bring this up is that we really don't want to split
across folios, like if you get a 2M,2M,1G pattern then we don't want
to split the 1G across batches as that will definately harm
construction of the page table. Theoretical because of how big the
size is but still.
> +static void batch_from_folios(struct pfn_batch *batch, struct folio ***folios_p,
> + unsigned long *offset_p, unsigned long npages)
> +{
> + unsigned long offset = *offset_p;
> + struct folio **folios = *folios_p;
> +
> + while (npages) {
> + unsigned long n;
> + struct folio *folio = *folios;
> + unsigned long nr = folio_nr_pages(folio) - offset;
> + unsigned long pfn = page_to_pfn(folio_page(folio, offset));
> +
> + nr = min(nr, npages);
Ah, npages is used for the trailing partial folio too..
> + n = batch_add_pfn_num(batch, pfn, nr);
> + npages -= n;
> +
> + if (n == nr) {
> + folios++;
> + offset = 0;
> + } else if (n) {
> + offset += n;
> + } else {
> + break;
> + }
Then this gets to be quite a bit simpler
And you'll put the refcount adjustment in here like you said?
Otherwise this patch looks like the right thing
Thanks,
Jason
next prev parent reply other threads:[~2024-10-21 17:30 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-18 21:27 [PATCH V4 0/9] iommu_ioas_map_file Steve Sistare
2024-10-18 21:27 ` [PATCH V4 1/9] mm/gup: folio_add_pins Steve Sistare
2024-10-18 21:27 ` [PATCH V4 2/9] iommufd: rename uptr in iopt_alloc_iova Steve Sistare
2024-10-22 18:32 ` Nicolin Chen
2024-10-18 21:27 ` [PATCH V4 3/9] iommufd: generalize iopt_pages address Steve Sistare
2024-10-22 18:38 ` Nicolin Chen
2024-10-18 21:27 ` [PATCH V4 4/9] iommufd: pfn reader local variables Steve Sistare
2024-10-18 21:27 ` [PATCH V4 5/9] iommufd: folio subroutines Steve Sistare
2024-10-21 17:30 ` Jason Gunthorpe [this message]
2024-10-22 13:26 ` Steven Sistare
2024-10-18 21:27 ` [PATCH V4 6/9] iommufd: pfn reader for file mappings Steve Sistare
2024-10-19 18:41 ` kernel test robot
2024-10-21 17:32 ` Jason Gunthorpe
2024-10-18 21:27 ` [PATCH V4 7/9] iommufd: IOMMU_IOAS_MAP_FILE Steve Sistare
2024-10-21 17:34 ` Jason Gunthorpe
2024-10-18 21:27 ` [PATCH V4 8/9] iommufd: file mappings for mdev Steve Sistare
2024-10-18 21:27 ` [PATCH V4 9/9] iommufd: map file selftest Steve Sistare
2024-10-18 22:51 ` Nicolin Chen
2024-10-22 13:27 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241021173021.GC13034@nvidia.com \
--to=jgg@nvidia.com \
--cc=iommu@lists.linux.dev \
--cc=kevin.tian@intel.com \
--cc=nicolinc@nvidia.com \
--cc=steven.sistare@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.