All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: Paul Durrant <paul@xen.org>, xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 3/6] x86/iommu: introduce a rangeset to perform hwdom IOMMU setup
Date: Tue, 5 Dec 2023 16:29:14 +0100	[thread overview]
Message-ID: <ZW9Byij0hlUsVzDL@macbook> (raw)
In-Reply-To: <93b57f96-e47e-493b-b0f4-a8183ba8466f@suse.com>

On Tue, Dec 05, 2023 at 03:50:44PM +0100, Jan Beulich wrote:
> On 04.12.2023 10:43, Roger Pau Monne wrote:
> > This change just introduces the boilerplate code in order to use a rangeset
> > when setting up the hardware domain IOMMU mappings.  The rangeset is never
> > populated in this patch, so it's a non-functional change as far as the mappings
> > the domain gets established.
> > 
> > Note there's a change for HVM domains (ie: PVH dom0) that will get switched to
> > create the p2m mappings using map_mmio_regions() instead of
> > p2m_add_identity_entry(), so that ranges can be mapped with a single function
> > call if possible.  Note that the interface of map_mmio_regions() doesn't allow
> > creating read-only mappings, but so far there are no such mappings created for
> > PVH dom0 in arch_iommu_hwdom_init().
> 
> I don't understand this paragraph: The rangeset remains empty, so nothing is
> changing right here. DYM there is going to be such a change as a result of
> this patch, but in a later part of this series?

Yes, when the rangeset is populated and mappings are created based on
its contents, map_mmio_regions() will be used instead of
p2m_add_identity_entry().  I guess the '... that will get switched to
create the p2m ...' is not clear enough.

> > --- a/xen/drivers/passthrough/x86/iommu.c
> > +++ b/xen/drivers/passthrough/x86/iommu.c
> > @@ -370,10 +370,77 @@ static unsigned int __hwdom_init hwdom_iommu_map(const struct domain *d,
> >      return perms;
> >  }
> >  
> > +struct map_data {
> > +    struct domain *d;
> > +    unsigned int flush_flags;
> > +    bool ro;
> > +};
> > +
> > +static int __hwdom_init cf_check identity_map(unsigned long s, unsigned long e,
> > +                                              void *data)
> > +{
> > +    struct map_data *info = data;
> > +    struct domain *d = info->d;
> > +    long rc;
> > +
> > +    if ( iommu_verbose )
> > +        printk(XENLOG_INFO " [%010lx, %010lx] R%c\n",
> > +               s, e, info->ro ? 'O' : 'W');
> > +
> > +    if ( paging_mode_translate(d) )
> > +    {
> > +        if ( info->ro )
> > +        {
> > +            ASSERT_UNREACHABLE();
> > +            return 0;
> > +        }
> > +        while ( (rc = map_mmio_regions(d, _gfn(s), e - s + 1, _mfn(s))) > 0 )
> > +        {
> > +            s += rc;
> > +            process_pending_softirqs();
> > +        }
> > +    }
> > +    else
> > +    {
> > +        const unsigned int perms = IOMMUF_readable | IOMMUF_preempt |
> > +                                   (info->ro ? 0 : IOMMUF_writable);
> > +
> > +        if ( info->ro && !iomem_access_permitted(d, s, e) )
> 
> How is r/o-ness related to iomem_access_permitted()? The present callers
> are such that there is a connection, but that's invisible here. I guess
> either the field wants to change name (maybe mmio_ro or ro_mmio or even
> just mmio), or there wants to be a comment.

Will add:

"read-only ranges are only created based on the contents of
mmio_ro_ranges, and hence need the additional iomem_access_permitted()
check."

> > +        {
> > +            /*
> > +             * Should be more fine grained in order to not map the forbidden
> > +             * frame instead of rejecting the region as a whole, but it's only
> > +             * for read-only MMIO regions, which are very limited.
> > +             */
> 
> How certain are you/we that no two adjacent ones may appear, with
> different permissions granted to Dom0?

Yeah, I was already not very convinced by this, and I think the only
solution here is to iterate over the read-only ranges with page
granularity.  In any case read-only ranges are both few and small in
size, hence this is unlikely to be noticeable performance wise.

> > +            printk(XENLOG_DEBUG
> > +                   "IOMMU read-only mapping of region [%lx, %lx] forbidden\n",
> > +                   s, e);
> > +            return 0;
> > +        }
> > +        while ( (rc = iommu_map(d, _dfn(s), _mfn(s), e - s + 1,
> > +                                perms, &info->flush_flags)) > 0 )
> > +        {
> > +            s += rc;
> > +            process_pending_softirqs();
> > +        }
> > +    }
> > +    ASSERT(rc <= 0);
> > +    if ( rc )
> > +        printk(XENLOG_WARNING
> > +               "IOMMU identity mapping of [%lx, %lx] failed: %ld\n",
> > +               s, e, rc);
> > +
> > +    /* Ignore errors and attempt to map the remaining regions. */
> > +    return 0;
> > +}
> > +
> >  void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
> >  {
> >      unsigned long i, top, max_pfn, start, count;
> >      unsigned int flush_flags = 0, start_perms = 0;
> > +    struct rangeset *map;
> > +    struct map_data map_data = { .d = d };
> > +    int rc;
> >  
> >      BUG_ON(!is_hardware_domain(d));
> >  
> > @@ -397,6 +464,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
> >      if ( iommu_hwdom_passthrough )
> >          return;
> >  
> > +    map = rangeset_new(NULL, NULL, 0);
> > +    if ( !map )
> > +        panic("IOMMU init: unable to allocate rangeset\n");
> > +
> >      max_pfn = (GB(4) >> PAGE_SHIFT) - 1;
> >      top = max(max_pdx, pfn_to_pdx(max_pfn) + 1);
> >  
> > @@ -451,6 +522,24 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
> >              goto commit;
> >      }
> >  
> > +    if ( iommu_verbose )
> > +        printk(XENLOG_INFO "d%u: identity mappings for IOMMU:\n",
> > +               d->domain_id);
> 
> %pd: ?

Indeed, I probably copied this from a different chunk and didn't
adjust to use %pd.

> > +    rc = rangeset_report_ranges(map, 0, ~0UL, identity_map, &map_data);
> > +    if ( rc )
> > +        panic("IOMMU unable to create mappings: %d\n", rc);
> > +    if ( is_pv_domain(d) )
> > +    {
> > +        map_data.ro = true;
> > +        rc = rangeset_report_ranges(mmio_ro_ranges, 0, ~0UL, identity_map,
> > +                                    &map_data);
> > +        if ( rc )
> > +            panic("IOMMU unable to create read-only mappings: %d\n", rc);
> > +    }
> > +
> > +    rangeset_destroy(map);
> 
> This could move up, couldn't it?

Yes, could be moved just after the rangeset_report_ranges(map...)
call.

> >      /* Use if to avoid compiler warning */
> >      if ( iommu_iotlb_flush_all(d, flush_flags) )
> 
> Don't you need to fold map.flush_flags into flush_flags ahead of this call?
> Or can the variable perhaps go away altogether, being replaced by the struct
> field?

Yes, the variable ends up being replaced in a later patch, hence I
didn't touch it here.

Thanks, Roger.


  reply	other threads:[~2023-12-05 15:29 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04  9:42 [PATCH v2 0/6] x86/iommu: improve setup time of hwdom IOMMU Roger Pau Monne
2023-12-04  9:43 ` [PATCH v2 1/6] iommu/vt-d: do not assume page table levels for quarantine domain Roger Pau Monne
2023-12-05 14:24   ` Jan Beulich
2023-12-05 14:30     ` Roger Pau Monné
2023-12-04  9:43 ` [PATCH v2 2/6] amd-vi: set IOMMU page table levels based on guest reported paddr width Roger Pau Monne
2023-12-05 14:32   ` Jan Beulich
2023-12-05 15:11     ` Roger Pau Monné
2023-12-05 15:19       ` Jan Beulich
2023-12-04  9:43 ` [PATCH v2 3/6] x86/iommu: introduce a rangeset to perform hwdom IOMMU setup Roger Pau Monne
2023-12-05 14:50   ` Jan Beulich
2023-12-05 15:29     ` Roger Pau Monné [this message]
2023-12-04  9:43 ` [PATCH v2 4/6] x86/iommu: remove regions not to be mapped Roger Pau Monne
2023-12-05 15:11   ` Jan Beulich
2023-12-05 15:31     ` Roger Pau Monné
2023-12-05 15:34       ` Jan Beulich
2023-12-04  9:43 ` [PATCH v2 5/6] x86/iommu: switch hwdom IOMMU to use a rangeset Roger Pau Monne
2023-12-05 15:27   ` Jan Beulich
2023-12-05 15:43     ` Roger Pau Monné
2023-12-05 15:47       ` Jan Beulich
2023-12-05 16:07         ` Roger Pau Monné
2023-12-04  9:43 ` [PATCH v2 6/6] x86/iommu: cleanup unused functions Roger Pau Monne
2023-12-05 15:29   ` Jan Beulich
2023-12-05 15:48     ` Roger Pau Monné
2023-12-05 15:52       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZW9Byij0hlUsVzDL@macbook \
    --to=roger.pau@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=paul@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.