All of lore.kernel.org
 help / color / mirror / Atom feed
From: Demi Marie Obenour <demi@invisiblethingslab.com>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Jan Beulich <jbeulich@suse.com>,
	Roger Pau Monne <roger.pau@citrix.com>, Wei Liu <wl@xen.org>,
	"Tim (Xen.org)" <tim@xen.org>,
	George Dunlap <George.Dunlap@citrix.com>
Subject: Re: [PATCH v6 4/5] x86/mm: Reject invalid cacheability in PV guests by default
Date: Fri, 6 Jan 2023 00:23:13 -0500	[thread overview]
Message-ID: <Y7ewcXxuEL92rI4v@itl-email> (raw)
In-Reply-To: <Y7eHqmeNgFmf3NqH@mail-itl>

[-- Attachment #1: Type: text/plain, Size: 6791 bytes --]

On Fri, Jan 06, 2023 at 03:30:01AM +0100, Marek Marczykowski-Górecki wrote:
> On Thu, Jan 05, 2023 at 09:00:03PM -0500, Demi Marie Obenour wrote:
> > On Thu, Jan 05, 2023 at 08:28:26PM +0000, Andrew Cooper wrote:
> > > On 22/12/2022 10:31 pm, Demi Marie Obenour wrote:
> > > > diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> > > > index 424b12cfb27d6ade2ec63eacb8afe5df82465451..0230a7bc17cbd4362a42ea64cea695f31f5e0f86 100644
> > > > --- a/docs/misc/xen-command-line.pandoc
> > > > +++ b/docs/misc/xen-command-line.pandoc
> > > > @@ -1417,6 +1417,17 @@ detection of systems known to misbehave upon accesses to that port.
> > > >  ### idle_latency_factor (x86)
> > > >  > `= <integer>`
> > > >  
> > > > +### invalid-cacheability (x86)
> > > > +> `= allow | deny | trap`
> > > > +
> > > > +> Default: `deny` in release builds, otherwise `trap`
> > > > +
> > > > +Specify what happens when a PV guest tries to use one of the reserved entries in
> > > > +the PAT.  `deny` causes the attempt to be rejected with -EINVAL, `allow` allows
> > > > +the attempt, and `trap` causes a general protection fault to be raised.
> > > > +Currently, the reserved entries are marked as uncacheable in Xen's PAT, but this
> > > > +will change if new memory types are added, so guests must not rely on it.
> > > > +
> > > 
> > > This wants to be scoped under "pv", and not a top-level option.  Current
> > > parsing is at the top of xen/arch/x86/pv/domain.c
> > > 
> > > How about `pv={no-}oob-pat`, and parse it into the opt_pv_oob_pat boolean ?
> > 
> > Works for me, though I will use ‘invalid’ instead of ‘oob’ as valid PAT
> > entries might not be contiguous.
> 
> If we're talking about alternative PAT settings, I'm not sure if they
> can be called "invalid". With the default Xen's choice of PAT, top two
> entries are documented as reserved (in xen.h) and only that makes them
> forbidden to use. But with alternative settings, it's only behavior of
> Linux parsing that prefers using lower options. When breaking contract
> set in xen.h, "reserved" entries stop being reserved too.

That makes sense.

> So, _if_ that option would be applicable alternative PAT choice, it's
> only useful for debugging Linux specifically (assuming Linux won't
> change its approach to choosing entries - which I think it's allowed to
> do).

Point taken.

> > > There really is no need to distinguish between deny and trap.  IMO,
> > > injecting #GP unilaterally is fine (to a guest expecting the hypercall
> > > to succeed, -EINVAL vs #GP makes no difference, but #GP is far more
> > > useful to a human trying to debug issues here), but I could be talked
> > > into putting that behind a CONFIG_DEBUG if other feel strongly.
> > 
> > Marek, thoughts?
> 
> With Xen's default PAT, #GP may be useful indeed, but it must come with
> a message why it was injected.

In xl dmesg?

> > > > @@ -1343,7 +1374,34 @@ static int promote_l1_table(struct page_info *page)
> > > >          }
> > > >          else
> > > >          {
> > > > -            switch ( ret = get_page_from_l1e(pl1e[i], d, d) )
> > > > +            l1_pgentry_t l1e = pl1e[i];
> > > > +
> > > > +            if ( invalid_cacheability != INVALID_CACHEABILITY_ALLOW )
> > > > +            {
> > > > +                switch ( l1e.l1 & PAGE_CACHE_ATTRS )
> > > > +                {
> > > > +                case _PAGE_WB:
> > > > +                case _PAGE_UC:
> > > > +                case _PAGE_UCM:
> > > > +                case _PAGE_WC:
> > > > +                case _PAGE_WT:
> > > > +                case _PAGE_WP:
> > > > +                    break;
> > > > +                default:
> > > > +                    /*
> > > > +                     * If we get here, a PV guest tried to use one of the
> > > > +                     * reserved values in Xen's PAT.  This indicates a bug
> > > > +                     * in the guest.  If requested by the user, inject #GP
> > > > +                     * to cause the guest to log a stack trace.
> > > > +                     */
> > > > +                    if ( invalid_cacheability == INVALID_CACHEABILITY_TRAP )
> > > > +                        pv_inject_hw_exception(TRAP_gp_fault, 0);
> > > > +                    ret = -EINVAL;
> > > > +                    goto fail;
> > > > +                }
> > > > +            }
> > > 
> > > This will catch cases where the guest writes a full pagetable, then
> > > promotes it, but won't catch cases where the guest modifies an
> > > individual entry with mmu_update/etc.
> > > 
> > > The logic needs to be in get_page_from_l1e() to get applied uniformly to
> > > all PTE modifications.
> > 
> > I will move it there, and also update Qubes OS’s patchset.
> > 
> > > Right now, the l1_disallow_mask() check near the start hides the "can
> > > you use a nonzero cacheattr" check.  If I ever get around to cleaning up
> > > my post-XSA-402 series, I have a load of modifications to this.
> > 
> > I came up with some major cleanups too.
> > 
> > > But this could be something like this:
> > > 
> > > if ( opt_pv_oob_pat && (l1f & PAGE_CACHE_ATTRS) > _PAGE_WP )
> > > {
> > >     // #GP
> > >     return -EINVAL;
> > > }
> > > 
> > > fairly early on.
> > > 
> > > It occurs to me that this check is only applicable when we're using the
> > > Xen ABI APT, and seeing as we've talked about possibly making patch 5 a
> > > Kconfig option, that may want accounting here.  (Perhaps simply making
> > > opt_pb_oob_pat be false in a !XEN_PAT build.)
> > 
> > It’s actually applicable even with other PATs.  While Marek and I were
> > tracking down an Intel iGPU cache coherency problem, Marek used it to
> > verify that PAT entries that we thought were not being used were in fact
> > unused.  This allowed proving that the behavior of the GPU was impacted
> > by changes to PAT entries the hardware should not even be looking at,
> > and therefore that the hardware itself must be buggy.
> 
> In fact, I did that via WARN() on the Linux side, to _not_ have guest
> killed in this case, to potentially collect more info.

Whoops!  I must have misunderstood what you meant by "trap".

> As said above,
> with alternative PAT settings, the contract which entries are "valid"
> isn't there anymore, so punishing guest for using them isn't
> appropriate. It could still be useful feature for debugging Linux (and
> it feels, we'll need this feature for some time...). So, with !XEN_PAT
> it should be at least disabled by default, or maybe even hidden behind
> CONFIG_DEBUG.

Okay, makes sense.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-01-06  5:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-22 22:31 [PATCH v6 0/5] Make PAT handling less brittle Demi Marie Obenour
2022-12-22 22:31 ` [PATCH v6 1/5] x86/mm: Avoid hard-coding PAT in get_page_from_l1e() Demi Marie Obenour
2022-12-22 22:31 ` [PATCH v6 2/5] x86: Remove MEMORY_NUM_TYPES and NO_HARDCODE_MEM_TYPE Demi Marie Obenour
2022-12-22 22:31 ` [PATCH v6 3/5] x86/mm: make code robust to future PAT changes Demi Marie Obenour
2023-01-05 14:01   ` Jan Beulich
2023-01-05 19:58     ` Andrew Cooper
2022-12-22 22:31 ` [PATCH v6 4/5] x86/mm: Reject invalid cacheability in PV guests by default Demi Marie Obenour
2023-01-05 14:30   ` Jan Beulich
2023-01-05 20:28   ` Andrew Cooper
2023-01-06  2:00     ` Demi Marie Obenour
2023-01-06  2:30       ` Marek Marczykowski-Górecki
2023-01-06  5:23         ` Demi Marie Obenour [this message]
2023-01-06  7:11     ` Jan Beulich
2022-12-22 22:31 ` [PATCH v6 5/5] x86: Use Linux's PAT Demi Marie Obenour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7ewcXxuEL92rI4v@itl-email \
    --to=demi@invisiblethingslab.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=George.Dunlap@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=marmarek@invisiblethingslab.com \
    --cc=roger.pau@citrix.com \
    --cc=tim@xen.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.