Re: [PATCH v2] AMD/intremap: Prevent use of per-device vector maps until irq logic is fixed

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>,
	"Keir (Xen.org)" <keir@xen.org>, Jacob Shin <jacob.shin@amd.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2] AMD/intremap: Prevent use of per-device vector maps until irq logic is fixed
Date: Mon, 3 Jun 2013 16:17:18 +0100	[thread overview]
Message-ID: <51ACB37E.6030802@citrix.com> (raw)
In-Reply-To: <51ACCBDC02000078000DA970@nat28.tlf.novell.com>

On 03/06/13 16:01, Jan Beulich wrote:
>>>> On 03.06.13 at 16:35, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 03/06/13 15:07, Jan Beulich wrote:
>>>>>> On 31.05.13 at 22:04, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>> In an effort to get AMD systems back to a non-regressed state, introduce a 
>>>> new
>>>> type of vector map called per-device-global.  This uses per-device vector maps
>>>> in the IOMMU, but uses a single used_vector map for the core IRQ logic.
>>> So what's the reason for not simply using OPT_IRQ_VECTOR_MAP_GLOBAL
>>> here?
>> Simply to make it obviously different until the core problem is fixed,
>> at which point I expect OPT_IRQ_VECTOR_MAP_PERDEV_GLOBAL to disappear.
> That's not a really good excuse...
>
>>>> This patch is intended to be removed as soon as the per-device logic is fixed
>>>> correctly.
>>> As a last resort thing this may be acceptable, but I'd much favor to
>>> fix this properly rather than hacking it like this.
>> While I agree that a proper fix would be good, what is going to happen
>> about 4.2 and 4.1 which wont have this new functionality backported? 
>> Futhermore, unless this new functionalty is going to race into 4.3 at
>> the last moment, 4.3 will also be in a regressed state.
> The new functionality (multi-vector MSI) doesn't necessarily need
> to be backported, but if the prereq change turns out to fix a bug,
> I don't see a reason not to try to backport that one.
>
> As to getting the patch in for 4.3 - George, would you revisit your
> opinion on the part of the multi-vector MSI series that originally
> I had hoped to get into 4.3 anyway?
>
>>>  Hence I'd really like
>>> to put up for discussion to instead use the patch[1] already posted
>>> as preparatory for the multi-vector MSI support doing away with the
>>> use of the vector for indexing the IRTE (and, in a second patch[2],
>>> the enforcement of OPT_IRQ_VECTOR_MAP_PERDEV).
>>>
>>> Also, overriding a command line request in the way you do is a
>>> no-go imo - even if this would cause [theoretical] problems,
>> Not theoretical.  I have reproduced the issue, albeit with a modified
>> Xen which deliberately limits the range of vectors considered for a
>> certain device, to increase the chances of a collision.
> You misunderstood my use of "theoretical": On a system with only
> MSI devices, no problem is to be expected afaict. Yet your change
> would affect those too.

Ah I see.

>
>>>  we
>>> ought to honor the request as long as we can't tell for sure that
>>> this is going to break the specific system. That's even more so
>>> since requesting per-device vector maps to be used on VT-d ought
>>> to yield exactly the same effect, yet you don't override the mode
>>> there.
>> Anyone using these vector maps with VT-d is mad.  I could tweak the
>> patch to not override the command line but simply warn when global is
>> chosen.
> Let's take a step back: What do we need those vector maps for in
> the first place, other than the disambiguation of AMD IOMMU
> IRTEs? If the answer is "nothing", then why was a command line
> option controlling this added in the first place? And in that case
> ripping them out the moment the patches mentioned above go in
> would seem like the right thing to do. George, I think you added all
> that - do you have any thoughts here?

As I remember, the original bug was that when migrating an interrupts in
Xen from one pcpu to another and choosing the same vector, the cleanup
code zapped the IRTE, causing loss of interrupts.  The used_vector logic
was added to prevent the interrupt migration code from choosing the same
vector on a different pcpu.

I cant precisely comment about the introduction of the command line
option.  With hindsight, I suspect it might have been a lack of
understanding the extent of the problem.  I was certainly quite new to
interrupt remapping at the time and did feel a little out of my depth.

>
>>> Furthermore, if only MSI-X devices currently suffer from this, the
>>> scalability effect this has (allowing nor more than about 200
>>> vectors to be in use even on huge systems) would call for limiting
>>> the effect to MSI-X capable devices (or perhaps even to devices
>>> actually using MSI-X).
>> As I said, this reverts to the behaviour before XSA-36, but without the
>> security issue of a single IOMMU interrupt remapping table.  Before
>> XSA-36, all AMD systems were limited in vector range because of the
>> global used_vector map.
> Right, so you'd trade one regression for another (less severe, but
> anyway).
>
> Jan
>

Absolutely, especially when it comes to trying to fix a regression we
have pushed out in a security fix.

Ideally a proper fix to MSI-X issue can be found, but failing a timely
fix, reverting to the pre XSA-36 behaviour but without the security
issue is a good solution.

~Andrew

next prev parent reply	other threads:[~2013-06-03 15:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-31 20:04 [PATCH v2] AMD/intremap: Prevent use of per-device vector maps until irq logic is fixed Andrew Cooper
2013-06-03 14:07 ` Jan Beulich
2013-06-03 14:35   ` Andrew Cooper
2013-06-03 15:01     ` Jan Beulich
2013-06-03 15:17       ` Andrew Cooper [this message]
2013-06-03 15:28         ` Jan Beulich
2013-06-03 15:41           ` Andrew Cooper
2013-06-04 13:12   ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51ACB37E.6030802@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=jacob.shin@amd.com \
    --cc=keir@xen.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).