Re: VT-d flush timeout - Andrew Cooper

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>,
	Eddie Dong <eddie.dong@intel.com>,
	Yang Z Zhang <yang.z.zhang@intel.com>
Cc: LiangX Z Li <liangx.z.li@intel.com>,
	Kevin Tian <kevin.tian@intel.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: VT-d flush timeout
Date: Thu, 25 Sep 2014 11:44:57 +0100	[thread overview]
Message-ID: <5423F229.3020304@citrix.com> (raw)
In-Reply-To: <5423F4A30200007800038CB2@mail.emea.novell.com>

On 25/09/14 09:55, Jan Beulich wrote:
>>>> On 25.09.14 at 03:02, <eddie.dong@intel.com> wrote:
>>>> I don't get your point. What's the different between the 1s and the
>>>> large enough value? Since hardware completed the flush quickly, it will
>>>> never spin more than real flush time in normal case. 1s spin only
>>>> happens in the abnormal case.
>>> Right, but we can't ignore this abnormal case. In particular we can't 
>> exclude
>>> that a DomU with a device assigned may have ways to (perhaps indirectly)
>>> affect the completion time of the flushes.
>>>
>>>> My only concern is that, for QI flush, the spin time relies on the
>>>> length of the queue. I am not sure whether 1s is enough for worst case
>>>> and I think we should remove the 1s in QI flush. And I think this also
>>>> the same reason for Linux don't use timeout mechanism in QI flush.
>>> First of all I think both Linux and Xen in the majority of cases waits for
>>> completion of just individual queue entries. I.e. I'm not sure if the 
>> practical
>>> worst case really is equal to the theoretical one. And then removing a
>>> timeout just to allow _longer_ spinning isn't really a step forward. If the
>>> timeout isn't big enough, the only solution is to immediately replace it 
>> with
>>> asynchronous handling.
>>>
>> Giving this path of long time wait-loop only happens at the case when the 
>> hardware fails, I don't care if it enters panic in 1 seconds or in 10ms 
>> seconds. But the software compatibility (for all existing platforms and 
>> potential future platforms) is much more important.
> I agree for the paths leading to a panic(). But there's one such case
> where it doesn't: snb_vtd_ops_preamble().
>
>> Replacing wait-loop with an asynchronous handling mechanism is definitely 
>> good -- maybe put an TODO list for the IOMMU stuff which I believe requiring 
>> not small effort. But the main goal should be targeting the normal code path, 
>> i.e. success of the IOTLB operation no matter it is 1ms or 10ms. 
>>
>> However, as for the timeout code path, given that the specification doesn't 
>> say what the hardware WORST case is, using practical smaller number is not a 
>> good choice to me. Nobody is able to test on all platforms, and it may fail 
>> in future platform. Further more, the result of the mis-prediction to the 
>> hardware behavior is so serious-->leading to hypervisor panic.  Of course 
>> 1second is not a good value too, but at least it is verified in the past 
>> years.
> Once again - the ATS spec talks of 60 seconds (with possibly another
> 30 seconds added on top depending on how you read it). So while for
> the non-ATS case, provided the one case pointed out above gets dealt
> with, I agree we don't need to bother changing the code, the ATS case
> clearly makes asynchronous handling necessary. It is for that reason
> that we decided to disable ATS support by default. Albeit now that I
> think about it again, I'm not sure that was an appropriate action for
> the AMD side of things - Andrew, what do you think?
>
> Jan
>

The AMD code will wait for a period of time for the COMMAND_WAIT to
start, but not panic() on a timeout.

By my reading of the code, amd_iommu_flush_iotlb() can complete without
confirming that the iotlbs have actually been flushed, and all that is
left behind is an AMD_IOMMU_DEBUG() message.

~Andrew

next prev parent reply	other threads:[~2014-09-25 10:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-18  2:01 VT-d flush timeout Zhang, Yang Z
2014-08-18  9:47 ` Andrew Cooper
2014-08-18 12:48 ` Jan Beulich
2014-08-19  1:34   ` Zhang, Yang Z
2014-08-19 13:02     ` Jan Beulich
2014-08-21  3:16       ` Zhang, Yang Z
2014-08-22  7:33         ` Jan Beulich
2014-08-22  7:49           ` Zhang, Yang Z
2014-08-22  7:58             ` Jan Beulich
2014-08-22  8:05               ` Zhang, Yang Z
2014-08-25 17:21                 ` Dugger, Donald D
2014-09-25  1:02       ` Dong, Eddie
2014-09-25  8:55         ` Jan Beulich
2014-09-25 10:44           ` Andrew Cooper [this message]
2014-09-25 10:55             ` Jan Beulich
2014-09-25 10:56               ` Andrew Cooper
2014-09-25 11:48                 ` Jan Beulich
2014-09-25 21:21                   ` Tian, Kevin
2014-09-25 23:23           ` Dong, Eddie
2014-09-26  1:07             ` Zhang, Yang Z
2014-09-26  6:30             ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5423F229.3020304@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=eddie.dong@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=liangx.z.li@intel.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.