From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: VT-d flush timeout Date: Mon, 18 Aug 2014 10:47:10 +0100 Message-ID: <53F1CB9E.5070705@citrix.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Zhang, Yang Z" , "xen-devel@lists.xen.org" Cc: "Li, LiangX Z" , "Tian, Kevin" , "'Jan Beulich (JBeulich@suse.com)'" List-Id: xen-devel@lists.xenproject.org On 18/08/14 03:01, Zhang, Yang Z wrote: > Hi all > > This is continuing with previous discussion about VT-d spin loop. According previous discussion, we will deal with current 1 second flush timeout firstly. > > After reviewing Linux IOMMU code, it uses the timeout mechanism widely, e.g., flush iotlb and context via register based mechanism, > __iommu_flush_context(): > /* Make sure hardware complete it */ > IOMMU_WAIT_OP(iommu, DMAR_CCMD_REG, > dmar_readq, (!(val & DMA_CCMD_ICC)), val); > > The only place it doesn't use this timeout mechanism is queue based invalidation. I think the reason is that the max number of queue entry is 2^15 and we don't know how much time is needed really to flush 2^15 entries. So it is better to not use timeout here. Likewise, for Xen side, we will only remove the timeout in qi flush function and use spin for instead. > > Any comments? Waiting 1 second for a timeout is quite antisocial, although as a panic() is the result, the system wasn't going to stay alive anyway. However, synchronously waiting for flush is not acceptable. As identified in the Citrix/Intel monthly meetings, there is hardware which takes milliseconds to reply to a flush. This is a meaningful fraction of the default scheduling timeslice. It is my strong opinion that all spin loops like this need to be made asynchronous, unless we know for certain that there is an upper bound measured in a very small quantity of microseconds, where rescheduling another vcpu might be a poor decision. ~Andrew