From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C811C433E0 for ; Mon, 25 Jan 2021 08:44:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0B2D922D02 for ; Mon, 25 Jan 2021 08:44:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725789AbhAYImx (ORCPT ); Mon, 25 Jan 2021 03:42:53 -0500 Received: from mga14.intel.com ([192.55.52.115]:20336 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725967AbhAYIlm (ORCPT ); Mon, 25 Jan 2021 03:41:42 -0500 IronPort-SDR: mEjcgFaXfpOcMZE4qpJIdSdu+rXXZ7XZZfMf0XoTjKDJY8TfBhnI6Gn4N/4bAUNditG7GtIELQ r7Kcqxnlkq7A== X-IronPort-AV: E=McAfee;i="6000,8403,9874"; a="178907778" X-IronPort-AV: E=Sophos;i="5.79,373,1602572400"; d="scan'208";a="178907778" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jan 2021 00:30:47 -0800 IronPort-SDR: g+0sRwh72EjCtadnOwMv0x3ysX2nnnQAeIFDPcMGQRoJJn1gcqNDTDz4dTeni5jRvYhAAGQJh5 WI6af5RlG7Xw== X-IronPort-AV: E=Sophos;i="5.79,373,1602572400"; d="scan'208";a="387241797" Received: from blu2-mobl3.ccr.corp.intel.com (HELO [10.255.29.249]) ([10.255.29.249]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jan 2021 00:30:44 -0800 Cc: baolu.lu@linux.intel.com, "Raj, Ashok" , Jacob Pan , "Liu, Yi L" , "iommu@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH 1/3] iommu/vt-d: Add rate limited information when PRQ overflows To: "Tian, Kevin" , Joerg Roedel , Will Deacon References: <20210121014505.1659166-1-baolu.lu@linux.intel.com> <20210121014505.1659166-2-baolu.lu@linux.intel.com> From: Lu Baolu Message-ID: <1a04f6c2-0c82-1693-c7e5-1333d3fced17@linux.intel.com> Date: Mon, 25 Jan 2021 16:30:41 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Kevin, On 2021/1/25 16:16, Tian, Kevin wrote: >> From: Lu Baolu >> Sent: Monday, January 25, 2021 2:29 PM >> >> Hi Kevin, >> >> On 2021/1/22 14:38, Tian, Kevin wrote: >>>> From: Lu Baolu >>>> Sent: Thursday, January 21, 2021 9:45 AM >>>> >>>> So that the uses could get chances to know what happened. >>>> >>>> Suggested-by: Ashok Raj >>>> Signed-off-by: Lu Baolu >>>> --- >>>> drivers/iommu/intel/svm.c | 10 ++++++++-- >>>> 1 file changed, 8 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c >>>> index 033b25886e57..f49fe715477b 100644 >>>> --- a/drivers/iommu/intel/svm.c >>>> +++ b/drivers/iommu/intel/svm.c >>>> @@ -895,6 +895,7 @@ static irqreturn_t prq_event_thread(int irq, void >> *d) >>>> struct intel_iommu *iommu = d; >>>> struct intel_svm *svm = NULL; >>>> int head, tail, handled = 0; >>>> + struct page_req_dsc *req; >>>> >>>> /* Clear PPR bit before reading head/tail registers, to >>>> * ensure that we get a new interrupt if needed. */ >>>> @@ -904,7 +905,6 @@ static irqreturn_t prq_event_thread(int irq, void >> *d) >>>> head = dmar_readq(iommu->reg + DMAR_PQH_REG) & >>>> PRQ_RING_MASK; >>>> while (head != tail) { >>>> struct vm_area_struct *vma; >>>> - struct page_req_dsc *req; >>>> struct qi_desc resp; >>>> int result; >>>> vm_fault_t ret; >>>> @@ -1042,8 +1042,14 @@ static irqreturn_t prq_event_thread(int irq, >> void >>>> *d) >>>> * Clear the page request overflow bit and wake up all threads that >>>> * are waiting for the completion of this handling. >>>> */ >>>> - if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) >>>> + if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) { >>>> + head = dmar_readq(iommu->reg + DMAR_PQH_REG) & >>>> PRQ_RING_MASK; >>>> + req = &iommu->prq[head / sizeof(*req)]; >>>> + pr_warn_ratelimited("IOMMU: %s: Page request overflow: >>>> HEAD: %08llx %08llx", >>>> + iommu->name, ((unsigned long long >>>> *)req)[0], >>>> + ((unsigned long long *)req)[1]); >>>> writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG); >>>> + } >>>> >>> >>> Not about rate limiting but I think we may have a problem in above >>> logic. It is incorrect to always clear PRO when it's set w/o first checking >>> whether the overflow condition has been cleared. This code assumes >>> that if an overflow condition occurs it must have been cleared by earlier >>> loop when hitting this check. However since this code runs in a threaded >>> context, the overflow condition could occur even after you reset the head >>> to the tail (under some extreme condition). To be sane I think we'd better >>> read both head/tail again after seeing a pending PRO here and only clear >>> PRO when it becomes a false indicator based on latest head/tail. >>> >> >> Yes, agreed. We can check the head and tail and clear the overflow bit >> until the queue is empty. The finial code looks like: >> >> /* >> * Clear the page request overflow bit and wake up all threads that >> * are waiting for the completion of this handling. >> */ >> if (readl(iommu->reg + DMAR_PRS_REG) & DMA_PRS_PRO) { >> head = dmar_readq(iommu->reg + DMAR_PQH_REG) & >> PRQ_RING_MASK; >> tail = dmar_readq(iommu->reg + DMAR_PQT_REG) & >> PRQ_RING_MASK; >> if (head == tail) { >> req = &iommu->prq[head / sizeof(*req)]; >> pr_warn_ratelimited("IOMMU: %s: Page request >> overflow cleared: HEAD: %08llx %08llx", >> iommu->name, ((unsigned >> long long *)req)[0], >> ((unsigned long long >> *)req)[1]); >> writel(DMA_PRS_PRO, iommu->reg + DMAR_PRS_REG); >> } >> } >> >> Thought? >> > > Just a small comment. Is it useful to also print a warning in the true > overflow condition which has to wait for next interrupt to be cleared? > That's fine. :-) > Thanks > Kevin > Best regards, baolu