From: Tom Lendacky <thomas.lendacky-5C7GfCeVMHo@public.gmane.org>
To: Jan Vesely <jan.vesely-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>,
"Nath, Arindam" <Arindam.Nath-5C7GfCeVMHo@public.gmane.org>,
Craig Stein <stein12c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
"iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org"
<iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
Subject: Re: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush
Date: Wed, 21 Jun 2017 12:01:09 -0500 [thread overview]
Message-ID: <bf685f44-019c-4c21-25d4-6a6ea647b7cc@amd.com> (raw)
In-Reply-To: <1498062018.17007.6.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
On 6/21/2017 11:20 AM, Jan Vesely wrote:
> Hi Arindam,
>
> has this patch been replaced by Joerg's "[PATCH 0/7] iommu/amd:
> Optimize iova queue flushing" series?
Yes, Joerg's patches replaced this patch. He applied just the first two
patches of this series.
Thanks,
Tom
>
> Jan
>
> On Thu, 2017-06-08 at 22:33 +0200, Jan Vesely wrote:
>> On Tue, 2017-06-06 at 10:02 +0000, Nath, Arindam wrote:
>>>> -----Original Message-----
>>>> From: Lendacky, Thomas
>>>> Sent: Tuesday, June 06, 2017 1:23 AM
>>>> To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>> Cc: Nath, Arindam <Arindam.Nath-5C7GfCeVMHo@public.gmane.org>; Joerg Roedel
>>>> <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>; Duran, Leo <leo.duran-5C7GfCeVMHo@public.gmane.org>; Suthikulpanit,
>>>> Suravee <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
>>>> Subject: [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush
>>>>
>>>> After reducing the amount of MMIO performed by the IOMMU during
>>>> operation,
>>>> perf data shows that flushing the TLB for all protection domains during
>>>> DMA unmapping is a performance issue. It is not necessary to flush the
>>>> TLBs for all protection domains, only the protection domains associated
>>>> with iova's on the flush queue.
>>>>
>>>> Create a separate queue that tracks the protection domains associated with
>>>> the iova's on the flush queue. This new queue optimizes the flushing of
>>>> TLBs to the required protection domains.
>>>>
>>>> Reviewed-by: Arindam Nath <arindam.nath-5C7GfCeVMHo@public.gmane.org>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky-5C7GfCeVMHo@public.gmane.org>
>>>> ---
>>>> drivers/iommu/amd_iommu.c | 56
>>>> ++++++++++++++++++++++++++++++++++++++++-----
>>>> 1 file changed, 50 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>>>> index 856103b..a5e77f0 100644
>>>> --- a/drivers/iommu/amd_iommu.c
>>>> +++ b/drivers/iommu/amd_iommu.c
>>>> @@ -103,7 +103,18 @@ struct flush_queue {
>>>> struct flush_queue_entry *entries;
>>>> };
>>>>
>>>> +struct flush_pd_queue_entry {
>>>> + struct protection_domain *pd;
>>>> +};
>>>> +
>>>> +struct flush_pd_queue {
>>>> + /* No lock needed, protected by flush_queue lock */
>>>> + unsigned next;
>>>> + struct flush_pd_queue_entry *entries;
>>>> +};
>>>> +
>>>> static DEFINE_PER_CPU(struct flush_queue, flush_queue);
>>>> +static DEFINE_PER_CPU(struct flush_pd_queue, flush_pd_queue);
>>>>
>>>> static atomic_t queue_timer_on;
>>>> static struct timer_list queue_timer;
>>>> @@ -2227,16 +2238,20 @@ static struct iommu_group
>>>> *amd_iommu_device_group(struct device *dev)
>>>> *
>>>>
>>>> ***********************************************************
>>>> ******************/
>>>>
>>>> -static void __queue_flush(struct flush_queue *queue)
>>>> +static void __queue_flush(struct flush_queue *queue,
>>>> + struct flush_pd_queue *pd_queue)
>>>> {
>>>> - struct protection_domain *domain;
>>>> unsigned long flags;
>>>> int idx;
>>>>
>>>> /* First flush TLB of all known domains */
>>>> spin_lock_irqsave(&amd_iommu_pd_lock, flags);
>>>> - list_for_each_entry(domain, &amd_iommu_pd_list, list)
>>>> - domain_flush_tlb(domain);
>>>> + for (idx = 0; idx < pd_queue->next; ++idx) {
>>>> + struct flush_pd_queue_entry *entry;
>>>> +
>>>> + entry = pd_queue->entries + idx;
>>>> + domain_flush_tlb(entry->pd);
>>>> + }
>>>> spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
>>>>
>>>> /* Wait until flushes have completed */
>>>> @@ -2255,6 +2270,7 @@ static void __queue_flush(struct flush_queue
>>>> *queue)
>>>> entry->dma_dom = NULL;
>>>> }
>>>>
>>>> + pd_queue->next = 0;
>>>> queue->next = 0;
>>>> }
>>>>
>>>> @@ -2263,13 +2279,15 @@ static void queue_flush_all(void)
>>>> int cpu;
>>>>
>>>> for_each_possible_cpu(cpu) {
>>>> + struct flush_pd_queue *pd_queue;
>>>> struct flush_queue *queue;
>>>> unsigned long flags;
>>>>
>>>> queue = per_cpu_ptr(&flush_queue, cpu);
>>>> + pd_queue = per_cpu_ptr(&flush_pd_queue, cpu);
>>>> spin_lock_irqsave(&queue->lock, flags);
>>>> if (queue->next > 0)
>>>> - __queue_flush(queue);
>>>> + __queue_flush(queue, pd_queue);
>>>> spin_unlock_irqrestore(&queue->lock, flags);
>>>> }
>>>> }
>>>> @@ -2283,6 +2301,8 @@ static void queue_flush_timeout(unsigned long
>>>> unsused)
>>>> static void queue_add(struct dma_ops_domain *dma_dom,
>>>> unsigned long address, unsigned long pages)
>>>> {
>>>> + struct flush_pd_queue_entry *pd_entry;
>>>> + struct flush_pd_queue *pd_queue;
>>>> struct flush_queue_entry *entry;
>>>> struct flush_queue *queue;
>>>> unsigned long flags;
>>>> @@ -2292,10 +2312,22 @@ static void queue_add(struct dma_ops_domain
>>>> *dma_dom,
>>>> address >>= PAGE_SHIFT;
>>>>
>>>> queue = get_cpu_ptr(&flush_queue);
>>>> + pd_queue = get_cpu_ptr(&flush_pd_queue);
>>>> spin_lock_irqsave(&queue->lock, flags);
>>>>
>>>> if (queue->next == FLUSH_QUEUE_SIZE)
>>>> - __queue_flush(queue);
>>>> + __queue_flush(queue, pd_queue);
>>>> +
>>>> + for (idx = 0; idx < pd_queue->next; ++idx) {
>>>> + pd_entry = pd_queue->entries + idx;
>>>> + if (pd_entry->pd == &dma_dom->domain)
>>>> + break;
>>>> + }
>>>> + if (idx == pd_queue->next) {
>>>> + /* New protection domain, add it to the list */
>>>> + pd_entry = pd_queue->entries + pd_queue->next++;
>>>> + pd_entry->pd = &dma_dom->domain;
>>>> + }
>>>>
>>>> idx = queue->next++;
>>>> entry = queue->entries + idx;
>>>> @@ -2309,6 +2341,7 @@ static void queue_add(struct dma_ops_domain
>>>> *dma_dom,
>>>> if (atomic_cmpxchg(&queue_timer_on, 0, 1) == 0)
>>>> mod_timer(&queue_timer, jiffies + msecs_to_jiffies(10));
>>>>
>>>> + put_cpu_ptr(&flush_pd_queue);
>>>> put_cpu_ptr(&flush_queue);
>>>> }
>>>>
>>>> @@ -2810,6 +2843,8 @@ int __init amd_iommu_init_api(void)
>>>> return ret;
>>>>
>>>> for_each_possible_cpu(cpu) {
>>>> + struct flush_pd_queue *pd_queue =
>>>> per_cpu_ptr(&flush_pd_queue,
>>>> + cpu);
>>>> struct flush_queue *queue = per_cpu_ptr(&flush_queue,
>>>> cpu);
>>>>
>>>> queue->entries = kzalloc(FLUSH_QUEUE_SIZE *
>>>> @@ -2819,6 +2854,12 @@ int __init amd_iommu_init_api(void)
>>>> goto out_put_iova;
>>>>
>>>> spin_lock_init(&queue->lock);
>>>> +
>>>> + pd_queue->entries = kzalloc(FLUSH_QUEUE_SIZE *
>>>> + sizeof(*pd_queue->entries),
>>>> + GFP_KERNEL);
>>>> + if (!pd_queue->entries)
>>>> + goto out_put_iova;
>>>> }
>>>>
>>>> err = bus_set_iommu(&pci_bus_type, &amd_iommu_ops);
>>>> @@ -2836,9 +2877,12 @@ int __init amd_iommu_init_api(void)
>>>>
>>>> out_put_iova:
>>>> for_each_possible_cpu(cpu) {
>>>> + struct flush_pd_queue *pd_queue =
>>>> per_cpu_ptr(&flush_pd_queue,
>>>> + cpu);
>>>> struct flush_queue *queue = per_cpu_ptr(&flush_queue,
>>>> cpu);
>>>>
>>>> kfree(queue->entries);
>>>> + kfree(pd_queue->entries);
>>>> }
>>>>
>>>> return -ENOMEM;
>>>
>>> Craig and Jan, can you please confirm whether this patch fixes the
>>> IOMMU timeout errors you encountered before? If it does, then this is
>>> a better implementation of the fix I provided few weeks back.
>>
>> I have only remote access to the machine, so I won't be able to test
>> until June 22nd.
>>
>> Jan
>>
>>>
>>> Thanks,
>>> Arindam
>>
>>
next prev parent reply other threads:[~2017-06-21 17:01 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-05 19:52 [PATCH v1 0/3] iommu/amd: AMD IOMMU performance updates 2017-06-05 Tom Lendacky
[not found] ` <20170605195203.11512.20579.stgit-qCXWGYdRb2BnqfbPTmsdiZQ+2ll4COg0XqFh9Ls21Oc@public.gmane.org>
2017-06-05 19:52 ` [PATCH v1 1/3] iommu/amd: Reduce amount of MMIO when submitting commands Tom Lendacky
2017-06-05 19:52 ` [PATCH v1 2/3] iommu/amd: Reduce delay waiting for command buffer space Tom Lendacky
2017-06-05 19:52 ` [PATCH v1 3/3] iommu/amd: Optimize the IOMMU queue flush Tom Lendacky
[not found] ` <20170605195235.11512.52995.stgit-qCXWGYdRb2BnqfbPTmsdiZQ+2ll4COg0XqFh9Ls21Oc@public.gmane.org>
2017-06-06 10:02 ` Nath, Arindam
[not found] ` <MWHPR12MB15181A6A020ACA2F53DF70339CCB0-Gy0DoCVfaSXKu+HfpMNLNQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-06-08 20:33 ` Jan Vesely
[not found] ` <1496954035.4188.1.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-06-08 23:31 ` Craig Stein
2017-06-21 16:20 ` Jan Vesely
[not found] ` <1498062018.17007.6.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-06-21 17:01 ` Tom Lendacky [this message]
[not found] ` <bf685f44-019c-4c21-25d4-6a6ea647b7cc-5C7GfCeVMHo@public.gmane.org>
2017-06-21 21:09 ` Jan Vesely
[not found] ` <1498079371.17007.18.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-06-22 9:20 ` Joerg Roedel
[not found] ` <20170622092053.GV30388-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2017-06-22 15:13 ` Jan Vesely
[not found] ` <1498144389.17007.25.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-06-22 21:57 ` Joerg Roedel
[not found] ` <20170622215735.GW30388-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2017-06-23 14:20 ` Jan Vesely
[not found] ` <1498227647.17007.31.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-06-26 12:14 ` Joerg Roedel
[not found] ` <20170626121430.GX30388-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2017-06-27 16:24 ` Jan Vesely
[not found] ` <1498580675.10525.3.camel-kgbqMDwikbSVc3sceRu5cw@public.gmane.org>
2017-06-28 8:36 ` Joerg Roedel
[not found] ` <20170628083659.GA30388-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2017-06-28 22:14 ` Deucher, Alexander
[not found] ` <BN6PR12MB16525D2E89F4AB61DC36EFBEF7DD0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-07-04 16:29 ` Craig Stein
2017-06-06 12:05 ` Joerg Roedel
[not found] ` <20170606120516.GD30388-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2017-06-06 13:36 ` Tom Lendacky
[not found] ` <85356483-1d5e-251f-57e3-d9f761239100-5C7GfCeVMHo@public.gmane.org>
2017-06-07 14:03 ` Tom Lendacky
[not found] ` <32599b14-c138-3c89-6834-0335fec0b3f6-5C7GfCeVMHo@public.gmane.org>
2017-06-07 14:17 ` Joerg Roedel
2017-06-08 12:43 ` [PATCH v1 0/3] iommu/amd: AMD IOMMU performance updates 2017-06-05 Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bf685f44-019c-4c21-25d4-6a6ea647b7cc@amd.com \
--to=thomas.lendacky-5c7gfcevmho@public.gmane.org \
--cc=Arindam.Nath-5C7GfCeVMHo@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=jan.vesely-kgbqMDwikbSVc3sceRu5cw@public.gmane.org \
--cc=stein12c-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox