All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: Kai Huang <dev.kai.huang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Tony Luck <tony.luck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Vinod Koul <vinod.koul-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	David Woodhouse <dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dmaengine-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Dan Williams
	<dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Yinghai Lu <yinghai-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Subject: Re: [RFC Patch Part2 V1 14/14] iommu/vt-d: update IOMMU state when memory hotplug happens
Date: Wed, 08 Jan 2014 14:21:52 +0800	[thread overview]
Message-ID: <52CCEE80.20001@linux.intel.com> (raw)
In-Reply-To: <CAOtp4KqvzrJr=Z5xj-vZnnL--W6R2CjZ=m0rFgR9DzxVKjfwSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>



On 2014/1/8 14:14, Kai Huang wrote:
> On Wed, Jan 8, 2014 at 2:01 PM, Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>>
>>
>> On 2014/1/8 13:07, Kai Huang wrote:
>>> On Tue, Jan 7, 2014 at 5:00 PM, Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>>>> If static identity domain is created, IOMMU driver needs to update
>>>> si_domain page table when memory hotplug event happens. Otherwise
>>>> PCI device DMA operations can't access the hot-added memory regions.
>>>>
>>>> Signed-off-by: Jiang Liu <jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
>>>> ---
>>>>  drivers/iommu/intel-iommu.c |   52 ++++++++++++++++++++++++++++++++++++++++++-
>>>>  1 file changed, 51 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 83e3ed4..35a987d 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -33,6 +33,7 @@
>>>>  #include <linux/dmar.h>
>>>>  #include <linux/dma-mapping.h>
>>>>  #include <linux/mempool.h>
>>>> +#include <linux/memory.h>
>>>>  #include <linux/timer.h>
>>>>  #include <linux/iova.h>
>>>>  #include <linux/iommu.h>
>>>> @@ -3689,6 +3690,54 @@ static struct notifier_block device_nb = {
>>>>         .notifier_call = device_notifier,
>>>>  };
>>>>
>>>> +static int intel_iommu_memory_notifier(struct notifier_block *nb,
>>>> +                                      unsigned long val, void *v)
>>>> +{
>>>> +       struct memory_notify *mhp = v;
>>>> +       unsigned long long start, end;
>>>> +       struct iova *iova;
>>>> +
>>>> +       switch (val) {
>>>> +       case MEM_GOING_ONLINE:
>>>> +               start = mhp->start_pfn << PAGE_SHIFT;
>>>> +               end = ((mhp->start_pfn + mhp->nr_pages) << PAGE_SHIFT) - 1;
>>>> +               if (iommu_domain_identity_map(si_domain, start, end)) {
>>>> +                       pr_warn("dmar: failed to build identity map for [%llx-%llx]\n",
>>>> +                               start, end);
>>>> +                       return NOTIFY_BAD;
>>>> +               }
>>>
>>> Better to use iommu_prepare_identity_map? For si_domain, if
>>> hw_pass_through is used, there's no page table.
>> Hi Kai,
>>         Good catch!
>> Seems function iommu_prepare_identity_map() is designed to handle
>> RMRRs. So how about avoiding of registering memory hotplug notifier
>> if hw_pass_through is true?
> 
> I think that's also fine :)
> 
> Btw, I have a related question to memory hotplug but not related to
> intel IOMMU specifically. For the devices use DMA remapping, suppose
> the device is already using the memory that we are trying to remove,
> is this case, looks we need to change the existing iova <-> pa
> mappings for the pa that is in the memory range about to be removed,
> and reset the mapping to different pa (iova remains the same). Does
> existing code have this covered? Is there a generic IOMMU layer memory
> hotplug notifier to handle memory removal?
That's a big issue about how to reclaim memory in use. Current rule
is that memory used by DMA won't be removed until released.

> 
> -Kai
>>
>> Thanks!
>> Gerry
>>
>>>
>>>> +               break;
>>>> +       case MEM_OFFLINE:
>>>> +       case MEM_CANCEL_ONLINE:
>>>> +               /* TODO: enhance RB-tree and IOVA code to support of splitting iova */
>>>> +               iova = find_iova(&si_domain->iovad, mhp->start_pfn);
>>>> +               if (iova) {
>>>> +                       unsigned long start_pfn, last_pfn;
>>>> +                       struct dmar_drhd_unit *drhd;
>>>> +                       struct intel_iommu *iommu;
>>>> +
>>>> +                       start_pfn = mm_to_dma_pfn(iova->pfn_lo);
>>>> +                       last_pfn = mm_to_dma_pfn(iova->pfn_hi + 1) - 1;
>>>> +                       dma_pte_clear_range(si_domain, start_pfn, last_pfn);
>>>> +                       dma_pte_free_pagetable(si_domain, start_pfn, last_pfn);
>>>> +                       rcu_read_lock();
>>>> +                       for_each_active_iommu(iommu, drhd)
>>>> +                               iommu_flush_iotlb_psi(iommu, si_domain->id,
>>>> +                                       start_pfn, last_pfn - start_pfn + 1, 0);
>>>> +                       rcu_read_unlock();
>>>> +                       __free_iova(&si_domain->iovad, iova);
>>>> +               }
>>>
>>> The same as above. Looks we need to consider hw_pass_through for the si_domain.
>>>
>>> -Kai
>>>
>>>> +               break;
>>>> +       }
>>>> +
>>>> +       return NOTIFY_OK;
>>>> +}
>>>> +
>>>> +static struct notifier_block intel_iommu_memory_nb = {
>>>> +       .notifier_call = intel_iommu_memory_notifier,
>>>> +       .priority = 0
>>>> +};
>>>> +
>>>>  int __init intel_iommu_init(void)
>>>>  {
>>>>         int ret = -ENODEV;
>>>> @@ -3761,8 +3810,9 @@ int __init intel_iommu_init(void)
>>>>         init_iommu_pm_ops();
>>>>
>>>>         bus_set_iommu(&pci_bus_type, &intel_iommu_ops);
>>>> -
>>>>         bus_register_notifier(&pci_bus_type, &device_nb);
>>>> +       if (si_domain)
>>>> +               register_memory_notifier(&intel_iommu_memory_nb);
>>>>
>>>>         intel_iommu_enabled = 1;
>>>>
>>>> --
>>>> 1.7.10.4
>>>>
>>>> _______________________________________________
>>>> iommu mailing list
>>>> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Jiang Liu <jiang.liu@linux.intel.com>
To: Kai Huang <dev.kai.huang@gmail.com>
Cc: Joerg Roedel <joro@8bytes.org>,
	David Woodhouse <dwmw2@infradead.org>,
	Yinghai Lu <yinghai@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Vinod Koul <vinod.koul@intel.com>,
	Tony Luck <tony.luck@intel.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	dmaengine@vger.kernel.org, iommu@lists.linux-foundation.org
Subject: Re: [RFC Patch Part2 V1 14/14] iommu/vt-d: update IOMMU state when memory hotplug happens
Date: Wed, 08 Jan 2014 14:21:52 +0800	[thread overview]
Message-ID: <52CCEE80.20001@linux.intel.com> (raw)
In-Reply-To: <CAOtp4KqvzrJr=Z5xj-vZnnL--W6R2CjZ=m0rFgR9DzxVKjfwSQ@mail.gmail.com>



On 2014/1/8 14:14, Kai Huang wrote:
> On Wed, Jan 8, 2014 at 2:01 PM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
>>
>>
>> On 2014/1/8 13:07, Kai Huang wrote:
>>> On Tue, Jan 7, 2014 at 5:00 PM, Jiang Liu <jiang.liu@linux.intel.com> wrote:
>>>> If static identity domain is created, IOMMU driver needs to update
>>>> si_domain page table when memory hotplug event happens. Otherwise
>>>> PCI device DMA operations can't access the hot-added memory regions.
>>>>
>>>> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
>>>> ---
>>>>  drivers/iommu/intel-iommu.c |   52 ++++++++++++++++++++++++++++++++++++++++++-
>>>>  1 file changed, 51 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 83e3ed4..35a987d 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -33,6 +33,7 @@
>>>>  #include <linux/dmar.h>
>>>>  #include <linux/dma-mapping.h>
>>>>  #include <linux/mempool.h>
>>>> +#include <linux/memory.h>
>>>>  #include <linux/timer.h>
>>>>  #include <linux/iova.h>
>>>>  #include <linux/iommu.h>
>>>> @@ -3689,6 +3690,54 @@ static struct notifier_block device_nb = {
>>>>         .notifier_call = device_notifier,
>>>>  };
>>>>
>>>> +static int intel_iommu_memory_notifier(struct notifier_block *nb,
>>>> +                                      unsigned long val, void *v)
>>>> +{
>>>> +       struct memory_notify *mhp = v;
>>>> +       unsigned long long start, end;
>>>> +       struct iova *iova;
>>>> +
>>>> +       switch (val) {
>>>> +       case MEM_GOING_ONLINE:
>>>> +               start = mhp->start_pfn << PAGE_SHIFT;
>>>> +               end = ((mhp->start_pfn + mhp->nr_pages) << PAGE_SHIFT) - 1;
>>>> +               if (iommu_domain_identity_map(si_domain, start, end)) {
>>>> +                       pr_warn("dmar: failed to build identity map for [%llx-%llx]\n",
>>>> +                               start, end);
>>>> +                       return NOTIFY_BAD;
>>>> +               }
>>>
>>> Better to use iommu_prepare_identity_map? For si_domain, if
>>> hw_pass_through is used, there's no page table.
>> Hi Kai,
>>         Good catch!
>> Seems function iommu_prepare_identity_map() is designed to handle
>> RMRRs. So how about avoiding of registering memory hotplug notifier
>> if hw_pass_through is true?
> 
> I think that's also fine :)
> 
> Btw, I have a related question to memory hotplug but not related to
> intel IOMMU specifically. For the devices use DMA remapping, suppose
> the device is already using the memory that we are trying to remove,
> is this case, looks we need to change the existing iova <-> pa
> mappings for the pa that is in the memory range about to be removed,
> and reset the mapping to different pa (iova remains the same). Does
> existing code have this covered? Is there a generic IOMMU layer memory
> hotplug notifier to handle memory removal?
That's a big issue about how to reclaim memory in use. Current rule
is that memory used by DMA won't be removed until released.

> 
> -Kai
>>
>> Thanks!
>> Gerry
>>
>>>
>>>> +               break;
>>>> +       case MEM_OFFLINE:
>>>> +       case MEM_CANCEL_ONLINE:
>>>> +               /* TODO: enhance RB-tree and IOVA code to support of splitting iova */
>>>> +               iova = find_iova(&si_domain->iovad, mhp->start_pfn);
>>>> +               if (iova) {
>>>> +                       unsigned long start_pfn, last_pfn;
>>>> +                       struct dmar_drhd_unit *drhd;
>>>> +                       struct intel_iommu *iommu;
>>>> +
>>>> +                       start_pfn = mm_to_dma_pfn(iova->pfn_lo);
>>>> +                       last_pfn = mm_to_dma_pfn(iova->pfn_hi + 1) - 1;
>>>> +                       dma_pte_clear_range(si_domain, start_pfn, last_pfn);
>>>> +                       dma_pte_free_pagetable(si_domain, start_pfn, last_pfn);
>>>> +                       rcu_read_lock();
>>>> +                       for_each_active_iommu(iommu, drhd)
>>>> +                               iommu_flush_iotlb_psi(iommu, si_domain->id,
>>>> +                                       start_pfn, last_pfn - start_pfn + 1, 0);
>>>> +                       rcu_read_unlock();
>>>> +                       __free_iova(&si_domain->iovad, iova);
>>>> +               }
>>>
>>> The same as above. Looks we need to consider hw_pass_through for the si_domain.
>>>
>>> -Kai
>>>
>>>> +               break;
>>>> +       }
>>>> +
>>>> +       return NOTIFY_OK;
>>>> +}
>>>> +
>>>> +static struct notifier_block intel_iommu_memory_nb = {
>>>> +       .notifier_call = intel_iommu_memory_notifier,
>>>> +       .priority = 0
>>>> +};
>>>> +
>>>>  int __init intel_iommu_init(void)
>>>>  {
>>>>         int ret = -ENODEV;
>>>> @@ -3761,8 +3810,9 @@ int __init intel_iommu_init(void)
>>>>         init_iommu_pm_ops();
>>>>
>>>>         bus_set_iommu(&pci_bus_type, &intel_iommu_ops);
>>>> -
>>>>         bus_register_notifier(&pci_bus_type, &device_nb);
>>>> +       if (si_domain)
>>>> +               register_memory_notifier(&intel_iommu_memory_nb);
>>>>
>>>>         intel_iommu_enabled = 1;
>>>>
>>>> --
>>>> 1.7.10.4
>>>>
>>>> _______________________________________________
>>>> iommu mailing list
>>>> iommu@lists.linux-foundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu

  parent reply	other threads:[~2014-01-08  6:21 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-07  9:00 [Patch Part2 V1 00/14] Enhance DMAR drivers to handle PCI/memory hotplug events Jiang Liu
2014-01-07  9:00 ` Jiang Liu
     [not found] ` <1389085234-22296-1-git-send-email-jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-01-07  9:00   ` [Patch Part2 V1 01/14] iommu/vt-d: factor out dmar_alloc_dev_scope() for later reuse Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 02/14] iommu/vt-d: move private structures and variables into intel-iommu.c Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 03/14] iommu/vt-d: simplify function get_domain_for_dev() Jiang Liu
2014-01-07  9:00     ` Jiang Liu
     [not found]     ` <1389085234-22296-4-git-send-email-jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-01-08  3:06       ` Kai Huang
2014-01-08  5:48         ` Jiang Liu
     [not found]           ` <52CCE6AE.1070809-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-01-08  6:06             ` Kai Huang
2014-01-08  6:06               ` Kai Huang
     [not found]               ` <CAOtp4Kqo5m-uOKfr8WDwH1v3+23iSv9_329xS=K76Kpq-QXdVw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-08  6:31                 ` Jiang Liu
2014-01-08  6:31                   ` Jiang Liu
     [not found]                   ` <52CCF0A9.70703-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-01-08  6:48                     ` Kai Huang
2014-01-08  6:48                       ` Kai Huang
2014-01-08  6:56                       ` Kai Huang
     [not found]                       ` <CAOtp4KpVbS6twWHukFWODDuwujG8BX6zYXOZiGRQM17f49UQ3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-08  6:57                         ` Jiang Liu
2014-01-08  6:57                           ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 04/14] iommu/vt-d: free resources if failed to create domain for PCIe endpoint Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 05/14] iommu/vt-d: create device_domain_info structure for intermediate P2P bridges Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 06/14] iommu/vt-d: fix incorrect iommu_count for si_domain Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 07/14] iommu/vt-d: fix error in detect ATS capability Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-09  3:10     ` Yijing Wang
2014-01-09  3:10       ` Yijing Wang
2014-01-07  9:00   ` [Patch Part2 V1 08/14] iommu/vt-d: introduce macro for_each_dev_scope() to walk device scope entries Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 09/14] iommu/vt-d: introduce a rwsem to protect global data structures Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 10/14] iommu/vt-d: use RCU to protect global resources in interrupt context Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 11/14] iommu/vt-d, PCI: update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens Jiang Liu
2014-01-07  9:00     ` Jiang Liu
     [not found]     ` <1389085234-22296-12-git-send-email-jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-01-09  8:52       ` Yijing Wang
2014-01-09  8:52         ` Yijing Wang
2014-01-07  9:00   ` [Patch Part2 V1 12/14] iommu/vt-d, PCI: unify the way to process DMAR device scope array Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [Patch Part2 V1 13/14] iommu/vt-d: update device to static identity domain mapping for PCI hotplug Jiang Liu
2014-01-07  9:00     ` Jiang Liu
2014-01-07  9:00   ` [RFC Patch Part2 V1 14/14] iommu/vt-d: update IOMMU state when memory hotplug happens Jiang Liu
2014-01-07  9:00     ` Jiang Liu
     [not found]     ` <1389085234-22296-15-git-send-email-jiang.liu-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-01-08  5:07       ` Kai Huang
2014-01-08  5:07         ` Kai Huang
     [not found]         ` <CAOtp4Kqn5e_51hwrRMgRmam7jXaVC=md7BAuwnG3gJGETj9iQA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-08  6:01           ` Jiang Liu
2014-01-08  6:01             ` Jiang Liu
2014-01-08  6:14             ` Kai Huang
     [not found]               ` <CAOtp4KqvzrJr=Z5xj-vZnnL--W6R2CjZ=m0rFgR9DzxVKjfwSQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-08  6:21                 ` Jiang Liu [this message]
2014-01-08  6:21                   ` Jiang Liu
     [not found]                   ` <52CCEE80.20001-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2014-01-08  6:27                     ` Kai Huang
2014-01-08  6:27                       ` Kai Huang
2014-01-08 20:43   ` [Patch Part2 V1 00/14] Enhance DMAR drivers to handle PCI/memory hotplug events Yinghai Lu
2014-01-08 20:43     ` Yinghai Lu
     [not found]     ` <CAE9FiQUgfuQ9nXNOOCcYAKVeN05o+TX6e35qe5nSkyxB-DpyGg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-01-09  0:41       ` Jiang Liu
2014-01-09  0:41         ` Jiang Liu
2014-01-09 20:30         ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52CCEE80.20001@linux.intel.com \
    --to=jiang.liu-vuqaysv1563yd54fqh9/ca@public.gmane.org \
    --cc=bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=dev.kai.huang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=dmaengine-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tony.luck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=vinod.koul-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=yinghai-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.