Re: [RFC PATCH 4/4] iommu: make IOVA domain page size explicit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: leizhen <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
To: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
Cc: Catalin Marinas <Catalin.Marinas-5wv7dgnIgG8@public.gmane.org>,
	Will Deacon <Will.Deacon-5wv7dgnIgG8@public.gmane.org>,
	"iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org"
	<iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	"dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org"
	<dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	"linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org"
	<linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org>
Subject: Re: [RFC PATCH 4/4] iommu: make IOVA domain page size explicit
Date: Thu, 27 Nov 2014 15:10:50 +0800	[thread overview]
Message-ID: <5476CE7A.3020302@huawei.com> (raw)
In-Reply-To: <5475D63D.1050405-5wv7dgnIgG8@public.gmane.org>

On 2014/11/26 21:31, Robin Murphy wrote:
> Hi,
> 
> On 26/11/14 07:17, leizhen wrote:
> [...]
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index a3dbba8..9e50625 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -55,12 +55,16 @@ void free_iova_mem(struct iova *iova)
>>>   }
>>>
>>>   void
>>> -init_iova_domain(struct iova_domain *iovad, unsigned long start_pfn,
>>> -    unsigned long pfn_32bit)
>>> +init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> +    unsigned long start_pfn, unsigned long pfn_32bit)
>>>   {
>>> +    /* IOMMUs must support at least the CPU page size */
>>> +    BUG_ON(granule > PAGE_SIZE);
>>
>> ??? BUG_ON(granule < PAGE_SIZE);
>>
> 
> Granule represents the smallest page size that the IOMMU can map - we'd never allocate less IOVA space than a single page, so we don't need a unit smaller than that, and a bigger unit would make it
> problematic to allocate IOVA space for individual pages.
> 
> If, for instance, the CPU has 64k pages, and the IOMMU supports 1k tiny pages, granule would be less than PAGE_SIZE, but the IOMMU could happily that 64k as 64 contiguous tiny pages, even if it's a
> waste of TLB space.
> 
> However, say the IOMMU only supports 16k pages and we try to map two noncontiguous 4k CPU pages as a contiguous 8k buffer in IOVA space - this can't work, hence why the _minimum_ IOMMU page size, and
> thus granule, may never be _bigger_ than a CPU page.
> 

Oh, I just confused with the comment: /* IOMMUs must support at least the CPU page size */
But the code means at most the CPU page size. Now, I got it.

>>> +
>>>       spin_lock_init(&iovad->iova_rbtree_lock);
>>>       iovad->rbroot = RB_ROOT;
>>>       iovad->cached32_node = NULL;
>>> +    iovad->granule = granule;
>>
>> Do you need to check granule is 2^n ?
>>
> 
> Yes, that would make a lot of sense.
> 
>>>       iovad->start_pfn = start_pfn;
>>>       iovad->dma_32bit_pfn = pfn_32bit;
>>>   }
>>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>>> index 591b196..3920a19 100644
>>> --- a/include/linux/iova.h
>>> +++ b/include/linux/iova.h
>>> @@ -28,6 +28,7 @@ struct iova_domain {
>>>       spinlock_t    iova_rbtree_lock; /* Lock to protect update of rbtree */
>>>       struct rb_root    rbroot;        /* iova domain rbtree root */
>>>       struct rb_node    *cached32_node; /* Save last alloced node */
>>> +    unsigned long    granule;    /* pfn granularity for this domain */
>>>       unsigned long    start_pfn;    /* Lower limit for this domain */
>>>       unsigned long    dma_32bit_pfn;
>>>   };
>>> @@ -37,6 +38,36 @@ static inline unsigned long iova_size(struct iova *iova)
>>>       return iova->pfn_hi - iova->pfn_lo + 1;
>>>   }
>>>
>>> +static inline unsigned long iova_shift(struct iova_domain *iovad)
>>> +{
>>> +    return __ffs(iovad->granule);
>>
>> I think it's good to save the result to a struct member. If we caculate it ervertime, it's too slow.
>>
> 
> I did give some consideration at the time to which of size, shift and mask to store - given the redundancy between them storing more than one seems excessively wasteful - and this made for the
> nicest-looking code IMO. These IOVA operations are going to be mostly used in map/unmap situations leading to an IOMMU TLB flush anyway, so I'm not sure how performance-critical they really are.
> 
> Are we likely to see this code used on architectures where __ffs takes more than a few cycles and has more impact than the general memory bloat of making structures bigger?
> 
> 
> Thanks for the review,
> Robin.
> 
>>> +}
>>> +
>>> +static inline unsigned long iova_mask(struct iova_domain *iovad)
>>> +{
>>> +    return iovad->granule - 1;
>>> +}
>>> +
>>> +static inline size_t iova_offset(struct iova_domain *iovad, dma_addr_t iova)
>>> +{
>>> +    return iova & iova_mask(iovad);
>>> +}
>>> +
>>> +static inline size_t iova_align(struct iova_domain *iovad, size_t size)
>>> +{
>>> +    return ALIGN(size, iovad->granule);
>>> +}
>>> +
>>> +static inline dma_addr_t iova_dma_addr(struct iova_domain *iovad, struct iova *iova)
>>> +{
>>> +    return (dma_addr_t)iova->pfn_lo << iova_shift(iovad);
>>> +}
>>> +
>>> +static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
>>> +{
>>> +    return iova >> iova_shift(iovad);
>>> +}
>>> +
>>>   int iommu_iova_cache_init(void);
>>>   void iommu_iova_cache_destroy(void);
>>>
>>> @@ -50,8 +81,8 @@ struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
>>>   struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
>>>       unsigned long pfn_hi);
>>>   void copy_reserved_iova(struct iova_domain *from, struct iova_domain *to);
>>> -void init_iova_domain(struct iova_domain *iovad, unsigned long start_pfn,
>>> -    unsigned long pfn_32bit);
>>> +void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> +    unsigned long start_pfn, unsigned long pfn_32bit);
>>>   struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
>>>   void put_iova_domain(struct iova_domain *iovad);
>>>   struct iova *split_and_remove_iova(struct iova_domain *iovad,
>>>
>>
>>
>>
> 
> 
> 
> .
>

WARNING: multiple messages have this Message-ID (diff)

From: thunder.leizhen@huawei.com (leizhen)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH 4/4] iommu: make IOVA domain page size explicit
Date: Thu, 27 Nov 2014 15:10:50 +0800	[thread overview]
Message-ID: <5476CE7A.3020302@huawei.com> (raw)
In-Reply-To: <5475D63D.1050405@arm.com>

On 2014/11/26 21:31, Robin Murphy wrote:
> Hi,
> 
> On 26/11/14 07:17, leizhen wrote:
> [...]
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index a3dbba8..9e50625 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -55,12 +55,16 @@ void free_iova_mem(struct iova *iova)
>>>   }
>>>
>>>   void
>>> -init_iova_domain(struct iova_domain *iovad, unsigned long start_pfn,
>>> -    unsigned long pfn_32bit)
>>> +init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> +    unsigned long start_pfn, unsigned long pfn_32bit)
>>>   {
>>> +    /* IOMMUs must support at least the CPU page size */
>>> +    BUG_ON(granule > PAGE_SIZE);
>>
>> ??? BUG_ON(granule < PAGE_SIZE);
>>
> 
> Granule represents the smallest page size that the IOMMU can map - we'd never allocate less IOVA space than a single page, so we don't need a unit smaller than that, and a bigger unit would make it
> problematic to allocate IOVA space for individual pages.
> 
> If, for instance, the CPU has 64k pages, and the IOMMU supports 1k tiny pages, granule would be less than PAGE_SIZE, but the IOMMU could happily that 64k as 64 contiguous tiny pages, even if it's a
> waste of TLB space.
> 
> However, say the IOMMU only supports 16k pages and we try to map two noncontiguous 4k CPU pages as a contiguous 8k buffer in IOVA space - this can't work, hence why the _minimum_ IOMMU page size, and
> thus granule, may never be _bigger_ than a CPU page.
> 

Oh, I just confused with the comment: /* IOMMUs must support at least the CPU page size */
But the code means at most the CPU page size. Now, I got it.

>>> +
>>>       spin_lock_init(&iovad->iova_rbtree_lock);
>>>       iovad->rbroot = RB_ROOT;
>>>       iovad->cached32_node = NULL;
>>> +    iovad->granule = granule;
>>
>> Do you need to check granule is 2^n ?
>>
> 
> Yes, that would make a lot of sense.
> 
>>>       iovad->start_pfn = start_pfn;
>>>       iovad->dma_32bit_pfn = pfn_32bit;
>>>   }
>>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>>> index 591b196..3920a19 100644
>>> --- a/include/linux/iova.h
>>> +++ b/include/linux/iova.h
>>> @@ -28,6 +28,7 @@ struct iova_domain {
>>>       spinlock_t    iova_rbtree_lock; /* Lock to protect update of rbtree */
>>>       struct rb_root    rbroot;        /* iova domain rbtree root */
>>>       struct rb_node    *cached32_node; /* Save last alloced node */
>>> +    unsigned long    granule;    /* pfn granularity for this domain */
>>>       unsigned long    start_pfn;    /* Lower limit for this domain */
>>>       unsigned long    dma_32bit_pfn;
>>>   };
>>> @@ -37,6 +38,36 @@ static inline unsigned long iova_size(struct iova *iova)
>>>       return iova->pfn_hi - iova->pfn_lo + 1;
>>>   }
>>>
>>> +static inline unsigned long iova_shift(struct iova_domain *iovad)
>>> +{
>>> +    return __ffs(iovad->granule);
>>
>> I think it's good to save the result to a struct member. If we caculate it ervertime, it's too slow.
>>
> 
> I did give some consideration at the time to which of size, shift and mask to store - given the redundancy between them storing more than one seems excessively wasteful - and this made for the
> nicest-looking code IMO. These IOVA operations are going to be mostly used in map/unmap situations leading to an IOMMU TLB flush anyway, so I'm not sure how performance-critical they really are.
> 
> Are we likely to see this code used on architectures where __ffs takes more than a few cycles and has more impact than the general memory bloat of making structures bigger?
> 
> 
> Thanks for the review,
> Robin.
> 
>>> +}
>>> +
>>> +static inline unsigned long iova_mask(struct iova_domain *iovad)
>>> +{
>>> +    return iovad->granule - 1;
>>> +}
>>> +
>>> +static inline size_t iova_offset(struct iova_domain *iovad, dma_addr_t iova)
>>> +{
>>> +    return iova & iova_mask(iovad);
>>> +}
>>> +
>>> +static inline size_t iova_align(struct iova_domain *iovad, size_t size)
>>> +{
>>> +    return ALIGN(size, iovad->granule);
>>> +}
>>> +
>>> +static inline dma_addr_t iova_dma_addr(struct iova_domain *iovad, struct iova *iova)
>>> +{
>>> +    return (dma_addr_t)iova->pfn_lo << iova_shift(iovad);
>>> +}
>>> +
>>> +static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
>>> +{
>>> +    return iova >> iova_shift(iovad);
>>> +}
>>> +
>>>   int iommu_iova_cache_init(void);
>>>   void iommu_iova_cache_destroy(void);
>>>
>>> @@ -50,8 +81,8 @@ struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
>>>   struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
>>>       unsigned long pfn_hi);
>>>   void copy_reserved_iova(struct iova_domain *from, struct iova_domain *to);
>>> -void init_iova_domain(struct iova_domain *iovad, unsigned long start_pfn,
>>> -    unsigned long pfn_32bit);
>>> +void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> +    unsigned long start_pfn, unsigned long pfn_32bit);
>>>   struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
>>>   void put_iova_domain(struct iova_domain *iovad);
>>>   struct iova *split_and_remove_iova(struct iova_domain *iovad,
>>>
>>
>>
>>
> 
> 
> 
> .
>

next prev parent reply	other threads:[~2014-11-27  7:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-25 17:27 [RFC PATCH 0/4] Genericise the IOVA allocator Robin Murphy
2014-11-25 17:27 ` Robin Murphy
     [not found] ` <cover.1416931258.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2014-11-25 17:27   ` [RFC PATCH 1/4] iommu: build iova.c for any IOMMU Robin Murphy
2014-11-25 17:27     ` Robin Murphy
     [not found]     ` <90240865e7cc00d0f7e471605ba9d2478b81ea88.1416931258.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2014-11-26  6:58       ` leizhen
2014-11-26  6:58         ` leizhen
     [not found]         ` <54757A20.2090609-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-11-26 12:19           ` Robin Murphy
2014-11-26 12:19             ` Robin Murphy
2014-11-25 17:27   ` [RFC PATCH 2/4] iommu: consolidate IOVA allocator code Robin Murphy
2014-11-25 17:27     ` Robin Murphy
2014-11-25 17:27   ` [RFC PATCH 3/4] iommu: make IOVA domain low limit flexible Robin Murphy
2014-11-25 17:27     ` Robin Murphy
2014-11-25 17:27   ` [RFC PATCH 4/4] iommu: make IOVA domain page size explicit Robin Murphy
2014-11-25 17:27     ` Robin Murphy
     [not found]     ` <3774c76c45b21820b6a6acf582b8f441b639ffe9.1416931258.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2014-11-26  7:17       ` leizhen
2014-11-26  7:17         ` leizhen
     [not found]         ` <54757E98.9010006-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2014-11-26 13:31           ` Robin Murphy
2014-11-26 13:31             ` Robin Murphy
     [not found]             ` <5475D63D.1050405-5wv7dgnIgG8@public.gmane.org>
2014-11-27  7:10               ` leizhen [this message]
2014-11-27  7:10                 ` leizhen
2014-11-27 12:43   ` [RFC PATCH 0/4] Genericise the IOVA allocator Sakari Ailus
2014-11-27 12:43     ` Sakari Ailus
2015-01-12 15:52   ` Joerg Roedel
2015-01-12 15:52     ` Joerg Roedel
     [not found]     ` <20150112155220.GD6343-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-01-12 16:05       ` Robin Murphy
2015-01-12 16:05         ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5476CE7A.3020302@huawei.com \
    --to=thunder.leizhen-hv44wf8li93qt0dzr+alfa@public.gmane.org \
    --cc=Catalin.Marinas-5wv7dgnIgG8@public.gmane.org \
    --cc=Will.Deacon-5wv7dgnIgG8@public.gmane.org \
    --cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
    --cc=robin.murphy-5wv7dgnIgG8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.