* [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
@ 2017-01-16 14:12 Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-01-16 14:12 UTC (permalink / raw)
To: Intel-gfx
Cc: Tomasz Stanislawski, Pawel Osciak, linux-kernel, Masahiro Yamada,
Kyungmin Park, Matt Porter, linux-media, Alexandre Bounine,
Marek Szyprowski
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.
Since these are offsets withing a page, unsigned int is
wide enough.
Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.
v2: Use offset_in_page. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
drivers/rapidio/devices/rio_mport_cdev.c | 4 ++--
include/linux/scatterlist.h | 2 +-
lib/scatterlist.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index fb6a177be461..51e8765bc3c6 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
{
struct vb2_dc_buf *buf;
struct frame_vector *vec;
- unsigned long offset;
+ unsigned int offset;
int n_pages, i;
int ret = 0;
struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
buf->dev = dev;
buf->dma_dir = dma_dir;
- offset = vaddr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(vaddr));
vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
if (IS_ERR(vec)) {
ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 9013a585507e..0fae29ff47ba 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
* offset within the internal buffer specified by handle parameter.
*/
if (xfer->loc_addr) {
- unsigned long offset;
+ unsigned int offset;
long pinned;
- offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(xfer->loc_addr));
nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask);
size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 004fc70fc56a..e05e7fc98892 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
*/
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask)
{
unsigned int chunks;
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
@ 2017-01-16 14:12 ` Tvrtko Ursulin
2017-03-07 8:58 ` Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-01-16 14:12 UTC (permalink / raw)
To: Intel-gfx; +Cc: linux-kernel, Masahiro Yamada, Andy Shevchenko
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.
v2: Drop reference to future use. Use UINT_MAX.
v3: max_segment must be page aligned.
v4: Do not rely on compiler to optimise out the rounddown.
(Joonas Lahtinen)
v5: Simplified loops and use post-increments rather than
pre-increments. Use PAGE_MASK and fix comment typo.
(Andy Shevchenko)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
---
include/linux/scatterlist.h | 6 ++++++
lib/scatterlist.c | 31 ++++++++++++++++++++-----------
2 files changed, 26 insertions(+), 11 deletions(-)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..4768eeeb7054 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -21,6 +21,12 @@ struct scatterlist {
};
/*
+ * Since the above length field is an unsigned int, below we define the maximum
+ * length in bytes that can be stored in one scatterlist entry.
+ */
+#define SCATTERLIST_MAX_SEGMENT (UINT_MAX & PAGE_MASK)
+
+/*
* These macros should be used after a dma_map_sg call has been done
* to get bus addresses of each of the SG entries and their lengths.
* You should only work with the number of sg entries dma_map_sg
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index e05e7fc98892..65f375645df5 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,17 +394,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
unsigned int offset, unsigned long size,
gfp_t gfp_mask)
{
- unsigned int chunks;
- unsigned int i;
- unsigned int cur_page;
+ const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
+ unsigned int chunks, cur_page, seg_len, i;
int ret;
struct scatterlist *s;
/* compute number of contiguous chunks */
chunks = 1;
- for (i = 1; i < n_pages; ++i)
- if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
- ++chunks;
+ seg_len = 0;
+ for (i = 1; i < n_pages; i++) {
+ seg_len += PAGE_SIZE;
+ if (seg_len >= max_segment ||
+ page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
+ chunks++;
+ seg_len = 0;
+ }
+ }
ret = sg_alloc_table(sgt, chunks, gfp_mask);
if (unlikely(ret))
@@ -413,17 +418,21 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
/* merging chunks and putting them into the scatterlist */
cur_page = 0;
for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
- unsigned long chunk_size;
- unsigned int j;
+ unsigned int j, chunk_size;
/* look for the end of the current chunk */
- for (j = cur_page + 1; j < n_pages; ++j)
- if (page_to_pfn(pages[j]) !=
+ seg_len = 0;
+ for (j = cur_page + 1; j < n_pages; j++) {
+ seg_len += PAGE_SIZE;
+ if (seg_len >= max_segment ||
+ page_to_pfn(pages[j]) !=
page_to_pfn(pages[j - 1]) + 1)
break;
+ }
chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
- sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+ sg_set_page(s, pages[cur_page],
+ min_t(unsigned long, size, chunk_size), offset);
size -= chunk_size;
offset = 0;
cur_page = j;
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
2017-01-16 14:12 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
@ 2017-03-07 8:58 ` Tvrtko Ursulin
2017-03-07 10:16 ` Tvrtko Ursulin
0 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-03-07 8:58 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx; +Cc: Masahiro Yamada, Andy Shevchenko, linux-kernel
Hi,
On 16/01/2017 14:12, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Since the scatterlist length field is an unsigned int, make
> sure that sg_alloc_table_from_pages does not overflow it while
> coallescing pages to a single entry.
>
> v2: Drop reference to future use. Use UINT_MAX.
> v3: max_segment must be page aligned.
> v4: Do not rely on compiler to optimise out the rounddown.
> (Joonas Lahtinen)
> v5: Simplified loops and use post-increments rather than
> pre-increments. Use PAGE_MASK and fix comment typo.
> (Andy Shevchenko)
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Anyone in the mood for reviewing from here to the end of the series?
Regards,
Tvrtko
> ---
> include/linux/scatterlist.h | 6 ++++++
> lib/scatterlist.c | 31 ++++++++++++++++++++-----------
> 2 files changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index c981bee1a3ae..4768eeeb7054 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -21,6 +21,12 @@ struct scatterlist {
> };
>
> /*
> + * Since the above length field is an unsigned int, below we define the maximum
> + * length in bytes that can be stored in one scatterlist entry.
> + */
> +#define SCATTERLIST_MAX_SEGMENT (UINT_MAX & PAGE_MASK)
> +
> +/*
> * These macros should be used after a dma_map_sg call has been done
> * to get bus addresses of each of the SG entries and their lengths.
> * You should only work with the number of sg entries dma_map_sg
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index e05e7fc98892..65f375645df5 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -394,17 +394,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
> unsigned int offset, unsigned long size,
> gfp_t gfp_mask)
> {
> - unsigned int chunks;
> - unsigned int i;
> - unsigned int cur_page;
> + const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
> + unsigned int chunks, cur_page, seg_len, i;
> int ret;
> struct scatterlist *s;
>
> /* compute number of contiguous chunks */
> chunks = 1;
> - for (i = 1; i < n_pages; ++i)
> - if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
> - ++chunks;
> + seg_len = 0;
> + for (i = 1; i < n_pages; i++) {
> + seg_len += PAGE_SIZE;
> + if (seg_len >= max_segment ||
> + page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
> + chunks++;
> + seg_len = 0;
> + }
> + }
>
> ret = sg_alloc_table(sgt, chunks, gfp_mask);
> if (unlikely(ret))
> @@ -413,17 +418,21 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
> /* merging chunks and putting them into the scatterlist */
> cur_page = 0;
> for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
> - unsigned long chunk_size;
> - unsigned int j;
> + unsigned int j, chunk_size;
>
> /* look for the end of the current chunk */
> - for (j = cur_page + 1; j < n_pages; ++j)
> - if (page_to_pfn(pages[j]) !=
> + seg_len = 0;
> + for (j = cur_page + 1; j < n_pages; j++) {
> + seg_len += PAGE_SIZE;
> + if (seg_len >= max_segment ||
> + page_to_pfn(pages[j]) !=
> page_to_pfn(pages[j - 1]) + 1)
> break;
> + }
>
> chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
> - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
> + sg_set_page(s, pages[cur_page],
> + min_t(unsigned long, size, chunk_size), offset);
> size -= chunk_size;
> offset = 0;
> cur_page = j;
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
2017-03-07 8:58 ` Tvrtko Ursulin
@ 2017-03-07 10:16 ` Tvrtko Ursulin
0 siblings, 0 replies; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-03-07 10:16 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx, David Dillow
Cc: Masahiro Yamada, Andy Shevchenko, linux-kernel
Hi David,
Chris noticed your "scatterlist: don't overflow length field" patch and
pinged me, so I am copying you on another thread which tries to solve
the same problem.
My latest series is here:
https://patchwork.freedesktop.org/series/18062/, but it has been going
from some time November last year.
I like your BUILD_BUG_ON safety, but otherwise our patches are pretty
similar.
i915 driver also benefits from the ability to create large sg chunks
which saves us a few megabytes of RAM at runtime, but we do have to
degrade to smaller chunks when running under a hypervisor. For that we
are using the swiotlb_max_segment API Konrad recently added for this
purpose.
So what I did in addition to fixing the overflow is exported a new
flavour of sg_alloc_table_from_pages which allows you to control the
maximum chunk.
Maybe you can have a look at my series and see if it would work for you?
I've been trying to gain some traction for it for some months now.
Regards,
Tvrtko
On 07/03/2017 08:58, Tvrtko Ursulin wrote:
>
> Hi,
>
> On 16/01/2017 14:12, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Since the scatterlist length field is an unsigned int, make
>> sure that sg_alloc_table_from_pages does not overflow it while
>> coallescing pages to a single entry.
>>
>> v2: Drop reference to future use. Use UINT_MAX.
>> v3: max_segment must be page aligned.
>> v4: Do not rely on compiler to optimise out the rounddown.
>> (Joonas Lahtinen)
>> v5: Simplified loops and use post-increments rather than
>> pre-increments. Use PAGE_MASK and fix comment typo.
>> (Andy Shevchenko)
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>> Cc: linux-kernel@vger.kernel.org
>> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
>
> Anyone in the mood for reviewing from here to the end of the series?
>
> Regards,
>
> Tvrtko
>
>> ---
>> include/linux/scatterlist.h | 6 ++++++
>> lib/scatterlist.c | 31 ++++++++++++++++++++-----------
>> 2 files changed, 26 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
>> index c981bee1a3ae..4768eeeb7054 100644
>> --- a/include/linux/scatterlist.h
>> +++ b/include/linux/scatterlist.h
>> @@ -21,6 +21,12 @@ struct scatterlist {
>> };
>>
>> /*
>> + * Since the above length field is an unsigned int, below we define
>> the maximum
>> + * length in bytes that can be stored in one scatterlist entry.
>> + */
>> +#define SCATTERLIST_MAX_SEGMENT (UINT_MAX & PAGE_MASK)
>> +
>> +/*
>> * These macros should be used after a dma_map_sg call has been done
>> * to get bus addresses of each of the SG entries and their lengths.
>> * You should only work with the number of sg entries dma_map_sg
>> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
>> index e05e7fc98892..65f375645df5 100644
>> --- a/lib/scatterlist.c
>> +++ b/lib/scatterlist.c
>> @@ -394,17 +394,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>> unsigned int offset, unsigned long size,
>> gfp_t gfp_mask)
>> {
>> - unsigned int chunks;
>> - unsigned int i;
>> - unsigned int cur_page;
>> + const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
>> + unsigned int chunks, cur_page, seg_len, i;
>> int ret;
>> struct scatterlist *s;
>>
>> /* compute number of contiguous chunks */
>> chunks = 1;
>> - for (i = 1; i < n_pages; ++i)
>> - if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
>> - ++chunks;
>> + seg_len = 0;
>> + for (i = 1; i < n_pages; i++) {
>> + seg_len += PAGE_SIZE;
>> + if (seg_len >= max_segment ||
>> + page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
>> + chunks++;
>> + seg_len = 0;
>> + }
>> + }
>>
>> ret = sg_alloc_table(sgt, chunks, gfp_mask);
>> if (unlikely(ret))
>> @@ -413,17 +418,21 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>> /* merging chunks and putting them into the scatterlist */
>> cur_page = 0;
>> for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
>> - unsigned long chunk_size;
>> - unsigned int j;
>> + unsigned int j, chunk_size;
>>
>> /* look for the end of the current chunk */
>> - for (j = cur_page + 1; j < n_pages; ++j)
>> - if (page_to_pfn(pages[j]) !=
>> + seg_len = 0;
>> + for (j = cur_page + 1; j < n_pages; j++) {
>> + seg_len += PAGE_SIZE;
>> + if (seg_len >= max_segment ||
>> + page_to_pfn(pages[j]) !=
>> page_to_pfn(pages[j - 1]) + 1)
>> break;
>> + }
>>
>> chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
>> - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
>> + sg_set_page(s, pages[cur_page],
>> + min_t(unsigned long, size, chunk_size), offset);
>> size -= chunk_size;
>> offset = 0;
>> cur_page = j;
>>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
@ 2017-01-16 14:12 ` Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-01-16 14:12 UTC (permalink / raw)
To: Intel-gfx; +Cc: linux-kernel, Masahiro Yamada
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.
Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.
v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.
v4: max_segment needs to be page aligned.
v5: Rebase.
v6: Rebase.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kernel@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
include/linux/scatterlist.h | 11 +++++----
lib/scatterlist.c | 58 +++++++++++++++++++++++++++++++++++----------
2 files changed, 52 insertions(+), 17 deletions(-)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 4768eeeb7054..4d67a9652c7d 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -267,10 +267,13 @@ void sg_free_table(struct sg_table *);
int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
struct scatterlist *, gfp_t, sg_alloc_fn *);
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, unsigned int max_segment,
+ gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask);
size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 65f375645df5..326119281167 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
EXPORT_SYMBOL(sg_alloc_table);
/**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- * an array of pages
- * @sgt: The sg table header to use
- * @pages: Pointer to an array of page pointers
- * @n_pages: Number of pages in the pages array
- * @offset: Offset from start of the first page to the start of a buffer
- * @size: Number of valid bytes in the buffer (after offset)
- * @gfp_mask: GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a scatterlist node in bytes (page aligned)
+ * @gfp_mask: GFP allocation mask
*
* Description:
* Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,16 +390,18 @@ EXPORT_SYMBOL(sg_alloc_table);
* Returns:
* 0 on success, negative error on failure
*/
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, unsigned int max_segment,
+ gfp_t gfp_mask)
{
- const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
unsigned int chunks, cur_page, seg_len, i;
int ret;
struct scatterlist *s;
+ if (WARN_ON(!max_segment || offset_in_page(max_segment)))
+ return -EINVAL;
+
/* compute number of contiguous chunks */
chunks = 1;
seg_len = 0;
@@ -440,6 +443,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
return 0;
}
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @gfp_mask: GFP allocation mask
+ *
+ * Description:
+ * Allocate and initialize an sg table from a list of pages. Contiguous
+ * ranges of the pages are squashed into a single scatterlist node. A user
+ * may provide an offset at a start and a size of valid data in a buffer
+ * specified by the page array. The returned sg table is released by
+ * sg_free_table.
+ *
+ * Returns:
+ * 0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask)
+{
+ return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset, size,
+ SCATTERLIST_MAX_SEGMENT, gfp_mask);
+}
EXPORT_SYMBOL(sg_alloc_table_from_pages);
void __sg_page_iter_start(struct sg_page_iter *piter,
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
@ 2017-01-16 14:12 ` Tvrtko Ursulin
2017-01-16 14:54 ` ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Patchwork
2017-01-30 9:44 ` [PATCH 1/4] " Daniel Vetter
4 siblings, 0 replies; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-01-16 14:12 UTC (permalink / raw)
To: Intel-gfx
Cc: tursulin, Tvrtko Ursulin, Chris Wilson, linux-kernel,
Joonas Lahtinen
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.
Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.
v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.
v3:
* Actually include the swiotlb override fix.
v4:
* Regroup parameters a bit. (Chris Wilson)
v5:
* Rebase for swiotlb_max_segment.
* Add DMA map failure handling as in abb0deacb5a6
("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").
v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 15 +++++++
drivers/gpu/drm/i915/i915_gem.c | 6 +--
drivers/gpu/drm/i915/i915_gem_userptr.c | 79 ++++++++++++---------------------
3 files changed, 45 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f861418122ef..531f47fae143 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2599,6 +2599,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
(((__iter).curr += PAGE_SIZE) < (__iter).max) || \
((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
+static inline unsigned int i915_sg_segment_size(void)
+{
+ unsigned int size = swiotlb_max_segment();
+
+ if (size == 0)
+ return SCATTERLIST_MAX_SEGMENT;
+
+ size = rounddown(size, PAGE_SIZE);
+ /* swiotlb_max_segment_size can return 1 byte when it means one page. */
+ if (size < PAGE_SIZE)
+ size = PAGE_SIZE;
+
+ return size;
+}
+
static inline const struct intel_device_info *
intel_info(const struct drm_i915_private *dev_priv)
{
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d4c59b53532e..4dca3b52f495 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2254,7 +2254,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = i915_sg_segment_size();
int ret;
gfp_t gfp;
@@ -2265,10 +2265,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
- max_segment = swiotlb_max_segment();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 6a8fa085b74e..95b62b9c5cd6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,64 +390,42 @@ struct get_pages_work {
struct task_struct *task;
};
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
- struct scatterlist *sg;
- int ret, n;
-
- *st = kmalloc(sizeof(**st), GFP_KERNEL);
- if (*st == NULL)
- return -ENOMEM;
-
- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
-
- return 0;
-
-err:
- kfree(*st);
- *st = NULL;
- return ret;
-}
-
static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
- struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+ struct page **pvec, int num_pages)
{
- struct sg_table *pages;
+ unsigned int max_segment = i915_sg_segment_size();
+ struct sg_table *st;
int ret;
- ret = st_set_pages(&pages, pvec, num_pages);
- if (ret)
+ st = kmalloc(sizeof(*st), GFP_KERNEL);
+ if (!st)
+ return ERR_PTR(-ENOMEM);
+
+alloc_table:
+ ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
+ 0, num_pages << PAGE_SHIFT,
+ max_segment,
+ GFP_KERNEL);
+ if (ret) {
+ kfree(st);
return ERR_PTR(ret);
+ }
- ret = i915_gem_gtt_prepare_pages(obj, pages);
+ ret = i915_gem_gtt_prepare_pages(obj, st);
if (ret) {
- sg_free_table(pages);
- kfree(pages);
+ sg_free_table(st);
+
+ if (max_segment > PAGE_SIZE) {
+ max_segment = PAGE_SIZE;
+ goto alloc_table;
+ }
+
+ kfree(st);
return ERR_PTR(ret);
}
- return pages;
+ return st;
}
static int
@@ -531,7 +509,8 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
struct sg_table *pages = ERR_PTR(ret);
if (pinned == npages) {
- pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec,
+ npages);
if (!IS_ERR(pages)) {
__i915_gem_object_set_pages(obj, pages);
pinned = 0;
@@ -653,7 +632,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
else if (pinned < num_pages)
pages = __i915_gem_userptr_get_pages_schedule(obj, &active);
else
- pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
if (IS_ERR(pages)) {
__i915_gem_userptr_set_active(obj, active);
release_pages(pvec, pinned, 0);
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
` (2 preceding siblings ...)
2017-01-16 14:12 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
@ 2017-01-16 14:54 ` Patchwork
2017-01-30 9:44 ` [PATCH 1/4] " Daniel Vetter
4 siblings, 0 replies; 14+ messages in thread
From: Patchwork @ 2017-01-16 14:54 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
URL : https://patchwork.freedesktop.org/series/18062/
State : success
== Summary ==
Series 18062v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/18062/revisions/1/mbox/
fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14
fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39
fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22
fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12
fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27
fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31
fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19
fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19
fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21
fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21
fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21
fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13
fi-skl-6700hq total:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20
fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21
fi-skl-6770hq total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13
fi-snb-2520m total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31
fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32
e0c7007e02b52375d3e5daa1bc4ef2e6d00e1016 drm-tip: 2017y-01m-16d-12h-26m-22s UTC integration manifest
a7efa97 drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
c329eed lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
75a27aa lib/scatterlist: Avoid potential scatterlist entry overflow
2e8bf74 lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3528/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
` (3 preceding siblings ...)
2017-01-16 14:54 ` ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Patchwork
@ 2017-01-30 9:44 ` Daniel Vetter
4 siblings, 0 replies; 14+ messages in thread
From: Daniel Vetter @ 2017-01-30 9:44 UTC (permalink / raw)
To: Tvrtko Ursulin
Cc: Tomasz Stanislawski, Pawel Osciak, Alexandre Bounine, Intel-gfx,
linux-kernel, Masahiro Yamada, Kyungmin Park, Matt Porter,
Marek Szyprowski, linux-media
Hi all,
Ok if we merge the entire series through drm-intel (likely for 4.12, 4.11
is getting a bit late)? We'd like to use this there, and Mauro already
reviewed the v4l side ...
Thanks, Daniel
On Mon, Jan 16, 2017 at 02:12:07PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Scatterlist entries have an unsigned int for the offset so
> correct the sg_alloc_table_from_pages function accordingly.
>
> Since these are offsets withing a page, unsigned int is
> wide enough.
>
> Also converts callers which were using unsigned long locally
> with the lower_32_bits annotation to make it explicitly
> clear what is happening.
>
> v2: Use offset_in_page. (Chris Wilson)
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: Pawel Osciak <pawel@osciak.com>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Cc: Kyungmin Park <kyungmin.park@samsung.com>
> Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
> Cc: Matt Porter <mporter@kernel.crashing.org>
> Cc: Alexandre Bounine <alexandre.bounine@idt.com>
> Cc: linux-media@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
> ---
> drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
> drivers/rapidio/devices/rio_mport_cdev.c | 4 ++--
> include/linux/scatterlist.h | 2 +-
> lib/scatterlist.c | 2 +-
> 4 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> index fb6a177be461..51e8765bc3c6 100644
> --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
> @@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> {
> struct vb2_dc_buf *buf;
> struct frame_vector *vec;
> - unsigned long offset;
> + unsigned int offset;
> int n_pages, i;
> int ret = 0;
> struct sg_table *sgt;
> @@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
> buf->dev = dev;
> buf->dma_dir = dma_dir;
>
> - offset = vaddr & ~PAGE_MASK;
> + offset = lower_32_bits(offset_in_page(vaddr));
> vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
> if (IS_ERR(vec)) {
> ret = PTR_ERR(vec);
> diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
> index 9013a585507e..0fae29ff47ba 100644
> --- a/drivers/rapidio/devices/rio_mport_cdev.c
> +++ b/drivers/rapidio/devices/rio_mport_cdev.c
> @@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
> * offset within the internal buffer specified by handle parameter.
> */
> if (xfer->loc_addr) {
> - unsigned long offset;
> + unsigned int offset;
> long pinned;
>
> - offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
> + offset = lower_32_bits(offset_in_page(xfer->loc_addr));
> nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
>
> page_list = kmalloc_array(nr_pages,
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index cb3c8fe6acd7..c981bee1a3ae 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
> int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
> int sg_alloc_table_from_pages(struct sg_table *sgt,
> struct page **pages, unsigned int n_pages,
> - unsigned long offset, unsigned long size,
> + unsigned int offset, unsigned long size,
> gfp_t gfp_mask);
>
> size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index 004fc70fc56a..e05e7fc98892 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
> */
> int sg_alloc_table_from_pages(struct sg_table *sgt,
> struct page **pages, unsigned int n_pages,
> - unsigned long offset, unsigned long size,
> + unsigned int offset, unsigned long size,
> gfp_t gfp_mask)
> {
> unsigned int chunks;
> --
> 2.7.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 0/4] Userptr bo slab use optimization
@ 2017-07-27 9:05 Tvrtko Ursulin
2017-07-27 9:05 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
0 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-07-27 9:05 UTC (permalink / raw)
To: Intel-gfx; +Cc: Ben Widawsky
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Yet another attempt to get this series reviewed and merged...
I've heard Vulkan might be creating a lot of userptr objects so might be
interesting to check what benefit it brings to those use cases.
As an introduction, this allows i915 to create fewer sg table entries for the bo
backing store representation. As such it primarily saves kernel slab memory.
When we added this optimisation to normal i915 bos, the savings were as far as
I remember around 1-2MiB of slab after booting to KDE desktop, and 2-4Mib on
neverball (game) main screen (or maybe it was while playing).
So thinking is, if Vulkan is indeed using a lot of userptr bos, it should
translate to similar savings there. It is not much but the motto is that every
little helps.
On the low level the saving will be up to around 32 bytes for each 4k of an
userptr bo (1GiB of userptr bos = up to ~8MiB of slab saving), with the actual
number depending on the backing store fragmentation.
Tvrtko Ursulin (4):
lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
lib/scatterlist: Avoid potential scatterlist entry overflow
lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
drivers/gpu/drm/i915/i915_drv.h | 15 +++++
drivers/gpu/drm/i915/i915_gem.c | 6 +-
drivers/gpu/drm/i915/i915_gem_userptr.c | 79 +++++++++--------------
drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 +-
drivers/rapidio/devices/rio_mport_cdev.c | 4 +-
include/linux/scatterlist.h | 17 +++--
lib/scatterlist.c | 87 +++++++++++++++++++-------
7 files changed, 126 insertions(+), 86 deletions(-)
--
2.9.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
2017-07-27 9:05 [PATCH 0/4] Userptr bo slab use optimization Tvrtko Ursulin
@ 2017-07-27 9:05 ` Tvrtko Ursulin
2017-07-28 11:06 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-07-27 9:05 UTC (permalink / raw)
To: Intel-gfx; +Cc: linux-kernel, Ben Widawsky
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.
Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.
v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.
v3:
* Actually include the swiotlb override fix.
v4:
* Regroup parameters a bit. (Chris Wilson)
v5:
* Rebase for swiotlb_max_segment.
* Add DMA map failure handling as in abb0deacb5a6
("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").
v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)
v7: Rebase.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 15 +++++++
drivers/gpu/drm/i915/i915_gem.c | 6 +--
drivers/gpu/drm/i915/i915_gem_userptr.c | 79 ++++++++++++---------------------
3 files changed, 45 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2c7456f4ed38..6383940e8d55 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2749,6 +2749,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
(((__iter).curr += PAGE_SIZE) < (__iter).max) || \
((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
+static inline unsigned int i915_sg_segment_size(void)
+{
+ unsigned int size = swiotlb_max_segment();
+
+ if (size == 0)
+ return SCATTERLIST_MAX_SEGMENT;
+
+ size = rounddown(size, PAGE_SIZE);
+ /* swiotlb_max_segment_size can return 1 byte when it means one page. */
+ if (size < PAGE_SIZE)
+ size = PAGE_SIZE;
+
+ return size;
+}
+
static inline const struct intel_device_info *
intel_info(const struct drm_i915_private *dev_priv)
{
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6faabf34f142..a60885d6231b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2339,7 +2339,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = i915_sg_segment_size();
gfp_t noreclaim;
int ret;
@@ -2350,10 +2350,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
- max_segment = swiotlb_max_segment();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index ccd09e8419f5..60c10d4118ae 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -399,64 +399,42 @@ struct get_pages_work {
struct task_struct *task;
};
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
- struct scatterlist *sg;
- int ret, n;
-
- *st = kmalloc(sizeof(**st), GFP_KERNEL);
- if (*st == NULL)
- return -ENOMEM;
-
- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
-
- return 0;
-
-err:
- kfree(*st);
- *st = NULL;
- return ret;
-}
-
static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
- struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+ struct page **pvec, int num_pages)
{
- struct sg_table *pages;
+ unsigned int max_segment = i915_sg_segment_size();
+ struct sg_table *st;
int ret;
- ret = st_set_pages(&pages, pvec, num_pages);
- if (ret)
+ st = kmalloc(sizeof(*st), GFP_KERNEL);
+ if (!st)
+ return ERR_PTR(-ENOMEM);
+
+alloc_table:
+ ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
+ 0, num_pages << PAGE_SHIFT,
+ max_segment,
+ GFP_KERNEL);
+ if (ret) {
+ kfree(st);
return ERR_PTR(ret);
+ }
- ret = i915_gem_gtt_prepare_pages(obj, pages);
+ ret = i915_gem_gtt_prepare_pages(obj, st);
if (ret) {
- sg_free_table(pages);
- kfree(pages);
+ sg_free_table(st);
+
+ if (max_segment > PAGE_SIZE) {
+ max_segment = PAGE_SIZE;
+ goto alloc_table;
+ }
+
+ kfree(st);
return ERR_PTR(ret);
}
- return pages;
+ return st;
}
static int
@@ -540,7 +518,8 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
struct sg_table *pages = ERR_PTR(ret);
if (pinned == npages) {
- pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec,
+ npages);
if (!IS_ERR(pages)) {
__i915_gem_object_set_pages(obj, pages);
pinned = 0;
@@ -661,7 +640,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
pages = __i915_gem_userptr_get_pages_schedule(obj);
active = pages == ERR_PTR(-EAGAIN);
} else {
- pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
active = !IS_ERR(pages);
}
if (active)
--
2.9.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
2017-07-27 9:05 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
@ 2017-07-28 11:06 ` Chris Wilson
0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2017-07-28 11:06 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx
Cc: Ben Widawsky, Jason Ekstrand, Tvrtko Ursulin, linux-kernel,
Joonas Lahtinen
Quoting Tvrtko Ursulin (2017-07-27 10:05:04)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> With the addition of __sg_alloc_table_from_pages we can control
> the maximum coallescing size and eliminate a separate path for
> allocating backing store here.
>
> Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
> SWIOTLB max segment size") this enables more compact sg lists to
> be created and so has a beneficial effect on workloads with many
> and/or large objects of this class.
>
> v2:
> * Rename helper to i915_sg_segment_size and fix swiotlb override.
> * Commit message update.
>
> v3:
> * Actually include the swiotlb override fix.
>
> v4:
> * Regroup parameters a bit. (Chris Wilson)
>
> v5:
> * Rebase for swiotlb_max_segment.
> * Add DMA map failure handling as in abb0deacb5a6
> ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").
>
> v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)
>
> v7: Rebase.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 15 +++++++
> drivers/gpu/drm/i915/i915_gem.c | 6 +--
> drivers/gpu/drm/i915/i915_gem_userptr.c | 79 ++++++++++++---------------------
> 3 files changed, 45 insertions(+), 55 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2c7456f4ed38..6383940e8d55 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2749,6 +2749,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
> (((__iter).curr += PAGE_SIZE) < (__iter).max) || \
> ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
>
> +static inline unsigned int i915_sg_segment_size(void)
> +{
> + unsigned int size = swiotlb_max_segment();
> +
> + if (size == 0)
> + return SCATTERLIST_MAX_SEGMENT;
> +
> + size = rounddown(size, PAGE_SIZE);
Looks like swiotbl_max_seqment() is always page aligned when not 1.
And it returns bytes, ok.
Given that you are using a pot, you can use round_down().
> + /* swiotlb_max_segment_size can return 1 byte when it means one page. */
> + if (size < PAGE_SIZE)
> + size = PAGE_SIZE;
> +
> + return size;
> +}
> +
> static inline const struct intel_device_info *
> intel_info(const struct drm_i915_private *dev_priv)
> {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6faabf34f142..a60885d6231b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2339,7 +2339,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> struct sgt_iter sgt_iter;
> struct page *page;
> unsigned long last_pfn = 0; /* suppress gcc warning */
> - unsigned int max_segment;
> + unsigned int max_segment = i915_sg_segment_size();
> gfp_t noreclaim;
> int ret;
>
> @@ -2350,10 +2350,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
>
> - max_segment = swiotlb_max_segment();
> - if (!max_segment)
> - max_segment = rounddown(UINT_MAX, PAGE_SIZE);
Conversion to i915_sg_segment_size(), ok.
> -
> st = kmalloc(sizeof(*st), GFP_KERNEL);
> if (st == NULL)
> return ERR_PTR(-ENOMEM);
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index ccd09e8419f5..60c10d4118ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -399,64 +399,42 @@ struct get_pages_work {
> struct task_struct *task;
> };
>
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> -#define swiotlb_active() swiotlb_nr_tbl()
> -#else
> -#define swiotlb_active() 0
> -#endif
Converted to i915_sg_segment_size(), nice.
> -static int
> -st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
> -{
> - struct scatterlist *sg;
> - int ret, n;
> -
> - *st = kmalloc(sizeof(**st), GFP_KERNEL);
> - if (*st == NULL)
> - return -ENOMEM;
> -
> - if (swiotlb_active()) {
> - ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> - if (ret)
> - goto err;
> -
> - for_each_sg((*st)->sgl, sg, num_pages, n)
> - sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> - } else {
> - ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
> - 0, num_pages << PAGE_SHIFT,
> - GFP_KERNEL);
> - if (ret)
> - goto err;
> - }
> -
> - return 0;
> -
> -err:
> - kfree(*st);
> - *st = NULL;
> - return ret;
> -}
> -
> static struct sg_table *
> -__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
> - struct page **pvec, int num_pages)
> +__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
> + struct page **pvec, int num_pages)
> {
> - struct sg_table *pages;
> + unsigned int max_segment = i915_sg_segment_size();
> + struct sg_table *st;
> int ret;
>
> - ret = st_set_pages(&pages, pvec, num_pages);
> - if (ret)
> + st = kmalloc(sizeof(*st), GFP_KERNEL);
> + if (!st)
> + return ERR_PTR(-ENOMEM);
> +
> +alloc_table:
> + ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
> + 0, num_pages << PAGE_SHIFT,
> + max_segment,
> + GFP_KERNEL);
> + if (ret) {
> + kfree(st);
> return ERR_PTR(ret);
> + }
>
> - ret = i915_gem_gtt_prepare_pages(obj, pages);
> + ret = i915_gem_gtt_prepare_pages(obj, st);
> if (ret) {
> - sg_free_table(pages);
> - kfree(pages);
> + sg_free_table(st);
> +
> + if (max_segment > PAGE_SIZE) {
> + max_segment = PAGE_SIZE;
> + goto alloc_table;
> + }
> +
> + kfree(st);
> return ERR_PTR(ret);
> }
Much neater.
>
> - return pages;
> + return st;
> }
>
> static int
> @@ -540,7 +518,8 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
> struct sg_table *pages = ERR_PTR(ret);
>
> if (pinned == npages) {
> - pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
> + pages = __i915_gem_userptr_alloc_pages(obj, pvec,
> + npages);
> if (!IS_ERR(pages)) {
> __i915_gem_object_set_pages(obj, pages);
> pinned = 0;
> @@ -661,7 +640,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
> pages = __i915_gem_userptr_get_pages_schedule(obj);
> active = pages == ERR_PTR(-EAGAIN);
> } else {
> - pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
> + pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
> active = !IS_ERR(pages);
> }
> if (active)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
@ 2017-05-04 15:54 Tvrtko Ursulin
2017-05-04 15:54 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
0 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-05-04 15:54 UTC (permalink / raw)
To: Intel-gfx
Cc: Tomasz Stanislawski, Pawel Osciak, linux-kernel, Masahiro Yamada,
Kyungmin Park, Matt Porter, linux-media, Alexandre Bounine,
Marek Szyprowski
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.
Since these are offsets withing a page, unsigned int is
wide enough.
Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.
v2: Use offset_in_page. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
drivers/rapidio/devices/rio_mport_cdev.c | 4 ++--
include/linux/scatterlist.h | 2 +-
lib/scatterlist.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 2db0413f5d57..b5009c1649bc 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
{
struct vb2_dc_buf *buf;
struct frame_vector *vec;
- unsigned long offset;
+ unsigned int offset;
int n_pages, i;
int ret = 0;
struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
buf->dev = dev;
buf->dma_dir = dma_dir;
- offset = vaddr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(vaddr));
vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
if (IS_ERR(vec)) {
ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 50b617af81bd..a8b6696ab6cb 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
* offset within the internal buffer specified by handle parameter.
*/
if (xfer->loc_addr) {
- unsigned long offset;
+ unsigned int offset;
long pinned;
- offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(xfer->loc_addr));
nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask);
size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index c6cf82242d65..11f172c383cb 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
*/
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask)
{
unsigned int chunks;
--
2.9.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
2017-05-04 15:54 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
@ 2017-05-04 15:54 ` Tvrtko Ursulin
0 siblings, 0 replies; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-05-04 15:54 UTC (permalink / raw)
To: Intel-gfx; +Cc: linux-kernel
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.
Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.
v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.
v3:
* Actually include the swiotlb override fix.
v4:
* Regroup parameters a bit. (Chris Wilson)
v5:
* Rebase for swiotlb_max_segment.
* Add DMA map failure handling as in abb0deacb5a6
("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").
v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)
v7: Rebase.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 15 +++++++
drivers/gpu/drm/i915/i915_gem.c | 6 +--
drivers/gpu/drm/i915/i915_gem_userptr.c | 79 ++++++++++++---------------------
3 files changed, 45 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b20ed16da0ad..320c16df1c9c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2676,6 +2676,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
(((__iter).curr += PAGE_SIZE) < (__iter).max) || \
((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
+static inline unsigned int i915_sg_segment_size(void)
+{
+ unsigned int size = swiotlb_max_segment();
+
+ if (size == 0)
+ return SCATTERLIST_MAX_SEGMENT;
+
+ size = rounddown(size, PAGE_SIZE);
+ /* swiotlb_max_segment_size can return 1 byte when it means one page. */
+ if (size < PAGE_SIZE)
+ size = PAGE_SIZE;
+
+ return size;
+}
+
static inline const struct intel_device_info *
intel_info(const struct drm_i915_private *dev_priv)
{
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f9c6b9b5002c..b2727905ef2b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2336,7 +2336,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = i915_sg_segment_size();
int ret;
gfp_t gfp;
@@ -2347,10 +2347,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
- max_segment = swiotlb_max_segment();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 58ccf8b8ca1c..d003076702ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -399,64 +399,42 @@ struct get_pages_work {
struct task_struct *task;
};
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
- struct scatterlist *sg;
- int ret, n;
-
- *st = kmalloc(sizeof(**st), GFP_KERNEL);
- if (*st == NULL)
- return -ENOMEM;
-
- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
-
- return 0;
-
-err:
- kfree(*st);
- *st = NULL;
- return ret;
-}
-
static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
- struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+ struct page **pvec, int num_pages)
{
- struct sg_table *pages;
+ unsigned int max_segment = i915_sg_segment_size();
+ struct sg_table *st;
int ret;
- ret = st_set_pages(&pages, pvec, num_pages);
- if (ret)
+ st = kmalloc(sizeof(*st), GFP_KERNEL);
+ if (!st)
+ return ERR_PTR(-ENOMEM);
+
+alloc_table:
+ ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
+ 0, num_pages << PAGE_SHIFT,
+ max_segment,
+ GFP_KERNEL);
+ if (ret) {
+ kfree(st);
return ERR_PTR(ret);
+ }
- ret = i915_gem_gtt_prepare_pages(obj, pages);
+ ret = i915_gem_gtt_prepare_pages(obj, st);
if (ret) {
- sg_free_table(pages);
- kfree(pages);
+ sg_free_table(st);
+
+ if (max_segment > PAGE_SIZE) {
+ max_segment = PAGE_SIZE;
+ goto alloc_table;
+ }
+
+ kfree(st);
return ERR_PTR(ret);
}
- return pages;
+ return st;
}
static int
@@ -540,7 +518,8 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
struct sg_table *pages = ERR_PTR(ret);
if (pinned == npages) {
- pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec,
+ npages);
if (!IS_ERR(pages)) {
__i915_gem_object_set_pages(obj, pages);
pinned = 0;
@@ -661,7 +640,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
pages = __i915_gem_userptr_get_pages_schedule(obj);
active = pages == ERR_PTR(-EAGAIN);
} else {
- pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
active = !IS_ERR(pages);
}
if (active)
--
2.9.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
@ 2017-01-11 9:00 Tvrtko Ursulin
2017-01-11 9:00 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
0 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11 9:00 UTC (permalink / raw)
To: Intel-gfx
Cc: tursulin, Tvrtko Ursulin, Masahiro Yamada, Pawel Osciak,
Marek Szyprowski, Kyungmin Park, Tomasz Stanislawski, Matt Porter,
Alexandre Bounine, linux-media, linux-kernel
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.
Since these are offsets withing a page, unsigned int is
wide enough.
Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.
v2: Use offset_in_page. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Pawel Osciak <pawel@osciak.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Tomasz Stanislawski <t.stanislaws@samsung.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> (v1)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
drivers/rapidio/devices/rio_mport_cdev.c | 4 ++--
include/linux/scatterlist.h | 2 +-
lib/scatterlist.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index fb6a177be461..51e8765bc3c6 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
{
struct vb2_dc_buf *buf;
struct frame_vector *vec;
- unsigned long offset;
+ unsigned int offset;
int n_pages, i;
int ret = 0;
struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
buf->dev = dev;
buf->dma_dir = dma_dir;
- offset = vaddr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(vaddr));
vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
if (IS_ERR(vec)) {
ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 9013a585507e..0fae29ff47ba 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
* offset within the internal buffer specified by handle parameter.
*/
if (xfer->loc_addr) {
- unsigned long offset;
+ unsigned int offset;
long pinned;
- offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(xfer->loc_addr));
nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask);
size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 004fc70fc56a..e05e7fc98892 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
*/
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask)
{
unsigned int chunks;
--
2.7.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
2017-01-11 9:00 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
@ 2017-01-11 9:00 ` Tvrtko Ursulin
0 siblings, 0 replies; 14+ messages in thread
From: Tvrtko Ursulin @ 2017-01-11 9:00 UTC (permalink / raw)
To: Intel-gfx; +Cc: linux-kernel
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.
Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.
v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.
v3:
* Actually include the swiotlb override fix.
v4:
* Regroup parameters a bit. (Chris Wilson)
v5:
* Rebase for swiotlb_max_segment.
* Add DMA map failure handling as in abb0deacb5a6
("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
---
drivers/gpu/drm/i915/i915_drv.h | 10 +++++
drivers/gpu/drm/i915/i915_gem.c | 6 +--
drivers/gpu/drm/i915/i915_gem_userptr.c | 79 ++++++++++++---------------------
3 files changed, 40 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2b325032fedc..a944ff0c5c68 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2594,6 +2594,16 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg)
(((__iter).curr += PAGE_SIZE) < (__iter).max) || \
((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
+static inline unsigned int i915_sg_segment_size(void)
+{
+ unsigned int size = swiotlb_max_segment();
+
+ if (size == 0)
+ size = UINT_MAX;
+
+ return rounddown(size, PAGE_SIZE);
+}
+
static inline const struct intel_device_info *
intel_info(const struct drm_i915_private *dev_priv)
{
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 13c02015709c..421827069a2f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2248,7 +2248,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = i915_sg_segment_size();
int ret;
gfp_t gfp;
@@ -2259,10 +2259,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
- max_segment = swiotlb_max_segment();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 6a8fa085b74e..95b62b9c5cd6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,64 +390,42 @@ struct get_pages_work {
struct task_struct *task;
};
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
- struct scatterlist *sg;
- int ret, n;
-
- *st = kmalloc(sizeof(**st), GFP_KERNEL);
- if (*st == NULL)
- return -ENOMEM;
-
- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
-
- return 0;
-
-err:
- kfree(*st);
- *st = NULL;
- return ret;
-}
-
static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
- struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+ struct page **pvec, int num_pages)
{
- struct sg_table *pages;
+ unsigned int max_segment = i915_sg_segment_size();
+ struct sg_table *st;
int ret;
- ret = st_set_pages(&pages, pvec, num_pages);
- if (ret)
+ st = kmalloc(sizeof(*st), GFP_KERNEL);
+ if (!st)
+ return ERR_PTR(-ENOMEM);
+
+alloc_table:
+ ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
+ 0, num_pages << PAGE_SHIFT,
+ max_segment,
+ GFP_KERNEL);
+ if (ret) {
+ kfree(st);
return ERR_PTR(ret);
+ }
- ret = i915_gem_gtt_prepare_pages(obj, pages);
+ ret = i915_gem_gtt_prepare_pages(obj, st);
if (ret) {
- sg_free_table(pages);
- kfree(pages);
+ sg_free_table(st);
+
+ if (max_segment > PAGE_SIZE) {
+ max_segment = PAGE_SIZE;
+ goto alloc_table;
+ }
+
+ kfree(st);
return ERR_PTR(ret);
}
- return pages;
+ return st;
}
static int
@@ -531,7 +509,8 @@ __i915_gem_userptr_get_pages_worker(struct work_struct *_work)
struct sg_table *pages = ERR_PTR(ret);
if (pinned == npages) {
- pages = __i915_gem_userptr_set_pages(obj, pvec, npages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec,
+ npages);
if (!IS_ERR(pages)) {
__i915_gem_object_set_pages(obj, pages);
pinned = 0;
@@ -653,7 +632,7 @@ i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
else if (pinned < num_pages)
pages = __i915_gem_userptr_get_pages_schedule(obj, &active);
else
- pages = __i915_gem_userptr_set_pages(obj, pvec, num_pages);
+ pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
if (IS_ERR(pages)) {
__i915_gem_userptr_set_active(obj, active);
release_pages(pvec, pinned, 0);
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 0/4] Compact userptr object backing store allocation
@ 2016-11-11 8:50 Tvrtko Ursulin
2016-11-11 8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
0 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11 8:50 UTC (permalink / raw)
To: Intel-gfx; +Cc: linux-kernel
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Userptr backing store with SWIOTBL active is currently allocated in the same
inefficient manner, with one sg entry per object page, as what the commit
871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") fixed
for regular GEM objects.
We can fix that by adding new a __sg_alloc_table_from_pages core function which
allows us to control the maximum desired coalesced segment size.
Other than that the series starts with two simple fixes to
sg_alloc_table_from_pages which deal with incorrect data type usage and a
theoretical overflow condition. Fixing the latter enables easy addition of the
above mentioned __sg_alloc_table_from_pages.
Tvrtko Ursulin (4):
lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
lib/scatterlist: Avoid potential scatterlist entry overflow
lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
drivers/gpu/drm/i915/i915_drv.h | 9 +++
drivers/gpu/drm/i915/i915_gem.c | 15 +----
drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++-------
drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 +-
drivers/rapidio/devices/rio_mport_cdev.c | 4 +-
include/linux/scatterlist.h | 11 ++--
lib/scatterlist.c | 78 ++++++++++++++++++++------
7 files changed, 87 insertions(+), 62 deletions(-)
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
2016-11-11 8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
@ 2016-11-11 8:50 ` Tvrtko Ursulin
2016-11-11 10:23 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Tvrtko Ursulin @ 2016-11-11 8:50 UTC (permalink / raw)
To: Intel-gfx; +Cc: linux-kernel
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.
Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.
v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++
drivers/gpu/drm/i915/i915_gem.c | 15 +--------------
drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
3 files changed, 16 insertions(+), 36 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30777dee3f9c..319f8def0f86 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4175,4 +4175,13 @@ int remap_io_mapping(struct vm_area_struct *vma,
__T; \
})
+static inline unsigned int i915_sg_segment_size(void)
+{
+#if IS_ENABLED(CONFIG_SWIOTLB)
+ return swiotlb_nr_tbl() << IO_TLB_SHIFT;
+#else
+ return UINT_MAX;
+#endif
+}
+
#endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c20edba7f2a..cb4c188a395c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
mutex_unlock(&obj->mm.lock);
}
-static unsigned int swiotlb_max_size(void)
-{
-#if IS_ENABLED(CONFIG_SWIOTLB)
- return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
-#else
- return 0;
-#endif
-}
-
static void i915_sg_trim(struct sg_table *orig_st)
{
struct sg_table new_st;
@@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
int ret;
gfp_t gfp;
@@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
- max_segment = swiotlb_max_size();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 64261639f547..b4461f1832a6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,36 +390,20 @@ struct get_pages_work {
struct task_struct *task;
};
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
static int
st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
{
- struct scatterlist *sg;
- int ret, n;
+ int ret;
*st = kmalloc(sizeof(**st), GFP_KERNEL);
if (*st == NULL)
return -ENOMEM;
- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
+ ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
+ num_pages << PAGE_SHIFT,
+ GFP_KERNEL, i915_sg_segment_size());
+ if (ret)
+ goto err;
return 0;
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
2016-11-11 8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
@ 2016-11-11 10:23 ` Chris Wilson
0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2016-11-11 10:23 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx, linux-kernel
On Fri, Nov 11, 2016 at 08:50:20AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> With the addition of __sg_alloc_table_from_pages we can control
> the maximum coallescing size and eliminate a separate path for
> allocating backing store here.
>
> Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
> SWIOTLB max segment size") this enables more compact sg lists to
> be created and so has a beneficial effect on workloads with many
> and/or large objects of this class.
>
> v2:
> * Rename helper to i915_sg_segment_size and fix swiotlb override.
> * Commit message update.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++
> drivers/gpu/drm/i915/i915_gem.c | 15 +--------------
> drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
> 3 files changed, 16 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 30777dee3f9c..319f8def0f86 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -4175,4 +4175,13 @@ int remap_io_mapping(struct vm_area_struct *vma,
> __T; \
> })
>
> +static inline unsigned int i915_sg_segment_size(void)
> +{
> +#if IS_ENABLED(CONFIG_SWIOTLB)
> + return swiotlb_nr_tbl() << IO_TLB_SHIFT;
> +#else
> + return UINT_MAX;
> +#endif
> +}
> +
> #endif
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1c20edba7f2a..cb4c188a395c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> mutex_unlock(&obj->mm.lock);
> }
>
> -static unsigned int swiotlb_max_size(void)
> -{
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> - return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
> -#else
> - return 0;
> -#endif
> -}
> -
> static void i915_sg_trim(struct sg_table *orig_st)
> {
> struct sg_table new_st;
> @@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> struct sgt_iter sgt_iter;
> struct page *page;
> unsigned long last_pfn = 0; /* suppress gcc warning */
> - unsigned int max_segment;
> + unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
> int ret;
> gfp_t gfp;
>
> @@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
>
> - max_segment = swiotlb_max_size();
> - if (!max_segment)
> - max_segment = rounddown(UINT_MAX, PAGE_SIZE);
> -
> st = kmalloc(sizeof(*st), GFP_KERNEL);
> if (st == NULL)
> return ERR_PTR(-ENOMEM);
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 64261639f547..b4461f1832a6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -390,36 +390,20 @@ struct get_pages_work {
> struct task_struct *task;
> };
>
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> -#define swiotlb_active() swiotlb_nr_tbl()
> -#else
> -#define swiotlb_active() 0
> -#endif
> -
> static int
> st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
> {
> - struct scatterlist *sg;
> - int ret, n;
> + int ret;
>
> *st = kmalloc(sizeof(**st), GFP_KERNEL);
> if (*st == NULL)
> return -ENOMEM;
>
> - if (swiotlb_active()) {
> - ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> - if (ret)
> - goto err;
> -
> - for_each_sg((*st)->sgl, sg, num_pages, n)
> - sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> - } else {
> - ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
> - 0, num_pages << PAGE_SHIFT,
> - GFP_KERNEL);
> - if (ret)
> - goto err;
> - }
> + ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
> + num_pages << PAGE_SHIFT,
Petty:
ret = __sg_alloc_table_from_pages(*st, pvec, num_pages,
pvec + num_pages are paired
0, num_pages << PAGE_SHIFT,
offset + size are paired
i915_sg_segment_size()),
GFP_KERNEL);
And for some reason I always like to see gfp_t last.
Otherwise looks good,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-07-28 11:06 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-16 14:12 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow Tvrtko Ursulin
2017-03-07 8:58 ` Tvrtko Ursulin
2017-03-07 10:16 ` Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-16 14:12 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2017-01-16 14:54 ` ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Patchwork
2017-01-30 9:44 ` [PATCH 1/4] " Daniel Vetter
-- strict thread matches above, loose matches on Subject: below --
2017-07-27 9:05 [PATCH 0/4] Userptr bo slab use optimization Tvrtko Ursulin
2017-07-27 9:05 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2017-07-28 11:06 ` Chris Wilson
2017-05-04 15:54 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2017-05-04 15:54 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2017-01-11 9:00 [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages Tvrtko Ursulin
2017-01-11 9:00 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2016-11-11 8:50 [PATCH 0/4] Compact userptr object backing store allocation Tvrtko Ursulin
2016-11-11 8:50 ` [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations Tvrtko Ursulin
2016-11-11 10:23 ` Chris Wilson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).