linux-sh.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sh: Remove memset from coherent memory allocator (for
@ 2010-07-29 11:50 Andrew Murray
  2010-07-29 13:13 ` [PATCH] sh: Remove memset from coherent memory allocator (for comments) Matt Fleming
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Andrew Murray @ 2010-07-29 11:50 UTC (permalink / raw)
  To: linux-sh

From: Andrew Murray <amurray@mpc-data.co.uk>

This patch removes unnecessary memset in coherent memory allocator.

Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
---
--- linux-2.6-old/arch/sh/mm/consistent.c       2010-07-25
05:01:33.813493496 +0100
+++ linux-2.6/arch/sh/mm/consistent.c   2010-07-25 05:03:42.763055056 +0100
@@ -42,7 +42,6 @@ void *dma_generic_alloc_coherent(struct
        if (!ret)
                return NULL;

-       memset(ret, 0, size);
        /*
         * Pages from the page allocator may have data present in
         * cache. So flush the cache before using uncached memory.
@@ -141,7 +140,7 @@ int __init platform_resource_setup_memor
        if (!memsize)
                return 0;

-       buf = dma_alloc_coherent(NULL, memsize, &dma_handle, GFP_KERNEL);
+       buf = dma_alloc_coherent(NULL, memsize, &dma_handle,
GFP_KERNEL | __GFP_ZERO);
        if (!buf) {
                pr_warning("%s: unable to allocate memory\n", name);
                return -ENOMEM;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sh: Remove memset from coherent memory allocator (for  comments)
  2010-07-29 11:50 [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
@ 2010-07-29 13:13 ` Matt Fleming
  2010-07-29 14:03 ` [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Matt Fleming @ 2010-07-29 13:13 UTC (permalink / raw)
  To: linux-sh

On Thu, 29 Jul 2010 12:50:59 +0100, Andrew Murray <amurray@mpcdata.com> wrote:
> From: Andrew Murray <amurray@mpc-data.co.uk>
> 
> This patch removes unnecessary memset in coherent memory allocator.
> 
> Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
> ---
> --- linux-2.6-old/arch/sh/mm/consistent.c       2010-07-25
> 05:01:33.813493496 +0100
> +++ linux-2.6/arch/sh/mm/consistent.c   2010-07-25 05:03:42.763055056 +0100
> @@ -42,7 +42,6 @@ void *dma_generic_alloc_coherent(struct
>         if (!ret)
>                 return NULL;
> 
> -       memset(ret, 0, size);
>         /*
>          * Pages from the page allocator may have data present in
>          * cache. So flush the cache before using uncached memory.
> @@ -141,7 +140,7 @@ int __init platform_resource_setup_memor
>         if (!memsize)
>                 return 0;
> 
> -       buf = dma_alloc_coherent(NULL, memsize, &dma_handle, GFP_KERNEL);
> +       buf = dma_alloc_coherent(NULL, memsize, &dma_handle,
> GFP_KERNEL | __GFP_ZERO);
>         if (!buf) {
>                 pr_warning("%s: unable to allocate memory\n", name);
>                 return -ENOMEM;

Your mail client has word wrapped this patch. Check out
Documentation/email-clients.txt to see if there are any tips for your
client.

This patch seems OK in principal but I think there's a safer way to get
the results you want. With your patch, you've changed the semantics of
dma_generic_alloc_coherent().  Previously it was guaranteed to return a
chunk of zero'd memory, now you're relying on the caller passing
__GFP_ZERO to indicate whether they want zero'd memory or not. Which
means that if there's a bit of code that expects zero'd memory but
doesn't pass __GFP_ZERO, it'll now be broken.

(You could argue that this hypothetical caller of dma_alloc_coherent()
is already broken if it doesn't pass __GFP_ZERO but my point is that it
could be a lot of work to track down all the callers and figure out
exactly what guarantees they expect).

The safest approach is to follow what x86 does in its
dma_generic_alloc_coherent() implementation; it adds the __GFP_ZERO flag
unconditionally before allocating pages.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] sh: Remove memset from coherent memory allocator (for
  2010-07-29 11:50 [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
  2010-07-29 13:13 ` [PATCH] sh: Remove memset from coherent memory allocator (for comments) Matt Fleming
@ 2010-07-29 14:03 ` Andrew Murray
  2010-07-29 19:36 ` Guennadi Liakhovetski
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Andrew Murray @ 2010-07-29 14:03 UTC (permalink / raw)
  To: linux-sh

> -----Original Message-----
> From: Matt Fleming [mailto:matt@console-pimps.org]
> Your mail client has word wrapped this patch. Check out
> Documentation/email-clients.txt to see if there are any tips for your
> client.
>
> This patch seems OK in principal but I think there's a safer way to get
> the results you want. With your patch, you've changed the semantics of
> dma_generic_alloc_coherent().  Previously it was guaranteed to return a
> chunk of zero'd memory, now you're relying on the caller passing
> __GFP_ZERO to indicate whether they want zero'd memory or not. Which
> means that if there's a bit of code that expects zero'd memory but
> doesn't pass __GFP_ZERO, it'll now be broken.
>
> (You could argue that this hypothetical caller of dma_alloc_coherent()
> is already broken if it doesn't pass __GFP_ZERO but my point is that it
> could be a lot of work to track down all the callers and figure out
> exactly what guarantees they expect).
>
> The safest approach is to follow what x86 does in its
> dma_generic_alloc_coherent() implementation; it adds the __GFP_ZERO
> flag
> unconditionally before allocating pages.

I share your concern (and also make the broken caller argument). New
patch...

From: Andrew Murray <amurray@mpc-data.co.uk>

This patch reduces the time taken to allocate coherent dma memory.

Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
---
--- linux-2.6-old/arch/sh/mm/consistent.c	2010-07-25
05:01:33.813493496 +0100
+++ linux-2.6/arch/sh/mm/consistent.c	2010-07-25 08:11:28.969943650
+0100
@@ -38,11 +38,11 @@ void *dma_generic_alloc_coherent(struct
 	void *ret, *ret_nocache;
 	int order = get_order(size);

+	gfp |= __GFP_ZERO;
 	ret = (void *)__get_free_pages(gfp, order);
 	if (!ret)
 		return NULL;

-	memset(ret, 0, size);
 	/*
 	 * Pages from the page allocator may have data present in
 	 * cache. So flush the cache before using uncached memory.
--

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sh: Remove memset from coherent memory allocator (for
  2010-07-29 11:50 [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
  2010-07-29 13:13 ` [PATCH] sh: Remove memset from coherent memory allocator (for comments) Matt Fleming
  2010-07-29 14:03 ` [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
@ 2010-07-29 19:36 ` Guennadi Liakhovetski
  2010-07-29 19:52 ` [PATCH] sh: Remove memset from coherent memory allocator (for comments) Matt Fleming
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Guennadi Liakhovetski @ 2010-07-29 19:36 UTC (permalink / raw)
  To: linux-sh

On Thu, 29 Jul 2010, Matt Fleming wrote:

> On Thu, 29 Jul 2010 12:50:59 +0100, Andrew Murray <amurray@mpcdata.com> wrote:
> > From: Andrew Murray <amurray@mpc-data.co.uk>
> > 
> > This patch removes unnecessary memset in coherent memory allocator.
> > 
> > Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
> > ---
> > --- linux-2.6-old/arch/sh/mm/consistent.c       2010-07-25
> > 05:01:33.813493496 +0100
> > +++ linux-2.6/arch/sh/mm/consistent.c   2010-07-25 05:03:42.763055056 +0100
> > @@ -42,7 +42,6 @@ void *dma_generic_alloc_coherent(struct
> >         if (!ret)
> >                 return NULL;
> > 
> > -       memset(ret, 0, size);
> >         /*
> >          * Pages from the page allocator may have data present in
> >          * cache. So flush the cache before using uncached memory.
> > @@ -141,7 +140,7 @@ int __init platform_resource_setup_memor
> >         if (!memsize)
> >                 return 0;
> > 
> > -       buf = dma_alloc_coherent(NULL, memsize, &dma_handle, GFP_KERNEL);
> > +       buf = dma_alloc_coherent(NULL, memsize, &dma_handle,
> > GFP_KERNEL | __GFP_ZERO);
> >         if (!buf) {
> >                 pr_warning("%s: unable to allocate memory\n", name);
> >                 return -ENOMEM;
> 
> Your mail client has word wrapped this patch. Check out
> Documentation/email-clients.txt to see if there are any tips for your
> client.
> 
> This patch seems OK in principal but I think there's a safer way to get
> the results you want. With your patch, you've changed the semantics of
> dma_generic_alloc_coherent().  Previously it was guaranteed to return a
> chunk of zero'd memory, now you're relying on the caller passing
> __GFP_ZERO to indicate whether they want zero'd memory or not. Which
> means that if there's a bit of code that expects zero'd memory but
> doesn't pass __GFP_ZERO, it'll now be broken.
> 
> (You could argue that this hypothetical caller of dma_alloc_coherent()
> is already broken if it doesn't pass __GFP_ZERO but my point is that it
> could be a lot of work to track down all the callers and figure out
> exactly what guarantees they expect).
> 
> The safest approach is to follow what x86 does in its
> dma_generic_alloc_coherent() implementation; it adds the __GFP_ZERO flag
> unconditionally before allocating pages.

I don't know where this is specified, but this belongs to the API - 
dma_alloc_coherent() _must_ return zeroed memory. There have been many 
patches on LKML removing superfluous memset(0) after dma_alloc_coherent().

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] sh: Remove memset from coherent memory allocator (for  comments)
  2010-07-29 11:50 [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
                   ` (2 preceding siblings ...)
  2010-07-29 19:36 ` Guennadi Liakhovetski
@ 2010-07-29 19:52 ` Matt Fleming
  2010-08-04  7:37 ` Paul Mundt
  2010-08-04  7:53 ` [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
  5 siblings, 0 replies; 7+ messages in thread
From: Matt Fleming @ 2010-07-29 19:52 UTC (permalink / raw)
  To: linux-sh

On Thu, 29 Jul 2010 15:03:20 +0100, Andrew Murray <amurray@mpcdata.com> wrote:
> 
> I share your concern (and also make the broken caller argument). New
> patch...
> 
> From: Andrew Murray <amurray@mpc-data.co.uk>
> 
> This patch reduces the time taken to allocate coherent dma memory.
> 
> Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
> ---
> --- linux-2.6-old/arch/sh/mm/consistent.c	2010-07-25
> 05:01:33.813493496 +0100
> +++ linux-2.6/arch/sh/mm/consistent.c	2010-07-25 08:11:28.969943650
> +0100
> @@ -38,11 +38,11 @@ void *dma_generic_alloc_coherent(struct
>  	void *ret, *ret_nocache;
>  	int order = get_order(size);
> 
> +	gfp |= __GFP_ZERO;
>  	ret = (void *)__get_free_pages(gfp, order);
>  	if (!ret)
>  		return NULL;
> 
> -	memset(ret, 0, size);
>  	/*
>  	 * Pages from the page allocator may have data present in
>  	 * cache. So flush the cache before using uncached memory.
> --

Reviewed-by: Matt Fleming <matt@console-pimps.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sh: Remove memset from coherent memory allocator (for comments)
  2010-07-29 11:50 [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
                   ` (3 preceding siblings ...)
  2010-07-29 19:52 ` [PATCH] sh: Remove memset from coherent memory allocator (for comments) Matt Fleming
@ 2010-08-04  7:37 ` Paul Mundt
  2010-08-04  7:53 ` [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
  5 siblings, 0 replies; 7+ messages in thread
From: Paul Mundt @ 2010-08-04  7:37 UTC (permalink / raw)
  To: linux-sh

On Thu, Jul 29, 2010 at 03:03:20PM +0100, Andrew Murray wrote:
> > -----Original Message-----
> > From: Matt Fleming [mailto:matt@console-pimps.org]
> > Your mail client has word wrapped this patch. Check out
> > Documentation/email-clients.txt to see if there are any tips for your
> > client.
> >
> > This patch seems OK in principal but I think there's a safer way to get
> > the results you want. With your patch, you've changed the semantics of
> > dma_generic_alloc_coherent().  Previously it was guaranteed to return a
> > chunk of zero'd memory, now you're relying on the caller passing
> > __GFP_ZERO to indicate whether they want zero'd memory or not. Which
> > means that if there's a bit of code that expects zero'd memory but
> > doesn't pass __GFP_ZERO, it'll now be broken.
> >
> > (You could argue that this hypothetical caller of dma_alloc_coherent()
> > is already broken if it doesn't pass __GFP_ZERO but my point is that it
> > could be a lot of work to track down all the callers and figure out
> > exactly what guarantees they expect).
> >
> > The safest approach is to follow what x86 does in its
> > dma_generic_alloc_coherent() implementation; it adds the __GFP_ZERO
> > flag
> > unconditionally before allocating pages.
> 
> I share your concern (and also make the broken caller argument). New
> patch...
> 
> From: Andrew Murray <amurray@mpc-data.co.uk>
> 
> This patch reduces the time taken to allocate coherent dma memory.
> 
> Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>

Looks ok to me. What sort of numbers are we looking at, precisely? It's
been awhile since I profiled our page clearing code, and it would be nice
to see how that compares against a flat memset() on variable page sizes.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sh: Remove memset from coherent memory allocator (for
  2010-07-29 11:50 [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
                   ` (4 preceding siblings ...)
  2010-08-04  7:37 ` Paul Mundt
@ 2010-08-04  7:53 ` Andrew Murray
  5 siblings, 0 replies; 7+ messages in thread
From: Andrew Murray @ 2010-08-04  7:53 UTC (permalink / raw)
  To: linux-sh

On 4 August 2010 08:37, Paul Mundt <lethal@linux-sh.org> wrote:
>
> On Thu, Jul 29, 2010 at 03:03:20PM +0100, Andrew Murray wrote:
> > > -----Original Message-----
> > > From: Matt Fleming [mailto:matt@console-pimps.org]
> > > Your mail client has word wrapped this patch. Check out
> > > Documentation/email-clients.txt to see if there are any tips for your
> > > client.
> > >
> > > This patch seems OK in principal but I think there's a safer way to get
> > > the results you want. With your patch, you've changed the semantics of
> > > dma_generic_alloc_coherent().  Previously it was guaranteed to return a
> > > chunk of zero'd memory, now you're relying on the caller passing
> > > __GFP_ZERO to indicate whether they want zero'd memory or not. Which
> > > means that if there's a bit of code that expects zero'd memory but
> > > doesn't pass __GFP_ZERO, it'll now be broken.
> > >
> > > (You could argue that this hypothetical caller of dma_alloc_coherent()
> > > is already broken if it doesn't pass __GFP_ZERO but my point is that it
> > > could be a lot of work to track down all the callers and figure out
> > > exactly what guarantees they expect).
> > >
> > > The safest approach is to follow what x86 does in its
> > > dma_generic_alloc_coherent() implementation; it adds the __GFP_ZERO
> > > flag
> > > unconditionally before allocating pages.
> >
> > I share your concern (and also make the broken caller argument). New
> > patch...
> >
> > From: Andrew Murray <amurray@mpc-data.co.uk>
> >
> > This patch reduces the time taken to allocate coherent dma memory.
> >
> > Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
>
> Looks ok to me. What sort of numbers are we looking at, precisely? It's
> been awhile since I profiled our page clearing code, and it would be nice
> to see how that compares against a flat memset() on variable page sizes.

I've got some figures in my notebook - though they refer to the time
taken for the sh7724_devices_setup initcall to complete with
memchunk.veu0 and vpu arguments of 8m and 4m....(2.6.31-rc7 kernel) -
so provides some indication...

160ms when using memset (~ < 75MB/ms)
58ms with no memset or equivalent
89ms with GFP_ZERO (~ < 134MB/s)

Andrew Murray

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-08-04  7:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-29 11:50 [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
2010-07-29 13:13 ` [PATCH] sh: Remove memset from coherent memory allocator (for comments) Matt Fleming
2010-07-29 14:03 ` [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray
2010-07-29 19:36 ` Guennadi Liakhovetski
2010-07-29 19:52 ` [PATCH] sh: Remove memset from coherent memory allocator (for comments) Matt Fleming
2010-08-04  7:37 ` Paul Mundt
2010-08-04  7:53 ` [PATCH] sh: Remove memset from coherent memory allocator (for Andrew Murray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).