[PATCH] Revert "arm64: Increase the max granular size"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] Revert "arm64: Increase the max granular size"
Date: Mon, 21 Mar 2016 17:23:01 +0000	[thread overview]
Message-ID: <20160321172301.GP23397@arm.com> (raw)
In-Reply-To: <20160321171403.GE25466@e104818-lin.cambridge.arm.com>

On Mon, Mar 21, 2016 at 05:14:03PM +0000, Catalin Marinas wrote:
> On Fri, Mar 18, 2016 at 09:05:37PM +0000, Chalamarla, Tirumalesh wrote:
> > On 3/16/16, 2:32 AM, "linux-arm-kernel on behalf of Ganesh Mahendran" <linux-arm-kernel-bounces at lists.infradead.org on behalf of opensource.ganesh@gmail.com> wrote:
> > >Reverts commit 97303480753e ("arm64: Increase the max granular size").
> > >
> > >The commit 97303480753e ("arm64: Increase the max granular size") will
> > >degrade system performente in some cpus.
> > >
> > >We test wifi network throughput with iperf on Qualcomm msm8996 CPU:
> > >----------------
> > >run on host:
> > >  # iperf -s
> > >run on device:
> > >  # iperf -c <device-ip-addr> -t 100 -i 1
> > >----------------
> > >
> > >Test result:
> > >----------------
> > >with commit 97303480753e ("arm64: Increase the max granular size"):
> > >    172MBits/sec
> > >
> > >without commit 97303480753e ("arm64: Increase the max granular size"):
> > >    230MBits/sec
> > >----------------
> > >
> > >Some module like slab/net will use the L1_CACHE_SHIFT, so if we do not
> > >set the parameter correctly, it may affect the system performance.
> > >
> > >So revert the commit.
> > 
> > Is there any explanation why is this so? May be there is an
> > alternative to this, apart from reverting the commit.
> 
> I agree we need an explanation but in the meantime, this patch has
> caused a regression on certain systems.
> 
> > Until now it seems L1_CACHE_SHIFT is the max of supported chips. But
> > now we are making it 64byte, is there any reason why not 32. 
> 
> We may have to revisit this logic and consider L1_CACHE_BYTES the
> _minimum_ of cache line sizes in arm64 systems supported by the kernel.
> Do you have any benchmarks on Cavium boards that would show significant
> degradation with 64-byte L1_CACHE_BYTES vs 128?
> 
> For non-coherent DMA, the simplest is to make ARCH_DMA_MINALIGN the
> _maximum_ of the supported systems:
> 
> diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
> index 5082b30bc2c0..4b5d7b27edaf 100644
> --- a/arch/arm64/include/asm/cache.h
> +++ b/arch/arm64/include/asm/cache.h
> @@ -18,17 +18,17 @@
>  
>  #include <asm/cachetype.h>
>  
> -#define L1_CACHE_SHIFT		7
> +#define L1_CACHE_SHIFT		6
>  #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
>  
>  /*
>   * Memory returned by kmalloc() may be used for DMA, so we must make
> - * sure that all such allocations are cache aligned. Otherwise,
> - * unrelated code may cause parts of the buffer to be read into the
> - * cache before the transfer is done, causing old data to be seen by
> - * the CPU.
> + * sure that all such allocations are aligned to the maximum *known*
> + * cache line size on ARMv8 systems. Otherwise, unrelated code may cause
> + * parts of the buffer to be read into the cache before the transfer is
> + * done, causing old data to be seen by the CPU.
>   */
> -#define ARCH_DMA_MINALIGN	L1_CACHE_BYTES
> +#define ARCH_DMA_MINALIGN	(128)

Does this actually fix the reported iperf regression? My assumption was
that ARCH_DMA_MINALIGN is the problem, but I could be wrong.

Will

WARNING: multiple messages have this Message-ID (diff)

From: Will Deacon <will.deacon@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: "Chalamarla,
	Tirumalesh" <Tirumalesh.Chalamarla@caviumnetworks.com>,
	Ganesh Mahendran <opensource.ganesh@gmail.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: [PATCH] Revert "arm64: Increase the max granular size"
Date: Mon, 21 Mar 2016 17:23:01 +0000	[thread overview]
Message-ID: <20160321172301.GP23397@arm.com> (raw)
In-Reply-To: <20160321171403.GE25466@e104818-lin.cambridge.arm.com>

On Mon, Mar 21, 2016 at 05:14:03PM +0000, Catalin Marinas wrote:
> On Fri, Mar 18, 2016 at 09:05:37PM +0000, Chalamarla, Tirumalesh wrote:
> > On 3/16/16, 2:32 AM, "linux-arm-kernel on behalf of Ganesh Mahendran" <linux-arm-kernel-bounces@lists.infradead.org on behalf of opensource.ganesh@gmail.com> wrote:
> > >Reverts commit 97303480753e ("arm64: Increase the max granular size").
> > >
> > >The commit 97303480753e ("arm64: Increase the max granular size") will
> > >degrade system performente in some cpus.
> > >
> > >We test wifi network throughput with iperf on Qualcomm msm8996 CPU:
> > >----------------
> > >run on host:
> > >  # iperf -s
> > >run on device:
> > >  # iperf -c <device-ip-addr> -t 100 -i 1
> > >----------------
> > >
> > >Test result:
> > >----------------
> > >with commit 97303480753e ("arm64: Increase the max granular size"):
> > >    172MBits/sec
> > >
> > >without commit 97303480753e ("arm64: Increase the max granular size"):
> > >    230MBits/sec
> > >----------------
> > >
> > >Some module like slab/net will use the L1_CACHE_SHIFT, so if we do not
> > >set the parameter correctly, it may affect the system performance.
> > >
> > >So revert the commit.
> > 
> > Is there any explanation why is this so? May be there is an
> > alternative to this, apart from reverting the commit.
> 
> I agree we need an explanation but in the meantime, this patch has
> caused a regression on certain systems.
> 
> > Until now it seems L1_CACHE_SHIFT is the max of supported chips. But
> > now we are making it 64byte, is there any reason why not 32. 
> 
> We may have to revisit this logic and consider L1_CACHE_BYTES the
> _minimum_ of cache line sizes in arm64 systems supported by the kernel.
> Do you have any benchmarks on Cavium boards that would show significant
> degradation with 64-byte L1_CACHE_BYTES vs 128?
> 
> For non-coherent DMA, the simplest is to make ARCH_DMA_MINALIGN the
> _maximum_ of the supported systems:
> 
> diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
> index 5082b30bc2c0..4b5d7b27edaf 100644
> --- a/arch/arm64/include/asm/cache.h
> +++ b/arch/arm64/include/asm/cache.h
> @@ -18,17 +18,17 @@
>  
>  #include <asm/cachetype.h>
>  
> -#define L1_CACHE_SHIFT		7
> +#define L1_CACHE_SHIFT		6
>  #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
>  
>  /*
>   * Memory returned by kmalloc() may be used for DMA, so we must make
> - * sure that all such allocations are cache aligned. Otherwise,
> - * unrelated code may cause parts of the buffer to be read into the
> - * cache before the transfer is done, causing old data to be seen by
> - * the CPU.
> + * sure that all such allocations are aligned to the maximum *known*
> + * cache line size on ARMv8 systems. Otherwise, unrelated code may cause
> + * parts of the buffer to be read into the cache before the transfer is
> + * done, causing old data to be seen by the CPU.
>   */
> -#define ARCH_DMA_MINALIGN	L1_CACHE_BYTES
> +#define ARCH_DMA_MINALIGN	(128)

Does this actually fix the reported iperf regression? My assumption was
that ARCH_DMA_MINALIGN is the problem, but I could be wrong.

Will

next prev parent reply	other threads:[~2016-03-21 17:23 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-16  9:32 [PATCH] Revert "arm64: Increase the max granular size" Ganesh Mahendran
2016-03-16  9:32 ` Ganesh Mahendran
2016-03-16 10:07 ` Will Deacon
2016-03-16 10:07   ` Will Deacon
2016-03-16 13:06   ` Timur Tabi
2016-03-16 13:06     ` Timur Tabi
2016-03-16 14:03     ` Mark Rutland
2016-03-16 14:03       ` Mark Rutland
2016-03-16 14:35       ` Will Deacon
2016-03-16 14:35         ` Will Deacon
2016-03-16 14:54         ` Mark Rutland
2016-03-16 14:54           ` Mark Rutland
2016-03-16 14:18     ` Catalin Marinas
2016-03-16 14:18       ` Catalin Marinas
2016-03-16 15:26       ` Timur Tabi
2016-03-16 15:26         ` Timur Tabi
2016-03-17 14:27         ` Catalin Marinas
2016-03-17 14:27           ` Catalin Marinas
2016-03-17 14:49           ` Timur Tabi
2016-03-17 14:49             ` Timur Tabi
2016-03-17 15:37             ` Catalin Marinas
2016-03-17 15:37               ` Catalin Marinas
2016-03-17 16:03               ` Marc Zyngier
2016-03-17 16:03                 ` Marc Zyngier
2016-03-17 18:07           ` Andrew Pinski
2016-03-17 18:07             ` Andrew Pinski
2016-03-17 18:34             ` Timur Tabi
2016-03-17 18:34               ` Timur Tabi
2016-03-17 18:37             ` Catalin Marinas
2016-03-17 18:37               ` Catalin Marinas
2016-03-18 21:05 ` Chalamarla, Tirumalesh
2016-03-18 21:05   ` Chalamarla, Tirumalesh
2016-03-21  1:56   ` Ganesh Mahendran
2016-03-21  1:56     ` Ganesh Mahendran
2016-03-21 17:14   ` Catalin Marinas
2016-03-21 17:14     ` Catalin Marinas
2016-03-21 17:23     ` Will Deacon [this message]
2016-03-21 17:23       ` Will Deacon
2016-03-21 17:33       ` Catalin Marinas
2016-03-21 17:33         ` Catalin Marinas
2016-03-21 17:39         ` Chalamarla, Tirumalesh
2016-03-21 17:39           ` Chalamarla, Tirumalesh
     [not found]     ` <CAPub14-sFgx=oCHzJPb9h9b_V0rbn5UAMDNJ-yTkjhz38JPqMQ@mail.gmail.com>
     [not found]       ` <10fef112-37f1-0a1b-b5af-435acd032f01@codeaurora.org>
2017-04-06  7:22         ` Imran Khan
2017-04-06  7:22           ` Imran Khan
2017-04-06  7:22           ` Imran Khan
2017-04-06 15:58           ` Catalin Marinas
2017-04-06 15:58             ` Catalin Marinas
2017-04-07  2:06             ` Ganesh Mahendran
2017-04-07  2:06               ` Ganesh Mahendran
2017-04-07  8:59               ` Catalin Marinas
2017-04-07  8:59                 ` Catalin Marinas
2017-04-12  5:13               ` Imran Khan
2017-04-12  5:13                 ` Imran Khan
2017-04-12 14:00                 ` Chalamarla, Tirumalesh
2017-04-12 14:00                   ` Chalamarla, Tirumalesh
2017-04-17  7:35                   ` Imran Khan
2017-04-17  7:35                     ` Imran Khan
2017-04-17 10:38                     ` Sunil Kovvuri
2017-04-17 10:38                       ` Sunil Kovvuri
2017-04-18 14:48                       ` Catalin Marinas
2017-04-18 14:48                         ` Catalin Marinas
2017-04-18 17:05                         ` Sunil Kovvuri
2017-04-18 17:05                           ` Sunil Kovvuri
2017-04-19 12:01                           ` Catalin Marinas
2017-04-19 12:01                             ` Catalin Marinas
2017-04-19 13:11                             ` Sunil Kovvuri
2017-04-19 13:11                               ` Sunil Kovvuri
2017-04-25  6:42                               ` Ding Tianhong
2017-04-25  6:42                                 ` Ding Tianhong
2017-04-25  6:42                                 ` Ding Tianhong
2017-04-18 18:21                     ` Chalamarla, Tirumalesh
2017-04-18 18:21                       ` Chalamarla, Tirumalesh
2017-04-11  4:40             ` Jon Masters
2017-04-11  4:40               ` Jon Masters
  -- strict thread matches above, loose matches on Subject: below --
2016-03-16  9:37 Ganesh Mahendran
2016-03-16  9:27 Ganesh Mahendran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160321172301.GP23397@arm.com \
    --to=will.deacon@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.