[PATCH] arm64: Increase the max granular size

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] arm64: Increase the max granular size
@ 2015-09-22 17:59 Robert Richter
  2015-09-22 18:29 ` Will Deacon
                   ` (4 more replies)
  0 siblings, 5 replies; 31+ messages in thread
From: Robert Richter @ 2015-09-22 17:59 UTC (permalink / raw)
  To: linux-arm-kernel

From: Tirumalesh Chalamarla <tchalamarla@cavium.com>

Increase the standard cacheline size to avoid having locks in the same
cacheline.

Cavium's ThunderX core implements cache lines of 128 byte size. With
current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
share the same cache line leading a performance degradation.
Increasing the size fixes that.

Increasing the size has no negative impact to cache invalidation on
systems with a smaller cache line. There is an impact on memory usage,
but that's not too important for arm64 use cases.

Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
---
 arch/arm64/include/asm/cache.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
index bde449936e2f..5082b30bc2c0 100644
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -18,7 +18,7 @@

 #include <asm/cachetype.h>

-#define L1_CACHE_SHIFT		6
+#define L1_CACHE_SHIFT		7
 #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)

 /*
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-09-22 17:59 [PATCH] arm64: Increase the max granular size Robert Richter
@ 2015-09-22 18:29 ` Will Deacon
  2015-09-25 14:45   ` Robert Richter
  2015-10-10 17:39 ` Timur Tabi
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Will Deacon @ 2015-09-22 18:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 22, 2015 at 06:59:48PM +0100, Robert Richter wrote:
> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> 
> Increase the standard cacheline size to avoid having locks in the same
> cacheline.
> 
> Cavium's ThunderX core implements cache lines of 128 byte size. With
> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> share the same cache line leading a performance degradation.
> Increasing the size fixes that.

Do you have an example of that happening?

> Increasing the size has no negative impact to cache invalidation on
> systems with a smaller cache line. There is an impact on memory usage,
> but that's not too important for arm64 use cases.

Do you have any before/after numbers to show the impact of this change
on other supported SoCs?

Will

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-09-22 18:29 ` Will Deacon
@ 2015-09-25 14:45   ` Robert Richter
  2015-09-25 16:31     ` Tirumalesh Chalamarla
  0 siblings, 1 reply; 31+ messages in thread
From: Robert Richter @ 2015-09-25 14:45 UTC (permalink / raw)
  To: linux-arm-kernel

Will,

On 22.09.15 19:29:02, Will Deacon wrote:
> On Tue, Sep 22, 2015 at 06:59:48PM +0100, Robert Richter wrote:
> > From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> > 
> > Increase the standard cacheline size to avoid having locks in the same
> > cacheline.
> > 
> > Cavium's ThunderX core implements cache lines of 128 byte size. With
> > current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> > share the same cache line leading a performance degradation.
> > Increasing the size fixes that.
> 
> Do you have an example of that happening?

I did some 'poor man's kernel build all modules benchmarking' and
could not find significant performance improvements so far (second
part with the patch reverted):

 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m10.490s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.747s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.264s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m0.435s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.569s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.274s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m0.507s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m1.551s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.073s
 build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m1.738s

 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m10.644s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.814s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.315s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.610s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.885s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.281s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.869s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.953s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.787s
 build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.656s

I will check what kind of workloads this patch was written for.
Tirumalesh, any idea?

Thanks,

-Robert

> 
> > Increasing the size has no negative impact to cache invalidation on
> > systems with a smaller cache line. There is an impact on memory usage,
> > but that's not too important for arm64 use cases.
> 
> Do you have any before/after numbers to show the impact of this change
> on other supported SoCs?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-09-25 14:45   ` Robert Richter
@ 2015-09-25 16:31     ` Tirumalesh Chalamarla
  0 siblings, 0 replies; 31+ messages in thread
From: Tirumalesh Chalamarla @ 2015-09-25 16:31 UTC (permalink / raw)
  To: linux-arm-kernel



On 09/25/2015 07:45 AM, Robert Richter wrote:
> Will,
>
> On 22.09.15 19:29:02, Will Deacon wrote:
>> On Tue, Sep 22, 2015 at 06:59:48PM +0100, Robert Richter wrote:
>>> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
>>>
>>> Increase the standard cacheline size to avoid having locks in the same
>>> cacheline.
>>>
>>> Cavium's ThunderX core implements cache lines of 128 byte size. With
>>> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
>>> share the same cache line leading a performance degradation.
>>> Increasing the size fixes that.
>> Do you have an example of that happening?
> I did some 'poor man's kernel build all modules benchmarking' and
> could not find significant performance improvements so far (second
> part with the patch reverted):
>
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m10.490s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.747s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.264s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m0.435s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.569s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.274s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m0.507s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m1.551s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       6m59.073s
>   build-allmodules-4.2.0-01404-g5818d6e89783.log:real       7m1.738s
>
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m10.644s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.814s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.315s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.610s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.885s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       6m59.281s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.869s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.953s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.787s
>   build-allmodules-4.2.0-01406-g638c69fddc40.log:real       7m0.656s
>
> I will check what kind of workloads this patch was written for.
> Tirumalesh, any idea?

mainly for workloads where compiler optimizes based on cache line size,
let me write a small bench mark
> Thanks,
>
> -Robert
>
>>> Increasing the size has no negative impact to cache invalidation on
>>> systems with a smaller cache line. There is an impact on memory usage,
>>> but that's not too important for arm64 use cases.
>> Do you have any before/after numbers to show the impact of this change
>> on other supported SoCs?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-09-22 17:59 [PATCH] arm64: Increase the max granular size Robert Richter
  2015-09-22 18:29 ` Will Deacon
@ 2015-10-10 17:39 ` Timur Tabi
  2015-10-12  9:16   ` Will Deacon
  2015-10-16 19:57 ` Timur Tabi
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Timur Tabi @ 2015-10-10 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 22, 2015 at 12:59 PM, Robert Richter <rric@kernel.org> wrote:
>
> -#define L1_CACHE_SHIFT         6
> +#define L1_CACHE_SHIFT         7
>  #define L1_CACHE_BYTES         (1 << L1_CACHE_SHIFT)

Would it be better if this were a Kconfig option, like it is on ARM32?

http://lxr.free-electrons.com/source/arch/arm/include/asm/cache.h#L7

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-10-10 17:39 ` Timur Tabi
@ 2015-10-12  9:16   ` Will Deacon
  0 siblings, 0 replies; 31+ messages in thread
From: Will Deacon @ 2015-10-12  9:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Oct 10, 2015 at 12:39:25PM -0500, Timur Tabi wrote:
> On Tue, Sep 22, 2015 at 12:59 PM, Robert Richter <rric@kernel.org> wrote:
> >
> > -#define L1_CACHE_SHIFT         6
> > +#define L1_CACHE_SHIFT         7
> >  #define L1_CACHE_BYTES         (1 << L1_CACHE_SHIFT)
> 
> Would it be better if this were a Kconfig option, like it is on ARM32?
> 
> http://lxr.free-electrons.com/source/arch/arm/include/asm/cache.h#L7

I don't think it adds anything, to be honest. We really want one kernel
that runs everywhere and we don't (yet) have the SoC variation that exists
on arch/arm/, so we may as well just keep it as big as it needs to be.
Of course, if we start to get significant divergence between the minimum
and maximum value and that in turn shows a non-trivial impact on kernel
size and/or performance, then we could consider a Kconfig option but at
that point we'd probably also need to consider whether there are alternative
ways of providing this information to the kernel.

If somebody really wants to change it for their particular kernel build,
modifying the #define isn't exactly rocket science.

Will

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-09-22 17:59 [PATCH] arm64: Increase the max granular size Robert Richter
  2015-09-22 18:29 ` Will Deacon
  2015-10-10 17:39 ` Timur Tabi
@ 2015-10-16 19:57 ` Timur Tabi
  2015-10-28 19:09 ` Catalin Marinas
  2015-11-05  4:40 ` [PATCH] arm64: Increase the max granular size Joonsoo Kim
  4 siblings, 0 replies; 31+ messages in thread
From: Timur Tabi @ 2015-10-16 19:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 22, 2015 at 12:59 PM, Robert Richter <rric@kernel.org> wrote:
> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
>
> Increase the standard cacheline size to avoid having locks in the same
> cacheline.
>
> Cavium's ThunderX core implements cache lines of 128 byte size. With
> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> share the same cache line leading a performance degradation.
> Increasing the size fixes that.
>
> Increasing the size has no negative impact to cache invalidation on
> systems with a smaller cache line. There is an impact on memory usage,
> but that's not too important for arm64 use cases.
>
> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> Signed-off-by: Robert Richter <rrichter@cavium.com>

Acked-by: Timur Tabi <timur@codeaurora.org>

We need this patch, because on our silicon, CTR_EL0[CWG] set to 5, which
means that setup_processor() complains with this warning:

cls = cache_line_size();

if (L1_CACHE_BYTES < cls)
    pr_warn("L1_CACHE_BYTES smaller than the Cache Writeback Granule
(%d < %d)\n",
            L1_CACHE_BYTES, cls);

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-09-22 17:59 [PATCH] arm64: Increase the max granular size Robert Richter
                   ` (2 preceding siblings ...)
  2015-10-16 19:57 ` Timur Tabi
@ 2015-10-28 19:09 ` Catalin Marinas
       [not found]   ` <CAMuHMdWQygbxMXoOsbwek6DzZcr7J-C23VCK4ubbgUr+zj=giw@mail.gmail.com>
  2015-11-05  4:40 ` [PATCH] arm64: Increase the max granular size Joonsoo Kim
  4 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-10-28 19:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> 
> Increase the standard cacheline size to avoid having locks in the same
> cacheline.
> 
> Cavium's ThunderX core implements cache lines of 128 byte size. With
> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> share the same cache line leading a performance degradation.
> Increasing the size fixes that.
> 
> Increasing the size has no negative impact to cache invalidation on
> systems with a smaller cache line. There is an impact on memory usage,
> but that's not too important for arm64 use cases.
> 
> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> Signed-off-by: Robert Richter <rrichter@cavium.com>

Applied. Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
       [not found]   ` <CAMuHMdWQygbxMXoOsbwek6DzZcr7J-C23VCK4ubbgUr+zj=giw@mail.gmail.com>
@ 2015-11-03 12:05     ` Catalin Marinas
  2015-11-03 14:38       ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-03 12:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 03, 2015 at 12:07:06PM +0100, Geert Uytterhoeven wrote:
> On Wed, Oct 28, 2015 at 8:09 PM, Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
> >> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> >>
> >> Increase the standard cacheline size to avoid having locks in the same
> >> cacheline.
> >>
> >> Cavium's ThunderX core implements cache lines of 128 byte size. With
> >> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> >> share the same cache line leading a performance degradation.
> >> Increasing the size fixes that.
> >>
> >> Increasing the size has no negative impact to cache invalidation on
> >> systems with a smaller cache line. There is an impact on memory usage,
> >> but that's not too important for arm64 use cases.
> >>
> >> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> >> Signed-off-by: Robert Richter <rrichter@cavium.com>
> >
> > Applied. Thanks.
> 
> This patch causes a BUG() on r8a7795/salvator-x, for which support is not
> yet upstream.
> 
> My config (attached) uses SLAB. If I switch to SLUB, it works.
> The arm64 defconfig works, even if I switch from SLUB to SLAB.
[...]
> ------------[ cut here ]------------
> kernel BUG at mm/slab.c:2283!
> Internal error: Oops - BUG: 0 [#1] SMP
[...]
> Call trace:
> [<ffffffc00014f9b4>] __kmem_cache_create+0x21c/0x280
> [<ffffffc00068be50>] create_boot_cache+0x4c/0x80
> [<ffffffc00068bed8>] create_kmalloc_cache+0x54/0x88
> [<ffffffc00068bfc0>] create_kmalloc_caches+0x50/0xf4
> [<ffffffc00068db08>] kmem_cache_init+0x104/0x118
> [<ffffffc00067d7d8>] start_kernel+0x218/0x33c

I haven't managed to reproduce this on a Juno kernel. Could you please
add some debug printing to see what's being passed to kmalloc_slab()?
The freelist_size getting out of bounds?

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-03 12:05     ` Catalin Marinas
@ 2015-11-03 14:38       ` Catalin Marinas
  2015-11-03 14:55         ` Geert Uytterhoeven
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-03 14:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 03, 2015 at 12:05:05PM +0000, Catalin Marinas wrote:
> On Tue, Nov 03, 2015 at 12:07:06PM +0100, Geert Uytterhoeven wrote:
> > On Wed, Oct 28, 2015 at 8:09 PM, Catalin Marinas
> > <catalin.marinas@arm.com> wrote:
> > > On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
> > >> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> > >>
> > >> Increase the standard cacheline size to avoid having locks in the same
> > >> cacheline.
> > >>
> > >> Cavium's ThunderX core implements cache lines of 128 byte size. With
> > >> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> > >> share the same cache line leading a performance degradation.
> > >> Increasing the size fixes that.
> > >>
> > >> Increasing the size has no negative impact to cache invalidation on
> > >> systems with a smaller cache line. There is an impact on memory usage,
> > >> but that's not too important for arm64 use cases.
> > >>
> > >> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> > >> Signed-off-by: Robert Richter <rrichter@cavium.com>
> > >
> > > Applied. Thanks.
> > 
> > This patch causes a BUG() on r8a7795/salvator-x, for which support is not
> > yet upstream.
> > 
> > My config (attached) uses SLAB. If I switch to SLUB, it works.
> > The arm64 defconfig works, even if I switch from SLUB to SLAB.
> [...]
> > ------------[ cut here ]------------
> > kernel BUG at mm/slab.c:2283!
> > Internal error: Oops - BUG: 0 [#1] SMP
> [...]
> > Call trace:
> > [<ffffffc00014f9b4>] __kmem_cache_create+0x21c/0x280
> > [<ffffffc00068be50>] create_boot_cache+0x4c/0x80
> > [<ffffffc00068bed8>] create_kmalloc_cache+0x54/0x88
> > [<ffffffc00068bfc0>] create_kmalloc_caches+0x50/0xf4
> > [<ffffffc00068db08>] kmem_cache_init+0x104/0x118
> > [<ffffffc00067d7d8>] start_kernel+0x218/0x33c
> 
> I haven't managed to reproduce this on a Juno kernel.

I now managed to reproduce it with your config (slightly adapted to
allow Juno). I'll look into it.

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-03 14:38       ` Catalin Marinas
@ 2015-11-03 14:55         ` Geert Uytterhoeven
  2015-11-03 18:50           ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Geert Uytterhoeven @ 2015-11-03 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On Tue, Nov 3, 2015 at 3:38 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Tue, Nov 03, 2015 at 12:05:05PM +0000, Catalin Marinas wrote:
>> On Tue, Nov 03, 2015 at 12:07:06PM +0100, Geert Uytterhoeven wrote:
>> > On Wed, Oct 28, 2015 at 8:09 PM, Catalin Marinas
>> > <catalin.marinas@arm.com> wrote:
>> > > On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
>> > >> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
>> > >>
>> > >> Increase the standard cacheline size to avoid having locks in the same
>> > >> cacheline.
>> > >>
>> > >> Cavium's ThunderX core implements cache lines of 128 byte size. With
>> > >> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
>> > >> share the same cache line leading a performance degradation.
>> > >> Increasing the size fixes that.
>> > >>
>> > >> Increasing the size has no negative impact to cache invalidation on
>> > >> systems with a smaller cache line. There is an impact on memory usage,
>> > >> but that's not too important for arm64 use cases.
>> > >>
>> > >> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
>> > >> Signed-off-by: Robert Richter <rrichter@cavium.com>
>> > >
>> > > Applied. Thanks.
>> >
>> > This patch causes a BUG() on r8a7795/salvator-x, for which support is not
>> > yet upstream.
>> >
>> > My config (attached) uses SLAB. If I switch to SLUB, it works.
>> > The arm64 defconfig works, even if I switch from SLUB to SLAB.
>> [...]
>> > ------------[ cut here ]------------
>> > kernel BUG at mm/slab.c:2283!
>> > Internal error: Oops - BUG: 0 [#1] SMP
>> [...]
>> > Call trace:
>> > [<ffffffc00014f9b4>] __kmem_cache_create+0x21c/0x280
>> > [<ffffffc00068be50>] create_boot_cache+0x4c/0x80
>> > [<ffffffc00068bed8>] create_kmalloc_cache+0x54/0x88
>> > [<ffffffc00068bfc0>] create_kmalloc_caches+0x50/0xf4
>> > [<ffffffc00068db08>] kmem_cache_init+0x104/0x118
>> > [<ffffffc00067d7d8>] start_kernel+0x218/0x33c
>>
>> I haven't managed to reproduce this on a Juno kernel.
>
> I now managed to reproduce it with your config (slightly adapted to
> allow Juno). I'll look into it.

Good to hear that!

BTW, I see this:

        freelist_size = 32
        cache_line_size() = 64

It seems like the value returned by cache_line_size() in
arch/arm64/include/asm/cache.h disagrees with L1_CACHE_SHIFT  == 7:

        static inline int cache_line_size(void)
        {
                u32 cwg = cache_type_cwg();
                return cwg ? 4 << cwg : L1_CACHE_BYTES;
        }

Making cache_line_size() always return L1_CACHE_BYTES doesn't help.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-03 14:55         ` Geert Uytterhoeven
@ 2015-11-03 18:50           ` Catalin Marinas
  2015-11-03 23:33             ` Christoph Lameter
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-03 18:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 03, 2015 at 03:55:29PM +0100, Geert Uytterhoeven wrote:
> On Tue, Nov 3, 2015 at 3:38 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, Nov 03, 2015 at 12:05:05PM +0000, Catalin Marinas wrote:
> >> On Tue, Nov 03, 2015 at 12:07:06PM +0100, Geert Uytterhoeven wrote:
> >> > On Wed, Oct 28, 2015 at 8:09 PM, Catalin Marinas
> >> > <catalin.marinas@arm.com> wrote:
> >> > > On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
> >> > >> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> >> > >>
> >> > >> Increase the standard cacheline size to avoid having locks in the same
> >> > >> cacheline.
> >> > >>
> >> > >> Cavium's ThunderX core implements cache lines of 128 byte size. With
> >> > >> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> >> > >> share the same cache line leading a performance degradation.
> >> > >> Increasing the size fixes that.
> >> > >>
> >> > >> Increasing the size has no negative impact to cache invalidation on
> >> > >> systems with a smaller cache line. There is an impact on memory usage,
> >> > >> but that's not too important for arm64 use cases.
> >> > >>
> >> > >> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> >> > >> Signed-off-by: Robert Richter <rrichter@cavium.com>
> >> > >
> >> > > Applied. Thanks.
> >> >
> >> > This patch causes a BUG() on r8a7795/salvator-x, for which support is not
> >> > yet upstream.
> >> >
> >> > My config (attached) uses SLAB. If I switch to SLUB, it works.
> >> > The arm64 defconfig works, even if I switch from SLUB to SLAB.
> >> [...]
> >> > ------------[ cut here ]------------
> >> > kernel BUG at mm/slab.c:2283!
> >> > Internal error: Oops - BUG: 0 [#1] SMP
> >> [...]
> >> > Call trace:
> >> > [<ffffffc00014f9b4>] __kmem_cache_create+0x21c/0x280
> >> > [<ffffffc00068be50>] create_boot_cache+0x4c/0x80
> >> > [<ffffffc00068bed8>] create_kmalloc_cache+0x54/0x88
> >> > [<ffffffc00068bfc0>] create_kmalloc_caches+0x50/0xf4
> >> > [<ffffffc00068db08>] kmem_cache_init+0x104/0x118
> >> > [<ffffffc00067d7d8>] start_kernel+0x218/0x33c
> >>
> >> I haven't managed to reproduce this on a Juno kernel.
> >
> > I now managed to reproduce it with your config (slightly adapted to
> > allow Juno). I'll look into it.
> 
> Good to hear that!
> 
> BTW, I see this:
> 
>         freelist_size = 32
>         cache_line_size() = 64
> 
> It seems like the value returned by cache_line_size() in
> arch/arm64/include/asm/cache.h disagrees with L1_CACHE_SHIFT  == 7:
> 
>         static inline int cache_line_size(void)
>         {
>                 u32 cwg = cache_type_cwg();
>                 return cwg ? 4 << cwg : L1_CACHE_BYTES;
>         }
> 
> Making cache_line_size() always return L1_CACHE_BYTES doesn't help.

(cc'ing Jonsoo and Christoph; summary: slab failure with L1_CACHE_BYTES
of 128 and sizeof(kmem_cache_node) of 152)

If I revert commit 8fc9cf420b36 ("slab: make more slab management
structure off the slab") it works but I still need to figure out how
slab indices are calculated. The size_index[] array is overridden so
that 0..15 are 7 and 16..23 are 8. But the kmalloc_caches[7] has never
been populated, hence the BUG_ON. Another option may be to change
kmalloc_size and kmalloc_index to cope with KMALLOC_MIN_SIZE of 128.

I'll do some more investigation tomorrow.

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-03 18:50           ` Catalin Marinas
@ 2015-11-03 23:33             ` Christoph Lameter
  2015-11-04 12:36               ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Lameter @ 2015-11-03 23:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 3 Nov 2015, Catalin Marinas wrote:

> (cc'ing Jonsoo and Christoph; summary: slab failure with L1_CACHE_BYTES
> of 128 and sizeof(kmem_cache_node) of 152)

Hmmm... Yes that would mean use the 196 sized kmalloc array which is not a
power of two slab. But the code looks fine to me.

> If I revert commit 8fc9cf420b36 ("slab: make more slab management
> structure off the slab") it works but I still need to figure out how
> slab indices are calculated. The size_index[] array is overridden so
> that 0..15 are 7 and 16..23 are 8. But the kmalloc_caches[7] has never
> been populated, hence the BUG_ON. Another option may be to change
> kmalloc_size and kmalloc_index to cope with KMALLOC_MIN_SIZE of 128.
>
> I'll do some more investigation tomorrow.

The commit allows off slab management for PAGE_SIZE >> 5 that is 128.

After that commit kmem_cache_create would try to allocate an off slab
management structure which is not available during early boot.
But the slab_early_init is set which should prevent the use of an off slab
management infrastructure in kmem_cache_create().

However, the failure in line 2283 shows that the OFF SLAB flag was
mistakenly set anyways!!!! Something must havee cleared slab_early_init?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-03 23:33             ` Christoph Lameter
@ 2015-11-04 12:36               ` Catalin Marinas
  2015-11-04 13:53                 ` Christoph Lameter
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-04 12:36 UTC (permalink / raw)
  To: linux-arm-kernel

(+ linux-mm)

On Tue, Nov 03, 2015 at 05:33:25PM -0600, Christoph Lameter wrote:
> On Tue, 3 Nov 2015, Catalin Marinas wrote:
> > (cc'ing Jonsoo and Christoph; summary: slab failure with L1_CACHE_BYTES
> > of 128 and sizeof(kmem_cache_node) of 152)
> 
> Hmmm... Yes that would mean use the 196 sized kmalloc array which is not a
> power of two slab. But the code looks fine to me.

I'm not entirely sure that gets used (or even created).
kmalloc_index(152) returns 8 (INDEX_NODE==8) since KMALLOC_MIN_SIZE==128
and the "kmalloc-node" cache size is 256.

> > If I revert commit 8fc9cf420b36 ("slab: make more slab management
> > structure off the slab") it works but I still need to figure out how
> > slab indices are calculated. The size_index[] array is overridden so
> > that 0..15 are 7 and 16..23 are 8. But the kmalloc_caches[7] has never
> > been populated, hence the BUG_ON. Another option may be to change
> > kmalloc_size and kmalloc_index to cope with KMALLOC_MIN_SIZE of 128.
> >
> > I'll do some more investigation tomorrow.
> 
> The commit allows off slab management for PAGE_SIZE >> 5 that is 128.

This means that the first kmalloc cache to be created, "kmalloc-128", is
off slab.

> After that commit kmem_cache_create would try to allocate an off slab
> management structure which is not available during early boot.
> But the slab_early_init is set which should prevent the use of an off slab
> management infrastructure in kmem_cache_create().
> 
> However, the failure in line 2283 shows that the OFF SLAB flag was
> mistakenly set anyways!!!! Something must havee cleared slab_early_init?

slab_early_init is cleared after "kmem_cache" and "kmalloc-node" caches
are successfully created. Following this, the minimum kmalloc allocation
goes off-slab when KMALLOC_MIN_SIZE == 128.

When trying to create "kmalloc-128" (via create_kmalloc_caches(),
slab_early_init is already 0), __kmem_cache_create() requires an
allocation of 32 bytes (freelist_size) which has index 7, hence exactly
the kmalloc_caches[7] we are trying to create.

The simplest option would be to make sure that off slab isn't allowed
for caches of KMALLOC_MIN_SIZE or smaller, with the drawback that not
only "kmalloc-128" but any other such caches will be on slab.

I think a better option would be to first check that there is a
kmalloc_caches[] entry for freelist_size before deciding to go off-slab.
See below:

-----8<------------------------------

>From ce27c5c6d156522ceaff20de8a7af281cf079b6f Mon Sep 17 00:00:00 2001
From: Catalin Marinas <catalin.marinas@arm.com>
Date: Wed, 4 Nov 2015 12:19:00 +0000
Subject: [PATCH] mm: slab: Avoid BUG when KMALLOC_MIN_SIZE == (PAGE_SIZE >> 5)

The slab allocator, following commit 8fc9cf420b36 ("slab: make more slab
management structure off the slab"), tries to place slab management
off-slab when the object size is PAGE_SIZE >> 5 or larger. On arm64 with
KMALLOC_MIN_SIZE = L1_CACHE_BYTES = 128, "kmalloc-128" is the smallest
cache to be created after slab_early_init = 0. The current
__kmem_cache_create() implementation aims to place the management
structure off-slab. However, the kmalloc_slab(freelist_size) has not
been populated yet, triggering a bug on !cachep->freelist_cache.

This patch addresses the problem by keeping the management structure
on-slab if the corresponding kmalloc_caches[] is not populated yet.

Fixes: 8fc9cf420b36 ("slab: make more slab management structure off the slab")
Cc: <stable@vger.kernel.org> # 3.15+
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 mm/slab.c | 43 ++++++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 4fcc5dd8d5a6..d4a21736eb5d 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2246,16 +2246,33 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 
 	if (flags & CFLGS_OFF_SLAB) {
 		/* really off slab. No need for manual alignment */
-		freelist_size = calculate_freelist_size(cachep->num, 0);
+		size_t off_freelist_size = calculate_freelist_size(cachep->num, 0);
+
+		cachep->freelist_cache = kmalloc_slab(off_freelist_size, 0u);
+		if (ZERO_OR_NULL_PTR(cachep->freelist_cache)) {
+			/*
+			 * We don't have kmalloc_caches[] populated for
+			 * off_freelist_size yet. This can happen during
+			 * create_kmalloc_caches() when KMALLOC_MIN_SIZE >=
+			 * (PAGE_SHIFT >> 5) and CFLGS_OFF_SLAB is set. Move
+			 * the cache on-slab.
+			 */
+			flags &= ~CFLGS_OFF_SLAB;
+			left_over = calculate_slab_order(cachep, size, cachep->align, flags);
+		} else {
+			freelist_size = off_freelist_size;
 
 #ifdef CONFIG_PAGE_POISONING
-		/* If we're going to use the generic kernel_map_pages()
-		 * poisoning, then it's going to smash the contents of
-		 * the redzone and userword anyhow, so switch them off.
-		 */
-		if (size % PAGE_SIZE == 0 && flags & SLAB_POISON)
-			flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER);
+			/*
+			 * If we're going to use the generic kernel_map_pages()
+			 * poisoning, then it's going to smash the contents of
+			 * the redzone and userword anyhow, so switch them off.
+			 */
+			if (size % PAGE_SIZE == 0 && flags & SLAB_POISON)
+				flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER);
 #endif
+		}
+
 	}
 
 	cachep->colour_off = cache_line_size();
@@ -2271,18 +2288,6 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 	cachep->size = size;
 	cachep->reciprocal_buffer_size = reciprocal_value(size);
 
-	if (flags & CFLGS_OFF_SLAB) {
-		cachep->freelist_cache = kmalloc_slab(freelist_size, 0u);
-		/*
-		 * This is a possibility for one of the kmalloc_{dma,}_caches.
-		 * But since we go off slab only for object size greater than
-		 * PAGE_SIZE/8, and kmalloc_{dma,}_caches get created
-		 * in ascending order,this should not happen at all.
-		 * But leave a BUG_ON for some lucky dude.
-		 */
-		BUG_ON(ZERO_OR_NULL_PTR(cachep->freelist_cache));
-	}
-
 	err = setup_cpu_cache(cachep, gfp);
 	if (err) {
 		__kmem_cache_shutdown(cachep);

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-04 12:36               ` Catalin Marinas
@ 2015-11-04 13:53                 ` Christoph Lameter
  2015-11-04 14:54                   ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Lameter @ 2015-11-04 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 4 Nov 2015, Catalin Marinas wrote:

> The simplest option would be to make sure that off slab isn't allowed
> for caches of KMALLOC_MIN_SIZE or smaller, with the drawback that not
> only "kmalloc-128" but any other such caches will be on slab.

The reason for an off slab configuration is denser object packing.

> I think a better option would be to first check that there is a
> kmalloc_caches[] entry for freelist_size before deciding to go off-slab.

Hmmm.. Yes seems to be an option.

Maybe we simply revert commit 8fc9cf420b36 instead? That does not seem to
make too much sense to me and the goal of the commit cannot be
accomplished on ARM. Your patch essentially reverts the effect anyways.

Smaller slabs really do not need off slab management anyways since they
will only loose a few objects per slab page.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-04 13:53                 ` Christoph Lameter
@ 2015-11-04 14:54                   ` Catalin Marinas
  2015-11-04 15:28                     ` Christoph Lameter
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-04 14:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 04, 2015 at 07:53:50AM -0600, Christoph Lameter wrote:
> On Wed, 4 Nov 2015, Catalin Marinas wrote:
> 
> > The simplest option would be to make sure that off slab isn't allowed
> > for caches of KMALLOC_MIN_SIZE or smaller, with the drawback that not
> > only "kmalloc-128" but any other such caches will be on slab.
> 
> The reason for an off slab configuration is denser object packing.
> 
> > I think a better option would be to first check that there is a
> > kmalloc_caches[] entry for freelist_size before deciding to go off-slab.
> 
> Hmmm.. Yes seems to be an option.
> 
> Maybe we simply revert commit 8fc9cf420b36 instead?

I'm fine with this. Also note that the arm64 commit changing
L1_CACHE_BYTES to 128 hasn't been pushed yet (it's queued for 4.4).

> That does not seem to make too much sense to me and the goal of the
> commit cannot be accomplished on ARM. Your patch essentially reverts
> the effect anyways.

In theory it only reverts the effect for the first kmalloc_cache
("kmalloc-128" in the arm64 case). Any other bigger cache which would
not be mergeable with an existing one still has the potential of
off-slab management.

> Smaller slabs really do not need off slab management anyways since they
> will only loose a few objects per slab page.

IIUC, starting with 128 slab size for a 4KB page, you have 32 objects
per page. The freelist takes 32 bytes (or 31), therefore you waste a
single slab object. However, only 1/4 of it is used for freelist and the
waste gets bigger with 256 slab size, hence the original commit.

BTW, assuming L1_CACHE_BYTES is 512 (I don't ever see this happening but
just in theory), we potentially have the same issue. What would save us
is that INDEX_NODE would match the first "kmalloc-512" cache, so we have
it pre-populated.

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-04 14:54                   ` Catalin Marinas
@ 2015-11-04 15:28                     ` Christoph Lameter
  2015-11-04 15:39                       ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Christoph Lameter @ 2015-11-04 15:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 4 Nov 2015, Catalin Marinas wrote:

> BTW, assuming L1_CACHE_BYTES is 512 (I don't ever see this happening but
> just in theory), we potentially have the same issue. What would save us
> is that INDEX_NODE would match the first "kmalloc-512" cache, so we have
> it pre-populated.

Ok maybe add some BUILD_BUG_ONs to ensure that builds fail until we have
addressed that.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-04 15:28                     ` Christoph Lameter
@ 2015-11-04 15:39                       ` Catalin Marinas
  2015-11-05  4:31                         ` Joonsoo Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-04 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 04, 2015 at 09:28:34AM -0600, Christoph Lameter wrote:
> On Wed, 4 Nov 2015, Catalin Marinas wrote:
> 
> > BTW, assuming L1_CACHE_BYTES is 512 (I don't ever see this happening but
> > just in theory), we potentially have the same issue. What would save us
> > is that INDEX_NODE would match the first "kmalloc-512" cache, so we have
> > it pre-populated.
> 
> Ok maybe add some BUILD_BUG_ONs to ensure that builds fail until we have
> addressed that.

A BUILD_BUG_ON should be fine.

Thinking some more, I think if KMALLOC_MIN_SIZE is 128, there is no gain
with off-slab management since the freelist allocation would still be
128 bytes. An alternative to reverting while still having a little
benefit of off-slab for 256 bytes objects (rather than 512 as we would
get with the revert):

diff --git a/mm/slab.c b/mm/slab.c
index 4fcc5dd8d5a6..ac32b4a0f2ec 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2212,8 +2212,8 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 	 * it too early on. Always use on-slab management when
 	 * SLAB_NOLEAKTRACE to avoid recursive calls into kmemleak)
 	 */
-	if ((size >= (PAGE_SIZE >> 5)) && !slab_early_init &&
-	    !(flags & SLAB_NOLEAKTRACE))
+	if ((size >= (PAGE_SIZE >> 5)) && (size > KMALLOC_MIN_SIZE) &&
+		!slab_early_init && !(flags & SLAB_NOLEAKTRACE))
 		/*
 		 * Size is large, assume best to place the slab management obj
 		 * off-slab (should allow better packing of objs).

Whichever you prefer.

-- 
Catalin

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-04 15:39                       ` Catalin Marinas
@ 2015-11-05  4:31                         ` Joonsoo Kim
  2015-11-05 11:50                           ` [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Joonsoo Kim @ 2015-11-05  4:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 04, 2015 at 03:39:10PM +0000, Catalin Marinas wrote:
> On Wed, Nov 04, 2015 at 09:28:34AM -0600, Christoph Lameter wrote:
> > On Wed, 4 Nov 2015, Catalin Marinas wrote:
> > 
> > > BTW, assuming L1_CACHE_BYTES is 512 (I don't ever see this happening but
> > > just in theory), we potentially have the same issue. What would save us
> > > is that INDEX_NODE would match the first "kmalloc-512" cache, so we have
> > > it pre-populated.
> > 
> > Ok maybe add some BUILD_BUG_ONs to ensure that builds fail until we have
> > addressed that.
> 
> A BUILD_BUG_ON should be fine.
> 
> Thinking some more, I think if KMALLOC_MIN_SIZE is 128, there is no gain
> with off-slab management since the freelist allocation would still be
> 128 bytes. An alternative to reverting while still having a little
> benefit of off-slab for 256 bytes objects (rather than 512 as we would
> get with the revert):
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 4fcc5dd8d5a6..ac32b4a0f2ec 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2212,8 +2212,8 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
>  	 * it too early on. Always use on-slab management when
>  	 * SLAB_NOLEAKTRACE to avoid recursive calls into kmemleak)
>  	 */
> -	if ((size >= (PAGE_SIZE >> 5)) && !slab_early_init &&
> -	    !(flags & SLAB_NOLEAKTRACE))
> +	if ((size >= (PAGE_SIZE >> 5)) && (size > KMALLOC_MIN_SIZE) &&
> +		!slab_early_init && !(flags & SLAB_NOLEAKTRACE))
>  		/*
>  		 * Size is large, assume best to place the slab management obj
>  		 * off-slab (should allow better packing of objs).
> 
> Whichever you prefer.

Hello,

I prefer this simple way. It looks like that it can solve the issue on
any arbitrary configuration.

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-09-22 17:59 [PATCH] arm64: Increase the max granular size Robert Richter
                   ` (3 preceding siblings ...)
  2015-10-28 19:09 ` Catalin Marinas
@ 2015-11-05  4:40 ` Joonsoo Kim
  2015-11-05 10:32   ` Catalin Marinas
  4 siblings, 1 reply; 31+ messages in thread
From: Joonsoo Kim @ 2015-11-05  4:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
> From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> 
> Increase the standard cacheline size to avoid having locks in the same
> cacheline.
> 
> Cavium's ThunderX core implements cache lines of 128 byte size. With
> current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> share the same cache line leading a performance degradation.
> Increasing the size fixes that.

Beside, slab-side bug, I don't think this argument is valid.
Even if this change is applied, statically allocated spinlock could
share the same cache line.

If two locks should not share the same cache line, you'd better to use
compiler attribute such as ____cacheline_aligned_in_smp in appropriate
place.

Thanks.

> 
> Increasing the size has no negative impact to cache invalidation on
> systems with a smaller cache line. There is an impact on memory usage,
> but that's not too important for arm64 use cases.
> 
> Signed-off-by: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> Signed-off-by: Robert Richter <rrichter@cavium.com>
> ---
>  arch/arm64/include/asm/cache.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
> index bde449936e2f..5082b30bc2c0 100644
> --- a/arch/arm64/include/asm/cache.h
> +++ b/arch/arm64/include/asm/cache.h
> @@ -18,7 +18,7 @@
>  
>  #include <asm/cachetype.h>
>  
> -#define L1_CACHE_SHIFT		6
> +#define L1_CACHE_SHIFT		7
>  #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
>  
>  /*
> -- 
> 2.1.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-05  4:40 ` [PATCH] arm64: Increase the max granular size Joonsoo Kim
@ 2015-11-05 10:32   ` Catalin Marinas
  2015-11-05 11:45     ` Joonsoo Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-05 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 05, 2015 at 01:40:14PM +0900, Joonsoo Kim wrote:
> On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
> > From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
> > 
> > Increase the standard cacheline size to avoid having locks in the same
> > cacheline.
> > 
> > Cavium's ThunderX core implements cache lines of 128 byte size. With
> > current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
> > share the same cache line leading a performance degradation.
> > Increasing the size fixes that.
> 
> Beside, slab-side bug, I don't think this argument is valid.
> Even if this change is applied, statically allocated spinlock could
> share the same cache line.

The benchmarks didn't show any difference with or without this patch
applied. What convinced me to apply it was this email:

http://lkml.kernel.org/g/CAOZdJXUiRMAguDV+HEJqPg57MyBNqEcTyaH+ya=U93NHb-pdJA at mail.gmail.com

On ARM we have a notion of cache writeback granule (CWG) which tells us
"the maximum size of memory that can be overwritten as a result of the
eviction of a cache entry that has had a memory location in it
modified". What we actually needed was ARCH_DMA_MINALIGN to be 128
(currently defined to the L1_CACHE_BYTES value). However, this wouldn't
have fixed the KMALLOC_MIN_SIZE, unless we somehow generate different
kmalloc_caches[] and kmalloc_dma_caches[] and probably introduce a
size_dma_index[].

> If two locks should not share the same cache line, you'd better to use
> compiler attribute such as ____cacheline_aligned_in_smp in appropriate
> place.

We could decouple SMP_CACHE_BYTES from L1_CACHE_BYTES but see above for
the other issue we had to solve.

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-05 10:32   ` Catalin Marinas
@ 2015-11-05 11:45     ` Joonsoo Kim
  2015-11-05 12:17       ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Joonsoo Kim @ 2015-11-05 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

2015-11-05 19:32 GMT+09:00 Catalin Marinas <catalin.marinas@arm.com>:
> On Thu, Nov 05, 2015 at 01:40:14PM +0900, Joonsoo Kim wrote:
>> On Tue, Sep 22, 2015 at 07:59:48PM +0200, Robert Richter wrote:
>> > From: Tirumalesh Chalamarla <tchalamarla@cavium.com>
>> >
>> > Increase the standard cacheline size to avoid having locks in the same
>> > cacheline.
>> >
>> > Cavium's ThunderX core implements cache lines of 128 byte size. With
>> > current granulare size of 64 bytes (L1_CACHE_SHIFT=6) two locks could
>> > share the same cache line leading a performance degradation.
>> > Increasing the size fixes that.
>>
>> Beside, slab-side bug, I don't think this argument is valid.
>> Even if this change is applied, statically allocated spinlock could
>> share the same cache line.
>
> The benchmarks didn't show any difference with or without this patch
> applied. What convinced me to apply it was this email:
>
> http://lkml.kernel.org/g/CAOZdJXUiRMAguDV+HEJqPg57MyBNqEcTyaH+ya=U93NHb-pdJA at mail.gmail.com

Okay.

> On ARM we have a notion of cache writeback granule (CWG) which tells us
> "the maximum size of memory that can be overwritten as a result of the
> eviction of a cache entry that has had a memory location in it
> modified". What we actually needed was ARCH_DMA_MINALIGN to be 128
> (currently defined to the L1_CACHE_BYTES value). However, this wouldn't
> have fixed the KMALLOC_MIN_SIZE, unless we somehow generate different
> kmalloc_caches[] and kmalloc_dma_caches[] and probably introduce a
> size_dma_index[].

If we create separate kmalloc caches for dma, can we apply this alignment
requirement only to dma caches? I guess some memory allocation request
that will be used for DMA operation doesn't specify GFP_DMA because
it doesn't want the memory from ZONE_DMA. In this case, we should apply
dma alignment requirement to all types of caches.

In fact, I know someone who try to implement this alignment separation like
as you mentioned to reduce memory waste. I first suggest this solution
to him but now I realize that it isn't possible because of above reason.

Am I missing?

If it isn't possible, is there another way to reduce memory waste due to
increase of dma alignment requirement in arm64?

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE
  2015-11-05  4:31                         ` Joonsoo Kim
@ 2015-11-05 11:50                           ` Catalin Marinas
  2015-11-05 13:31                             ` Andrew Morton
  2015-11-05 17:39                             ` Christoph Lameter
  0 siblings, 2 replies; 31+ messages in thread
From: Catalin Marinas @ 2015-11-05 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

Commit 8fc9cf420b36 ("slab: make more slab management structure off the
slab") enables off-slab management objects for sizes starting with
PAGE_SIZE >> 5. This means 128 bytes for a 4KB page configuration.
However, on systems with a KMALLOC_MIN_SIZE of 128 (arm64 in 4.4), such
optimisation does not make sense since the slab management allocation
would take 128 bytes anyway (even though freelist_size is 32) with the
additional overhead of another allocation.

This patch introduces an OFF_SLAB_MIN_SIZE macro which takes
KMALLOC_MIN_SIZE into account. It also solves a slab bug on arm64 where
the first kmalloc_cache to be initialised after slab_early_init = 0,
"kmalloc-128", fails to allocate off-slab management objects from the
same "kmalloc-128" cache.

Fixes: 8fc9cf420b36 ("slab: make more slab management structure off the slab")
Cc: <stable@vger.kernel.org> # 3.15+
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 mm/slab.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 4fcc5dd8d5a6..419b649b395e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -282,6 +282,7 @@ static void kmem_cache_node_init(struct kmem_cache_node *parent)
 
 #define CFLGS_OFF_SLAB		(0x80000000UL)
 #define	OFF_SLAB(x)	((x)->flags & CFLGS_OFF_SLAB)
+#define OFF_SLAB_MIN_SIZE	(max_t(size_t, PAGE_SIZE >> 5, KMALLOC_MIN_SIZE + 1))
 
 #define BATCHREFILL_LIMIT	16
 /*
@@ -2212,7 +2213,7 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 	 * it too early on. Always use on-slab management when
 	 * SLAB_NOLEAKTRACE to avoid recursive calls into kmemleak)
 	 */
-	if ((size >= (PAGE_SIZE >> 5)) && !slab_early_init &&
+	if (size >= OFF_SLAB_MIN_SIZE && !slab_early_init &&
 	    !(flags & SLAB_NOLEAKTRACE))
 		/*
 		 * Size is large, assume best to place the slab management obj
@@ -2276,7 +2277,7 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 		/*
 		 * This is a possibility for one of the kmalloc_{dma,}_caches.
 		 * But since we go off slab only for object size greater than
-		 * PAGE_SIZE/8, and kmalloc_{dma,}_caches get created
+		 * OFF_SLAB_MIN_SIZE, and kmalloc_{dma,}_caches get created
 		 * in ascending order,this should not happen at all.
 		 * But leave a BUG_ON for some lucky dude.
 		 */

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-05 11:45     ` Joonsoo Kim
@ 2015-11-05 12:17       ` Catalin Marinas
  2015-11-09  7:41         ` Joonsoo Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-05 12:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 05, 2015 at 08:45:08PM +0900, Joonsoo Kim wrote:
> 2015-11-05 19:32 GMT+09:00 Catalin Marinas <catalin.marinas@arm.com>:
> > On ARM we have a notion of cache writeback granule (CWG) which tells us
> > "the maximum size of memory that can be overwritten as a result of the
> > eviction of a cache entry that has had a memory location in it
> > modified". What we actually needed was ARCH_DMA_MINALIGN to be 128
> > (currently defined to the L1_CACHE_BYTES value). However, this wouldn't
> > have fixed the KMALLOC_MIN_SIZE, unless we somehow generate different
> > kmalloc_caches[] and kmalloc_dma_caches[] and probably introduce a
> > size_dma_index[].
> 
> If we create separate kmalloc caches for dma, can we apply this alignment
> requirement only to dma caches? I guess some memory allocation request
> that will be used for DMA operation doesn't specify GFP_DMA because
> it doesn't want the memory from ZONE_DMA. In this case, we should apply
> dma alignment requirement to all types of caches.

I think you are right. While something like swiotlb (through the
streaming DMA API) could do bounce buffering and allocate one from
ZONE_DMA, this is not guaranteed if the buffer physical address happens
to match the dma_mask. Similarly with an IOMMU, no bouncing happens but
the alignment is still required.

> If it isn't possible, is there another way to reduce memory waste due to
> increase of dma alignment requirement in arm64?

I first need to see how significant the impact is (especially for
embedded/mobiles platforms).

An alternative is to leave L1_CACHE_BYTES to 64 by default but warn if
the CWG is 128 on systems with non-coherent DMA (and hope that we won't
have any). It's not really fix, rather an assumption. Anyway I would
very much like the same kernel image for all platforms and no Kconfig
entry for the cache line size but if the waste is significant, we may
add one for some specific builds (like a mobile phone; or such vendors
could patch the kernel themselves).

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE
  2015-11-05 11:50                           ` [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE Catalin Marinas
@ 2015-11-05 13:31                             ` Andrew Morton
  2015-11-05 16:08                               ` Catalin Marinas
  2015-11-05 17:39                             ` Christoph Lameter
  1 sibling, 1 reply; 31+ messages in thread
From: Andrew Morton @ 2015-11-05 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu,  5 Nov 2015 11:50:35 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote:

> Commit 8fc9cf420b36 ("slab: make more slab management structure off the
> slab") enables off-slab management objects for sizes starting with
> PAGE_SIZE >> 5. This means 128 bytes for a 4KB page configuration.
> However, on systems with a KMALLOC_MIN_SIZE of 128 (arm64 in 4.4), such
> optimisation does not make sense since the slab management allocation
> would take 128 bytes anyway (even though freelist_size is 32) with the
> additional overhead of another allocation.
> 
> This patch introduces an OFF_SLAB_MIN_SIZE macro which takes
> KMALLOC_MIN_SIZE into account. It also solves a slab bug on arm64 where
> the first kmalloc_cache to be initialised after slab_early_init = 0,
> "kmalloc-128", fails to allocate off-slab management objects from the
> same "kmalloc-128" cache.

That all seems to be quite minor stuff.

> Fixes: 8fc9cf420b36 ("slab: make more slab management structure off the slab")
> Cc: <stable@vger.kernel.org> # 3.15+

Yet you believe the fix should be backported.

So, the usual refrain: when fixing a bug, please describe the end-user
visible effects of that bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE
  2015-11-05 13:31                             ` Andrew Morton
@ 2015-11-05 16:08                               ` Catalin Marinas
  2015-11-06 13:00                                 ` Geert Uytterhoeven
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-05 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 05, 2015 at 05:31:39AM -0800, Andrew Morton wrote:
> On Thu,  5 Nov 2015 11:50:35 +0000 Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> > Commit 8fc9cf420b36 ("slab: make more slab management structure off the
> > slab") enables off-slab management objects for sizes starting with
> > PAGE_SIZE >> 5. This means 128 bytes for a 4KB page configuration.
> > However, on systems with a KMALLOC_MIN_SIZE of 128 (arm64 in 4.4), such
> > optimisation does not make sense since the slab management allocation
> > would take 128 bytes anyway (even though freelist_size is 32) with the
> > additional overhead of another allocation.
> > 
> > This patch introduces an OFF_SLAB_MIN_SIZE macro which takes
> > KMALLOC_MIN_SIZE into account. It also solves a slab bug on arm64 where
> > the first kmalloc_cache to be initialised after slab_early_init = 0,
> > "kmalloc-128", fails to allocate off-slab management objects from the
> > same "kmalloc-128" cache.
> 
> That all seems to be quite minor stuff.

Apart from "it also solves a bug on arm64...". But I agree, the initial
commit log doesn't give any justification for cc stable.

> > Fixes: 8fc9cf420b36 ("slab: make more slab management structure off the slab")
> > Cc: <stable@vger.kernel.org> # 3.15+
> 
> Yet you believe the fix should be backported.
> 
> So, the usual refrain: when fixing a bug, please describe the end-user
> visible effects of that bug.

What about (unless you prefer this slightly more intrusive fix:
http://article.gmane.org/gmane.linux.ports.sh.devel/50303):

------------------8<--------------------------

>From fda8f306b6941f4ddbefcbcfaa59fedef4a679a3 Mon Sep 17 00:00:00 2001
From: Catalin Marinas <catalin.marinas@arm.com>
Date: Thu, 5 Nov 2015 11:14:48 +0000
Subject: [PATCH] mm: slab: Only move management objects off-slab for sizes
 larger than KMALLOC_MIN_SIZE

On systems with a KMALLOC_MIN_SIZE of 128 (arm64, some mips and powerpc
configurations defining ARCH_DMA_MINALIGN to 128), the first
kmalloc_caches[] entry to be initialised after slab_early_init = 0 is
"kmalloc-128" with index 7. Depending on the debug kernel configuration,
sizeof(struct kmem_cache) can be larger than 128 resulting in an
INDEX_NODE of 8.

Commit 8fc9cf420b36 ("slab: make more slab management structure off the
slab") enables off-slab management objects for sizes starting with
PAGE_SIZE >> 5 (128 bytes for a 4KB page configuration) and the creation
of the "kmalloc-128" cache would try to place the management objects
off-slab. However, since KMALLOC_MIN_SIZE is already 128 and
freelist_size == 32 in __kmem_cache_create(),
kmalloc_slab(freelist_size) returns NULL (kmalloc_caches[7] not
populated yet). This triggers the following bug on arm64:

[    0.000000] kernel BUG at /work/Linux/linux-2.6-aarch64/mm/slab.c:2283!
[    0.000000] Internal error: Oops - BUG: 0 [#1] SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.3.0-rc4+ #540
[    0.000000] Hardware name: Juno (DT)
[    0.000000] task: ffffffc0006962b0 ti: ffffffc00068c000 task.ti: ffffffc00068c000
[    0.000000] PC is at __kmem_cache_create+0x21c/0x280
[    0.000000] LR is at __kmem_cache_create+0x210/0x280
[...]
[    0.000000] Call trace:
[    0.000000] [<ffffffc000154948>] __kmem_cache_create+0x21c/0x280
[    0.000000] [<ffffffc000652da4>] create_boot_cache+0x48/0x80
[    0.000000] [<ffffffc000652e2c>] create_kmalloc_cache+0x50/0x88
[    0.000000] [<ffffffc000652f14>] create_kmalloc_caches+0x4c/0xf4
[    0.000000] [<ffffffc000654a9c>] kmem_cache_init+0x100/0x118
[    0.000000] [<ffffffc0006447d4>] start_kernel+0x214/0x33c

This patch introduces an OFF_SLAB_MIN_SIZE definition to avoid off-slab
management objects for sizes equal to or smaller than KMALLOC_MIN_SIZE.

Fixes: 8fc9cf420b36 ("slab: make more slab management structure off the slab")
Cc: <stable@vger.kernel.org> # 3.15+
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 mm/slab.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 4fcc5dd8d5a6..419b649b395e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -282,6 +282,7 @@ static void kmem_cache_node_init(struct kmem_cache_node *parent)
 
 #define CFLGS_OFF_SLAB		(0x80000000UL)
 #define	OFF_SLAB(x)	((x)->flags & CFLGS_OFF_SLAB)
+#define OFF_SLAB_MIN_SIZE	(max_t(size_t, PAGE_SIZE >> 5, KMALLOC_MIN_SIZE + 1))
 
 #define BATCHREFILL_LIMIT	16
 /*
@@ -2212,7 +2213,7 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 	 * it too early on. Always use on-slab management when
 	 * SLAB_NOLEAKTRACE to avoid recursive calls into kmemleak)
 	 */
-	if ((size >= (PAGE_SIZE >> 5)) && !slab_early_init &&
+	if (size >= OFF_SLAB_MIN_SIZE && !slab_early_init &&
 	    !(flags & SLAB_NOLEAKTRACE))
 		/*
 		 * Size is large, assume best to place the slab management obj
@@ -2276,7 +2277,7 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 		/*
 		 * This is a possibility for one of the kmalloc_{dma,}_caches.
 		 * But since we go off slab only for object size greater than
-		 * PAGE_SIZE/8, and kmalloc_{dma,}_caches get created
+		 * OFF_SLAB_MIN_SIZE, and kmalloc_{dma,}_caches get created
 		 * in ascending order,this should not happen at all.
 		 * But leave a BUG_ON for some lucky dude.
 		 */

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE
  2015-11-05 11:50                           ` [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE Catalin Marinas
  2015-11-05 13:31                             ` Andrew Morton
@ 2015-11-05 17:39                             ` Christoph Lameter
  1 sibling, 0 replies; 31+ messages in thread
From: Christoph Lameter @ 2015-11-05 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 5 Nov 2015, Catalin Marinas wrote:

> This patch introduces an OFF_SLAB_MIN_SIZE macro which takes
> KMALLOC_MIN_SIZE into account. It also solves a slab bug on arm64 where
> the first kmalloc_cache to be initialised after slab_early_init = 0,
> "kmalloc-128", fails to allocate off-slab management objects from the
> same "kmalloc-128" cache.

Acked-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE
  2015-11-05 16:08                               ` Catalin Marinas
@ 2015-11-06 13:00                                 ` Geert Uytterhoeven
  0 siblings, 0 replies; 31+ messages in thread
From: Geert Uytterhoeven @ 2015-11-06 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 5, 2015 at 5:08 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> From fda8f306b6941f4ddbefcbcfaa59fedef4a679a3 Mon Sep 17 00:00:00 2001
> From: Catalin Marinas <catalin.marinas@arm.com>
> Date: Thu, 5 Nov 2015 11:14:48 +0000
> Subject: [PATCH] mm: slab: Only move management objects off-slab for sizes
>  larger than KMALLOC_MIN_SIZE
>
> On systems with a KMALLOC_MIN_SIZE of 128 (arm64, some mips and powerpc
> configurations defining ARCH_DMA_MINALIGN to 128), the first
> kmalloc_caches[] entry to be initialised after slab_early_init = 0 is
> "kmalloc-128" with index 7. Depending on the debug kernel configuration,
> sizeof(struct kmem_cache) can be larger than 128 resulting in an
> INDEX_NODE of 8.
>
> Commit 8fc9cf420b36 ("slab: make more slab management structure off the
> slab") enables off-slab management objects for sizes starting with
> PAGE_SIZE >> 5 (128 bytes for a 4KB page configuration) and the creation
> of the "kmalloc-128" cache would try to place the management objects
> off-slab. However, since KMALLOC_MIN_SIZE is already 128 and
> freelist_size == 32 in __kmem_cache_create(),
> kmalloc_slab(freelist_size) returns NULL (kmalloc_caches[7] not
> populated yet). This triggers the following bug on arm64:
>
> [    0.000000] kernel BUG at /work/Linux/linux-2.6-aarch64/mm/slab.c:2283!
> [    0.000000] Internal error: Oops - BUG: 0 [#1] SMP
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.3.0-rc4+ #540
> [    0.000000] Hardware name: Juno (DT)
> [    0.000000] task: ffffffc0006962b0 ti: ffffffc00068c000 task.ti: ffffffc00068c000
> [    0.000000] PC is at __kmem_cache_create+0x21c/0x280
> [    0.000000] LR is at __kmem_cache_create+0x210/0x280
> [...]
> [    0.000000] Call trace:
> [    0.000000] [<ffffffc000154948>] __kmem_cache_create+0x21c/0x280
> [    0.000000] [<ffffffc000652da4>] create_boot_cache+0x48/0x80
> [    0.000000] [<ffffffc000652e2c>] create_kmalloc_cache+0x50/0x88
> [    0.000000] [<ffffffc000652f14>] create_kmalloc_caches+0x4c/0xf4
> [    0.000000] [<ffffffc000654a9c>] kmem_cache_init+0x100/0x118
> [    0.000000] [<ffffffc0006447d4>] start_kernel+0x214/0x33c
>
> This patch introduces an OFF_SLAB_MIN_SIZE definition to avoid off-slab
> management objects for sizes equal to or smaller than KMALLOC_MIN_SIZE.
>
>
> Fixes: 8fc9cf420b36 ("slab: make more slab management structure off the slab")
> Cc: <stable@vger.kernel.org> # 3.15+
> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks a lot!

For the record (the fix is already upstream):

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-05 12:17       ` Catalin Marinas
@ 2015-11-09  7:41         ` Joonsoo Kim
  2015-11-09 18:36           ` Catalin Marinas
  0 siblings, 1 reply; 31+ messages in thread
From: Joonsoo Kim @ 2015-11-09  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

2015-11-05 21:17 GMT+09:00 Catalin Marinas <catalin.marinas@arm.com>:
> On Thu, Nov 05, 2015 at 08:45:08PM +0900, Joonsoo Kim wrote:
>> 2015-11-05 19:32 GMT+09:00 Catalin Marinas <catalin.marinas@arm.com>:
>> > On ARM we have a notion of cache writeback granule (CWG) which tells us
>> > "the maximum size of memory that can be overwritten as a result of the
>> > eviction of a cache entry that has had a memory location in it
>> > modified". What we actually needed was ARCH_DMA_MINALIGN to be 128
>> > (currently defined to the L1_CACHE_BYTES value). However, this wouldn't
>> > have fixed the KMALLOC_MIN_SIZE, unless we somehow generate different
>> > kmalloc_caches[] and kmalloc_dma_caches[] and probably introduce a
>> > size_dma_index[].
>>
>> If we create separate kmalloc caches for dma, can we apply this alignment
>> requirement only to dma caches? I guess some memory allocation request
>> that will be used for DMA operation doesn't specify GFP_DMA because
>> it doesn't want the memory from ZONE_DMA. In this case, we should apply
>> dma alignment requirement to all types of caches.
>
> I think you are right. While something like swiotlb (through the
> streaming DMA API) could do bounce buffering and allocate one from
> ZONE_DMA, this is not guaranteed if the buffer physical address happens
> to match the dma_mask. Similarly with an IOMMU, no bouncing happens but
> the alignment is still required.
>
>> If it isn't possible, is there another way to reduce memory waste due to
>> increase of dma alignment requirement in arm64?
>
> I first need to see how significant the impact is (especially for
> embedded/mobiles platforms).

I don't have any ARM64 device. What I have just one report
about slab usage from our developer.

The report shows slab usage just after android boot is done
in ARM64.

Total slab usage: 90 MB
kmalloc usage: 25 MB
kmalloc (<=64) usage: 7 MB

This would be measured without slab_nomerge so there is
a possibility that some usages on kmem_cache is merged
into usage of kmalloc (<=64).

Anyway, if ARM64 increase L1_CACHE_BYTES to 128, roughly
7 MB would be wasted. I don't know how this picture is varied
in runtime, but, even boot time overhead, 7 MB looks large to me.

> An alternative is to leave L1_CACHE_BYTES to 64 by default but warn if
> the CWG is 128 on systems with non-coherent DMA (and hope that we won't
> have any). It's not really fix, rather an assumption. Anyway I would
> very much like the same kernel image for all platforms and no Kconfig
> entry for the cache line size but if the waste is significant, we may
> add one for some specific builds (like a mobile phone; or such vendors
> could patch the kernel themselves).

Okay. In fact, I'm not very familiar with android device so it'd be better
for you to ask some other android developer about how significant
the impact is in android or mobile device.

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-09  7:41         ` Joonsoo Kim
@ 2015-11-09 18:36           ` Catalin Marinas
  2015-11-10  0:19             ` Joonsoo Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Catalin Marinas @ 2015-11-09 18:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 09, 2015 at 04:41:58PM +0900, Joonsoo Kim wrote:
> 2015-11-05 21:17 GMT+09:00 Catalin Marinas <catalin.marinas@arm.com>:
> > On Thu, Nov 05, 2015 at 08:45:08PM +0900, Joonsoo Kim wrote:
> >> If it isn't possible, is there another way to reduce memory waste due to
> >> increase of dma alignment requirement in arm64?
> >
> > I first need to see how significant the impact is (especially for
> > embedded/mobiles platforms).
> 
> I don't have any ARM64 device. What I have just one report
> about slab usage from our developer.
> 
> The report shows slab usage just after android boot is done
> in ARM64.
> 
> Total slab usage: 90 MB
> kmalloc usage: 25 MB
> kmalloc (<=64) usage: 7 MB
> 
> This would be measured without slab_nomerge so there is
> a possibility that some usages on kmem_cache is merged
> into usage of kmalloc (<=64).
> 
> Anyway, if ARM64 increase L1_CACHE_BYTES to 128, roughly
> 7 MB would be wasted. I don't know how this picture is varied
> in runtime, but, even boot time overhead, 7 MB looks large to me.

7MB is considerable but I guess it wouldn't be all wasted with
L1_CACHE_BYTES == 128, maybe half or slightly over. It would be good to
know the other kmalloc caches, maybe up to 256.

I don't have an Android filesystem but I just tried to boot Arch
(aarch64). Immediately after boot and slab_nomerge, with 128 L1 I get:

kmalloc-128:	6624
kmalloc-256:	1488

With L1 64, I get:

kmalloc-64:	5760
kmalloc-128:	1152
kmalloc-192:	1155
kmalloc-256:	 320

So that's about 1.2MB vs 0.8MB. The ratio is 3:2, though I'm not sure it
will stay the same as the slab usage increases.

It would be good to get more numbers, we could add a Kconfig option just
for specific builds while keeping the default to 128.

-- 
Catalin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] arm64: Increase the max granular size
  2015-11-09 18:36           ` Catalin Marinas
@ 2015-11-10  0:19             ` Joonsoo Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Joonsoo Kim @ 2015-11-10  0:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 09, 2015 at 06:36:09PM +0000, Catalin Marinas wrote:
> On Mon, Nov 09, 2015 at 04:41:58PM +0900, Joonsoo Kim wrote:
> > 2015-11-05 21:17 GMT+09:00 Catalin Marinas <catalin.marinas@arm.com>:
> > > On Thu, Nov 05, 2015 at 08:45:08PM +0900, Joonsoo Kim wrote:
> > >> If it isn't possible, is there another way to reduce memory waste due to
> > >> increase of dma alignment requirement in arm64?
> > >
> > > I first need to see how significant the impact is (especially for
> > > embedded/mobiles platforms).
> > 
> > I don't have any ARM64 device. What I have just one report
> > about slab usage from our developer.
> > 
> > The report shows slab usage just after android boot is done
> > in ARM64.
> > 
> > Total slab usage: 90 MB
> > kmalloc usage: 25 MB
> > kmalloc (<=64) usage: 7 MB
> > 
> > This would be measured without slab_nomerge so there is
> > a possibility that some usages on kmem_cache is merged
> > into usage of kmalloc (<=64).
> > 
> > Anyway, if ARM64 increase L1_CACHE_BYTES to 128, roughly
> > 7 MB would be wasted. I don't know how this picture is varied
> > in runtime, but, even boot time overhead, 7 MB looks large to me.
> 
> 7MB is considerable but I guess it wouldn't be all wasted with
> L1_CACHE_BYTES == 128, maybe half or slightly over. It would be good to
> know the other kmalloc caches, maybe up to 256.
> 
> I don't have an Android filesystem but I just tried to boot Arch
> (aarch64). Immediately after boot and slab_nomerge, with 128 L1 I get:
> 
> kmalloc-128:	6624
> kmalloc-256:	1488
> 
> With L1 64, I get:
> 
> kmalloc-64:	5760
> kmalloc-128:	1152
> kmalloc-192:	1155
> kmalloc-256:	 320
> 
> So that's about 1.2MB vs 0.8MB. The ratio is 3:2, though I'm not sure it
> will stay the same as the slab usage increases.
> 
> It would be good to get more numbers, we could add a Kconfig option just
> for specific builds while keeping the default to 128.

Okay.

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2015-11-10  0:19 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-22 17:59 [PATCH] arm64: Increase the max granular size Robert Richter
2015-09-22 18:29 ` Will Deacon
2015-09-25 14:45   ` Robert Richter
2015-09-25 16:31     ` Tirumalesh Chalamarla
2015-10-10 17:39 ` Timur Tabi
2015-10-12  9:16   ` Will Deacon
2015-10-16 19:57 ` Timur Tabi
2015-10-28 19:09 ` Catalin Marinas
     [not found]   ` <CAMuHMdWQygbxMXoOsbwek6DzZcr7J-C23VCK4ubbgUr+zj=giw@mail.gmail.com>
2015-11-03 12:05     ` Catalin Marinas
2015-11-03 14:38       ` Catalin Marinas
2015-11-03 14:55         ` Geert Uytterhoeven
2015-11-03 18:50           ` Catalin Marinas
2015-11-03 23:33             ` Christoph Lameter
2015-11-04 12:36               ` Catalin Marinas
2015-11-04 13:53                 ` Christoph Lameter
2015-11-04 14:54                   ` Catalin Marinas
2015-11-04 15:28                     ` Christoph Lameter
2015-11-04 15:39                       ` Catalin Marinas
2015-11-05  4:31                         ` Joonsoo Kim
2015-11-05 11:50                           ` [PATCH] mm: slab: Only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE Catalin Marinas
2015-11-05 13:31                             ` Andrew Morton
2015-11-05 16:08                               ` Catalin Marinas
2015-11-06 13:00                                 ` Geert Uytterhoeven
2015-11-05 17:39                             ` Christoph Lameter
2015-11-05  4:40 ` [PATCH] arm64: Increase the max granular size Joonsoo Kim
2015-11-05 10:32   ` Catalin Marinas
2015-11-05 11:45     ` Joonsoo Kim
2015-11-05 12:17       ` Catalin Marinas
2015-11-09  7:41         ` Joonsoo Kim
2015-11-09 18:36           ` Catalin Marinas
2015-11-10  0:19             ` Joonsoo Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).