All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Yu Zhao <yuzhao@google.com>, Erhard Furtner <erhard_f@mailbox.org>
Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>
Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc)
Date: Thu, 06 Jun 2024 22:08:40 +1000	[thread overview]
Message-ID: <87r0dap92v.fsf@mail.lhotse> (raw)
In-Reply-To: <CAOUHufacbbpS3ghEwsQ-pObttnQk__xo0vjpGWXNq1i-bsuiGw@mail.gmail.com>

Yu Zhao <yuzhao@google.com> writes:
> On Wed, Jun 5, 2024 at 9:12 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> David Hildenbrand <david@redhat.com> writes:
>> > On 01.06.24 08:01, Yu Zhao wrote:
>> >> On Wed, May 15, 2024 at 4:06 PM Yu Zhao <yuzhao@google.com> wrote:
>> ...
>> >>
>> >> Your system has 2GB memory and it uses zswap with zsmalloc (which is
>> >> good since it can allocate from the highmem zone) and zstd/lzo (which
>> >> doesn't matter much). Somehow -- I couldn't figure out why -- it
>> >> splits the 2GB into a 0.25GB DMA zone and a 1.75GB highmem zone:
>> >>
>> >> [    0.000000] Zone ranges:
>> >> [    0.000000]   DMA      [mem 0x0000000000000000-0x000000002fffffff]
>> >> [    0.000000]   Normal   empty
>> >> [    0.000000]   HighMem  [mem 0x0000000030000000-0x000000007fffffff]
>> >
>> > That's really odd. But we are messing with "PowerMac3,6", so I don't
>> > really know what's right or wrong ...
>>
>> The DMA zone exists because 9739ab7eda45 ("powerpc: enable a 30-bit
>> ZONE_DMA for 32-bit pmac") selects it.
>>
>> It's 768MB (not 0.25GB) because it's clamped at max_low_pfn:
>
> Right. (I meant 0.75GB.)
>
>> #ifdef CONFIG_ZONE_DMA
>>         max_zone_pfns[ZONE_DMA] = min(max_low_pfn,
>>                                       1UL << (zone_dma_bits - PAGE_SHIFT));
>> #endif
>>
>> Which comes eventually from CONFIG_LOWMEM_SIZE, which defaults to 768MB.
>
> I see. I grep'ed VMSPLIT which is used on x86 and arm but apparently
> not on powerpc.

Those VMSPLIT configs are nice, on powerpc it's all done manually :}

>> I think it's 768MB because the user:kernel split is 3G:1G, and then the
>> kernel needs some of that 1G virtual space for vmalloc/ioremap/highmem,
>> so it splits it 768M:256M.
>>
>> Then ZONE_NORMAL is empty because it is also limited to max_low_pfn:
>>
>>         max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
>>
>> The rest of RAM is highmem.
>>
>> So I think that's all behaving as expected, but I don't know 32-bit /
>> highmem stuff that well so I could be wrong.
>
> Yes, the three zones work as intended.
>
> Erhard,
>
> Since your system only has 2GB memory, I'd try the 2G:2G split, which
> would in theory allow both the kernel and userspace to all memory.
>
> CONFIG_LOWMEM_SIZE_BOOL=y
> CONFIG_LOWMEM_SIZE=0x7000000
>
> (Michael, please correct me if the above wouldn't work.)

It's a bit more complicated, in order to increase LOWMEM_SIZE you need
to adjust all the other variables to make space.

To get 2G of user virtual space I think you need:

CONFIG_ADVANCED_OPTIONS=y
CONFIG_LOWMEM_SIZE_BOOL=y
CONFIG_LOWMEM_SIZE=0x60000000
CONFIG_PAGE_OFFSET_BOOL=y
CONFIG_PAGE_OFFSET=0x90000000
CONFIG_KERNEL_START_BOOL=y
CONFIG_KERNEL_START=0x90000000
CONFIG_PHYSICAL_START=0x00000000
CONFIG_TASK_SIZE_BOOL=y
CONFIG_TASK_SIZE=0x80000000

Which results in 1.5GB of lowmem.

Or if you want to map all 2G of RAM directly in the kernel without
highmem, but limit user virtual space to 1.5G:

CONFIG_ADVANCED_OPTIONS=y
CONFIG_LOWMEM_SIZE_BOOL=y
CONFIG_LOWMEM_SIZE=0x80000000
CONFIG_PAGE_OFFSET_BOOL=y
CONFIG_PAGE_OFFSET=0x70000000
CONFIG_KERNEL_START_BOOL=y
CONFIG_KERNEL_START=0x70000000
CONFIG_PHYSICAL_START=0x00000000
CONFIG_TASK_SIZE_BOOL=y
CONFIG_TASK_SIZE=0x60000000

You can also reclaim another 256MB of virtual space if you disable
CONFIG_MODULES.

Those configs do boot on qemu. But I don't have easy access to my 32-bit
machine to test if they boot on actual hardware.

cheers

WARNING: multiple messages have this Message-ID (diff)
From: Michael Ellerman <mpe@ellerman.id.au>
To: Yu Zhao <yuzhao@google.com>, Erhard Furtner <erhard_f@mailbox.org>
Cc: David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc)
Date: Thu, 06 Jun 2024 22:08:40 +1000	[thread overview]
Message-ID: <87r0dap92v.fsf@mail.lhotse> (raw)
In-Reply-To: <CAOUHufacbbpS3ghEwsQ-pObttnQk__xo0vjpGWXNq1i-bsuiGw@mail.gmail.com>

Yu Zhao <yuzhao@google.com> writes:
> On Wed, Jun 5, 2024 at 9:12 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> David Hildenbrand <david@redhat.com> writes:
>> > On 01.06.24 08:01, Yu Zhao wrote:
>> >> On Wed, May 15, 2024 at 4:06 PM Yu Zhao <yuzhao@google.com> wrote:
>> ...
>> >>
>> >> Your system has 2GB memory and it uses zswap with zsmalloc (which is
>> >> good since it can allocate from the highmem zone) and zstd/lzo (which
>> >> doesn't matter much). Somehow -- I couldn't figure out why -- it
>> >> splits the 2GB into a 0.25GB DMA zone and a 1.75GB highmem zone:
>> >>
>> >> [    0.000000] Zone ranges:
>> >> [    0.000000]   DMA      [mem 0x0000000000000000-0x000000002fffffff]
>> >> [    0.000000]   Normal   empty
>> >> [    0.000000]   HighMem  [mem 0x0000000030000000-0x000000007fffffff]
>> >
>> > That's really odd. But we are messing with "PowerMac3,6", so I don't
>> > really know what's right or wrong ...
>>
>> The DMA zone exists because 9739ab7eda45 ("powerpc: enable a 30-bit
>> ZONE_DMA for 32-bit pmac") selects it.
>>
>> It's 768MB (not 0.25GB) because it's clamped at max_low_pfn:
>
> Right. (I meant 0.75GB.)
>
>> #ifdef CONFIG_ZONE_DMA
>>         max_zone_pfns[ZONE_DMA] = min(max_low_pfn,
>>                                       1UL << (zone_dma_bits - PAGE_SHIFT));
>> #endif
>>
>> Which comes eventually from CONFIG_LOWMEM_SIZE, which defaults to 768MB.
>
> I see. I grep'ed VMSPLIT which is used on x86 and arm but apparently
> not on powerpc.

Those VMSPLIT configs are nice, on powerpc it's all done manually :}

>> I think it's 768MB because the user:kernel split is 3G:1G, and then the
>> kernel needs some of that 1G virtual space for vmalloc/ioremap/highmem,
>> so it splits it 768M:256M.
>>
>> Then ZONE_NORMAL is empty because it is also limited to max_low_pfn:
>>
>>         max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
>>
>> The rest of RAM is highmem.
>>
>> So I think that's all behaving as expected, but I don't know 32-bit /
>> highmem stuff that well so I could be wrong.
>
> Yes, the three zones work as intended.
>
> Erhard,
>
> Since your system only has 2GB memory, I'd try the 2G:2G split, which
> would in theory allow both the kernel and userspace to all memory.
>
> CONFIG_LOWMEM_SIZE_BOOL=y
> CONFIG_LOWMEM_SIZE=0x7000000
>
> (Michael, please correct me if the above wouldn't work.)

It's a bit more complicated, in order to increase LOWMEM_SIZE you need
to adjust all the other variables to make space.

To get 2G of user virtual space I think you need:

CONFIG_ADVANCED_OPTIONS=y
CONFIG_LOWMEM_SIZE_BOOL=y
CONFIG_LOWMEM_SIZE=0x60000000
CONFIG_PAGE_OFFSET_BOOL=y
CONFIG_PAGE_OFFSET=0x90000000
CONFIG_KERNEL_START_BOOL=y
CONFIG_KERNEL_START=0x90000000
CONFIG_PHYSICAL_START=0x00000000
CONFIG_TASK_SIZE_BOOL=y
CONFIG_TASK_SIZE=0x80000000

Which results in 1.5GB of lowmem.

Or if you want to map all 2G of RAM directly in the kernel without
highmem, but limit user virtual space to 1.5G:

CONFIG_ADVANCED_OPTIONS=y
CONFIG_LOWMEM_SIZE_BOOL=y
CONFIG_LOWMEM_SIZE=0x80000000
CONFIG_PAGE_OFFSET_BOOL=y
CONFIG_PAGE_OFFSET=0x70000000
CONFIG_KERNEL_START_BOOL=y
CONFIG_KERNEL_START=0x70000000
CONFIG_PHYSICAL_START=0x00000000
CONFIG_TASK_SIZE_BOOL=y
CONFIG_TASK_SIZE=0x60000000

You can also reclaim another 256MB of virtual space if you disable
CONFIG_MODULES.

Those configs do boot on qemu. But I don't have easy access to my 32-bit
machine to test if they boot on actual hardware.

cheers


  reply	other threads:[~2024-06-06 12:09 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-08 18:21 kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc) Erhard Furtner
2024-05-08 18:21 ` Erhard Furtner
2024-05-15 20:45 ` Erhard Furtner
2024-05-15 20:45   ` Erhard Furtner
2024-05-15 22:06   ` Yu Zhao
2024-05-15 22:06     ` Yu Zhao
2024-06-01  6:01     ` Yu Zhao
2024-06-01  6:01       ` Yu Zhao
2024-06-01 15:37       ` David Hildenbrand
2024-06-01 15:37         ` David Hildenbrand
2024-06-06  3:11         ` Michael Ellerman
2024-06-06  3:11           ` Michael Ellerman
2024-06-06  3:38           ` Yu Zhao
2024-06-06  3:38             ` Yu Zhao
2024-06-06 12:08             ` Michael Ellerman [this message]
2024-06-06 12:08               ` Michael Ellerman
2024-06-06 16:05               ` Erhard Furtner
2024-06-06 16:05                 ` Erhard Furtner
2024-06-02 18:03       ` Erhard Furtner
2024-06-02 18:03         ` Erhard Furtner
2024-06-02 20:38         ` Yu Zhao
2024-06-02 20:38           ` Yu Zhao
2024-06-02 21:36           ` Erhard Furtner
2024-06-02 21:36             ` Erhard Furtner
2024-06-03 22:13         ` Erhard Furtner
2024-06-03 22:13           ` Erhard Furtner
2024-06-03 23:24           ` Yosry Ahmed
2024-06-03 23:24             ` Yosry Ahmed
2024-06-04 11:44             ` Erhard Furtner
2024-06-04 11:44               ` Erhard Furtner
2024-06-04 16:11               ` Yosry Ahmed
2024-06-04 16:11                 ` Yosry Ahmed
2024-06-04 17:18                 ` Yu Zhao
2024-06-04 17:18                   ` Yu Zhao
2024-06-04 17:34                   ` Yosry Ahmed
2024-06-04 17:34                     ` Yosry Ahmed
2024-06-04 17:53                     ` Yu Zhao
2024-06-04 17:53                       ` Yu Zhao
2024-06-04 18:01                       ` Yosry Ahmed
2024-06-04 18:01                         ` Yosry Ahmed
2024-06-04 21:00                         ` Vlastimil Babka (SUSE)
2024-06-04 21:00                           ` Vlastimil Babka (SUSE)
2024-06-04 21:10                         ` Erhard Furtner
2024-06-04 21:10                           ` Erhard Furtner
2024-06-05  3:03                           ` Yosry Ahmed
2024-06-05  3:03                             ` Yosry Ahmed
2024-06-05 23:04                             ` Erhard Furtner
2024-06-05 23:04                               ` Erhard Furtner
2024-06-05 23:41                               ` Yosry Ahmed
2024-06-05 23:41                                 ` Yosry Ahmed
2024-06-05 23:52                                 ` Yu Zhao
2024-06-05 23:52                                   ` Yu Zhao
2024-06-05 23:58                                   ` Yosry Ahmed
2024-06-05 23:58                                     ` Yosry Ahmed
2024-06-06 13:28                                     ` Erhard Furtner
2024-06-06 13:28                                       ` Erhard Furtner
2024-06-06 16:42                                       ` Yosry Ahmed
2024-06-06 16:42                                         ` Yosry Ahmed
2024-06-06  2:49                                 ` Chengming Zhou
2024-06-06  2:49                                   ` Chengming Zhou
2024-06-06  4:31                                   ` Sergey Senozhatsky
2024-06-06  4:31                                     ` Sergey Senozhatsky
2024-06-06  4:46                                     ` Chengming Zhou
2024-06-06  4:46                                       ` Chengming Zhou
2024-06-06  5:43                                       ` Sergey Senozhatsky
2024-06-06  5:43                                         ` Sergey Senozhatsky
2024-06-06  5:55                                         ` Chengming Zhou
2024-06-06  5:55                                           ` Chengming Zhou
2024-06-07  9:40                                         ` Nhat Pham
2024-06-07  9:40                                           ` Nhat Pham
2024-06-07 11:20                                           ` Sergey Senozhatsky
2024-06-07 11:20                                             ` Sergey Senozhatsky
2024-06-06  7:24                                 ` Vlastimil Babka (SUSE)
2024-06-06  7:24                                   ` Vlastimil Babka (SUSE)
2024-06-06 13:32                                   ` Erhard Furtner
2024-06-06 13:32                                     ` Erhard Furtner
2024-06-06 16:53                                     ` Vlastimil Babka (SUSE)
2024-06-06 16:53                                       ` Vlastimil Babka (SUSE)
2024-06-06 17:14                                 ` Takero Funaki
2024-06-06 17:14                                   ` Takero Funaki
2024-06-06 17:41                                   ` Yosry Ahmed
2024-06-06 17:41                                     ` Yosry Ahmed
2024-06-06 17:55                                     ` Yu Zhao
2024-06-06 17:55                                       ` Yu Zhao
2024-06-06 18:03                                       ` Yosry Ahmed
2024-06-06 18:03                                         ` Yosry Ahmed
2024-06-04 22:17                   ` Erhard Furtner
2024-06-04 22:17                     ` Erhard Furtner
2024-06-04 20:52             ` Vlastimil Babka (SUSE)
2024-06-04 20:52               ` Vlastimil Babka (SUSE)
2024-06-04 20:55               ` Yosry Ahmed
2024-06-04 20:55                 ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0dap92v.fsf@mail.lhotse \
    --to=mpe@ellerman.id.au \
    --cc=david@redhat.com \
    --cc=erhard_f@mailbox.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.