All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
Date: Mon, 18 Sep 2023 10:40:37 -0700	[thread overview]
Message-ID: <20230918174037.GA112714@monkey> (raw)
In-Reply-To: <20230918145204.GB16104@cmpxchg.org>

On 09/18/23 10:52, Johannes Weiner wrote:
> On Mon, Sep 18, 2023 at 09:16:58AM +0200, Vlastimil Babka wrote:
> > On 9/16/23 21:57, Mike Kravetz wrote:
> > > On 09/15/23 10:16, Johannes Weiner wrote:
> > >> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> > > 
> > > With the patch below applied, a slightly different workload triggers the
> > > following warnings.  It seems related, and appears to go away when
> > > reverting the series.
> > > 
> > > [  331.595382] ------------[ cut here ]------------
> > > [  331.596665] page type is 5, passed migratetype is 1 (nr=512)
> > > [  331.598121] WARNING: CPU: 2 PID: 935 at mm/page_alloc.c:662 expand+0x1c9/0x200
> > 
> > Initially I thought this demonstrates the possible race I was suggesting in
> > reply to 6/6. But, assuming you have CONFIG_CMA, page type 5 is cma and we
> > are trying to get a MOVABLE page from a CMA page block, which is something
> > that's normally done and the pageblock stays CMA. So yeah if the warnings
> > are to stay, they need to handle this case. Maybe the same can happen with
> > HIGHATOMIC blocks?
> 
> Hm I don't think that's quite it.
> 
> CMA and HIGHATOMIC have their own freelists. When MOVABLE requests dip
> into CMA and HIGHATOMIC, we explicitly pass that migratetype to
> __rmqueue_smallest(). This takes a chunk of e.g. CMA, expands the
> remainder to the CMA freelist, then returns the page. While you get a
> different mt than requested, the freelist typing should be consistent.
> 
> In this splat, the migratetype passed to __rmqueue_smallest() is
> MOVABLE. There is no preceding warning from del_page_from_freelist()
> (Mike, correct me if I'm wrong), so we got a confirmed MOVABLE
> order-10 block from the MOVABLE list. So far so good. However, when we
> expand() the order-9 tail of this block to the MOVABLE list, it warns
> that its pageblock type is CMA.
> 
> This means we have an order-10 page where one half is MOVABLE and the
> other is CMA.
> 
> I don't see how the merging code in __free_one_page() could have done
> that. The CMA buddy would have failed the migrate_is_mergeable() test
> and we should have left it at order-9s.
> 
> I also don't see how the CMA setup could have done this because
> MIGRATE_CMA is set on the range before the pages are fed to the buddy.
> 
> Mike, could you describe the workload that is triggering this?

This 'slightly different workload' is actually a slightly different
environment.  Sorry for mis-speaking!  The slight difference is that this
environment does not use the 'alloc hugetlb gigantic pages from CMA'
(hugetlb_cma) feature that triggered the previous issue.

This is still on a 16G VM.  Kernel command line here is:
"BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.6.0-rc1-next-20230913+
root=UUID=49c13301-2555-44dc-847b-caabe1d62bdf ro console=tty0
console=ttyS0,115200 audit=0 selinux=0 transparent_hugepage=always
hugetlb_free_vmemmap=on"

The workload is just running this script:
while true; do
 echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
 echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
 echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
done

> 
> Does this reproduce instantly and reliably?
> 

It is not 'instant' but will reproduce fairly reliably within a minute
or so.

Note that the 'echo 4 > .../hugepages-1048576kB/nr_hugepages' is going
to end up calling alloc_contig_pages -> alloc_contig_range.  Those pages
will eventually be freed via __free_pages(folio, 9).

> Is there high load on the system, or is it requesting the huge page
> with not much else going on?

Only the script was running.

> Do you see compact_* history in /proc/vmstat after this triggers?

As one might expect, compact_isolated continually increases during this
this run.

> Could you please also provide /proc/zoneinfo, /proc/pagetypeinfo and
> the hugetlb_cma= parameter you're using?

As mentioned above, hugetlb_cma is not used in this environment.  Strangely
enough, this does not reproduce (easily at least) if I use hugetlb_cma as
in the previous report.

The following are during a run after WARNING is triggered.

# cat /proc/zoneinfo
Node 0, zone      DMA
  per-node stats
      nr_inactive_anon 11800
      nr_active_anon 109
      nr_inactive_file 38161
      nr_active_file 10007
      nr_unevictable 12
      nr_slab_reclaimable 2766
      nr_slab_unreclaimable 6881
      nr_isolated_anon 0
      nr_isolated_file 0
      workingset_nodes 0
      workingset_refault_anon 0
      workingset_refault_file 0
      workingset_activate_anon 0
      workingset_activate_file 0
      workingset_restore_anon 0
      workingset_restore_file 0
      workingset_nodereclaim 0
      nr_anon_pages 11750
      nr_mapped    18402
      nr_file_pages 48339
      nr_dirty     0
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     166
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_file_hugepages 0
      nr_file_pmdmapped 0
      nr_anon_transparent_hugepages 6
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   14766
      nr_written   7701
      nr_throttled_written 0
      nr_kernel_misc_reclaimable 0
      nr_foll_pin_acquired 96
      nr_foll_pin_released 96
      nr_kernel_stack 1816
      nr_page_table_pages 1100
      nr_sec_page_table_pages 0
      nr_swapcached 0
  pages free     3840
        boost    0
        min      21
        low      26
        high     31
        spanned  4095
        present  3998
        managed  3840
        cma      0
        protection: (0, 1908, 7923, 7923)
      nr_free_pages 3840
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 0
      nr_zone_active_file 0
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  0
      numa_hit     0
      numa_miss    0
      numa_foreign 0
      numa_interleave 0
      numa_local   0
      numa_other   0
  pagesets
    cpu: 0
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
    cpu: 1
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
    cpu: 2
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
    cpu: 3
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
  node_unreclaimable:  0
  start_pfn:           1
Node 0, zone    DMA32
  pages free     495317
        boost    0
        min      2687
        low      3358
        high     4029
        spanned  1044480
        present  520156
        managed  496486
        cma      0
        protection: (0, 0, 6015, 6015)
      nr_free_pages 495317
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 0
      nr_zone_active_file 0
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  0
      numa_hit     0
      numa_miss    0
      numa_foreign 0
      numa_interleave 0
      numa_local   0
      numa_other   0
  pagesets
    cpu: 0
              count: 913
              high:  1679
              batch: 63
  vm stats threshold: 30
    cpu: 1
              count: 0
              high:  1679
              batch: 63
  vm stats threshold: 30
    cpu: 2
              count: 0
              high:  1679
              batch: 63
  vm stats threshold: 30
    cpu: 3
              count: 256
              high:  1679
              batch: 63
  vm stats threshold: 30
  node_unreclaimable:  0
  start_pfn:           4096
Node 0, zone   Normal
  pages free     1360836
        boost    0
        min      8473
        low      10591
        high     12709
        spanned  1572864
        present  1572864
        managed  1552266
        cma      0
        protection: (0, 0, 0, 0)
      nr_free_pages 1360836
      nr_zone_inactive_anon 11800
      nr_zone_active_anon 109
      nr_zone_inactive_file 38161
      nr_zone_active_file 10007
      nr_zone_unevictable 12
      nr_zone_write_pending 0
      nr_mlock     12
      nr_bounce    0
      nr_zspages   3
      nr_free_cma  0
      numa_hit     10623572
      numa_miss    0
      numa_foreign 0
      numa_interleave 1357
      numa_local   6902986
      numa_other   3720586
  pagesets
    cpu: 0
              count: 156
              high:  5295
              batch: 63
  vm stats threshold: 42
    cpu: 1
              count: 210
              high:  5295
              batch: 63
  vm stats threshold: 42
    cpu: 2
              count: 4956
              high:  5295
              batch: 63
  vm stats threshold: 42
    cpu: 3
              count: 1
              high:  5295
              batch: 63
  vm stats threshold: 42
  node_unreclaimable:  0
  start_pfn:           1048576
Node 0, zone  Movable
  pages free     0
        boost    0
        min      32
        low      32
        high     32
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)
Node 1, zone      DMA
  pages free     0
        boost    0
        min      0
        low      0
        high     0
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)
Node 1, zone    DMA32
  pages free     0
        boost    0
        min      0
        low      0
        high     0
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)
Node 1, zone   Normal
  per-node stats
      nr_inactive_anon 15381
      nr_active_anon 81
      nr_inactive_file 66550
      nr_active_file 25965
      nr_unevictable 421
      nr_slab_reclaimable 4069
      nr_slab_unreclaimable 7836
      nr_isolated_anon 0
      nr_isolated_file 0
      workingset_nodes 0
      workingset_refault_anon 0
      workingset_refault_file 0
      workingset_activate_anon 0
      workingset_activate_file 0
      workingset_restore_anon 0
      workingset_restore_file 0
      workingset_nodereclaim 0
      nr_anon_pages 15420
      nr_mapped    24331
      nr_file_pages 92978
      nr_dirty     0
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     100
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_file_hugepages 0
      nr_file_pmdmapped 0
      nr_anon_transparent_hugepages 11
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   6217
      nr_written   2902
      nr_throttled_written 0
      nr_kernel_misc_reclaimable 0
      nr_foll_pin_acquired 0
      nr_foll_pin_released 0
      nr_kernel_stack 1656
      nr_page_table_pages 756
      nr_sec_page_table_pages 0
      nr_swapcached 0
  pages free     1829073
        boost    0
        min      11345
        low      14181
        high     17017
        spanned  2097152
        present  2097152
        managed  2086594
        cma      0
        protection: (0, 0, 0, 0)
      nr_free_pages 1829073
      nr_zone_inactive_anon 15381
      nr_zone_active_anon 81
      nr_zone_inactive_file 66550
      nr_zone_active_file 25965
      nr_zone_unevictable 421
      nr_zone_write_pending 0
      nr_mlock     421
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  0
      numa_hit     10522401
      numa_miss    0
      numa_foreign 0
      numa_interleave 961
      numa_local   4057399
      numa_other   6465002
  pagesets
    cpu: 0
              count: 0
              high:  7090
              batch: 63
  vm stats threshold: 42
    cpu: 1
              count: 17
              high:  7090
              batch: 63
  vm stats threshold: 42
    cpu: 2
              count: 6997
              high:  7090
              batch: 63
  vm stats threshold: 42
    cpu: 3
              count: 0
              high:  7090
              batch: 63
  vm stats threshold: 42
  node_unreclaimable:  0
  start_pfn:           2621440
Node 1, zone  Movable
  pages free     0
        boost    0
        min      32
        low      32
        high     32
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)

# cat /proc/pagetypeinfo
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10 
Node    0, zone      DMA, type    Unmovable      0      0      0      0      0      0      0      0      1      0      0 
Node    0, zone      DMA, type      Movable      0      0      0      0      0      0      0      0      0      1      3 
Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone      DMA, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone      DMA, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type    Unmovable      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type      Movable      1      0      1      2      2      3      3      3      4      4    480 
Node    0, zone    DMA32, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone   Normal, type    Unmovable    566     14     22      7      8      8      9      4      7      0      1 
Node    0, zone   Normal, type      Movable    214    299    120     53     15     10      6      6      1      4   1159 
Node    0, zone   Normal, type  Reclaimable      0      9     18     11      6      1      0      0      0      0      0 
Node    0, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate 
Node 0, zone      DMA            1            7            0            0            0            0 
Node 0, zone    DMA32            0         1016            0            0            0            0 
Node 0, zone   Normal           71         2995            6            0            0            0 
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10 
Node    1, zone   Normal, type    Unmovable    459     12      5      6      6      5      5      5      6      2      1 
Node    1, zone   Normal, type      Movable   1287    502    171     85     34     14     13      8      2      5   1861 
Node    1, zone   Normal, type  Reclaimable      1      5     12      6      9      3      1      1      0      1      0 
Node    1, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    1, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    1, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      3 

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate 
Node 1, zone   Normal          101         3977           10            0            0            8 

-- 
Mike Kravetz


  reply	other threads:[~2023-09-18 17:40 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11 19:41 [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59   ` Zi Yan
2023-09-11 21:09     ` Andrew Morton
2023-09-12 13:47   ` Vlastimil Babka
2023-09-12 14:50     ` Johannes Weiner
2023-09-13  9:33       ` Vlastimil Babka
2023-09-13 13:24         ` Johannes Weiner
2023-09-13 13:34           ` Vlastimil Babka
2023-09-12 15:03     ` Johannes Weiner
2023-09-14  7:29       ` Vlastimil Babka
2023-09-14  9:56   ` Mel Gorman
2023-09-27  5:42   ` Huang, Ying
2023-09-27 14:51     ` Johannes Weiner
2023-09-30  4:26       ` Huang, Ying
2023-10-02 14:58         ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01   ` Zi Yan
2023-09-13  9:52   ` Vlastimil Babka
2023-09-14 10:00   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17   ` Zi Yan
2023-09-11 20:47     ` Johannes Weiner
2023-09-11 20:50       ` Zi Yan
2023-09-13 14:31   ` Vlastimil Babka
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23   ` Zi Yan
2023-09-13 14:40   ` Vlastimil Babka
2023-09-14 13:37     ` Johannes Weiner
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52   ` Vlastimil Babka
2023-09-14 14:47     ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18   ` Vlastimil Babka
2023-09-14  4:11     ` Johannes Weiner
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16   ` Johannes Weiner
2023-09-15 15:05     ` Mike Kravetz
2023-09-16 19:57     ` Mike Kravetz
2023-09-16 20:13       ` Andrew Morton
2023-09-18  7:16       ` Vlastimil Babka
2023-09-18 14:52         ` Johannes Weiner
2023-09-18 17:40           ` Mike Kravetz [this message]
2023-09-19  6:49             ` Johannes Weiner
2023-09-19 12:37               ` Zi Yan
2023-09-19 15:22                 ` Zi Yan
2023-09-19 18:47               ` Mike Kravetz
2023-09-19 20:57                 ` Zi Yan
2023-09-20  0:32                   ` Mike Kravetz
2023-09-20  1:38                     ` Zi Yan
2023-09-20  6:07                       ` Vlastimil Babka
2023-09-20 13:48                         ` Johannes Weiner
2023-09-20 16:04                           ` Johannes Weiner
2023-09-20 17:23                             ` Zi Yan
2023-09-21  2:31                               ` Zi Yan
2023-09-21 10:19                                 ` David Hildenbrand
2023-09-21 14:47                                   ` Zi Yan
2023-09-25 21:12                                     ` Zi Yan
2023-09-26 17:39                                       ` Johannes Weiner
2023-09-28  2:51                                         ` Zi Yan
2023-10-03  2:26                                           ` Zi Yan
2023-10-10 21:12                                             ` Johannes Weiner
2023-10-11 15:25                                               ` Johannes Weiner
2023-10-11 15:45                                                 ` Johannes Weiner
2023-10-11 15:57                                                   ` Zi Yan
2023-10-13  0:06                                               ` Zi Yan
2023-10-13 14:51                                                 ` Zi Yan
2023-10-16 13:35                                                   ` Zi Yan
2023-10-16 14:37                                                     ` Johannes Weiner
2023-10-16 15:00                                                       ` Zi Yan
2023-10-16 18:51                                                         ` Johannes Weiner
2023-10-16 19:49                                                           ` Zi Yan
2023-10-16 20:26                                                             ` Johannes Weiner
2023-10-16 20:39                                                               ` Johannes Weiner
2023-10-16 20:48                                                                 ` Zi Yan
2023-09-26 18:19                                     ` David Hildenbrand
2023-09-28  3:22                                       ` Zi Yan
2023-10-02 11:43                                         ` David Hildenbrand
2023-10-03  2:35                                           ` Zi Yan
2023-09-18  7:07     ` Vlastimil Babka
2023-09-18 14:09       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230918174037.GA112714@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.