linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
Date: Mon, 18 Sep 2023 10:40:37 -0700	[thread overview]
Message-ID: <20230918174037.GA112714@monkey> (raw)
In-Reply-To: <20230918145204.GB16104@cmpxchg.org>

On 09/18/23 10:52, Johannes Weiner wrote:
> On Mon, Sep 18, 2023 at 09:16:58AM +0200, Vlastimil Babka wrote:
> > On 9/16/23 21:57, Mike Kravetz wrote:
> > > On 09/15/23 10:16, Johannes Weiner wrote:
> > >> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> > > 
> > > With the patch below applied, a slightly different workload triggers the
> > > following warnings.  It seems related, and appears to go away when
> > > reverting the series.
> > > 
> > > [  331.595382] ------------[ cut here ]------------
> > > [  331.596665] page type is 5, passed migratetype is 1 (nr=512)
> > > [  331.598121] WARNING: CPU: 2 PID: 935 at mm/page_alloc.c:662 expand+0x1c9/0x200
> > 
> > Initially I thought this demonstrates the possible race I was suggesting in
> > reply to 6/6. But, assuming you have CONFIG_CMA, page type 5 is cma and we
> > are trying to get a MOVABLE page from a CMA page block, which is something
> > that's normally done and the pageblock stays CMA. So yeah if the warnings
> > are to stay, they need to handle this case. Maybe the same can happen with
> > HIGHATOMIC blocks?
> 
> Hm I don't think that's quite it.
> 
> CMA and HIGHATOMIC have their own freelists. When MOVABLE requests dip
> into CMA and HIGHATOMIC, we explicitly pass that migratetype to
> __rmqueue_smallest(). This takes a chunk of e.g. CMA, expands the
> remainder to the CMA freelist, then returns the page. While you get a
> different mt than requested, the freelist typing should be consistent.
> 
> In this splat, the migratetype passed to __rmqueue_smallest() is
> MOVABLE. There is no preceding warning from del_page_from_freelist()
> (Mike, correct me if I'm wrong), so we got a confirmed MOVABLE
> order-10 block from the MOVABLE list. So far so good. However, when we
> expand() the order-9 tail of this block to the MOVABLE list, it warns
> that its pageblock type is CMA.
> 
> This means we have an order-10 page where one half is MOVABLE and the
> other is CMA.
> 
> I don't see how the merging code in __free_one_page() could have done
> that. The CMA buddy would have failed the migrate_is_mergeable() test
> and we should have left it at order-9s.
> 
> I also don't see how the CMA setup could have done this because
> MIGRATE_CMA is set on the range before the pages are fed to the buddy.
> 
> Mike, could you describe the workload that is triggering this?

This 'slightly different workload' is actually a slightly different
environment.  Sorry for mis-speaking!  The slight difference is that this
environment does not use the 'alloc hugetlb gigantic pages from CMA'
(hugetlb_cma) feature that triggered the previous issue.

This is still on a 16G VM.  Kernel command line here is:
"BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.6.0-rc1-next-20230913+
root=UUID=49c13301-2555-44dc-847b-caabe1d62bdf ro console=tty0
console=ttyS0,115200 audit=0 selinux=0 transparent_hugepage=always
hugetlb_free_vmemmap=on"

The workload is just running this script:
while true; do
 echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
 echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
 echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
done

> 
> Does this reproduce instantly and reliably?
> 

It is not 'instant' but will reproduce fairly reliably within a minute
or so.

Note that the 'echo 4 > .../hugepages-1048576kB/nr_hugepages' is going
to end up calling alloc_contig_pages -> alloc_contig_range.  Those pages
will eventually be freed via __free_pages(folio, 9).

> Is there high load on the system, or is it requesting the huge page
> with not much else going on?

Only the script was running.

> Do you see compact_* history in /proc/vmstat after this triggers?

As one might expect, compact_isolated continually increases during this
this run.

> Could you please also provide /proc/zoneinfo, /proc/pagetypeinfo and
> the hugetlb_cma= parameter you're using?

As mentioned above, hugetlb_cma is not used in this environment.  Strangely
enough, this does not reproduce (easily at least) if I use hugetlb_cma as
in the previous report.

The following are during a run after WARNING is triggered.

# cat /proc/zoneinfo
Node 0, zone      DMA
  per-node stats
      nr_inactive_anon 11800
      nr_active_anon 109
      nr_inactive_file 38161
      nr_active_file 10007
      nr_unevictable 12
      nr_slab_reclaimable 2766
      nr_slab_unreclaimable 6881
      nr_isolated_anon 0
      nr_isolated_file 0
      workingset_nodes 0
      workingset_refault_anon 0
      workingset_refault_file 0
      workingset_activate_anon 0
      workingset_activate_file 0
      workingset_restore_anon 0
      workingset_restore_file 0
      workingset_nodereclaim 0
      nr_anon_pages 11750
      nr_mapped    18402
      nr_file_pages 48339
      nr_dirty     0
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     166
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_file_hugepages 0
      nr_file_pmdmapped 0
      nr_anon_transparent_hugepages 6
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   14766
      nr_written   7701
      nr_throttled_written 0
      nr_kernel_misc_reclaimable 0
      nr_foll_pin_acquired 96
      nr_foll_pin_released 96
      nr_kernel_stack 1816
      nr_page_table_pages 1100
      nr_sec_page_table_pages 0
      nr_swapcached 0
  pages free     3840
        boost    0
        min      21
        low      26
        high     31
        spanned  4095
        present  3998
        managed  3840
        cma      0
        protection: (0, 1908, 7923, 7923)
      nr_free_pages 3840
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 0
      nr_zone_active_file 0
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  0
      numa_hit     0
      numa_miss    0
      numa_foreign 0
      numa_interleave 0
      numa_local   0
      numa_other   0
  pagesets
    cpu: 0
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
    cpu: 1
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
    cpu: 2
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
    cpu: 3
              count: 0
              high:  13
              batch: 1
  vm stats threshold: 6
  node_unreclaimable:  0
  start_pfn:           1
Node 0, zone    DMA32
  pages free     495317
        boost    0
        min      2687
        low      3358
        high     4029
        spanned  1044480
        present  520156
        managed  496486
        cma      0
        protection: (0, 0, 6015, 6015)
      nr_free_pages 495317
      nr_zone_inactive_anon 0
      nr_zone_active_anon 0
      nr_zone_inactive_file 0
      nr_zone_active_file 0
      nr_zone_unevictable 0
      nr_zone_write_pending 0
      nr_mlock     0
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  0
      numa_hit     0
      numa_miss    0
      numa_foreign 0
      numa_interleave 0
      numa_local   0
      numa_other   0
  pagesets
    cpu: 0
              count: 913
              high:  1679
              batch: 63
  vm stats threshold: 30
    cpu: 1
              count: 0
              high:  1679
              batch: 63
  vm stats threshold: 30
    cpu: 2
              count: 0
              high:  1679
              batch: 63
  vm stats threshold: 30
    cpu: 3
              count: 256
              high:  1679
              batch: 63
  vm stats threshold: 30
  node_unreclaimable:  0
  start_pfn:           4096
Node 0, zone   Normal
  pages free     1360836
        boost    0
        min      8473
        low      10591
        high     12709
        spanned  1572864
        present  1572864
        managed  1552266
        cma      0
        protection: (0, 0, 0, 0)
      nr_free_pages 1360836
      nr_zone_inactive_anon 11800
      nr_zone_active_anon 109
      nr_zone_inactive_file 38161
      nr_zone_active_file 10007
      nr_zone_unevictable 12
      nr_zone_write_pending 0
      nr_mlock     12
      nr_bounce    0
      nr_zspages   3
      nr_free_cma  0
      numa_hit     10623572
      numa_miss    0
      numa_foreign 0
      numa_interleave 1357
      numa_local   6902986
      numa_other   3720586
  pagesets
    cpu: 0
              count: 156
              high:  5295
              batch: 63
  vm stats threshold: 42
    cpu: 1
              count: 210
              high:  5295
              batch: 63
  vm stats threshold: 42
    cpu: 2
              count: 4956
              high:  5295
              batch: 63
  vm stats threshold: 42
    cpu: 3
              count: 1
              high:  5295
              batch: 63
  vm stats threshold: 42
  node_unreclaimable:  0
  start_pfn:           1048576
Node 0, zone  Movable
  pages free     0
        boost    0
        min      32
        low      32
        high     32
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)
Node 1, zone      DMA
  pages free     0
        boost    0
        min      0
        low      0
        high     0
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)
Node 1, zone    DMA32
  pages free     0
        boost    0
        min      0
        low      0
        high     0
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)
Node 1, zone   Normal
  per-node stats
      nr_inactive_anon 15381
      nr_active_anon 81
      nr_inactive_file 66550
      nr_active_file 25965
      nr_unevictable 421
      nr_slab_reclaimable 4069
      nr_slab_unreclaimable 7836
      nr_isolated_anon 0
      nr_isolated_file 0
      workingset_nodes 0
      workingset_refault_anon 0
      workingset_refault_file 0
      workingset_activate_anon 0
      workingset_activate_file 0
      workingset_restore_anon 0
      workingset_restore_file 0
      workingset_nodereclaim 0
      nr_anon_pages 15420
      nr_mapped    24331
      nr_file_pages 92978
      nr_dirty     0
      nr_writeback 0
      nr_writeback_temp 0
      nr_shmem     100
      nr_shmem_hugepages 0
      nr_shmem_pmdmapped 0
      nr_file_hugepages 0
      nr_file_pmdmapped 0
      nr_anon_transparent_hugepages 11
      nr_vmscan_write 0
      nr_vmscan_immediate_reclaim 0
      nr_dirtied   6217
      nr_written   2902
      nr_throttled_written 0
      nr_kernel_misc_reclaimable 0
      nr_foll_pin_acquired 0
      nr_foll_pin_released 0
      nr_kernel_stack 1656
      nr_page_table_pages 756
      nr_sec_page_table_pages 0
      nr_swapcached 0
  pages free     1829073
        boost    0
        min      11345
        low      14181
        high     17017
        spanned  2097152
        present  2097152
        managed  2086594
        cma      0
        protection: (0, 0, 0, 0)
      nr_free_pages 1829073
      nr_zone_inactive_anon 15381
      nr_zone_active_anon 81
      nr_zone_inactive_file 66550
      nr_zone_active_file 25965
      nr_zone_unevictable 421
      nr_zone_write_pending 0
      nr_mlock     421
      nr_bounce    0
      nr_zspages   0
      nr_free_cma  0
      numa_hit     10522401
      numa_miss    0
      numa_foreign 0
      numa_interleave 961
      numa_local   4057399
      numa_other   6465002
  pagesets
    cpu: 0
              count: 0
              high:  7090
              batch: 63
  vm stats threshold: 42
    cpu: 1
              count: 17
              high:  7090
              batch: 63
  vm stats threshold: 42
    cpu: 2
              count: 6997
              high:  7090
              batch: 63
  vm stats threshold: 42
    cpu: 3
              count: 0
              high:  7090
              batch: 63
  vm stats threshold: 42
  node_unreclaimable:  0
  start_pfn:           2621440
Node 1, zone  Movable
  pages free     0
        boost    0
        min      32
        low      32
        high     32
        spanned  0
        present  0
        managed  0
        cma      0
        protection: (0, 0, 0, 0)

# cat /proc/pagetypeinfo
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10 
Node    0, zone      DMA, type    Unmovable      0      0      0      0      0      0      0      0      1      0      0 
Node    0, zone      DMA, type      Movable      0      0      0      0      0      0      0      0      0      1      3 
Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone      DMA, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone      DMA, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type    Unmovable      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type      Movable      1      0      1      2      2      3      3      3      4      4    480 
Node    0, zone    DMA32, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone   Normal, type    Unmovable    566     14     22      7      8      8      9      4      7      0      1 
Node    0, zone   Normal, type      Movable    214    299    120     53     15     10      6      6      1      4   1159 
Node    0, zone   Normal, type  Reclaimable      0      9     18     11      6      1      0      0      0      0      0 
Node    0, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0 

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate 
Node 0, zone      DMA            1            7            0            0            0            0 
Node 0, zone    DMA32            0         1016            0            0            0            0 
Node 0, zone   Normal           71         2995            6            0            0            0 
Page block order: 9
Pages per block:  512

Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10 
Node    1, zone   Normal, type    Unmovable    459     12      5      6      6      5      5      5      6      2      1 
Node    1, zone   Normal, type      Movable   1287    502    171     85     34     14     13      8      2      5   1861 
Node    1, zone   Normal, type  Reclaimable      1      5     12      6      9      3      1      1      0      1      0 
Node    1, zone   Normal, type   HighAtomic      0      0      0      0      0      0      0      0      0      0      0 
Node    1, zone   Normal, type          CMA      0      0      0      0      0      0      0      0      0      0      0 
Node    1, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      3 

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic          CMA      Isolate 
Node 1, zone   Normal          101         3977           10            0            0            8 

-- 
Mike Kravetz


  reply	other threads:[~2023-09-18 17:40 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11 19:41 [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59   ` Zi Yan
2023-09-11 21:09     ` Andrew Morton
2023-09-12 13:47   ` Vlastimil Babka
2023-09-12 14:50     ` Johannes Weiner
2023-09-13  9:33       ` Vlastimil Babka
2023-09-13 13:24         ` Johannes Weiner
2023-09-13 13:34           ` Vlastimil Babka
2023-09-12 15:03     ` Johannes Weiner
2023-09-14  7:29       ` Vlastimil Babka
2023-09-14  9:56   ` Mel Gorman
2023-09-27  5:42   ` Huang, Ying
2023-09-27 14:51     ` Johannes Weiner
2023-09-30  4:26       ` Huang, Ying
2023-10-02 14:58         ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01   ` Zi Yan
2023-09-13  9:52   ` Vlastimil Babka
2023-09-14 10:00   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17   ` Zi Yan
2023-09-11 20:47     ` Johannes Weiner
2023-09-11 20:50       ` Zi Yan
2023-09-13 14:31   ` Vlastimil Babka
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23   ` Zi Yan
2023-09-13 14:40   ` Vlastimil Babka
2023-09-14 13:37     ` Johannes Weiner
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52   ` Vlastimil Babka
2023-09-14 14:47     ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18   ` Vlastimil Babka
2023-09-14  4:11     ` Johannes Weiner
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16   ` Johannes Weiner
2023-09-15 15:05     ` Mike Kravetz
2023-09-16 19:57     ` Mike Kravetz
2023-09-16 20:13       ` Andrew Morton
2023-09-18  7:16       ` Vlastimil Babka
2023-09-18 14:52         ` Johannes Weiner
2023-09-18 17:40           ` Mike Kravetz [this message]
2023-09-19  6:49             ` Johannes Weiner
2023-09-19 12:37               ` Zi Yan
2023-09-19 15:22                 ` Zi Yan
2023-09-19 18:47               ` Mike Kravetz
2023-09-19 20:57                 ` Zi Yan
2023-09-20  0:32                   ` Mike Kravetz
2023-09-20  1:38                     ` Zi Yan
2023-09-20  6:07                       ` Vlastimil Babka
2023-09-20 13:48                         ` Johannes Weiner
2023-09-20 16:04                           ` Johannes Weiner
2023-09-20 17:23                             ` Zi Yan
2023-09-21  2:31                               ` Zi Yan
2023-09-21 10:19                                 ` David Hildenbrand
2023-09-21 14:47                                   ` Zi Yan
2023-09-25 21:12                                     ` Zi Yan
2023-09-26 17:39                                       ` Johannes Weiner
2023-09-28  2:51                                         ` Zi Yan
2023-10-03  2:26                                           ` Zi Yan
2023-10-10 21:12                                             ` Johannes Weiner
2023-10-11 15:25                                               ` Johannes Weiner
2023-10-11 15:45                                                 ` Johannes Weiner
2023-10-11 15:57                                                   ` Zi Yan
2023-10-13  0:06                                               ` Zi Yan
2023-10-13 14:51                                                 ` Zi Yan
2023-10-16 13:35                                                   ` Zi Yan
2023-10-16 14:37                                                     ` Johannes Weiner
2023-10-16 15:00                                                       ` Zi Yan
2023-10-16 18:51                                                         ` Johannes Weiner
2023-10-16 19:49                                                           ` Zi Yan
2023-10-16 20:26                                                             ` Johannes Weiner
2023-10-16 20:39                                                               ` Johannes Weiner
2023-10-16 20:48                                                                 ` Zi Yan
2023-09-26 18:19                                     ` David Hildenbrand
2023-09-28  3:22                                       ` Zi Yan
2023-10-02 11:43                                         ` David Hildenbrand
2023-10-03  2:35                                           ` Zi Yan
2023-09-18  7:07     ` Vlastimil Babka
2023-09-18 14:09       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230918174037.GA112714@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).