All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Anshuman Khandual <anshuman.khandual@arm.com>,
	Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Oscar Salvador <osalvador@suse.de>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Pavel Tatashin <pavel.tatashin@microsoft.com>,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_alloc: Add a reason for reserved pages in has_unmovable_pages()
Date: Fri, 04 Oct 2019 08:56:16 -0400	[thread overview]
Message-ID: <1570193776.5576.270.camel@lca.pw> (raw)
In-Reply-To: <91128b73-9a47-100b-d3de-e83f0b941e9f@arm.com>

On Fri, 2019-10-04 at 17:14 +0530, Anshuman Khandual wrote:
> 
> On 10/04/2019 04:28 PM, Michal Hocko wrote:
> > On Thu 03-10-19 13:40:57, Anshuman Khandual wrote:
> > > Having unmovable pages on a given pageblock should be reported correctly
> > > when required with REPORT_FAILURE flag. But there can be a scenario where a
> > > reserved page in the page block will get reported as a generic "unmovable"
> > > reason code. Instead this should be changed to a more appropriate reason
> > > code like "Reserved page".
> > 
> > Others have already pointed out this is just redundant but I will have a
> 
> Sure.
> 
> > more generic comment on the changelog. There is essentially no
> > information why the current state is bad/unhelpful and why the chnage is
> 
> The current state is not necessarily bad or unhelpful. I just though that it
> could be improved upon. Some how calling out explicitly only the CMA page
> failure case just felt adhoc, where as there are other reasons like HugeTLB
> immovability which might depend on other factors apart from just page flags
> (though I did not propose that originally).
> 
> > needed. All you claim is that something is a certain way and then assert
> > that it should be done differently. That is not how changelogs should
> > look like.
> > 
> 
> Okay, probably I should have explained more on why "unmovable" is less than
> adequate to capture the exact reason for specific failure cases and how
> "Reserved Page" instead would been better. But got the point, will improve.
> 

It might be a good time to rethink if it is really a good idea to dump_page()
at all inside has_unmovable_pages(). As it is right now, it is a a potential
deadlock between console vs memory offline. More details are in this thread,

https://lore.kernel.org/lkml/1568817579.5576.172.camel@lca.pw/

01: [  672.875392] WARNING: possible circular locking dependency detected       
01: [  672.875394] 5.4.0-rc1-next-20191004+ #64 Not tainted                     
01: [  672.875396] ------------------------------------------------------       
01: [  672.875398] test.sh/1724 is trying to acquire lock:                      
01: [  672.875400] 0000000052059ec0 (console_owner){-...}, at: console_unlock+0x
01: 328/0xa30                                                                   
01: [  672.875406]                                                              
01: [  672.875408] but task is already holding lock:                            
01: [  672.875409] 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso
01: late_page_range+0x216/0x538                                                 
01: [  672.875415]                                                              
01: [  672.875417] which lock already depends on the new lock.                  
01: [  672.875418]                                                              
01: [  672.875419]                                                              
01: [  672.875421] the existing dependency chain (in reverse order) is:         
01: [  672.875423]                                                              
01: [  672.875424] -> #2 (&(&zone->lock)->rlock){-.-.}:                         
01: [  672.875430]        lock_acquire+0x21a/0x468                              
01: [  672.875431]        _raw_spin_lock+0x54/0x68                              
01: [  672.875433]        get_page_from_freelist+0x8b6/0x2d28                   
01: [  672.875435]        __alloc_pages_nodemask+0x246/0x658                    
01: [  672.875437]        __get_free_pages+0x34/0x78                            
01: [  672.875438]        sclp_init+0x106/0x690                                 
01: [  672.875440]        sclp_register+0x2e/0x248                              
01: [  672.875442]        sclp_rw_init+0x4a/0x70                                
01: [  672.875443]        sclp_console_init+0x4a/0x1b8                          
01: [  672.875445]        console_init+0x2c8/0x410                              
01: [  672.875447]        start_kernel+0x530/0x6a0                              
01: [  672.875448]        startup_continue+0x70/0xd0                            
01: [  672.875449]                                                              
01: [  672.875450] -> #1 (sclp_lock){-.-.}:                                     
01: [  672.875458]        lock_acquire+0x21a/0x468                              
01: [  672.875460]        _raw_spin_lock_irqsave+0xcc/0xe8                      
01: [  672.875462]        sclp_add_request+0x34/0x308                           
01: [  672.875464]        sclp_conbuf_emit+0x100/0x138                          
01: [  672.875465]        sclp_console_write+0x96/0x3b8                         
01: [  672.875467]        console_unlock+0x6dc/0xa30                            
01: [  672.875469]        vprintk_emit+0x184/0x3c8                              
01: [  672.875470]        vprintk_default+0x44/0x50                             
01: [  672.875472]        printk+0xa8/0xc0                                      
01: [  672.875473]        iommu_debugfs_setup+0xf2/0x108                        
01: [  672.875475]        iommu_init+0x6c/0x78                                  
01: [  672.875477]        do_one_initcall+0x162/0x680                           
01: [  672.875478]        kernel_init_freeable+0x4e8/0x5a8                      
01: [  672.875480]        kernel_init+0x2a/0x188                                
01: [  672.875484]        ret_from_fork+0x30/0x34                               
01: [  672.875486]        kernel_thread_starter+0x0/0xc                         
01: [  672.875487]                                                              
01: [  672.875488] -> #0 (console_owner){-...}:                                 
01: [  672.875495]        check_noncircular+0x338/0x3e0                         
01: [  672.875496]        __lock_acquire+0x1e66/0x2d88                          
01: [  672.875498]        lock_acquire+0x21a/0x468                              
01: [  672.875499]        console_unlock+0x3a6/0xa30                            
01: [  672.875501]        vprintk_emit+0x184/0x3c8                              
01: [  672.875503]        vprintk_default+0x44/0x50                             
01: [  672.875504]        printk+0xa8/0xc0                                      
01: [  672.875506]        __dump_page+0x1dc/0x710                               
01: [  672.875507]        dump_page+0x2e/0x58                                   
01: [  672.875509]        has_unmovable_pages+0x2e8/0x470                       
01: [  672.875511]        start_isolate_page_range+0x404/0x538                  
01: [  672.875513]        __offline_pages+0x22c/0x1338                          
01: [  672.875514]        memory_subsys_offline+0xa6/0xe8                       
01: [  672.875516]        device_offline+0xe6/0x118                             
01: [  672.875517]        state_store+0xf0/0x110                                
01: [  672.875519]        kernfs_fop_write+0x1bc/0x270                          
01: [  672.875521]        vfs_write+0xce/0x220                                  
01: [  672.875522]        ksys_write+0xea/0x190                                 
01: [  672.875524]        system_call+0xd8/0x2b4                                
01: [  672.875525]                                                              
01: [  672.875527] other info that might help us debug this:                    
01: [  672.875528]                                                              
01: [  672.875529] Chain exists of:                                             
01: [  672.875530]   console_owner --> sclp_lock --> &(&zone->lock)->rlock      
01: [  672.875538]                                                              
01: [  672.875540]  Possible unsafe locking scenario:                           
01: [  672.875541]                                                              
01: [  672.875543]        CPU0                    CPU1                          
01: [  672.875544]        ----                    ----                          
01: [  672.875545]   lock(&(&zone->lock)->rlock);                               
01: [  672.875549]                                lock(sclp_lock);              
01: [  672.875553]                                lock(&(&zone->lock)->rlock);  
01: [  672.875557]   lock(console_owner);                                       
01: [  672.875560]                                                              
01: [  672.875562]  *** DEADLOCK ***                                            
01: [  672.875563]                                                              
01: [  672.875564] 9 locks held by test.sh/1724:                                
01: [  672.875565]  #0: 000000000e925408 (sb_writers#4){.+.+}, at: vfs_write+0x2
01: 06/0x220                                                                    
01: [  672.875574]  #1: 0000000050aa4280 (&of->mutex){+.+.}, at: kernfs_fop_writ
01: e+0x154/0x270                                                               
01: [  672.875581]  #2: 0000000062e5c628 (kn->count#198){.+.+}, at: kernfs_fop_w
01: rite+0x16a/0x270                                                            
01: [  672.875590]  #3: 00000000523236a0 (device_hotplug_lock){+.+.}, at: lock_d
01: evice_hotplug_sysfs+0x30/0x80                                               
01: [  672.875598]  #4: 0000000062e70990 (&dev->mutex){....}, at: device_offline
01: +0x78/0x118                                                                 
01: [  672.875605]  #5: 0000000051fd36b0 (cpu_hotplug_lock.rw_sem){++++}, at: __
01: offline_pages+0xec/0x1338                                                   
01: [  672.875613]  #6: 00000000521ca470 (mem_hotplug_lock.rw_sem){++++}, at: pe
01: rcpu_down_write+0x38/0x210                                                  
01: [  672.875620]  #7: 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: star
01: t_isolate_page_range+0x216/0x538                                            
01: [  672.875628]  #8: 000000005205a100 (console_lock){+.+.}, at: vprintk_emit+
01: 0x178/0x3c8                                                                 
01: [  672.875635]                                                              
01: [  672.875636] stack backtrace:                                             
01: [  672.875639] CPU: 1 PID: 1724 Comm: test.sh Not tainted 5.4.0-rc1-next-201
01: 91004+ #64                                                                  
01: [  672.875640] Hardware name: IBM 2964 N96 400 (z/VM 6.4.0)                 
01: [  672.875642] Call Trace:                                                  
01: [  672.875644] ([<00000000512ae218>] show_stack+0x110/0x1b0)                
01: [  672.875645]  [<0000000051b6d506>] dump_stack+0x126/0x178                 
01: [  672.875648]  [<00000000513a4b08>] check_noncircular+0x338/0x3e0          
01: [  672.875650]  [<00000000513aaaf6>] __lock_acquire+0x1e66/0x2d88           
01: [  672.875652]  [<00000000513a7e12>] lock_acquire+0x21a/0x468               
01: [  672.875654]  [<00000000513bb2fe>] console_unlock+0x3a6/0xa30             
01: [  672.875655]  [<00000000513bde2c>] vprintk_emit+0x184/0x3c8               
01: [  672.875657]  [<00000000513be0b4>] vprintk_default+0x44/0x50              
01: [  672.875659]  [<00000000513beb60>] printk+0xa8/0xc0                       
01: [  672.875661]  [<000000005158c364>] __dump_page+0x1dc/0x710                
01: [  672.875663]  [<000000005158c8c6>] dump_page+0x2e/0x58                    
01: [  672.875665]  [<00000000515d87c8>] has_unmovable_pages+0x2e8/0x470        
01: [  672.875667]  [<000000005167072c>] start_isolate_page_range+0x404/0x538   
01: [  672.875669]  [<0000000051b96de4>] __offline_pages+0x22c/0x1338           
01: [  672.875671]  [<0000000051908586>] memory_subsys_offline+0xa6/0xe8        
01: [  672.875673]  [<00000000518e561e>] device_offline+0xe6/0x118              
01: [  672.875675]  [<0000000051908170>] state_store+0xf0/0x110                 
01: [  672.875677]  [<0000000051796384>] kernfs_fop_write+0x1bc/0x270           
01: [  672.875679]  [<000000005168972e>] vfs_write+0xce/0x220                   
01: [  672.875681]  [<0000000051689b9a>] ksys_write+0xea/0x190                  
01: [  672.875685]  [<0000000051ba9990>] system_call+0xd8/0x2b4                 
01: [  672.875687] INFO: lockdep is turned off.


  reply	other threads:[~2019-10-04 12:56 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-03  8:10 [PATCH] mm/page_alloc: Add a reason for reserved pages in has_unmovable_pages() Anshuman Khandual
2019-10-03  9:05 ` Qian Cai
2019-10-03  9:32   ` Anshuman Khandual
2019-10-03  9:53     ` Anshuman Khandual
2019-10-03 11:19     ` Qian Cai
2019-10-03 11:32       ` Anshuman Khandual
2019-10-03 11:50         ` Qian Cai
2019-10-03 12:02           ` Anshuman Khandual
2019-10-03 12:14             ` Qian Cai
2019-10-04  8:25               ` David Hildenbrand
2020-01-14  8:19                 ` Anshuman Khandual
2020-01-14  8:30                   ` David Hildenbrand
2020-01-14  9:10                   ` Michal Hocko
2020-01-14 10:23                     ` Vlastimil Babka
2020-01-14 11:03                       ` Anshuman Khandual
2020-01-14 11:32                       ` Michal Hocko
2020-01-14 12:04                         ` [PATCH] mm, debug: always print flags in dump_page() Vlastimil Babka
2020-01-14 13:35                           ` Michal Hocko
2020-01-14 18:22                         ` [PATCH] mm/page_alloc: Add a reason for reserved pages in has_unmovable_pages() Ralph Campbell
2019-10-04 10:58 ` Michal Hocko
2019-10-04 11:44   ` Anshuman Khandual
2019-10-04 12:56     ` Qian Cai [this message]
2019-10-04 13:07       ` Michal Hocko
2019-10-04 13:30         ` Qian Cai
2019-10-04 13:38           ` Michal Hocko
2019-10-04 13:56             ` Qian Cai
2019-10-04 14:41               ` Michal Hocko
2019-10-05 21:22                 ` Andrew Morton
2019-10-05 22:38                   ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1570193776.5576.270.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    --cc=pavel.tatashin@microsoft.com \
    --cc=rppt@linux.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.