linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-mm@kvack.org,
	"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Petr Mladek <pmladek@suse.com>
Subject: Re: [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save().
Date: Thu, 22 Jun 2023 14:09:03 +0200	[thread overview]
Message-ID: <ZJQ53/aG48VqnxSA@dhcp22.suse.cz> (raw)
In-Reply-To: <7758a46f-69a9-c585-53e0-9b1b220b75c0@I-love.SAKURA.ne.jp>

On Thu 22-06-23 19:58:33, Tetsuo Handa wrote:
> On 2023/06/22 16:18, Michal Hocko wrote:
> >>> It is explained as the first deadlock scenario in commit 1007843a9190
> >>> ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock").
> >>> We have to disable IRQ before making zonelist_update_seq.seqcount odd.
> >>>
> >>
> >> Since we must replace local_irq_save() + write_seqlock() with write_seqlock_irqsave() for
> >> CONFIG_PREEMPT_RT=y case but we must not replace local_irq_save() + write_seqlock() with
> >> write_seqlock_irqsave() for CONFIG_PREEMPT_RT=n case, the proper fix is something like below?
> > 
> > Now, I am confused. Why write_seqlock_irqsave is not allowed for !RT?
> > Let me quote the changelog and he scenario 1:
> >         write_seqlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount odd
> >         // e.g. timer interrupt handler runs at this moment
> >           some_timer_func() {
> >             kmalloc(GFP_ATOMIC) {
> >               __alloc_pages_slowpath() {
> >                 read_seqbegin(&zonelist_update_seq) {
> >                   // spins forever because zonelist_update_seq.seqcount is odd
> >                 }
> >               }
> >             }
> >           }
> >         // e.g. timer interrupt handler finishes
> >         write_sequnlock(&zonelist_update_seq); // makes zonelist_update_seq.seqcount even
> > 
> > This is clearly impossible with write_seqlock_irqsave as interrupts are
> > disabled before the lock is taken.
> 
> Well, it seems that "I don't want to replace" rather than "we must not replace".

OK, so this is an alteranative fix rather the proposed fix being
incorrect.

> I reread the thread but I couldn't find why nobody suggested write_seqlock_irqsave().
> The reason I proposed the
> 
>   local_irq_save() => printk_deferred_enter() => write_seqlock()
> 
> ordering implies a precaution in case write_seqlock() involves printk() (e.g. lockdep,
> KCSAN, soft-lockup warning), in addition to "local_irq_save() before printk_deferred_enter()"
> requirement. Maybe people in that thread were happy with preserving this precaution...

Precaution is a fair argument. I am not sure it is the strongest one to
justify the ugly RT special casing though. I would propose to go with 
Sebastian's patch as a clear fix and if you really care about the
pre-caution then make sure you describe potential problems.

> You commented
> 
>   There shouldn't be any other locks (apart from hotplug) taken in that path IIRC.
> 
> at https://lkml.kernel.org/ZCrYQj+2/uMtqNBm@dhcp22.suse.cz .
> 
> If __build_all_zonelists() is already serialized by hotplug lock, we don't
> need to call spin_lock(&zonelist_update_seq.lock) and we will be able to
> replace write_seqlock(&zonelist_update_seq) with
> write_seqcount_begin(&zonelist_update_seq.seqcount) like
> cpuset_change_task_nodemask() does?

Maybe, I haven't really dived into this deeper. One way or the other
RT requires a special IRQ handling along with the seq lock, no?

-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2023-06-22 12:09 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-21 10:40 [PATCH] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save() Sebastian Andrzej Siewior
2023-06-21 10:59 ` Michal Hocko
2023-06-21 11:16   ` Sebastian Andrzej Siewior
2023-06-21 11:49     ` Michal Hocko
2023-06-21 13:11       ` Sebastian Andrzej Siewior
2023-06-21 13:22         ` Michal Hocko
2023-06-21 13:25           ` Sebastian Andrzej Siewior
2023-06-21 11:14 ` David Hildenbrand
2023-06-21 11:33 ` Tetsuo Handa
2023-06-21 12:40   ` Petr Mladek
2023-06-21 13:08     ` Sebastian Andrzej Siewior
2023-06-21 13:06   ` Sebastian Andrzej Siewior
2023-06-21 13:32     ` Tetsuo Handa
2023-06-21 14:34       ` Sebastian Andrzej Siewior
2023-06-21 14:50         ` Tetsuo Handa
2023-06-21 23:24           ` Tetsuo Handa
2023-06-22  7:18             ` Michal Hocko
2023-06-22 10:58               ` Tetsuo Handa
2023-06-22 12:09                 ` Michal Hocko [this message]
2023-06-22 13:36             ` Tetsuo Handa
2023-06-22 14:11               ` Petr Mladek
2023-06-22 14:28                 ` Tetsuo Handa
2023-06-23  9:35                   ` Sebastian Andrzej Siewior
2023-06-22 15:04                 ` Petr Mladek
2023-06-22 15:43                   ` Tetsuo Handa
2023-06-23  9:45                     ` Sebastian Andrzej Siewior
2023-06-23  9:51                       ` Tetsuo Handa
2023-06-23 10:11                         ` Sebastian Andrzej Siewior
2023-06-23 10:36                           ` Tetsuo Handa
2023-06-23 12:44                             ` Sebastian Andrzej Siewior
2023-06-23 12:57                               ` Michal Hocko
2023-06-23 10:53                           ` Petr Mladek
2023-06-23 11:16                             ` Tetsuo Handa
2023-06-23 13:31                             ` Sebastian Andrzej Siewior
2023-06-23 15:38                               ` Petr Mladek
2023-06-23 16:04                                 ` Sebastian Andrzej Siewior
2023-06-23  9:31               ` Sebastian Andrzej Siewior
2023-06-23  7:27           ` Sebastian Andrzej Siewior
2023-06-21 15:38         ` Petr Mladek
2023-06-23  8:12           ` Sebastian Andrzej Siewior
2023-06-23  9:21             ` Michal Hocko
2023-06-23  9:58               ` Sebastian Andrzej Siewior
2023-06-23 10:43                 ` Michal Hocko
2023-06-23 10:45                 ` Sebastian Andrzej Siewior
2023-06-23 10:50                   ` Sebastian Andrzej Siewior
2023-06-23 11:32                   ` Michal Hocko
2023-06-23 10:40             ` Petr Mladek
2023-06-23 13:24               ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZJQ53/aG48VqnxSA@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=lgoncalv@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=pmladek@suse.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).