public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: junxiao.bi@oracle.com
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	gregkh@linuxfoundation.org
Subject: Re: [PATCH] kernfs: support kernfs notify in memory recliam context
Date: Tue, 14 Nov 2023 10:24:23 -1000	[thread overview]
Message-ID: <ZVPXd-3TshjeScek@slm.duckdns.org> (raw)
In-Reply-To: <c71f1cb7-14d6-45e4-9df1-dc9bc82deda8@oracle.com>

Hello,

On Tue, Nov 14, 2023 at 12:09:19PM -0800, junxiao.bi@oracle.com wrote:
> On 11/14/23 11:06 AM, Tejun Heo wrote:
> > On Tue, Nov 14, 2023 at 10:59:47AM -0800, Junxiao Bi wrote:
> > > kernfs notify is used in write path of md (md_write_start) to wake up
> > > userspace daemon, like "mdmon" for updating md superblock of imsm raid,
> > > md write will wait for that update done before issuing the write, if this
> > How is forward progress guarnateed for that userspace daemon? This sounds
> > like a really fragile setup.
> 
> For imsm raid, userspace daemon "mdmon" is responsible for updating raid
> metadata, kernel will use kernfs_notify to wake up the daemon anywhere
> metadata update is required. If the daemon can't move forward, write may
> hung, but that will be a bug in the daemon?

I see. That sounds very fragile and I'm not quite sure that can ever be made
to work reliably. While memlocking everything needed and being really
judicious about which syscalls to make will probably get you pretty far,
there are things like task_work which gets scheduled and executed when a
task is about to return to userspace. Those things are allowed to grab e.g.
mutexes in the kernel and allocate memory and so on, and can deadlock under
memory pressure.

Even just looking at kernfs_notify, it down_reads()
root->kernfs_supers_rwsem which might already be write-locked by somebody
who's waiting on memory allocation.

The patch you're proposing removes one link in the dependency chain but
there are many on there. I'm not sure this is fixable. Nobody writes kernel
code thinking that userspace code can be on the memory reclaim dependency
chain.

Thanks.

-- 
tejun

  reply	other threads:[~2023-11-14 20:24 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-14 18:59 [PATCH] kernfs: support kernfs notify in memory recliam context Junxiao Bi
2023-11-14 19:06 ` Tejun Heo
2023-11-14 20:09   ` junxiao.bi
2023-11-14 20:24     ` Tejun Heo [this message]
2023-11-14 23:53       ` junxiao.bi
2023-11-15 15:30         ` Mariusz Tkaczyk
2023-11-16 17:04           ` junxiao.bi
2023-11-17  8:36             ` Mariusz Tkaczyk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVPXd-3TshjeScek@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox