linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: NeilBrown <neilb@suse.de>, Christian Brauner <brauner@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	Oleg Nesterov <oleg@redhat.com>,
	Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org
Subject: Re: [PATCH 1/2] Allow a kthread to declare that it calls task_work_run()
Date: Tue, 5 Dec 2023 15:03:46 -0700	[thread overview]
Message-ID: <3609267c-3fcd-43d6-9b43-9f84bef029a2@kernel.dk> (raw)
In-Reply-To: <fb713388-661a-46e0-8925-6d169b46ff9c@kernel.dk>

On 12/5/23 2:58 PM, Jens Axboe wrote:
> On 12/5/23 2:28 PM, NeilBrown wrote:
>> On Tue, 05 Dec 2023, Christian Brauner wrote:
>>> On Mon, Dec 04, 2023 at 03:09:44PM -0700, Jens Axboe wrote:
>>>> On 12/4/23 2:02 PM, NeilBrown wrote:
>>>>> It isn't clear to me what _GPL is appropriate, but maybe the rules
>>>>> changed since last I looked..... are there rules?
>>>>>
>>>>> My reasoning was that the call is effectively part of the user-space
>>>>> ABI.  A user-space process can call this trivially by invoking any
>>>>> system call.  The user-space ABI is explicitly a boundary which the GPL
>>>>> does not cross.  So it doesn't seem appropriate to prevent non-GPL
>>>>> kernel code from doing something that non-GPL user-space code can
>>>>> trivially do.
>>>>
>>>> By that reasoning, basically everything in the kernel should be non-GPL
>>>> marked. And while task_work can get used by the application, it happens
>>>> only indirectly or implicitly. So I don't think this reasoning is sound
>>>> at all, it's not an exported ABI or API by itself.
>>>>
>>>> For me, the more core of an export it is, the stronger the reason it
>>>> should be GPL. FWIW, I don't think exporting task_work functionality is
> 
>>>
>>> Yeah, I'm not too fond of that part as well. I don't think we want to
>>> give modules the ability to mess with task work. This is just asking for
>>> trouble.
>>>
>>
>> Ok, maybe we need to reframe the problem then.
>>
>> Currently fput(), and hence filp_close(), take control away from kernel
>> threads in that they cannot be sure that a "close" has actually
>> completed.
>>
>> This is already a problem for nfsd.  When renaming a file, nfsd needs to
>> ensure any cached "open" that it has on the file is closed (else when
>> re-exporting an NFS filesystem it can result in a silly-rename).
>>
>> nfsd currently handles this case by calling flush_delayed_fput().  I
>> suspect you are no more happy about exporting that than you are about
>> exporting task_work_run(), but this solution isn't actually 100%
>> reliable.  If some other thread calls flush_delayed_fput() between nfsd
>> calling filp_close() and that same nfsd calling flush_delayed_fput(),
>> then the second flush can return before the first flush (in the other
>> thread) completes all the work it took on.
>>
>> What we really need - both for handling renames and for avoiding
>> possible memory exhaustion - is for nfsd to be able to reliably wait for
>> any fput() that it initiated to complete.
>>
>> How would you like the VFS to provide that service?
> 
> Since task_work happens in the context of your task already, why not
> just have a way to get it stashed into a list when final fput is done?
> This avoids all of this "let's expose task_work" and using the task list
> for that, which seems kind of pointless as you're just going to run it
> later on manually anyway.
> 
> In semi pseudo code:
> 
> bool fput_put_ref(struct file *file)
> {
> 	return atomic_dec_and_test(&file->f_count);
> }
> 
> void fput(struct file *file)
> {
> 	if (fput_put_ref(file)) {
> 		...
> 	}
> }
> 
> and then your nfsd_file_free() could do:
> 
> ret = filp_flush(file, id);
> if (fput_put_ref(file))
> 	llist_add(&file->f_llist, &l->to_free_llist);
> 
> or something like that, where l->to_free_llist is where ever you'd
> otherwise punt the actual freeing to.

Should probably have the put_ref or whatever helper also init the
task_work, and then reuse the list in the callback_head there. Then
whoever flushes it has to call ->func() and avoid exposing ____fput() to
random users. But you get the idea.

-- 
Jens Axboe


  reply	other threads:[~2023-12-05 22:03 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04  1:36 [PATCH 0/2 v2] Move all file-close work for nfsd into nfsd threads NeilBrown
2023-12-04  1:36 ` [PATCH 1/2] Allow a kthread to declare that it calls task_work_run() NeilBrown
2023-12-04  2:13   ` Jens Axboe
2023-12-04 21:02     ` NeilBrown
2023-12-04 22:09       ` Jens Axboe
2023-12-04 22:27         ` NeilBrown
2023-12-05 11:14         ` Christian Brauner
2023-12-05 14:06           ` Jeff Layton
2023-12-05 21:28           ` NeilBrown
2023-12-05 21:58             ` Jens Axboe
2023-12-05 22:03               ` Jens Axboe [this message]
2023-12-05 22:16                 ` NeilBrown
2023-12-05 23:23                   ` NeilBrown
2023-12-05 23:31                     ` Jens Axboe
2023-12-06 14:24                       ` Christian Brauner
2023-12-08  1:40                         ` NeilBrown
2023-12-06 14:29             ` Christian Brauner
2023-12-06  5:44           ` Christoph Hellwig
2023-12-04  2:25   ` Matthew Wilcox
2023-12-04 21:04     ` NeilBrown
2023-12-04  2:40   ` Al Viro
2023-12-04 16:12     ` Oleg Nesterov
2023-12-04 21:20     ` NeilBrown
2023-12-05  6:27       ` Dave Chinner
2023-12-05  6:41   ` Dave Chinner
2023-12-05  8:48     ` NeilBrown
2023-12-05 11:29       ` Christian Brauner
2023-12-05 11:25   ` Christian Brauner
2023-12-05 14:23     ` Chuck Lever
2023-12-04  1:36 ` [PATCH 2/2] nfsd: Don't leave work of closing files to a work queue NeilBrown
2023-12-04 16:58   ` Chuck Lever
2023-12-04 22:21     ` NeilBrown
2023-12-04 23:48       ` Chuck Lever
2023-12-05  6:36   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3609267c-3fcd-43d6-9b43-9f84bef029a2@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=neilb@suse.de \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).