From: Andrew Morton <akpm@linux-foundation.org>
To: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: jack@suse.cz, amir73il@gmail.com, willy@infradead.org,
linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
gorcunov@virtuozzo.com
Subject: Re: [PATCH v2] inotify: Extend ioctl to allow to request id of new watch descriptor
Date: Fri, 9 Feb 2018 12:56:56 -0800 [thread overview]
Message-ID: <20180209125656.e440e0518540d6b76ae42bc0@linux-foundation.org> (raw)
In-Reply-To: <ca006760-de72-37b3-f6fd-311c86f29b62@virtuozzo.com>
On Fri, 9 Feb 2018 18:04:54 +0300 Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
> Watch descriptor is id of the watch created by inotify_add_watch().
> It is allocated in inotify_add_to_idr(), and takes the numbers
> starting from 1. Every new inotify watch obtains next available
> number (usually, old + 1), as served by idr_alloc_cyclic().
>
> CRIU (Checkpoint/Restore In Userspace) project supports inotify
> files, and restores watched descriptors with the same numbers,
> they had before dump. Since there was no kernel support, we
> had to use cycle to add a watch with specific descriptor id:
>
> while (1) {
> int wd;
>
> wd = inotify_add_watch(inotify_fd, path, mask);
> if (wd < 0) {
> break;
> } else if (wd == desired_wd_id) {
> ret = 0;
> break;
> }
>
> inotify_rm_watch(inotify_fd, wd);
> }
>
> (You may find the actual code at the below link:
> https://github.com/checkpoint-restore/criu/blob/v3.7/criu/fsnotify.c#L577)
>
> The cycle is suboptiomal and very expensive, but since there is no better
> kernel support, it was the only way to restore that. Happily, we had met
> mostly descriptors with small id, and this approach had worked somehow.
>
> But recent time containers with inotify with big watch descriptors
> begun to come, and this way stopped to work at all. When descriptor id
> is something about 0x34d71d6, the restoring process spins in busy loop
> for a long time, and the restore hungs and delay of migration from node
> to node could easily be watched.
>
> This patch aims to solve this problem. It introduces new ioctl
> INOTIFY_IOC_SETNEXTWD, which allows to request the number of next created
> watch descriptor from userspace. It simply calls idr_set_cursor() primitive
> to populate idr::idr_next, so that next idr_alloc_cyclic() allocation
> will return this id, if it is not occupied. This is the way which is
> used to restore some other resources from userspace. For example,
> /proc/sys/kernel/ns_last_pid works the same for task pids.
>
> The new code is under CONFIG_CHECKPOINT_RESTORE #define, so small system
> may exclude it.
>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
With a little cleanup:
--- a/fs/notify/inotify/inotify_user.c~inotify-extend-ioctl-to-allow-to-request-id-of-new-watch-descriptor-fix
+++ a/fs/notify/inotify/inotify_user.c
@@ -285,7 +285,6 @@ static int inotify_release(struct inode
static long inotify_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
- struct inotify_group_private_data *data __maybe_unused;
struct fsnotify_group *group;
struct fsnotify_event *fsn_event;
void __user *p;
@@ -294,7 +293,6 @@ static long inotify_ioctl(struct file *f
group = file->private_data;
p = (void __user *) arg;
- data = &group->inotify_data;
pr_debug("%s: group=%p cmd=%u\n", __func__, group, cmd);
@@ -313,6 +311,9 @@ static long inotify_ioctl(struct file *f
case INOTIFY_IOC_SETNEXTWD:
ret = -EINVAL;
if (arg >= 1 && arg <= INT_MAX) {
+ struct inotify_group_private_data *data;
+
+ data = &group->inotify_data;
spin_lock(&data->idr_lock);
idr_set_cursor(&data->idr, (unsigned int)arg);
spin_unlock(&data->idr_lock);
_
next prev parent reply other threads:[~2018-02-09 20:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <151810242614.30935.12876744458891870220.stgit@localhost.localdomain>
[not found] ` <151810242614.30935.12876744458891870220.stgit-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2018-02-08 16:14 ` [PATCH] inotify: Extend ioctl to allow to request id of new watch descriptor Jan Kara
2018-02-08 17:58 ` Cyrill Gorcunov
2018-02-09 15:04 ` [PATCH v2] " Kirill Tkhai
2018-02-09 15:14 ` Matthew Wilcox
2018-02-09 20:56 ` Andrew Morton [this message]
2018-02-09 22:45 ` Kirill Tkhai
2018-02-11 11:30 ` Stef Bon
[not found] ` <CANXojcxKH1zFHOPsJh7zbjshUkFGagah-vN6EvMU7Q-9kLFmpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-12 8:42 ` Kirill Tkhai
[not found] ` <bb9bafab-9a50-d19e-7293-65e74aca4720-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2018-02-14 10:18 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180209125656.e440e0518540d6b76ae42bc0@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=amir73il@gmail.com \
--cc=gorcunov@virtuozzo.com \
--cc=jack@suse.cz \
--cc=ktkhai@virtuozzo.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).