* [PATCH] inotify: hide internal kernel bits from fdinfo
@ 2015-09-21 18:45 Dave Hansen
2015-09-21 19:26 ` Eric Paris
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Dave Hansen @ 2015-09-21 18:45 UTC (permalink / raw)
To: dave
Cc: dave.hansen, avagin, akpm, gorcunov, xemul, eparis, john, rlove,
linux-kernel
From: Dave Hansen <dave.hansen@linux.intel.com>
There was a report that my patch:
inotify: actually check for invalid bits in sys_inotify_add_watch()
broke CRIU.
The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/*
to figure out how to rebuild inotify watches and then passes those
flags directly back in to the inotify API. One of those flags
(FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the
inotify API. It is used inside the kernel to _implement_ inotify
but it is not and has never been part of the API.
My patch above ensured that we only allow bits which are part of
the API (IN_ALL_EVENTS). This broke CRIU.
FS_EVENT_ON_CHILD is really internal to the kernel. It is set
_anyway_ on all inotify marks. So, CRIU was really just trying
to set a bit that was already set.
This patch hides that bit from fdinfo. CRIU will not see the
bit, not try to set it, and should work as before. We should not
have been exposing this bit in the first place, so this is a good
patch independent of the CRIU problem.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reported-by: Andrey Wagin <avagin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: xemul@parallels.com
Cc: Eric Paris <eparis@redhat.com>
Cc: john@johnmccutchan.com
Cc: rlove@rlove.org
Cc: linux-kernel@vger.kernel.org
---
b/fs/notify/fdinfo.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff -puN fs/notify/fdinfo.c~fdinfo-mask fs/notify/fdinfo.c
--- a/fs/notify/fdinfo.c~fdinfo-mask 2015-09-21 10:24:01.031864268 -0700
+++ b/fs/notify/fdinfo.c 2015-09-21 10:25:04.335723826 -0700
@@ -82,9 +82,16 @@ static void inotify_fdinfo(struct seq_fi
inode_mark = container_of(mark, struct inotify_inode_mark, fsn_mark);
inode = igrab(mark->inode);
if (inode) {
+ /*
+ * IN_ALL_EVENTS represents all of the mask bits
+ * that we expose to userspace. There is at
+ * least one bit (FS_EVENT_ON_CHILD) which is
+ * used only internally to the kernel.
+ */
+ u32 mask = mark->mask & IN_ALL_EVENTS;
seq_printf(m, "inotify wd:%x ino:%lx sdev:%x mask:%x ignored_mask:%x ",
inode_mark->wd, inode->i_ino, inode->i_sb->s_dev,
- mark->mask, mark->ignored_mask);
+ mask, mark->ignored_mask);
show_mark_fhandle(m, inode);
seq_putc(m, '\n');
iput(inode);
_
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] inotify: hide internal kernel bits from fdinfo
2015-09-21 18:45 [PATCH] inotify: hide internal kernel bits from fdinfo Dave Hansen
@ 2015-09-21 19:26 ` Eric Paris
2015-09-21 19:28 ` Cyrill Gorcunov
2015-09-21 19:56 ` Andrey Wagin
2 siblings, 0 replies; 4+ messages in thread
From: Eric Paris @ 2015-09-21 19:26 UTC (permalink / raw)
To: Dave Hansen
Cc: dave.hansen, avagin, akpm, gorcunov, xemul, john, rlove,
linux-kernel
Acked-by: Eric Paris <eparis@redhat.com>
On Mon, 2015-09-21 at 11:45 -0700, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> There was a report that my patch:
>
> inotify: actually check for invalid bits in
> sys_inotify_add_watch()
>
> broke CRIU.
>
> The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/*
> to figure out how to rebuild inotify watches and then passes those
> flags directly back in to the inotify API. One of those flags
> (FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the
> inotify API. It is used inside the kernel to _implement_ inotify
> but it is not and has never been part of the API.
>
> My patch above ensured that we only allow bits which are part of
> the API (IN_ALL_EVENTS). This broke CRIU.
>
> FS_EVENT_ON_CHILD is really internal to the kernel. It is set
> _anyway_ on all inotify marks. So, CRIU was really just trying
> to set a bit that was already set.
>
> This patch hides that bit from fdinfo. CRIU will not see the
> bit, not try to set it, and should work as before. We should not
> have been exposing this bit in the first place, so this is a good
> patch independent of the CRIU problem.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reported-by: Andrey Wagin <avagin@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: xemul@parallels.com
> Cc: Eric Paris <eparis@redhat.com>
> Cc: john@johnmccutchan.com
> Cc: rlove@rlove.org
> Cc: linux-kernel@vger.kernel.org
> ---
>
> b/fs/notify/fdinfo.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff -puN fs/notify/fdinfo.c~fdinfo-mask fs/notify/fdinfo.c
> --- a/fs/notify/fdinfo.c~fdinfo-mask 2015-09-21
> 10:24:01.031864268 -0700
> +++ b/fs/notify/fdinfo.c 2015-09-21 10:25:04.335723826 -0700
> @@ -82,9 +82,16 @@ static void inotify_fdinfo(struct seq_fi
> inode_mark = container_of(mark, struct inotify_inode_mark,
> fsn_mark);
> inode = igrab(mark->inode);
> if (inode) {
> + /*
> + * IN_ALL_EVENTS represents all of the mask bits
> + * that we expose to userspace. There is at
> + * least one bit (FS_EVENT_ON_CHILD) which is
> + * used only internally to the kernel.
> + */
> + u32 mask = mark->mask & IN_ALL_EVENTS;
> seq_printf(m, "inotify wd:%x ino:%lx sdev:%x mask:%x
> ignored_mask:%x ",
> inode_mark->wd, inode->i_ino, inode->i_sb
> ->s_dev,
> - mark->mask, mark->ignored_mask);
> + mask, mark->ignored_mask);
> show_mark_fhandle(m, inode);
> seq_putc(m, '\n');
> iput(inode);
> _
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] inotify: hide internal kernel bits from fdinfo
2015-09-21 18:45 [PATCH] inotify: hide internal kernel bits from fdinfo Dave Hansen
2015-09-21 19:26 ` Eric Paris
@ 2015-09-21 19:28 ` Cyrill Gorcunov
2015-09-21 19:56 ` Andrey Wagin
2 siblings, 0 replies; 4+ messages in thread
From: Cyrill Gorcunov @ 2015-09-21 19:28 UTC (permalink / raw)
To: Dave Hansen
Cc: dave.hansen, avagin, akpm, xemul, eparis, john, rlove,
linux-kernel
On Mon, Sep 21, 2015 at 11:45:01AM -0700, Dave Hansen wrote:
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> There was a report that my patch:
>
> inotify: actually check for invalid bits in sys_inotify_add_watch()
>
> broke CRIU.
>
> The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/*
> to figure out how to rebuild inotify watches and then passes those
> flags directly back in to the inotify API. One of those flags
> (FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the
> inotify API. It is used inside the kernel to _implement_ inotify
> but it is not and has never been part of the API.
>
> My patch above ensured that we only allow bits which are part of
> the API (IN_ALL_EVENTS). This broke CRIU.
>
> FS_EVENT_ON_CHILD is really internal to the kernel. It is set
> _anyway_ on all inotify marks. So, CRIU was really just trying
> to set a bit that was already set.
>
> This patch hides that bit from fdinfo. CRIU will not see the
> bit, not try to set it, and should work as before. We should not
> have been exposing this bit in the first place, so this is a good
> patch independent of the CRIU problem.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reported-by: Andrey Wagin <avagin@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Thank you!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] inotify: hide internal kernel bits from fdinfo
2015-09-21 18:45 [PATCH] inotify: hide internal kernel bits from fdinfo Dave Hansen
2015-09-21 19:26 ` Eric Paris
2015-09-21 19:28 ` Cyrill Gorcunov
@ 2015-09-21 19:56 ` Andrey Wagin
2 siblings, 0 replies; 4+ messages in thread
From: Andrey Wagin @ 2015-09-21 19:56 UTC (permalink / raw)
To: Dave Hansen
Cc: dave.hansen, Andrew Morton, Cyrill Gorcunov,
Павел Емельянов,
Eric Paris, John McCutchan, Robert Love, LKML
2015-09-21 21:45 GMT+03:00 Dave Hansen <dave@sr71.net>:
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> There was a report that my patch:
>
> inotify: actually check for invalid bits in sys_inotify_add_watch()
>
> broke CRIU.
>
> The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/*
> to figure out how to rebuild inotify watches and then passes those
> flags directly back in to the inotify API. One of those flags
> (FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the
> inotify API. It is used inside the kernel to _implement_ inotify
> but it is not and has never been part of the API.
>
> My patch above ensured that we only allow bits which are part of
> the API (IN_ALL_EVENTS). This broke CRIU.
>
> FS_EVENT_ON_CHILD is really internal to the kernel. It is set
> _anyway_ on all inotify marks. So, CRIU was really just trying
> to set a bit that was already set.
>
> This patch hides that bit from fdinfo. CRIU will not see the
> bit, not try to set it, and should work as before. We should not
> have been exposing this bit in the first place, so this is a good
> patch independent of the CRIU problem.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reported-by: Andrey Wagin <avagin@gmail.com>
Acked-by: Andrey Vagin <avagin@openvz.org>
Thanks,
Andrey
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Cc: xemul@parallels.com
> Cc: Eric Paris <eparis@redhat.com>
> Cc: john@johnmccutchan.com
> Cc: rlove@rlove.org
> Cc: linux-kernel@vger.kernel.org
> ---
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-09-21 19:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-21 18:45 [PATCH] inotify: hide internal kernel bits from fdinfo Dave Hansen
2015-09-21 19:26 ` Eric Paris
2015-09-21 19:28 ` Cyrill Gorcunov
2015-09-21 19:56 ` Andrey Wagin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox