From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757258AbbIUT0L (ORCPT ); Mon, 21 Sep 2015 15:26:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33189 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752695AbbIUT0K (ORCPT ); Mon, 21 Sep 2015 15:26:10 -0400 Message-ID: <1442863567.30986.45.camel@redhat.com> Subject: Re: [PATCH] inotify: hide internal kernel bits from fdinfo From: Eric Paris To: Dave Hansen Cc: dave.hansen@linux.intel.com, avagin@gmail.com, akpm@linux-foundation.org, gorcunov@openvz.org, xemul@parallels.com, john@johnmccutchan.com, rlove@rlove.org, linux-kernel@vger.kernel.org Date: Mon, 21 Sep 2015 14:26:07 -0500 In-Reply-To: <20150921184501.E0313E5A@viggo.jf.intel.com> References: <20150921184501.E0313E5A@viggo.jf.intel.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Acked-by: Eric Paris On Mon, 2015-09-21 at 11:45 -0700, Dave Hansen wrote: > From: Dave Hansen > > There was a report that my patch: > > inotify: actually check for invalid bits in > sys_inotify_add_watch() > > broke CRIU. > > The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/* > to figure out how to rebuild inotify watches and then passes those > flags directly back in to the inotify API. One of those flags > (FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the > inotify API. It is used inside the kernel to _implement_ inotify > but it is not and has never been part of the API. > > My patch above ensured that we only allow bits which are part of > the API (IN_ALL_EVENTS). This broke CRIU. > > FS_EVENT_ON_CHILD is really internal to the kernel. It is set > _anyway_ on all inotify marks. So, CRIU was really just trying > to set a bit that was already set. > > This patch hides that bit from fdinfo. CRIU will not see the > bit, not try to set it, and should work as before. We should not > have been exposing this bit in the first place, so this is a good > patch independent of the CRIU problem. > > Signed-off-by: Dave Hansen > Reported-by: Andrey Wagin > Cc: Andrew Morton > Cc: Cyrill Gorcunov > Cc: xemul@parallels.com > Cc: Eric Paris > Cc: john@johnmccutchan.com > Cc: rlove@rlove.org > Cc: linux-kernel@vger.kernel.org > --- > > b/fs/notify/fdinfo.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff -puN fs/notify/fdinfo.c~fdinfo-mask fs/notify/fdinfo.c > --- a/fs/notify/fdinfo.c~fdinfo-mask 2015-09-21 > 10:24:01.031864268 -0700 > +++ b/fs/notify/fdinfo.c 2015-09-21 10:25:04.335723826 -0700 > @@ -82,9 +82,16 @@ static void inotify_fdinfo(struct seq_fi > inode_mark = container_of(mark, struct inotify_inode_mark, > fsn_mark); > inode = igrab(mark->inode); > if (inode) { > + /* > + * IN_ALL_EVENTS represents all of the mask bits > + * that we expose to userspace. There is at > + * least one bit (FS_EVENT_ON_CHILD) which is > + * used only internally to the kernel. > + */ > + u32 mask = mark->mask & IN_ALL_EVENTS; > seq_printf(m, "inotify wd:%x ino:%lx sdev:%x mask:%x > ignored_mask:%x ", > inode_mark->wd, inode->i_ino, inode->i_sb > ->s_dev, > - mark->mask, mark->ignored_mask); > + mask, mark->ignored_mask); > show_mark_fhandle(m, inode); > seq_putc(m, '\n'); > iput(inode); > _