linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Ashish Sangwan <a.sangwan@samsung.com>
Cc: Jan Kara <jack@suse.cz>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Eric Paris <eparis@redhat.com>,
	AMIT SAHRAWAT <a.sahrawat@samsung.com>,
	Namjae Jeon <namjae.jeon@samsung.com>,
	PANKAJ MISHRA <pankaj.m@samsung.com>,
	Lino Sanfilippo <LinoSanfilippo@gmx.de>
Subject: Re: Re: Re: [PATCH] fsnotify: fix a crash due to invalid virtual address
Date: Wed, 24 Jun 2015 13:22:14 +0200	[thread overview]
Message-ID: <20150624112214.GD17849@quack.suse.cz> (raw)
In-Reply-To: <166776534.176181435140830045.JavaMail.weblogic@ep2mlwas04a>

  Hello,

On Wed 24-06-15 10:13:50, Ashish Sangwan wrote:
> > On Tue 23-06-15 12:05:51, Ashish Sangwan wrote:
> > > > Looking into this in more detail, it might be worthwhile to revisit how
> > > > mark_mutex is used since at least fanotify and dnotify use it for more than
> > > > just a protection of list of group marks and untangling this would simplify
> > > > things. But that's a longer term goal.
> > > > 
> > > > A relatively simple fix for your issue is to split group list of marks into
> > > > a list of inode marks and a list of mount marks. Then destroying becomes
> > > > much simpler because we always discard the whole list (or both of them) and
> > > > we can easily avoid problems with list corruption when dropping the
> > > > mark_mutex. I can write the patch later or you can do that if you are
> > > Sorry I could not understand why the group's list of marks needs to be split.
> > > I was browsing through the old code, from the days mark_mutex was not present
> > > and it looked like below:
> > > void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
> > > 					 unsigned int flags)
> > > {
> > > 	struct fsnotify_mark *lmark, *mark;
> > > 	LIST_HEAD(free_list);
> > > 
> > > 	spin_lock(&group->mark_lock);
> > > 	list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
> > > 		if (mark->flags & flags) {
> > > 			list_add(&mark->free_g_list, &free_list);
> > > 			list_del_init(&mark->g_list);
> > > 			fsnotify_get_mark(mark);
> > > 		}
> > > 	}
> > > 	spin_unlock(&group->mark_lock);
> > > 
> > > 	list_for_each_entry_safe(mark, lmark, &free_list, free_g_list) {
> > > 		fsnotify_destroy_mark(mark);
> > > 		fsnotify_put_mark(mark);
> > > 	}
> > > }
> > > How about using a temporary onstack list_head like above?
> > 
> > So we can use a temporary list_head for entries selected for destruction as
> > well. *But* in either case you have to be careful because even the
> > temporary free_list can be modified by concurrent mark destruction e.g. from
> > fsnotify_free_marks_by_inode() call as each mark is in two lists - one
> > hanging from inode / mount and one hanging from the notification group.
> True.
> > 
> > Actually, looking into it even fsnotify_destroy_marks() doesn't seem to be
> > properly protected from the race. Sigh.
> True.
> 
> So anything which is calling fsnotify_destroy_mark is not protected against
> concurrent deletion from fsnotify_clear_marks_by_group_flags() and vice versa.
> Plus, as you rightly pointed, we have both the inode mark and vfsmount sharing the same list.
> So even if fsnotify_clear_marks_by_group_flags is for removing inode mark, a
> parallel fsnotify_destroy_mark for vfsmount can cause crash as they are sharing
> same list.

So I was looking more into the issue today and fsnotify_destroy_marks() is
fine in the end since it holds a reference to each mark (so it cannot be
freed) and it also uses a special free_list list_head within mark for its
temporary list so nobody removes any mark from that list.

> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index d90deaa..d83ec7d 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -124,14 +124,6 @@ void fsnotify_destroy_mark_locked(struct fsnotify_mark *mark,
> 
>         spin_lock(&mark->lock);
> 
> -       /* something else already called this function on this mark */
> -       if (!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE)) {
> -               spin_unlock(&mark->lock);
> -               return;
> -       }
> -
> -       mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
> -
>         if (mark->flags & FSNOTIFY_MARK_FLAG_INODE) {
>                 inode = mark->i.inode;
>                 fsnotify_destroy_inode_mark(mark);
> @@ -188,7 +180,10 @@ void fsnotify_destroy_mark(struct fsnotify_mark *mark,
>                            struct fsnotify_group *group)
>  {
>         mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
> -       fsnotify_destroy_mark_locked(mark, group);
> +       if (mark->flags & FSNOTIFY_MARK_FLAG_ALIVE) {
> +               mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
> +               fsnotify_destroy_mark_locked(mark, group);
> +       }
>         mutex_unlock(&group->mark_mutex);
>  }
> 
> @@ -293,14 +288,27 @@ void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
>                                          unsigned int flags)
>  {
>         struct fsnotify_mark *lmark, *mark;
> -
> +       LIST_HEAD(free_list);
> +
>         mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
> -       list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list) {
> +       /* Pass 1 : clear the alive flag and move to free list */
> +       list_for_each_entry_safe(mark, lmark, &group->marks_list, g_list)
>                 if (mark->flags & flags) {
> +                       /*
> +                        * If the mark is present on group's mark list
> +                        * it has to be alive.
> +                        */
> +                       WARN_ON(!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE));
> +                       list_del_init(&mark->g_list);
> +                       list_add(&mark->g_list, &free_list);
> +                       mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
> +               }
> +
> +       /* Pass 2: remove mark */
> +       list_for_each_entry_safe(mark, lmark, &free_list, g_list) {
>                         fsnotify_get_mark(mark);
>                         fsnotify_destroy_mark_locked(mark, group);
>                         fsnotify_put_mark(mark);
> -               }
>         }
>         mutex_unlock(&group->mark_mutex);
>  }

So this should fix the problem as well. However the lifetime rules are so
convoluted that I'm thinking how to make things simpler (especially to
unify how clearing via inode/mount and via group works). When investigating
the code I also found out there are possible issues with mark->inode
getting cleared while we are delivering some event. So I'm looking into
fixing all the problems.

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

      reply	other threads:[~2015-06-24 11:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-24 10:13 Re: Re: [PATCH] fsnotify: fix a crash due to invalid virtual address Ashish Sangwan
2015-06-24 11:22 ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150624112214.GD17849@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=LinoSanfilippo@gmx.de \
    --cc=a.sahrawat@samsung.com \
    --cc=a.sangwan@samsung.com \
    --cc=akpm@linux-foundation.org \
    --cc=eparis@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=namjae.jeon@samsung.com \
    --cc=pankaj.m@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).