Re: [PATCH 0/22] fsnotify: Avoid SRCU stalls with fanotify permission events

public inbox for linux-audit@redhat.com
 help / color / mirror / Atom feed

* Re: [PATCH 0/22] fsnotify: Avoid SRCU stalls with fanotify permission events
       [not found] <20161222091538.28702-1-jack@suse.cz>
@ 2016-12-22 20:58 ` Paul Moore
  2016-12-22 21:05   ` Amir Goldstein
       [not found] ` <20161222091538.28702-5-jack@suse.cz>
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2016-12-22 20:58 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> Hello,
>
> currently, fanotify waits for response to a permission even from userspace
> process while holding fsnotify_mark_srcu lock. That has a consequence that
> when userspace process takes long to respond or does not respond at all,
> fsnotify_mark_srcu period cannot ever complete blocking reclaim of any
> notification marks and also blocking any process that did synchronize_srcu()
> on fsnotify_mark_srcu. Effectively, this eventually blocks anybody interacting
> with the notification subsystem. Miklos has some real world reports of this
> happening. Although this in principle a problem of broken userspace
> application (which futhermore has to have CAP_SYS_ADMIN in init_user_ns, so
> it is not a security problem), it is still nasty that a simple error can
> block the kernel like this.
>
> This patch set solves this problem ...
>
> Patches have survived testing with inotify/fanotify tests in LTP. I didn't test
> audit - Paul can you give these patches some testing?  Since some of the
> changes are really non-trivial, I'd welcome if someone reviewed the patch set.

I'm going to take a look at the audit related patches right now,
expect some feedback shortly.

In the meantime, if you wanted to play a bit with some simple audit
regression tests, check out the testsuite below:

* https://github.com/linux-audit/audit-testsuite

... it is still rather simplistic, but the tests in tests/file_* and
tests/exec_name should do some basic exercises of the audit code that
leverages fsnotify.  If nothing else, it should give you some ideas
about how you might stress this a bit more with audit.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/22] fsnotify: Avoid SRCU stalls with fanotify permission events
  2016-12-22 20:58 ` [PATCH 0/22] fsnotify: Avoid SRCU stalls with fanotify permission events Paul Moore
@ 2016-12-22 21:05   ` Amir Goldstein
  2016-12-22 23:04     ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Amir Goldstein @ 2016-12-22 21:05 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jan Kara, linux-fsdevel, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Thu, Dec 22, 2016 at 10:58 PM, Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
>> Hello,
>>
>> currently, fanotify waits for response to a permission even from userspace
>> process while holding fsnotify_mark_srcu lock. That has a consequence that
>> when userspace process takes long to respond or does not respond at all,
>> fsnotify_mark_srcu period cannot ever complete blocking reclaim of any
>> notification marks and also blocking any process that did synchronize_srcu()
>> on fsnotify_mark_srcu. Effectively, this eventually blocks anybody interacting
>> with the notification subsystem. Miklos has some real world reports of this
>> happening. Although this in principle a problem of broken userspace
>> application (which futhermore has to have CAP_SYS_ADMIN in init_user_ns, so
>> it is not a security problem), it is still nasty that a simple error can
>> block the kernel like this.
>>
>> This patch set solves this problem ...
>>
>> Patches have survived testing with inotify/fanotify tests in LTP. I didn't test
>> audit - Paul can you give these patches some testing?  Since some of the
>> changes are really non-trivial, I'd welcome if someone reviewed the patch set.
>
> I'm going to take a look at the audit related patches right now,
> expect some feedback shortly.
>
> In the meantime, if you wanted to play a bit with some simple audit
> regression tests, check out the testsuite below:
>
> * https://github.com/linux-audit/audit-testsuite
>
> ... it is still rather simplistic, but the tests in tests/file_* and
> tests/exec_name should do some basic exercises of the audit code that
> leverages fsnotify.  If nothing else, it should give you some ideas
> about how you might stress this a bit more with audit.
>


Mmm that's interesting. I was looking for a good place to start with a proper
testsuite for fsnotify.
It seems like the 2 subsystems could use the same testsuite.

I will look into it.

Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/22] fsnotify: Avoid SRCU stalls with fanotify permission events
  2016-12-22 21:05   ` Amir Goldstein
@ 2016-12-22 23:04     ` Paul Moore
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Moore @ 2016-12-22 23:04 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, linux-fsdevel, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Thu, Dec 22, 2016 at 4:05 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Dec 22, 2016 at 10:58 PM, Paul Moore <paul@paul-moore.com> wrote:
>> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
>>> Hello,
>>>
>>> currently, fanotify waits for response to a permission even from userspace
>>> process while holding fsnotify_mark_srcu lock. That has a consequence that
>>> when userspace process takes long to respond or does not respond at all,
>>> fsnotify_mark_srcu period cannot ever complete blocking reclaim of any
>>> notification marks and also blocking any process that did synchronize_srcu()
>>> on fsnotify_mark_srcu. Effectively, this eventually blocks anybody interacting
>>> with the notification subsystem. Miklos has some real world reports of this
>>> happening. Although this in principle a problem of broken userspace
>>> application (which futhermore has to have CAP_SYS_ADMIN in init_user_ns, so
>>> it is not a security problem), it is still nasty that a simple error can
>>> block the kernel like this.
>>>
>>> This patch set solves this problem ...
>>>
>>> Patches have survived testing with inotify/fanotify tests in LTP. I didn't test
>>> audit - Paul can you give these patches some testing?  Since some of the
>>> changes are really non-trivial, I'd welcome if someone reviewed the patch set.
>>
>> I'm going to take a look at the audit related patches right now,
>> expect some feedback shortly.
>>
>> In the meantime, if you wanted to play a bit with some simple audit
>> regression tests, check out the testsuite below:
>>
>> * https://github.com/linux-audit/audit-testsuite
>>
>> ... it is still rather simplistic, but the tests in tests/file_* and
>> tests/exec_name should do some basic exercises of the audit code that
>> leverages fsnotify.  If nothing else, it should give you some ideas
>> about how you might stress this a bit more with audit.
>
> Mmm that's interesting. I was looking for a good place to start with a proper
> testsuite for fsnotify.
> It seems like the 2 subsystems could use the same testsuite.
>
> I will look into it.
>
> Thanks!

No problem, I'm glad it's helpful.

FWIW, it's based off ideas from the selinux-testsuite (link below);
the general motivation being a quick and easy regression test that can
be used to verify patches and general upstream development.

* https://github.com/SELinuxProject/selinux-testsuite

In addition to individual commit testing, I've combined both the audit
and SELinux testsuites with a semi-automated weekly kernel build to
test both the -rcX releases as well the selinux/next and audit/next
branches; it's proved quite beneficial.  In case you're curious, I did
a short presentation on it this summer (slides and video at the link
below).  If you are interested, I'm happy to talk about it further,
but perhaps in another thread - I don't want to hijack Jan's patchset
with marginally relevant testing discussion :)

* http://www.paul-moore.com/blog/d/2016/08/flock-kernel-testing.html

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 04/22] fsnotify: Remove fsnotify_duplicate_mark()
       [not found] ` <20161222091538.28702-5-jack@suse.cz>
@ 2016-12-22 23:13   ` Paul Moore
  2016-12-23 13:22     ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2016-12-22 23:13 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> There are only two calls sites of fsnotify_duplicate_mark(). Those are
> in kernel/audit_tree.c and both are bogus. Vfsmount pointer is unused
> for audit tree, inode pointer and group gets set in
> fsnotify_add_mark_locked() later anyway, mask and free_mark are already
> set in alloc_chunk(). In fact, calling fsnotify_duplicate_mark() is
> actively harmful because following fsnotify_add_mark_locked() will leak
> group reference by overwriting the group pointer. So just remove the two
> calls to fsnotify_duplicate_mark() and the function.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/notify/mark.c                 | 12 ------------
>  include/linux/fsnotify_backend.h |  2 --
>  kernel/audit_tree.c              |  6 ++----
>  3 files changed, 2 insertions(+), 18 deletions(-)

At first glance this looks reasonable, although you keep mentioning
"fsnotify_add_mark_locked" above when untag_chunk() is calling
"fsnotify_add_mark"; I just wanted to make sure you hadn't intended to
take the mutex in the audit code instead of relying on the locking in
fsnotify_add_mark().

> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index d3fea0bd89e2..6043306e8e21 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -510,18 +510,6 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
>         }
>  }
>
> -void fsnotify_duplicate_mark(struct fsnotify_mark *new, struct fsnotify_mark *old)
> -{
> -       assert_spin_locked(&old->lock);
> -       new->inode = old->inode;
> -       new->mnt = old->mnt;
> -       if (old->group)
> -               fsnotify_get_group(old->group);
> -       new->group = old->group;
> -       new->mask = old->mask;
> -       new->free_mark = old->free_mark;
> -}
> -
>  /*
>   * Nothing fancy, just initialize lists and locks and counters.
>   */
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 0cf34d6cc253..487246546ebe 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -323,8 +323,6 @@ extern void fsnotify_init_mark(struct fsnotify_mark *mark, void (*free_mark)(str
>  extern struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group, struct inode *inode);
>  /* find (and take a reference) to a mark associated with group and vfsmount */
>  extern struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group, struct vfsmount *mnt);
> -/* copy the values from old into new */
> -extern void fsnotify_duplicate_mark(struct fsnotify_mark *new, struct fsnotify_mark *old);
>  /* set the ignored_mask of a mark */
>  extern void fsnotify_set_mark_ignored_mask_locked(struct fsnotify_mark *mark, __u32 mask);
>  /* set the mask of a mark (might pin the object into memory */
> diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> index 8b1dde96a0fa..f3130eb0a4bd 100644
> --- a/kernel/audit_tree.c
> +++ b/kernel/audit_tree.c
> @@ -258,8 +258,7 @@ static void untag_chunk(struct node *p)
>         if (!new)
>                 goto Fallback;
>
> -       fsnotify_duplicate_mark(&new->mark, entry);
> -       if (fsnotify_add_mark(&new->mark, new->mark.group, new->mark.inode, NULL, 1)) {
> +       if (fsnotify_add_mark(&new->mark, entry->group, entry->inode, NULL, 1)) {
>                 fsnotify_put_mark(&new->mark);
>                 goto Fallback;
>         }
> @@ -395,8 +394,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>                 return -ENOENT;
>         }
>
> -       fsnotify_duplicate_mark(chunk_entry, old_entry);
> -       if (fsnotify_add_mark(chunk_entry, chunk_entry->group, chunk_entry->inode, NULL, 1)) {
> +       if (fsnotify_add_mark(chunk_entry, old_entry->group, old_entry->inode, NULL, 1)) {
>                 spin_unlock(&old_entry->lock);
>                 fsnotify_put_mark(chunk_entry);
>                 fsnotify_put_mark(old_entry);
> --
> 2.10.2
>

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
       [not found] ` <20161222091538.28702-6-jack@suse.cz>
@ 2016-12-22 23:18   ` Paul Moore
  2016-12-23 13:24     ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2016-12-22 23:18 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> Audit tree code was happily adding new notification marks while holding
> spinlocks. Since fsnotify_add_mark() acquires group->mark_mutex this can
> lead to sleeping while holding a spinlock, deadlocks due to lock
> inversion, and probably other fun. Fix the problem by acquiring
> group->mark_mutex earlier.
>
> CC: Paul Moore <paul@paul-moore.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  kernel/audit_tree.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)

[SIDE NOTE: this patch explains your comments and my earlier concern
about the locked/unlocked variants of fsnotify_add_mark() in
untag_chunk()]

Ouch.  Thanks for catching this ... what is your goal with these
patches, are you targeting this as a fix during the v4.10-rcX cycle?
If not, any objections if I pull this patch into the audit tree and
send this to Linus during the v4.10-rcX cycle (assuming it passes
testing, yadda yadda)?

> diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> index f3130eb0a4bd..156b6a93f4fc 100644
> --- a/kernel/audit_tree.c
> +++ b/kernel/audit_tree.c
> @@ -231,6 +231,7 @@ static void untag_chunk(struct node *p)
>         if (size)
>                 new = alloc_chunk(size);
>
> +       mutex_lock(&entry->group->mark_mutex);
>         spin_lock(&entry->lock);
>         if (chunk->dead || !entry->inode) {
>                 spin_unlock(&entry->lock);
> @@ -258,7 +259,8 @@ static void untag_chunk(struct node *p)
>         if (!new)
>                 goto Fallback;
>
> -       if (fsnotify_add_mark(&new->mark, entry->group, entry->inode, NULL, 1)) {
> +       if (fsnotify_add_mark_locked(&new->mark, entry->group, entry->inode,
> +                                    NULL, 1)) {
>                 fsnotify_put_mark(&new->mark);
>                 goto Fallback;
>         }
> @@ -309,6 +311,7 @@ static void untag_chunk(struct node *p)
>         spin_unlock(&hash_lock);
>         spin_unlock(&entry->lock);
>  out:
> +       mutex_unlock(&entry->group->mark_mutex);
>         fsnotify_put_mark(entry);
>         spin_lock(&hash_lock);
>  }
> @@ -385,17 +388,21 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>
>         chunk_entry = &chunk->mark;
>
> +       mutex_lock(&old_entry->group->mark_mutex);
>         spin_lock(&old_entry->lock);
>         if (!old_entry->inode) {
>                 /* old_entry is being shot, lets just lie */
>                 spin_unlock(&old_entry->lock);
> +               mutex_unlock(&old_entry->group->mark_mutex);
>                 fsnotify_put_mark(old_entry);
>                 free_chunk(chunk);
>                 return -ENOENT;
>         }
>
> -       if (fsnotify_add_mark(chunk_entry, old_entry->group, old_entry->inode, NULL, 1)) {
> +       if (fsnotify_add_mark_locked(chunk_entry, old_entry->group,
> +                                    old_entry->inode, NULL, 1)) {
>                 spin_unlock(&old_entry->lock);
> +               mutex_unlock(&old_entry->group->mark_mutex);
>                 fsnotify_put_mark(chunk_entry);
>                 fsnotify_put_mark(old_entry);
>                 return -ENOSPC;
> @@ -411,6 +418,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>                 chunk->dead = 1;
>                 spin_unlock(&chunk_entry->lock);
>                 spin_unlock(&old_entry->lock);
> +               mutex_unlock(&old_entry->group->mark_mutex);
>
>                 fsnotify_destroy_mark(chunk_entry, audit_tree_group);
>
> @@ -443,6 +451,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>         spin_unlock(&hash_lock);
>         spin_unlock(&chunk_entry->lock);
>         spin_unlock(&old_entry->lock);
> +       mutex_unlock(&old_entry->group->mark_mutex);
>         fsnotify_destroy_mark(old_entry, audit_tree_group);
>         fsnotify_put_mark(chunk_entry); /* drop initial reference */
>         fsnotify_put_mark(old_entry); /* pair to fsnotify_find mark_entry */
> --
> 2.10.2
>



-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 06/22] audit: Abstract hash key handling
       [not found] ` <20161222091538.28702-7-jack@suse.cz>
@ 2016-12-22 23:27   ` Paul Moore
  2016-12-23 13:27     ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2016-12-22 23:27 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> Audit tree currently uses inode pointer as a key into the hash table.
> Getting that from notification mark will be somewhat more difficult with
> coming fsnotify changes and there's no reason we really have to use the
> inode pointer. So abstract getting of hash key from the audit chunk and
> inode so that we can switch to a different key easily later.
>
> CC: Paul Moore <paul@paul-moore.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  kernel/audit_tree.c | 39 ++++++++++++++++++++++++++++-----------
>  1 file changed, 28 insertions(+), 11 deletions(-)

I have no objections with this patch in particular, but in patch 8,
are you certain that inode_to_key() and chunk_to_key() will continue
to return the same key value?

> diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> index 156b6a93f4fc..f0859828de09 100644
> --- a/kernel/audit_tree.c
> +++ b/kernel/audit_tree.c
> @@ -163,33 +163,48 @@ enum {HASH_SIZE = 128};
>  static struct list_head chunk_hash_heads[HASH_SIZE];
>  static __cacheline_aligned_in_smp DEFINE_SPINLOCK(hash_lock);
>
> -static inline struct list_head *chunk_hash(const struct inode *inode)
> +/* Function to return search key in our hash from inode. */
> +static unsigned long inode_to_key(const struct inode *inode)
>  {
> -       unsigned long n = (unsigned long)inode / L1_CACHE_BYTES;
> +       return (unsigned long)inode;
> +}
> +
> +/*
> + * Function to return search key in our hash from chunk. Key 0 is special and
> + * should never be present in the hash.
> + */
> +static unsigned long chunk_to_key(struct audit_chunk *chunk)
> +{
> +       return (unsigned long)chunk->mark.inode;
> +}
> +
> +static inline struct list_head *chunk_hash(unsigned long key)
> +{
> +       unsigned long n = key / L1_CACHE_BYTES;
>         return chunk_hash_heads + n % HASH_SIZE;
>  }
>
>  /* hash_lock & entry->lock is held by caller */
>  static void insert_hash(struct audit_chunk *chunk)
>  {
> -       struct fsnotify_mark *entry = &chunk->mark;
> +       unsigned long key = chunk_to_key(chunk);
>         struct list_head *list;
>
> -       if (!entry->inode)
> +       if (!key)
>                 return;
> -       list = chunk_hash(entry->inode);
> +       list = chunk_hash(key);
>         list_add_rcu(&chunk->hash, list);
>  }
>
>  /* called under rcu_read_lock */
>  struct audit_chunk *audit_tree_lookup(const struct inode *inode)
>  {
> -       struct list_head *list = chunk_hash(inode);
> +       unsigned long key = inode_to_key(inode);
> +       struct list_head *list = chunk_hash(key);
>         struct audit_chunk *p;
>
>         list_for_each_entry_rcu(p, list, hash) {
> -               /* mark.inode may have gone NULL, but who cares? */
> -               if (p->mark.inode == inode) {
> +               if (chunk_to_key(p) == key) {
>                         atomic_long_inc(&p->refs);
>                         return p;
>                 }
> @@ -585,7 +600,8 @@ int audit_remove_tree_rule(struct audit_krule *rule)
>
>  static int compare_root(struct vfsmount *mnt, void *arg)
>  {
> -       return d_backing_inode(mnt->mnt_root) == arg;
> +       return inode_to_key(d_backing_inode(mnt->mnt_root)) ==
> +              (unsigned long)arg;
>  }
>
>  void audit_trim_trees(void)
> @@ -620,9 +636,10 @@ void audit_trim_trees(void)
>                 list_for_each_entry(node, &tree->chunks, list) {
>                         struct audit_chunk *chunk = find_chunk(node);
>                         /* this could be NULL if the watch is dying else where... */
> -                       struct inode *inode = chunk->mark.inode;
>                         node->index |= 1U<<31;
> -                       if (iterate_mounts(compare_root, inode, root_mnt))
> +                       if (iterate_mounts(compare_root,
> +                                          (void *)chunk_to_key(chunk),
> +                                          root_mnt))
>                                 node->index &= ~(1U<<31);
>                 }
>                 spin_unlock(&hash_lock);
> --
> 2.10.2
>



-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 04/22] fsnotify: Remove fsnotify_duplicate_mark()
  2016-12-22 23:13   ` [PATCH 04/22] fsnotify: Remove fsnotify_duplicate_mark() Paul Moore
@ 2016-12-23 13:22     ` Jan Kara
  2016-12-23 14:01       ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2016-12-23 13:22 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jan Kara, linux-fsdevel, Amir Goldstein, Lino Sanfilippo,
	Miklos Szeredi, linux-audit

On Thu 22-12-16 18:13:11, Paul Moore wrote:
> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> > There are only two calls sites of fsnotify_duplicate_mark(). Those are
> > in kernel/audit_tree.c and both are bogus. Vfsmount pointer is unused
> > for audit tree, inode pointer and group gets set in
> > fsnotify_add_mark_locked() later anyway, mask and free_mark are already
> > set in alloc_chunk(). In fact, calling fsnotify_duplicate_mark() is
> > actively harmful because following fsnotify_add_mark_locked() will leak
> > group reference by overwriting the group pointer. So just remove the two
> > calls to fsnotify_duplicate_mark() and the function.
> >
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/notify/mark.c                 | 12 ------------
> >  include/linux/fsnotify_backend.h |  2 --
> >  kernel/audit_tree.c              |  6 ++----
> >  3 files changed, 2 insertions(+), 18 deletions(-)
> 
> At first glance this looks reasonable, although you keep mentioning
> "fsnotify_add_mark_locked" above when untag_chunk() is calling
> "fsnotify_add_mark"; I just wanted to make sure you hadn't intended to
> take the mutex in the audit code instead of relying on the locking in
> fsnotify_add_mark().

No, I didn't want to take mutex in the audit code. It is just that
fsnotify_add_mark() is a thin wrapper around fsnotify_add_mark_locked() so
I was speaking about that function.

								Honza

> 
> > diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> > index d3fea0bd89e2..6043306e8e21 100644
> > --- a/fs/notify/mark.c
> > +++ b/fs/notify/mark.c
> > @@ -510,18 +510,6 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
> >         }
> >  }
> >
> > -void fsnotify_duplicate_mark(struct fsnotify_mark *new, struct fsnotify_mark *old)
> > -{
> > -       assert_spin_locked(&old->lock);
> > -       new->inode = old->inode;
> > -       new->mnt = old->mnt;
> > -       if (old->group)
> > -               fsnotify_get_group(old->group);
> > -       new->group = old->group;
> > -       new->mask = old->mask;
> > -       new->free_mark = old->free_mark;
> > -}
> > -
> >  /*
> >   * Nothing fancy, just initialize lists and locks and counters.
> >   */
> > diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> > index 0cf34d6cc253..487246546ebe 100644
> > --- a/include/linux/fsnotify_backend.h
> > +++ b/include/linux/fsnotify_backend.h
> > @@ -323,8 +323,6 @@ extern void fsnotify_init_mark(struct fsnotify_mark *mark, void (*free_mark)(str
> >  extern struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group, struct inode *inode);
> >  /* find (and take a reference) to a mark associated with group and vfsmount */
> >  extern struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group, struct vfsmount *mnt);
> > -/* copy the values from old into new */
> > -extern void fsnotify_duplicate_mark(struct fsnotify_mark *new, struct fsnotify_mark *old);
> >  /* set the ignored_mask of a mark */
> >  extern void fsnotify_set_mark_ignored_mask_locked(struct fsnotify_mark *mark, __u32 mask);
> >  /* set the mask of a mark (might pin the object into memory */
> > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> > index 8b1dde96a0fa..f3130eb0a4bd 100644
> > --- a/kernel/audit_tree.c
> > +++ b/kernel/audit_tree.c
> > @@ -258,8 +258,7 @@ static void untag_chunk(struct node *p)
> >         if (!new)
> >                 goto Fallback;
> >
> > -       fsnotify_duplicate_mark(&new->mark, entry);
> > -       if (fsnotify_add_mark(&new->mark, new->mark.group, new->mark.inode, NULL, 1)) {
> > +       if (fsnotify_add_mark(&new->mark, entry->group, entry->inode, NULL, 1)) {
> >                 fsnotify_put_mark(&new->mark);
> >                 goto Fallback;
> >         }
> > @@ -395,8 +394,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
> >                 return -ENOENT;
> >         }
> >
> > -       fsnotify_duplicate_mark(chunk_entry, old_entry);
> > -       if (fsnotify_add_mark(chunk_entry, chunk_entry->group, chunk_entry->inode, NULL, 1)) {
> > +       if (fsnotify_add_mark(chunk_entry, old_entry->group, old_entry->inode, NULL, 1)) {
> >                 spin_unlock(&old_entry->lock);
> >                 fsnotify_put_mark(chunk_entry);
> >                 fsnotify_put_mark(old_entry);
> > --
> > 2.10.2
> >
> 
> -- 
> paul moore
> www.paul-moore.com
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
  2016-12-22 23:18   ` [PATCH 05/22] audit: Fix sleep in atomic Paul Moore
@ 2016-12-23 13:24     ` Jan Kara
  2016-12-23 14:17       ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2016-12-23 13:24 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jan Kara, linux-fsdevel, Amir Goldstein, Lino Sanfilippo,
	Miklos Szeredi, linux-audit

On Thu 22-12-16 18:18:36, Paul Moore wrote:
> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> > Audit tree code was happily adding new notification marks while holding
> > spinlocks. Since fsnotify_add_mark() acquires group->mark_mutex this can
> > lead to sleeping while holding a spinlock, deadlocks due to lock
> > inversion, and probably other fun. Fix the problem by acquiring
> > group->mark_mutex earlier.
> >
> > CC: Paul Moore <paul@paul-moore.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  kernel/audit_tree.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> [SIDE NOTE: this patch explains your comments and my earlier concern
> about the locked/unlocked variants of fsnotify_add_mark() in
> untag_chunk()]
> 
> Ouch.  Thanks for catching this ... what is your goal with these
> patches, are you targeting this as a fix during the v4.10-rcX cycle?
> If not, any objections if I pull this patch into the audit tree and
> send this to Linus during the v4.10-rcX cycle (assuming it passes
> testing, yadda yadda)?

Sure, go ahead. I plan these patches for the next merge window. So I can
rebase the series once you merge audit fixes...

								Honza
> 
> 
> > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> > index f3130eb0a4bd..156b6a93f4fc 100644
> > --- a/kernel/audit_tree.c
> > +++ b/kernel/audit_tree.c
> > @@ -231,6 +231,7 @@ static void untag_chunk(struct node *p)
> >         if (size)
> >                 new = alloc_chunk(size);
> >
> > +       mutex_lock(&entry->group->mark_mutex);
> >         spin_lock(&entry->lock);
> >         if (chunk->dead || !entry->inode) {
> >                 spin_unlock(&entry->lock);
> > @@ -258,7 +259,8 @@ static void untag_chunk(struct node *p)
> >         if (!new)
> >                 goto Fallback;
> >
> > -       if (fsnotify_add_mark(&new->mark, entry->group, entry->inode, NULL, 1)) {
> > +       if (fsnotify_add_mark_locked(&new->mark, entry->group, entry->inode,
> > +                                    NULL, 1)) {
> >                 fsnotify_put_mark(&new->mark);
> >                 goto Fallback;
> >         }
> > @@ -309,6 +311,7 @@ static void untag_chunk(struct node *p)
> >         spin_unlock(&hash_lock);
> >         spin_unlock(&entry->lock);
> >  out:
> > +       mutex_unlock(&entry->group->mark_mutex);
> >         fsnotify_put_mark(entry);
> >         spin_lock(&hash_lock);
> >  }
> > @@ -385,17 +388,21 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
> >
> >         chunk_entry = &chunk->mark;
> >
> > +       mutex_lock(&old_entry->group->mark_mutex);
> >         spin_lock(&old_entry->lock);
> >         if (!old_entry->inode) {
> >                 /* old_entry is being shot, lets just lie */
> >                 spin_unlock(&old_entry->lock);
> > +               mutex_unlock(&old_entry->group->mark_mutex);
> >                 fsnotify_put_mark(old_entry);
> >                 free_chunk(chunk);
> >                 return -ENOENT;
> >         }
> >
> > -       if (fsnotify_add_mark(chunk_entry, old_entry->group, old_entry->inode, NULL, 1)) {
> > +       if (fsnotify_add_mark_locked(chunk_entry, old_entry->group,
> > +                                    old_entry->inode, NULL, 1)) {
> >                 spin_unlock(&old_entry->lock);
> > +               mutex_unlock(&old_entry->group->mark_mutex);
> >                 fsnotify_put_mark(chunk_entry);
> >                 fsnotify_put_mark(old_entry);
> >                 return -ENOSPC;
> > @@ -411,6 +418,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
> >                 chunk->dead = 1;
> >                 spin_unlock(&chunk_entry->lock);
> >                 spin_unlock(&old_entry->lock);
> > +               mutex_unlock(&old_entry->group->mark_mutex);
> >
> >                 fsnotify_destroy_mark(chunk_entry, audit_tree_group);
> >
> > @@ -443,6 +451,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
> >         spin_unlock(&hash_lock);
> >         spin_unlock(&chunk_entry->lock);
> >         spin_unlock(&old_entry->lock);
> > +       mutex_unlock(&old_entry->group->mark_mutex);
> >         fsnotify_destroy_mark(old_entry, audit_tree_group);
> >         fsnotify_put_mark(chunk_entry); /* drop initial reference */
> >         fsnotify_put_mark(old_entry); /* pair to fsnotify_find mark_entry */
> > --
> > 2.10.2
> >
> 
> 
> 
> -- 
> paul moore
> www.paul-moore.com
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 06/22] audit: Abstract hash key handling
  2016-12-22 23:27   ` [PATCH 06/22] audit: Abstract hash key handling Paul Moore
@ 2016-12-23 13:27     ` Jan Kara
  2016-12-23 14:13       ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2016-12-23 13:27 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jan Kara, linux-fsdevel, Amir Goldstein, Lino Sanfilippo,
	Miklos Szeredi, linux-audit

On Thu 22-12-16 18:27:40, Paul Moore wrote:
> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> > Audit tree currently uses inode pointer as a key into the hash table.
> > Getting that from notification mark will be somewhat more difficult with
> > coming fsnotify changes and there's no reason we really have to use the
> > inode pointer. So abstract getting of hash key from the audit chunk and
> > inode so that we can switch to a different key easily later.
> >
> > CC: Paul Moore <paul@paul-moore.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  kernel/audit_tree.c | 39 ++++++++++++++++++++++++++++-----------
> >  1 file changed, 28 insertions(+), 11 deletions(-)
> 
> I have no objections with this patch in particular, but in patch 8,
> are you certain that inode_to_key() and chunk_to_key() will continue
> to return the same key value?

Yes, that's the intention. Or better in that patch the key will no longer
be inode pointer but instead the fsnotify_list pointer. But still it would
match for chunks attached to an inode and inode itself so comparison
results should stay the same.

								Honza 
> 
> > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> > index 156b6a93f4fc..f0859828de09 100644
> > --- a/kernel/audit_tree.c
> > +++ b/kernel/audit_tree.c
> > @@ -163,33 +163,48 @@ enum {HASH_SIZE = 128};
> >  static struct list_head chunk_hash_heads[HASH_SIZE];
> >  static __cacheline_aligned_in_smp DEFINE_SPINLOCK(hash_lock);
> >
> > -static inline struct list_head *chunk_hash(const struct inode *inode)
> > +/* Function to return search key in our hash from inode. */
> > +static unsigned long inode_to_key(const struct inode *inode)
> >  {
> > -       unsigned long n = (unsigned long)inode / L1_CACHE_BYTES;
> > +       return (unsigned long)inode;
> > +}
> > +
> > +/*
> > + * Function to return search key in our hash from chunk. Key 0 is special and
> > + * should never be present in the hash.
> > + */
> > +static unsigned long chunk_to_key(struct audit_chunk *chunk)
> > +{
> > +       return (unsigned long)chunk->mark.inode;
> > +}
> > +
> > +static inline struct list_head *chunk_hash(unsigned long key)
> > +{
> > +       unsigned long n = key / L1_CACHE_BYTES;
> >         return chunk_hash_heads + n % HASH_SIZE;
> >  }
> >
> >  /* hash_lock & entry->lock is held by caller */
> >  static void insert_hash(struct audit_chunk *chunk)
> >  {
> > -       struct fsnotify_mark *entry = &chunk->mark;
> > +       unsigned long key = chunk_to_key(chunk);
> >         struct list_head *list;
> >
> > -       if (!entry->inode)
> > +       if (!key)
> >                 return;
> > -       list = chunk_hash(entry->inode);
> > +       list = chunk_hash(key);
> >         list_add_rcu(&chunk->hash, list);
> >  }
> >
> >  /* called under rcu_read_lock */
> >  struct audit_chunk *audit_tree_lookup(const struct inode *inode)
> >  {
> > -       struct list_head *list = chunk_hash(inode);
> > +       unsigned long key = inode_to_key(inode);
> > +       struct list_head *list = chunk_hash(key);
> >         struct audit_chunk *p;
> >
> >         list_for_each_entry_rcu(p, list, hash) {
> > -               /* mark.inode may have gone NULL, but who cares? */
> > -               if (p->mark.inode == inode) {
> > +               if (chunk_to_key(p) == key) {
> >                         atomic_long_inc(&p->refs);
> >                         return p;
> >                 }
> > @@ -585,7 +600,8 @@ int audit_remove_tree_rule(struct audit_krule *rule)
> >
> >  static int compare_root(struct vfsmount *mnt, void *arg)
> >  {
> > -       return d_backing_inode(mnt->mnt_root) == arg;
> > +       return inode_to_key(d_backing_inode(mnt->mnt_root)) ==
> > +              (unsigned long)arg;
> >  }
> >
> >  void audit_trim_trees(void)
> > @@ -620,9 +636,10 @@ void audit_trim_trees(void)
> >                 list_for_each_entry(node, &tree->chunks, list) {
> >                         struct audit_chunk *chunk = find_chunk(node);
> >                         /* this could be NULL if the watch is dying else where... */
> > -                       struct inode *inode = chunk->mark.inode;
> >                         node->index |= 1U<<31;
> > -                       if (iterate_mounts(compare_root, inode, root_mnt))
> > +                       if (iterate_mounts(compare_root,
> > +                                          (void *)chunk_to_key(chunk),
> > +                                          root_mnt))
> >                                 node->index &= ~(1U<<31);
> >                 }
> >                 spin_unlock(&hash_lock);
> > --
> > 2.10.2
> >
> 
> 
> 
> -- 
> paul moore
> www.paul-moore.com
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 04/22] fsnotify: Remove fsnotify_duplicate_mark()
  2016-12-23 13:22     ` Jan Kara
@ 2016-12-23 14:01       ` Paul Moore
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Moore @ 2016-12-23 14:01 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Fri, Dec 23, 2016 at 8:22 AM, Jan Kara <jack@suse.cz> wrote:
> On Thu 22-12-16 18:13:11, Paul Moore wrote:
>> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
>> > There are only two calls sites of fsnotify_duplicate_mark(). Those are
>> > in kernel/audit_tree.c and both are bogus. Vfsmount pointer is unused
>> > for audit tree, inode pointer and group gets set in
>> > fsnotify_add_mark_locked() later anyway, mask and free_mark are already
>> > set in alloc_chunk(). In fact, calling fsnotify_duplicate_mark() is
>> > actively harmful because following fsnotify_add_mark_locked() will leak
>> > group reference by overwriting the group pointer. So just remove the two
>> > calls to fsnotify_duplicate_mark() and the function.
>> >
>> > Signed-off-by: Jan Kara <jack@suse.cz>
>> > ---
>> >  fs/notify/mark.c                 | 12 ------------
>> >  include/linux/fsnotify_backend.h |  2 --
>> >  kernel/audit_tree.c              |  6 ++----
>> >  3 files changed, 2 insertions(+), 18 deletions(-)
>>
>> At first glance this looks reasonable, although you keep mentioning
>> "fsnotify_add_mark_locked" above when untag_chunk() is calling
>> "fsnotify_add_mark"; I just wanted to make sure you hadn't intended to
>> take the mutex in the audit code instead of relying on the locking in
>> fsnotify_add_mark().
>
> No, I didn't want to take mutex in the audit code. It is just that
> fsnotify_add_mark() is a thin wrapper around fsnotify_add_mark_locked() so
> I was speaking about that function.

Understood.  I just wanted to make sure this is what you intended
since the commit description and patch did not match.

If you end up respinning the patchset for any reason I might suggest
changing the function name in the description above.

>>
>> > diff --git a/fs/notify/mark.c b/fs/notify/mark.c
>> > index d3fea0bd89e2..6043306e8e21 100644
>> > --- a/fs/notify/mark.c
>> > +++ b/fs/notify/mark.c
>> > @@ -510,18 +510,6 @@ void fsnotify_detach_group_marks(struct fsnotify_group *group)
>> >         }
>> >  }
>> >
>> > -void fsnotify_duplicate_mark(struct fsnotify_mark *new, struct fsnotify_mark *old)
>> > -{
>> > -       assert_spin_locked(&old->lock);
>> > -       new->inode = old->inode;
>> > -       new->mnt = old->mnt;
>> > -       if (old->group)
>> > -               fsnotify_get_group(old->group);
>> > -       new->group = old->group;
>> > -       new->mask = old->mask;
>> > -       new->free_mark = old->free_mark;
>> > -}
>> > -
>> >  /*
>> >   * Nothing fancy, just initialize lists and locks and counters.
>> >   */
>> > diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
>> > index 0cf34d6cc253..487246546ebe 100644
>> > --- a/include/linux/fsnotify_backend.h
>> > +++ b/include/linux/fsnotify_backend.h
>> > @@ -323,8 +323,6 @@ extern void fsnotify_init_mark(struct fsnotify_mark *mark, void (*free_mark)(str
>> >  extern struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group, struct inode *inode);
>> >  /* find (and take a reference) to a mark associated with group and vfsmount */
>> >  extern struct fsnotify_mark *fsnotify_find_vfsmount_mark(struct fsnotify_group *group, struct vfsmount *mnt);
>> > -/* copy the values from old into new */
>> > -extern void fsnotify_duplicate_mark(struct fsnotify_mark *new, struct fsnotify_mark *old);
>> >  /* set the ignored_mask of a mark */
>> >  extern void fsnotify_set_mark_ignored_mask_locked(struct fsnotify_mark *mark, __u32 mask);
>> >  /* set the mask of a mark (might pin the object into memory */
>> > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
>> > index 8b1dde96a0fa..f3130eb0a4bd 100644
>> > --- a/kernel/audit_tree.c
>> > +++ b/kernel/audit_tree.c
>> > @@ -258,8 +258,7 @@ static void untag_chunk(struct node *p)
>> >         if (!new)
>> >                 goto Fallback;
>> >
>> > -       fsnotify_duplicate_mark(&new->mark, entry);
>> > -       if (fsnotify_add_mark(&new->mark, new->mark.group, new->mark.inode, NULL, 1)) {
>> > +       if (fsnotify_add_mark(&new->mark, entry->group, entry->inode, NULL, 1)) {
>> >                 fsnotify_put_mark(&new->mark);
>> >                 goto Fallback;
>> >         }
>> > @@ -395,8 +394,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>> >                 return -ENOENT;
>> >         }
>> >
>> > -       fsnotify_duplicate_mark(chunk_entry, old_entry);
>> > -       if (fsnotify_add_mark(chunk_entry, chunk_entry->group, chunk_entry->inode, NULL, 1)) {
>> > +       if (fsnotify_add_mark(chunk_entry, old_entry->group, old_entry->inode, NULL, 1)) {
>> >                 spin_unlock(&old_entry->lock);
>> >                 fsnotify_put_mark(chunk_entry);
>> >                 fsnotify_put_mark(old_entry);
>> > --
>> > 2.10.2
>> >
>>
>> --
>> paul moore
>> www.paul-moore.com
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR



-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 06/22] audit: Abstract hash key handling
  2016-12-23 13:27     ` Jan Kara
@ 2016-12-23 14:13       ` Paul Moore
  2017-01-03 17:34         ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2016-12-23 14:13 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Fri, Dec 23, 2016 at 8:27 AM, Jan Kara <jack@suse.cz> wrote:
> On Thu 22-12-16 18:27:40, Paul Moore wrote:
>> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
>> > Audit tree currently uses inode pointer as a key into the hash table.
>> > Getting that from notification mark will be somewhat more difficult with
>> > coming fsnotify changes and there's no reason we really have to use the
>> > inode pointer. So abstract getting of hash key from the audit chunk and
>> > inode so that we can switch to a different key easily later.
>> >
>> > CC: Paul Moore <paul@paul-moore.com>
>> > Signed-off-by: Jan Kara <jack@suse.cz>
>> > ---
>> >  kernel/audit_tree.c | 39 ++++++++++++++++++++++++++++-----------
>> >  1 file changed, 28 insertions(+), 11 deletions(-)
>>
>> I have no objections with this patch in particular, but in patch 8,
>> are you certain that inode_to_key() and chunk_to_key() will continue
>> to return the same key value?
>
> Yes, that's the intention. Or better in that patch the key will no longer
> be inode pointer but instead the fsnotify_list pointer. But still it would
> match for chunks attached to an inode and inode itself so comparison
> results should stay the same.

My apologies, I probably should have been more clear.

Yes, I think we are all in agreement that the *_to_key() functions
need to return a consistent value for the same object.  My concern is
that in patch 8 these functions are using different variables (granted
they may contain the same value, and therefore evaluate to the same
key) and I worry that there is a possibility of the two variables
taking on different values and breaking the hash.  What guarantees
exist that these values will be the same?  Are there any safeguards to
prevent future patches from accidentally sidestepping these
guarantees?

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
  2016-12-23 13:24     ` Jan Kara
@ 2016-12-23 14:17       ` Paul Moore
  2016-12-26 16:33         ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2016-12-23 14:17 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Fri, Dec 23, 2016 at 8:24 AM, Jan Kara <jack@suse.cz> wrote:
> On Thu 22-12-16 18:18:36, Paul Moore wrote:
>> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
>> > Audit tree code was happily adding new notification marks while holding
>> > spinlocks. Since fsnotify_add_mark() acquires group->mark_mutex this can
>> > lead to sleeping while holding a spinlock, deadlocks due to lock
>> > inversion, and probably other fun. Fix the problem by acquiring
>> > group->mark_mutex earlier.
>> >
>> > CC: Paul Moore <paul@paul-moore.com>
>> > Signed-off-by: Jan Kara <jack@suse.cz>
>> > ---
>> >  kernel/audit_tree.c | 13 +++++++++++--
>> >  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> [SIDE NOTE: this patch explains your comments and my earlier concern
>> about the locked/unlocked variants of fsnotify_add_mark() in
>> untag_chunk()]
>>
>> Ouch.  Thanks for catching this ... what is your goal with these
>> patches, are you targeting this as a fix during the v4.10-rcX cycle?
>> If not, any objections if I pull this patch into the audit tree and
>> send this to Linus during the v4.10-rcX cycle (assuming it passes
>> testing, yadda yadda)?
>
> Sure, go ahead. I plan these patches for the next merge window. So I can
> rebase the series once you merge audit fixes...

Okay, great.  I'll merge this patch in the audit/stable-4.10 branch
for Linus but there will likely be some delays due to
holidays/vacation on my end.

Thanks again for your help fixing this, I really appreciate it.

>> > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
>> > index f3130eb0a4bd..156b6a93f4fc 100644
>> > --- a/kernel/audit_tree.c
>> > +++ b/kernel/audit_tree.c
>> > @@ -231,6 +231,7 @@ static void untag_chunk(struct node *p)
>> >         if (size)
>> >                 new = alloc_chunk(size);
>> >
>> > +       mutex_lock(&entry->group->mark_mutex);
>> >         spin_lock(&entry->lock);
>> >         if (chunk->dead || !entry->inode) {
>> >                 spin_unlock(&entry->lock);
>> > @@ -258,7 +259,8 @@ static void untag_chunk(struct node *p)
>> >         if (!new)
>> >                 goto Fallback;
>> >
>> > -       if (fsnotify_add_mark(&new->mark, entry->group, entry->inode, NULL, 1)) {
>> > +       if (fsnotify_add_mark_locked(&new->mark, entry->group, entry->inode,
>> > +                                    NULL, 1)) {
>> >                 fsnotify_put_mark(&new->mark);
>> >                 goto Fallback;
>> >         }
>> > @@ -309,6 +311,7 @@ static void untag_chunk(struct node *p)
>> >         spin_unlock(&hash_lock);
>> >         spin_unlock(&entry->lock);
>> >  out:
>> > +       mutex_unlock(&entry->group->mark_mutex);
>> >         fsnotify_put_mark(entry);
>> >         spin_lock(&hash_lock);
>> >  }
>> > @@ -385,17 +388,21 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>> >
>> >         chunk_entry = &chunk->mark;
>> >
>> > +       mutex_lock(&old_entry->group->mark_mutex);
>> >         spin_lock(&old_entry->lock);
>> >         if (!old_entry->inode) {
>> >                 /* old_entry is being shot, lets just lie */
>> >                 spin_unlock(&old_entry->lock);
>> > +               mutex_unlock(&old_entry->group->mark_mutex);
>> >                 fsnotify_put_mark(old_entry);
>> >                 free_chunk(chunk);
>> >                 return -ENOENT;
>> >         }
>> >
>> > -       if (fsnotify_add_mark(chunk_entry, old_entry->group, old_entry->inode, NULL, 1)) {
>> > +       if (fsnotify_add_mark_locked(chunk_entry, old_entry->group,
>> > +                                    old_entry->inode, NULL, 1)) {
>> >                 spin_unlock(&old_entry->lock);
>> > +               mutex_unlock(&old_entry->group->mark_mutex);
>> >                 fsnotify_put_mark(chunk_entry);
>> >                 fsnotify_put_mark(old_entry);
>> >                 return -ENOSPC;
>> > @@ -411,6 +418,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>> >                 chunk->dead = 1;
>> >                 spin_unlock(&chunk_entry->lock);
>> >                 spin_unlock(&old_entry->lock);
>> > +               mutex_unlock(&old_entry->group->mark_mutex);
>> >
>> >                 fsnotify_destroy_mark(chunk_entry, audit_tree_group);
>> >
>> > @@ -443,6 +451,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
>> >         spin_unlock(&hash_lock);
>> >         spin_unlock(&chunk_entry->lock);
>> >         spin_unlock(&old_entry->lock);
>> > +       mutex_unlock(&old_entry->group->mark_mutex);
>> >         fsnotify_destroy_mark(old_entry, audit_tree_group);
>> >         fsnotify_put_mark(chunk_entry); /* drop initial reference */
>> >         fsnotify_put_mark(old_entry); /* pair to fsnotify_find mark_entry */
>> > --
>> > 2.10.2
>> >

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
  2016-12-23 14:17       ` Paul Moore
@ 2016-12-26 16:33         ` Paul Moore
  2017-01-02 18:21           ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2016-12-26 16:33 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Fri, Dec 23, 2016 at 9:17 AM, Paul Moore <paul@paul-moore.com> wrote:
> On Fri, Dec 23, 2016 at 8:24 AM, Jan Kara <jack@suse.cz> wrote:
>> On Thu 22-12-16 18:18:36, Paul Moore wrote:
>>> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
>>> > Audit tree code was happily adding new notification marks while holding
>>> > spinlocks. Since fsnotify_add_mark() acquires group->mark_mutex this can
>>> > lead to sleeping while holding a spinlock, deadlocks due to lock
>>> > inversion, and probably other fun. Fix the problem by acquiring
>>> > group->mark_mutex earlier.
>>> >
>>> > CC: Paul Moore <paul@paul-moore.com>
>>> > Signed-off-by: Jan Kara <jack@suse.cz>
>>> > ---
>>> >  kernel/audit_tree.c | 13 +++++++++++--
>>> >  1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> [SIDE NOTE: this patch explains your comments and my earlier concern
>>> about the locked/unlocked variants of fsnotify_add_mark() in
>>> untag_chunk()]
>>>
>>> Ouch.  Thanks for catching this ... what is your goal with these
>>> patches, are you targeting this as a fix during the v4.10-rcX cycle?
>>> If not, any objections if I pull this patch into the audit tree and
>>> send this to Linus during the v4.10-rcX cycle (assuming it passes
>>> testing, yadda yadda)?
>>
>> Sure, go ahead. I plan these patches for the next merge window. So I can
>> rebase the series once you merge audit fixes...
>
> Okay, great.  I'll merge this patch in the audit/stable-4.10 branch
> for Linus but there will likely be some delays due to
> holidays/vacation on my end.
>
> Thanks again for your help fixing this, I really appreciate it.

I merged this patch, as well as the "Remove fsnotify_duplicate_mark()"
patch (to make things cleaner when merging this patch) and did a quick
test using the audit-testsuite ... the test hung on the "file_create"
tests.  Unfortunately, I'm traveling right now for the holidays and
will not likely have a chance to debug this much further until after
the new year, but I thought I would mention it in case you had some
time to look into this failure.

For reference, here is the audit-testsuite again:

* https://github.com/linux-audit/audit-testsuite

... and if you have a Fedora test system, here is the Rawhide kernel I
used to test (it is basically my kernel-secnext test kernel with those
two patches mentioned above added on top):

* https://copr.fedorainfracloud.org/coprs/pcmoore/kernel-testing/build/492386

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
  2016-12-26 16:33         ` Paul Moore
@ 2017-01-02 18:21           ` Jan Kara
  2017-01-03 21:11             ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2017-01-02 18:21 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jan Kara, linux-fsdevel, Amir Goldstein, Lino Sanfilippo,
	Miklos Szeredi, linux-audit

[-- Attachment #1: Type: text/plain, Size: 3016 bytes --]

On Mon 26-12-16 11:33:10, Paul Moore wrote:
> On Fri, Dec 23, 2016 at 9:17 AM, Paul Moore <paul@paul-moore.com> wrote:
> > On Fri, Dec 23, 2016 at 8:24 AM, Jan Kara <jack@suse.cz> wrote:
> >> On Thu 22-12-16 18:18:36, Paul Moore wrote:
> >>> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> >>> > Audit tree code was happily adding new notification marks while holding
> >>> > spinlocks. Since fsnotify_add_mark() acquires group->mark_mutex this can
> >>> > lead to sleeping while holding a spinlock, deadlocks due to lock
> >>> > inversion, and probably other fun. Fix the problem by acquiring
> >>> > group->mark_mutex earlier.
> >>> >
> >>> > CC: Paul Moore <paul@paul-moore.com>
> >>> > Signed-off-by: Jan Kara <jack@suse.cz>
> >>> > ---
> >>> >  kernel/audit_tree.c | 13 +++++++++++--
> >>> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >>>
> >>> [SIDE NOTE: this patch explains your comments and my earlier concern
> >>> about the locked/unlocked variants of fsnotify_add_mark() in
> >>> untag_chunk()]
> >>>
> >>> Ouch.  Thanks for catching this ... what is your goal with these
> >>> patches, are you targeting this as a fix during the v4.10-rcX cycle?
> >>> If not, any objections if I pull this patch into the audit tree and
> >>> send this to Linus during the v4.10-rcX cycle (assuming it passes
> >>> testing, yadda yadda)?
> >>
> >> Sure, go ahead. I plan these patches for the next merge window. So I can
> >> rebase the series once you merge audit fixes...
> >
> > Okay, great.  I'll merge this patch in the audit/stable-4.10 branch
> > for Linus but there will likely be some delays due to
> > holidays/vacation on my end.
> >
> > Thanks again for your help fixing this, I really appreciate it.
> 
> I merged this patch, as well as the "Remove fsnotify_duplicate_mark()"
> patch (to make things cleaner when merging this patch) and did a quick
> test using the audit-testsuite ... the test hung on the "file_create"
> tests.  Unfortunately, I'm traveling right now for the holidays and
> will not likely have a chance to debug this much further until after
> the new year, but I thought I would mention it in case you had some
> time to look into this failure.
> 
> For reference, here is the audit-testsuite again:
> 
> * https://github.com/linux-audit/audit-testsuite
> 
> ... and if you have a Fedora test system, here is the Rawhide kernel I
> used to test (it is basically my kernel-secnext test kernel with those
> two patches mentioned above added on top):
> 
> * https://copr.fedorainfracloud.org/coprs/pcmoore/kernel-testing/build/492386

So I found where the problem was. Attached is a new version of the patch.
Tests from audit-testsuite fail for me but do not hang anymore. I guess the
failing is because I don't have audit or selinux configured in any way and
I'm using SUSE I guess (if there's some easy way to do that, I'd be
interested) - runtests.pl complains that I have to be root although I am...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-audit-Fix-sleep-in-atomic.patch --]
[-- Type: text/x-patch, Size: 3764 bytes --]

>From 300e01735a0b5dbf9cd32c932b77d9dc77258489 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 14 Dec 2016 14:40:05 +0100
Subject: [PATCH] audit: Fix sleep in atomic

Audit tree code was happily adding new notification marks while holding
spinlocks. Since fsnotify_add_mark() acquires group->mark_mutex this can
lead to sleeping while holding a spinlock, deadlocks due to lock
inversion, and probably other fun. Fix the problem by acquiring
group->mark_mutex earlier.

CC: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 kernel/audit_tree.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
index f3130eb0a4bd..7b44195da81b 100644
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -231,9 +231,11 @@ static void untag_chunk(struct node *p)
 	if (size)
 		new = alloc_chunk(size);
 
+	mutex_lock(&entry->group->mark_mutex);
 	spin_lock(&entry->lock);
 	if (chunk->dead || !entry->inode) {
 		spin_unlock(&entry->lock);
+		mutex_unlock(&entry->group->mark_mutex);
 		if (new)
 			free_chunk(new);
 		goto out;
@@ -251,6 +253,7 @@ static void untag_chunk(struct node *p)
 		list_del_rcu(&chunk->hash);
 		spin_unlock(&hash_lock);
 		spin_unlock(&entry->lock);
+		mutex_unlock(&entry->group->mark_mutex);
 		fsnotify_destroy_mark(entry, audit_tree_group);
 		goto out;
 	}
@@ -258,7 +261,8 @@ static void untag_chunk(struct node *p)
 	if (!new)
 		goto Fallback;
 
-	if (fsnotify_add_mark(&new->mark, entry->group, entry->inode, NULL, 1)) {
+	if (fsnotify_add_mark_locked(&new->mark, entry->group, entry->inode,
+				     NULL, 1)) {
 		fsnotify_put_mark(&new->mark);
 		goto Fallback;
 	}
@@ -292,6 +296,7 @@ static void untag_chunk(struct node *p)
 		owner->root = new;
 	spin_unlock(&hash_lock);
 	spin_unlock(&entry->lock);
+	mutex_unlock(&entry->group->mark_mutex);
 	fsnotify_destroy_mark(entry, audit_tree_group);
 	fsnotify_put_mark(&new->mark);	/* drop initial reference */
 	goto out;
@@ -308,6 +313,7 @@ static void untag_chunk(struct node *p)
 	put_tree(owner);
 	spin_unlock(&hash_lock);
 	spin_unlock(&entry->lock);
+	mutex_unlock(&entry->group->mark_mutex);
 out:
 	fsnotify_put_mark(entry);
 	spin_lock(&hash_lock);
@@ -385,17 +391,21 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 
 	chunk_entry = &chunk->mark;
 
+	mutex_lock(&old_entry->group->mark_mutex);
 	spin_lock(&old_entry->lock);
 	if (!old_entry->inode) {
 		/* old_entry is being shot, lets just lie */
 		spin_unlock(&old_entry->lock);
+		mutex_unlock(&old_entry->group->mark_mutex);
 		fsnotify_put_mark(old_entry);
 		free_chunk(chunk);
 		return -ENOENT;
 	}
 
-	if (fsnotify_add_mark(chunk_entry, old_entry->group, old_entry->inode, NULL, 1)) {
+	if (fsnotify_add_mark_locked(chunk_entry, old_entry->group,
+				     old_entry->inode, NULL, 1)) {
 		spin_unlock(&old_entry->lock);
+		mutex_unlock(&old_entry->group->mark_mutex);
 		fsnotify_put_mark(chunk_entry);
 		fsnotify_put_mark(old_entry);
 		return -ENOSPC;
@@ -411,6 +421,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 		chunk->dead = 1;
 		spin_unlock(&chunk_entry->lock);
 		spin_unlock(&old_entry->lock);
+		mutex_unlock(&old_entry->group->mark_mutex);
 
 		fsnotify_destroy_mark(chunk_entry, audit_tree_group);
 
@@ -443,6 +454,7 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
 	spin_unlock(&hash_lock);
 	spin_unlock(&chunk_entry->lock);
 	spin_unlock(&old_entry->lock);
+	mutex_unlock(&old_entry->group->mark_mutex);
 	fsnotify_destroy_mark(old_entry, audit_tree_group);
 	fsnotify_put_mark(chunk_entry);	/* drop initial reference */
 	fsnotify_put_mark(old_entry); /* pair to fsnotify_find mark_entry */
-- 
2.10.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 06/22] audit: Abstract hash key handling
  2016-12-23 14:13       ` Paul Moore
@ 2017-01-03 17:34         ` Jan Kara
  2017-01-05  2:06           ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2017-01-03 17:34 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jan Kara, linux-fsdevel, Amir Goldstein, Lino Sanfilippo,
	Miklos Szeredi, linux-audit

On Fri 23-12-16 09:13:55, Paul Moore wrote:
> On Fri, Dec 23, 2016 at 8:27 AM, Jan Kara <jack@suse.cz> wrote:
> > On Thu 22-12-16 18:27:40, Paul Moore wrote:
> >> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
> >> > Audit tree currently uses inode pointer as a key into the hash table.
> >> > Getting that from notification mark will be somewhat more difficult with
> >> > coming fsnotify changes and there's no reason we really have to use the
> >> > inode pointer. So abstract getting of hash key from the audit chunk and
> >> > inode so that we can switch to a different key easily later.
> >> >
> >> > CC: Paul Moore <paul@paul-moore.com>
> >> > Signed-off-by: Jan Kara <jack@suse.cz>
> >> > ---
> >> >  kernel/audit_tree.c | 39 ++++++++++++++++++++++++++++-----------
> >> >  1 file changed, 28 insertions(+), 11 deletions(-)
> >>
> >> I have no objections with this patch in particular, but in patch 8,
> >> are you certain that inode_to_key() and chunk_to_key() will continue
> >> to return the same key value?
> >
> > Yes, that's the intention. Or better in that patch the key will no longer
> > be inode pointer but instead the fsnotify_list pointer. But still it would
> > match for chunks attached to an inode and inode itself so comparison
> > results should stay the same.
> 
> My apologies, I probably should have been more clear.
> 
> Yes, I think we are all in agreement that the *_to_key() functions
> need to return a consistent value for the same object.  My concern is
> that in patch 8 these functions are using different variables (granted
> they may contain the same value, and therefore evaluate to the same
> key) and I worry that there is a possibility of the two variables
> taking on different values and breaking the hash.  What guarantees
> exist that these values will be the same?  Are there any safeguards to
> prevent future patches from accidentally sidestepping these
> guarantees?

Ah, OK, so this is more about patch 8 than patch 6. So far audit uses inode
pointer as a key - now with patch 8, there is a fsnotify_mark_list attached
to an inode if and only if there is any fsnotify_mark for that inode and
both inode->i_fsnotify_marks (used as a key in inode_to_key()) and
mark->obj_list_head (used as a key in chunk_to_key()) point to it. So keys
for an inode and chunk match if and only if the fsnotify mark in the chunk
is attached to the inode - the same as before patch 8.

The only reason why I changed audit to use a different pointer for the key
is that you need some lock protection to do mark->obj_list_head->inode
dereference and this seemed the easiest. Actually now that all the lifetime
rules have worked out, I can see we can actually use inode pointer as a key
relatively easily since mark->obj_list_head is stable once you hold a mark
reference so locking would be only intermediate step until this gets
established in the series. Would you prefer me to do that?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
  2017-01-02 18:21           ` Jan Kara
@ 2017-01-03 21:11             ` Paul Moore
  2017-01-04  8:50               ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Moore @ 2017-01-03 21:11 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, linux-audit, Amir Goldstein, Lino Sanfilippo,
	Miklos Szeredi

On Mon, Jan 2, 2017 at 1:21 PM, Jan Kara <jack@suse.cz> wrote:
> So I found where the problem was. Attached is a new version of the patch.
> Tests from audit-testsuite fail for me but do not hang anymore. I guess the
> failing is because I don't have audit or selinux configured in any way and
> I'm using SUSE I guess (if there's some easy way to do that, I'd be
> interested) - runtests.pl complains that I have to be root although I am...

I've never tried running the tests on SUSE, but if audit and SELinux
are in some undetermined state, then I can only imagine what wierd
test results you would get.  I'm building a test kernel as I type
this, I'll report back when I have some results.

Also, while I'm sure you've heard this before (and likely already know
better), please send patches inline, it makes review/commenting much
easier.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
  2017-01-03 21:11             ` Paul Moore
@ 2017-01-04  8:50               ` Jan Kara
  2017-01-05  2:14                 ` Paul Moore
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Kara @ 2017-01-04  8:50 UTC (permalink / raw)
  To: Paul Moore
  Cc: Jan Kara, linux-fsdevel, Amir Goldstein, Lino Sanfilippo,
	Miklos Szeredi, linux-audit

On Tue 03-01-17 16:11:16, Paul Moore wrote:
> On Mon, Jan 2, 2017 at 1:21 PM, Jan Kara <jack@suse.cz> wrote:
> > So I found where the problem was. Attached is a new version of the patch.
> > Tests from audit-testsuite fail for me but do not hang anymore. I guess the
> > failing is because I don't have audit or selinux configured in any way and
> > I'm using SUSE I guess (if there's some easy way to do that, I'd be
> > interested) - runtests.pl complains that I have to be root although I am...
> 
> I've never tried running the tests on SUSE, but if audit and SELinux
> are in some undetermined state, then I can only imagine what wierd
> test results you would get.

Well, the state is well determined - nothing is installed ;) I was kind of
hoping kernel support would be enough but apparently the testsuite needs
some userspace installed & configured as well.

> I'm building a test kernel as I type this, I'll report back when I have
> some results.

Thanks!

> Also, while I'm sure you've heard this before (and likely already know
> better), please send patches inline, it makes review/commenting much
> easier.

Actually, I haven't heard this for quite a long time :) and I myself
prefer attached patches (with text/plain attachment type) when they are
in a reply to another email - they are easier to extract and at least my
mail client automatically inlines them when I hit reply... Arguably the
best of both worlds is to use git-send-email with properly set threading
but I tend to forget about that option.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 06/22] audit: Abstract hash key handling
  2017-01-03 17:34         ` Jan Kara
@ 2017-01-05  2:06           ` Paul Moore
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Moore @ 2017-01-05  2:06 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Tue, Jan 3, 2017 at 12:34 PM, Jan Kara <jack@suse.cz> wrote:
> On Fri 23-12-16 09:13:55, Paul Moore wrote:
>> On Fri, Dec 23, 2016 at 8:27 AM, Jan Kara <jack@suse.cz> wrote:
>> > On Thu 22-12-16 18:27:40, Paul Moore wrote:
>> >> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara <jack@suse.cz> wrote:
>> >> > Audit tree currently uses inode pointer as a key into the hash table.
>> >> > Getting that from notification mark will be somewhat more difficult with
>> >> > coming fsnotify changes and there's no reason we really have to use the
>> >> > inode pointer. So abstract getting of hash key from the audit chunk and
>> >> > inode so that we can switch to a different key easily later.
>> >> >
>> >> > CC: Paul Moore <paul@paul-moore.com>
>> >> > Signed-off-by: Jan Kara <jack@suse.cz>
>> >> > ---
>> >> >  kernel/audit_tree.c | 39 ++++++++++++++++++++++++++++-----------
>> >> >  1 file changed, 28 insertions(+), 11 deletions(-)
>> >>
>> >> I have no objections with this patch in particular, but in patch 8,
>> >> are you certain that inode_to_key() and chunk_to_key() will continue
>> >> to return the same key value?
>> >
>> > Yes, that's the intention. Or better in that patch the key will no longer
>> > be inode pointer but instead the fsnotify_list pointer. But still it would
>> > match for chunks attached to an inode and inode itself so comparison
>> > results should stay the same.
>>
>> My apologies, I probably should have been more clear.
>>
>> Yes, I think we are all in agreement that the *_to_key() functions
>> need to return a consistent value for the same object.  My concern is
>> that in patch 8 these functions are using different variables (granted
>> they may contain the same value, and therefore evaluate to the same
>> key) and I worry that there is a possibility of the two variables
>> taking on different values and breaking the hash.  What guarantees
>> exist that these values will be the same?  Are there any safeguards to
>> prevent future patches from accidentally sidestepping these
>> guarantees?
>
> Ah, OK, so this is more about patch 8 than patch 6. So far audit uses inode
> pointer as a key - now with patch 8, there is a fsnotify_mark_list attached
> to an inode if and only if there is any fsnotify_mark for that inode and
> both inode->i_fsnotify_marks (used as a key in inode_to_key()) and
> mark->obj_list_head (used as a key in chunk_to_key()) point to it. So keys
> for an inode and chunk match if and only if the fsnotify mark in the chunk
> is attached to the inode - the same as before patch 8.
>
> The only reason why I changed audit to use a different pointer for the key
> is that you need some lock protection to do mark->obj_list_head->inode
> dereference and this seemed the easiest. Actually now that all the lifetime
> rules have worked out, I can see we can actually use inode pointer as a key
> relatively easily since mark->obj_list_head is stable once you hold a mark
> reference so locking would be only intermediate step until this gets
> established in the series. Would you prefer me to do that?

Unless you can think of any reason why that would be dangerous, I
think it would be more obvious and easier to maintain as a result.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 05/22] audit: Fix sleep in atomic
  2017-01-04  8:50               ` Jan Kara
@ 2017-01-05  2:14                 ` Paul Moore
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Moore @ 2017-01-05  2:14 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-fsdevel, Amir Goldstein, Lino Sanfilippo, Miklos Szeredi,
	linux-audit

On Wed, Jan 4, 2017 at 3:50 AM, Jan Kara <jack@suse.cz> wrote:
> On Tue 03-01-17 16:11:16, Paul Moore wrote:
>> On Mon, Jan 2, 2017 at 1:21 PM, Jan Kara <jack@suse.cz> wrote:
>> > So I found where the problem was. Attached is a new version of the patch.
>> > Tests from audit-testsuite fail for me but do not hang anymore. I guess the
>> > failing is because I don't have audit or selinux configured in any way and
>> > I'm using SUSE I guess (if there's some easy way to do that, I'd be
>> > interested) - runtests.pl complains that I have to be root although I am...
>>
>> I've never tried running the tests on SUSE, but if audit and SELinux
>> are in some undetermined state, then I can only imagine what wierd
>> test results you would get.
>
> Well, the state is well determined - nothing is installed ;) I was kind of
> hoping kernel support would be enough but apparently the testsuite needs
> some userspace installed & configured as well.

Heh, yes :)  If nothing is installed you likely missing all the audit
userspace tools and auditd probably isn't running; either case would
cause massive (complete?) failures in the testsuite.

>> I'm building a test kernel as I type this, I'll report back when I have
>> some results.
>
> Thanks!

Good news - I looped the testsuite a couple thousand times this
afternoon and the VM was still standing afterwards (on a 4.10-rc2 base
too!).

Let me take a closer look at your revised patch tomorrow (it is
getting late for me) and if all is well I'll send it up to Linus.

>> Also, while I'm sure you've heard this before (and likely already know
>> better), please send patches inline, it makes review/commenting much
>> easier.
>
> Actually, I haven't heard this for quite a long time :) and I myself
> prefer attached patches (with text/plain attachment type) when they are
> in a reply to another email - they are easier to extract and at least my
> mail client automatically inlines them when I hit reply... Arguably the
> best of both worlds is to use git-send-email with properly set threading
> but I tend to forget about that option.

You probably haven't heard this in some time because everyone else
isn't as cranky and stubborn as me ;)  It isn't a big deal for one
small patch with only a few interested parties, but when you get
several people discussing the patch it can be very handy to have it
inline.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-01-05  2:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20161222091538.28702-1-jack@suse.cz>
2016-12-22 20:58 ` [PATCH 0/22] fsnotify: Avoid SRCU stalls with fanotify permission events Paul Moore
2016-12-22 21:05   ` Amir Goldstein
2016-12-22 23:04     ` Paul Moore
     [not found] ` <20161222091538.28702-5-jack@suse.cz>
2016-12-22 23:13   ` [PATCH 04/22] fsnotify: Remove fsnotify_duplicate_mark() Paul Moore
2016-12-23 13:22     ` Jan Kara
2016-12-23 14:01       ` Paul Moore
     [not found] ` <20161222091538.28702-6-jack@suse.cz>
2016-12-22 23:18   ` [PATCH 05/22] audit: Fix sleep in atomic Paul Moore
2016-12-23 13:24     ` Jan Kara
2016-12-23 14:17       ` Paul Moore
2016-12-26 16:33         ` Paul Moore
2017-01-02 18:21           ` Jan Kara
2017-01-03 21:11             ` Paul Moore
2017-01-04  8:50               ` Jan Kara
2017-01-05  2:14                 ` Paul Moore
     [not found] ` <20161222091538.28702-7-jack@suse.cz>
2016-12-22 23:27   ` [PATCH 06/22] audit: Abstract hash key handling Paul Moore
2016-12-23 13:27     ` Jan Kara
2016-12-23 14:13       ` Paul Moore
2017-01-03 17:34         ` Jan Kara
2017-01-05  2:06           ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox