On Mon 26-12-16 11:33:10, Paul Moore wrote: > On Fri, Dec 23, 2016 at 9:17 AM, Paul Moore wrote: > > On Fri, Dec 23, 2016 at 8:24 AM, Jan Kara wrote: > >> On Thu 22-12-16 18:18:36, Paul Moore wrote: > >>> On Thu, Dec 22, 2016 at 4:15 AM, Jan Kara wrote: > >>> > Audit tree code was happily adding new notification marks while holding > >>> > spinlocks. Since fsnotify_add_mark() acquires group->mark_mutex this can > >>> > lead to sleeping while holding a spinlock, deadlocks due to lock > >>> > inversion, and probably other fun. Fix the problem by acquiring > >>> > group->mark_mutex earlier. > >>> > > >>> > CC: Paul Moore > >>> > Signed-off-by: Jan Kara > >>> > --- > >>> > kernel/audit_tree.c | 13 +++++++++++-- > >>> > 1 file changed, 11 insertions(+), 2 deletions(-) > >>> > >>> [SIDE NOTE: this patch explains your comments and my earlier concern > >>> about the locked/unlocked variants of fsnotify_add_mark() in > >>> untag_chunk()] > >>> > >>> Ouch. Thanks for catching this ... what is your goal with these > >>> patches, are you targeting this as a fix during the v4.10-rcX cycle? > >>> If not, any objections if I pull this patch into the audit tree and > >>> send this to Linus during the v4.10-rcX cycle (assuming it passes > >>> testing, yadda yadda)? > >> > >> Sure, go ahead. I plan these patches for the next merge window. So I can > >> rebase the series once you merge audit fixes... > > > > Okay, great. I'll merge this patch in the audit/stable-4.10 branch > > for Linus but there will likely be some delays due to > > holidays/vacation on my end. > > > > Thanks again for your help fixing this, I really appreciate it. > > I merged this patch, as well as the "Remove fsnotify_duplicate_mark()" > patch (to make things cleaner when merging this patch) and did a quick > test using the audit-testsuite ... the test hung on the "file_create" > tests. Unfortunately, I'm traveling right now for the holidays and > will not likely have a chance to debug this much further until after > the new year, but I thought I would mention it in case you had some > time to look into this failure. > > For reference, here is the audit-testsuite again: > > * https://github.com/linux-audit/audit-testsuite > > ... and if you have a Fedora test system, here is the Rawhide kernel I > used to test (it is basically my kernel-secnext test kernel with those > two patches mentioned above added on top): > > * https://copr.fedorainfracloud.org/coprs/pcmoore/kernel-testing/build/492386 So I found where the problem was. Attached is a new version of the patch. Tests from audit-testsuite fail for me but do not hang anymore. I guess the failing is because I don't have audit or selinux configured in any way and I'm using SUSE I guess (if there's some easy way to do that, I'd be interested) - runtests.pl complains that I have to be root although I am... Honza -- Jan Kara SUSE Labs, CR