All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Jan Kara <jack@suse.cz>, LKML <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context
Date: Wed, 4 May 2016 10:07:03 +1000	[thread overview]
Message-ID: <20160504000703.GW26977@dastard> (raw)
In-Reply-To: <20160503153823.GB4470@dhcp22.suse.cz>

On Tue, May 03, 2016 at 05:38:23PM +0200, Michal Hocko wrote:
> On Sat 30-04-16 09:40:08, Dave Chinner wrote:
> > On Fri, Apr 29, 2016 at 02:12:20PM +0200, Michal Hocko wrote:
> [...]
> > > - was it 
> > > "inconsistent {RECLAIM_FS-ON-[RW]} -> {IN-RECLAIM_FS-[WR]} usage"
> > > or a different class reports?
> > 
> > Typically that was involved, but it quite often there'd be a number
> > of locks and sometimes even interrupt stacks in an interaction
> > between 5 or 6 different processes. Lockdep covers all sorts of
> > stuff now (like fs freeze annotations as well as locks and memory
> > reclaim) so sometimes the only thing we can do is remove the
> > reclaim context from the stack and see if that makes it go away...
> 
> That is what I was thinking of. lockdep_reclaim_{disable,enable} or
> something like that to tell __lockdep_trace_alloc to not skip
> mark_held_locks(). This would effectivelly help to get rid of reclaim
> specific reports. It is hard to tell whether there would be others,
> though.

Yeah, though I suspect this would get messy having to scatter it
around the code. I can encapsulate it via internal XFS KM flags,
though, so I do think that will be a real issue.

> > > > They may have been fixed since, but I'm sceptical
> > > > of that because, generally speaking, developer testing only catches
> > > > the obvious lockdep issues. i.e. it's users that report all the
> > > > really twisty issues, and they are generally not reproducable except
> > > > under their production workloads...
> > > > 
> > > > IOWs, the absence of reports in your testing does not mean there
> > > > isn't a problem, and that is one of the biggest problems with
> > > > lockdep annotations - we have no way of ever knowing if they are
> > > > still necessary or not without exposing users to regressions and
> > > > potential deadlocks.....
> > > 
> > > I understand your points here but if we are sure that those lockdep
> > > reports are just false positives then we should rather provide an api to
> > > silence lockdep for those paths
> > 
> > I agree with this - please provide such infrastructure before we
> > need it...
> 
> Do you think a reclaim specific lockdep annotation would be sufficient?

It will help - it'll take some time to work through all the explicit
KM_NOFS calls in XFS, though, to determine if they are just working
around lockdep false positives or some other potential problem....

> I do understand your concerns and I really do not ask you to redesign
> your code. I would like make the code more maintainable and reducing the
> number of (undocumented) GFP_NOFS usage to the minimum seems to be like
> a first step. Now the direct usage of GFP_NOFS (resp. KM_NOFS) in xfs is
> not that large.

That's true, and if we can reduce them to real cases of GFP_NOFS
being needed vs annotations to silence lockdep false positives we'll
then know what problems we really need to fix...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	xfs@oss.sgi.com, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context
Date: Wed, 4 May 2016 10:07:03 +1000	[thread overview]
Message-ID: <20160504000703.GW26977@dastard> (raw)
In-Reply-To: <20160503153823.GB4470@dhcp22.suse.cz>

On Tue, May 03, 2016 at 05:38:23PM +0200, Michal Hocko wrote:
> On Sat 30-04-16 09:40:08, Dave Chinner wrote:
> > On Fri, Apr 29, 2016 at 02:12:20PM +0200, Michal Hocko wrote:
> [...]
> > > - was it 
> > > "inconsistent {RECLAIM_FS-ON-[RW]} -> {IN-RECLAIM_FS-[WR]} usage"
> > > or a different class reports?
> > 
> > Typically that was involved, but it quite often there'd be a number
> > of locks and sometimes even interrupt stacks in an interaction
> > between 5 or 6 different processes. Lockdep covers all sorts of
> > stuff now (like fs freeze annotations as well as locks and memory
> > reclaim) so sometimes the only thing we can do is remove the
> > reclaim context from the stack and see if that makes it go away...
> 
> That is what I was thinking of. lockdep_reclaim_{disable,enable} or
> something like that to tell __lockdep_trace_alloc to not skip
> mark_held_locks(). This would effectivelly help to get rid of reclaim
> specific reports. It is hard to tell whether there would be others,
> though.

Yeah, though I suspect this would get messy having to scatter it
around the code. I can encapsulate it via internal XFS KM flags,
though, so I do think that will be a real issue.

> > > > They may have been fixed since, but I'm sceptical
> > > > of that because, generally speaking, developer testing only catches
> > > > the obvious lockdep issues. i.e. it's users that report all the
> > > > really twisty issues, and they are generally not reproducable except
> > > > under their production workloads...
> > > > 
> > > > IOWs, the absence of reports in your testing does not mean there
> > > > isn't a problem, and that is one of the biggest problems with
> > > > lockdep annotations - we have no way of ever knowing if they are
> > > > still necessary or not without exposing users to regressions and
> > > > potential deadlocks.....
> > > 
> > > I understand your points here but if we are sure that those lockdep
> > > reports are just false positives then we should rather provide an api to
> > > silence lockdep for those paths
> > 
> > I agree with this - please provide such infrastructure before we
> > need it...
> 
> Do you think a reclaim specific lockdep annotation would be sufficient?

It will help - it'll take some time to work through all the explicit
KM_NOFS calls in XFS, though, to determine if they are just working
around lockdep false positives or some other potential problem....

> I do understand your concerns and I really do not ask you to redesign
> your code. I would like make the code more maintainable and reducing the
> number of (undocumented) GFP_NOFS usage to the minimum seems to be like
> a first step. Now the direct usage of GFP_NOFS (resp. KM_NOFS) in xfs is
> not that large.

That's true, and if we can reduce them to real cases of GFP_NOFS
being needed vs annotations to silence lockdep false positives we'll
then know what problems we really need to fix...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	xfs@oss.sgi.com, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context
Date: Wed, 4 May 2016 10:07:03 +1000	[thread overview]
Message-ID: <20160504000703.GW26977@dastard> (raw)
In-Reply-To: <20160503153823.GB4470@dhcp22.suse.cz>

On Tue, May 03, 2016 at 05:38:23PM +0200, Michal Hocko wrote:
> On Sat 30-04-16 09:40:08, Dave Chinner wrote:
> > On Fri, Apr 29, 2016 at 02:12:20PM +0200, Michal Hocko wrote:
> [...]
> > > - was it 
> > > "inconsistent {RECLAIM_FS-ON-[RW]} -> {IN-RECLAIM_FS-[WR]} usage"
> > > or a different class reports?
> > 
> > Typically that was involved, but it quite often there'd be a number
> > of locks and sometimes even interrupt stacks in an interaction
> > between 5 or 6 different processes. Lockdep covers all sorts of
> > stuff now (like fs freeze annotations as well as locks and memory
> > reclaim) so sometimes the only thing we can do is remove the
> > reclaim context from the stack and see if that makes it go away...
> 
> That is what I was thinking of. lockdep_reclaim_{disable,enable} or
> something like that to tell __lockdep_trace_alloc to not skip
> mark_held_locks(). This would effectivelly help to get rid of reclaim
> specific reports. It is hard to tell whether there would be others,
> though.

Yeah, though I suspect this would get messy having to scatter it
around the code. I can encapsulate it via internal XFS KM flags,
though, so I do think that will be a real issue.

> > > > They may have been fixed since, but I'm sceptical
> > > > of that because, generally speaking, developer testing only catches
> > > > the obvious lockdep issues. i.e. it's users that report all the
> > > > really twisty issues, and they are generally not reproducable except
> > > > under their production workloads...
> > > > 
> > > > IOWs, the absence of reports in your testing does not mean there
> > > > isn't a problem, and that is one of the biggest problems with
> > > > lockdep annotations - we have no way of ever knowing if they are
> > > > still necessary or not without exposing users to regressions and
> > > > potential deadlocks.....
> > > 
> > > I understand your points here but if we are sure that those lockdep
> > > reports are just false positives then we should rather provide an api to
> > > silence lockdep for those paths
> > 
> > I agree with this - please provide such infrastructure before we
> > need it...
> 
> Do you think a reclaim specific lockdep annotation would be sufficient?

It will help - it'll take some time to work through all the explicit
KM_NOFS calls in XFS, though, to determine if they are just working
around lockdep false positives or some other potential problem....

> I do understand your concerns and I really do not ask you to redesign
> your code. I would like make the code more maintainable and reducing the
> number of (undocumented) GFP_NOFS usage to the minimum seems to be like
> a first step. Now the direct usage of GFP_NOFS (resp. KM_NOFS) in xfs is
> not that large.

That's true, and if we can reduce them to real cases of GFP_NOFS
being needed vs annotations to silence lockdep false positives we'll
then know what problems we really need to fix...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2016-05-04  0:09 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26 11:56 [PATCH 0/2] scop GFP_NOFS api Michal Hocko
2016-04-26 11:56 ` Michal Hocko
2016-04-26 11:56 ` Michal Hocko
2016-04-26 11:56 ` Michal Hocko
2016-04-26 11:56 ` [Cluster-devel] " Michal Hocko
2016-04-26 11:56 ` [PATCH 1/2] mm: add PF_MEMALLOC_NOFS Michal Hocko
2016-04-26 11:56   ` Michal Hocko
2016-04-26 11:56   ` Michal Hocko
2016-04-26 11:56   ` Michal Hocko
2016-04-26 11:56   ` [Cluster-devel] " Michal Hocko
2016-04-26 23:07   ` Dave Chinner
2016-04-26 23:07     ` Dave Chinner
2016-04-26 23:07     ` Dave Chinner
2016-04-26 23:07     ` [Cluster-devel] " Dave Chinner
2016-04-27  7:51     ` Michal Hocko
2016-04-27  7:51       ` Michal Hocko
2016-04-27  7:51       ` Michal Hocko
2016-04-27  7:51       ` [Cluster-devel] " Michal Hocko
2016-04-27 10:53   ` Tetsuo Handa
2016-04-27 10:53     ` Tetsuo Handa
2016-04-27 11:15     ` Michal Hocko
2016-04-27 11:15       ` Michal Hocko
2016-04-27 14:44       ` Tetsuo Handa
2016-04-27 14:44         ` Tetsuo Handa
2016-04-27 20:05         ` Michal Hocko
2016-04-27 20:05           ` Michal Hocko
2016-04-27 11:54   ` [PATCH 1.1/2] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS Michal Hocko
2016-04-27 11:54     ` Michal Hocko
2016-04-27 11:54     ` Michal Hocko
2016-04-27 11:54     ` Michal Hocko
2016-04-27 11:54     ` [Cluster-devel] " Michal Hocko
2016-04-27 11:54     ` [PATCH 1.2/2] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2016-04-27 11:54       ` Michal Hocko
2016-04-27 11:54       ` Michal Hocko
2016-04-27 11:54       ` Michal Hocko
2016-04-27 11:54       ` Michal Hocko
2016-04-27 11:54       ` [Cluster-devel] [PATCH 1.2/2] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2016-04-27 13:07       ` [PATCH 1.2/2] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2016-04-27 13:07         ` Michal Hocko
2016-04-27 13:07         ` Michal Hocko
2016-04-27 13:07         ` [Cluster-devel] [PATCH 1.2/2] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2016-04-27 20:09       ` [PATCH 1.2/2] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2016-04-27 20:09         ` Michal Hocko
2016-04-27 20:09         ` Michal Hocko
2016-04-27 20:09         ` [Cluster-devel] [PATCH 1.2/2] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2016-04-27 20:30         ` [PATCH 1.2/2] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2016-04-27 20:30           ` Michal Hocko
2016-04-27 20:30           ` Michal Hocko
2016-04-27 20:30           ` [Cluster-devel] [PATCH 1.2/2] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2016-04-27 21:14       ` [PATCH 1.2/2] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2016-04-27 21:14         ` Michal Hocko
2016-04-27 21:14         ` Michal Hocko
2016-04-27 21:14         ` Michal Hocko
2016-04-27 21:14         ` Michal Hocko
2016-04-27 21:14         ` [Cluster-devel] [PATCH 1.2/2] mm: introduce memalloc_nofs_{save, restore} API Michal Hocko
2016-04-27 17:41     ` [PATCH 1.1/2] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS Andreas Dilger
2016-04-27 17:41       ` Andreas Dilger
2016-04-27 17:41       ` [Cluster-devel] " Andreas Dilger
2016-04-27 19:43       ` Michal Hocko
2016-04-27 19:43         ` Michal Hocko
2016-04-27 19:43         ` Michal Hocko
2016-04-27 19:43         ` [Cluster-devel] " Michal Hocko
2016-04-26 11:56 ` [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Michal Hocko
2016-04-26 11:56   ` [PATCH 2/2] mm, debug: report when GFP_NO{FS, IO} is used explicitly from memalloc_no{fs, io}_{save, restore} context Michal Hocko
2016-04-26 11:56   ` Michal Hocko
2016-04-26 11:56   ` [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Michal Hocko
2016-04-26 11:56   ` Michal Hocko
2016-04-26 11:56   ` [Cluster-devel] [PATCH 2/2] mm, debug: report when GFP_NO{FS, IO} is used explicitly from memalloc_no{fs, io}_{save, restore} context Michal Hocko
2016-04-26 22:58   ` [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Dave Chinner
2016-04-26 22:58     ` Dave Chinner
2016-04-26 22:58     ` Dave Chinner
2016-04-26 22:58     ` [Cluster-devel] [PATCH 2/2] mm, debug: report when GFP_NO{FS, IO} is used explicitly from memalloc_no{fs, io}_{save, restore} context Dave Chinner
2016-04-27  8:03     ` [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Michal Hocko
2016-04-27  8:03       ` Michal Hocko
2016-04-27  8:03       ` Michal Hocko
2016-04-27  8:03       ` [Cluster-devel] [PATCH 2/2] mm, debug: report when GFP_NO{FS, IO} is used explicitly from memalloc_no{fs, io}_{save, restore} context Michal Hocko
2016-04-27 22:55       ` [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Dave Chinner
2016-04-27 22:55         ` Dave Chinner
2016-04-27 22:55         ` Dave Chinner
2016-04-27 22:55         ` [Cluster-devel] [PATCH 2/2] mm, debug: report when GFP_NO{FS, IO} is used explicitly from memalloc_no{fs, io}_{save, restore} context Dave Chinner
2016-04-28  8:17     ` [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Michal Hocko
2016-04-28  8:17       ` Michal Hocko
2016-04-28  8:17       ` Michal Hocko
2016-04-28 21:51       ` Dave Chinner
2016-04-28 21:51         ` Dave Chinner
2016-04-28 21:51         ` Dave Chinner
2016-04-29 12:12         ` Michal Hocko
2016-04-29 12:12           ` Michal Hocko
2016-04-29 12:12           ` Michal Hocko
2016-04-29 23:40           ` Dave Chinner
2016-04-29 23:40             ` Dave Chinner
2016-04-29 23:40             ` Dave Chinner
2016-05-03 15:38             ` Michal Hocko
2016-05-03 15:38               ` Michal Hocko
2016-05-03 15:38               ` Michal Hocko
2016-05-04  0:07               ` Dave Chinner [this message]
2016-05-04  0:07                 ` Dave Chinner
2016-05-04  0:07                 ` Dave Chinner
2016-04-29  5:35 ` [PATCH 0/2] scop GFP_NOFS api NeilBrown
2016-04-29  5:35   ` NeilBrown
2016-04-29  5:35   ` [Cluster-devel] " NeilBrown
2016-04-29 10:20   ` Steven Whitehouse
2016-04-29 10:20     ` Steven Whitehouse
2016-04-29 10:20     ` Steven Whitehouse
2016-04-29 10:20     ` Steven Whitehouse
2016-04-29 10:20     ` Steven Whitehouse
     [not found]     ` <57233571.50509-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-04-30 21:17       ` NeilBrown
2016-04-30 21:17         ` NeilBrown
2016-04-30 21:17         ` NeilBrown
2016-04-30 21:17         ` NeilBrown
2016-04-30 21:17         ` NeilBrown
2016-04-30 21:17         ` NeilBrown
2016-04-29 12:04   ` Michal Hocko
2016-04-29 12:04     ` Michal Hocko
2016-04-29 12:04     ` Michal Hocko
2016-04-29 12:04     ` [Cluster-devel] " Michal Hocko
2016-04-30  0:24     ` Dave Chinner
2016-04-30  0:24       ` Dave Chinner
2016-04-30  0:24       ` Dave Chinner
2016-04-30  0:24       ` [Cluster-devel] " Dave Chinner
2016-04-30 21:55     ` NeilBrown
2016-04-30 21:55       ` NeilBrown
2016-04-30 21:55       ` [Cluster-devel] " NeilBrown
2016-05-03 15:13       ` Michal Hocko
2016-05-03 15:13         ` Michal Hocko
2016-05-03 15:13         ` Michal Hocko
2016-05-03 15:13         ` [Cluster-devel] " Michal Hocko
2016-05-03 23:26         ` NeilBrown
2016-05-03 23:26           ` NeilBrown
2016-05-03 23:26           ` [Cluster-devel] " NeilBrown
2016-04-30  0:11   ` Dave Chinner
2016-04-30  0:11     ` Dave Chinner
2016-04-30  0:11     ` Dave Chinner
2016-04-30  0:11     ` [Cluster-devel] " Dave Chinner
2016-04-30 22:19     ` NeilBrown
2016-04-30 22:19       ` NeilBrown
2016-04-30 22:19       ` [Cluster-devel] " NeilBrown
2016-05-04  1:00       ` Dave Chinner
2016-05-04  1:00         ` Dave Chinner
2016-05-04  1:00         ` Dave Chinner
2016-05-04  1:00         ` [Cluster-devel] " Dave Chinner
2016-05-06  3:20         ` NeilBrown
2016-05-06  3:20           ` NeilBrown
2016-05-06  3:20           ` [Cluster-devel] " NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160504000703.GW26977@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.