All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
	Rik van Riel <riel@redhat.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	linux-mm@kvack.org, rusty@rustcorp.com.au,
	Hugh Dickins <hughd@google.com>,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	xfs@oss.sgi.com, Minchan Kim <minchan@kernel.org>,
	mst@redhat.com, Mel Gorman <mgorman@suse.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [RFC 2/2] x86_64: expand kernel stack to 16K
Date: Thu, 29 May 2014 07:55:18 +1000	[thread overview]
Message-ID: <20140528215518.GM8554@dastard> (raw)
In-Reply-To: <20140528160658.GH2878@cmpxchg.org>

On Wed, May 28, 2014 at 12:06:58PM -0400, Johannes Weiner wrote:
> On Wed, May 28, 2014 at 07:13:45PM +1000, Dave Chinner wrote:
> > On Wed, May 28, 2014 at 06:37:38PM +1000, Dave Chinner wrote:
> > > [ cc XFS list ]
> > 
> > [and now there is a complete copy on the XFs list, I'll add my 2c]
> > 
> > > On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> > > > While I play inhouse patches with much memory pressure on qemu-kvm,
> > > > 3.14 kernel was randomly crashed. The reason was kernel stack overflow.
> > > > 
> > > > When I investigated the problem, the callstack was a little bit deeper
> > > > by involve with reclaim functions but not direct reclaim path.
> > > > 
> > > > I tried to diet stack size of some functions related with alloc/reclaim
> > > > so did a hundred of byte but overflow was't disappeard so that I encounter
> > > > overflow by another deeper callstack on reclaim/allocator path.
> > 
> > That's a no win situation. The stack overruns through ->writepage
> > we've been seeing with XFS over the past *4 years* are much larger
> > than a few bytes. The worst case stack usage on a virtio block
> > device was about 10.5KB of stack usage.
> > 
> > And, like this one, it came from the flusher thread as well. The
> > difference was that the allocation that triggered the reclaim path
> > you've reported occurred when 5k of the stack had already been
> > used...
> > 
> > > > Of course, we might sweep every sites we have found for reducing
> > > > stack usage but I'm not sure how long it saves the world(surely,
> > > > lots of developer start to add nice features which will use stack
> > > > agains) and if we consider another more complex feature in I/O layer
> > > > and/or reclaim path, it might be better to increase stack size(
> > > > meanwhile, stack usage on 64bit machine was doubled compared to 32bit
> > > > while it have sticked to 8K. Hmm, it's not a fair to me and arm64
> > > > already expaned to 16K. )
> > 
> > Yup, that's all been pointed out previously. 8k stacks were never
> > large enough to fit the linux IO architecture on x86-64, but nobody
> > outside filesystem and IO developers has been willing to accept that
> > argument as valid, despite regular stack overruns and filesystem
> > having to add workaround after workaround to prevent stack overruns.
> > 
> > That's why stuff like this appears in various filesystem's
> > ->writepage:
> > 
> >         /*
> >          * Refuse to write the page out if we are called from reclaim context.
> >          *
> >          * This avoids stack overflows when called from deeply used stacks in
> >          * random callers for direct reclaim or memcg reclaim.  We explicitly
> >          * allow reclaim from kswapd as the stack usage there is relatively low.
> >          *
> >          * This should never happen except in the case of a VM regression so
> >          * warn about it.
> >          */
> >         if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) ==
> >                         PF_MEMALLOC))
> >                 goto redirty;
> > 
> > That still doesn't guarantee us enough stack space to do writeback,
> > though, because memory allocation can occur when reading in metadata
> > needed to do delayed allocation, and so we could trigger GFP_NOFS
> > memory allocation from the flusher thread with 4-5k of stack already
> > consumed, so that would still overrun teh stack.
> > 
> > So, a couple of years ago we started defering half the writeback
> > stack usage to a worker thread (commit c999a22 "xfs: introduce an
> > allocation workqueue"), under the assumption that the worst stack
> > usage when we call memory allocation is around 3-3.5k of stack used.
> > We thought that would be safe, but the stack trace you've posted
> > shows that alloc_page(GFP_NOFS) can consume upwards of 5k of stack,
> > which means we're still screwed despite all the workarounds we have
> > in place.
> 
> The allocation and reclaim stack itself is only 2k per the stacktrace
> below.  What got us in this particular case is that we engaged a
> complicated block layer setup from within the allocation context in
> order to swap out a page.

The report does not have a complicated block layer setup - it's just
a swap device on a virtio device. There's no MD, no raid, no complex
transport and protocol layer, etc. It's about as simple as it gets.

> In the past we disabled filesystem ->writepage from within the
> allocation context and deferred it to kswapd for stack reasons (see
> the WARN_ON_ONCE and the comment in your above quote), but I think we
> have to go further and do the same for even swap_writepage():

I don't think that solves the problem. I've seen plenty of near
stack overflows that were caused by >3k of stack being used because
of memory allocation/reclaim overhead and then scheduling.
usage and another 1k of stack scheduling waiting.

If we have a subsystem that can put >3k on the stack at arbitrary
locations, then we really only have <5k of stack available for
callers. And when the generic code typically consumes 1-2k of stack
before we get to filesystem specific methods, we only have 3-4k of
stack left for the worst case storage path stack usage. With the
block layer and driver layers requiring 2.5-3k because they can do
memory allocation and schedule, that leaves very little for the
layers in the middle, which is arguably the most algorithmically
complex layer of the storage stack.....

> > > > I guess this topic was discussed several time so there might be
> > > > strong reason not to increase kernel stack size on x86_64, for me not
> > > > knowing so Ccing x86_64 maintainers, other MM guys and virtio
> > > > maintainers.
> > > >
> > > >          Depth    Size   Location    (51 entries)
> > > > 
> > > >    0)     7696      16   lookup_address+0x28/0x30
> > > >    1)     7680      16   _lookup_address_cpa.isra.3+0x3b/0x40
> > > >    2)     7664      24   __change_page_attr_set_clr+0xe0/0xb50
> > > >    3)     7640     392   kernel_map_pages+0x6c/0x120
> > > >    4)     7248     256   get_page_from_freelist+0x489/0x920
> > > >    5)     6992     352   __alloc_pages_nodemask+0x5e1/0xb20
> > > >    6)     6640       8   alloc_pages_current+0x10f/0x1f0
> > > >    7)     6632     168   new_slab+0x2c5/0x370
> > > >    8)     6464       8   __slab_alloc+0x3a9/0x501
> > > >    9)     6456      80   __kmalloc+0x1cb/0x200
> > > >   10)     6376     376   vring_add_indirect+0x36/0x200
> > > >   11)     6000     144   virtqueue_add_sgs+0x2e2/0x320
> > > >   12)     5856     288   __virtblk_add_req+0xda/0x1b0
> > > >   13)     5568      96   virtio_queue_rq+0xd3/0x1d0
> > > >   14)     5472     128   __blk_mq_run_hw_queue+0x1ef/0x440
> > > >   15)     5344      16   blk_mq_run_hw_queue+0x35/0x40
> > > >   16)     5328      96   blk_mq_insert_requests+0xdb/0x160
> > > >   17)     5232     112   blk_mq_flush_plug_list+0x12b/0x140
> > > >   18)     5120     112   blk_flush_plug_list+0xc7/0x220
> > > >   19)     5008      64   io_schedule_timeout+0x88/0x100
> > > >   20)     4944     128   mempool_alloc+0x145/0x170
> > > >   21)     4816      96   bio_alloc_bioset+0x10b/0x1d0
> > > >   22)     4720      48   get_swap_bio+0x30/0x90
> > > >   23)     4672     160   __swap_writepage+0x150/0x230
> > > >   24)     4512      32   swap_writepage+0x42/0x90
> 
> Without swap IO from the allocation context, the stack would have
> ended here, which would have been easily survivable.  And left the
> writeout work to kswapd, which has a much shallower stack than this:

Sure, but this is just playing whack-a-stack. We can keep slapping
band-aids and restrictions on code and make the code more complex,
constrainted, convouted and slower, or we can just increase the
stack size....

> > > >   25)     4480     320   shrink_page_list+0x676/0xa80
> > > >   26)     4160     208   shrink_inactive_list+0x262/0x4e0
> > > >   27)     3952     304   shrink_lruvec+0x3e1/0x6a0
> > > >   28)     3648      80   shrink_zone+0x3f/0x110
> > > >   29)     3568     128   do_try_to_free_pages+0x156/0x4c0
> > > >   30)     3440     208   try_to_free_pages+0xf7/0x1e0
> > > >   31)     3232     352   __alloc_pages_nodemask+0x783/0xb20
> > > >   32)     2880       8   alloc_pages_current+0x10f/0x1f0
> > > >   33)     2872     200   __page_cache_alloc+0x13f/0x160
> > > >   34)     2672      80   find_or_create_page+0x4c/0xb0
> > > >   35)     2592      80   ext4_mb_load_buddy+0x1e9/0x370
> > > >   36)     2512     176   ext4_mb_regular_allocator+0x1b7/0x460
> > > >   37)     2336     128   ext4_mb_new_blocks+0x458/0x5f0
> > > >   38)     2208     256   ext4_ext_map_blocks+0x70b/0x1010
> > > >   39)     1952     160   ext4_map_blocks+0x325/0x530
> > > >   40)     1792     384   ext4_writepages+0x6d1/0xce0
> > > >   41)     1408      16   do_writepages+0x23/0x40
> > > >   42)     1392      96   __writeback_single_inode+0x45/0x2e0
> > > >   43)     1296     176   writeback_sb_inodes+0x2ad/0x500
> > > >   44)     1120      80   __writeback_inodes_wb+0x9e/0xd0
> > > >   45)     1040     160   wb_writeback+0x29b/0x350
> > > >   46)      880     208   bdi_writeback_workfn+0x11c/0x480
> > > >   47)      672     144   process_one_work+0x1d2/0x570
> > > >   48)      528     112   worker_thread+0x116/0x370
> > > >   49)      416     240   kthread+0xf3/0x110
> > > >   50)      176     176   ret_from_fork+0x7c/0xb0
> > 
> > Impressive: 3 nested allocations - GFP_NOFS, GFP_NOIO and then
> > GFP_ATOMIC before the stack goes boom. XFS usually only needs 2...
> 
> Do they also usually involve swap_writepage()?

No.  Have a look at this recent thread when Dave Jones reported
trinity was busting the stack.

http://oss.sgi.com/archives/xfs/2014-02/msg00325.html

What happens when a shrinker issues IO:

http://oss.sgi.com/archives/xfs/2014-02/msg00361.html

Yes, there was an XFS problem in there that was fixed (by moving
work to a workqueue!) but the point is that swap is not the only
path through memory allocation that can consume huge amounts of
stack. That above trace also points out a path through the scheduler
of close to 1k of stack usage. That gets worse -
wait_for_completion() typically requires 1.5k of stack....

Contributing is the new blk-mq layer, which from the above stack
trace still hasn't been fixed:

http://oss.sgi.com/archives/xfs/2014-02/msg00355.html

and a lot of the stack usage is because of saved registers on each
function call:

http://oss.sgi.com/archives/xfs/2014-02/msg00470.html

And here's a good set of examples of the amount of stack certain
functions can require:

http://oss.sgi.com/archives/xfs/2014-02/msg00365.html

Am I the only person who sees a widespread problem here?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, "H. Peter Anvin" <hpa@zytor.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	rusty@rustcorp.com.au, mst@redhat.com,
	Dave Hansen <dave.hansen@intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	xfs@oss.sgi.com
Subject: Re: [RFC 2/2] x86_64: expand kernel stack to 16K
Date: Thu, 29 May 2014 07:55:18 +1000	[thread overview]
Message-ID: <20140528215518.GM8554@dastard> (raw)
In-Reply-To: <20140528160658.GH2878@cmpxchg.org>

On Wed, May 28, 2014 at 12:06:58PM -0400, Johannes Weiner wrote:
> On Wed, May 28, 2014 at 07:13:45PM +1000, Dave Chinner wrote:
> > On Wed, May 28, 2014 at 06:37:38PM +1000, Dave Chinner wrote:
> > > [ cc XFS list ]
> > 
> > [and now there is a complete copy on the XFs list, I'll add my 2c]
> > 
> > > On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> > > > While I play inhouse patches with much memory pressure on qemu-kvm,
> > > > 3.14 kernel was randomly crashed. The reason was kernel stack overflow.
> > > > 
> > > > When I investigated the problem, the callstack was a little bit deeper
> > > > by involve with reclaim functions but not direct reclaim path.
> > > > 
> > > > I tried to diet stack size of some functions related with alloc/reclaim
> > > > so did a hundred of byte but overflow was't disappeard so that I encounter
> > > > overflow by another deeper callstack on reclaim/allocator path.
> > 
> > That's a no win situation. The stack overruns through ->writepage
> > we've been seeing with XFS over the past *4 years* are much larger
> > than a few bytes. The worst case stack usage on a virtio block
> > device was about 10.5KB of stack usage.
> > 
> > And, like this one, it came from the flusher thread as well. The
> > difference was that the allocation that triggered the reclaim path
> > you've reported occurred when 5k of the stack had already been
> > used...
> > 
> > > > Of course, we might sweep every sites we have found for reducing
> > > > stack usage but I'm not sure how long it saves the world(surely,
> > > > lots of developer start to add nice features which will use stack
> > > > agains) and if we consider another more complex feature in I/O layer
> > > > and/or reclaim path, it might be better to increase stack size(
> > > > meanwhile, stack usage on 64bit machine was doubled compared to 32bit
> > > > while it have sticked to 8K. Hmm, it's not a fair to me and arm64
> > > > already expaned to 16K. )
> > 
> > Yup, that's all been pointed out previously. 8k stacks were never
> > large enough to fit the linux IO architecture on x86-64, but nobody
> > outside filesystem and IO developers has been willing to accept that
> > argument as valid, despite regular stack overruns and filesystem
> > having to add workaround after workaround to prevent stack overruns.
> > 
> > That's why stuff like this appears in various filesystem's
> > ->writepage:
> > 
> >         /*
> >          * Refuse to write the page out if we are called from reclaim context.
> >          *
> >          * This avoids stack overflows when called from deeply used stacks in
> >          * random callers for direct reclaim or memcg reclaim.  We explicitly
> >          * allow reclaim from kswapd as the stack usage there is relatively low.
> >          *
> >          * This should never happen except in the case of a VM regression so
> >          * warn about it.
> >          */
> >         if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) ==
> >                         PF_MEMALLOC))
> >                 goto redirty;
> > 
> > That still doesn't guarantee us enough stack space to do writeback,
> > though, because memory allocation can occur when reading in metadata
> > needed to do delayed allocation, and so we could trigger GFP_NOFS
> > memory allocation from the flusher thread with 4-5k of stack already
> > consumed, so that would still overrun teh stack.
> > 
> > So, a couple of years ago we started defering half the writeback
> > stack usage to a worker thread (commit c999a22 "xfs: introduce an
> > allocation workqueue"), under the assumption that the worst stack
> > usage when we call memory allocation is around 3-3.5k of stack used.
> > We thought that would be safe, but the stack trace you've posted
> > shows that alloc_page(GFP_NOFS) can consume upwards of 5k of stack,
> > which means we're still screwed despite all the workarounds we have
> > in place.
> 
> The allocation and reclaim stack itself is only 2k per the stacktrace
> below.  What got us in this particular case is that we engaged a
> complicated block layer setup from within the allocation context in
> order to swap out a page.

The report does not have a complicated block layer setup - it's just
a swap device on a virtio device. There's no MD, no raid, no complex
transport and protocol layer, etc. It's about as simple as it gets.

> In the past we disabled filesystem ->writepage from within the
> allocation context and deferred it to kswapd for stack reasons (see
> the WARN_ON_ONCE and the comment in your above quote), but I think we
> have to go further and do the same for even swap_writepage():

I don't think that solves the problem. I've seen plenty of near
stack overflows that were caused by >3k of stack being used because
of memory allocation/reclaim overhead and then scheduling.
usage and another 1k of stack scheduling waiting.

If we have a subsystem that can put >3k on the stack at arbitrary
locations, then we really only have <5k of stack available for
callers. And when the generic code typically consumes 1-2k of stack
before we get to filesystem specific methods, we only have 3-4k of
stack left for the worst case storage path stack usage. With the
block layer and driver layers requiring 2.5-3k because they can do
memory allocation and schedule, that leaves very little for the
layers in the middle, which is arguably the most algorithmically
complex layer of the storage stack.....

> > > > I guess this topic was discussed several time so there might be
> > > > strong reason not to increase kernel stack size on x86_64, for me not
> > > > knowing so Ccing x86_64 maintainers, other MM guys and virtio
> > > > maintainers.
> > > >
> > > >          Depth    Size   Location    (51 entries)
> > > > 
> > > >    0)     7696      16   lookup_address+0x28/0x30
> > > >    1)     7680      16   _lookup_address_cpa.isra.3+0x3b/0x40
> > > >    2)     7664      24   __change_page_attr_set_clr+0xe0/0xb50
> > > >    3)     7640     392   kernel_map_pages+0x6c/0x120
> > > >    4)     7248     256   get_page_from_freelist+0x489/0x920
> > > >    5)     6992     352   __alloc_pages_nodemask+0x5e1/0xb20
> > > >    6)     6640       8   alloc_pages_current+0x10f/0x1f0
> > > >    7)     6632     168   new_slab+0x2c5/0x370
> > > >    8)     6464       8   __slab_alloc+0x3a9/0x501
> > > >    9)     6456      80   __kmalloc+0x1cb/0x200
> > > >   10)     6376     376   vring_add_indirect+0x36/0x200
> > > >   11)     6000     144   virtqueue_add_sgs+0x2e2/0x320
> > > >   12)     5856     288   __virtblk_add_req+0xda/0x1b0
> > > >   13)     5568      96   virtio_queue_rq+0xd3/0x1d0
> > > >   14)     5472     128   __blk_mq_run_hw_queue+0x1ef/0x440
> > > >   15)     5344      16   blk_mq_run_hw_queue+0x35/0x40
> > > >   16)     5328      96   blk_mq_insert_requests+0xdb/0x160
> > > >   17)     5232     112   blk_mq_flush_plug_list+0x12b/0x140
> > > >   18)     5120     112   blk_flush_plug_list+0xc7/0x220
> > > >   19)     5008      64   io_schedule_timeout+0x88/0x100
> > > >   20)     4944     128   mempool_alloc+0x145/0x170
> > > >   21)     4816      96   bio_alloc_bioset+0x10b/0x1d0
> > > >   22)     4720      48   get_swap_bio+0x30/0x90
> > > >   23)     4672     160   __swap_writepage+0x150/0x230
> > > >   24)     4512      32   swap_writepage+0x42/0x90
> 
> Without swap IO from the allocation context, the stack would have
> ended here, which would have been easily survivable.  And left the
> writeout work to kswapd, which has a much shallower stack than this:

Sure, but this is just playing whack-a-stack. We can keep slapping
band-aids and restrictions on code and make the code more complex,
constrainted, convouted and slower, or we can just increase the
stack size....

> > > >   25)     4480     320   shrink_page_list+0x676/0xa80
> > > >   26)     4160     208   shrink_inactive_list+0x262/0x4e0
> > > >   27)     3952     304   shrink_lruvec+0x3e1/0x6a0
> > > >   28)     3648      80   shrink_zone+0x3f/0x110
> > > >   29)     3568     128   do_try_to_free_pages+0x156/0x4c0
> > > >   30)     3440     208   try_to_free_pages+0xf7/0x1e0
> > > >   31)     3232     352   __alloc_pages_nodemask+0x783/0xb20
> > > >   32)     2880       8   alloc_pages_current+0x10f/0x1f0
> > > >   33)     2872     200   __page_cache_alloc+0x13f/0x160
> > > >   34)     2672      80   find_or_create_page+0x4c/0xb0
> > > >   35)     2592      80   ext4_mb_load_buddy+0x1e9/0x370
> > > >   36)     2512     176   ext4_mb_regular_allocator+0x1b7/0x460
> > > >   37)     2336     128   ext4_mb_new_blocks+0x458/0x5f0
> > > >   38)     2208     256   ext4_ext_map_blocks+0x70b/0x1010
> > > >   39)     1952     160   ext4_map_blocks+0x325/0x530
> > > >   40)     1792     384   ext4_writepages+0x6d1/0xce0
> > > >   41)     1408      16   do_writepages+0x23/0x40
> > > >   42)     1392      96   __writeback_single_inode+0x45/0x2e0
> > > >   43)     1296     176   writeback_sb_inodes+0x2ad/0x500
> > > >   44)     1120      80   __writeback_inodes_wb+0x9e/0xd0
> > > >   45)     1040     160   wb_writeback+0x29b/0x350
> > > >   46)      880     208   bdi_writeback_workfn+0x11c/0x480
> > > >   47)      672     144   process_one_work+0x1d2/0x570
> > > >   48)      528     112   worker_thread+0x116/0x370
> > > >   49)      416     240   kthread+0xf3/0x110
> > > >   50)      176     176   ret_from_fork+0x7c/0xb0
> > 
> > Impressive: 3 nested allocations - GFP_NOFS, GFP_NOIO and then
> > GFP_ATOMIC before the stack goes boom. XFS usually only needs 2...
> 
> Do they also usually involve swap_writepage()?

No.  Have a look at this recent thread when Dave Jones reported
trinity was busting the stack.

http://oss.sgi.com/archives/xfs/2014-02/msg00325.html

What happens when a shrinker issues IO:

http://oss.sgi.com/archives/xfs/2014-02/msg00361.html

Yes, there was an XFS problem in there that was fixed (by moving
work to a workqueue!) but the point is that swap is not the only
path through memory allocation that can consume huge amounts of
stack. That above trace also points out a path through the scheduler
of close to 1k of stack usage. That gets worse -
wait_for_completion() typically requires 1.5k of stack....

Contributing is the new blk-mq layer, which from the above stack
trace still hasn't been fixed:

http://oss.sgi.com/archives/xfs/2014-02/msg00355.html

and a lot of the stack usage is because of saved registers on each
function call:

http://oss.sgi.com/archives/xfs/2014-02/msg00470.html

And here's a good set of examples of the amount of stack certain
functions can require:

http://oss.sgi.com/archives/xfs/2014-02/msg00365.html

Am I the only person who sees a widespread problem here?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, "H. Peter Anvin" <hpa@zytor.com>,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	rusty@rustcorp.com.au, mst@redhat.com,
	Dave Hansen <dave.hansen@intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	xfs@oss.sgi.com
Subject: Re: [RFC 2/2] x86_64: expand kernel stack to 16K
Date: Thu, 29 May 2014 07:55:18 +1000	[thread overview]
Message-ID: <20140528215518.GM8554@dastard> (raw)
In-Reply-To: <20140528160658.GH2878@cmpxchg.org>

On Wed, May 28, 2014 at 12:06:58PM -0400, Johannes Weiner wrote:
> On Wed, May 28, 2014 at 07:13:45PM +1000, Dave Chinner wrote:
> > On Wed, May 28, 2014 at 06:37:38PM +1000, Dave Chinner wrote:
> > > [ cc XFS list ]
> > 
> > [and now there is a complete copy on the XFs list, I'll add my 2c]
> > 
> > > On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> > > > While I play inhouse patches with much memory pressure on qemu-kvm,
> > > > 3.14 kernel was randomly crashed. The reason was kernel stack overflow.
> > > > 
> > > > When I investigated the problem, the callstack was a little bit deeper
> > > > by involve with reclaim functions but not direct reclaim path.
> > > > 
> > > > I tried to diet stack size of some functions related with alloc/reclaim
> > > > so did a hundred of byte but overflow was't disappeard so that I encounter
> > > > overflow by another deeper callstack on reclaim/allocator path.
> > 
> > That's a no win situation. The stack overruns through ->writepage
> > we've been seeing with XFS over the past *4 years* are much larger
> > than a few bytes. The worst case stack usage on a virtio block
> > device was about 10.5KB of stack usage.
> > 
> > And, like this one, it came from the flusher thread as well. The
> > difference was that the allocation that triggered the reclaim path
> > you've reported occurred when 5k of the stack had already been
> > used...
> > 
> > > > Of course, we might sweep every sites we have found for reducing
> > > > stack usage but I'm not sure how long it saves the world(surely,
> > > > lots of developer start to add nice features which will use stack
> > > > agains) and if we consider another more complex feature in I/O layer
> > > > and/or reclaim path, it might be better to increase stack size(
> > > > meanwhile, stack usage on 64bit machine was doubled compared to 32bit
> > > > while it have sticked to 8K. Hmm, it's not a fair to me and arm64
> > > > already expaned to 16K. )
> > 
> > Yup, that's all been pointed out previously. 8k stacks were never
> > large enough to fit the linux IO architecture on x86-64, but nobody
> > outside filesystem and IO developers has been willing to accept that
> > argument as valid, despite regular stack overruns and filesystem
> > having to add workaround after workaround to prevent stack overruns.
> > 
> > That's why stuff like this appears in various filesystem's
> > ->writepage:
> > 
> >         /*
> >          * Refuse to write the page out if we are called from reclaim context.
> >          *
> >          * This avoids stack overflows when called from deeply used stacks in
> >          * random callers for direct reclaim or memcg reclaim.  We explicitly
> >          * allow reclaim from kswapd as the stack usage there is relatively low.
> >          *
> >          * This should never happen except in the case of a VM regression so
> >          * warn about it.
> >          */
> >         if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) ==
> >                         PF_MEMALLOC))
> >                 goto redirty;
> > 
> > That still doesn't guarantee us enough stack space to do writeback,
> > though, because memory allocation can occur when reading in metadata
> > needed to do delayed allocation, and so we could trigger GFP_NOFS
> > memory allocation from the flusher thread with 4-5k of stack already
> > consumed, so that would still overrun teh stack.
> > 
> > So, a couple of years ago we started defering half the writeback
> > stack usage to a worker thread (commit c999a22 "xfs: introduce an
> > allocation workqueue"), under the assumption that the worst stack
> > usage when we call memory allocation is around 3-3.5k of stack used.
> > We thought that would be safe, but the stack trace you've posted
> > shows that alloc_page(GFP_NOFS) can consume upwards of 5k of stack,
> > which means we're still screwed despite all the workarounds we have
> > in place.
> 
> The allocation and reclaim stack itself is only 2k per the stacktrace
> below.  What got us in this particular case is that we engaged a
> complicated block layer setup from within the allocation context in
> order to swap out a page.

The report does not have a complicated block layer setup - it's just
a swap device on a virtio device. There's no MD, no raid, no complex
transport and protocol layer, etc. It's about as simple as it gets.

> In the past we disabled filesystem ->writepage from within the
> allocation context and deferred it to kswapd for stack reasons (see
> the WARN_ON_ONCE and the comment in your above quote), but I think we
> have to go further and do the same for even swap_writepage():

I don't think that solves the problem. I've seen plenty of near
stack overflows that were caused by >3k of stack being used because
of memory allocation/reclaim overhead and then scheduling.
usage and another 1k of stack scheduling waiting.

If we have a subsystem that can put >3k on the stack at arbitrary
locations, then we really only have <5k of stack available for
callers. And when the generic code typically consumes 1-2k of stack
before we get to filesystem specific methods, we only have 3-4k of
stack left for the worst case storage path stack usage. With the
block layer and driver layers requiring 2.5-3k because they can do
memory allocation and schedule, that leaves very little for the
layers in the middle, which is arguably the most algorithmically
complex layer of the storage stack.....

> > > > I guess this topic was discussed several time so there might be
> > > > strong reason not to increase kernel stack size on x86_64, for me not
> > > > knowing so Ccing x86_64 maintainers, other MM guys and virtio
> > > > maintainers.
> > > >
> > > >          Depth    Size   Location    (51 entries)
> > > > 
> > > >    0)     7696      16   lookup_address+0x28/0x30
> > > >    1)     7680      16   _lookup_address_cpa.isra.3+0x3b/0x40
> > > >    2)     7664      24   __change_page_attr_set_clr+0xe0/0xb50
> > > >    3)     7640     392   kernel_map_pages+0x6c/0x120
> > > >    4)     7248     256   get_page_from_freelist+0x489/0x920
> > > >    5)     6992     352   __alloc_pages_nodemask+0x5e1/0xb20
> > > >    6)     6640       8   alloc_pages_current+0x10f/0x1f0
> > > >    7)     6632     168   new_slab+0x2c5/0x370
> > > >    8)     6464       8   __slab_alloc+0x3a9/0x501
> > > >    9)     6456      80   __kmalloc+0x1cb/0x200
> > > >   10)     6376     376   vring_add_indirect+0x36/0x200
> > > >   11)     6000     144   virtqueue_add_sgs+0x2e2/0x320
> > > >   12)     5856     288   __virtblk_add_req+0xda/0x1b0
> > > >   13)     5568      96   virtio_queue_rq+0xd3/0x1d0
> > > >   14)     5472     128   __blk_mq_run_hw_queue+0x1ef/0x440
> > > >   15)     5344      16   blk_mq_run_hw_queue+0x35/0x40
> > > >   16)     5328      96   blk_mq_insert_requests+0xdb/0x160
> > > >   17)     5232     112   blk_mq_flush_plug_list+0x12b/0x140
> > > >   18)     5120     112   blk_flush_plug_list+0xc7/0x220
> > > >   19)     5008      64   io_schedule_timeout+0x88/0x100
> > > >   20)     4944     128   mempool_alloc+0x145/0x170
> > > >   21)     4816      96   bio_alloc_bioset+0x10b/0x1d0
> > > >   22)     4720      48   get_swap_bio+0x30/0x90
> > > >   23)     4672     160   __swap_writepage+0x150/0x230
> > > >   24)     4512      32   swap_writepage+0x42/0x90
> 
> Without swap IO from the allocation context, the stack would have
> ended here, which would have been easily survivable.  And left the
> writeout work to kswapd, which has a much shallower stack than this:

Sure, but this is just playing whack-a-stack. We can keep slapping
band-aids and restrictions on code and make the code more complex,
constrainted, convouted and slower, or we can just increase the
stack size....

> > > >   25)     4480     320   shrink_page_list+0x676/0xa80
> > > >   26)     4160     208   shrink_inactive_list+0x262/0x4e0
> > > >   27)     3952     304   shrink_lruvec+0x3e1/0x6a0
> > > >   28)     3648      80   shrink_zone+0x3f/0x110
> > > >   29)     3568     128   do_try_to_free_pages+0x156/0x4c0
> > > >   30)     3440     208   try_to_free_pages+0xf7/0x1e0
> > > >   31)     3232     352   __alloc_pages_nodemask+0x783/0xb20
> > > >   32)     2880       8   alloc_pages_current+0x10f/0x1f0
> > > >   33)     2872     200   __page_cache_alloc+0x13f/0x160
> > > >   34)     2672      80   find_or_create_page+0x4c/0xb0
> > > >   35)     2592      80   ext4_mb_load_buddy+0x1e9/0x370
> > > >   36)     2512     176   ext4_mb_regular_allocator+0x1b7/0x460
> > > >   37)     2336     128   ext4_mb_new_blocks+0x458/0x5f0
> > > >   38)     2208     256   ext4_ext_map_blocks+0x70b/0x1010
> > > >   39)     1952     160   ext4_map_blocks+0x325/0x530
> > > >   40)     1792     384   ext4_writepages+0x6d1/0xce0
> > > >   41)     1408      16   do_writepages+0x23/0x40
> > > >   42)     1392      96   __writeback_single_inode+0x45/0x2e0
> > > >   43)     1296     176   writeback_sb_inodes+0x2ad/0x500
> > > >   44)     1120      80   __writeback_inodes_wb+0x9e/0xd0
> > > >   45)     1040     160   wb_writeback+0x29b/0x350
> > > >   46)      880     208   bdi_writeback_workfn+0x11c/0x480
> > > >   47)      672     144   process_one_work+0x1d2/0x570
> > > >   48)      528     112   worker_thread+0x116/0x370
> > > >   49)      416     240   kthread+0xf3/0x110
> > > >   50)      176     176   ret_from_fork+0x7c/0xb0
> > 
> > Impressive: 3 nested allocations - GFP_NOFS, GFP_NOIO and then
> > GFP_ATOMIC before the stack goes boom. XFS usually only needs 2...
> 
> Do they also usually involve swap_writepage()?

No.  Have a look at this recent thread when Dave Jones reported
trinity was busting the stack.

http://oss.sgi.com/archives/xfs/2014-02/msg00325.html

What happens when a shrinker issues IO:

http://oss.sgi.com/archives/xfs/2014-02/msg00361.html

Yes, there was an XFS problem in there that was fixed (by moving
work to a workqueue!) but the point is that swap is not the only
path through memory allocation that can consume huge amounts of
stack. That above trace also points out a path through the scheduler
of close to 1k of stack usage. That gets worse -
wait_for_completion() typically requires 1.5k of stack....

Contributing is the new blk-mq layer, which from the above stack
trace still hasn't been fixed:

http://oss.sgi.com/archives/xfs/2014-02/msg00355.html

and a lot of the stack usage is because of saved registers on each
function call:

http://oss.sgi.com/archives/xfs/2014-02/msg00470.html

And here's a good set of examples of the amount of stack certain
functions can require:

http://oss.sgi.com/archives/xfs/2014-02/msg00365.html

Am I the only person who sees a widespread problem here?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2014-05-28 21:55 UTC|newest]

Thread overview: 205+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-28  6:53 [PATCH 1/2] ftrace: print stack usage right before Oops Minchan Kim
2014-05-28  6:53 ` Minchan Kim
2014-05-28  6:53 ` [RFC 2/2] x86_64: expand kernel stack to 16K Minchan Kim
2014-05-28  6:53   ` Minchan Kim
2014-05-28  8:37   ` Dave Chinner
2014-05-28  8:37     ` Dave Chinner
2014-05-28  8:37     ` Dave Chinner
2014-05-28  9:13     ` Dave Chinner
2014-05-28  9:13       ` Dave Chinner
2014-05-28  9:13       ` Dave Chinner
2014-05-28 16:06       ` Johannes Weiner
2014-05-28 16:06         ` Johannes Weiner
2014-05-28 16:06         ` Johannes Weiner
2014-05-28 21:55         ` Dave Chinner [this message]
2014-05-28 21:55           ` Dave Chinner
2014-05-28 21:55           ` Dave Chinner
2014-05-29  6:06         ` Minchan Kim
2014-05-29  6:06           ` Minchan Kim
2014-05-29  6:06           ` Minchan Kim
2014-05-28  9:04   ` Michael S. Tsirkin
2014-05-28  9:04     ` Michael S. Tsirkin
2014-05-29  1:09     ` Minchan Kim
2014-05-29  2:44       ` Steven Rostedt
2014-05-29  2:44         ` Steven Rostedt
2014-05-29  4:11         ` Minchan Kim
2014-05-29  4:11           ` Minchan Kim
2014-05-29  2:47       ` Rusty Russell
2014-05-29  2:47         ` Rusty Russell
2014-05-29  4:10     ` virtio_ring stack usage Rusty Russell
2014-05-28  9:27   ` [RFC 2/2] x86_64: expand kernel stack to 16K Borislav Petkov
2014-05-29 13:23     ` One Thousand Gnomes
2014-05-29 13:23       ` One Thousand Gnomes
2014-05-28 14:14   ` Steven Rostedt
2014-05-28 14:14     ` Steven Rostedt
2014-05-28 14:23     ` H. Peter Anvin
2014-05-28 14:23       ` H. Peter Anvin
2014-05-28 22:11       ` Dave Chinner
2014-05-28 22:11         ` Dave Chinner
2014-05-28 22:42         ` H. Peter Anvin
2014-05-28 22:42           ` H. Peter Anvin
2014-05-28 23:17           ` Dave Chinner
2014-05-28 23:17             ` Dave Chinner
2014-05-28 23:21             ` H. Peter Anvin
2014-05-28 23:21               ` H. Peter Anvin
2014-05-28 15:43   ` Richard Weinberger
2014-05-28 15:43     ` Richard Weinberger
2014-05-28 16:08     ` Steven Rostedt
2014-05-28 16:08       ` Steven Rostedt
2014-05-28 16:11       ` Richard Weinberger
2014-05-28 16:11         ` Richard Weinberger
2014-05-28 16:13       ` Linus Torvalds
2014-05-28 16:13         ` Linus Torvalds
2014-05-28 16:09   ` Linus Torvalds
2014-05-28 16:09     ` Linus Torvalds
2014-05-28 22:31     ` Dave Chinner
2014-05-28 22:31       ` Dave Chinner
2014-05-28 22:41       ` Linus Torvalds
2014-05-28 22:41         ` Linus Torvalds
2014-05-29  1:30         ` Dave Chinner
2014-05-29  1:30           ` Dave Chinner
2014-05-29  1:58           ` Dave Chinner
2014-05-29  1:58             ` Dave Chinner
2014-05-29  2:51             ` Linus Torvalds
2014-05-29  2:51               ` Linus Torvalds
2014-05-29 23:36             ` Minchan Kim
2014-05-29 23:36               ` Minchan Kim
2014-05-30  0:05               ` Linus Torvalds
2014-05-30  0:20                 ` Minchan Kim
2014-05-30  0:20                   ` Minchan Kim
2014-05-30  0:31                   ` Linus Torvalds
2014-05-30  0:31                     ` Linus Torvalds
2014-05-30  0:50                     ` Minchan Kim
2014-05-30  0:50                       ` Minchan Kim
2014-05-30  1:24                       ` Linus Torvalds
2014-05-30  1:24                         ` Linus Torvalds
2014-05-30  1:58                         ` Dave Chinner
2014-05-30  1:58                           ` Dave Chinner
2014-05-30  2:13                           ` Linus Torvalds
2014-05-30  2:13                             ` Linus Torvalds
2014-05-30  6:21                         ` Minchan Kim
2014-05-30  6:21                           ` Minchan Kim
2014-05-30  1:30                 ` Linus Torvalds
2014-05-30  1:30                   ` Linus Torvalds
2014-05-30  0:15               ` Dave Chinner
2014-05-30  0:15                 ` Dave Chinner
2014-05-30  2:12                 ` Minchan Kim
2014-05-30  2:12                   ` Minchan Kim
2014-05-30  4:37                   ` Linus Torvalds
2014-05-30  4:37                     ` Linus Torvalds
2014-05-31  1:45                     ` Linus Torvalds
2014-05-31  1:45                       ` Linus Torvalds
2014-05-30  6:12                   ` Minchan Kim
2014-05-30  6:12                     ` Minchan Kim
2014-06-03 13:28                   ` Rasmus Villemoes
2014-06-03 13:28                     ` Rasmus Villemoes
2014-06-03 19:04                     ` Linus Torvalds
2014-06-03 19:04                       ` Linus Torvalds
2014-06-10 12:29                       ` [PATCH 0/2] Per-task wait_queue_t Rasmus Villemoes
2014-06-10 12:29                         ` [PATCH 1/2] wait: Introduce per-task wait_queue_t Rasmus Villemoes
2014-06-11 15:16                           ` Oleg Nesterov
2014-06-10 12:29                         ` [PATCH 2/2] wait: Use the per-task wait_queue_t in ___wait_event macro Rasmus Villemoes
2014-06-10 15:50                         ` [PATCH 0/2] Per-task wait_queue_t Peter Zijlstra
2014-06-12 21:46                           ` Rasmus Villemoes
2014-05-29  2:42           ` [RFC 2/2] x86_64: expand kernel stack to 16K Linus Torvalds
2014-05-29  2:42             ` Linus Torvalds
2014-05-29  5:14             ` H. Peter Anvin
2014-05-29  5:14               ` H. Peter Anvin
2014-05-29  6:01             ` Rusty Russell
2014-05-29  6:01               ` Rusty Russell
2014-05-29  7:26               ` virtio ring cleanups, which save stack on older gcc Rusty Russell
2014-05-29  7:26                 ` Rusty Russell
2014-05-29  7:26                 ` [PATCH 1/4] Hack: measure stack taken by vring from virtio_blk Rusty Russell
2014-05-29  7:26                   ` Rusty Russell
2014-05-29 15:39                   ` Linus Torvalds
2014-05-29 15:39                     ` Linus Torvalds
2014-05-29  7:26                 ` [PATCH 2/4] virtio_net: pass well-formed sg to virtqueue_add_inbuf() Rusty Russell
2014-05-29  7:26                   ` Rusty Russell
2014-05-29 10:07                   ` Michael S. Tsirkin
2014-05-29 10:07                     ` Michael S. Tsirkin
2014-05-29  7:26                 ` [PATCH 3/4] virtio_ring: assume sgs are always well-formed Rusty Russell
2014-05-29  7:26                   ` Rusty Russell
2014-05-29 11:18                   ` Michael S. Tsirkin
2014-05-29 11:18                     ` Michael S. Tsirkin
2014-05-29  7:26                 ` [PATCH 4/4] virtio_ring: unify direct/indirect code paths Rusty Russell
2014-05-29  7:26                   ` Rusty Russell
2014-05-29  7:52                   ` Peter Zijlstra
2014-05-29 11:05                     ` Rusty Russell
2014-05-29 11:05                       ` Rusty Russell
2014-05-29 11:33                       ` Michael S. Tsirkin
2014-05-29 11:33                         ` Michael S. Tsirkin
2014-05-29 11:29                   ` Michael S. Tsirkin
2014-05-29 11:29                     ` Michael S. Tsirkin
2014-05-30  2:37                     ` Rusty Russell
2014-05-30  2:37                       ` Rusty Russell
2014-05-30  6:21                       ` Rusty Russell
2014-05-29  7:41                 ` virtio ring cleanups, which save stack on older gcc Minchan Kim
2014-05-29  7:41                   ` Minchan Kim
2014-05-29 10:39                   ` Dave Chinner
2014-05-29 10:39                     ` Dave Chinner
2014-05-29 11:08                   ` Rusty Russell
2014-05-29 11:08                     ` Rusty Russell
2014-05-29 23:45                     ` Minchan Kim
2014-05-29 23:45                       ` Minchan Kim
2014-05-30  1:06                       ` Minchan Kim
2014-05-30  1:06                         ` Minchan Kim
2014-05-30  6:56                       ` Rusty Russell
2014-05-30  6:56                         ` Rusty Russell
2014-05-29  7:26             ` [RFC 2/2] x86_64: expand kernel stack to 16K Dave Chinner
2014-05-29  7:26               ` Dave Chinner
2014-05-29 15:24               ` Linus Torvalds
2014-05-29 15:24                 ` Linus Torvalds
2014-05-29 23:40                 ` Minchan Kim
2014-05-29 23:40                   ` Minchan Kim
2014-05-29 23:53                 ` Dave Chinner
2014-05-29 23:53                   ` Dave Chinner
2014-05-30  0:06                   ` Dave Jones
2014-05-30  0:06                     ` Dave Jones
2014-05-30  0:21                     ` Dave Chinner
2014-05-30  0:21                       ` Dave Chinner
2014-05-30  0:29                       ` Dave Jones
2014-05-30  0:29                         ` Dave Jones
2014-05-30  0:32                       ` Minchan Kim
2014-05-30  0:32                         ` Minchan Kim
2014-05-30  1:34                         ` Dave Chinner
2014-05-30  1:34                           ` Dave Chinner
2014-05-30 15:25                           ` H. Peter Anvin
2014-05-30 15:25                             ` H. Peter Anvin
2014-05-30 15:41                             ` Linus Torvalds
2014-05-30 15:41                               ` Linus Torvalds
2014-05-30 15:52                               ` H. Peter Anvin
2014-05-30 15:52                                 ` H. Peter Anvin
2014-05-30 16:06                                 ` Linus Torvalds
2014-05-30 16:06                                   ` Linus Torvalds
2014-05-30 17:24                                   ` Dave Hansen
2014-05-30 17:24                                     ` Dave Hansen
2014-05-30 18:12                                     ` H. Peter Anvin
2014-05-30 18:12                                       ` H. Peter Anvin
2014-10-21  2:00                               ` Dave Jones
2014-10-21  4:59                                 ` Andy Lutomirski
2014-05-30  9:48                 ` Richard Weinberger
2014-05-30  9:48                   ` Richard Weinberger
2014-05-30 15:36                   ` Linus Torvalds
2014-05-30 15:36                     ` Linus Torvalds
2014-05-31  2:06             ` Jens Axboe
2014-05-31  2:06               ` Jens Axboe
2014-06-02 22:59               ` Dave Chinner
2014-06-02 22:59                 ` Dave Chinner
2014-06-03 13:02               ` Konstantin Khlebnikov
2014-06-03 13:02                 ` Konstantin Khlebnikov
2014-05-29  3:46     ` Minchan Kim
2014-05-29  3:46       ` Minchan Kim
2014-05-29  4:13       ` Linus Torvalds
2014-05-29  4:13         ` Linus Torvalds
2014-05-29  5:10         ` Minchan Kim
2014-05-29  5:10           ` Minchan Kim
2014-05-30 21:23     ` Andi Kleen
2014-05-30 21:23       ` Andi Kleen
2014-05-28 16:18 ` [PATCH 1/2] ftrace: print stack usage right before Oops Steven Rostedt
2014-05-28 16:18   ` Steven Rostedt
2014-05-29  3:52   ` Minchan Kim
2014-05-29  3:52     ` Minchan Kim
2014-05-29  3:01 ` Steven Rostedt
2014-05-29  3:01   ` Steven Rostedt
2014-05-29  3:49   ` Minchan Kim
2014-05-29  3:49     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140528215518.GM8554@dastard \
    --to=david@fromorbit.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mst@redhat.com \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.