[PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm_bufio lock
@ 2018-07-04 14:36 Mikulas Patocka
  2018-07-10 14:15 ` Greg Kroah-Hartman
  2018-11-10 23:14 ` Ben Hutchings
  0 siblings, 2 replies; 3+ messages in thread
From: Mikulas Patocka @ 2018-07-04 14:36 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Ben Hutchings, stable
  Cc: jing xia, David Rientjes, Guenter Roeck, Douglas Anderson,
	Mike Snitzer

This is backport of the upstream patch 
9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc. It should be backported to 
stable kernels because this performance problem was seen on the android 
4.4 kernel.


commit 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc
Author: Douglas Anderson <dianders@chromium.org>
Date:   Thu Nov 17 11:24:20 2016 -0800

    dm bufio: avoid sleeping while holding the dm_bufio lock
    
    We've seen in-field reports showing _lots_ (18 in one case, 41 in
    another) of tasks all sitting there blocked on:
    
      mutex_lock+0x4c/0x68
      dm_bufio_shrink_count+0x38/0x78
      shrink_slab.part.54.constprop.65+0x100/0x464
      shrink_zone+0xa8/0x198
    
    In the two cases analyzed, we see one task that looks like this:
    
      Workqueue: kverityd verity_prefetch_io
    
      __switch_to+0x9c/0xa8
      __schedule+0x440/0x6d8
      schedule+0x94/0xb4
      schedule_timeout+0x204/0x27c
      schedule_timeout_uninterruptible+0x44/0x50
      wait_iff_congested+0x9c/0x1f0
      shrink_inactive_list+0x3a0/0x4cc
      shrink_lruvec+0x418/0x5cc
      shrink_zone+0x88/0x198
      try_to_free_pages+0x51c/0x588
      __alloc_pages_nodemask+0x648/0xa88
      __get_free_pages+0x34/0x7c
      alloc_buffer+0xa4/0x144
      __bufio_new+0x84/0x278
      dm_bufio_prefetch+0x9c/0x154
      verity_prefetch_io+0xe8/0x10c
      process_one_work+0x240/0x424
      worker_thread+0x2fc/0x424
      kthread+0x10c/0x114
    
    ...and that looks to be the one holding the mutex.
    
    The problem has been reproduced on fairly easily:
    0. Be running Chrome OS w/ verity enabled on the root filesystem
    1. Pick test patch: http://crosreview.com/412360
    2. Install launchBalloons.sh and balloon.arm from
         http://crbug.com/468342
       ...that's just a memory stress test app.
    3. On a 4GB rk3399 machine, run
         nice ./launchBalloons.sh 4 900 100000
       ...that tries to eat 4 * 900 MB of memory and keep accessing.
    4. Login to the Chrome web browser and restore many tabs
    
    With that, I've seen printouts like:
      DOUG: long bufio 90758 ms
    ...and stack trace always show's we're in dm_bufio_prefetch().
    
    The problem is that we try to allocate memory with GFP_NOIO while
    we're holding the dm_bufio lock.  Instead we should be using
    GFP_NOWAIT.  Using GFP_NOIO can cause us to sleep while holding the
    lock and that causes the above problems.
    
    The current behavior explained by David Rientjes:
    
      It will still try reclaim initially because __GFP_WAIT (or
      __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO.  This is the cause of
      contention on dm_bufio_lock() that the thread holds.  You want to
      pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
      mutex that can be contended by a concurrent slab shrinker (if
      count_objects didn't use a trylock, this pattern would trivially
      deadlock).
    
    This change significantly increases responsiveness of the system while
    in this state.  It makes a real difference because it unblocks kswapd.
    In the bug report analyzed, kswapd was hung:
    
       kswapd0         D ffffffc000204fd8     0    72      2 0x00000000
       Call trace:
       [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
       [<ffffffc00090b794>] __schedule+0x440/0x6d8
       [<ffffffc00090bac0>] schedule+0x94/0xb4
       [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
       [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
       [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
       [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
       [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
       [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
       [<ffffffc00030e578>] balance_pgdat+0x328/0x508
       [<ffffffc00030eb7c>] kswapd+0x424/0x51c
       [<ffffffc00023f06c>] kthread+0x10c/0x114
       [<ffffffc000203dd0>] ret_from_fork+0x10/0x40
    
    By unblocking kswapd memory pressure should be reduced.
    
    Suggested-by: David Rientjes <rientjes@google.com>
    Reviewed-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 125aedc3875f..b5d3428d109e 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -827,7 +827,8 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
 	 * dm-bufio is resistant to allocation failures (it just keeps
 	 * one buffer reserved in cases all the allocations fail).
 	 * So set flags to not try too hard:
-	 *	GFP_NOIO: don't recurse into the I/O layer
+	 *	GFP_NOWAIT: don't wait; if we need to sleep we'll release our
+	 *		    mutex and wait ourselves.
 	 *	__GFP_NORETRY: don't retry and rather return failure
 	 *	__GFP_NOMEMALLOC: don't use emergency reserves
 	 *	__GFP_NOWARN: don't print a warning in case of failure
@@ -837,7 +838,7 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
 	 */
 	while (1) {
 		if (dm_bufio_cache_size_latch != 1) {
-			b = alloc_buffer(c, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
+			b = alloc_buffer(c, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
 			if (b)
 				return b;
 		}

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm_bufio lock
  2018-07-04 14:36 [PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm_bufio lock Mikulas Patocka
@ 2018-07-10 14:15 ` Greg Kroah-Hartman
  2018-11-10 23:14 ` Ben Hutchings
  1 sibling, 0 replies; 3+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-10 14:15 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Ben Hutchings, stable, jing xia, David Rientjes, Guenter Roeck,
	Douglas Anderson, Mike Snitzer

On Wed, Jul 04, 2018 at 10:36:58AM -0400, Mikulas Patocka wrote:
> This is backport of the upstream patch 
> 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc. It should be backported to 
> stable kernels because this performance problem was seen on the android 
> 4.4 kernel.

Now applied, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm_bufio lock
  2018-07-04 14:36 [PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm_bufio lock Mikulas Patocka
  2018-07-10 14:15 ` Greg Kroah-Hartman
@ 2018-11-10 23:14 ` Ben Hutchings
  1 sibling, 0 replies; 3+ messages in thread
From: Ben Hutchings @ 2018-11-10 23:14 UTC (permalink / raw)
  To: Mikulas Patocka, Greg Kroah-Hartman, stable
  Cc: jing xia, David Rientjes, Guenter Roeck, Douglas Anderson,
	Mike Snitzer

[-- Attachment #1: Type: text/plain, Size: 6199 bytes --]

On Wed, 2018-07-04 at 10:36 -0400, Mikulas Patocka wrote:
> This is backport of the upstream patch 
> 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc. It should be backported to 
> stable kernels because this performance problem was seen on the android 
> 4.4 kernel.

I've finally queued this up for 3.16, thanks.

Ben.

> commit 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc
> Author: Douglas Anderson <dianders@chromium.org>
> Date:   Thu Nov 17 11:24:20 2016 -0800
> 
>     dm bufio: avoid sleeping while holding the dm_bufio lock
>     
>     We've seen in-field reports showing _lots_ (18 in one case, 41 in
>     another) of tasks all sitting there blocked on:
>     
>       mutex_lock+0x4c/0x68
>       dm_bufio_shrink_count+0x38/0x78
>       shrink_slab.part.54.constprop.65+0x100/0x464
>       shrink_zone+0xa8/0x198
>     
>     In the two cases analyzed, we see one task that looks like this:
>     
>       Workqueue: kverityd verity_prefetch_io
>     
>       __switch_to+0x9c/0xa8
>       __schedule+0x440/0x6d8
>       schedule+0x94/0xb4
>       schedule_timeout+0x204/0x27c
>       schedule_timeout_uninterruptible+0x44/0x50
>       wait_iff_congested+0x9c/0x1f0
>       shrink_inactive_list+0x3a0/0x4cc
>       shrink_lruvec+0x418/0x5cc
>       shrink_zone+0x88/0x198
>       try_to_free_pages+0x51c/0x588
>       __alloc_pages_nodemask+0x648/0xa88
>       __get_free_pages+0x34/0x7c
>       alloc_buffer+0xa4/0x144
>       __bufio_new+0x84/0x278
>       dm_bufio_prefetch+0x9c/0x154
>       verity_prefetch_io+0xe8/0x10c
>       process_one_work+0x240/0x424
>       worker_thread+0x2fc/0x424
>       kthread+0x10c/0x114
>     
>     ...and that looks to be the one holding the mutex.
>     
>     The problem has been reproduced on fairly easily:
>     0. Be running Chrome OS w/ verity enabled on the root filesystem
>     1. Pick test patch: http://crosreview.com/412360
>     2. Install launchBalloons.sh and balloon.arm from
>          http://crbug.com/468342
>        ...that's just a memory stress test app.
>     3. On a 4GB rk3399 machine, run
>          nice ./launchBalloons.sh 4 900 100000
>        ...that tries to eat 4 * 900 MB of memory and keep accessing.
>     4. Login to the Chrome web browser and restore many tabs
>     
>     With that, I've seen printouts like:
>       DOUG: long bufio 90758 ms
>     ...and stack trace always show's we're in dm_bufio_prefetch().
>     
>     The problem is that we try to allocate memory with GFP_NOIO while
>     we're holding the dm_bufio lock.  Instead we should be using
>     GFP_NOWAIT.  Using GFP_NOIO can cause us to sleep while holding the
>     lock and that causes the above problems.
>     
>     The current behavior explained by David Rientjes:
>     
>       It will still try reclaim initially because __GFP_WAIT (or
>       __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO.  This is the cause of
>       contention on dm_bufio_lock() that the thread holds.  You want to
>       pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
>       mutex that can be contended by a concurrent slab shrinker (if
>       count_objects didn't use a trylock, this pattern would trivially
>       deadlock).
>     
>     This change significantly increases responsiveness of the system while
>     in this state.  It makes a real difference because it unblocks kswapd.
>     In the bug report analyzed, kswapd was hung:
>     
>        kswapd0         D ffffffc000204fd8     0    72      2 0x00000000
>        Call trace:
>        [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
>        [<ffffffc00090b794>] __schedule+0x440/0x6d8
>        [<ffffffc00090bac0>] schedule+0x94/0xb4
>        [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
>        [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
>        [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
>        [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
>        [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
>        [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
>        [<ffffffc00030e578>] balance_pgdat+0x328/0x508
>        [<ffffffc00030eb7c>] kswapd+0x424/0x51c
>        [<ffffffc00023f06c>] kthread+0x10c/0x114
>        [<ffffffc000203dd0>] ret_from_fork+0x10/0x40
>     
>     By unblocking kswapd memory pressure should be reduced.
>     
>     Suggested-by: David Rientjes <rientjes@google.com>
>     Reviewed-by: Guenter Roeck <linux@roeck-us.net>
>     Signed-off-by: Douglas Anderson <dianders@chromium.org>
>     Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> 
> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> index 125aedc3875f..b5d3428d109e 100644
> --- a/drivers/md/dm-bufio.c
> +++ b/drivers/md/dm-bufio.c
> @@ -827,7 +827,8 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
>          * dm-bufio is resistant to allocation failures (it just keeps
>          * one buffer reserved in cases all the allocations fail).
>          * So set flags to not try too hard:
> -        *      GFP_NOIO: don't recurse into the I/O layer
> +        *      GFP_NOWAIT: don't wait; if we need to sleep we'll release our
> +        *                  mutex and wait ourselves.
>          *      __GFP_NORETRY: don't retry and rather return failure
>          *      __GFP_NOMEMALLOC: don't use emergency reserves
>          *      __GFP_NOWARN: don't print a warning in case of failure
> @@ -837,7 +838,7 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
>          */
>         while (1) {
>                 if (dm_bufio_cache_size_latch != 1) {
> -                       b = alloc_buffer(c, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
> +                       b = alloc_buffer(c, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
>                         if (b)
>                                 return b;
>                 }
-- 
Ben Hutchings
Kids!  Bringing about Armageddon can be dangerous.  Do not attempt it
in your own home. - Terry Pratchett and Neil Gaiman, `Good Omens'



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-11-11  9:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-04 14:36 [PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm_bufio lock Mikulas Patocka
2018-07-10 14:15 ` Greg Kroah-Hartman
2018-11-10 23:14 ` Ben Hutchings

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.