linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kasan: resched in quarantine_remove_cache()
@ 2017-03-08 15:42 Dmitry Vyukov
  2017-03-08 23:05 ` Andrew Morton
  2017-03-09  8:37 ` Andrey Ryabinin
  0 siblings, 2 replies; 3+ messages in thread
From: Dmitry Vyukov @ 2017-03-08 15:42 UTC (permalink / raw)
  To: aryabinin, linux-mm, akpm; +Cc: Dmitry Vyukov, kasan-dev, Greg Thelen

We see reported stalls/lockups in quarantine_remove_cache() on machines
with large amounts of RAM. quarantine_remove_cache() needs to scan whole
quarantine in order to take out all objects belonging to the cache.
Quarantine is currently 1/32-th of RAM, e.g. on a machine with 256GB
of memory that will be 8GB. Moreover quarantine scanning is a walk
over uncached linked list, which is slow.

Add cond_resched() after scanning of each non-empty batch of objects.
Batches are specifically kept of reasonable size for quarantine_put().
On a machine with 256GB of RAM we should have ~512 non-empty batches,
each with 16MB of objects.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Greg Thelen <gthelen@google.com>
---
 mm/kasan/quarantine.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index 075422c3cee3..3021d2976dd6 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -311,8 +311,15 @@ void quarantine_remove_cache(struct kmem_cache *cache)
 	on_each_cpu(per_cpu_remove_cache, cache, 1);
 
 	spin_lock_irqsave(&quarantine_lock, flags);
-	for (i = 0; i < QUARANTINE_BATCHES; i++)
+	for (i = 0; i < QUARANTINE_BATCHES; i++) {
+		if (qlist_empty(&global_quarantine[i]))
+			continue;
 		qlist_move_cache(&global_quarantine[i], &to_free, cache);
+		/* Scanning whole quarantine can take a while. */
+		spin_unlock_irqrestore(&quarantine_lock, flags);
+		cond_resched();
+		spin_lock_irqsave(&quarantine_lock, flags);
+	}
 	spin_unlock_irqrestore(&quarantine_lock, flags);
 
 	qlist_free_all(&to_free, cache);
-- 
2.12.0.246.ga2ecc84866-goog

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] kasan: resched in quarantine_remove_cache()
  2017-03-08 15:42 [PATCH] kasan: resched in quarantine_remove_cache() Dmitry Vyukov
@ 2017-03-08 23:05 ` Andrew Morton
  2017-03-09  8:37 ` Andrey Ryabinin
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2017-03-08 23:05 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: aryabinin, linux-mm, kasan-dev, Greg Thelen

On Wed,  8 Mar 2017 16:42:39 +0100 Dmitry Vyukov <dvyukov@google.com> wrote:

> We see reported stalls/lockups in quarantine_remove_cache() on machines
> with large amounts of RAM. quarantine_remove_cache() needs to scan whole
> quarantine in order to take out all objects belonging to the cache.
> Quarantine is currently 1/32-th of RAM, e.g. on a machine with 256GB
> of memory that will be 8GB. Moreover quarantine scanning is a walk
> over uncached linked list, which is slow.
> 
> Add cond_resched() after scanning of each non-empty batch of objects.
> Batches are specifically kept of reasonable size for quarantine_put().
> On a machine with 256GB of RAM we should have ~512 non-empty batches,
> each with 16MB of objects.

I'll add cc:stable to this one - softlockup reports on large machines
is a pretty significant issue.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] kasan: resched in quarantine_remove_cache()
  2017-03-08 15:42 [PATCH] kasan: resched in quarantine_remove_cache() Dmitry Vyukov
  2017-03-08 23:05 ` Andrew Morton
@ 2017-03-09  8:37 ` Andrey Ryabinin
  1 sibling, 0 replies; 3+ messages in thread
From: Andrey Ryabinin @ 2017-03-09  8:37 UTC (permalink / raw)
  To: Dmitry Vyukov, linux-mm, akpm; +Cc: kasan-dev, Greg Thelen

On 03/08/2017 06:42 PM, Dmitry Vyukov wrote:
> We see reported stalls/lockups in quarantine_remove_cache() on machines
> with large amounts of RAM. quarantine_remove_cache() needs to scan whole
> quarantine in order to take out all objects belonging to the cache.
> Quarantine is currently 1/32-th of RAM, e.g. on a machine with 256GB
> of memory that will be 8GB. Moreover quarantine scanning is a walk
> over uncached linked list, which is slow.
> 
> Add cond_resched() after scanning of each non-empty batch of objects.
> Batches are specifically kept of reasonable size for quarantine_put().
> On a machine with 256GB of RAM we should have ~512 non-empty batches,
> each with 16MB of objects.
> 
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: kasan-dev@googlegroups.com
> Cc: linux-mm@kvack.org
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Greg Thelen <gthelen@google.com>

Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-03-09  8:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-08 15:42 [PATCH] kasan: resched in quarantine_remove_cache() Dmitry Vyukov
2017-03-08 23:05 ` Andrew Morton
2017-03-09  8:37 ` Andrey Ryabinin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).