linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: arm64 flushing 255GB of vmalloc space takes too long
@ 2014-07-09 16:53 Eric Miao
  2014-07-09 17:40 ` Catalin Marinas
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Miao @ 2014-07-09 16:53 UTC (permalink / raw)
  To: Laura Abbott
  Cc: linux-arm-kernel@lists.infradead.org,
	Linux Memory Management List, Catalin Marinas, Will Deacon,
	Russell King

On Tue, Jul 8, 2014 at 6:43 PM, Laura Abbott <lauraa@codeaurora.org> wrote:
>
> Hi,
>
> I have an arm64 target which has been observed hanging in __purge_vmap_area_lazy
> in vmalloc.c The root cause of this 'hang' is that flush_tlb_kernel_range is
> attempting to flush 255GB of virtual address space. This takes ~2 seconds and
> preemption is disabled at this time thanks to the purge lock. Disabling
> preemption for that time is long enough to trigger a watchdog we have setup.
>
> Triggering this is fairly easy:
> 1) Early in bootup, vmalloc > lazy_max_pages. This gives an address near the
> start of the vmalloc range.
> 2) load a module
> 3) vfree the vmalloc region from step 1
> 4) unload the module
>
> The arm64 virtual address layout looks like
> vmalloc : 0xffffff8000000000 - 0xffffffbbffff0000   (245759 MB)
> vmemmap : 0xffffffbc02400000 - 0xffffffbc03600000   (    18 MB)
> modules : 0xffffffbffc000000 - 0xffffffc000000000   (    64 MB)
>
> and the algorithm in __purge_vmap_area_lazy flushes between the lowest address.
> Essentially, if we are using a reasonable amount of vmalloc space and a module
> unload triggers a vmalloc purge, we will end up triggering our watchdog.
>
> A couple of options I thought of:
> 1) Increase the timeout of our watchdog to allow the flush to occur. Nobody
> I suggested this to likes the idea as the watchdog firing generally catches
> behavior that results in poor system performance and disabling preemption
> for that long does seem like a problem.
> 2) Change __purge_vmap_area_lazy to do less work under a spinlock. This would
> certainly have a performance impact and I don't even know if it is plausible.
> 3) Allow module unloading to trigger a vmalloc purge beforehand to help avoid
> this case. This would still be racy if another vfree came in during the time
> between the purge and the vfree but it might be good enough.
> 4) Add 'if size > threshold flush entire tlb' (I haven't profiled this yet)

We have the same problem. I'd agree with point 2 and point 4, point 1/3 do not
actually fix this issue. purge_vmap_area_lazy() could be called in other
cases.

w.r.t the threshold to flush entire tlb instead of doing that page-by-page, that
could be different from platform to platform. And considering the cost of tlb
flush on x86, I wonder why this isn't an issue on x86.

The whole __purge_vmap_area_lazy() is protected by a single spinlock, I
see no reason why a mutex cannot be used there, this allows preemption
during this likely lengthy process.

The rbtree removal seems to be heavy too - worst case would be to call
__free_vmap_area() for lazy_max_pages times. And they are all protected
by a single spinlock for the whole traversal, which is not necessary.

CC+ Russell, Catalin, Will.

We have a patch as below:

============================ >8 =========================

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-07-24 17:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-09 16:53 arm64 flushing 255GB of vmalloc space takes too long Eric Miao
2014-07-09 17:40 ` Catalin Marinas
2014-07-09 18:04   ` Eric Miao
2014-07-11  1:26     ` Laura Abbott
2014-07-11 12:45       ` Catalin Marinas
2014-07-23 21:25         ` Mark Salter
2014-07-24 14:24           ` Catalin Marinas
2014-07-24 14:56             ` [PATCH] arm64: fix soft lockup due to large tlb flush range Mark Salter
2014-07-24 17:47               ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).