From: David Vrabel <david.vrabel@citrix.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Len Brown <lenb@kernel.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
linux-acpi@vger.kernel.org,
xen-devel <xen-devel@lists.xenproject.org>,
Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Subject: Re: vunmap() on large regions may trigger soft lockup warnings
Date: Thu, 12 Dec 2013 12:50:47 +0000 [thread overview]
Message-ID: <52A9B127.9010501@citrix.com> (raw)
In-Reply-To: <20131211133917.dd10cb2c4360dba65d8e6ce2@linux-foundation.org>
On 11/12/13 21:39, Andrew Morton wrote:
> On Wed, 11 Dec 2013 16:58:19 +0000 David Vrabel <david.vrabel@citrix.com> wrote:
>
>> Andrew,
>>
>> Dietmar Hahn reported an issue where calling vunmap() on a large (50 GB)
>> region would trigger soft lockup warnings.
>>
>> The following patch would resolve this (by adding a cond_resched() call
>> to vunmap_pmd_range()). Almost calls of vunmap(), unmap_kernel_range()
>> are from process context (as far as I could tell) except for an ACPI
>> driver (drivers/acpi/apei/ghes.c) calls unmap_kernel_range_noflush()
>> from an interrupt and NMI contexts.
>>
>> Can you advise on a preferred solution?
>>
>> For example, an unmap_kernel_page() function (callable from atomic
>> context) could be provided since the GHES driver only maps/unmaps a
>> single page.
>>
>> 8<-------------------------
>> mm/vmalloc: avoid soft lockup warnings when vunmap()'ing large ranges
>>
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> If vunmap() is used to unmap a large (e.g., 50 GB) region, it may take
>> sufficiently long that it triggers soft lockup warnings.
>>
>> Add a cond_resched() into vunmap_pmd_range() so the calling task may
>> be resheduled after unmapping each PMD entry. This is how
>> zap_pmd_range() fixes the same problem for userspace mappings.
>>
>> ...
>>
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -75,6 +75,7 @@ static void vunmap_pmd_range(pud_t *pud, unsigned long
>> addr, unsigned long end)
>> if (pmd_none_or_clear_bad(pmd))
>> continue;
>> vunmap_pte_range(pmd, addr, next);
>> + cond_resched();
>> } while (pmd++, addr = next, addr != end);
>> }
>
> Well that's ugly.
>
> We could redo unmap_kernel_range() so it takes an `atomic' flag then
> loops around unmapping N MB at a time, doing
>
> if (!atomic)
> cond_resched()
>
> each time. But that would require difficult tuning of N.
>
> I suppose we could just do
>
> if (!in_interrupt())
> cond_resched();
>
> in vunmap_pmd_range(), but that's pretty specific to ghes.c and doesn't
> permit unmap-inside-spinlock.
>
> So I can't immediately think of a suitable fix apart from adding a new
> unmap_kernel_range_atomic(). Then add a `bool atomic' arg to
> vunmap_page_range() and pass that all the way down.
That would work for the unmap, but looking at the GHES driver some more
and it looks like it's call to ioremap_page_range() is already unsafe --
it may need to allocate a new PTE page with a non-atomic alloc in
pte_alloc_one_kernel().
Perhaps what's needed here is a pair of ioremap_page_atomic() and
iounmap_page_atomic() calls? With some prep function to sure the PTE
pages (etc.) are preallocated.
David
WARNING: multiple messages have this Message-ID (diff)
From: David Vrabel <david.vrabel@citrix.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Len Brown <lenb@kernel.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>,
<linux-acpi@vger.kernel.org>,
xen-devel <xen-devel@lists.xenproject.org>,
Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Subject: Re: vunmap() on large regions may trigger soft lockup warnings
Date: Thu, 12 Dec 2013 12:50:47 +0000 [thread overview]
Message-ID: <52A9B127.9010501@citrix.com> (raw)
In-Reply-To: <20131211133917.dd10cb2c4360dba65d8e6ce2@linux-foundation.org>
On 11/12/13 21:39, Andrew Morton wrote:
> On Wed, 11 Dec 2013 16:58:19 +0000 David Vrabel <david.vrabel@citrix.com> wrote:
>
>> Andrew,
>>
>> Dietmar Hahn reported an issue where calling vunmap() on a large (50 GB)
>> region would trigger soft lockup warnings.
>>
>> The following patch would resolve this (by adding a cond_resched() call
>> to vunmap_pmd_range()). Almost calls of vunmap(), unmap_kernel_range()
>> are from process context (as far as I could tell) except for an ACPI
>> driver (drivers/acpi/apei/ghes.c) calls unmap_kernel_range_noflush()
>> from an interrupt and NMI contexts.
>>
>> Can you advise on a preferred solution?
>>
>> For example, an unmap_kernel_page() function (callable from atomic
>> context) could be provided since the GHES driver only maps/unmaps a
>> single page.
>>
>> 8<-------------------------
>> mm/vmalloc: avoid soft lockup warnings when vunmap()'ing large ranges
>>
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> If vunmap() is used to unmap a large (e.g., 50 GB) region, it may take
>> sufficiently long that it triggers soft lockup warnings.
>>
>> Add a cond_resched() into vunmap_pmd_range() so the calling task may
>> be resheduled after unmapping each PMD entry. This is how
>> zap_pmd_range() fixes the same problem for userspace mappings.
>>
>> ...
>>
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -75,6 +75,7 @@ static void vunmap_pmd_range(pud_t *pud, unsigned long
>> addr, unsigned long end)
>> if (pmd_none_or_clear_bad(pmd))
>> continue;
>> vunmap_pte_range(pmd, addr, next);
>> + cond_resched();
>> } while (pmd++, addr = next, addr != end);
>> }
>
> Well that's ugly.
>
> We could redo unmap_kernel_range() so it takes an `atomic' flag then
> loops around unmapping N MB at a time, doing
>
> if (!atomic)
> cond_resched()
>
> each time. But that would require difficult tuning of N.
>
> I suppose we could just do
>
> if (!in_interrupt())
> cond_resched();
>
> in vunmap_pmd_range(), but that's pretty specific to ghes.c and doesn't
> permit unmap-inside-spinlock.
>
> So I can't immediately think of a suitable fix apart from adding a new
> unmap_kernel_range_atomic(). Then add a `bool atomic' arg to
> vunmap_page_range() and pass that all the way down.
That would work for the unmap, but looking at the GHES driver some more
and it looks like it's call to ioremap_page_range() is already unsafe --
it may need to allocate a new PTE page with a non-atomic alloc in
pte_alloc_one_kernel().
Perhaps what's needed here is a pair of ioremap_page_atomic() and
iounmap_page_atomic() calls? With some prep function to sure the PTE
pages (etc.) are preallocated.
David
next prev parent reply other threads:[~2013-12-12 12:50 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-11 16:58 vunmap() on large regions may trigger soft lockup warnings David Vrabel
2013-12-11 16:58 ` David Vrabel
2013-12-11 21:39 ` Andrew Morton
2013-12-11 21:39 ` Andrew Morton
2013-12-12 12:50 ` David Vrabel [this message]
2013-12-12 12:50 ` David Vrabel
2013-12-14 8:32 ` Andrew Morton
2013-12-14 8:32 ` Andrew Morton
2013-12-14 8:32 ` Andrew Morton
2013-12-16 12:56 ` David Vrabel
2013-12-16 12:56 ` David Vrabel
2013-12-16 22:57 ` Andrew Morton
2013-12-16 22:57 ` Andrew Morton
2013-12-16 22:57 ` Andrew Morton
2013-12-16 12:56 ` David Vrabel
2013-12-12 12:50 ` David Vrabel
2013-12-11 21:39 ` Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2013-12-11 16:58 David Vrabel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52A9B127.9010501@citrix.com \
--to=david.vrabel@citrix.com \
--cc=akpm@linux-foundation.org \
--cc=dietmar.hahn@ts.fujitsu.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rjw@rjwysocki.net \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.