From: Andrea Arcangeli <aarcange@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: Bharat Bhushan <Bharat.Bhushan@freescale.com>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Scott Wood <scottwood@freescale.com>,
Ben Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: Error in frreing hugepages with preemption enabled
Date: Tue, 03 Dec 2013 22:21:21 +0000 [thread overview]
Message-ID: <20131203222121.GB18764@redhat.com> (raw)
In-Reply-To: <EDB02D3B-6B03-4CCB-95C8-FDF2F2932E1D@suse.de>
Hi everyone,
On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
>
> On 29.11.2013, at 05:38, Bharat Bhushan <Bharat.Bhushan@freescale.com> wrote:
>
> > Hi Alex,
> >
> > I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when "quit" from qemu.
> >
> > (qemu) QEMU waiting for connection on: telnet:0.0.0.0:4444,server
> > qemu-system-ppc64: pci_add_option_rom: failed to find romfile "efi-virtio.rom"
> > q
> > debug_smp_processor_id: 15 callbacks suppressed
> > BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-ppc/2504
> > caller is .free_hugepd_range+0xb0/0x21c
> > CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175
> > Call Trace:
> > [c0000000fb433400] [c000000000007d38] .show_stack+0x7c/0x1cc (unreliable)
> > [c0000000fb4334d0] [c0000000005e8ce0] .dump_stack+0x9c/0xf4
> > [c0000000fb433560] [c0000000002de5ec] .debug_smp_processor_id+0x108/0x11c
> > [c0000000fb4335f0] [c000000000025e10] .free_hugepd_range+0xb0/0x21c
> > [c0000000fb433680] [c0000000000265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
> > [c0000000fb4337a0] [c0000000000e428c] .free_pgtables+0x14c/0x158
> > [c0000000fb433840] [c0000000000ef320] .exit_mmap+0xec/0x194
> > [c0000000fb433960] [c00000000004d780] .mmput+0x64/0x124
> > [c0000000fb4339e0] [c000000000051f40] .do_exit+0x29c/0x9c8
> > [c0000000fb433ae0] [c0000000000527c8] .do_group_exit+0x50/0xc4
> > [c0000000fb433b70] [c0000000000606a0] .get_signal_to_deliver+0x21c/0x5d8
> > [c0000000fb433c70] [c000000000009b08] .do_signal+0x54/0x278
> > [c0000000fb433db0] [c000000000009e50] .do_notify_resume+0x64/0x78
> > [c0000000fb433e30] [c000000000000b44] .ret_from_except_lite+0x70/0x74
> >
> >
> > This mean that free_hugepd_range() must be called with preemption enabled.
>
> with preemption disabled.
>
> > I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way)
>
> Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :).
:) So I had a look at the top of this function (0xb0) in the upstream
kernel and no smp_processor_id() call is apparent, is this stock git
or a ppc tree? The first few calls seem not to call it but I may have
overlooked something. It's just quicker if somebody with vmlinux finds
the location of it.
static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift,
unsigned long start, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
pte_t *hugepte = hugepd_page(*hpdp);
int i;
unsigned long pdmask = ~((1UL << pdshift) - 1);
unsigned int num_hugepd = 1;
#ifdef CONFIG_PPC_FSL_BOOK3E
/* Note: On fsl the hpdp may be the first of several */
num_hugepd = (1 << (hugepd_shift(*hpdp) - pdshift));
#else
unsigned int shift = hugepd_shift(*hpdp);
#endif
start &= pdmask;
if (start < floor)
return;
if (ceiling) {
ceiling &= pdmask;
if (! ceiling)
return;
}
if (end - 1 > ceiling - 1)
return;
for (i = 0; i < num_hugepd; i++, hpdp++)
hpdp->pd = 0;
tlb->need_flush = 1;
#ifdef CONFIG_PPC_FSL_BOOK3E
hugepd_free(tlb, hugepte);
#else
pgtable_free_tlb(tlb, hugepte, pdshift - shift);
#endif
}
Generally smp_processor_id should never be used, exactly to avoid
problems like above with preempion enabled in .config.
Instead it should be replaced with a get_cpu()/put_cpu() pair that is
exactly meant to fix bugs like this and define proper critical
sections around the per-cpu variables.
#define get_cpu() ({ preempt_disable(); smp_processor_id(); })
#define put_cpu() preempt_enable()
After you find where that smp_processor_id() is located, you should
simply replace it with a get_cpu() and then you should insert a
put_cpu immediately after the "cpu" info is not used anymore. That
will define a proper and strict critical section around the use of the
per-cpu variables.
With a ppc vmlinux it should be immediate to find the location of
smp_processor_id but I don't have the ppc vmlinux here.
Thanks!
Andrea
>
>
> Alex
>
> >
> > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> > index d67db4b..6bf8459 100644
> > --- a/arch/powerpc/mm/hugetlbpage.c
> > +++ b/arch/powerpc/mm/hugetlbpage.c
> > @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> > */
> > next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd));
> > #endif
> > + preempt_disable();
> > free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
> > addr, next, floor, ceiling);
> > + preempt_enable();
> > } while (addr = next, addr != end);
> >
> > start &= PUD_MASK;
> >
> >
> > Thanks
> > -Bharat
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
Bharat Bhushan <Bharat.Bhushan@freescale.com>,
Scott Wood <scottwood@freescale.com>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: Error in frreing hugepages with preemption enabled
Date: Tue, 3 Dec 2013 23:21:21 +0100 [thread overview]
Message-ID: <20131203222121.GB18764@redhat.com> (raw)
In-Reply-To: <EDB02D3B-6B03-4CCB-95C8-FDF2F2932E1D@suse.de>
Hi everyone,
On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
>
> On 29.11.2013, at 05:38, Bharat Bhushan <Bharat.Bhushan@freescale.com> wrote:
>
> > Hi Alex,
> >
> > I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when "quit" from qemu.
> >
> > (qemu) QEMU waiting for connection on: telnet:0.0.0.0:4444,server
> > qemu-system-ppc64: pci_add_option_rom: failed to find romfile "efi-virtio.rom"
> > q
> > debug_smp_processor_id: 15 callbacks suppressed
> > BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-ppc/2504
> > caller is .free_hugepd_range+0xb0/0x21c
> > CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175
> > Call Trace:
> > [c0000000fb433400] [c000000000007d38] .show_stack+0x7c/0x1cc (unreliable)
> > [c0000000fb4334d0] [c0000000005e8ce0] .dump_stack+0x9c/0xf4
> > [c0000000fb433560] [c0000000002de5ec] .debug_smp_processor_id+0x108/0x11c
> > [c0000000fb4335f0] [c000000000025e10] .free_hugepd_range+0xb0/0x21c
> > [c0000000fb433680] [c0000000000265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
> > [c0000000fb4337a0] [c0000000000e428c] .free_pgtables+0x14c/0x158
> > [c0000000fb433840] [c0000000000ef320] .exit_mmap+0xec/0x194
> > [c0000000fb433960] [c00000000004d780] .mmput+0x64/0x124
> > [c0000000fb4339e0] [c000000000051f40] .do_exit+0x29c/0x9c8
> > [c0000000fb433ae0] [c0000000000527c8] .do_group_exit+0x50/0xc4
> > [c0000000fb433b70] [c0000000000606a0] .get_signal_to_deliver+0x21c/0x5d8
> > [c0000000fb433c70] [c000000000009b08] .do_signal+0x54/0x278
> > [c0000000fb433db0] [c000000000009e50] .do_notify_resume+0x64/0x78
> > [c0000000fb433e30] [c000000000000b44] .ret_from_except_lite+0x70/0x74
> >
> >
> > This mean that free_hugepd_range() must be called with preemption enabled.
>
> with preemption disabled.
>
> > I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way)
>
> Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :).
:) So I had a look at the top of this function (0xb0) in the upstream
kernel and no smp_processor_id() call is apparent, is this stock git
or a ppc tree? The first few calls seem not to call it but I may have
overlooked something. It's just quicker if somebody with vmlinux finds
the location of it.
static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift,
unsigned long start, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
pte_t *hugepte = hugepd_page(*hpdp);
int i;
unsigned long pdmask = ~((1UL << pdshift) - 1);
unsigned int num_hugepd = 1;
#ifdef CONFIG_PPC_FSL_BOOK3E
/* Note: On fsl the hpdp may be the first of several */
num_hugepd = (1 << (hugepd_shift(*hpdp) - pdshift));
#else
unsigned int shift = hugepd_shift(*hpdp);
#endif
start &= pdmask;
if (start < floor)
return;
if (ceiling) {
ceiling &= pdmask;
if (! ceiling)
return;
}
if (end - 1 > ceiling - 1)
return;
for (i = 0; i < num_hugepd; i++, hpdp++)
hpdp->pd = 0;
tlb->need_flush = 1;
#ifdef CONFIG_PPC_FSL_BOOK3E
hugepd_free(tlb, hugepte);
#else
pgtable_free_tlb(tlb, hugepte, pdshift - shift);
#endif
}
Generally smp_processor_id should never be used, exactly to avoid
problems like above with preempion enabled in .config.
Instead it should be replaced with a get_cpu()/put_cpu() pair that is
exactly meant to fix bugs like this and define proper critical
sections around the per-cpu variables.
#define get_cpu() ({ preempt_disable(); smp_processor_id(); })
#define put_cpu() preempt_enable()
After you find where that smp_processor_id() is located, you should
simply replace it with a get_cpu() and then you should insert a
put_cpu immediately after the "cpu" info is not used anymore. That
will define a proper and strict critical section around the use of the
per-cpu variables.
With a ppc vmlinux it should be immediate to find the location of
smp_processor_id but I don't have the ppc vmlinux here.
Thanks!
Andrea
>
>
> Alex
>
> >
> > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> > index d67db4b..6bf8459 100644
> > --- a/arch/powerpc/mm/hugetlbpage.c
> > +++ b/arch/powerpc/mm/hugetlbpage.c
> > @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> > */
> > next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd));
> > #endif
> > + preempt_disable();
> > free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
> > addr, next, floor, ceiling);
> > + preempt_enable();
> > } while (addr = next, addr != end);
> >
> > start &= PUD_MASK;
> >
> >
> > Thanks
> > -Bharat
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: Bharat Bhushan <Bharat.Bhushan@freescale.com>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Scott Wood <scottwood@freescale.com>,
Ben Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: Error in frreing hugepages with preemption enabled
Date: Tue, 3 Dec 2013 23:21:21 +0100 [thread overview]
Message-ID: <20131203222121.GB18764@redhat.com> (raw)
In-Reply-To: <EDB02D3B-6B03-4CCB-95C8-FDF2F2932E1D@suse.de>
Hi everyone,
On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
>
> On 29.11.2013, at 05:38, Bharat Bhushan <Bharat.Bhushan@freescale.com> wrote:
>
> > Hi Alex,
> >
> > I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when "quit" from qemu.
> >
> > (qemu) QEMU waiting for connection on: telnet:0.0.0.0:4444,server
> > qemu-system-ppc64: pci_add_option_rom: failed to find romfile "efi-virtio.rom"
> > q
> > debug_smp_processor_id: 15 callbacks suppressed
> > BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-ppc/2504
> > caller is .free_hugepd_range+0xb0/0x21c
> > CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175
> > Call Trace:
> > [c0000000fb433400] [c000000000007d38] .show_stack+0x7c/0x1cc (unreliable)
> > [c0000000fb4334d0] [c0000000005e8ce0] .dump_stack+0x9c/0xf4
> > [c0000000fb433560] [c0000000002de5ec] .debug_smp_processor_id+0x108/0x11c
> > [c0000000fb4335f0] [c000000000025e10] .free_hugepd_range+0xb0/0x21c
> > [c0000000fb433680] [c0000000000265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
> > [c0000000fb4337a0] [c0000000000e428c] .free_pgtables+0x14c/0x158
> > [c0000000fb433840] [c0000000000ef320] .exit_mmap+0xec/0x194
> > [c0000000fb433960] [c00000000004d780] .mmput+0x64/0x124
> > [c0000000fb4339e0] [c000000000051f40] .do_exit+0x29c/0x9c8
> > [c0000000fb433ae0] [c0000000000527c8] .do_group_exit+0x50/0xc4
> > [c0000000fb433b70] [c0000000000606a0] .get_signal_to_deliver+0x21c/0x5d8
> > [c0000000fb433c70] [c000000000009b08] .do_signal+0x54/0x278
> > [c0000000fb433db0] [c000000000009e50] .do_notify_resume+0x64/0x78
> > [c0000000fb433e30] [c000000000000b44] .ret_from_except_lite+0x70/0x74
> >
> >
> > This mean that free_hugepd_range() must be called with preemption enabled.
>
> with preemption disabled.
>
> > I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way)
>
> Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :).
:) So I had a look at the top of this function (0xb0) in the upstream
kernel and no smp_processor_id() call is apparent, is this stock git
or a ppc tree? The first few calls seem not to call it but I may have
overlooked something. It's just quicker if somebody with vmlinux finds
the location of it.
static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift,
unsigned long start, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
pte_t *hugepte = hugepd_page(*hpdp);
int i;
unsigned long pdmask = ~((1UL << pdshift) - 1);
unsigned int num_hugepd = 1;
#ifdef CONFIG_PPC_FSL_BOOK3E
/* Note: On fsl the hpdp may be the first of several */
num_hugepd = (1 << (hugepd_shift(*hpdp) - pdshift));
#else
unsigned int shift = hugepd_shift(*hpdp);
#endif
start &= pdmask;
if (start < floor)
return;
if (ceiling) {
ceiling &= pdmask;
if (! ceiling)
return;
}
if (end - 1 > ceiling - 1)
return;
for (i = 0; i < num_hugepd; i++, hpdp++)
hpdp->pd = 0;
tlb->need_flush = 1;
#ifdef CONFIG_PPC_FSL_BOOK3E
hugepd_free(tlb, hugepte);
#else
pgtable_free_tlb(tlb, hugepte, pdshift - shift);
#endif
}
Generally smp_processor_id should never be used, exactly to avoid
problems like above with preempion enabled in .config.
Instead it should be replaced with a get_cpu()/put_cpu() pair that is
exactly meant to fix bugs like this and define proper critical
sections around the per-cpu variables.
#define get_cpu() ({ preempt_disable(); smp_processor_id(); })
#define put_cpu() preempt_enable()
After you find where that smp_processor_id() is located, you should
simply replace it with a get_cpu() and then you should insert a
put_cpu immediately after the "cpu" info is not used anymore. That
will define a proper and strict critical section around the use of the
per-cpu variables.
With a ppc vmlinux it should be immediate to find the location of
smp_processor_id but I don't have the ppc vmlinux here.
Thanks!
Andrea
>
>
> Alex
>
> >
> > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> > index d67db4b..6bf8459 100644
> > --- a/arch/powerpc/mm/hugetlbpage.c
> > +++ b/arch/powerpc/mm/hugetlbpage.c
> > @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> > */
> > next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd));
> > #endif
> > + preempt_disable();
> > free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
> > addr, next, floor, ceiling);
> > + preempt_enable();
> > } while (addr = next, addr != end);
> >
> > start &= PUD_MASK;
> >
> >
> > Thanks
> > -Bharat
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2013-12-03 22:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-29 4:38 Error in frreing hugepages with preemption enabled Bharat Bhushan
2013-11-29 4:38 ` Bharat Bhushan
2013-11-29 11:13 ` Alexander Graf
2013-11-29 11:13 ` Alexander Graf
2013-11-29 11:13 ` Alexander Graf
2013-12-03 22:21 ` Andrea Arcangeli [this message]
2013-12-03 22:21 ` Andrea Arcangeli
2013-12-03 22:21 ` Andrea Arcangeli
2013-12-04 2:22 ` Benjamin Herrenschmidt
2013-12-04 2:22 ` Benjamin Herrenschmidt
2013-12-04 2:22 ` Benjamin Herrenschmidt
2013-12-05 11:14 ` Bharat Bhushan
2013-12-05 11:14 ` Bharat Bhushan
2013-12-05 11:14 ` Bharat Bhushan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131203222121.GB18764@redhat.com \
--to=aarcange@redhat.com \
--cc=Bharat.Bhushan@freescale.com \
--cc=agraf@suse.de \
--cc=benh@kernel.crashing.org \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=scottwood@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.