All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: Bharat Bhushan <Bharat.Bhushan@freescale.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Scott Wood <scottwood@freescale.com>,
	Ben Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: Error in frreing hugepages with preemption enabled
Date: Tue, 03 Dec 2013 22:21:21 +0000	[thread overview]
Message-ID: <20131203222121.GB18764@redhat.com> (raw)
In-Reply-To: <EDB02D3B-6B03-4CCB-95C8-FDF2F2932E1D@suse.de>

Hi everyone,

On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
> 
> On 29.11.2013, at 05:38, Bharat Bhushan <Bharat.Bhushan@freescale.com> wrote:
> 
> > Hi Alex,
> > 
> > I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when "quit" from qemu.
> > 
> > (qemu) QEMU waiting for connection on: telnet:0.0.0.0:4444,server
> > qemu-system-ppc64: pci_add_option_rom: failed to find romfile "efi-virtio.rom"
> > q
> > debug_smp_processor_id: 15 callbacks suppressed
> > BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-ppc/2504
> > caller is .free_hugepd_range+0xb0/0x21c
> > CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175
> > Call Trace:
> > [c0000000fb433400] [c000000000007d38] .show_stack+0x7c/0x1cc (unreliable)
> > [c0000000fb4334d0] [c0000000005e8ce0] .dump_stack+0x9c/0xf4
> > [c0000000fb433560] [c0000000002de5ec] .debug_smp_processor_id+0x108/0x11c
> > [c0000000fb4335f0] [c000000000025e10] .free_hugepd_range+0xb0/0x21c
> > [c0000000fb433680] [c0000000000265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
> > [c0000000fb4337a0] [c0000000000e428c] .free_pgtables+0x14c/0x158
> > [c0000000fb433840] [c0000000000ef320] .exit_mmap+0xec/0x194
> > [c0000000fb433960] [c00000000004d780] .mmput+0x64/0x124
> > [c0000000fb4339e0] [c000000000051f40] .do_exit+0x29c/0x9c8
> > [c0000000fb433ae0] [c0000000000527c8] .do_group_exit+0x50/0xc4
> > [c0000000fb433b70] [c0000000000606a0] .get_signal_to_deliver+0x21c/0x5d8
> > [c0000000fb433c70] [c000000000009b08] .do_signal+0x54/0x278
> > [c0000000fb433db0] [c000000000009e50] .do_notify_resume+0x64/0x78
> > [c0000000fb433e30] [c000000000000b44] .ret_from_except_lite+0x70/0x74
> > 
> > 
> > This mean that free_hugepd_range() must be called with preemption enabled.
> 
> with preemption disabled.
> 
> > I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way)
> 
> Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :).

:) So I had a look at the top of this function (0xb0) in the upstream
kernel and no smp_processor_id() call is apparent, is this stock git
or a ppc tree? The first few calls seem not to call it but I may have
overlooked something. It's just quicker if somebody with vmlinux finds
the location of it.

static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift,
			      unsigned long start, unsigned long end,
			      unsigned long floor, unsigned long ceiling)
{
	pte_t *hugepte = hugepd_page(*hpdp);
	int i;

	unsigned long pdmask = ~((1UL << pdshift) - 1);
	unsigned int num_hugepd = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
	/* Note: On fsl the hpdp may be the first of several */
	num_hugepd = (1 << (hugepd_shift(*hpdp) - pdshift));
#else
	unsigned int shift = hugepd_shift(*hpdp);
#endif

	start &= pdmask;
	if (start < floor)
		return;
	if (ceiling) {
		ceiling &= pdmask;
		if (! ceiling)
			return;
	}
	if (end - 1 > ceiling - 1)
		return;

	for (i = 0; i < num_hugepd; i++, hpdp++)
		hpdp->pd = 0;

	tlb->need_flush = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
	hugepd_free(tlb, hugepte);
#else
	pgtable_free_tlb(tlb, hugepte, pdshift - shift);
#endif
}

Generally smp_processor_id should never be used, exactly to avoid
problems like above with preempion enabled in .config.

Instead it should be replaced with a get_cpu()/put_cpu() pair that is
exactly meant to fix bugs like this and define proper critical
sections around the per-cpu variables.

#define get_cpu()		({ preempt_disable(); smp_processor_id(); })
#define put_cpu()		preempt_enable()

After you find where that smp_processor_id() is located, you should
simply replace it with a get_cpu() and then you should insert a
put_cpu immediately after the "cpu" info is not used anymore. That
will define a proper and strict critical section around the use of the
per-cpu variables.

With a ppc vmlinux it should be immediate to find the location of
smp_processor_id but I don't have the ppc vmlinux here.

Thanks!
Andrea

> 
> 
> Alex
> 
> > 
> > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> > index d67db4b..6bf8459 100644
> > --- a/arch/powerpc/mm/hugetlbpage.c
> > +++ b/arch/powerpc/mm/hugetlbpage.c
> > @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> >                 */
> >                next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd));
> > #endif
> > +               preempt_disable();
> >                free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
> >                                  addr, next, floor, ceiling);
> > +               preempt_enable();
> >        } while (addr = next, addr != end);
> > 
> >        start &= PUD_MASK;
> > 
> > 
> > Thanks
> > -Bharat
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
	Bharat Bhushan <Bharat.Bhushan@freescale.com>,
	Scott Wood <scottwood@freescale.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: Error in frreing hugepages with preemption enabled
Date: Tue, 3 Dec 2013 23:21:21 +0100	[thread overview]
Message-ID: <20131203222121.GB18764@redhat.com> (raw)
In-Reply-To: <EDB02D3B-6B03-4CCB-95C8-FDF2F2932E1D@suse.de>

Hi everyone,

On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
> 
> On 29.11.2013, at 05:38, Bharat Bhushan <Bharat.Bhushan@freescale.com> wrote:
> 
> > Hi Alex,
> > 
> > I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when "quit" from qemu.
> > 
> > (qemu) QEMU waiting for connection on: telnet:0.0.0.0:4444,server
> > qemu-system-ppc64: pci_add_option_rom: failed to find romfile "efi-virtio.rom"
> > q
> > debug_smp_processor_id: 15 callbacks suppressed
> > BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-ppc/2504
> > caller is .free_hugepd_range+0xb0/0x21c
> > CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175
> > Call Trace:
> > [c0000000fb433400] [c000000000007d38] .show_stack+0x7c/0x1cc (unreliable)
> > [c0000000fb4334d0] [c0000000005e8ce0] .dump_stack+0x9c/0xf4
> > [c0000000fb433560] [c0000000002de5ec] .debug_smp_processor_id+0x108/0x11c
> > [c0000000fb4335f0] [c000000000025e10] .free_hugepd_range+0xb0/0x21c
> > [c0000000fb433680] [c0000000000265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
> > [c0000000fb4337a0] [c0000000000e428c] .free_pgtables+0x14c/0x158
> > [c0000000fb433840] [c0000000000ef320] .exit_mmap+0xec/0x194
> > [c0000000fb433960] [c00000000004d780] .mmput+0x64/0x124
> > [c0000000fb4339e0] [c000000000051f40] .do_exit+0x29c/0x9c8
> > [c0000000fb433ae0] [c0000000000527c8] .do_group_exit+0x50/0xc4
> > [c0000000fb433b70] [c0000000000606a0] .get_signal_to_deliver+0x21c/0x5d8
> > [c0000000fb433c70] [c000000000009b08] .do_signal+0x54/0x278
> > [c0000000fb433db0] [c000000000009e50] .do_notify_resume+0x64/0x78
> > [c0000000fb433e30] [c000000000000b44] .ret_from_except_lite+0x70/0x74
> > 
> > 
> > This mean that free_hugepd_range() must be called with preemption enabled.
> 
> with preemption disabled.
> 
> > I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way)
> 
> Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :).

:) So I had a look at the top of this function (0xb0) in the upstream
kernel and no smp_processor_id() call is apparent, is this stock git
or a ppc tree? The first few calls seem not to call it but I may have
overlooked something. It's just quicker if somebody with vmlinux finds
the location of it.

static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift,
			      unsigned long start, unsigned long end,
			      unsigned long floor, unsigned long ceiling)
{
	pte_t *hugepte = hugepd_page(*hpdp);
	int i;

	unsigned long pdmask = ~((1UL << pdshift) - 1);
	unsigned int num_hugepd = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
	/* Note: On fsl the hpdp may be the first of several */
	num_hugepd = (1 << (hugepd_shift(*hpdp) - pdshift));
#else
	unsigned int shift = hugepd_shift(*hpdp);
#endif

	start &= pdmask;
	if (start < floor)
		return;
	if (ceiling) {
		ceiling &= pdmask;
		if (! ceiling)
			return;
	}
	if (end - 1 > ceiling - 1)
		return;

	for (i = 0; i < num_hugepd; i++, hpdp++)
		hpdp->pd = 0;

	tlb->need_flush = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
	hugepd_free(tlb, hugepte);
#else
	pgtable_free_tlb(tlb, hugepte, pdshift - shift);
#endif
}

Generally smp_processor_id should never be used, exactly to avoid
problems like above with preempion enabled in .config.

Instead it should be replaced with a get_cpu()/put_cpu() pair that is
exactly meant to fix bugs like this and define proper critical
sections around the per-cpu variables.

#define get_cpu()		({ preempt_disable(); smp_processor_id(); })
#define put_cpu()		preempt_enable()

After you find where that smp_processor_id() is located, you should
simply replace it with a get_cpu() and then you should insert a
put_cpu immediately after the "cpu" info is not used anymore. That
will define a proper and strict critical section around the use of the
per-cpu variables.

With a ppc vmlinux it should be immediate to find the location of
smp_processor_id but I don't have the ppc vmlinux here.

Thanks!
Andrea

> 
> 
> Alex
> 
> > 
> > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> > index d67db4b..6bf8459 100644
> > --- a/arch/powerpc/mm/hugetlbpage.c
> > +++ b/arch/powerpc/mm/hugetlbpage.c
> > @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> >                 */
> >                next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd));
> > #endif
> > +               preempt_disable();
> >                free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
> >                                  addr, next, floor, ceiling);
> > +               preempt_enable();
> >        } while (addr = next, addr != end);
> > 
> >        start &= PUD_MASK;
> > 
> > 
> > Thanks
> > -Bharat
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Alexander Graf <agraf@suse.de>
Cc: Bharat Bhushan <Bharat.Bhushan@freescale.com>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"kvm-ppc@vger.kernel.org" <kvm-ppc@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Scott Wood <scottwood@freescale.com>,
	Ben Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: Error in frreing hugepages with preemption enabled
Date: Tue, 3 Dec 2013 23:21:21 +0100	[thread overview]
Message-ID: <20131203222121.GB18764@redhat.com> (raw)
In-Reply-To: <EDB02D3B-6B03-4CCB-95C8-FDF2F2932E1D@suse.de>

Hi everyone,

On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:
> 
> On 29.11.2013, at 05:38, Bharat Bhushan <Bharat.Bhushan@freescale.com> wrote:
> 
> > Hi Alex,
> > 
> > I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With allocated pages things seems to work fine but I uses hugepages for guest I see below prints when "quit" from qemu.
> > 
> > (qemu) QEMU waiting for connection on: telnet:0.0.0.0:4444,server
> > qemu-system-ppc64: pci_add_option_rom: failed to find romfile "efi-virtio.rom"
> > q
> > debug_smp_processor_id: 15 callbacks suppressed
> > BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-ppc/2504
> > caller is .free_hugepd_range+0xb0/0x21c
> > CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 #175
> > Call Trace:
> > [c0000000fb433400] [c000000000007d38] .show_stack+0x7c/0x1cc (unreliable)
> > [c0000000fb4334d0] [c0000000005e8ce0] .dump_stack+0x9c/0xf4
> > [c0000000fb433560] [c0000000002de5ec] .debug_smp_processor_id+0x108/0x11c
> > [c0000000fb4335f0] [c000000000025e10] .free_hugepd_range+0xb0/0x21c
> > [c0000000fb433680] [c0000000000265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
> > [c0000000fb4337a0] [c0000000000e428c] .free_pgtables+0x14c/0x158
> > [c0000000fb433840] [c0000000000ef320] .exit_mmap+0xec/0x194
> > [c0000000fb433960] [c00000000004d780] .mmput+0x64/0x124
> > [c0000000fb4339e0] [c000000000051f40] .do_exit+0x29c/0x9c8
> > [c0000000fb433ae0] [c0000000000527c8] .do_group_exit+0x50/0xc4
> > [c0000000fb433b70] [c0000000000606a0] .get_signal_to_deliver+0x21c/0x5d8
> > [c0000000fb433c70] [c000000000009b08] .do_signal+0x54/0x278
> > [c0000000fb433db0] [c000000000009e50] .do_notify_resume+0x64/0x78
> > [c0000000fb433e30] [c000000000000b44] .ret_from_except_lite+0x70/0x74
> > 
> > 
> > This mean that free_hugepd_range() must be called with preemption enabled.
> 
> with preemption disabled.
> 
> > I tried below change and this seems to work fine (I am not having expertise in this area so not sure this is correct way)
> 
> Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows what to do :).

:) So I had a look at the top of this function (0xb0) in the upstream
kernel and no smp_processor_id() call is apparent, is this stock git
or a ppc tree? The first few calls seem not to call it but I may have
overlooked something. It's just quicker if somebody with vmlinux finds
the location of it.

static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift,
			      unsigned long start, unsigned long end,
			      unsigned long floor, unsigned long ceiling)
{
	pte_t *hugepte = hugepd_page(*hpdp);
	int i;

	unsigned long pdmask = ~((1UL << pdshift) - 1);
	unsigned int num_hugepd = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
	/* Note: On fsl the hpdp may be the first of several */
	num_hugepd = (1 << (hugepd_shift(*hpdp) - pdshift));
#else
	unsigned int shift = hugepd_shift(*hpdp);
#endif

	start &= pdmask;
	if (start < floor)
		return;
	if (ceiling) {
		ceiling &= pdmask;
		if (! ceiling)
			return;
	}
	if (end - 1 > ceiling - 1)
		return;

	for (i = 0; i < num_hugepd; i++, hpdp++)
		hpdp->pd = 0;

	tlb->need_flush = 1;

#ifdef CONFIG_PPC_FSL_BOOK3E
	hugepd_free(tlb, hugepte);
#else
	pgtable_free_tlb(tlb, hugepte, pdshift - shift);
#endif
}

Generally smp_processor_id should never be used, exactly to avoid
problems like above with preempion enabled in .config.

Instead it should be replaced with a get_cpu()/put_cpu() pair that is
exactly meant to fix bugs like this and define proper critical
sections around the per-cpu variables.

#define get_cpu()		({ preempt_disable(); smp_processor_id(); })
#define put_cpu()		preempt_enable()

After you find where that smp_processor_id() is located, you should
simply replace it with a get_cpu() and then you should insert a
put_cpu immediately after the "cpu" info is not used anymore. That
will define a proper and strict critical section around the use of the
per-cpu variables.

With a ppc vmlinux it should be immediate to find the location of
smp_processor_id but I don't have the ppc vmlinux here.

Thanks!
Andrea

> 
> 
> Alex
> 
> > 
> > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> > index d67db4b..6bf8459 100644
> > --- a/arch/powerpc/mm/hugetlbpage.c
> > +++ b/arch/powerpc/mm/hugetlbpage.c
> > @@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> >                 */
> >                next = addr + (1 << hugepd_shift(*(hugepd_t *)pmd));
> > #endif
> > +               preempt_disable();
> >                free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
> >                                  addr, next, floor, ceiling);
> > +               preempt_enable();
> >        } while (addr = next, addr != end);
> > 
> >        start &= PUD_MASK;
> > 
> > 
> > Thanks
> > -Bharat
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

  reply	other threads:[~2013-12-03 22:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-29  4:38 Error in frreing hugepages with preemption enabled Bharat Bhushan
2013-11-29  4:38 ` Bharat Bhushan
2013-11-29 11:13 ` Alexander Graf
2013-11-29 11:13   ` Alexander Graf
2013-11-29 11:13   ` Alexander Graf
2013-12-03 22:21   ` Andrea Arcangeli [this message]
2013-12-03 22:21     ` Andrea Arcangeli
2013-12-03 22:21     ` Andrea Arcangeli
2013-12-04  2:22     ` Benjamin Herrenschmidt
2013-12-04  2:22       ` Benjamin Herrenschmidt
2013-12-04  2:22       ` Benjamin Herrenschmidt
2013-12-05 11:14     ` Bharat Bhushan
2013-12-05 11:14       ` Bharat Bhushan
2013-12-05 11:14       ` Bharat Bhushan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131203222121.GB18764@redhat.com \
    --to=aarcange@redhat.com \
    --cc=Bharat.Bhushan@freescale.com \
    --cc=agraf@suse.de \
    --cc=benh@kernel.crashing.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=scottwood@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.