All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: lkp@lists.01.org
Subject: Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf
Date: Thu, 09 Mar 2017 22:32:12 +0100	[thread overview]
Message-ID: <58C1C9DC.7070509@iogearbox.net> (raw)
In-Reply-To: <58C19F67.3040509@iogearbox.net>

[-- Attachment #1: Type: text/plain, Size: 3669 bytes --]

[ + Borislav ]

On 03/09/2017 07:31 PM, Daniel Borkmann wrote:
> On 03/09/2017 07:15 PM, Linus Torvalds wrote:
>> On Thu, Mar 9, 2017 at 10:10 AM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> Very odd. We should always have PGE (0x0080) set in cr4 (if the CPU
>>> supports it).
>>
>> Daniel, do you see the code in probe_page_size_mask() triggering?
>>
>>          /* Enable PGE if available */
>>          if (boot_cpu_has(X86_FEATURE_PGE)) {
>>                  cr4_set_bits_and_update_boot(X86_CR4_PGE);
>>                  __supported_pte_mask |= _PAGE_GLOBAL;
>
> We do have boot_cpu_has(X86_FEATURE_PGE) and go indeed into this
> branch here. So it seems something must be clearing it later, hmm.
>
>>          } else
>>                  __supported_pte_mask &= ~_PAGE_GLOBAL;
>>
>> but maybe there's something wrong with the percpu cr4 caching?

So, I think I got a little bit further. To the printk's I added
previously in the test_setmem code that run on the problematic
kernel, I now extended them into:

   printk("static_cpu X86_FEATURE_PGE:%u\n",     static_cpu_has(X86_FEATURE_PGE));
   printk("boot_cpu   X86_FEATURE_PGE:%u\n",       boot_cpu_has(X86_FEATURE_PGE));
   printk("static_cpu X86_FEATURE_INVPCID:%u\n", static_cpu_has(X86_FEATURE_INVPCID));
   printk("boot_cpu   X86_FEATURE_INVPCID:%u\n",   boot_cpu_has(X86_FEATURE_INVPCID));

And here's what I get in the log:

"-cpu kvm64" gives:

[    8.426865] static_cpu X86_FEATURE_PGE:1
[    8.427148] boot_cpu   X86_FEATURE_PGE:0
[    8.427428] static_cpu X86_FEATURE_INVPCID:0
[    8.427732] boot_cpu   X86_FEATURE_INVPCID:0

"-cpu host" gives:

[    8.426408] static_cpu X86_FEATURE_PGE:1
[    8.426726] boot_cpu   X86_FEATURE_PGE:0
[    8.427037] static_cpu X86_FEATURE_INVPCID:1
[    8.427375] boot_cpu   X86_FEATURE_INVPCID:1

This means at that point in time static_cpu_has(X86_FEATURE_PGE) is
not the same as boot_cpu_has(X86_FEATURE_PGE).

The code that switches this off is in lguest_arch_host_init(). Right
before that, both are X86_FEATURE_PGE:1, X86_FEATURE_INVPCID:0 for
the "-cpu kvm64" case.

Then, the lguest code does:

	get_online_cpus();
	if (boot_cpu_has(X86_FEATURE_PGE)) { /* We have a broader idea of "global". */
		/* Remember that this was originally set (for cleanup). */
		cpu_had_pge = 1;
		/*
		 * adjust_pge is a helper function which sets or unsets the PGE
		 * bit on its CPU, depending on the argument (0 == unset).
		 */
		on_each_cpu(adjust_pge, (void *)0, 1);
		/* Turn off the feature in the global feature set. */
		clear_cpu_cap(&boot_cpu_data, X86_FEATURE_PGE);
	}
	put_online_cpus();

So, adjust_pge() clears X86_CR4_PGE, and boot cpu has X86_FEATURE_PGE
unset. This means, with static_cpu_has(X86_FEATURE_PGE) still 1, we
run into using cr4 for TLB flushing with no X86_CR4_PGE bit set (which
doesn't trigger the flush), whereas we should be using cr3 for flushing
from that point onwards instead.

I tried with this one, and things seem to work again:

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6fa8594..fc5abff 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -188,7 +188,7 @@ static inline void __native_flush_tlb_single(unsigned long addr)

  static inline void __flush_tlb_all(void)
  {
-       if (static_cpu_has(X86_FEATURE_PGE))
+       if (boot_cpu_has(X86_FEATURE_PGE))
                 __flush_tlb_global();
         else
                 __flush_tlb();

Presumably coming from c109bf95992b ("x86/cpufeature: Remove cpu_has_pge")?

Thanks,
Daniel

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Borkmann <daniel@iogearbox.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Kees Cook <keescook@chromium.org>,
	Laura Abbott <labbott@redhat.com>, Ingo Molnar <mingo@kernel.org>,
	Peter Anvin <hpa@zytor.com>,
	Fengguang Wu <fengguang.wu@intel.com>,
	Network Development <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, LKP <lkp@01.org>,
	ast@fb.com, the arch/x86 maintainers <x86@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	bp@suse.de
Subject: Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf
Date: Thu, 09 Mar 2017 22:32:12 +0100	[thread overview]
Message-ID: <58C1C9DC.7070509@iogearbox.net> (raw)
In-Reply-To: <58C19F67.3040509@iogearbox.net>

[ + Borislav ]

On 03/09/2017 07:31 PM, Daniel Borkmann wrote:
> On 03/09/2017 07:15 PM, Linus Torvalds wrote:
>> On Thu, Mar 9, 2017 at 10:10 AM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> Very odd. We should always have PGE (0x0080) set in cr4 (if the CPU
>>> supports it).
>>
>> Daniel, do you see the code in probe_page_size_mask() triggering?
>>
>>          /* Enable PGE if available */
>>          if (boot_cpu_has(X86_FEATURE_PGE)) {
>>                  cr4_set_bits_and_update_boot(X86_CR4_PGE);
>>                  __supported_pte_mask |= _PAGE_GLOBAL;
>
> We do have boot_cpu_has(X86_FEATURE_PGE) and go indeed into this
> branch here. So it seems something must be clearing it later, hmm.
>
>>          } else
>>                  __supported_pte_mask &= ~_PAGE_GLOBAL;
>>
>> but maybe there's something wrong with the percpu cr4 caching?

So, I think I got a little bit further. To the printk's I added
previously in the test_setmem code that run on the problematic
kernel, I now extended them into:

   printk("static_cpu X86_FEATURE_PGE:%u\n",     static_cpu_has(X86_FEATURE_PGE));
   printk("boot_cpu   X86_FEATURE_PGE:%u\n",       boot_cpu_has(X86_FEATURE_PGE));
   printk("static_cpu X86_FEATURE_INVPCID:%u\n", static_cpu_has(X86_FEATURE_INVPCID));
   printk("boot_cpu   X86_FEATURE_INVPCID:%u\n",   boot_cpu_has(X86_FEATURE_INVPCID));

And here's what I get in the log:

"-cpu kvm64" gives:

[    8.426865] static_cpu X86_FEATURE_PGE:1
[    8.427148] boot_cpu   X86_FEATURE_PGE:0
[    8.427428] static_cpu X86_FEATURE_INVPCID:0
[    8.427732] boot_cpu   X86_FEATURE_INVPCID:0

"-cpu host" gives:

[    8.426408] static_cpu X86_FEATURE_PGE:1
[    8.426726] boot_cpu   X86_FEATURE_PGE:0
[    8.427037] static_cpu X86_FEATURE_INVPCID:1
[    8.427375] boot_cpu   X86_FEATURE_INVPCID:1

This means at that point in time static_cpu_has(X86_FEATURE_PGE) is
not the same as boot_cpu_has(X86_FEATURE_PGE).

The code that switches this off is in lguest_arch_host_init(). Right
before that, both are X86_FEATURE_PGE:1, X86_FEATURE_INVPCID:0 for
the "-cpu kvm64" case.

Then, the lguest code does:

	get_online_cpus();
	if (boot_cpu_has(X86_FEATURE_PGE)) { /* We have a broader idea of "global". */
		/* Remember that this was originally set (for cleanup). */
		cpu_had_pge = 1;
		/*
		 * adjust_pge is a helper function which sets or unsets the PGE
		 * bit on its CPU, depending on the argument (0 == unset).
		 */
		on_each_cpu(adjust_pge, (void *)0, 1);
		/* Turn off the feature in the global feature set. */
		clear_cpu_cap(&boot_cpu_data, X86_FEATURE_PGE);
	}
	put_online_cpus();

So, adjust_pge() clears X86_CR4_PGE, and boot cpu has X86_FEATURE_PGE
unset. This means, with static_cpu_has(X86_FEATURE_PGE) still 1, we
run into using cr4 for TLB flushing with no X86_CR4_PGE bit set (which
doesn't trigger the flush), whereas we should be using cr3 for flushing
from that point onwards instead.

I tried with this one, and things seem to work again:

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6fa8594..fc5abff 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -188,7 +188,7 @@ static inline void __native_flush_tlb_single(unsigned long addr)

  static inline void __flush_tlb_all(void)
  {
-       if (static_cpu_has(X86_FEATURE_PGE))
+       if (boot_cpu_has(X86_FEATURE_PGE))
                 __flush_tlb_global();
         else
                 __flush_tlb();

Presumably coming from c109bf95992b ("x86/cpufeature: Remove cpu_has_pge")?

Thanks,
Daniel

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Borkmann <daniel@iogearbox.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Kees Cook <keescook@chromium.org>,
	Laura Abbott <labbott@redhat.com>, Ingo Molnar <mingo@kernel.org>,
	Peter Anvin <hpa@zytor.com>,
	Fengguang Wu <fengguang.wu@intel.com>,
	Network Development <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, LKP <lkp@01.org>,
	ast@fb.com, the arch/x86 maintainers <x86@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	bp@suse.de
Subject: Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf
Date: Thu, 09 Mar 2017 22:32:12 +0100	[thread overview]
Message-ID: <58C1C9DC.7070509@iogearbox.net> (raw)
In-Reply-To: <58C19F67.3040509@iogearbox.net>

[ + Borislav ]

On 03/09/2017 07:31 PM, Daniel Borkmann wrote:
> On 03/09/2017 07:15 PM, Linus Torvalds wrote:
>> On Thu, Mar 9, 2017 at 10:10 AM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> Very odd. We should always have PGE (0x0080) set in cr4 (if the CPU
>>> supports it).
>>
>> Daniel, do you see the code in probe_page_size_mask() triggering?
>>
>>          /* Enable PGE if available */
>>          if (boot_cpu_has(X86_FEATURE_PGE)) {
>>                  cr4_set_bits_and_update_boot(X86_CR4_PGE);
>>                  __supported_pte_mask |= _PAGE_GLOBAL;
>
> We do have boot_cpu_has(X86_FEATURE_PGE) and go indeed into this
> branch here. So it seems something must be clearing it later, hmm.
>
>>          } else
>>                  __supported_pte_mask &= ~_PAGE_GLOBAL;
>>
>> but maybe there's something wrong with the percpu cr4 caching?

So, I think I got a little bit further. To the printk's I added
previously in the test_setmem code that run on the problematic
kernel, I now extended them into:

   printk("static_cpu X86_FEATURE_PGE:%u\n",     static_cpu_has(X86_FEATURE_PGE));
   printk("boot_cpu   X86_FEATURE_PGE:%u\n",       boot_cpu_has(X86_FEATURE_PGE));
   printk("static_cpu X86_FEATURE_INVPCID:%u\n", static_cpu_has(X86_FEATURE_INVPCID));
   printk("boot_cpu   X86_FEATURE_INVPCID:%u\n",   boot_cpu_has(X86_FEATURE_INVPCID));

And here's what I get in the log:

"-cpu kvm64" gives:

[    8.426865] static_cpu X86_FEATURE_PGE:1
[    8.427148] boot_cpu   X86_FEATURE_PGE:0
[    8.427428] static_cpu X86_FEATURE_INVPCID:0
[    8.427732] boot_cpu   X86_FEATURE_INVPCID:0

"-cpu host" gives:

[    8.426408] static_cpu X86_FEATURE_PGE:1
[    8.426726] boot_cpu   X86_FEATURE_PGE:0
[    8.427037] static_cpu X86_FEATURE_INVPCID:1
[    8.427375] boot_cpu   X86_FEATURE_INVPCID:1

This means at that point in time static_cpu_has(X86_FEATURE_PGE) is
not the same as boot_cpu_has(X86_FEATURE_PGE).

The code that switches this off is in lguest_arch_host_init(). Right
before that, both are X86_FEATURE_PGE:1, X86_FEATURE_INVPCID:0 for
the "-cpu kvm64" case.

Then, the lguest code does:

	get_online_cpus();
	if (boot_cpu_has(X86_FEATURE_PGE)) { /* We have a broader idea of "global". */
		/* Remember that this was originally set (for cleanup). */
		cpu_had_pge = 1;
		/*
		 * adjust_pge is a helper function which sets or unsets the PGE
		 * bit on its CPU, depending on the argument (0 == unset).
		 */
		on_each_cpu(adjust_pge, (void *)0, 1);
		/* Turn off the feature in the global feature set. */
		clear_cpu_cap(&boot_cpu_data, X86_FEATURE_PGE);
	}
	put_online_cpus();

So, adjust_pge() clears X86_CR4_PGE, and boot cpu has X86_FEATURE_PGE
unset. This means, with static_cpu_has(X86_FEATURE_PGE) still 1, we
run into using cr4 for TLB flushing with no X86_CR4_PGE bit set (which
doesn't trigger the flush), whereas we should be using cr3 for flushing
from that point onwards instead.

I tried with this one, and things seem to work again:

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6fa8594..fc5abff 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -188,7 +188,7 @@ static inline void __native_flush_tlb_single(unsigned long addr)

  static inline void __flush_tlb_all(void)
  {
-       if (static_cpu_has(X86_FEATURE_PGE))
+       if (boot_cpu_has(X86_FEATURE_PGE))
                 __flush_tlb_global();
         else
                 __flush_tlb();

Presumably coming from c109bf95992b ("x86/cpufeature: Remove cpu_has_pge")?

Thanks,
Daniel

  reply	other threads:[~2017-03-09 21:32 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-01 12:54 [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf Fengguang Wu
2017-03-01 12:54 ` Fengguang Wu
2017-03-02 20:23 ` Fengguang Wu
2017-03-02 20:23   ` Fengguang Wu
2017-03-02 20:40   ` Daniel Borkmann
2017-03-02 20:40     ` Daniel Borkmann
2017-03-08 19:25     ` Linus Torvalds
2017-03-08 19:25       ` Linus Torvalds
2017-03-08 22:27       ` Daniel Borkmann
2017-03-08 22:27         ` Daniel Borkmann
2017-03-08 22:36         ` Kees Cook
2017-03-08 22:36           ` Kees Cook
2017-03-08 22:51           ` Daniel Borkmann
2017-03-08 22:51             ` Daniel Borkmann
2017-03-08 23:55           ` Laura Abbott
2017-03-08 23:55             ` Laura Abbott
2017-03-09  5:36             ` Kees Cook
2017-03-09  5:36               ` Kees Cook
2017-03-09 13:04               ` Daniel Borkmann
2017-03-09 13:04                 ` Daniel Borkmann
2017-03-09 13:10                 ` Thomas Gleixner
2017-03-09 13:10                   ` Thomas Gleixner
2017-03-09 13:25                   ` Daniel Borkmann
2017-03-09 13:25                     ` Daniel Borkmann
2017-03-09 14:49                     ` Thomas Gleixner
2017-03-09 14:49                       ` Thomas Gleixner
2017-03-09 17:51                       ` Daniel Borkmann
2017-03-09 17:51                         ` Daniel Borkmann
2017-03-09 18:08                         ` David Miller
2017-03-09 18:08                           ` David Miller
2017-03-09 18:10                         ` Linus Torvalds
2017-03-09 18:10                           ` Linus Torvalds
2017-03-09 18:15                           ` Linus Torvalds
2017-03-09 18:15                             ` Linus Torvalds
2017-03-09 18:31                             ` Daniel Borkmann
2017-03-09 18:31                               ` Daniel Borkmann
2017-03-09 21:32                               ` Daniel Borkmann [this message]
2017-03-09 21:32                                 ` Daniel Borkmann
2017-03-09 21:32                                 ` Daniel Borkmann
2017-03-09 21:55                                 ` Borislav Petkov
2017-03-09 21:55                                   ` Borislav Petkov
2017-03-09 22:07                                   ` Borislav Petkov
2017-03-09 22:07                                     ` Borislav Petkov
2017-03-09 22:11                                     ` Daniel Borkmann
2017-03-09 22:11                                       ` Daniel Borkmann
2017-03-09 22:48                                       ` Borislav Petkov
2017-03-09 22:48                                         ` Borislav Petkov
2017-03-09 23:26                                         ` Linus Torvalds
2017-03-09 23:26                                           ` Linus Torvalds
2017-03-09 23:44                                           ` Borislav Petkov
2017-03-09 23:44                                             ` Borislav Petkov
2017-03-10  0:13                                             ` Daniel Borkmann
2017-03-10  0:13                                               ` Daniel Borkmann
2017-03-12 21:40                                           ` Borislav Petkov
2017-03-12 21:40                                             ` Borislav Petkov
2017-03-09 14:53                     ` Daniel Borkmann
2017-03-09 14:53                       ` Daniel Borkmann
2017-03-09 17:48                       ` Linus Torvalds
2017-03-09 17:48                         ` Linus Torvalds
2017-03-08 22:43         ` Linus Torvalds
2017-03-08 22:43           ` Linus Torvalds
2017-03-09  1:34           ` Fengguang Wu
2017-03-09  1:34             ` Fengguang Wu
2017-03-09 13:09       ` Thomas Gleixner
2017-03-09 13:09         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58C1C9DC.7070509@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.