public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG]: Intel uncore boot warning introduced in 4.1
@ 2015-08-06  9:34 Matthew Leach
  2015-08-06 16:13 ` Matthew Leach
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Leach @ 2015-08-06  9:34 UTC (permalink / raw)
  To: Kan Liang, Ingo Molnar; +Cc: linux-kernel, linux-ia64

Hello,

Since upgrading to a 4.1 series kernel, I have been getting an odd
warning message from the kernel on boot, [1], as well as random freezes
after about 20-30 minutes of uptime.  I'm not sure if the two are
related, however.

I've bisected the kernel and found that commit [2] seems to introduce
the warning message.  I have checked on a v4.2-rc5 kernel and the
warning message is still there.

I am running a Lenovo thinkpad t440. See [3] for /proc/cpuinfo.

Thanks,
Matt

[1]: 
resource sanity check: requesting [mem 0xfed10000-0xfed15fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff]
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1 at arch/x86/mm/ioremap.c:202 __ioremap_caller+0x2b0/0x3a0()
Info: mapping multiple BARs. Your kernel is fine.
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.1.0-rc6-ARCH-00156-g15c1247 #31
Hardware name: LENOVO 20B7A0HL00/20B7A0HL00, BIOS GJET79WW (2.29 ) 09/03/2014
 0000000000000000 000000008d90ee2d ffff880310117ad8 ffffffff81f31c90
 ffffffff8245b758 ffff880310117b30 ffff880310117b18 ffffffff810df57b
 00000000fed10000 ffffc90001b90000 00000000fed16000 0000000000006000
Call Trace:
 [<ffffffff81f31c90>] dump_stack+0x4c/0x6e
 [<ffffffff810df57b>] warn_slowpath_common+0x7b/0xc0
 [<ffffffff810df640>] warn_slowpath_fmt+0x50/0x70
 [<ffffffff81045040>] __ioremap_caller+0x2b0/0x3a0
 [<ffffffff810452e2>] ioremap_nocache+0x12/0x20
 [<ffffffff81025464>] snb_uncore_imc_init_box+0x74/0xb0
 [<ffffffff810236e0>] uncore_pci_probe+0xd0/0x220
 [<ffffffff814f85d0>] local_pci_probe+0x40/0xa0
 [<ffffffff814f9755>] ? pci_match_device+0xe5/0x110
 [<ffffffff814f9879>] pci_device_probe+0xf9/0x150
 [<ffffffff816f6219>] driver_probe_device+0x1f9/0x4b0
 [<ffffffff816f65a3>] __driver_attach+0x93/0xa0
 [<ffffffff816f6510>] ? __device_attach+0x40/0x40
 [<ffffffff816f4223>] bus_for_each_dev+0x73/0xc0
 [<ffffffff816f6689>] driver_attach+0x19/0x20
 [<ffffffff816f4e18>] bus_add_driver+0x168/0x240
 [<ffffffff816f6def>] driver_register+0x5f/0xf0
 [<ffffffff8259b44c>] ? uncore_types_exit+0x26/0x26
 [<ffffffff814f9af6>] __pci_register_driver+0x46/0x50
 [<ffffffff8259b514>] intel_uncore_init+0xc8/0x2ad
 [<ffffffff8259b44c>] ? uncore_types_exit+0x26/0x26
 [<ffffffff82591106>] do_one_initcall+0x195/0x1aa
 [<ffffffff82591277>] kernel_init_freeable+0x15c/0x1f8
 [<ffffffff81f2dd50>] ? rest_init+0x90/0x90
 [<ffffffff81f2dd59>] kernel_init+0x9/0xf0
 [<ffffffff81f3bb92>] ret_from_fork+0x42/0x70
 [<ffffffff81f2dd50>] ? rest_init+0x90/0x90
---[ end trace 29e0f99deb80a845 ]---

[2]: 8cf1a3de97804b047973dd44cfacdc1930da8403

[3]:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 69
model name      : Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz
stepping        : 1
microcode       : 0x1c
cpu MHz         : 800.000
cache size      : 3072 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt
bugs            :
bogomips        : 4990.54
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

[repeated 3 times]


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG]: Intel uncore boot warning introduced in 4.1
  2015-08-06  9:34 [BUG]: Intel uncore boot warning introduced in 4.1 Matthew Leach
@ 2015-08-06 16:13 ` Matthew Leach
  2015-08-06 18:10   ` Liang, Kan
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Leach @ 2015-08-06 16:13 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Kan Liang, linux-kernel, linux-ia64

Hi Ingo,

Matthew Leach <matthew@mattleach.net> writes:

[...]

> I've bisected the kernel and found that commit [2] seems to introduce
> the warning message.  I have checked on a v4.2-rc5 kernel and the
> warning message is still there.

[...]

> [2]: 8cf1a3de97804b047973dd44cfacdc1930da8403

Apologies, I got it wrong.  The commit that is causing the issue is [1].
If I revert it, the warning goes away.  I'm also testing to see if this
is the cause of the random freezing that occurs (which I can confirm is
also happening with v4.2-rc5).

[1]: 15c1247953e8a45232ed5a5540f291d2d0a77665

Thanks,
-- 
Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [BUG]: Intel uncore boot warning introduced in 4.1
  2015-08-06 16:13 ` Matthew Leach
@ 2015-08-06 18:10   ` Liang, Kan
  2015-08-06 18:44     ` Matthew Leach
  2015-08-07  9:05     ` Peter Zijlstra
  0 siblings, 2 replies; 7+ messages in thread
From: Liang, Kan @ 2015-08-06 18:10 UTC (permalink / raw)
  To: Matthew Leach, Ingo Molnar
  Cc: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org,
	eranian@google.com, 'Andi Kleen', 'Bjorn Helgaas',
	'Vince Weaver', 'Peter Zijlstra',
	'Sonny Rao'


> 
> Hi Ingo,
> 
> Matthew Leach <matthew@mattleach.net> writes:
> 
> [...]
> 
> > I've bisected the kernel and found that commit [2] seems to introduce
> > the warning message.  I have checked on a v4.2-rc5 kernel and the
> > warning message is still there.
> 
> [...]
> 
> > [2]: 8cf1a3de97804b047973dd44cfacdc1930da8403
> 
> Apologies, I got it wrong.  The commit that is causing the issue is [1].
> If I revert it, the warning goes away.  I'm also testing to see if this is the
> cause of the random freezing that occurs (which I can confirm is also
> happening with v4.2-rc5).
> 
> [1]: 15c1247953e8a45232ed5a5540f291d2d0a77665
> 

The issue may be caused by uncore box initialization.

For preventing the potential issues of uncore box initialization, I once
moved the uncore_box_init() out of driver initialization in commit
c05199e5a57a579fea1e8fa65e2b511ceb524ffc.

However, it cause some desktop crash, because the box initialization
codes were moved in IPI context.
 
For fixing the crash issue, we had two choice at that time.
 - Simply revert the codes. That's where is
15c1247953e8a45232ed5a5540f291d2d0a77665 from.
 - Move uncore_box_init out of IPI context to uncore event
   init. I provided a patch for it. https://lkml.org/lkml/2015/4/28/21
  Stephane Eranian also verified it on his platform

At that time, we chose first option. But it looks there is some
issue now. I guess we may try the second option this time.

Matthew,

Could you please revert
15c1247953e8a45232ed5a5540f291d2d0a77665
and apply the patch https://lkml.org/lkml/2015/4/26/294?
See if it works?


Thanks,
Kan



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG]: Intel uncore boot warning introduced in 4.1
  2015-08-06 18:10   ` Liang, Kan
@ 2015-08-06 18:44     ` Matthew Leach
  2015-08-07  9:05     ` Peter Zijlstra
  1 sibling, 0 replies; 7+ messages in thread
From: Matthew Leach @ 2015-08-06 18:44 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Ingo Molnar, linux-kernel@vger.kernel.org,
	linux-ia64@vger.kernel.org, eranian@google.com,
	'Andi Kleen', 'Bjorn Helgaas',
	'Vince Weaver', 'Peter Zijlstra',
	'Sonny Rao'

Hi Kan,

"Liang, Kan" <kan.liang@intel.com> writes:

[...]

> Matthew,
>
> Could you please revert
> 15c1247953e8a45232ed5a5540f291d2d0a77665
> and apply the patch https://lkml.org/lkml/2015/4/26/294?
> See if it works?

That works for me.  I no longer get the warning in my kernel boot log.

Thanks,
Matt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG]: Intel uncore boot warning introduced in 4.1
  2015-08-06 18:10   ` Liang, Kan
  2015-08-06 18:44     ` Matthew Leach
@ 2015-08-07  9:05     ` Peter Zijlstra
  2015-08-10 13:23       ` Liang, Kan
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2015-08-07  9:05 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Matthew Leach, Ingo Molnar, linux-kernel@vger.kernel.org,
	linux-ia64@vger.kernel.org, eranian@google.com,
	'Andi Kleen', 'Bjorn Helgaas',
	'Vince Weaver', 'Sonny Rao'

On Thu, Aug 06, 2015 at 06:10:40PM +0000, Liang, Kan wrote:
> The issue may be caused by uncore box initialization.
> 
> For preventing the potential issues of uncore box initialization, I once
> moved the uncore_box_init() out of driver initialization in commit
> c05199e5a57a579fea1e8fa65e2b511ceb524ffc.
> 
> However, it cause some desktop crash, because the box initialization
> codes were moved in IPI context.
>  
> For fixing the crash issue, we had two choice at that time.
>  - Simply revert the codes. That's where is
> 15c1247953e8a45232ed5a5540f291d2d0a77665 from.
>  - Move uncore_box_init out of IPI context to uncore event
>    init. I provided a patch for it. https://lkml.org/lkml/2015/4/28/21
>   Stephane Eranian also verified it on his platform
> 
> At that time, we chose first option. But it looks there is some
> issue now. I guess we may try the second option this time.
> 
> Matthew,
> 
> Could you please revert
> 15c1247953e8a45232ed5a5540f291d2d0a77665
> and apply the patch https://lkml.org/lkml/2015/4/26/294?
> See if it works?

That patch is wrong though; how can even publish a PMU which is not
initialized?



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [BUG]: Intel uncore boot warning introduced in 4.1
  2015-08-07  9:05     ` Peter Zijlstra
@ 2015-08-10 13:23       ` Liang, Kan
  2015-09-15 13:35         ` Josh Boyer
  0 siblings, 1 reply; 7+ messages in thread
From: Liang, Kan @ 2015-08-10 13:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Matthew Leach, Ingo Molnar, linux-kernel@vger.kernel.org,
	linux-ia64@vger.kernel.org, eranian@google.com,
	'Andi Kleen', 'Bjorn Helgaas',
	'Vince Weaver', 'Sonny Rao'


> On Thu, Aug 06, 2015 at 06:10:40PM +0000, Liang, Kan wrote:
> > The issue may be caused by uncore box initialization.
> >
> > For preventing the potential issues of uncore box initialization, I
> > once moved the uncore_box_init() out of driver initialization in
> > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc.
> >
> > However, it cause some desktop crash, because the box initialization
> > codes were moved in IPI context.
> >
> > For fixing the crash issue, we had two choice at that time.
> >  - Simply revert the codes. That's where is
> > 15c1247953e8a45232ed5a5540f291d2d0a77665 from.
> >  - Move uncore_box_init out of IPI context to uncore event
> >    init. I provided a patch for it. https://lkml.org/lkml/2015/4/28/21
> >   Stephane Eranian also verified it on his platform
> >
> > At that time, we chose first option. But it looks there is some issue
> > now. I guess we may try the second option this time.
> >
> > Matthew,
> >
> > Could you please revert
> > 15c1247953e8a45232ed5a5540f291d2d0a77665
> > and apply the patch https://lkml.org/lkml/2015/4/26/294?
> > See if it works?
> 
> That patch is wrong though; how can even publish a PMU which is not
> initialized?

It's initialized but not in the driver initialization.
We once encountered boot crashes which caused by uncore
driver who trying to access non-existing boxes. Also this uncore
boot warning.
So I think it's better to move the box init code out of driver
initialization to prevent such potential boot failures.
Uncore event init should be a good place to do box init.
Only when the box is not initialized and user tries to use
uncore event, we do box initialization.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG]: Intel uncore boot warning introduced in 4.1
  2015-08-10 13:23       ` Liang, Kan
@ 2015-09-15 13:35         ` Josh Boyer
  0 siblings, 0 replies; 7+ messages in thread
From: Josh Boyer @ 2015-09-15 13:35 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Peter Zijlstra, Matthew Leach, Ingo Molnar,
	linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org,
	eranian@google.com, Andi Kleen, Bjorn Helgaas, Vince Weaver,
	Sonny Rao

On Mon, Aug 10, 2015 at 9:23 AM, Liang, Kan <kan.liang@intel.com> wrote:
>
>> On Thu, Aug 06, 2015 at 06:10:40PM +0000, Liang, Kan wrote:
>> > The issue may be caused by uncore box initialization.
>> >
>> > For preventing the potential issues of uncore box initialization, I
>> > once moved the uncore_box_init() out of driver initialization in
>> > commit c05199e5a57a579fea1e8fa65e2b511ceb524ffc.
>> >
>> > However, it cause some desktop crash, because the box initialization
>> > codes were moved in IPI context.
>> >
>> > For fixing the crash issue, we had two choice at that time.
>> >  - Simply revert the codes. That's where is
>> > 15c1247953e8a45232ed5a5540f291d2d0a77665 from.
>> >  - Move uncore_box_init out of IPI context to uncore event
>> >    init. I provided a patch for it. https://lkml.org/lkml/2015/4/28/21
>> >   Stephane Eranian also verified it on his platform
>> >
>> > At that time, we chose first option. But it looks there is some issue
>> > now. I guess we may try the second option this time.
>> >
>> > Matthew,
>> >
>> > Could you please revert
>> > 15c1247953e8a45232ed5a5540f291d2d0a77665
>> > and apply the patch https://lkml.org/lkml/2015/4/26/294?
>> > See if it works?
>>
>> That patch is wrong though; how can even publish a PMU which is not
>> initialized?
>
> It's initialized but not in the driver initialization.
> We once encountered boot crashes which caused by uncore
> driver who trying to access non-existing boxes. Also this uncore
> boot warning.
> So I think it's better to move the box init code out of driver
> initialization to prevent such potential boot failures.
> Uncore event init should be a good place to do box init.
> Only when the box is not initialized and user tries to use
> uncore event, we do box initialization.

We're still getting reports of this in Fedora with 4.1.y kernels.  Was
there any resolution to this?

josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-09-15 13:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-06  9:34 [BUG]: Intel uncore boot warning introduced in 4.1 Matthew Leach
2015-08-06 16:13 ` Matthew Leach
2015-08-06 18:10   ` Liang, Kan
2015-08-06 18:44     ` Matthew Leach
2015-08-07  9:05     ` Peter Zijlstra
2015-08-10 13:23       ` Liang, Kan
2015-09-15 13:35         ` Josh Boyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox