public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] DCA: fix over-warning in ioat3_dca_init
       [not found] <1399542166-42985-1-git-send-email-jet.chen@intel.com>
@ 2014-05-08 15:04 ` Alexander Duyck
  2014-05-08 15:28   ` Jet Chen
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Duyck @ 2014-05-08 15:04 UTC (permalink / raw)
  To: Jet Chen, Dan Williams, Dave Jiang
  Cc: fengguang.wu, wei.w.lin, linux-kernel, dmaengine

I actually went to a bit of trouble to get this message added as many
BIOSes have the annoying quality of getting this wrong, and then
products are shipped and labeled as having the DCA feature when they
actually don't.  One easy way to get rid of the message is to disable
either DCA or IOAT in the BIOS since it is broken anyway.  By adding
this we at least have some visibility and it puts pressure on the BIOS
guys to get this fixed if we want to claim the platform does DCA.

I consider this to be a real BIOS bug as the DCA feature is crippled
without it.  Also moving this to just a debug message is going to make
it very difficult for us to debug this when a performance issue comes up
on a customer platform as we will have to get them to perform extra
steps in order to actually figure out what is going on with DCA.

Thanks,

Alex

On 05/08/2014 02:42 AM, Jet Chen wrote:
> We keep seeing such dmesg messages on boxes
>
> [   16.596610] WARNING: CPU: 0 PID: 457 at drivers/dma/ioat/dca.c:697 ioat3_dca_init+0x19c/0x1b0 [ioatdma]()
> [   16.609614] ioatdma 0000:00:04.0: APICID_TAG_MAP set incorrectly by BIOS, disabling DCA
> ...
> [   16.892058]  [<ffffffff8172807e>] dump_stack+0x4d/0x66
> [   16.892061]  [<ffffffff81067f7d>] warn_slowpath_common+0x7d/0xa0
> [   16.892064]  [<ffffffff81068034>] warn_slowpath_fmt_taint+0x44/0x50
> [   16.892065]  [<ffffffffa00228bc>] ioat3_dca_init+0x19c/0x1b0 [ioatdma]
> [   16.892069]  [<ffffffffa0021cd6>] ioat3_dma_probe+0x386/0x3e0 [ioatdma]
> [   16.892071]  [<ffffffffa001a192>] ioat_pci_probe+0x122/0x1b0 [ioatdma]
> [   16.892074]  [<ffffffff81329385>] local_pci_probe+0x45/0xa0
> [   16.892076]  [<ffffffff81080d34>] work_for_cpu_fn+0x14/0x20
> [   16.892077]  [<ffffffff81083c33>] process_one_work+0x183/0x490
> [   16.892079]  [<ffffffff81084bd3>] worker_thread+0x2a3/0x410
> [   16.892080]  [<ffffffff81084930>] ? rescuer_thread+0x410/0x410
> [   16.892081]  [<ffffffff8108b852>] kthread+0xd2/0xf0
> [   16.892083]  [<ffffffff8108b780>] ? kthread_create_on_node+0x180/0x180
> [   16.892085]  [<ffffffff817396bc>] ret_from_fork+0x7c/0xb0
> [   16.892091] fbcon: mgadrmfb (fb0) is primary device
> [   16.892092]  [<ffffffff8108b780>] ? kthread_create_on_node+0x180/0x180
>
> No need to use WARN_TAINT_ONCE to generate a such big noise if this is not a critical error for kernel. DCA driver could print out a debug messages then quit quietly.
>
> If this is a real BIOS bug, please ignore this patch. Let's transfer this issue to BIOS guys.
>
> Signed-off-by: Jet Chen <jet.chen@intel.com>
> ---
>  drivers/dma/ioat/dca.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/dma/ioat/dca.c b/drivers/dma/ioat/dca.c
> index 9e84d5b..c0f7971 100644
> --- a/drivers/dma/ioat/dca.c
> +++ b/drivers/dma/ioat/dca.c
> @@ -470,10 +470,7 @@ struct dca_provider *ioat2_dca_init(struct pci_dev *pdev, void __iomem *iobase)
>  	}
>  
>  	if (!dca2_tag_map_valid(ioatdca->tag_map)) {
> -		WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
> -				"%s %s: APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n",
> -				dev_driver_string(&pdev->dev),
> -				dev_name(&pdev->dev));
> +		dev_dbg(&pdev->dev, "APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n");
>  		free_dca_provider(dca);
>  		return NULL;
>  	}
> @@ -691,10 +688,7 @@ struct dca_provider *ioat3_dca_init(struct pci_dev *pdev, void __iomem *iobase)
>  	}
>  
>  	if (dca3_tag_map_invalid(ioatdca->tag_map)) {
> -		WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
> -				"%s %s: APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n",
> -				dev_driver_string(&pdev->dev),
> -				dev_name(&pdev->dev));
> +		dev_dbg(&pdev->dev, "APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n");
>  		free_dca_provider(dca);
>  		return NULL;
>  	}


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] DCA: fix over-warning in ioat3_dca_init
  2014-05-08 15:04 ` [PATCH] DCA: fix over-warning in ioat3_dca_init Alexander Duyck
@ 2014-05-08 15:28   ` Jet Chen
  2014-05-08 15:57     ` Alexander Duyck
  0 siblings, 1 reply; 5+ messages in thread
From: Jet Chen @ 2014-05-08 15:28 UTC (permalink / raw)
  To: Alexander Duyck, Dan Williams, Dave Jiang
  Cc: fengguang.wu, wei.w.lin, linux-kernel, dmaengine

I agree with your option that it is a real BIOS bug and it puts pressure on the BIOS guys to get this fixed. However, this warning message interferes with our kernel booting tests and kernel performance tests. We have to disable CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before that, code of CONFIG_INTEL_IOATDMA will not be validated in our testing system :(.
Hope this issue could get fixed soon.

Thanks,
Jet

On 05/08/2014 11:04 PM, Alexander Duyck wrote:
> I actually went to a bit of trouble to get this message added as many
> BIOSes have the annoying quality of getting this wrong, and then
> products are shipped and labeled as having the DCA feature when they
> actually don't.  One easy way to get rid of the message is to disable
> either DCA or IOAT in the BIOS since it is broken anyway.  By adding
> this we at least have some visibility and it puts pressure on the BIOS
> guys to get this fixed if we want to claim the platform does DCA.
> 
> I consider this to be a real BIOS bug as the DCA feature is crippled
> without it.  Also moving this to just a debug message is going to make
> it very difficult for us to debug this when a performance issue comes up
> on a customer platform as we will have to get them to perform extra
> steps in order to actually figure out what is going on with DCA.
> 
> Thanks,
> 
> Alex
> 
> On 05/08/2014 02:42 AM, Jet Chen wrote:
>> We keep seeing such dmesg messages on boxes
>>
>> [   16.596610] WARNING: CPU: 0 PID: 457 at drivers/dma/ioat/dca.c:697 ioat3_dca_init+0x19c/0x1b0 [ioatdma]()
>> [   16.609614] ioatdma 0000:00:04.0: APICID_TAG_MAP set incorrectly by BIOS, disabling DCA
>> ...
>> [   16.892058]  [<ffffffff8172807e>] dump_stack+0x4d/0x66
>> [   16.892061]  [<ffffffff81067f7d>] warn_slowpath_common+0x7d/0xa0
>> [   16.892064]  [<ffffffff81068034>] warn_slowpath_fmt_taint+0x44/0x50
>> [   16.892065]  [<ffffffffa00228bc>] ioat3_dca_init+0x19c/0x1b0 [ioatdma]
>> [   16.892069]  [<ffffffffa0021cd6>] ioat3_dma_probe+0x386/0x3e0 [ioatdma]
>> [   16.892071]  [<ffffffffa001a192>] ioat_pci_probe+0x122/0x1b0 [ioatdma]
>> [   16.892074]  [<ffffffff81329385>] local_pci_probe+0x45/0xa0
>> [   16.892076]  [<ffffffff81080d34>] work_for_cpu_fn+0x14/0x20
>> [   16.892077]  [<ffffffff81083c33>] process_one_work+0x183/0x490
>> [   16.892079]  [<ffffffff81084bd3>] worker_thread+0x2a3/0x410
>> [   16.892080]  [<ffffffff81084930>] ? rescuer_thread+0x410/0x410
>> [   16.892081]  [<ffffffff8108b852>] kthread+0xd2/0xf0
>> [   16.892083]  [<ffffffff8108b780>] ? kthread_create_on_node+0x180/0x180
>> [   16.892085]  [<ffffffff817396bc>] ret_from_fork+0x7c/0xb0
>> [   16.892091] fbcon: mgadrmfb (fb0) is primary device
>> [   16.892092]  [<ffffffff8108b780>] ? kthread_create_on_node+0x180/0x180
>>
>> No need to use WARN_TAINT_ONCE to generate a such big noise if this is not a critical error for kernel. DCA driver could print out a debug messages then quit quietly.
>>
>> If this is a real BIOS bug, please ignore this patch. Let's transfer this issue to BIOS guys.
>>
>> Signed-off-by: Jet Chen <jet.chen@intel.com>
>> ---
>>  drivers/dma/ioat/dca.c | 10 ++--------
>>  1 file changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/dma/ioat/dca.c b/drivers/dma/ioat/dca.c
>> index 9e84d5b..c0f7971 100644
>> --- a/drivers/dma/ioat/dca.c
>> +++ b/drivers/dma/ioat/dca.c
>> @@ -470,10 +470,7 @@ struct dca_provider *ioat2_dca_init(struct pci_dev *pdev, void __iomem *iobase)
>>  	}
>>  
>>  	if (!dca2_tag_map_valid(ioatdca->tag_map)) {
>> -		WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
>> -				"%s %s: APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n",
>> -				dev_driver_string(&pdev->dev),
>> -				dev_name(&pdev->dev));
>> +		dev_dbg(&pdev->dev, "APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n");
>>  		free_dca_provider(dca);
>>  		return NULL;
>>  	}
>> @@ -691,10 +688,7 @@ struct dca_provider *ioat3_dca_init(struct pci_dev *pdev, void __iomem *iobase)
>>  	}
>>  
>>  	if (dca3_tag_map_invalid(ioatdca->tag_map)) {
>> -		WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
>> -				"%s %s: APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n",
>> -				dev_driver_string(&pdev->dev),
>> -				dev_name(&pdev->dev));
>> +		dev_dbg(&pdev->dev, "APICID_TAG_MAP set incorrectly by BIOS, disabling DCA\n");
>>  		free_dca_provider(dca);
>>  		return NULL;
>>  	}
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] DCA: fix over-warning in ioat3_dca_init
  2014-05-08 15:28   ` Jet Chen
@ 2014-05-08 15:57     ` Alexander Duyck
  2014-05-08 16:13       ` Jiang, Dave
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Duyck @ 2014-05-08 15:57 UTC (permalink / raw)
  To: Jet Chen, Dan Williams, Dave Jiang
  Cc: fengguang.wu, wei.w.lin, linux-kernel, dmaengine

On 05/08/2014 08:28 AM, Jet Chen wrote:
> I agree with your option that it is a real BIOS bug and it puts pressure on the BIOS guys to get this fixed. However, this warning message interferes with our kernel booting tests and kernel performance tests. We have to disable CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before that, code of CONFIG_INTEL_IOATDMA will not be validated in our testing system :(.
> Hope this issue could get fixed soon.
> 
> Thanks,
> Jet
> 

First I would recommend updating your BIOS.  If the updated BIOS also
has the issue I would recommend taking this feedback to whoever provided
the BIOS for your platform so that they can implement the fix.

If I am not mistaken some BIOSes have the option to disable DCA and/or
IOATDMA.  You might want to check yours to see if you can just disable
DCA on your platform until the issue can be resolved.

Thanks,

Alex


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] DCA: fix over-warning in ioat3_dca_init
  2014-05-08 15:57     ` Alexander Duyck
@ 2014-05-08 16:13       ` Jiang, Dave
  2014-05-08 16:30         ` Jet Chen
  0 siblings, 1 reply; 5+ messages in thread
From: Jiang, Dave @ 2014-05-08 16:13 UTC (permalink / raw)
  To: Duyck, Alexander H
  Cc: Williams, Dan J, linux-kernel@vger.kernel.org, Wu, Fengguang,
	dmaengine@vger.kernel.org, Lin, Wei W, Chen, Jet

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1525 bytes --]

On Thu, 2014-05-08 at 08:57 -0700, Alexander Duyck wrote:
> On 05/08/2014 08:28 AM, Jet Chen wrote:
> > I agree with your option that it is a real BIOS bug and it puts pressure on the BIOS guys to get this fixed. However, this warning message interferes with our kernel booting tests and kernel performance tests. We have to disable CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before that, code of CONFIG_INTEL_IOATDMA will not be validated in our testing system :(.
> > Hope this issue could get fixed soon.
> > 
> > Thanks,
> > Jet
> > 
> 
> First I would recommend updating your BIOS.  If the updated BIOS also
> has the issue I would recommend taking this feedback to whoever provided
> the BIOS for your platform so that they can implement the fix.
> 
> If I am not mistaken some BIOSes have the option to disable DCA and/or
> IOATDMA.  You might want to check yours to see if you can just disable
> DCA on your platform until the issue can be resolved.

Disabling DCA is the preferred option. IOATDMA is functional without
DCA.

Jet,
What exactly are you attempting to test with IOATDMA? The only two
consumers of this DMA driver I know of are MDRAID and NTB. But support
for XOR/PQ ops on Xeon platforms have been removed due to various
reasons recently so it really is just NTB at the moment in the latest
kernels. 

> Thanks,
> 
> Alex
> 
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] DCA: fix over-warning in ioat3_dca_init
  2014-05-08 16:13       ` Jiang, Dave
@ 2014-05-08 16:30         ` Jet Chen
  0 siblings, 0 replies; 5+ messages in thread
From: Jet Chen @ 2014-05-08 16:30 UTC (permalink / raw)
  To: Jiang, Dave, Duyck, Alexander H
  Cc: Williams, Dan J, linux-kernel@vger.kernel.org, Wu, Fengguang,
	dmaengine@vger.kernel.org, Lin, Wei W

On 05/09/2014 12:13 AM, Jiang, Dave wrote:
> On Thu, 2014-05-08 at 08:57 -0700, Alexander Duyck wrote:
>> On 05/08/2014 08:28 AM, Jet Chen wrote:
>>> I agree with your option that it is a real BIOS bug and it puts pressure on the BIOS guys to get this fixed. However, this warning message interferes with our kernel booting tests and kernel performance tests. We have to disable CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before that, code of CONFIG_INTEL_IOATDMA will not be validated in our testing system :(.
>>> Hope this issue could get fixed soon.
>>>
>>> Thanks,
>>> Jet
>>>
>>
>> First I would recommend updating your BIOS.  If the updated BIOS also
>> has the issue I would recommend taking this feedback to whoever provided
>> the BIOS for your platform so that they can implement the fix.
>>
>> If I am not mistaken some BIOSes have the option to disable DCA and/or
>> IOATDMA.  You might want to check yours to see if you can just disable
>> DCA on your platform until the issue can be resolved.
> 
> Disabling DCA is the preferred option. IOATDMA is functional without
> DCA.
> 
> Jet,
> What exactly are you attempting to test with IOATDMA? The only two
> consumers of this DMA driver I know of are MDRAID and NTB. But support
> for XOR/PQ ops on Xeon platforms have been removed due to various
> reasons recently so it really is just NTB at the moment in the latest
> kernels. 

We are running LKP to test kernel boot and performance. More information you can find at https://01.org/lkp/
This issue shows up in our boot testing with certain kconfig and impacts many of our test machines. It is difficult to update BIOS for all test boxes. Besides, we still not make sure every test box model have a workable version of BIOS. We will consider disabling DCA as your suggestion.

Thanks,
Jet

> 
>> Thanks,
>>
>> Alex
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-05-08 16:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1399542166-42985-1-git-send-email-jet.chen@intel.com>
2014-05-08 15:04 ` [PATCH] DCA: fix over-warning in ioat3_dca_init Alexander Duyck
2014-05-08 15:28   ` Jet Chen
2014-05-08 15:57     ` Alexander Duyck
2014-05-08 16:13       ` Jiang, Dave
2014-05-08 16:30         ` Jet Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox