From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Jay Fenlason <fenlason@redhat.com>
Cc: Santosh Rastapur <santosh@chelsio.com>,
Divy Le Ray <divy@chelsio.com>,
"David S. Miller" <davem@davemloft.net>,
netdev@vger.kernel.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: BUG cxgb3: Check and handle the dma mapping errors
Date: Tue, 06 Aug 2013 10:15:55 +1000 [thread overview]
Message-ID: <5200403B.3000206@ozlabs.ru> (raw)
In-Reply-To: <20130805184107.GA2998@redhat.com>
On 08/06/2013 04:41 AM, Jay Fenlason wrote:
> On Mon, Aug 05, 2013 at 12:59:04PM +1000, Alexey Kardashevskiy wrote:
>> Hi!
>>
>> Recently I started getting multiple errors like this:
>>
>> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
>> c000001fbdaaa882 npages 1
>> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
>> c000001fbdaaa882 npages 1
>> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
>> c000001fbdaaa882 npages 1
>> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
>> c000001fbdaaa882 npages 1
>> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
>> c000001fbdaaa882 npages 1
>> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
>> c000001fbdaaa882 npages 1
>> cxgb3 0006:01:00.0: iommu_alloc failed, tbl c000000003067980 vaddr
>> c000001fbdaaa882 npages 1
>> ... and so on
>>
>> This is all happening on a PPC64 "powernv" platform machine. To trigger the
>> error state, it is enough to _flood_ ping CXGB3 card from another machine
>> (which has Emulex 10Gb NIC + Cisco switch). Just do "ping -f 172.20.1.2"
>> and wait 10-15 seconds.
>>
>>
>> The messages are coming from arch/powerpc/kernel/iommu.c and basically
>> mean that the driver requested more pages than the DMA window has which is
>> normally 1GB (there could be another possible source of errors -
>> ppc_md.tce_build callback - but on powernv platform it always succeeds).
>>
>>
>> The patch after which it broke is:
>> commit f83331bab149e29fa2c49cf102c0cd8c3f1ce9f9
>> Author: Santosh Rastapur <santosh@chelsio.com>
>> Date: Tue May 21 04:21:29 2013 +0000
>> cxgb3: Check and handle the dma mapping errors
>>
>> Any quick ideas? Thanks!
>
> That patch adds error checking to detect failed dma mapping requests.
> Before it, the code always assumed that dma mapping requests succeded,
> whether they actually do or not, so the fact that the older kernel
> does not log errors only means that the failures are being ignored,
> and any appearance of working is through pure luck. The machine could
> have just crashed at that point.
>From what I see, the patch adds map_skb() function which is called in two
new places, so the patch does not just mechanically replace
skb_frag_dma_map() to map_skb() or something like that.
> What is the observed behavior of the system by the machine initiating
> the ping flood? Do the older and newer kernels differ in the
> percentage of pings that do not receive replies?
The other machine stops receiving replies. It is using different adapter,
not Chelsio and the kernel version does not really matter.
> O the newer kernel,
> when the mapping errors are detected, the packet that it is trying to
> transmit is dropped, but I'm not at all sure what happens on the older
> kernel after the dma mapping fails. As I mentioned earlier, I'm
> surprised it does not crash. Perhaps the folks from Chelsio have a
> better idea what happens after a dma mapping error is ignored?
Any kernel cannot avoid platform's iommu_alloc() on ppc64/powernv so if
there was a problem, we would have seen messages (and yes, kernel would
have crashed).
--
Alexey
next prev parent reply other threads:[~2013-08-06 0:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-05 2:59 BUG cxgb3: Check and handle the dma mapping errors Alexey Kardashevskiy
2013-08-05 18:41 ` Jay Fenlason
2013-08-06 0:15 ` Alexey Kardashevskiy [this message]
2013-08-07 16:55 ` Divy Le ray
2013-08-08 5:38 ` Alexey Kardashevskiy
2013-08-13 2:42 ` Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5200403B.3000206@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=davem@davemloft.net \
--cc=divy@chelsio.com \
--cc=fenlason@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=santosh@chelsio.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.