From: jerry <jerry.lilijun@huawei.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
"qianhuibin@huawei.com" <qianhuibin@huawei.com>,
stefano.stabellini@eu.citrix.com, xiaowei.yang@huawei.com,
wangfuhai@huawei.com, qinchuanyu@huawei.com,
xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: netback BUG_ON when using copy_skb=1
Date: Tue, 22 Oct 2013 09:18:06 +0800 [thread overview]
Message-ID: <5265D24E.2010609@huawei.com> (raw)
In-Reply-To: <525FEFF802000078000FBD13@nat28.tlf.novell.com>
On 2013/10/17 20:11, Jan Beulich wrote:
>>>> On 17.10.13 at 12:26, jerry <jerry.lilijun@huawei.com> wrote:
>> Hi Jan,
>
> please don't top post.
>
>> In my test, the grant table copy error may cause that VM crash.
>> The stack is as follows:
>> kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372!
>> ...
>> The BUG code in xen-netfront.c xennet_tx_buf_gc() is:
>> if (unlikely(gnttab_query_foreign_access(
>> np->grant_tx_ref[id]) != 0)) {
>> printk(KERN_ALERT "xennet_tx_buf_gc: warning "
>> "-- grant still in use by backend "
>> "domain.\n");
>> BUG();
>>
>> In my guess the reason may be as follows:
>> 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and
>> __acquire_grant_for_copy() is executed failed and the grant ref is not ended.
>> So GTF_reading bit cannot be cleared.
>> 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is
>> still set.
>
> If that was the case, this would be a hypervisor bug: a grant copy
> operation is supposed to hold the grant active only for as long as
> the copy operation takes. You'll in particular notice that
> __acquire_grant_for_copy() in its error path clears GTF_reading
> (and GTF_writing, as appropriate) again. You'd likely need to
> instrument the code to demonstrate (via a couple of extra log
> messages) what you think is not working properly here.
I have proved that the GTF_reading or GTF_writing is surely cleared after __gnttab_copy().
So the question is where the GTF_reading is set.
Is hypervisor doing a grant copy operation while VM netfront calling xennet_tx_buf_gc()?
Any ideas?
>
> Jan
>
>> On 2013/10/17 16:00, Jan Beulich wrote:
>>>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@huawei.com> wrote:
>>>> But there may be still concurrency problems in my test.
>>>> If the page replacing in copy_pending_req() was done after
>>>> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked
>>>> with GNTCOPY_source_gref.
>>>> Here the memory of that page in skb has been replaced with Dom0 local
>>>> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in
>>>> netbk_rx_actions() will get errors.
>>>> The messages is shown as:
>>>>
>>>> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0)
>>>>
>>>> Would you like to share some opinions?
>>>
>>> At a first glance that seems possible, but the question is - does it
>>> cause any problems other than the quoted message to be issued
>>> (and the problematic packet getting re-transmitted)? I'm asking
>>> mainly because fixing this would appear to imply adding locking to
>>> these paths - with the risk of adversely affecting performance.
>>>
>>> Jan
>>>
>>>
>>>
>
>
>
>
> .
>
next prev parent reply other threads:[~2013-10-22 1:18 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-16 4:13 netback BUG_ON when using copy_skb=1 jerry
2013-10-16 11:10 ` Jan Beulich
2013-10-17 7:41 ` jerry
2013-10-17 8:00 ` Jan Beulich
2013-10-17 10:26 ` jerry
2013-10-17 12:11 ` Jan Beulich
2013-10-22 1:18 ` jerry [this message]
2013-10-22 7:11 ` Jan Beulich
2013-10-26 8:32 ` jerry
2013-10-28 7:43 ` Jan Beulich
2013-10-29 4:04 ` jerry
2013-10-28 11:43 ` Wei Liu
2013-10-31 15:17 ` Ian Campbell
2013-10-31 15:32 ` Wei Liu
2013-11-01 2:53 ` jerry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5265D24E.2010609@huawei.com \
--to=jerry.lilijun@huawei.com \
--cc=JBeulich@suse.com \
--cc=qianhuibin@huawei.com \
--cc=qinchuanyu@huawei.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=wangfuhai@huawei.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
--cc=xiaowei.yang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).