From: jerry <jerry.lilijun@huawei.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
"qianhuibin@huawei.com" <qianhuibin@huawei.com>,
stefano.stabellini@eu.citrix.com, xiaowei.yang@huawei.com,
wangfuhai@huawei.com, qinchuanyu@huawei.com,
xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: netback BUG_ON when using copy_skb=1
Date: Tue, 22 Oct 2013 09:18:06 +0800 [thread overview]
Message-ID: <5265D24E.2010609@huawei.com> (raw)
In-Reply-To: <525FEFF802000078000FBD13@nat28.tlf.novell.com>
On 2013/10/17 20:11, Jan Beulich wrote:
>>>> On 17.10.13 at 12:26, jerry <jerry.lilijun@huawei.com> wrote:
>> Hi Jan,
>
> please don't top post.
>
>> In my test, the grant table copy error may cause that VM crash.
>> The stack is as follows:
>> kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372!
>> ...
>> The BUG code in xen-netfront.c xennet_tx_buf_gc() is:
>> if (unlikely(gnttab_query_foreign_access(
>> np->grant_tx_ref[id]) != 0)) {
>> printk(KERN_ALERT "xennet_tx_buf_gc: warning "
>> "-- grant still in use by backend "
>> "domain.\n");
>> BUG();
>>
>> In my guess the reason may be as follows:
>> 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and
>> __acquire_grant_for_copy() is executed failed and the grant ref is not ended.
>> So GTF_reading bit cannot be cleared.
>> 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is
>> still set.
>
> If that was the case, this would be a hypervisor bug: a grant copy
> operation is supposed to hold the grant active only for as long as
> the copy operation takes. You'll in particular notice that
> __acquire_grant_for_copy() in its error path clears GTF_reading
> (and GTF_writing, as appropriate) again. You'd likely need to
> instrument the code to demonstrate (via a couple of extra log
> messages) what you think is not working properly here.
I have proved that the GTF_reading or GTF_writing is surely cleared after __gnttab_copy().
So the question is where the GTF_reading is set.
Is hypervisor doing a grant copy operation while VM netfront calling xennet_tx_buf_gc()?
Any ideas?
>
> Jan
>
>> On 2013/10/17 16:00, Jan Beulich wrote:
>>>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@huawei.com> wrote:
>>>> But there may be still concurrency problems in my test.
>>>> If the page replacing in copy_pending_req() was done after
>>>> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked
>>>> with GNTCOPY_source_gref.
>>>> Here the memory of that page in skb has been replaced with Dom0 local
>>>> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in
>>>> netbk_rx_actions() will get errors.
>>>> The messages is shown as:
>>>>
>>>> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0)
>>>>
>>>> Would you like to share some opinions?
>>>
>>> At a first glance that seems possible, but the question is - does it
>>> cause any problems other than the quoted message to be issued
>>> (and the problematic packet getting re-transmitted)? I'm asking
>>> mainly because fixing this would appear to imply adding locking to
>>> these paths - with the risk of adversely affecting performance.
>>>
>>> Jan
>>>
>>>
>>>
>
>
>
>
> .
>
next prev parent reply other threads:[~2013-10-22 1:18 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-16 4:13 netback BUG_ON when using copy_skb=1 jerry
2013-10-16 11:10 ` Jan Beulich
2013-10-17 7:41 ` jerry
2013-10-17 8:00 ` Jan Beulich
2013-10-17 10:26 ` jerry
2013-10-17 12:11 ` Jan Beulich
2013-10-22 1:18 ` jerry [this message]
2013-10-22 7:11 ` Jan Beulich
2013-10-26 8:32 ` jerry
2013-10-28 7:43 ` Jan Beulich
2013-10-29 4:04 ` jerry
2013-10-28 11:43 ` Wei Liu
2013-10-31 15:17 ` Ian Campbell
2013-10-31 15:32 ` Wei Liu
2013-11-01 2:53 ` jerry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5265D24E.2010609@huawei.com \
--to=jerry.lilijun@huawei.com \
--cc=JBeulich@suse.com \
--cc=qianhuibin@huawei.com \
--cc=qinchuanyu@huawei.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=wangfuhai@huawei.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
--cc=xiaowei.yang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.