xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: jerry <jerry.lilijun@huawei.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>,
	"qianhuibin@huawei.com" <qianhuibin@huawei.com>,
	stefano.stabellini@eu.citrix.com, xiaowei.yang@huawei.com,
	wangfuhai@huawei.com, qinchuanyu@huawei.com,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: netback BUG_ON when using copy_skb=1
Date: Tue, 22 Oct 2013 09:18:06 +0800	[thread overview]
Message-ID: <5265D24E.2010609@huawei.com> (raw)
In-Reply-To: <525FEFF802000078000FBD13@nat28.tlf.novell.com>

On 2013/10/17 20:11, Jan Beulich wrote:
>>>> On 17.10.13 at 12:26, jerry <jerry.lilijun@huawei.com> wrote:
>> Hi Jan,
> 
> please don't top post.
> 
>> In my test, the grant table copy error may cause that VM crash.
>> The stack is as follows:
>> kernel BUG at /linux/driver/redhat6.2/xen-vnif/xen-netfront.c:372!
>> ...
>> The BUG code in xen-netfront.c xennet_tx_buf_gc() is:
>> 			if (unlikely(gnttab_query_foreign_access(
>> 				np->grant_tx_ref[id]) != 0)) {
>> 				printk(KERN_ALERT "xennet_tx_buf_gc: warning "
>> 				       "-- grant still in use by backend "
>> 				       "domain.\n");
>> 				BUG();
>>
>> In my guess the reason may be as follows:
>> 1) XEN: The function _set_status() called in hypercall __gnttab_copy() and 
>> __acquire_grant_for_copy() is executed failed and the grant ref is not ended.
>>         So GTF_reading bit cannot be cleared.
>> 2) Netfront: this module invokes a BUG when it checks the GTF_reading bit is 
>> still set.
> 
> If that was the case, this would be a hypervisor bug: a grant copy
> operation is supposed to hold the grant active only for as long as
> the copy operation takes. You'll in particular notice that
> __acquire_grant_for_copy() in its error path clears GTF_reading
> (and GTF_writing, as appropriate) again. You'd likely need to
> instrument the code to demonstrate (via a couple of extra log
> messages) what you think is not working properly here.

I have proved that the GTF_reading or GTF_writing is surely cleared after __gnttab_copy().
So the question is where the GTF_reading is set.
Is hypervisor doing a grant copy operation while VM netfront calling xennet_tx_buf_gc()?

Any ideas?
> 
> Jan
> 
>> On 2013/10/17 16:00, Jan Beulich wrote:
>>>>>> On 17.10.13 at 09:41, jerry <jerry.lilijun@huawei.com> wrote:
>>>> But there may be still concurrency problems in my test.
>>>> If the page replacing in copy_pending_req() was done after 
>>>> netif_get_page_ext() in netbk_gop_frag(), copy_gop->flags is wrongly marked 
>>>> with GNTCOPY_source_gref.
>>>> Here the memory of that page in skb has been replaced with Dom0 local 
>>>> memory, so the later HYPERVISOR_multicall() with GNTTABOP_copy in 
>>>> netbk_rx_actions() will get errors.
>>>> The messages is shown as:
>>>>
>>>> (XEN) grant_table.c:305:d0 Bad flags (0) or dom (0). (expected dom 0)
>>>>
>>>> Would you like to share some opinions?
>>>
>>> At a first glance that seems possible, but the question is - does it
>>> cause any problems other than the quoted message to be issued
>>> (and the problematic packet getting re-transmitted)? I'm asking
>>> mainly because fixing this would appear to imply adding locking to
>>> these paths - with the risk of adversely affecting performance.
>>>
>>> Jan
>>>
>>>
>>>
> 
> 
> 
> 
> .
> 

  reply	other threads:[~2013-10-22  1:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-16  4:13 netback BUG_ON when using copy_skb=1 jerry
2013-10-16 11:10 ` Jan Beulich
2013-10-17  7:41   ` jerry
2013-10-17  8:00     ` Jan Beulich
2013-10-17 10:26       ` jerry
2013-10-17 12:11         ` Jan Beulich
2013-10-22  1:18           ` jerry [this message]
2013-10-22  7:11             ` Jan Beulich
2013-10-26  8:32   ` jerry
2013-10-28  7:43     ` Jan Beulich
2013-10-29  4:04       ` jerry
2013-10-28 11:43     ` Wei Liu
2013-10-31 15:17       ` Ian Campbell
2013-10-31 15:32         ` Wei Liu
2013-11-01  2:53           ` jerry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5265D24E.2010609@huawei.com \
    --to=jerry.lilijun@huawei.com \
    --cc=JBeulich@suse.com \
    --cc=qianhuibin@huawei.com \
    --cc=qinchuanyu@huawei.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wangfuhai@huawei.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    --cc=xiaowei.yang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).