From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Chentao(Boby)" <boby.chen@huawei.com>
Subject: Re: xen-blkback unmap with network retansmission will
 cause a coredump
Date: Tue, 23 Sep 2014 21:36:48 +0800
Message-ID: <54217770.9060704@huawei.com>
References: <541D5D8C.8020604@huawei.com> <541FF5D2.8030002@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta3.messagelabs.com ([195.245.230.39])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <boby.chen@huawei.com>) id 1XWQHG-0004Ty-Ce
	for xen-devel@lists.xenproject.org; Tue, 23 Sep 2014 13:37:10 +0000
In-Reply-To: <541FF5D2.8030002@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: David Vrabel <david.vrabel@citrix.com>, "konrad.wilk" <konrad.wilk@oracle.com>, =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
Cc: meiwanlong@huawei.com, mu.muyang@huawei.com, Yanqiangjun <yanqiangjun@huawei.com>, liuyongan@huawei.com, huangzhichao@huawei.com, xen-devel@lists.xenproject.org, dengguoqiang@huawei.com, zhangmin <rudy.zhangmin@huawei.com>, wu.wubin@huawei.com
List-Id: xen-devel@lists.xenproject.org



On 2014/9/22 18:11, David Vrabel wrote:
> On 20/09/14 11:57, Chentao(Boby) wrote:
>> Hi konrad and roger,
>>
>> When xen-blkback module executes unmap operation, and at the same 
>> time the skb of network retansmission uses this map page, it will 
>> cause a crash of hostos.
>>
>> The crash stack of this problem is like below. 
>> <ffffffff8041133e>{do_page_fault+0x38e} 
>> <ffffffff8040d9e8>{page_fault+0x28} <ffffffff80223cdb>{memcpy+0xb} 
>> <ffffffff802325c2>{swiotlb_tbl_map_single+0x212} 
>> <ffffffff8023274a>{swiotlb_map_page+0x17a} 
>> <ffffffffa03468e6>{tg3:tg3_start_xmit+0x656} 
>> <ffffffff80354d14>{dev_hard_start_xmit+0x334} 
>> <ffffffff803721be>{sch_direct_xmit+0x1ae}
> 
> What dom0 (backend) kernel are you using?  Which backend and what storage?
> 
The dom0 kernel is Suse11 sp3 3.0.93-0.8-xen. The backend is xen-blkback, storage is IPSAN.

>> I search website, found citrix engineers has met this problem long
>> time ago. And I realized citrix engineers solve this problem
>> according to modify kernel stack. Because this modification is very
>> large, linux kernel community hasn't accept it until now. I have a
>> immature thought, in dispatch_rw_block_io function, if this io is a
>> write operation, we use grant copy hypercall instead of grant map
>> hypercall. I verify my modification and it can solve this problem.
> 
> Switching to grant copy will reduce performance significantly in many cases.
> 
> This was fixed for user space backends by replacing the foreign mapping
> with a mapping of a scratch page, when unmapping the grant.
> 
> Something similar should be done for kernel-only foreign mappings.  This
> requires a GNTOP_unmap_and_duplicate hypercall sub-op to allow efficient
> batching.
> 
> David
> .
> 
Thanks for your reply, David.