From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Chentao(Boby)" Subject: Re: xen-blkback unmap with network retansmission will cause a coredump Date: Tue, 23 Sep 2014 21:36:48 +0800 Message-ID: <54217770.9060704@huawei.com> References: <541D5D8C.8020604@huawei.com> <541FF5D2.8030002@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XWQHG-0004Ty-Ce for xen-devel@lists.xenproject.org; Tue, 23 Sep 2014 13:37:10 +0000 In-Reply-To: <541FF5D2.8030002@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel , "konrad.wilk" , =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= Cc: meiwanlong@huawei.com, mu.muyang@huawei.com, Yanqiangjun , liuyongan@huawei.com, huangzhichao@huawei.com, xen-devel@lists.xenproject.org, dengguoqiang@huawei.com, zhangmin , wu.wubin@huawei.com List-Id: xen-devel@lists.xenproject.org On 2014/9/22 18:11, David Vrabel wrote: > On 20/09/14 11:57, Chentao(Boby) wrote: >> Hi konrad and roger, >> >> When xen-blkback module executes unmap operation, and at the same >> time the skb of network retansmission uses this map page, it will >> cause a crash of hostos. >> >> The crash stack of this problem is like below. >> {do_page_fault+0x38e} >> {page_fault+0x28} {memcpy+0xb} >> {swiotlb_tbl_map_single+0x212} >> {swiotlb_map_page+0x17a} >> {tg3:tg3_start_xmit+0x656} >> {dev_hard_start_xmit+0x334} >> {sch_direct_xmit+0x1ae} > > What dom0 (backend) kernel are you using? Which backend and what storage? > The dom0 kernel is Suse11 sp3 3.0.93-0.8-xen. The backend is xen-blkback, storage is IPSAN. >> I search website, found citrix engineers has met this problem long >> time ago. And I realized citrix engineers solve this problem >> according to modify kernel stack. Because this modification is very >> large, linux kernel community hasn't accept it until now. I have a >> immature thought, in dispatch_rw_block_io function, if this io is a >> write operation, we use grant copy hypercall instead of grant map >> hypercall. I verify my modification and it can solve this problem. > > Switching to grant copy will reduce performance significantly in many cases. > > This was fixed for user space backends by replacing the foreign mapping > with a mapping of a scratch page, when unmapping the grant. > > Something similar should be done for kernel-only foreign mappings. This > requires a GNTOP_unmap_and_duplicate hypercall sub-op to allow efficient > batching. > > David > . > Thanks for your reply, David.