From mboxrd@z Thu Jan 1 00:00:00 1970 From: annie li Subject: Rebooting domu fails in nfs share exported from another domu on the same dom0 Date: Wed, 16 Jul 2014 16:36:29 -0400 Message-ID: <53C6E24D.7050903@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: roger.pau@citrix.com, "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org Hi I hit a problem in such scenario: vm1 is running and export nfs service, dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2 are running on the same dom0. When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback-> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback -> vm1 netfront. In above data flow, nfs implements direct io, blkfront and blkback uses grantmap. This makes page mapping works well through vm2 blkfront to vm1 netback. However, when netback does grant copy, the error happens in this routine: __gnttab_copy->__get_paged_frame->get_page_from_gfn->get_page. See /xen/arch/x86/mm.c get_page(), if ( likely(owner == domain) ) return 1; In above if condition, the src page is from vm2, so owner is id of vm2, domain is 0 here. Then get_page return 0, hence get_page_from_gfn return NULL and __get_paged_frame return GNTST_bad_page. Finally, put_page is called in __grant_copy directly and grant copy fails in netback. As a result, writing to nfsfile fails and this results damage to nfsfile, then vm can not be rebooted successfully. Disable the nfs direct io can be a workaround, however, this will cause performance penalty. Or any copy is involved between vm2 blkfront->vm1 netback probably helps in this case. But zerocopy is the best thing for performance, so any suggestions for this issue? This issue is pretty similar with this one http://lists.xen.org/archives/html/xen-devel/2012-12/msg01722.html. Roger, did you fix this issue in your case? Thanks Annie