* Rebooting domu fails in nfs share exported from another domu on the same dom0
@ 2014-07-16 20:36 annie li
2014-07-17 15:49 ` Roger Pau Monné
2014-07-28 14:14 ` David Vrabel
0 siblings, 2 replies; 17+ messages in thread
From: annie li @ 2014-07-16 20:36 UTC (permalink / raw)
To: roger.pau, xen-devel@lists.xen.org
Hi
I hit a problem in such scenario: vm1 is running and export nfs service,
dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
are running on the same dom0.
When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
-> vm1 netfront.
In above data flow, nfs implements direct io, blkfront and blkback uses
grantmap. This makes page mapping works well through vm2 blkfront to vm1
netback. However, when netback does grant copy, the error happens in
this routine: __gnttab_copy->__get_paged_frame->get_page_from_gfn->get_page.
See /xen/arch/x86/mm.c get_page(),
if ( likely(owner == domain) )
return 1;
In above if condition, the src page is from vm2, so owner is id of vm2,
domain is 0 here. Then get_page return 0, hence get_page_from_gfn return
NULL and __get_paged_frame return GNTST_bad_page. Finally, put_page is
called in __grant_copy directly and grant copy fails in netback. As a
result, writing to nfsfile fails and this results damage to nfsfile,
then vm can not be rebooted successfully.
Disable the nfs direct io can be a workaround, however, this will cause
performance penalty. Or any copy is involved between vm2 blkfront->vm1
netback probably helps in this case. But zerocopy is the best thing for
performance, so any suggestions for this issue?
This issue is pretty similar with this one
http://lists.xen.org/archives/html/xen-devel/2012-12/msg01722.html.
Roger, did you fix this issue in your case?
Thanks
Annie
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-16 20:36 Rebooting domu fails in nfs share exported from another domu on the same dom0 annie li
@ 2014-07-17 15:49 ` Roger Pau Monné
2014-07-17 16:56 ` annie li
2014-07-28 14:14 ` David Vrabel
1 sibling, 1 reply; 17+ messages in thread
From: Roger Pau Monné @ 2014-07-17 15:49 UTC (permalink / raw)
To: annie li, xen-devel@lists.xen.org
On 16/07/14 22:36, annie li wrote:
> Hi
>
> I hit a problem in such scenario: vm1 is running and export nfs service,
> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
> are running on the same dom0.
>
> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
> -> vm1 netfront.
>
> In above data flow, nfs implements direct io, blkfront and blkback uses
> grantmap. This makes page mapping works well through vm2 blkfront to vm1
> netback. However, when netback does grant copy, the error happens in
> this routine:
If it's the same error I was seeing (which I think it is), the problem
is due to trying to do a grant_copy with the PFN of a grant mapped page
in Dom0, which Xen refuses to perform.
I've never followed it up, but I think the problem should be fixed in
Linux, so that netback realises it is trying to perform a grant_copy to
a granted page, and use the grant ref instead of the PFN.
Roger.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-17 15:49 ` Roger Pau Monné
@ 2014-07-17 16:56 ` annie li
2014-07-18 18:53 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 17+ messages in thread
From: annie li @ 2014-07-17 16:56 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: xen-devel@lists.xen.org
On 2014/7/17 11:49, Roger Pau Monné wrote:
> On 16/07/14 22:36, annie li wrote:
>> Hi
>>
>> I hit a problem in such scenario: vm1 is running and export nfs service,
>> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
>> are running on the same dom0.
>>
>> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
>> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
>> -> vm1 netfront.
>>
>> In above data flow, nfs implements direct io, blkfront and blkback uses
>> grantmap. This makes page mapping works well through vm2 blkfront to vm1
>> netback. However, when netback does grant copy, the error happens in
>> this routine:
> If it's the same error I was seeing (which I think it is), the problem
> is due to trying to do a grant_copy with the PFN of a grant mapped page
> in Dom0, which Xen refuses to perform.
>
> I've never followed it up, but I think the problem should be fixed in
> Linux, so that netback realises it is trying to perform a grant_copy to
> a granted page, and use the grant ref instead of the PFN.
>
I guess the routine is similar if using grant ref. See __gnttab_copy
function,
The routine goes into __acquire_grant_for_copy for grant ref, and then
__get_paged_frame->get_page_from_gfn->get_page. get_page is where the
checking page owner fails.
Thanks
Annie
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-17 16:56 ` annie li
@ 2014-07-18 18:53 ` Konrad Rzeszutek Wilk
2014-07-18 19:31 ` annie li
0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-18 18:53 UTC (permalink / raw)
To: annie li; +Cc: xen-devel@lists.xen.org, Roger Pau Monné
On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
>
> On 2014/7/17 11:49, Roger Pau Monné wrote:
> >On 16/07/14 22:36, annie li wrote:
> >>Hi
> >>
> >>I hit a problem in such scenario: vm1 is running and export nfs service,
> >>dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
> >>are running on the same dom0.
> >>
> >>When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
mean 'dom0 blkback'?
> >>loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
> >>-> vm1 netfront.
So both netback and netfront run in the same guest? I think you
want these two swapped around (netfront -> netback).
> >>
> >>In above data flow, nfs implements direct io, blkfront and blkback uses
> >>grantmap. This makes page mapping works well through vm2 blkfront to vm1
> >>netback. However, when netback does grant copy, the error happens in
> >>this routine:
> >If it's the same error I was seeing (which I think it is), the problem
> >is due to trying to do a grant_copy with the PFN of a grant mapped page
> >in Dom0, which Xen refuses to perform.
> >
> >I've never followed it up, but I think the problem should be fixed in
> >Linux, so that netback realises it is trying to perform a grant_copy to
> >a granted page, and use the grant ref instead of the PFN.
> >
>
> I guess the routine is similar if using grant ref. See __gnttab_copy
> function,
> The routine goes into __acquire_grant_for_copy for grant ref, and then
> __get_paged_frame->get_page_from_gfn->get_page. get_page is where the
> checking page owner fails.
>
> Thanks
> Annie
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 18:53 ` Konrad Rzeszutek Wilk
@ 2014-07-18 19:31 ` annie li
2014-07-18 19:43 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 17+ messages in thread
From: annie li @ 2014-07-18 19:31 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: Roger Pau Monné, xen-devel@lists.xen.org
On 2014/7/18 14:53, Konrad Rzeszutek Wilk wrote:
> On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
>> On 2014/7/17 11:49, Roger Pau Monné wrote:
>>> On 16/07/14 22:36, annie li wrote:
>>>> Hi
>>>>
>>>> I hit a problem in such scenario: vm1 is running and export nfs service,
>>>> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
>>>> are running on the same dom0.
>>>>
>>>> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
> I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
> mean 'dom0 blkback'?
Yes, Dom0 blkback.
>
>
>>>> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
>>>> -> vm1 netfront.
> So both netback and netfront run in the same guest? I think you
> want these two swapped around (netfront -> netback).
No, the netfront in above routine means the one in guest vm1, and this
is network RX path in vm1.
vm2 is running from nfs exported share of vm1. When vm2 write its disk,
data will go to the nfs file. Finally, the data will be sent through
network to vm1, then the data routine is from netback to vm1 netfront.
See following,
vm1(nfs server) vm2
| \ /
| \ /
| \ /
Dom0(nfs client)
Thanks
Annie
>
>>>> In above data flow, nfs implements direct io, blkfront and blkback uses
>>>> grantmap. This makes page mapping works well through vm2 blkfront to vm1
>>>> netback. However, when netback does grant copy, the error happens in
>>>> this routine:
>>> If it's the same error I was seeing (which I think it is), the problem
>>> is due to trying to do a grant_copy with the PFN of a grant mapped page
>>> in Dom0, which Xen refuses to perform.
>>>
>>> I've never followed it up, but I think the problem should be fixed in
>>> Linux, so that netback realises it is trying to perform a grant_copy to
>>> a granted page, and use the grant ref instead of the PFN.
>>>
>> I guess the routine is similar if using grant ref. See __gnttab_copy
>> function,
>> The routine goes into __acquire_grant_for_copy for grant ref, and then
>> __get_paged_frame->get_page_from_gfn->get_page. get_page is where the
>> checking page owner fails.
>>
>> Thanks
>> Annie
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 19:31 ` annie li
@ 2014-07-18 19:43 ` Konrad Rzeszutek Wilk
2014-07-18 20:17 ` annie li
0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-18 19:43 UTC (permalink / raw)
To: annie li; +Cc: Roger Pau Monné, xen-devel@lists.xen.org
On Fri, Jul 18, 2014 at 03:31:43PM -0400, annie li wrote:
>
> On 2014/7/18 14:53, Konrad Rzeszutek Wilk wrote:
> >On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
> >>On 2014/7/17 11:49, Roger Pau Monné wrote:
> >>>On 16/07/14 22:36, annie li wrote:
> >>>>Hi
> >>>>
> >>>>I hit a problem in such scenario: vm1 is running and export nfs service,
> >>>>dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
> >>>>are running on the same dom0.
> >>>>
> >>>>When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
> >I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
> >mean 'dom0 blkback'?
>
> Yes, Dom0 blkback.
>
> >
> >
> >>>>loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
> >>>>-> vm1 netfront.
> >So both netback and netfront run in the same guest? I think you
> >want these two swapped around (netfront -> netback).
>
> No, the netfront in above routine means the one in guest vm1, and this is
> network RX path in vm1.
OK, so 'dom0 netback' then. As the netback thread is running in the
initial domain?
So a revised view is:
vm2 blkfront -> dom0 blkback -> loop -> nfs file -> nfs client
-> bridge priv1 -> vm1 vif -> dom0 netback -> vm1 netfront
So with the grant map (blkfront -> blkback) the source is dom0
and the destination is vm1. For the grant copy, the source is
dom0 and the destinatation is vm2 right? (or did I get my src
and dst confused?).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 19:43 ` Konrad Rzeszutek Wilk
@ 2014-07-18 20:17 ` annie li
2014-07-18 20:22 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 17+ messages in thread
From: annie li @ 2014-07-18 20:17 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org, Roger Pau Monné
On 2014/7/18 15:43, Konrad Rzeszutek Wilk wrote:
> On Fri, Jul 18, 2014 at 03:31:43PM -0400, annie li wrote:
>> On 2014/7/18 14:53, Konrad Rzeszutek Wilk wrote:
>>> On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
>>>> On 2014/7/17 11:49, Roger Pau Monné wrote:
>>>>> On 16/07/14 22:36, annie li wrote:
>>>>>> Hi
>>>>>>
>>>>>> I hit a problem in such scenario: vm1 is running and export nfs service,
>>>>>> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
>>>>>> are running on the same dom0.
>>>>>>
>>>>>> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
>>> I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
>>> mean 'dom0 blkback'?
>> Yes, Dom0 blkback.
>>
>>>
>>>>>> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
>>>>>> -> vm1 netfront.
>>> So both netback and netfront run in the same guest? I think you
>>> want these two swapped around (netfront -> netback).
>> No, the netfront in above routine means the one in guest vm1, and this is
>> network RX path in vm1.
> OK, so 'dom0 netback' then. As the netback thread is running in the
> initial domain?
Netback thread is running in Dom0 as normal, and the two vms are all
normal vm, not driver domain. Probably my initial description causes
some confusion.
>
> So a revised view is:
>
> vm2 blkfront -> dom0 blkback -> loop -> nfs file -> nfs client
> -> bridge priv1 -> vm1 vif -> dom0 netback -> vm1 netfront
>
> So with the grant map (blkfront -> blkback) the source is dom0
> and the destination is vm1.
You mean destination is vm2 here, right?
But I am little confused. Blkfront of vm2 sends out request to blkback,
and the data page is allocated from vm2. So the grant map source is vm2,
destination is dom0. Anything I missed here?
> For the grant copy, the source is
> dom0 and the destinatation is vm2 right? (or did I get my src
> and dst confused?).
You mean vm1 here, right?
If so, yes, source is dom0, and destination is vm1.
In grantcopy code, it checks whether the source is from dom0, if not,
then an error is thrown out. In this situation, the source is from vm2
which is grant mapped.
Thanks
Annie
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 20:17 ` annie li
@ 2014-07-18 20:22 ` Konrad Rzeszutek Wilk
2014-07-18 20:31 ` annie li
0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-18 20:22 UTC (permalink / raw)
To: annie li; +Cc: xen-devel@lists.xen.org, Roger Pau Monné
On Fri, Jul 18, 2014 at 04:17:02PM -0400, annie li wrote:
>
> On 2014/7/18 15:43, Konrad Rzeszutek Wilk wrote:
> >On Fri, Jul 18, 2014 at 03:31:43PM -0400, annie li wrote:
> >>On 2014/7/18 14:53, Konrad Rzeszutek Wilk wrote:
> >>>On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
> >>>>On 2014/7/17 11:49, Roger Pau Monné wrote:
> >>>>>On 16/07/14 22:36, annie li wrote:
> >>>>>>Hi
> >>>>>>
> >>>>>>I hit a problem in such scenario: vm1 is running and export nfs service,
> >>>>>>dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
> >>>>>>are running on the same dom0.
> >>>>>>
> >>>>>>When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
> >>>I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
> >>>mean 'dom0 blkback'?
> >>Yes, Dom0 blkback.
> >>
> >>>
> >>>>>>loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
> >>>>>>-> vm1 netfront.
> >>>So both netback and netfront run in the same guest? I think you
> >>>want these two swapped around (netfront -> netback).
> >>No, the netfront in above routine means the one in guest vm1, and this is
> >>network RX path in vm1.
> >OK, so 'dom0 netback' then. As the netback thread is running in the
> >initial domain?
>
> Netback thread is running in Dom0 as normal, and the two vms are all
> normal vm, not driver domain. Probably my initial description causes
> some confusion.
>
> >
> >So a revised view is:
> >
> > vm2 blkfront -> dom0 blkback -> loop -> nfs file -> nfs client
> > -> bridge priv1 -> vm1 vif -> dom0 netback -> vm1 netfront
> >
> >So with the grant map (blkfront -> blkback) the source is dom0
> >and the destination is vm1.
>
> You mean destination is vm2 here, right?
Yes :-)
> But I am little confused. Blkfront of vm2 sends out request to
> blkback, and the data page is allocated from vm2. So the grant map
> source is vm2, destination is dom0. Anything I missed here?
>
> > For the grant copy, the source is
> >dom0 and the destinatation is vm2 right? (or did I get my src
> >and dst confused?).
>
> You mean vm1 here, right?
Heh.
> If so, yes, source is dom0, and destination is vm1.
> In grantcopy code, it checks whether the source is from dom0, if
> not, then an error is thrown out. In this situation, the source is
> from vm2 which is grant mapped.
In both cases dom0 is part of the equation. In one
case it is the destination and in another it is the source - both cases
for the same page right?
>
> Thanks
> Annie
> >
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@lists.xen.org
> >http://lists.xen.org/xen-devel
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 20:22 ` Konrad Rzeszutek Wilk
@ 2014-07-18 20:31 ` annie li
2014-07-18 21:07 ` Konrad Rzeszutek Wilk
0 siblings, 1 reply; 17+ messages in thread
From: annie li @ 2014-07-18 20:31 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xen.org, Roger Pau Monné
On 2014/7/18 16:22, Konrad Rzeszutek Wilk wrote:
> On Fri, Jul 18, 2014 at 04:17:02PM -0400, annie li wrote:
>> On 2014/7/18 15:43, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Jul 18, 2014 at 03:31:43PM -0400, annie li wrote:
>>>> On 2014/7/18 14:53, Konrad Rzeszutek Wilk wrote:
>>>>> On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
>>>>>> On 2014/7/17 11:49, Roger Pau Monné wrote:
>>>>>>> On 16/07/14 22:36, annie li wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> I hit a problem in such scenario: vm1 is running and export nfs service,
>>>>>>>> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
>>>>>>>> are running on the same dom0.
>>>>>>>>
>>>>>>>> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
>>>>> I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
>>>>> mean 'dom0 blkback'?
>>>> Yes, Dom0 blkback.
>>>>
>>>>>>>> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
>>>>>>>> -> vm1 netfront.
>>>>> So both netback and netfront run in the same guest? I think you
>>>>> want these two swapped around (netfront -> netback).
>>>> No, the netfront in above routine means the one in guest vm1, and this is
>>>> network RX path in vm1.
>>> OK, so 'dom0 netback' then. As the netback thread is running in the
>>> initial domain?
>> Netback thread is running in Dom0 as normal, and the two vms are all
>> normal vm, not driver domain. Probably my initial description causes
>> some confusion.
>>
>>> So a revised view is:
>>>
>>> vm2 blkfront -> dom0 blkback -> loop -> nfs file -> nfs client
>>> -> bridge priv1 -> vm1 vif -> dom0 netback -> vm1 netfront
>>>
>>> So with the grant map (blkfront -> blkback) the source is dom0
>>> and the destination is vm1.
>> You mean destination is vm2 here, right?
> Yes :-)
>> But I am little confused. Blkfront of vm2 sends out request to
>> blkback, and the data page is allocated from vm2. So the grant map
>> source is vm2, destination is dom0. Anything I missed here?
>>
>>> For the grant copy, the source is
>>> dom0 and the destinatation is vm2 right? (or did I get my src
>>> and dst confused?).
>> You mean vm1 here, right?
> Heh.
>> If so, yes, source is dom0, and destination is vm1.
>> In grantcopy code, it checks whether the source is from dom0, if
>> not, then an error is thrown out. In this situation, the source is
>> from vm2 which is grant mapped.
> In both cases dom0 is part of the equation. In one
> case it is the destination and in another it is the source - both cases
> for the same page right?
Yes.
blkfront->blkback, dom0 is destination
netback->netfront, dom0 is the source
When doing grantcopy for netback, xen requires the source is dom0.
However, it is vm2 in this case.
Thanks
Annie
>
>> Thanks
>> Annie
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 20:31 ` annie li
@ 2014-07-18 21:07 ` Konrad Rzeszutek Wilk
2014-07-18 21:43 ` annie li
0 siblings, 1 reply; 17+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-18 21:07 UTC (permalink / raw)
To: annie li; +Cc: xen-devel@lists.xen.org, Roger Pau Monné
On Fri, Jul 18, 2014 at 04:31:13PM -0400, annie li wrote:
>
> On 2014/7/18 16:22, Konrad Rzeszutek Wilk wrote:
> >On Fri, Jul 18, 2014 at 04:17:02PM -0400, annie li wrote:
> >>On 2014/7/18 15:43, Konrad Rzeszutek Wilk wrote:
> >>>On Fri, Jul 18, 2014 at 03:31:43PM -0400, annie li wrote:
> >>>>On 2014/7/18 14:53, Konrad Rzeszutek Wilk wrote:
> >>>>>On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
> >>>>>>On 2014/7/17 11:49, Roger Pau Monné wrote:
> >>>>>>>On 16/07/14 22:36, annie li wrote:
> >>>>>>>>Hi
> >>>>>>>>
> >>>>>>>>I hit a problem in such scenario: vm1 is running and export nfs service,
> >>>>>>>>dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
> >>>>>>>>are running on the same dom0.
> >>>>>>>>
> >>>>>>>>When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
> >>>>>I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
> >>>>>mean 'dom0 blkback'?
> >>>>Yes, Dom0 blkback.
> >>>>
> >>>>>>>>loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
> >>>>>>>>-> vm1 netfront.
> >>>>>So both netback and netfront run in the same guest? I think you
> >>>>>want these two swapped around (netfront -> netback).
> >>>>No, the netfront in above routine means the one in guest vm1, and this is
> >>>>network RX path in vm1.
> >>>OK, so 'dom0 netback' then. As the netback thread is running in the
> >>>initial domain?
> >>Netback thread is running in Dom0 as normal, and the two vms are all
> >>normal vm, not driver domain. Probably my initial description causes
> >>some confusion.
> >>
> >>>So a revised view is:
> >>>
> >>> vm2 blkfront -> dom0 blkback -> loop -> nfs file -> nfs client
> >>> -> bridge priv1 -> vm1 vif -> dom0 netback -> vm1 netfront
> >>>
> >>>So with the grant map (blkfront -> blkback) the source is dom0
> >>>and the destination is vm1.
> >>You mean destination is vm2 here, right?
> >Yes :-)
> >>But I am little confused. Blkfront of vm2 sends out request to
> >>blkback, and the data page is allocated from vm2. So the grant map
> >>source is vm2, destination is dom0. Anything I missed here?
> >>
> >>> For the grant copy, the source is
> >>>dom0 and the destinatation is vm2 right? (or did I get my src
> >>>and dst confused?).
> >>You mean vm1 here, right?
> >Heh.
> >>If so, yes, source is dom0, and destination is vm1.
> >>In grantcopy code, it checks whether the source is from dom0, if
> >>not, then an error is thrown out. In this situation, the source is
> >>from vm2 which is grant mapped.
> >In both cases dom0 is part of the equation. In one
> >case it is the destination and in another it is the source - both cases
> >for the same page right?
>
> Yes.
> blkfront->blkback, dom0 is destination
> netback->netfront, dom0 is the source
>
> When doing grantcopy for netback, xen requires the source is dom0. However,
> it is vm2 in this case.
Would it be possible in the hypervisor to have an extra check where you
look to see if the source (dom0 in this case) has this page mapped
from another guest? As in, this is a bit of A->B->C transition
and we want to do A->C (B is dom0). If you figure out that the PFN
belongs to A and you are doing a copy of A's page to C's page from B (dom0)
page (and B PFN is actually mapped to be A's page), then why
not just copy from A to C directly?
Hmm. Linux kernel could actually also have this information (it does
already I think) and we could search for that if the grant copy fails
and if so alter the source and retry the grant copy. Instead of dom0
being the source it is the C guest (vm2)? Thought I don't know if
the hypercall allows us to make a grant_copy on behalf of two
other guests.
>
> Thanks
> Annie
>
> >
> >>Thanks
> >>Annie
> >>>
> >>>_______________________________________________
> >>>Xen-devel mailing list
> >>>Xen-devel@lists.xen.org
> >>>http://lists.xen.org/xen-devel
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 21:07 ` Konrad Rzeszutek Wilk
@ 2014-07-18 21:43 ` annie li
2014-07-21 10:02 ` Wei Liu
0 siblings, 1 reply; 17+ messages in thread
From: annie li @ 2014-07-18 21:43 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: Roger Pau Monné, xen-devel@lists.xen.org
On 2014/7/18 17:07, Konrad Rzeszutek Wilk wrote:
> On Fri, Jul 18, 2014 at 04:31:13PM -0400, annie li wrote:
>> On 2014/7/18 16:22, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Jul 18, 2014 at 04:17:02PM -0400, annie li wrote:
>>>> On 2014/7/18 15:43, Konrad Rzeszutek Wilk wrote:
>>>>> On Fri, Jul 18, 2014 at 03:31:43PM -0400, annie li wrote:
>>>>>> On 2014/7/18 14:53, Konrad Rzeszutek Wilk wrote:
>>>>>>> On Thu, Jul 17, 2014 at 12:56:12PM -0400, annie li wrote:
>>>>>>>> On 2014/7/17 11:49, Roger Pau Monné wrote:
>>>>>>>>> On 16/07/14 22:36, annie li wrote:
>>>>>>>>>> Hi
>>>>>>>>>>
>>>>>>>>>> I hit a problem in such scenario: vm1 is running and export nfs service,
>>>>>>>>>> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
>>>>>>>>>> are running on the same dom0.
>>>>>>>>>>
>>>>>>>>>> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
>>>>>>> I am a bit confused here. 'vm2 blkfront -> vm2 blkback'? Did you
>>>>>>> mean 'dom0 blkback'?
>>>>>> Yes, Dom0 blkback.
>>>>>>
>>>>>>>>>> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
>>>>>>>>>> -> vm1 netfront.
>>>>>>> So both netback and netfront run in the same guest? I think you
>>>>>>> want these two swapped around (netfront -> netback).
>>>>>> No, the netfront in above routine means the one in guest vm1, and this is
>>>>>> network RX path in vm1.
>>>>> OK, so 'dom0 netback' then. As the netback thread is running in the
>>>>> initial domain?
>>>> Netback thread is running in Dom0 as normal, and the two vms are all
>>>> normal vm, not driver domain. Probably my initial description causes
>>>> some confusion.
>>>>
>>>>> So a revised view is:
>>>>>
>>>>> vm2 blkfront -> dom0 blkback -> loop -> nfs file -> nfs client
>>>>> -> bridge priv1 -> vm1 vif -> dom0 netback -> vm1 netfront
>>>>>
>>>>> So with the grant map (blkfront -> blkback) the source is dom0
>>>>> and the destination is vm1.
>>>> You mean destination is vm2 here, right?
>>> Yes :-)
>>>> But I am little confused. Blkfront of vm2 sends out request to
>>>> blkback, and the data page is allocated from vm2. So the grant map
>>>> source is vm2, destination is dom0. Anything I missed here?
>>>>
>>>>> For the grant copy, the source is
>>>>> dom0 and the destinatation is vm2 right? (or did I get my src
>>>>> and dst confused?).
>>>> You mean vm1 here, right?
>>> Heh.
>>>> If so, yes, source is dom0, and destination is vm1.
>>>> In grantcopy code, it checks whether the source is from dom0, if
>>>> not, then an error is thrown out. In this situation, the source is
>>> >from vm2 which is grant mapped.
>>> In both cases dom0 is part of the equation. In one
>>> case it is the destination and in another it is the source - both cases
>>> for the same page right?
>> Yes.
>> blkfront->blkback, dom0 is destination
>> netback->netfront, dom0 is the source
>>
>> When doing grantcopy for netback, xen requires the source is dom0. However,
>> it is vm2 in this case.
> Would it be possible in the hypervisor to have an extra check where you
> look to see if the source (dom0 in this case) has this page mapped
> from another guest? As in, this is a bit of A->B->C transition
> and we want to do A->C (B is dom0). If you figure out that the PFN
> belongs to A and you are doing a copy of A's page to C's page from B (dom0)
> page (and B PFN is actually mapped to be A's page), then why
> not just copy from A to C directly?
>
> Hmm. Linux kernel could actually also have this information (it does
> already I think) and we could search for that if the grant copy fails
> and if so alter the source and retry the grant copy. Instead of dom0
> being the source it is the C guest (vm2)? Thought I don't know if
> the hypercall allows us to make a grant_copy on behalf of two
> other guests.
Oh... I remember netback did have some grant_copy code on behalf of two
guests on same server, and it was removed later on. So I think grantcopy
A->C works for this case with the precondition that we can recognize
this page is mapped from another guest.
Thanks
Annie
>
>> Thanks
>> Annie
>>
>>>> Thanks
>>>> Annie
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@lists.xen.org
>>>>> http://lists.xen.org/xen-devel
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-18 21:43 ` annie li
@ 2014-07-21 10:02 ` Wei Liu
2014-07-21 15:02 ` annie li
0 siblings, 1 reply; 17+ messages in thread
From: Wei Liu @ 2014-07-21 10:02 UTC (permalink / raw)
To: annie li; +Cc: wei.liu2, Roger Pau Monné, xen-devel@lists.xen.org
On Fri, Jul 18, 2014 at 05:43:23PM -0400, annie li wrote:
[...]
> >>
> >>When doing grantcopy for netback, xen requires the source is dom0. However,
> >>it is vm2 in this case.
> >Would it be possible in the hypervisor to have an extra check where you
> >look to see if the source (dom0 in this case) has this page mapped
> >from another guest? As in, this is a bit of A->B->C transition
> >and we want to do A->C (B is dom0). If you figure out that the PFN
> >belongs to A and you are doing a copy of A's page to C's page from B (dom0)
> >page (and B PFN is actually mapped to be A's page), then why
> >not just copy from A to C directly?
> >
> >Hmm. Linux kernel could actually also have this information (it does
> >already I think) and we could search for that if the grant copy fails
> >and if so alter the source and retry the grant copy. Instead of dom0
> >being the source it is the C guest (vm2)? Thought I don't know if
> >the hypercall allows us to make a grant_copy on behalf of two
> >other guests.
>
> Oh... I remember netback did have some grant_copy code on behalf of two
> guests on same server, and it was removed later on. So I think grantcopy
> A->C works for this case with the precondition that we can recognize this
> page is mapped from another guest.
>
I have not followed this thread closely.
The tracking facility was removed because it was dead at that time. See
43e9d19 ("xen-netback: remove page tracking facility").
However I think the latest netback with mapping scheme does have
something similar. I can see there's a "foreign_queue" check in
xenvif_gop_frag_copy. Is that not enough?
Wei.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-21 10:02 ` Wei Liu
@ 2014-07-21 15:02 ` annie li
2014-07-21 23:05 ` Wei Liu
0 siblings, 1 reply; 17+ messages in thread
From: annie li @ 2014-07-21 15:02 UTC (permalink / raw)
To: Wei Liu; +Cc: xen-devel@lists.xen.org, Roger Pau Monné
On 2014/7/21 6:02, Wei Liu wrote:
> On Fri, Jul 18, 2014 at 05:43:23PM -0400, annie li wrote:
> [...]
>>>> When doing grantcopy for netback, xen requires the source is dom0. However,
>>>> it is vm2 in this case.
>>> Would it be possible in the hypervisor to have an extra check where you
>>> look to see if the source (dom0 in this case) has this page mapped
>> >from another guest? As in, this is a bit of A->B->C transition
>>> and we want to do A->C (B is dom0). If you figure out that the PFN
>>> belongs to A and you are doing a copy of A's page to C's page from B (dom0)
>>> page (and B PFN is actually mapped to be A's page), then why
>>> not just copy from A to C directly?
>>>
>>> Hmm. Linux kernel could actually also have this information (it does
>>> already I think) and we could search for that if the grant copy fails
>>> and if so alter the source and retry the grant copy. Instead of dom0
>>> being the source it is the C guest (vm2)? Thought I don't know if
>>> the hypercall allows us to make a grant_copy on behalf of two
>>> other guests.
>> Oh... I remember netback did have some grant_copy code on behalf of two
>> guests on same server, and it was removed later on. So I think grantcopy
>> A->C works for this case with the precondition that we can recognize this
>> page is mapped from another guest.
>>
> I have not followed this thread closely.
>
> The tracking facility was removed because it was dead at that time. See
> 43e9d19 ("xen-netback: remove page tracking facility").
>
> However I think the latest netback with mapping scheme does have
> something similar. I can see there's a "foreign_queue" check in
> xenvif_gop_frag_copy. Is that not enough?
Correct, this mapping scheme does similar thing, but it is for
communication between two vifs on the same server, the original page is
mapped by netback tx path. Here, this issue is caused when the page
mapped by blkback.
What I am thinking is: checking whether the page is mapped, if yes,
then does grant copy from source to dest directly instead of Dom0->dest
since a condition check fails in mm.c if the original page is from
another vm for situation Dom0->dist .
Thanks
Annie
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-21 15:02 ` annie li
@ 2014-07-21 23:05 ` Wei Liu
2014-07-23 1:58 ` annie li
0 siblings, 1 reply; 17+ messages in thread
From: Wei Liu @ 2014-07-21 23:05 UTC (permalink / raw)
To: annie li; +Cc: xen-devel@lists.xen.org, Wei Liu, Roger Pau Monné
On Mon, Jul 21, 2014 at 11:02:54AM -0400, annie li wrote:
[...]
> >>>being the source it is the C guest (vm2)? Thought I don't know if
> >>>the hypercall allows us to make a grant_copy on behalf of two
> >>>other guests.
> >>Oh... I remember netback did have some grant_copy code on behalf of two
> >>guests on same server, and it was removed later on. So I think grantcopy
> >>A->C works for this case with the precondition that we can recognize this
> >>page is mapped from another guest.
> >>
> >I have not followed this thread closely.
> >
> >The tracking facility was removed because it was dead at that time. See
> >43e9d19 ("xen-netback: remove page tracking facility").
> >
> >However I think the latest netback with mapping scheme does have
> >something similar. I can see there's a "foreign_queue" check in
> >xenvif_gop_frag_copy. Is that not enough?
>
> Correct, this mapping scheme does similar thing, but it is for communication
> between two vifs on the same server, the original page is mapped by netback
> tx path. Here, this issue is caused when the page mapped by blkback.
>
I see.
> What I am thinking is: checking whether the page is mapped, if yes, then
> does grant copy from source to dest directly instead of Dom0->dest since a
> condition check fails in mm.c if the original page is from another vm for
> situation Dom0->dist .
>
The foreign frame is marked with FOREIGN_FRAME_BIT in p2m code so you
can probably make use of that? However gref is not embedded in struct
page.
Wei.
> Thanks
> Annie
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-21 23:05 ` Wei Liu
@ 2014-07-23 1:58 ` annie li
0 siblings, 0 replies; 17+ messages in thread
From: annie li @ 2014-07-23 1:58 UTC (permalink / raw)
To: Wei Liu; +Cc: xen-devel@lists.xen.org, Roger Pau Monné
On 2014/7/21 19:05, Wei Liu wrote:
> On Mon, Jul 21, 2014 at 11:02:54AM -0400, annie li wrote:
> [...]
>>>>> being the source it is the C guest (vm2)? Thought I don't know if
>>>>> the hypercall allows us to make a grant_copy on behalf of two
>>>>> other guests.
>>>> Oh... I remember netback did have some grant_copy code on behalf of two
>>>> guests on same server, and it was removed later on. So I think grantcopy
>>>> A->C works for this case with the precondition that we can recognize this
>>>> page is mapped from another guest.
>>>>
>>> I have not followed this thread closely.
>>>
>>> The tracking facility was removed because it was dead at that time. See
>>> 43e9d19 ("xen-netback: remove page tracking facility").
>>>
>>> However I think the latest netback with mapping scheme does have
>>> something similar. I can see there's a "foreign_queue" check in
>>> xenvif_gop_frag_copy. Is that not enough?
>> Correct, this mapping scheme does similar thing, but it is for communication
>> between two vifs on the same server, the original page is mapped by netback
>> tx path. Here, this issue is caused when the page mapped by blkback.
>>
> I see.
>
>> What I am thinking is: checking whether the page is mapped, if yes, then
>> does grant copy from source to dest directly instead of Dom0->dest since a
>> condition check fails in mm.c if the original page is from another vm for
>> situation Dom0->dist .
>>
> The foreign frame is marked with FOREIGN_FRAME_BIT in p2m code so you
> can probably make use of that? However gref is not embedded in struct
> page.
Yes, no gref.
Let me look at xen grant table code to see whether there is a way to get
gref for specific page.
Thanks
Annie
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-16 20:36 Rebooting domu fails in nfs share exported from another domu on the same dom0 annie li
2014-07-17 15:49 ` Roger Pau Monné
@ 2014-07-28 14:14 ` David Vrabel
2014-07-28 16:14 ` annie li
1 sibling, 1 reply; 17+ messages in thread
From: David Vrabel @ 2014-07-28 14:14 UTC (permalink / raw)
To: annie li, roger.pau, xen-devel@lists.xen.org
On 16/07/14 21:36, annie li wrote:
> Hi
>
> I hit a problem in such scenario: vm1 is running and export nfs service,
> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
> are running on the same dom0.
>
> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
> -> vm1 netfront.
>
> In above data flow, nfs implements direct io, blkfront and blkback uses
> grantmap. This makes page mapping works well through vm2 blkfront to vm1
> netback. However, when netback does grant copy, the error happens in
> this routine:
> __gnttab_copy->__get_paged_frame->get_page_from_gfn->get_page.
> See /xen/arch/x86/mm.c get_page(),
> if ( likely(owner == domain) )
> return 1;
> In above if condition, the src page is from vm2, so owner is id of vm2,
> domain is 0 here. Then get_page return 0, hence get_page_from_gfn return
> NULL and __get_paged_frame return GNTST_bad_page. Finally, put_page is
> called in __grant_copy directly and grant copy fails in netback. As a
> result, writing to nfsfile fails and this results damage to nfsfile,
> then vm can not be rebooted successfully.
>
> Disable the nfs direct io can be a workaround, however, this will cause
> performance penalty. Or any copy is involved between vm2 blkfront->vm1
> netback probably helps in this case. But zerocopy is the best thing for
> performance, so any suggestions for this issue?
I planned (eventually) for foreign struct page's for grant mapped frames
to be marked as such and then the gref and original domain accessible.
The netback specific code for dealing with foreign pages could then be
made generic.
The difficultly lies in extending struct page without actually making it
bigger and without adding Xen-specific fields into it...
Other alternatives I explored were using the guest mapping to copy
to/from instead of having to use the grant ref to find the page. But
page sharing etc. made this look like a nightmare.
David
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Rebooting domu fails in nfs share exported from another domu on the same dom0
2014-07-28 14:14 ` David Vrabel
@ 2014-07-28 16:14 ` annie li
0 siblings, 0 replies; 17+ messages in thread
From: annie li @ 2014-07-28 16:14 UTC (permalink / raw)
To: David Vrabel; +Cc: xen-devel@lists.xen.org, roger.pau
On 2014/7/28 10:14, David Vrabel wrote:
> On 16/07/14 21:36, annie li wrote:
>> Hi
>>
>> I hit a problem in such scenario: vm1 is running and export nfs service,
>> dom0 mount this nfs, and vm2 is booted in this nfs location. vm1 and vm2
>> are running on the same dom0.
>>
>> When this bug happens, the data flow is: vm2 blkfront-> vm2 blkback->
>> loop -> nfs file -> nfs client -> bridge priv1 -> vm1 vif -> vm1 netback
>> -> vm1 netfront.
>>
>> In above data flow, nfs implements direct io, blkfront and blkback uses
>> grantmap. This makes page mapping works well through vm2 blkfront to vm1
>> netback. However, when netback does grant copy, the error happens in
>> this routine:
>> __gnttab_copy->__get_paged_frame->get_page_from_gfn->get_page.
>> See /xen/arch/x86/mm.c get_page(),
>> if ( likely(owner == domain) )
>> return 1;
>> In above if condition, the src page is from vm2, so owner is id of vm2,
>> domain is 0 here. Then get_page return 0, hence get_page_from_gfn return
>> NULL and __get_paged_frame return GNTST_bad_page. Finally, put_page is
>> called in __grant_copy directly and grant copy fails in netback. As a
>> result, writing to nfsfile fails and this results damage to nfsfile,
>> then vm can not be rebooted successfully.
>>
>> Disable the nfs direct io can be a workaround, however, this will cause
>> performance penalty. Or any copy is involved between vm2 blkfront->vm1
>> netback probably helps in this case. But zerocopy is the best thing for
>> performance, so any suggestions for this issue?
> I planned (eventually) for foreign struct page's for grant mapped frames
> to be marked as such and then the gref and original domain accessible.
> The netback specific code for dealing with foreign pages could then be
> made generic.
This sounds good if dealing with foreign pages in netback could be generic.
>
> The difficultly lies in extending struct page without actually making it
> bigger and without adding Xen-specific fields into it...
Yes...
>
> Other alternatives I explored were using the guest mapping to copy
> to/from instead of having to use the grant ref to find the page. But
> page sharing etc. made this look like a nightmare.
What I am thinking is add one more item named "frame" in grant_mapping
structure, see xen/include/xen/grant_table.h. From this, we can get the
ref based on foreign page, this probably involves some searching work.
But I was interrupted by other works and did not started it till now.
For example,
struct grant_mapping {
u32 ref; /* grant ref */
u16 flags; /* 0-4: GNTMAP_* ; 5-15: unused */
domid_t domid; /* granting domain */
+ unsigned long frame; /* grant frame */
};
Thanks
Annie
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-07-28 16:14 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-16 20:36 Rebooting domu fails in nfs share exported from another domu on the same dom0 annie li
2014-07-17 15:49 ` Roger Pau Monné
2014-07-17 16:56 ` annie li
2014-07-18 18:53 ` Konrad Rzeszutek Wilk
2014-07-18 19:31 ` annie li
2014-07-18 19:43 ` Konrad Rzeszutek Wilk
2014-07-18 20:17 ` annie li
2014-07-18 20:22 ` Konrad Rzeszutek Wilk
2014-07-18 20:31 ` annie li
2014-07-18 21:07 ` Konrad Rzeszutek Wilk
2014-07-18 21:43 ` annie li
2014-07-21 10:02 ` Wei Liu
2014-07-21 15:02 ` annie li
2014-07-21 23:05 ` Wei Liu
2014-07-23 1:58 ` annie li
2014-07-28 14:14 ` David Vrabel
2014-07-28 16:14 ` annie li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).