* xen-blkback unmap with network retansmission will cause a coredump
@ 2014-09-20 10:57 Chentao(Boby)
2014-09-22 10:01 ` Roger Pau Monné
2014-09-22 10:11 ` David Vrabel
0 siblings, 2 replies; 6+ messages in thread
From: Chentao(Boby) @ 2014-09-20 10:57 UTC (permalink / raw)
To: konrad.wilk, Roger Pau Monné
Cc: dengguoqiang, meiwanlong, mu.muyang, Yanqiangjun, liuyongan,
huangzhichao, xen-devel, zhangmin, wu.wubin
Hi konrad and roger,
When xen-blkback module executes unmap operation, and at the same time the skb of network retansmission uses this map page, it will cause a crash of hostos.
The crash stack of this problem is like below.
<ffffffff8041133e>{do_page_fault+0x38e}
<ffffffff8040d9e8>{page_fault+0x28}
<ffffffff80223cdb>{memcpy+0xb}
<ffffffff802325c2>{swiotlb_tbl_map_single+0x212}
<ffffffff8023274a>{swiotlb_map_page+0x17a}
<ffffffffa03468e6>{tg3:tg3_start_xmit+0x656}
<ffffffff80354d14>{dev_hard_start_xmit+0x334}
<ffffffff803721be>{sch_direct_xmit+0x1ae}
I search website, found citrix engineers has met this problem long time ago. And I realized citrix engineers solve this problem according to modify kernel stack.
Because this modification is very large, linux kernel community hasn't accept it until now. I have a immature thought, in dispatch_rw_block_io function, if this io
is a write operation, we use grant copy hypercall instead of grant map hypercall. I verify my modification and it can solve this problem.
What's your opinion of my modification? I am very looking forward to your reply. Any reply is appreciated.
Best wishes.
Tao Chen
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: xen-blkback unmap with network retansmission will cause a coredump
2014-09-20 10:57 xen-blkback unmap with network retansmission will cause a coredump Chentao(Boby)
@ 2014-09-22 10:01 ` Roger Pau Monné
2014-09-23 13:27 ` Chentao(Boby)
2014-09-22 10:11 ` David Vrabel
1 sibling, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2014-09-22 10:01 UTC (permalink / raw)
To: Chentao(Boby), konrad.wilk
Cc: dengguoqiang, meiwanlong, mu.muyang, Yanqiangjun, liuyongan,
huangzhichao, xen-devel, zhangmin, wu.wubin
El 20/09/14 a les 12.57, Chentao(Boby) ha escrit:
> Hi konrad and roger,
>
> When xen-blkback module executes unmap operation, and at the same time the skb of network retansmission uses this map page, it will cause a crash of hostos.
> The crash stack of this problem is like below.
> <ffffffff8041133e>{do_page_fault+0x38e}
> <ffffffff8040d9e8>{page_fault+0x28}
> <ffffffff80223cdb>{memcpy+0xb}
> <ffffffff802325c2>{swiotlb_tbl_map_single+0x212}
> <ffffffff8023274a>{swiotlb_map_page+0x17a}
> <ffffffffa03468e6>{tg3:tg3_start_xmit+0x656}
> <ffffffff80354d14>{dev_hard_start_xmit+0x334}
> <ffffffff803721be>{sch_direct_xmit+0x1ae}
>
> I search website, found citrix engineers has met this problem long time ago. And I realized citrix engineers solve this problem according to modify kernel stack.
> Because this modification is very large, linux kernel community hasn't accept it until now. I have a immature thought, in dispatch_rw_block_io function, if this io
> is a write operation, we use grant copy hypercall instead of grant map hypercall. I verify my modification and it can solve this problem.
>
> What's your opinion of my modification? I am very looking forward to your reply. Any reply is appreciated.
Hello,
Yes, using grant-copy instead of grant-map is going to solve the
problem, but it also defeats the purpose of persistent grants. I'm
afraid it is going to introduce a noticeable performance penalty.
IMHO a better solution would be to use GNTTABOP_unmap_and_replace with
the scratch balloon page instead of GNTTABOP_unmap_grant_ref. See
arch/x86/xen/p2m.c m2p_remove_override for an example implementation of
this procedure.
Roger.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: xen-blkback unmap with network retansmission will cause a coredump
2014-09-22 10:01 ` Roger Pau Monné
@ 2014-09-23 13:27 ` Chentao(Boby)
2014-09-23 14:16 ` Roger Pau Monné
0 siblings, 1 reply; 6+ messages in thread
From: Chentao(Boby) @ 2014-09-23 13:27 UTC (permalink / raw)
To: Roger Pau Monné, konrad.wilk
Cc: dengguoqiang, meiwanlong, mu.muyang, Yanqiangjun, liuyongan,
huangzhichao, xen-devel, zhangmin, wu.wubin
On 2014/9/22 18:01, Roger Pau Monné wrote:
> El 20/09/14 a les 12.57, Chentao(Boby) ha escrit:
>> Hi konrad and roger,
>>
>> When xen-blkback module executes unmap operation, and at the same time the skb of network retansmission uses this map page, it will cause a crash of hostos.
>> The crash stack of this problem is like below.
>> <ffffffff8041133e>{do_page_fault+0x38e}
>> <ffffffff8040d9e8>{page_fault+0x28}
>> <ffffffff80223cdb>{memcpy+0xb}
>> <ffffffff802325c2>{swiotlb_tbl_map_single+0x212}
>> <ffffffff8023274a>{swiotlb_map_page+0x17a}
>> <ffffffffa03468e6>{tg3:tg3_start_xmit+0x656}
>> <ffffffff80354d14>{dev_hard_start_xmit+0x334}
>> <ffffffff803721be>{sch_direct_xmit+0x1ae}
>>
>> I search website, found citrix engineers has met this problem long time ago. And I realized citrix engineers solve this problem according to modify kernel stack.
>> Because this modification is very large, linux kernel community hasn't accept it until now. I have a immature thought, in dispatch_rw_block_io function, if this io
>> is a write operation, we use grant copy hypercall instead of grant map hypercall. I verify my modification and it can solve this problem.
>>
>> What's your opinion of my modification? I am very looking forward to your reply. Any reply is appreciated.
>
> Hello,
>
> Yes, using grant-copy instead of grant-map is going to solve the
> problem, but it also defeats the purpose of persistent grants. I'm
> afraid it is going to introduce a noticeable performance penalty.
>
Roger, you are right. We found 20%+ performance penalty in 1M 100% sequential write 128 depth
when workload is running on ramdisk.
> IMHO a better solution would be to use GNTTABOP_unmap_and_replace with
> the scratch balloon page instead of GNTTABOP_unmap_grant_ref. See
> arch/x86/xen/p2m.c m2p_remove_override for an example implementation of
> this procedure.
>
You mean if we replace GNTTABOP_unmap_grant_ref with GNTTABOP_unmap_and_replace in xen-blkback module,
that will solve the problem. Is my understanding right?
> Roger.
> .
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: xen-blkback unmap with network retansmission will cause a coredump
2014-09-23 13:27 ` Chentao(Boby)
@ 2014-09-23 14:16 ` Roger Pau Monné
0 siblings, 0 replies; 6+ messages in thread
From: Roger Pau Monné @ 2014-09-23 14:16 UTC (permalink / raw)
To: Chentao(Boby), konrad.wilk
Cc: dengguoqiang, meiwanlong, mu.muyang, Yanqiangjun, liuyongan,
huangzhichao, xen-devel, zhangmin, wu.wubin
El 23/09/14 a les 15.27, Chentao(Boby) ha escrit:
>
>
> On 2014/9/22 18:01, Roger Pau Monné wrote:
>> El 20/09/14 a les 12.57, Chentao(Boby) ha escrit:
>>> Hi konrad and roger,
>>>
>>> When xen-blkback module executes unmap operation, and at the same time the skb of network retansmission uses this map page, it will cause a crash of hostos.
>>> The crash stack of this problem is like below.
>>> <ffffffff8041133e>{do_page_fault+0x38e}
>>> <ffffffff8040d9e8>{page_fault+0x28}
>>> <ffffffff80223cdb>{memcpy+0xb}
>>> <ffffffff802325c2>{swiotlb_tbl_map_single+0x212}
>>> <ffffffff8023274a>{swiotlb_map_page+0x17a}
>>> <ffffffffa03468e6>{tg3:tg3_start_xmit+0x656}
>>> <ffffffff80354d14>{dev_hard_start_xmit+0x334}
>>> <ffffffff803721be>{sch_direct_xmit+0x1ae}
>>>
>>> I search website, found citrix engineers has met this problem long time ago. And I realized citrix engineers solve this problem according to modify kernel stack.
>>> Because this modification is very large, linux kernel community hasn't accept it until now. I have a immature thought, in dispatch_rw_block_io function, if this io
>>> is a write operation, we use grant copy hypercall instead of grant map hypercall. I verify my modification and it can solve this problem.
>>>
>>> What's your opinion of my modification? I am very looking forward to your reply. Any reply is appreciated.
>>
>> Hello,
>>
>> Yes, using grant-copy instead of grant-map is going to solve the
>> problem, but it also defeats the purpose of persistent grants. I'm
>> afraid it is going to introduce a noticeable performance penalty.
>>
> Roger, you are right. We found 20%+ performance penalty in 1M 100% sequential write 128 depth
>
> when workload is running on ramdisk.
>
>> IMHO a better solution would be to use GNTTABOP_unmap_and_replace with
>> the scratch balloon page instead of GNTTABOP_unmap_grant_ref. See
>> arch/x86/xen/p2m.c m2p_remove_override for an example implementation of
>> this procedure.
>>
> You mean if we replace GNTTABOP_unmap_grant_ref with GNTTABOP_unmap_and_replace in xen-blkback module,
>
> that will solve the problem. Is my understanding right?
Well, it's not a straight replacement. You will need to issue a
multicall that bundles the grant ref replacement and a MMU operation to
update the scratch page VA to point to the MFN. This is because the
grant replace will remove the MFN from the scratch page VA.
You can find an example about how to do this in m2p_remove_override on
the Linux kernel file arch/x86/xen/p2m.c.
Another option would be to introduce a new hypercall like David
suggests, that does a replacement without redirecting <new_addr> to the
null entry, this way you should be able to avoid the multicall.
Roger.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: xen-blkback unmap with network retansmission will cause a coredump
2014-09-20 10:57 xen-blkback unmap with network retansmission will cause a coredump Chentao(Boby)
2014-09-22 10:01 ` Roger Pau Monné
@ 2014-09-22 10:11 ` David Vrabel
2014-09-23 13:36 ` Chentao(Boby)
1 sibling, 1 reply; 6+ messages in thread
From: David Vrabel @ 2014-09-22 10:11 UTC (permalink / raw)
To: Chentao(Boby), konrad.wilk, Roger Pau Monné
Cc: meiwanlong, mu.muyang, Yanqiangjun, liuyongan, huangzhichao,
xen-devel, dengguoqiang, zhangmin, wu.wubin
On 20/09/14 11:57, Chentao(Boby) wrote:
> Hi konrad and roger,
>
> When xen-blkback module executes unmap operation, and at the same
> time the skb of network retansmission uses this map page, it will
> cause a crash of hostos.
>
> The crash stack of this problem is like below.
> <ffffffff8041133e>{do_page_fault+0x38e}
> <ffffffff8040d9e8>{page_fault+0x28} <ffffffff80223cdb>{memcpy+0xb}
> <ffffffff802325c2>{swiotlb_tbl_map_single+0x212}
> <ffffffff8023274a>{swiotlb_map_page+0x17a}
> <ffffffffa03468e6>{tg3:tg3_start_xmit+0x656}
> <ffffffff80354d14>{dev_hard_start_xmit+0x334}
> <ffffffff803721be>{sch_direct_xmit+0x1ae}
What dom0 (backend) kernel are you using? Which backend and what storage?
> I search website, found citrix engineers has met this problem long
> time ago. And I realized citrix engineers solve this problem
> according to modify kernel stack. Because this modification is very
> large, linux kernel community hasn't accept it until now. I have a
> immature thought, in dispatch_rw_block_io function, if this io is a
> write operation, we use grant copy hypercall instead of grant map
> hypercall. I verify my modification and it can solve this problem.
Switching to grant copy will reduce performance significantly in many cases.
This was fixed for user space backends by replacing the foreign mapping
with a mapping of a scratch page, when unmapping the grant.
Something similar should be done for kernel-only foreign mappings. This
requires a GNTOP_unmap_and_duplicate hypercall sub-op to allow efficient
batching.
David
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: xen-blkback unmap with network retansmission will cause a coredump
2014-09-22 10:11 ` David Vrabel
@ 2014-09-23 13:36 ` Chentao(Boby)
0 siblings, 0 replies; 6+ messages in thread
From: Chentao(Boby) @ 2014-09-23 13:36 UTC (permalink / raw)
To: David Vrabel, konrad.wilk, Roger Pau Monné
Cc: meiwanlong, mu.muyang, Yanqiangjun, liuyongan, huangzhichao,
xen-devel, dengguoqiang, zhangmin, wu.wubin
On 2014/9/22 18:11, David Vrabel wrote:
> On 20/09/14 11:57, Chentao(Boby) wrote:
>> Hi konrad and roger,
>>
>> When xen-blkback module executes unmap operation, and at the same
>> time the skb of network retansmission uses this map page, it will
>> cause a crash of hostos.
>>
>> The crash stack of this problem is like below.
>> <ffffffff8041133e>{do_page_fault+0x38e}
>> <ffffffff8040d9e8>{page_fault+0x28} <ffffffff80223cdb>{memcpy+0xb}
>> <ffffffff802325c2>{swiotlb_tbl_map_single+0x212}
>> <ffffffff8023274a>{swiotlb_map_page+0x17a}
>> <ffffffffa03468e6>{tg3:tg3_start_xmit+0x656}
>> <ffffffff80354d14>{dev_hard_start_xmit+0x334}
>> <ffffffff803721be>{sch_direct_xmit+0x1ae}
>
> What dom0 (backend) kernel are you using? Which backend and what storage?
>
The dom0 kernel is Suse11 sp3 3.0.93-0.8-xen. The backend is xen-blkback, storage is IPSAN.
>> I search website, found citrix engineers has met this problem long
>> time ago. And I realized citrix engineers solve this problem
>> according to modify kernel stack. Because this modification is very
>> large, linux kernel community hasn't accept it until now. I have a
>> immature thought, in dispatch_rw_block_io function, if this io is a
>> write operation, we use grant copy hypercall instead of grant map
>> hypercall. I verify my modification and it can solve this problem.
>
> Switching to grant copy will reduce performance significantly in many cases.
>
> This was fixed for user space backends by replacing the foreign mapping
> with a mapping of a scratch page, when unmapping the grant.
>
> Something similar should be done for kernel-only foreign mappings. This
> requires a GNTOP_unmap_and_duplicate hypercall sub-op to allow efficient
> batching.
>
> David
> .
>
Thanks for your reply, David.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-09-23 14:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-20 10:57 xen-blkback unmap with network retansmission will cause a coredump Chentao(Boby)
2014-09-22 10:01 ` Roger Pau Monné
2014-09-23 13:27 ` Chentao(Boby)
2014-09-23 14:16 ` Roger Pau Monné
2014-09-22 10:11 ` David Vrabel
2014-09-23 13:36 ` Chentao(Boby)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).