* wbinvd optimization for pass through domain
@ 2007-11-15 3:05 Dong, Eddie
2007-11-15 7:04 ` question about io path in the front/backend tgh
2007-11-16 16:29 ` wbinvd optimization for pass through domain Keir Fraser
0 siblings, 2 replies; 11+ messages in thread
From: Dong, Eddie @ 2007-11-15 3:05 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1: Type: text/plain, Size: 241 bytes --]
This patch optimize wbinvd exit emulation for pass-through domain
to avoid "alway wbinvd" when a VCPU is migrated. In stead, do
host wbinvd on all host CPUs when wbinvd exit.
Signed-off-by Yaozu (Eddie) Dong <eddie.dong@intel.com>
[-- Attachment #2: wbinvd-opt.patch --]
[-- Type: application/octet-stream, Size: 3362 bytes --]
diff -r dfca1120813f xen/arch/x86/hvm/vmx/vmcs.c
--- a/xen/arch/x86/hvm/vmx/vmcs.c Sun Nov 11 18:28:57 2007 +0000
+++ b/xen/arch/x86/hvm/vmx/vmcs.c Wed Nov 14 17:19:44 2007 +0800
@@ -766,7 +766,7 @@ void vm_resume_fail(unsigned long eflags
domain_crash_synchronous();
}
-static void flush_cache(void *info)
+void flush_cache(void *info)
{
wbinvd();
}
@@ -784,10 +784,14 @@ void vmx_do_resume(struct vcpu *v)
{
/* For pass-through domain, guest PCI-E device driver may leverage the
* "Non-Snoop" I/O, and explicitly "WBINVD" or "CFLUSH" to a RAM space.
- * In that case, if migration occurs before "WBINVD" or "CFLUSH", need
- * to maintain data consistency.
+ * Since migration may occurs before "WBINVD" or "CFLUSH", we need to
+ * maintain data consistency either:
+ * 1: flush cache (wbinvd) when the guest is scheduled out if there is
+ * no wbinvd exit, or
+ * 2: execute wbinvd on all dirty pCPUs when guest wbinvd exits.
*/
- if ( !list_empty(&(domain_hvm_iommu(v->domain)->pdev_list)) )
+ if ( !list_empty(&(domain_hvm_iommu(v->domain)->pdev_list)) &&
+ !cpu_has_wbinvd_exiting )
{
int cpu = v->arch.hvm_vmx.active_cpu;
if ( cpu != -1 )
diff -r dfca1120813f xen/arch/x86/hvm/vmx/vmx.c
--- a/xen/arch/x86/hvm/vmx/vmx.c Sun Nov 11 18:28:57 2007 +0000
+++ b/xen/arch/x86/hvm/vmx/vmx.c Wed Nov 14 17:51:04 2007 +0800
@@ -2915,14 +2915,21 @@ asmlinkage void vmx_vmexit_handler(struc
__update_guest_eip(inst_len);
if ( !list_empty(&(domain_hvm_iommu(v->domain)->pdev_list)) )
{
- wbinvd();
- /* Disable further WBINVD intercepts. */
- if ( (exit_reason == EXIT_REASON_WBINVD) &&
- (vmx_cpu_based_exec_control &
- CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) )
- __vmwrite(SECONDARY_VM_EXEC_CONTROL,
- vmx_secondary_exec_control &
- ~SECONDARY_EXEC_WBINVD_EXITING);
+ if ( cpu_has_wbinvd_exiting ) {
+ extern void flush_cache(void *info);
+
+ on_selected_cpus(cpu_online_map, flush_cache, NULL, 1, 1);
+ }
+ else {
+ wbinvd();
+ /* Disable further WBINVD intercepts. */
+ if ( (exit_reason == EXIT_REASON_WBINVD) &&
+ (vmx_cpu_based_exec_control &
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) )
+ __vmwrite(SECONDARY_VM_EXEC_CONTROL,
+ vmx_secondary_exec_control &
+ ~SECONDARY_EXEC_WBINVD_EXITING);
+ }
}
break;
}
diff -r dfca1120813f xen/include/asm-x86/hvm/vmx/vmcs.h
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h Sun Nov 11 18:28:57 2007 +0000
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h Wed Nov 14 17:07:54 2007 +0800
@@ -136,6 +136,8 @@ extern u32 vmx_secondary_exec_control;
extern bool_t cpu_has_vmx_ins_outs_instr_info;
+#define cpu_has_wbinvd_exiting \
+ (vmx_secondary_exec_control & SECONDARY_EXEC_WBINVD_EXITING)
#define cpu_has_vmx_virtualize_apic_accesses \
(vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)
#define cpu_has_vmx_tpr_shadow \
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* question about io path in the front/backend
2007-11-15 3:05 wbinvd optimization for pass through domain Dong, Eddie
@ 2007-11-15 7:04 ` tgh
2007-12-04 2:58 ` Mark Williamson
2007-11-16 16:29 ` wbinvd optimization for pass through domain Keir Fraser
1 sibling, 1 reply; 11+ messages in thread
From: tgh @ 2007-11-15 7:04 UTC (permalink / raw)
To: xen-devel
hi
I have read some documents and wiki about split driver in xen,and I am
confused about the I/O path ,in which a sys_read() pass through the domU
and dom0,does sys_read() in the domU pass through vfs and ,say ,ext3fs
in domU,and insert request into the requeest_queue of the
frontend-driver,is it right?
and then ,say domU sets up with a *.img file in the dom0, then what
does frontend and backend driver do?
does frontend transmit the request to the backend ,is it right?
and then what does backend driver do ? does backend transfer the
request to the phyiscal driver in the dom0 ,is it right?
or does backend transfer the request into some read()operation ,and
submit it to the vfs and ,say,ext3fs in dom0, and do another relatively
complete io path in the dom0,is it right?
or if backend transfer the request to physical driver directly, how
does the backend deal with the request's virtual address ,and how does
backend manage bio buffer ,does physical driver and backend and frontend
share the bio buffer in some way, or what does xen deal with it ?
I am confused about all that ,could someone help me
Thanks in advance
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wbinvd optimization for pass through domain
2007-11-15 3:05 wbinvd optimization for pass through domain Dong, Eddie
2007-11-15 7:04 ` question about io path in the front/backend tgh
@ 2007-11-16 16:29 ` Keir Fraser
2007-11-19 5:08 ` Dong, Eddie
1 sibling, 1 reply; 11+ messages in thread
From: Keir Fraser @ 2007-11-16 16:29 UTC (permalink / raw)
To: Dong, Eddie, xen-devel
Does CLFLUSH cause vmexit? If not this seems a bit dodgy.
-- Keir
On 15/11/07 03:05, "Dong, Eddie" <eddie.dong@intel.com> wrote:
>
> This patch optimize wbinvd exit emulation for pass-through domain
> to avoid "alway wbinvd" when a VCPU is migrated. In stead, do
> host wbinvd on all host CPUs when wbinvd exit.
>
>
> Signed-off-by Yaozu (Eddie) Dong <eddie.dong@intel.com>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: wbinvd optimization for pass through domain
2007-11-16 16:29 ` wbinvd optimization for pass through domain Keir Fraser
@ 2007-11-19 5:08 ` Dong, Eddie
2007-11-19 11:48 ` Keir Fraser
0 siblings, 1 reply; 11+ messages in thread
From: Dong, Eddie @ 2007-11-19 5:08 UTC (permalink / raw)
To: Keir Fraser, xen-devel
xen-devel-bounces@lists.xensource.com wrote:
> Does CLFLUSH cause vmexit? If not this seems a bit dodgy.
>
CLFLUSH doesn't cause VM Exit, but it is different with wbinvd.
the later one only write back the dirty lines in local processor,
while CLFLUSH invalidation is broadcasted thru the cache
coherency domain.
Eddie
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: wbinvd optimization for pass through domain
2007-11-19 5:08 ` Dong, Eddie
@ 2007-11-19 11:48 ` Keir Fraser
0 siblings, 0 replies; 11+ messages in thread
From: Keir Fraser @ 2007-11-19 11:48 UTC (permalink / raw)
To: Dong, Eddie, xen-devel
Ah, so it does. That's all right then.
-- Keir
On 19/11/07 05:08, "Dong, Eddie" <eddie.dong@intel.com> wrote:
> xen-devel-bounces@lists.xensource.com wrote:
>> Does CLFLUSH cause vmexit? If not this seems a bit dodgy.
>>
> CLFLUSH doesn't cause VM Exit, but it is different with wbinvd.
> the later one only write back the dirty lines in local processor,
> while CLFLUSH invalidation is broadcasted thru the cache
> coherency domain.
>
> Eddie
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: question about io path in the front/backend
2007-11-15 7:04 ` question about io path in the front/backend tgh
@ 2007-12-04 2:58 ` Mark Williamson
2007-12-05 7:19 ` tgh
2007-12-07 9:25 ` tgh
0 siblings, 2 replies; 11+ messages in thread
From: Mark Williamson @ 2007-12-04 2:58 UTC (permalink / raw)
To: xen-devel; +Cc: tgh
> I have read some documents and wiki about split driver in xen,and I am
> confused about the I/O path ,in which a sys_read() pass through the domU
> and dom0,does sys_read() in the domU pass through vfs and ,say ,ext3fs
> in domU,and insert request into the requeest_queue of the
> frontend-driver,is it right?
Sounds like you have the right idea. Requests get queued with the frontend
driver in terms of Linux structures. IO requests to satisfy these are then
placed into the shared memory ring so that the backend can find out what
we're asking for.
> and then ,say domU sets up with a *.img file in the dom0, then what
> does frontend and backend driver do?
> does frontend transmit the request to the backend ,is it right?
Yes, the frontend does this by putting requests into the shared memory
ringbuffer which is also accessible by the backend. The frontend then sends
and event to the backend; this causes an interrupt in the backend so that it
knows it must check the shared memory.
> and then what does backend driver do ? does backend transfer the
> request to the phyiscal driver in the dom0 ,is it right?
Yes. The backend responds to the interrupt by checking the shared memory for
new requests, then it maps parts of the domUs memory so that dom0 will be
able to write data into it. Then it submits requests to the Linux block IO
subsystem to fill that memory with data. The Linux block IO system
eventually sends these requests to the device driver, to do the IO directly
into the mapped domU memory.
> or does backend transfer the request into some read()operation ,and
> submit it to the vfs and ,say,ext3fs in dom0, and do another relatively
> complete io path in the dom0,is it right?
If you're just exporting a phy: device to the guest, then the block IO
requests go down to the block device driver for that device and are serviced
there. e.g. if I export IDE driver phy:/dev/hda to my guest, then the IDE
driver will satisfy the IO requests directly.
Requests go backend -> block layer -> real device driver
If you're using a file: device then you have to go through the filesystem
layer... So the IO requests go backend -> block layer -> loopback block
device -> ext3 -> block layer (again) -> real device driver
If you're using blktap then the requests take a trip via userspace before
getting submitted.
> or if backend transfer the request to physical driver directly, how
> does the backend deal with the request's virtual address ,and how does
> backend manage bio buffer ,does physical driver and backend and frontend
> share the bio buffer in some way, or what does xen deal with it ?
I hope what I've said clarifies things a bit.
Cheers,
Mark
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: question about io path in the front/backend
2007-12-04 2:58 ` Mark Williamson
@ 2007-12-05 7:19 ` tgh
2007-12-07 9:25 ` tgh
1 sibling, 0 replies; 11+ messages in thread
From: tgh @ 2007-12-05 7:19 UTC (permalink / raw)
To: Mark Williamson; +Cc: xen-devel
Thank you for your explanation
and for the phy:device and file:device,which has better performance?
phy:device one for its direct path ,or the other one?
and I am also confused about how does the frontend I/O request remap to
the phyiscal device in the I/O path mode of "phy:device"? the blocks in
"phy:device" is continuous , or not? and in the "file:device",the blocks
are not continuous ,is it right?
Thanks in advance
Mark Williamson 写道:
>> I have read some documents and wiki about split driver in xen,and I am
>> confused about the I/O path ,in which a sys_read() pass through the domU
>> and dom0,does sys_read() in the domU pass through vfs and ,say ,ext3fs
>> in domU,and insert request into the requeest_queue of the
>> frontend-driver,is it right?
>>
>
> Sounds like you have the right idea. Requests get queued with the frontend
> driver in terms of Linux structures. IO requests to satisfy these are then
> placed into the shared memory ring so that the backend can find out what
> we're asking for.
>
>
>> and then ,say domU sets up with a *.img file in the dom0, then what
>> does frontend and backend driver do?
>> does frontend transmit the request to the backend ,is it right?
>>
>
> Yes, the frontend does this by putting requests into the shared memory
> ringbuffer which is also accessible by the backend. The frontend then sends
> and event to the backend; this causes an interrupt in the backend so that it
> knows it must check the shared memory.
>
>
>> and then what does backend driver do ? does backend transfer the
>> request to the phyiscal driver in the dom0 ,is it right?
>>
>
> Yes. The backend responds to the interrupt by checking the shared memory for
> new requests, then it maps parts of the domUs memory so that dom0 will be
> able to write data into it. Then it submits requests to the Linux block IO
> subsystem to fill that memory with data. The Linux block IO system
> eventually sends these requests to the device driver, to do the IO directly
> into the mapped domU memory.
>
>
>> or does backend transfer the request into some read()operation ,and
>> submit it to the vfs and ,say,ext3fs in dom0, and do another relatively
>> complete io path in the dom0,is it right?
>>
>
> If you're just exporting a phy: device to the guest, then the block IO
> requests go down to the block device driver for that device and are serviced
> there. e.g. if I export IDE driver phy:/dev/hda to my guest, then the IDE
> driver will satisfy the IO requests directly.
> Requests go backend -> block layer -> real device driver
>
> If you're using a file: device then you have to go through the filesystem
> layer... So the IO requests go backend -> block layer -> loopback block
> device -> ext3 -> block layer (again) -> real device driver
>
> If you're using blktap then the requests take a trip via userspace before
> getting submitted.
>
>
>> or if backend transfer the request to physical driver directly, how
>> does the backend deal with the request's virtual address ,and how does
>> backend manage bio buffer ,does physical driver and backend and frontend
>> share the bio buffer in some way, or what does xen deal with it ?
>>
>
> I hope what I've said clarifies things a bit.
>
> Cheers,
> Mark
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: question about io path in the front/backend
2007-12-04 2:58 ` Mark Williamson
2007-12-05 7:19 ` tgh
@ 2007-12-07 9:25 ` tgh
2007-12-09 2:10 ` Mark Williamson
1 sibling, 1 reply; 11+ messages in thread
From: tgh @ 2007-12-07 9:25 UTC (permalink / raw)
To: Mark Williamson, Tim Deegan; +Cc: xen-devel
hi
In the phy:device mode, dom0 and domU share the page filled with the I/O
request, which is transfered between frontend and backend, is it ?and
you know, in the native linux, there are buffer and cache ,such as bio
or something , which is shared in I/Opath ,say by FS and Block Driver,
is it? and what about the cache or buffer in the front/backend mode in
xen, say phy:device mode,does front/backend share some cache or buffer
,or does front/backend share no cache or buffer at all? and what about
the granttable's function? does granttable (or shared page filled with
I/Oreqest or I/O data) function as the cache or buffer in the native
linux I/O path? or does the shared page between the front/backend act
only as transferring the data and I/Orequest , or does it have a cache
or buffer function as the cache or buffer such as bio in the native linux?
Thanks in advance
Mark Williamson 写道:
>> I have read some documents and wiki about split driver in xen,and I am
>> confused about the I/O path ,in which a sys_read() pass through the domU
>> and dom0,does sys_read() in the domU pass through vfs and ,say ,ext3fs
>> in domU,and insert request into the requeest_queue of the
>> frontend-driver,is it right?
>>
>
> Sounds like you have the right idea. Requests get queued with the frontend
> driver in terms of Linux structures. IO requests to satisfy these are then
> placed into the shared memory ring so that the backend can find out what
> we're asking for.
>
>
>> and then ,say domU sets up with a *.img file in the dom0, then what
>> does frontend and backend driver do?
>> does frontend transmit the request to the backend ,is it right?
>>
>
> Yes, the frontend does this by putting requests into the shared memory
> ringbuffer which is also accessible by the backend. The frontend then sends
> and event to the backend; this causes an interrupt in the backend so that it
> knows it must check the shared memory.
>
>
>> and then what does backend driver do ? does backend transfer the
>> request to the phyiscal driver in the dom0 ,is it right?
>>
>
> Yes. The backend responds to the interrupt by checking the shared memory for
> new requests, then it maps parts of the domUs memory so that dom0 will be
> able to write data into it. Then it submits requests to the Linux block IO
> subsystem to fill that memory with data. The Linux block IO system
> eventually sends these requests to the device driver, to do the IO directly
> into the mapped domU memory.
>
>
>> or does backend transfer the request into some read()operation ,and
>> submit it to the vfs and ,say,ext3fs in dom0, and do another relatively
>> complete io path in the dom0,is it right?
>>
>
> If you're just exporting a phy: device to the guest, then the block IO
> requests go down to the block device driver for that device and are serviced
> there. e.g. if I export IDE driver phy:/dev/hda to my guest, then the IDE
> driver will satisfy the IO requests directly.
> Requests go backend -> block layer -> real device driver
>
> If you're using a file: device then you have to go through the filesystem
> layer... So the IO requests go backend -> block layer -> loopback block
> device -> ext3 -> block layer (again) -> real device driver
>
> If you're using blktap then the requests take a trip via userspace before
> getting submitted.
>
>
>> or if backend transfer the request to physical driver directly, how
>> does the backend deal with the request's virtual address ,and how does
>> backend manage bio buffer ,does physical driver and backend and frontend
>> share the bio buffer in some way, or what does xen deal with it ?
>>
>
> I hope what I've said clarifies things a bit.
>
> Cheers,
> Mark
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: question about io path in the front/backend
2007-12-07 9:25 ` tgh
@ 2007-12-09 2:10 ` Mark Williamson
2007-12-09 5:14 ` Pradeep Singh
0 siblings, 1 reply; 11+ messages in thread
From: Mark Williamson @ 2007-12-09 2:10 UTC (permalink / raw)
To: tgh; +Cc: xen-devel, Tim Deegan
> In the phy:device mode, dom0 and domU share the page filled with the I/O
> request, which is transfered between frontend and backend, is it ?
Whatever the device mode, if the PV drivers are in use then the domU grants
dom0 the right to share the pages to be filled. dom0 then maps these pages,
fills them with data, then unmaps them and notifies the domU.
> and
> you know, in the native linux, there are buffer and cache ,such as bio
> or something , which is shared in I/Opath ,say by FS and Block Driver,
> is it? and what about the cache or buffer in the front/backend mode in
> xen, say phy:device mode,does front/backend share some cache or buffer
> ,or does front/backend share no cache or buffer at all?
The domU maintains its own caches and dom0 just supplies it with data. dom0
doesn't cache stuff on behalf of the domU - in fact, the block backend driver
bypasses the page cache in dom0 entirely and transfers data directly into the
domU's memory.
> and what about
> the granttable's function? does granttable (or shared page filled with
> I/Oreqest or I/O data) function as the cache or buffer in the native
> linux I/O path?
The pages which the frontend domain granted to the backend are the sources or
destinations of the IO data. They are not part of the page cache in dom0.
dom0 maps them, then creates BIOs pointing to them and submits these to the
block layer. dom0's block layer gets the data from the device and puts it
into this mapped domU memory (or takes data from memory and puts it on disk).
In the domU, the pages form part of that domain's own private page cache but
dom0 does not know about this.
> or does the shared page between the front/backend act
> only as transferring the data and I/Orequest , or does it have a cache
> or buffer function as the cache or buffer such as bio in the native linux?
The shared ring page between front and back ends is just used for transferring
details of requests (like "get this data in memory, and put it here on
disk"). i.e. The shared ring contains request metadata. The pages that are
temporarily mapped from the domU into dom0 contain the actual data.
Cheers,
Mark
>
> Thanks in advance
>
> Mark Williamson 写道:
> >> I have read some documents and wiki about split driver in xen,and I am
> >> confused about the I/O path ,in which a sys_read() pass through the domU
> >> and dom0,does sys_read() in the domU pass through vfs and ,say ,ext3fs
> >> in domU,and insert request into the requeest_queue of the
> >> frontend-driver,is it right?
> >
> > Sounds like you have the right idea. Requests get queued with the
> > frontend driver in terms of Linux structures. IO requests to satisfy
> > these are then placed into the shared memory ring so that the backend can
> > find out what we're asking for.
> >
> >> and then ,say domU sets up with a *.img file in the dom0, then what
> >> does frontend and backend driver do?
> >> does frontend transmit the request to the backend ,is it right?
> >
> > Yes, the frontend does this by putting requests into the shared memory
> > ringbuffer which is also accessible by the backend. The frontend then
> > sends and event to the backend; this causes an interrupt in the backend
> > so that it knows it must check the shared memory.
> >
> >> and then what does backend driver do ? does backend transfer the
> >> request to the phyiscal driver in the dom0 ,is it right?
> >
> > Yes. The backend responds to the interrupt by checking the shared memory
> > for new requests, then it maps parts of the domUs memory so that dom0
> > will be able to write data into it. Then it submits requests to the
> > Linux block IO subsystem to fill that memory with data. The Linux block
> > IO system eventually sends these requests to the device driver, to do the
> > IO directly into the mapped domU memory.
> >
> >> or does backend transfer the request into some read()operation ,and
> >> submit it to the vfs and ,say,ext3fs in dom0, and do another relatively
> >> complete io path in the dom0,is it right?
> >
> > If you're just exporting a phy: device to the guest, then the block IO
> > requests go down to the block device driver for that device and are
> > serviced there. e.g. if I export IDE driver phy:/dev/hda to my guest,
> > then the IDE driver will satisfy the IO requests directly.
> > Requests go backend -> block layer -> real device driver
> >
> > If you're using a file: device then you have to go through the filesystem
> > layer... So the IO requests go backend -> block layer -> loopback block
> > device -> ext3 -> block layer (again) -> real device driver
> >
> > If you're using blktap then the requests take a trip via userspace before
> > getting submitted.
> >
> >> or if backend transfer the request to physical driver directly, how
> >> does the backend deal with the request's virtual address ,and how does
> >> backend manage bio buffer ,does physical driver and backend and frontend
> >> share the bio buffer in some way, or what does xen deal with it ?
> >
> > I hope what I've said clarifies things a bit.
> >
> > Cheers,
> > Mark
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: question about io path in the front/backend
2007-12-09 2:10 ` Mark Williamson
@ 2007-12-09 5:14 ` Pradeep Singh
2007-12-09 19:47 ` Mark Williamson
0 siblings, 1 reply; 11+ messages in thread
From: Pradeep Singh @ 2007-12-09 5:14 UTC (permalink / raw)
To: Mark Williamson; +Cc: xen-devel, tgh, Tim Deegan
On Sun, 9 Dec 2007 02:10:54 +0000
Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote:
>
> Whatever the device mode, if the PV drivers are in use then the domU
> grants dom0 the right to share the pages to be filled. dom0 then
> maps these pages, fills them with data, then unmaps them and notifies
> the domU.
>
[...]
> > Driver, is it? and what about the cache or buffer in the
> > front/backend mode in xen, say phy:device mode,does front/backend
> > share some cache or buffer ,or does front/backend share no cache or
> > buffer at all?
>
> The domU maintains its own caches and dom0 just supplies it with
> data. dom0 doesn't cache stuff on behalf of the domU - in fact, the
> block backend driver bypasses the page cache in dom0 entirely and
> transfers data directly into the domU's memory.
But isn't that a performance hit?
Cache in dom0 on the behalf of domU, may result in better performance,
isn't it?After all dom0 is the domain which actually does I/O with the
disk/IO device.Keeping domU's cache in dom0 sounds like increasing
complexity though, i must admit.But i guess solutions to this could be
found.
Thoughts?
Thanks,
Pradeep
>
> > and what about
> > the granttable's function? does granttable (or shared page filled
> > with I/Oreqest or I/O data) function as the cache or buffer in the
> > native linux I/O path?
>
> The pages which the frontend domain granted to the backend are the
> sources or destinations of the IO data. They are not part of the
> page cache in dom0. dom0 maps them, then creates BIOs pointing to
> them and submits these to the block layer. dom0's block layer gets
> the data from the device and puts it into this mapped domU memory (or
> takes data from memory and puts it on disk).
>
> In the domU, the pages form part of that domain's own private page
> cache but dom0 does not know about this.
>
> > or does the shared page between the front/backend act
> > only as transferring the data and I/Orequest , or does it have a
> > cache or buffer function as the cache or buffer such as bio in the
> > native linux?
>
> The shared ring page between front and back ends is just used for
> transferring details of requests (like "get this data in memory, and
> put it here on disk"). i.e. The shared ring contains request
> metadata. The pages that are temporarily mapped from the domU into
> dom0 contain the actual data.
>
> Cheers,
> Mark
>
> >
> > Thanks in advance
> >
> > Mark Williamson 写道:
> > >> I have read some documents and wiki about split driver in
> > >> xen,and I am confused about the I/O path ,in which a sys_read()
> > >> pass through the domU and dom0,does sys_read() in the domU pass
> > >> through vfs and ,say ,ext3fs in domU,and insert request into the
> > >> requeest_queue of the frontend-driver,is it right?
> > >
> > > Sounds like you have the right idea. Requests get queued with the
> > > frontend driver in terms of Linux structures. IO requests to
> > > satisfy these are then placed into the shared memory ring so that
> > > the backend can find out what we're asking for.
> > >
> > >> and then ,say domU sets up with a *.img file in the dom0, then
> > >> what does frontend and backend driver do?
> > >> does frontend transmit the request to the backend ,is it right?
> > >
> > > Yes, the frontend does this by putting requests into the shared
> > > memory ringbuffer which is also accessible by the backend. The
> > > frontend then sends and event to the backend; this causes an
> > > interrupt in the backend so that it knows it must check the
> > > shared memory.
> > >
> > >> and then what does backend driver do ? does backend transfer
> > >> the request to the phyiscal driver in the dom0 ,is it right?
> > >
> > > Yes. The backend responds to the interrupt by checking the
> > > shared memory for new requests, then it maps parts of the domUs
> > > memory so that dom0 will be able to write data into it. Then it
> > > submits requests to the Linux block IO subsystem to fill that
> > > memory with data. The Linux block IO system eventually sends
> > > these requests to the device driver, to do the IO directly into
> > > the mapped domU memory.
> > >
> > >> or does backend transfer the request into some
> > >> read()operation ,and submit it to the vfs and ,say,ext3fs in
> > >> dom0, and do another relatively complete io path in the dom0,is
> > >> it right?
> > >
> > > If you're just exporting a phy: device to the guest, then the
> > > block IO requests go down to the block device driver for that
> > > device and are serviced there. e.g. if I export IDE driver
> > > phy:/dev/hda to my guest, then the IDE driver will satisfy the IO
> > > requests directly. Requests go backend -> block layer -> real
> > > device driver
> > >
> > > If you're using a file: device then you have to go through the
> > > filesystem layer... So the IO requests go backend -> block layer
> > > -> loopback block device -> ext3 -> block layer (again) -> real
> > > device driver
> > >
> > > If you're using blktap then the requests take a trip via
> > > userspace before getting submitted.
> > >
> > >> or if backend transfer the request to physical driver
> > >> directly, how does the backend deal with the request's virtual
> > >> address ,and how does backend manage bio buffer ,does physical
> > >> driver and backend and frontend share the bio buffer in some
> > >> way, or what does xen deal with it ?
> > >
> > > I hope what I've said clarifies things a bit.
> > >
> > > Cheers,
> > > Mark
>
>
>
--
heh...people do try to read my signature.
http://eagain.wordpress.com
http://emptydomain.googlepages.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: question about io path in the front/backend
2007-12-09 5:14 ` Pradeep Singh
@ 2007-12-09 19:47 ` Mark Williamson
0 siblings, 0 replies; 11+ messages in thread
From: Mark Williamson @ 2007-12-09 19:47 UTC (permalink / raw)
To: Pradeep Singh; +Cc: xen-devel, tgh, Tim Deegan
> > The domU maintains its own caches and dom0 just supplies it with
> > data. dom0 doesn't cache stuff on behalf of the domU - in fact, the
> > block backend driver bypasses the page cache in dom0 entirely and
> > transfers data directly into the domU's memory.
>
> But isn't that a performance hit?
> Cache in dom0 on the behalf of domU, may result in better performance,
> isn't it?After all dom0 is the domain which actually does I/O with the
> disk/IO device.Keeping domU's cache in dom0 sounds like increasing
> complexity though, i must admit.But i guess solutions to this could be
> found.
> Thoughts?
Thing is, the domU *knows* what data is useful to it and therefore which bits
it wants to cache and which to throw away. So in principle it can make
better decisions. Also, Linux in the domU is basically going to *want* to
cache things anyhow, so unless you actually disabled that there may not be
much point in adding another level of cache in dom0 (though I can imagine
that could be useful in some circumstances).
By not caching stuff in dom0, you also avoid the possibility of domUs
interfering so much (performance wise) with each other and with dom0
applications by pushing relevant data out of the cache. By each domain
having their own private cache, you avoid contention for cache space.
Finally, it's worth noting that for applications the Linux page cache is able
to make use of sharing between applications accessing the same data. For
virtual machines in Xen, if they're all using different virtual disks then
there's not really anything readily shareable - they're accessing different
data. So as things are, it's harder to get such a benefit from using the
page cache as you would for running applications in the same domain.
However...
Having private caches for each domU may be less efficient overall than having
one big shared cache, even though it ensures better performance isolation.
And if domains are using a copy-on-write block device (or something like
that) then they may actually be accessing the same data sometimes.
For this reason, various people, including myself are exploring ways of
getting domains to share their page caches in order to improve memory use and
performance - particularly in the case where they're using some kind of
shared storage.
Other VMMs (e.g. qemu, VMware, kvm, and others) do things differently, either
by explicitly sitting *on top* of the Linux page cache or by implementing a
custom page sharing mechanism (in the case of VMware ESX, which is a
hypervisor rather different to Xen). This means that those VMMs can or do
use shared caching in order to improve memory usage and performance.
Cheers,
Mark
> Thanks,
> Pradeep
>
> > > and what about
> > > the granttable's function? does granttable (or shared page filled
> > > with I/Oreqest or I/O data) function as the cache or buffer in the
> > > native linux I/O path?
> >
> > The pages which the frontend domain granted to the backend are the
> > sources or destinations of the IO data. They are not part of the
> > page cache in dom0. dom0 maps them, then creates BIOs pointing to
> > them and submits these to the block layer. dom0's block layer gets
> > the data from the device and puts it into this mapped domU memory (or
> > takes data from memory and puts it on disk).
> >
> > In the domU, the pages form part of that domain's own private page
> > cache but dom0 does not know about this.
> >
> > > or does the shared page between the front/backend act
> > > only as transferring the data and I/Orequest , or does it have a
> > > cache or buffer function as the cache or buffer such as bio in the
> > > native linux?
> >
> > The shared ring page between front and back ends is just used for
> > transferring details of requests (like "get this data in memory, and
> > put it here on disk"). i.e. The shared ring contains request
> > metadata. The pages that are temporarily mapped from the domU into
> > dom0 contain the actual data.
> >
> > Cheers,
> > Mark
> >
> > > Thanks in advance
> > >
> > > Mark Williamson 写道:
> > > >> I have read some documents and wiki about split driver in
> > > >> xen,and I am confused about the I/O path ,in which a sys_read()
> > > >> pass through the domU and dom0,does sys_read() in the domU pass
> > > >> through vfs and ,say ,ext3fs in domU,and insert request into the
> > > >> requeest_queue of the frontend-driver,is it right?
> > > >
> > > > Sounds like you have the right idea. Requests get queued with the
> > > > frontend driver in terms of Linux structures. IO requests to
> > > > satisfy these are then placed into the shared memory ring so that
> > > > the backend can find out what we're asking for.
> > > >
> > > >> and then ,say domU sets up with a *.img file in the dom0, then
> > > >> what does frontend and backend driver do?
> > > >> does frontend transmit the request to the backend ,is it right?
> > > >
> > > > Yes, the frontend does this by putting requests into the shared
> > > > memory ringbuffer which is also accessible by the backend. The
> > > > frontend then sends and event to the backend; this causes an
> > > > interrupt in the backend so that it knows it must check the
> > > > shared memory.
> > > >
> > > >> and then what does backend driver do ? does backend transfer
> > > >> the request to the phyiscal driver in the dom0 ,is it right?
> > > >
> > > > Yes. The backend responds to the interrupt by checking the
> > > > shared memory for new requests, then it maps parts of the domUs
> > > > memory so that dom0 will be able to write data into it. Then it
> > > > submits requests to the Linux block IO subsystem to fill that
> > > > memory with data. The Linux block IO system eventually sends
> > > > these requests to the device driver, to do the IO directly into
> > > > the mapped domU memory.
> > > >
> > > >> or does backend transfer the request into some
> > > >> read()operation ,and submit it to the vfs and ,say,ext3fs in
> > > >> dom0, and do another relatively complete io path in the dom0,is
> > > >> it right?
> > > >
> > > > If you're just exporting a phy: device to the guest, then the
> > > > block IO requests go down to the block device driver for that
> > > > device and are serviced there. e.g. if I export IDE driver
> > > > phy:/dev/hda to my guest, then the IDE driver will satisfy the IO
> > > > requests directly. Requests go backend -> block layer -> real
> > > > device driver
> > > >
> > > > If you're using a file: device then you have to go through the
> > > > filesystem layer... So the IO requests go backend -> block layer
> > > > -> loopback block device -> ext3 -> block layer (again) -> real
> > > > device driver
> > > >
> > > > If you're using blktap then the requests take a trip via
> > > > userspace before getting submitted.
> > > >
> > > >> or if backend transfer the request to physical driver
> > > >> directly, how does the backend deal with the request's virtual
> > > >> address ,and how does backend manage bio buffer ,does physical
> > > >> driver and backend and frontend share the bio buffer in some
> > > >> way, or what does xen deal with it ?
> > > >
> > > > I hope what I've said clarifies things a bit.
> > > >
> > > > Cheers,
> > > > Mark
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-12-09 19:47 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-15 3:05 wbinvd optimization for pass through domain Dong, Eddie
2007-11-15 7:04 ` question about io path in the front/backend tgh
2007-12-04 2:58 ` Mark Williamson
2007-12-05 7:19 ` tgh
2007-12-07 9:25 ` tgh
2007-12-09 2:10 ` Mark Williamson
2007-12-09 5:14 ` Pradeep Singh
2007-12-09 19:47 ` Mark Williamson
2007-11-16 16:29 ` wbinvd optimization for pass through domain Keir Fraser
2007-11-19 5:08 ` Dong, Eddie
2007-11-19 11:48 ` Keir Fraser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.