From mboxrd@z Thu Jan  1 00:00:00 1970
From: Mats Petersson <mats@planetcatfish.com>
Subject: RE: Buffered IO for IO?
Date: Mon, 23 Jul 2007 21:07:32 +0100
Message-ID: <46a50aa9.03e9300a.7e16.ffffe87a@mx.google.com>
References: <46a4faf3.2134440a.482b.6698@mx.google.com>
	<BD262A443AD428499D90AF8368C4528D8A18BE@fmsmsx411.amr.corp.intel.com> <BD262A443AD428499D90AF8368C4528D8A18BE@fmsmsx411.amr.corp.
	intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <BD262A443AD428499D90AF8368C4528D8A18BE@fmsmsx411.amr.corp.
	intel.com>
References: <46a4faf3.2134440a.482b.6698@mx.google.com>
	<BD262A443AD428499D90AF8368C4528D8A18BE@fmsmsx411.amr.corp.intel.com>
List-Unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: "Zulauf, John" <john.zulauf@intel.com>, Keir Fraser <keir@xensource.com>, Trolle Selander <trolle.selander@gmail.com>
Cc: xen-devel@lists.xensource.com
List-Id: xen-devel@lists.xenproject.org

At 20:38 23/07/2007, Zulauf, John wrote:
>I'm running on an 8-core system with currently only two HVM domains
>(with currently single VCPU each). Both top on Dom0 and xm top, don't
>seem to indicate qemu-dm as the performance bottleneck.  However, I'm
>not sure about roundtrip latency through the xenstore to qemu and back.


Not sure where Xenstore is involved in this, but what I'm suggesting 
with having a dedicated core for Dom0 is that there's no chance that 
the DomU is "competing" with Dom0 for CPU on that core. By locking 
DomU's off a particular core, and locking Dom0 to a particular core, 
you guarantee that there's no need to "world-switch" the Dom0 
CPU-core. A world-switch involves a lot of memory latencies, which 
may not show up "anywhere", but do still take a lot of time.


>As for the interrupt handling, buffered IO is on a 100ms(?) timer in
>qemu-dm, so we're not looking at a deadlock.  Buffered IO handling
>appears to handle this case as well.
>
>However, if the comport code is in a write/sleep/intr/ tight loop, this
>is going to be tragic w.r.t. performance.  (80bps!)  So it's not a clear
>win, and would need *something* (a new hvm_op to control interrupt
>generation on buffered io ops?) in order to not run the risk of being
>vastly slower.


That could easily lead to a "timeout" if the sender is expecting a 
reply within a set amount of time at a much higher speed, so yes, 
you'd need some sort of "enable/disable" functionality at the very least.

There are a few things I can think of that would make sense to do here:
1. Make a mock-up where IO-writes to ONLY 0x3F8 are buffered for 
(say) up to 16 writes.
2. Add some code to just count the number of reads/writes in a row to 
0x3F8..0x3FF ports[1].
3. Measure the average time (e.g. TSC) for a number of 0x3F8 writes 
and see how much time is spent in communicating from the IOIO handler 
until you get back to HVM-code.

[1] Something like this:

struct {
    int direction;
    int current_run
    int no_runs;
    int count[2];
    int max_run_length[2];
} portdata[8] = { {-1}, {-1}, {-1}, {-1}, {-1}, {-1}, {-1}, {-1} 
};    // set direction to "not valid value".

void count_io_action(int portno, int direction)
{
         if ((portno & 0xFFF8) == 0x3F8) {
                 portno &= 0x7;   // Get whch port it is.
                 if (direction == portdata[portno].direction)
                         portdata[portno].current_run ++;
                 else {
                         if (portdata[portno].direction != -1) {
                                 portdata[portno].count[portdata[portno].direction] 
+= portdata[portno].current_run;
                                 portdata[portno].no_runs ++;
                                 if 
(portdata[portno].max_run_length[portdata[portno].direction] < 
portdata[portno].current_run)
                                         portdata[portno].max_run_length[portdata[portno].direction].max_run_length 
= portdata[portno].current_run;
                         }
                         portdata[portno].current_run = 1
                         portdata[portno].direction = direction;
                 }
         }
}

With this you can get the average run length and max run length for 
the different ports. It would tell you which of the ports are most 
often accessed.

Taking a sample of "all the 0x3Fx port accesses" for a large-ish 
number of accesses could also be beneficial (I think xentrace can do 
that for you).

--
Mats


>So, this is definitely neither obvious, easy, nor a clear win.
>
>Thanks to all.
>
>John
>
>-----Original Message-----
>From: mats petersson [mailto:mats.o.petersson@googlemail.com] On Behalf
>Of Mats Petersson
>Sent: Monday, July 23, 2007 12:01 PM
>To: Zulauf, John; Keir Fraser; Trolle Selander
>Cc: xen-devel@lists.xensource.com
>Subject: RE: [Xen-devel] Buffered IO for IO?
>
>At 19:49 23/07/2007, Zulauf, John wrote:
> >Content-class: urn:content-classes:message
> >Content-Type: multipart/alternative;
> >         boundary="----_=_NextPart_001_01C7CD5A.31A74761"
> >
> >Thanks for the comments.  Frankly, I'm guessing the bulk of the time
> >in the COM port IO is VMEXIT time, and that saving qemu round-trip
> >would be a marginal effect**.
>
>I guess the question of how much of the time is spent where depends
>on the setup. One thing you may want to try, is to ensure that the
>guest domain(s) and Dom0 doesn't share the same CPU(core) - by giving
>Dom0 it's own CPU(core) to run on you eliminate the possibility that
>some other guest is still using Dom0's CPU when you want QEMU to run.
>If you have MANY HVM domains, you may also want to give more than a
>single core to Dom0.
>
> >
> >As for the read's flushing writes, this happens automatically as a
> >result of how the buffered_io page works (and assuming one sticks to
> >this design for IO buffering).  If dir == IOREQ_READ then attempt to
> >buffered the IO request will fail.  Thus, hvm_send_assist_req is
> >invoked.  When qemu catches the "notify" event of the READ it firsts
> >dispatches *all* of the buffered io requests before dispatching the
> >READ. Thus order is preserved and inb are synchronous from the vcpu
> >point of view.
>
>Yes, that's the trivial case. But what about a write to 0x3F8 (send
>data) and code that goes to sleep, waiting for an IRQ to say that the
>data has been sent? There may not be a read of any port in the serial
>port in between - thanks to Trolle for reminding me of this type of
>operation.
>
>--
>Mats
>
> >
> >As for controlling outbound FIFO depth, adding a per range
> >"max_depth" test to the "queue is full" test already in use for mmio
> >buffering would be straight forward.
> >
> >The interrupt issues are more concerning.  A one byte write "window"
> >at 3F8 doesn't seem to have this issue (c.f.)
> >ftp://ftp.phil.uni-sb.de/pub/staff/chris/The_Serial_Port
> >
> >But I agree that proxy device models are not desirable when not
> >performance critical. Regardless, they wouldn't be supported
> >directly though a simple "hvm_buffered_io_intercept" call.  This
> >would be more suited to the approach used in hvm_mmio_intercept to
> >do the lapic emulation.
> >
> >
> >John
> >
> >** For those interested, I'm looking at the performance of using
> >Windbg for Guest domain debug, and the time to do the serial port
> >based initialization of a kernel debug session. Starting a WinDBG
> >session on a Windows guest OS takes several minutes. Any suggestions
> >to optimize that process would be gladly entertained.
> >
> >
> >----------
> >From: Keir Fraser [mailto:keir@xensource.com]
> >Sent: Saturday, July 21, 2007 4:09 AM
> >To: Trolle Selander; Zulauf, John
> >Cc: xen-devel@lists.xensource.com
> >Subject: Re: [Xen-devel] Buffered IO for IO?
> >
> >Yes, it strikes me that this cannot be done safely without providing
> >a set of 'proxy device models' in the hypervisor that know when it
> >is safe to buffer and when the buffers must be flushed, according to
> >native hardware behaviour.
> >
> >  -- Keir
> >
> >On 21/7/07 11:59, "Trolle Selander" <trolle.selander@gmail.com> wrote:
> >Safety would depend on how the emulated device works. For serial
> >ports in particular, it's definitely not safe, since depending on
> >the model of UART emulated, and the settings of the UART control
> >registers, every outb may result in a serial interrupt and UART
> >register changes that will have to be processed before any further
> >io can be done.
> >It's possible that there might be some performance to be gained by
> >"upgrading" the emulated UART to a 16550A or better, and doing
> >buffered IO for the FIFO. Earlier this year I was experimenting with
> >a patch that made the qemu-dm serial emulation into a 16550A with
> >FIFO, but though the patch did fix some compatability issues with
> >software that assumed a 16550A UART in the HVM guest I'm working
> >with, serial performance actually got noticeably _worse_, so I never
> >bothered submitting it. Implementing the FIFO with buffered IO would
> >possibly make it work better, but I don't see how it could be done
> >without moving at least part of the serial device model into the
> >hypervisor, which just strikes me as more trouble than it's worth.
> >
> >/Trolle
> >
> >On 7/21/07, Keir Fraser <keir@xensource.com> wrote:
> >
> >
> >
> >On 20/7/07 22:33, "Zulauf, John" <john.zulauf@intel.com> wrote:
> >
> > > Has anyone experimented with adding Buffered IO support for "out"
> > > instructions?  Currently, the buffered io pages is only used for
>mmio
> > > writes (and then only to vga space).  It seems quite
>straight-forward to
> > > add.
> >
> >Is it safe to buffer, and hence arbitrarily delay, any I/O port write?
> >
> >  -- Keir
> >
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@lists.xensource.com
> ><http://lists.xensource.com/xen-devel>http://lists.xensource.com/xen-de
>vel
> >
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@lists.xensource.com
> ><http://lists.xensource.com/xen-devel>http://lists.xensource.com/xen-de
>vel
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@lists.xensource.com
> >http://lists.xensource.com/xen-devel