From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Daniel P. Berrange" Subject: Re: xenconsoled CPU denial of service problem Date: Tue, 29 Aug 2006 16:59:00 +0100 Message-ID: <20060829155900.GC970@redhat.com> References: <20060828180224.GG862@redhat.com> Reply-To: "Daniel P. Berrange" Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="a8Wt8u1KmwUX3Y2C" Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org --a8Wt8u1KmwUX3Y2C Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, Aug 28, 2006 at 09:57:22PM +0100, Keir Fraser wrote: > On 28/8/06 7:02 pm, "Daniel P. Berrange" wrote: > > > Does anyone know of any alternative approach to detecting whether the fd > > for the master end of a psuedo-TTY, has a its end slave open / active ? > > Without being able to detect this I don't see any good way to avoid the DOS > > attack in the general case - only other option would be to start dropping > > data once > a certain rate, but this isn't really very desirable because > > there are (debug) scenarios in which you really do want the ability to > > capture all data. > > The protocol has flow control. If we rate-limited xenconsoled consumption of > data from each domU ring, we would limit resource consumption in dom0 and > not lose any data (since the domU will simply buffer it internally). Ah, that's very handy. Looking at the xenconsoled code I think I've worked out how to slow things down - although I'm not entirely sure I'm correctly activating the throttling - basically I'm delaying the calls to the methods xc_evtchn_notify & xc_evtchn_unmask after consuming data - is this the correct approach ? I'm attaching a patch which implements this scheme. In the buffer_append method we keep a cumulative count of how much data we've consumed. Every time if exceeds N bytes, we calculate the period over which those N bytes were received. If this period is < than a threshold (ie fast) then I insert a small delay before notifying & unmasking the event channel. The patch sets - data size 5 kb - period 200 ms - delay 200 ms ie, if we receive 5 kb of data in less than 200 ms, we delay for 200 ms before allowing more. These constants are #define'd for easy tuning, the current values are fairly conservative / low data rate. Any thoughts on what an appropriate data rate to allow from DomU is ? Finaly question - in the handle_ring_read() method is the port returned by xc_evtchn_pending guarrenteed to be same as port we already have a reference to in 'struct domain' local_port field ? If so I can remove the saved reference 'limited_port' I added in 'struct buffer' For testing purposes just pick a DomU and run while /bin/true ; do echo t > /proc/sysrq-trigger ; done Ordinarily this would cause Dom-0/xenconsoled to consume 100%. With this patch applied there is negligable CPU time consumed by Dom-0/xenconsoled. (If you increase delay to 2,000 ms you can visually see that no data is being lost as a result of the delays). Signed-off by: Daniel P. Berrange Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| --a8Wt8u1KmwUX3Y2C Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="xen-console-ratelimit-3.patch" diff -ru xen-unstable-10712/tools/console/daemon/io.c xen-unstable-10712-new/tools/console/daemon/io.c --- xen-unstable-10712/tools/console/daemon/io.c 2006-07-21 13:31:22.000000000 -0400 +++ xen-unstable-10712-new/tools/console/daemon/io.c 2006-08-29 11:35:06.000000000 -0400 @@ -37,6 +37,7 @@ #include #include #include +#include #define MAX(a, b) (((a) > (b)) ? (a) : (b)) #define MIN(a, b) (((a) < (b)) ? (a) : (b)) @@ -44,6 +45,13 @@ /* Each 10 bits takes ~ 3 digits, plus one, plus one for nul terminator. */ #define MAX_STRLEN(x) ((sizeof(x) * CHAR_BIT + CHAR_BIT-1) / 10 * 3 + 2) +/* How much data to allow in a measurement period - bytes */ +#define RATE_LIMIT_AMOUNT (5 * 1024) +/* Duration of the measurement period in - ms */ +#define RATE_LIMIT_PERIOD 200 +/* Delay before allowing a rate limited domain to continue - ms */ +#define RATE_LIMIT_DELAY 200 + struct buffer { char *data; @@ -51,6 +59,10 @@ size_t size; size_t capacity; size_t max_capacity; + size_t cumulative; + long long checkpoint; + int rate_limited; + evtchn_port_t limited_port; }; struct domain @@ -68,12 +80,28 @@ }; static struct domain *dom_head; +static long long int now; + +static int need_ratelimit(struct domain *dom, size_t consumed) +{ + dom->buffer.cumulative += consumed; + /* Rate limit if we consumed N bytes in < M milliseconds */ + if (dom->buffer.cumulative > RATE_LIMIT_AMOUNT) { + dom->buffer.cumulative = 0; + dom->buffer.rate_limited = (now - dom->buffer.checkpoint) < RATE_LIMIT_PERIOD ? 1 : 0; + dom->buffer.checkpoint = now; + + return dom->buffer.rate_limited; + } + return 0; +} static void buffer_append(struct domain *dom) { struct buffer *buffer = &dom->buffer; XENCONS_RING_IDX cons, prod, size; struct xencons_interface *intf = dom->interface; + size_t start = buffer->size; cons = intf->out_cons; prod = intf->out_prod; @@ -98,7 +126,8 @@ mb(); intf->out_cons = cons; - xc_evtchn_notify(dom->xce_handle, dom->local_port); + if (!need_ratelimit(dom, buffer->size - start)) + xc_evtchn_notify(dom->xce_handle, dom->local_port); if (buffer->max_capacity && buffer->size > buffer->max_capacity) { @@ -514,7 +543,10 @@ buffer_append(dom); - (void)xc_evtchn_unmask(dom->xce_handle, port); + if (!dom->buffer.rate_limited) + (void)xc_evtchn_unmask(dom->xce_handle, port); + else + dom->buffer.limited_port = port; } static void handle_xs(void) @@ -545,10 +577,17 @@ { fd_set readfds, writefds; int ret; + struct timeval tv; + + if (gettimeofday(&tv, NULL) < 0) + return; + now = (tv.tv_sec * 1000) + (tv.tv_usec / 1000); do { struct domain *d, *n; int max_fd = -1; + long long future = 0; + struct timeval timeout; FD_ZERO(&readfds); FD_ZERO(&writefds); @@ -557,7 +596,19 @@ max_fd = MAX(xs_fileno(xs), max_fd); for (d = dom_head; d; d = d->next) { - if (d->xce_handle != -1) { + if (d->buffer.rate_limited) { + long long ourfuture = d->buffer.checkpoint + RATE_LIMIT_DELAY; + /* Shouldn't happen, but sanity check at least 1 ms, otherwise + we could get stuck in select() without a timeout, and rate + limited */ + if (ourfuture <= now) + ourfuture = now + 1; + + /* If our timeout is sooner than any other domain's timeout */ + if (!future || + ourfuture < future) + future = ourfuture; + } else if (d->xce_handle != -1) { int evtchn_fd = xc_evtchn_fd(d->xce_handle); FD_SET(evtchn_fd, &readfds); max_fd = MAX(evtchn_fd, max_fd); @@ -573,16 +624,50 @@ } } - ret = select(max_fd + 1, &readfds, &writefds, 0, NULL); + /* If any domain has been rate limited, we need to work + out what timeout to supply to select */ + if (future) { + long long duration = (future - now); + timeout.tv_sec = duration / 1000; + timeout.tv_usec = (duration - (timeout.tv_sec * 1000)) * 1000; + } + + if ((ret = select(max_fd + 1, &readfds, &writefds, 0, future ? &timeout : NULL)) < 0) + if (errno != EINTR) + break; + + /* Update the global timestamp - not strictly neccessary, + but by keeping a cached timestamp we avoid having to + call gettimeofday (3 * num(domains)) on each iteration + which is a worthwhile optimization tradeoff against a little + inaccurracy - it assumes data processing time is negligable*/ + if (gettimeofday(&tv, NULL) < 0) + return; + now = (tv.tv_sec * 1000) + (tv.tv_usec / 1000); if (FD_ISSET(xs_fileno(xs), &readfds)) handle_xs(); for (d = dom_head; d; d = n) { n = d->next; - if (d->xce_handle != -1 && - FD_ISSET(xc_evtchn_fd(d->xce_handle), &readfds)) - handle_ring_read(d); + /* If the domain's rate limited, we need to see if its time to + unblock it, rather than checking the evt channel */ + if (d->buffer.rate_limited) { + /* We asked to wait for N ms, but if we're within 5 ms + of that unblock anyway, because select() often returns + a couple of ms earlier than the requested timeout. This + avoids wastefully select()'ing again with a 1 ms timeout */; + if ((now - d->buffer.checkpoint) > (200-5)) { + d->buffer.rate_limited = 0; + d->buffer.checkpoint = now; + xc_evtchn_notify(d->xce_handle, d->local_port); + (void)xc_evtchn_unmask(d->xce_handle, d->buffer.limited_port); + } + } else { + if (d->xce_handle != -1 && + FD_ISSET(xc_evtchn_fd(d->xce_handle), &readfds)) + handle_ring_read(d); + } if (d->tty_fd != -1) { if (FD_ISSET(d->tty_fd, &readfds)) --a8Wt8u1KmwUX3Y2C Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --a8Wt8u1KmwUX3Y2C--