From: "Daniel P. Berrange" <berrange@redhat.com>
To: Keir Fraser <Keir.Fraser@cl.cam.ac.uk>
Cc: xen-devel@lists.xensource.com
Subject: Re: xenconsoled CPU denial of service problem
Date: Tue, 29 Aug 2006 16:59:00 +0100 [thread overview]
Message-ID: <20060829155900.GC970@redhat.com> (raw)
In-Reply-To: <C1191742.EF0%Keir.Fraser@cl.cam.ac.uk>
[-- Attachment #1: Type: text/plain, Size: 2907 bytes --]
On Mon, Aug 28, 2006 at 09:57:22PM +0100, Keir Fraser wrote:
> On 28/8/06 7:02 pm, "Daniel P. Berrange" <berrange@redhat.com> wrote:
>
> > Does anyone know of any alternative approach to detecting whether the fd
> > for the master end of a psuedo-TTY, has a its end slave open / active ?
> > Without being able to detect this I don't see any good way to avoid the DOS
> > attack in the general case - only other option would be to start dropping
> > data once > a certain rate, but this isn't really very desirable because
> > there are (debug) scenarios in which you really do want the ability to
> > capture all data.
>
> The protocol has flow control. If we rate-limited xenconsoled consumption of
> data from each domU ring, we would limit resource consumption in dom0 and
> not lose any data (since the domU will simply buffer it internally).
Ah, that's very handy. Looking at the xenconsoled code I think I've worked
out how to slow things down - although I'm not entirely sure I'm correctly
activating the throttling - basically I'm delaying the calls to the methods
xc_evtchn_notify & xc_evtchn_unmask after consuming data - is this the
correct approach ?
I'm attaching a patch which implements this scheme. In the buffer_append
method we keep a cumulative count of how much data we've consumed. Every
time if exceeds N bytes, we calculate the period over which those N bytes
were received. If this period is < than a threshold (ie fast) then I insert
a small delay before notifying & unmasking the event channel.
The patch sets
- data size 5 kb
- period 200 ms
- delay 200 ms
ie, if we receive 5 kb of data in less than 200 ms, we delay for 200 ms
before allowing more. These constants are #define'd for easy tuning, the
current values are fairly conservative / low data rate. Any thoughts on
what an appropriate data rate to allow from DomU is ?
Finaly question - in the handle_ring_read() method is the port returned
by xc_evtchn_pending guarrenteed to be same as port we already have a
reference to in 'struct domain' local_port field ? If so I can remove
the saved reference 'limited_port' I added in 'struct buffer'
For testing purposes just pick a DomU and run
while /bin/true ; do echo t > /proc/sysrq-trigger ; done
Ordinarily this would cause Dom-0/xenconsoled to consume 100%. With this
patch applied there is negligable CPU time consumed by Dom-0/xenconsoled.
(If you increase delay to 2,000 ms you can visually see that no data is
being lost as a result of the delays).
Signed-off by: Daniel P. Berrange <berrange@redhat.com>
Regards,
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
[-- Attachment #2: xen-console-ratelimit-3.patch --]
[-- Type: text/plain, Size: 5582 bytes --]
diff -ru xen-unstable-10712/tools/console/daemon/io.c xen-unstable-10712-new/tools/console/daemon/io.c
--- xen-unstable-10712/tools/console/daemon/io.c 2006-07-21 13:31:22.000000000 -0400
+++ xen-unstable-10712-new/tools/console/daemon/io.c 2006-08-29 11:35:06.000000000 -0400
@@ -37,6 +37,7 @@
#include <termios.h>
#include <stdarg.h>
#include <sys/mman.h>
+#include <sys/time.h>
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
@@ -44,6 +45,13 @@
/* Each 10 bits takes ~ 3 digits, plus one, plus one for nul terminator. */
#define MAX_STRLEN(x) ((sizeof(x) * CHAR_BIT + CHAR_BIT-1) / 10 * 3 + 2)
+/* How much data to allow in a measurement period - bytes */
+#define RATE_LIMIT_AMOUNT (5 * 1024)
+/* Duration of the measurement period in - ms */
+#define RATE_LIMIT_PERIOD 200
+/* Delay before allowing a rate limited domain to continue - ms */
+#define RATE_LIMIT_DELAY 200
+
struct buffer
{
char *data;
@@ -51,6 +59,10 @@
size_t size;
size_t capacity;
size_t max_capacity;
+ size_t cumulative;
+ long long checkpoint;
+ int rate_limited;
+ evtchn_port_t limited_port;
};
struct domain
@@ -68,12 +80,28 @@
};
static struct domain *dom_head;
+static long long int now;
+
+static int need_ratelimit(struct domain *dom, size_t consumed)
+{
+ dom->buffer.cumulative += consumed;
+ /* Rate limit if we consumed N bytes in < M milliseconds */
+ if (dom->buffer.cumulative > RATE_LIMIT_AMOUNT) {
+ dom->buffer.cumulative = 0;
+ dom->buffer.rate_limited = (now - dom->buffer.checkpoint) < RATE_LIMIT_PERIOD ? 1 : 0;
+ dom->buffer.checkpoint = now;
+
+ return dom->buffer.rate_limited;
+ }
+ return 0;
+}
static void buffer_append(struct domain *dom)
{
struct buffer *buffer = &dom->buffer;
XENCONS_RING_IDX cons, prod, size;
struct xencons_interface *intf = dom->interface;
+ size_t start = buffer->size;
cons = intf->out_cons;
prod = intf->out_prod;
@@ -98,7 +126,8 @@
mb();
intf->out_cons = cons;
- xc_evtchn_notify(dom->xce_handle, dom->local_port);
+ if (!need_ratelimit(dom, buffer->size - start))
+ xc_evtchn_notify(dom->xce_handle, dom->local_port);
if (buffer->max_capacity &&
buffer->size > buffer->max_capacity) {
@@ -514,7 +543,10 @@
buffer_append(dom);
- (void)xc_evtchn_unmask(dom->xce_handle, port);
+ if (!dom->buffer.rate_limited)
+ (void)xc_evtchn_unmask(dom->xce_handle, port);
+ else
+ dom->buffer.limited_port = port;
}
static void handle_xs(void)
@@ -545,10 +577,17 @@
{
fd_set readfds, writefds;
int ret;
+ struct timeval tv;
+
+ if (gettimeofday(&tv, NULL) < 0)
+ return;
+ now = (tv.tv_sec * 1000) + (tv.tv_usec / 1000);
do {
struct domain *d, *n;
int max_fd = -1;
+ long long future = 0;
+ struct timeval timeout;
FD_ZERO(&readfds);
FD_ZERO(&writefds);
@@ -557,7 +596,19 @@
max_fd = MAX(xs_fileno(xs), max_fd);
for (d = dom_head; d; d = d->next) {
- if (d->xce_handle != -1) {
+ if (d->buffer.rate_limited) {
+ long long ourfuture = d->buffer.checkpoint + RATE_LIMIT_DELAY;
+ /* Shouldn't happen, but sanity check at least 1 ms, otherwise
+ we could get stuck in select() without a timeout, and rate
+ limited */
+ if (ourfuture <= now)
+ ourfuture = now + 1;
+
+ /* If our timeout is sooner than any other domain's timeout */
+ if (!future ||
+ ourfuture < future)
+ future = ourfuture;
+ } else if (d->xce_handle != -1) {
int evtchn_fd = xc_evtchn_fd(d->xce_handle);
FD_SET(evtchn_fd, &readfds);
max_fd = MAX(evtchn_fd, max_fd);
@@ -573,16 +624,50 @@
}
}
- ret = select(max_fd + 1, &readfds, &writefds, 0, NULL);
+ /* If any domain has been rate limited, we need to work
+ out what timeout to supply to select */
+ if (future) {
+ long long duration = (future - now);
+ timeout.tv_sec = duration / 1000;
+ timeout.tv_usec = (duration - (timeout.tv_sec * 1000)) * 1000;
+ }
+
+ if ((ret = select(max_fd + 1, &readfds, &writefds, 0, future ? &timeout : NULL)) < 0)
+ if (errno != EINTR)
+ break;
+
+ /* Update the global timestamp - not strictly neccessary,
+ but by keeping a cached timestamp we avoid having to
+ call gettimeofday (3 * num(domains)) on each iteration
+ which is a worthwhile optimization tradeoff against a little
+ inaccurracy - it assumes data processing time is negligable*/
+ if (gettimeofday(&tv, NULL) < 0)
+ return;
+ now = (tv.tv_sec * 1000) + (tv.tv_usec / 1000);
if (FD_ISSET(xs_fileno(xs), &readfds))
handle_xs();
for (d = dom_head; d; d = n) {
n = d->next;
- if (d->xce_handle != -1 &&
- FD_ISSET(xc_evtchn_fd(d->xce_handle), &readfds))
- handle_ring_read(d);
+ /* If the domain's rate limited, we need to see if its time to
+ unblock it, rather than checking the evt channel */
+ if (d->buffer.rate_limited) {
+ /* We asked to wait for N ms, but if we're within 5 ms
+ of that unblock anyway, because select() often returns
+ a couple of ms earlier than the requested timeout. This
+ avoids wastefully select()'ing again with a 1 ms timeout */;
+ if ((now - d->buffer.checkpoint) > (200-5)) {
+ d->buffer.rate_limited = 0;
+ d->buffer.checkpoint = now;
+ xc_evtchn_notify(d->xce_handle, d->local_port);
+ (void)xc_evtchn_unmask(d->xce_handle, d->buffer.limited_port);
+ }
+ } else {
+ if (d->xce_handle != -1 &&
+ FD_ISSET(xc_evtchn_fd(d->xce_handle), &readfds))
+ handle_ring_read(d);
+ }
if (d->tty_fd != -1) {
if (FD_ISSET(d->tty_fd, &readfds))
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2006-08-29 15:59 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-28 18:02 xenconsoled CPU denial of service problem Daniel P. Berrange
2006-08-28 20:57 ` Keir Fraser
2006-08-29 15:59 ` Daniel P. Berrange [this message]
2006-08-30 18:13 ` Keir Fraser
2006-09-11 14:19 ` Daniel P. Berrange
2006-10-04 13:50 ` Daniel P. Berrange
2006-10-04 14:19 ` Keir Fraser
2006-10-04 16:49 ` Anthony Liguori
2006-10-04 17:19 ` Daniel P. Berrange
2006-10-04 17:42 ` Keir Fraser
2006-10-04 17:52 ` Anthony Liguori
2006-10-04 18:10 ` Daniel P. Berrange
2006-10-04 18:19 ` Anthony Liguori
2006-08-30 18:13 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060829155900.GC970@redhat.com \
--to=berrange@redhat.com \
--cc=Keir.Fraser@cl.cam.ac.uk \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.