* [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 20:11 ` Anthony Liguori
[not found] ` <m2n.s.1KWGbo-001Nr9@chiark.greenend.org.uk>
2008-08-21 16:27 ` [Qemu-devel] [PATCH 02/13] add DisplayState->idle Gerd Hoffmann
` (13 subsequent siblings)
14 siblings, 2 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This patch makes qemu handle signals better. It sets the request_shutdown
flag, making the main_loop exit and qemu taking the usual exit route, with
atexit handlers being called and so on, instead of qemu just being killed
by the signal.
To avoid calling vm_start() from the signal handler main_loop() got an
additional check so qemu_system_shutdown_request() works even when the
vm is in stopped state.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
curses.c | 2 --
sdl.c | 9 +--------
vl.c | 27 +++++++++++++++++++++++++++
3 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/curses.c b/curses.c
index d09eff2..96b6e49 100644
--- a/curses.c
+++ b/curses.c
@@ -350,8 +350,6 @@ void curses_display_init(DisplayState *ds, int full_screen)
atexit(curses_atexit);
#ifndef _WIN32
- signal(SIGINT, SIG_DFL);
- signal(SIGQUIT, SIG_DFL);
#if defined(SIGWINCH) && defined(KEY_RESIZE)
/* some curses implementations provide a handler, but we
* want to be sure this is handled regardless of the library */
diff --git a/sdl.c b/sdl.c
index 0edc4a0..84a9d6d 100644
--- a/sdl.c
+++ b/sdl.c
@@ -476,10 +476,8 @@ static void sdl_refresh(DisplayState *ds)
sdl_process_key(&ev->key);
break;
case SDL_QUIT:
- if (!no_quit) {
+ if (!no_quit)
qemu_system_shutdown_request();
- vm_start(); /* In case we're paused */
- }
break;
case SDL_MOUSEMOTION:
if (gui_grab || kbd_mouse_is_absolute() ||
@@ -636,11 +634,6 @@ void sdl_display_init(DisplayState *ds, int full_screen, int no_frame)
fprintf(stderr, "Could not initialize SDL - exiting\n");
exit(1);
}
-#ifndef _WIN32
- /* NOTE: we still want Ctrl-C to work, so we undo the SDL redirections */
- signal(SIGINT, SIG_DFL);
- signal(SIGQUIT, SIG_DFL);
-#endif
ds->dpy_update = sdl_update;
ds->dpy_resize = sdl_resize;
diff --git a/vl.c b/vl.c
index fc5019f..071ee0c 100644
--- a/vl.c
+++ b/vl.c
@@ -7619,6 +7619,8 @@ static int main_loop(void)
timeout = 0;
}
} else {
+ if (shutdown_requested)
+ break;
timeout = 10;
}
#ifdef CONFIG_PROFILER
@@ -8183,6 +8185,26 @@ static BOOL WINAPI qemu_ctrl_handler(DWORD type)
#define MAX_NET_CLIENTS 32
+#ifndef _WIN32
+
+static void termsig_handler(int signal)
+{
+ qemu_system_shutdown_request();
+}
+
+void termsig_setup(void)
+{
+ struct sigaction act;
+
+ memset(&act, 0, sizeof(act));
+ act.sa_handler = termsig_handler;
+ sigaction(SIGINT, &act, NULL);
+ sigaction(SIGHUP, &act, NULL);
+ sigaction(SIGTERM, &act, NULL);
+}
+
+#endif
+
int main(int argc, char **argv)
{
#ifdef CONFIG_GDBSTUB
@@ -9071,6 +9093,11 @@ int main(int argc, char **argv)
#endif
}
+#ifndef _WIN32
+ /* must be after terminal init, SDL library changes signal handlers */
+ termsig_setup();
+#endif
+
/* Maintain compatibility with multiple stdio monitors */
if (!strcmp(monitor_device,"stdio")) {
for (i = 0; i < MAX_SERIAL_PORTS; i++) {
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-21 16:27 ` [Qemu-devel] [PATCH 01/13] Handle terminating signals Gerd Hoffmann
@ 2008-08-21 20:11 ` Anthony Liguori
[not found] ` <m2n.s.1KWGbo-001Nr9@chiark.greenend.org.uk>
1 sibling, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-21 20:11 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Gerd Hoffmann wrote:
> This patch makes qemu handle signals better. It sets the request_shutdown
> flag, making the main_loop exit and qemu taking the usual exit route, with
> atexit handlers being called and so on, instead of qemu just being killed
> by the signal.
>
> To avoid calling vm_start() from the signal handler main_loop() got an
> additional check so qemu_system_shutdown_request() works even when the
> vm is in stopped state.
>
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Applied. Thanks.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KWGbo-001Nr9@chiark.greenend.org.uk>]
* Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
[not found] ` <m2n.s.1KWGbo-001Nr9@chiark.greenend.org.uk>
@ 2008-08-26 10:44 ` Ian Jackson
2008-08-26 12:16 ` Gerd Hoffmann
0 siblings, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 10:44 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Anthony Liguori writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> Gerd Hoffmann wrote:
> > This patch makes qemu handle signals better. It sets the request_shutdown
> > flag, making the main_loop exit and qemu taking the usual exit route, with
> > atexit handlers being called and so on, instead of qemu just being killed
> > by the signal.
>
> Applied. Thanks.
This seems sensible to me.
I have a patch in our tree which changes the signal delivery mechanism
for aio completion to use the well known pipe-to-self trick. I had to
do this because there were some circumstances where IO completion
signals were being lost due to races.
I was planning to send this upstream. Should I enhance my change to
do the same to the termination signals too ?
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 10:44 ` Ian Jackson
@ 2008-08-26 12:16 ` Gerd Hoffmann
2008-08-26 12:36 ` Ian Jackson
0 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-26 12:16 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel
Ian Jackson wrote:
> I have a patch in our tree which changes the signal delivery mechanism
> for aio completion to use the well known pipe-to-self trick. I had to
> do this because there were some circumstances where IO completion
> signals were being lost due to races.
>
> I was planning to send this upstream. Should I enhance my change to
> do the same to the termination signals too ?
What this would be useful for? The termination signals usualy one-shot
anyway, so I don't see any advantage in doing the pipe-to-self trick.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 12:16 ` Gerd Hoffmann
@ 2008-08-26 12:36 ` Ian Jackson
2008-08-26 12:58 ` Gerd Hoffmann
2008-08-26 13:17 ` Anthony Liguori
0 siblings, 2 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 12:36 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, qemu-devel
Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> What this would be useful for? The termination signals usualy one-shot
> anyway, so I don't see any advantage in doing the pipe-to-self trick.
The point is that then they can interrupt select().
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 12:36 ` Ian Jackson
@ 2008-08-26 12:58 ` Gerd Hoffmann
2008-08-26 17:55 ` Jamie Lokier
2008-08-26 13:17 ` Anthony Liguori
1 sibling, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-26 12:58 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel
Ian Jackson wrote:
> Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>> What this would be useful for? The termination signals usualy one-shot
>> anyway, so I don't see any advantage in doing the pipe-to-self trick.
>
> The point is that then they can interrupt select().
Sure, basically any signal can interrupt any system call. I still don't
see the point you are trying to make.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 12:58 ` Gerd Hoffmann
@ 2008-08-26 17:55 ` Jamie Lokier
0 siblings, 0 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-26 17:55 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Ian Jackson
Gerd Hoffmann wrote:
> Ian Jackson wrote:
> > Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> >> What this would be useful for? The termination signals usualy one-shot
> >> anyway, so I don't see any advantage in doing the pipe-to-self trick.
> >
> > The point is that then they can interrupt select().
>
> Sure, basically any signal can interrupt any system call. I still don't
> see the point you are trying to make.
Signals interrupting system calls has nothing to do with it.
The point is this sequence:
1. Check for signals / flags from signals... nope.
[[ 2. Signal arrives! Sets flag. ]]
3. Call select().
The signal occurs _after_ the main loop checks for wakeup conditons,
but _before_ select() is called. It doesn't interrupt select()
because that hasn't started yet. Result: select() blocks for a long time.
In the fixed version, the signal handler writes to the self-pipe, and
select() is called in the main loop, and returns immediately because
the self-pipe is ready to read. (The data in it is ignored, just read
to empty it.)
The other way this can be done is for the signal handler to call
siglongjmp(). But that's slower (every call to select() must call
sigsetjmp() which is a bit slow), and crashes some OSes. The
self-pipe is well known for being reliable.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 12:36 ` Ian Jackson
2008-08-26 12:58 ` Gerd Hoffmann
@ 2008-08-26 13:17 ` Anthony Liguori
2008-08-26 13:38 ` [Xen-devel] " Ian Jackson
[not found] ` <m2n.s.1KXz4u-002Qvq@chiark.greenend.org.uk>
1 sibling, 2 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-26 13:17 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Ian Jackson wrote:
> Gerd Hoffmann writes ("Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
>> What this would be useful for? The termination signals usualy one-shot
>> anyway, so I don't see any advantage in doing the pipe-to-self trick.
>>
>
> The point is that then they can interrupt select().
>
The race I know of is that you may get an aio signal completion before
select but after you've already qemu_aio_poll()'d. In practice, we only
sleep for 10ms at a time in select() so the race is handled by that. If
we wanted to increase the amount of time we slept, we would have to
handle this race.
In KVM, we sleep for 1s in select() and use signalfd() to receive the
aio notifications. For older hosts, we emulate signalfd using a thread
and the pipe-to-self trick.
Regards,
Anthony Liguori
> Ian.
>
>
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 13:17 ` Anthony Liguori
@ 2008-08-26 13:38 ` Ian Jackson
2008-08-26 14:52 ` Avi Kivity
` (2 more replies)
[not found] ` <m2n.s.1KXz4u-002Qvq@chiark.greenend.org.uk>
1 sibling, 3 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 13:38 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> The race I know of is that you may get an aio signal completion before
> select but after you've already qemu_aio_poll()'d. In practice, we only
> sleep for 10ms at a time in select() so the race is handled by that. If
> we wanted to increase the amount of time we slept, we would have to
> handle this race.
Yes. And, 10ms is too long anyway for reasonable performance. During
my merge with upstream I found that the qemu aio functionality (which
was done quite differently to the old xen ioemu) caused a severe
performance regression under some conditions because of this race.
> In KVM, we sleep for 1s in select() and use signalfd() to receive the
> aio notifications. For older hosts, we emulate signalfd using a thread
> and the pipe-to-self trick.
Why does it need a thread ? You can just write to the pipe in the
signal handler. I'll post my code.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 13:38 ` [Xen-devel] " Ian Jackson
@ 2008-08-26 14:52 ` Avi Kivity
2008-08-26 17:49 ` Jamie Lokier
2008-08-26 15:04 ` Anthony Liguori
[not found] ` <m2n.s.1KXzwZ-002RA9@chiark.greenend.org.uk>
2 siblings, 1 reply; 117+ messages in thread
From: Avi Kivity @ 2008-08-26 14:52 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Ian Jackson wrote:
>> In KVM, we sleep for 1s in select() and use signalfd() to receive the
>> aio notifications. For older hosts, we emulate signalfd using a thread
>> and the pipe-to-self trick.
>>
>
> Why does it need a thread ? You can just write to the pipe in the
> signal handler. I'll post my code.
>
Not sure that it matters, but the semantics are slightly different:
with a thread you don't get EINTR in random syscalls as the signal
thread is the only one that has the signals unblocked.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 14:52 ` Avi Kivity
@ 2008-08-26 17:49 ` Jamie Lokier
0 siblings, 0 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-26 17:49 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Avi Kivity wrote:
> Ian Jackson wrote:
> >>In KVM, we sleep for 1s in select() and use signalfd() to receive the
> >>aio notifications. For older hosts, we emulate signalfd using a thread
> >>and the pipe-to-self trick.
> >>
> >
> >Why does it need a thread ? You can just write to the pipe in the
> >signal handler. I'll post my code.
> >
>
> Not sure that it matters, but the semantics are slightly different:
> with a thread you don't get EINTR in random syscalls as the signal
> thread is the only one that has the signals unblocked.
You shouldn't be getting spurious EINTRs if the signal handlers are
registered with SA_RESTART. I doubt if any useful host which has
threads does not have SA_RESTART.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 13:38 ` [Xen-devel] " Ian Jackson
2008-08-26 14:52 ` Avi Kivity
@ 2008-08-26 15:04 ` Anthony Liguori
2008-08-26 15:08 ` Ian Jackson
[not found] ` <m2n.s.1KXzwZ-002RA9@chiark.greenend.org.uk>
2 siblings, 1 reply; 117+ messages in thread
From: Anthony Liguori @ 2008-08-26 15:04 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Ian Jackson wrote:
> Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
>> The race I know of is that you may get an aio signal completion before
>> select but after you've already qemu_aio_poll()'d. In practice, we only
>> sleep for 10ms at a time in select() so the race is handled by that. If
>> we wanted to increase the amount of time we slept, we would have to
>> handle this race.
>>
>
> Yes. And, 10ms is too long anyway for reasonable performance. During
> my merge with upstream I found that the qemu aio functionality (which
> was done quite differently to the old xen ioemu) caused a severe
> performance regression under some conditions because of this race.
>
Yeah, noticed that too, especially with qcow2.
>> In KVM, we sleep for 1s in select() and use signalfd() to receive the
>> aio notifications. For older hosts, we emulate signalfd using a thread
>> and the pipe-to-self trick.
>>
>
> Why does it need a thread ? You can just write to the pipe in the
> signal handler. I'll post my code.
>
It's a little less perfect. Your signal handler can write() to the pipe
but what happens if you get EAGAIN?
So what we do in KVM is use sigwait() within a separate thread. We
don't set O_NONBLOCK so the thread blocks if the pipe fills up which is
exactly the semantics you would want.
Below is our implementation. I'll queue up to push this change into
QEMU after I finish with the migration patches.
Regards,
Anthony Liguori
#include <sys/syscall.h>
#include <pthread.h>
struct sigfd_compat_info
{
sigset_t mask;
int fd;
};
static void *sigwait_compat(void *opaque)
{
struct sigfd_compat_info *info = opaque;
int err;
sigset_t all;
sigfillset(&all);
sigprocmask(SIG_BLOCK, &all, NULL);
do {
siginfo_t siginfo;
err = sigwaitinfo(&info->mask, &siginfo);
if (err == -1 && errno == EINTR) {
err = 0;
continue;
}
if (err > 0) {
char buffer[128];
size_t offset = 0;
memcpy(buffer, &err, sizeof(err));
while (offset < sizeof(buffer)) {
ssize_t len;
len = write(info->fd, buffer + offset,
sizeof(buffer) - offset);
if (len == -1 && errno == EINTR)
continue;
if (len <= 0) {
err = -1;
break;
}
offset += len;
}
}
} while (err >= 0);
return NULL;
}
static int kvm_signalfd_compat(const sigset_t *mask)
{
pthread_attr_t attr;
pthread_t tid;
struct sigfd_compat_info *info;
int fds[2];
info = malloc(sizeof(*info));
if (info == NULL) {
errno = ENOMEM;
return -1;
}
if (pipe(fds) == -1) {
free(info);
return -1;
}
memcpy(&info->mask, mask, sizeof(*mask));
info->fd = fds[1];
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
pthread_create(&tid, &attr, sigwait_compat, info);
pthread_attr_destroy(&attr);
return fds[0];
}
int kvm_signalfd(const sigset_t *mask)
{
#if defined(SYS_signalfd)
int ret;
ret = syscall(SYS_signalfd, -1, mask, _NSIG / 8);
if (!(ret == -1 && errno == ENOSYS))
return ret;
#endif
return kvm_signalfd_compat(mask);
}
> Ian.
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:04 ` Anthony Liguori
@ 2008-08-26 15:08 ` Ian Jackson
2008-08-26 15:20 ` Anthony Liguori
2008-08-26 15:21 ` Anthony Liguori
0 siblings, 2 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 15:08 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> It's a little less perfect. Your signal handler can write() to the pipe
> but what happens if you get EAGAIN?
Then the pipe already has something in it which is fine. The purpose
of writing to it is to make sure that it is not empty. We don't count
the bytes.
> Below is our implementation. I'll queue up to push this change into
> QEMU after I finish with the migration patches.
Please consider my patch instead, which doesn't depend on threads and
is a lot smaller.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:08 ` Ian Jackson
@ 2008-08-26 15:20 ` Anthony Liguori
2008-08-26 15:28 ` Ian Jackson
2008-08-26 15:21 ` Anthony Liguori
1 sibling, 1 reply; 117+ messages in thread
From: Anthony Liguori @ 2008-08-26 15:20 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Ian Jackson wrote:
> Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
> Then the pipe already has something in it which is fine. The purpose
> of writing to it is to make sure that it is not empty. We don't count
> the bytes.
>
Then you need a pipe per-signal and you can't accurately simulate
threadfd() (since you can't communicate siginfo).
> Please consider my patch instead, which doesn't depend on threads and
> is a lot smaller.
>
I don't see threads as a problem. Are you concerned about mini-OS?
Regards,
Anthony Liguori
> Ian.
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:20 ` Anthony Liguori
@ 2008-08-26 15:28 ` Ian Jackson
2008-08-26 15:36 ` Anthony Liguori
2008-08-26 15:36 ` Samuel Thibault
0 siblings, 2 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 15:28 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> Then you need a pipe per-signal and you can't accurately simulate
> threadfd() (since you can't communicate siginfo).
No, you don't need a pipe per signal. You only need multiple pipes if
you have multiple different event loops which are each capable of
handling only a subset of the signals, _and_ you're unwilling to use
an atomic flag variable (eg, cpu_interrupt) or its equivalent.
But we already have cpu_interrupt, and of course the aio completion
system has its own recording of what's going on. So that does not
apply to qemu.
All that's needed is a reliable, race-free, way of avoiding spuriously
blocking in a syscall when an event occurs between checking the state
variables (aio_error, cpu interrupt check) and the call to select().
One fd is sufficient for that.
Just to make that concrete, even if we extended my patch to use its
mechanism for SIGINT et al I don't think a second pipe would be
needed.
> I don't see threads as a problem. Are you concerned about mini-OS?
Minios certainly doesn't currently have any threads and it would
probably be a severe pain to introduce them. That's just one example
of a portability problem.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:28 ` Ian Jackson
@ 2008-08-26 15:36 ` Anthony Liguori
2008-08-26 17:47 ` Jamie Lokier
2008-08-26 15:36 ` Samuel Thibault
1 sibling, 1 reply; 117+ messages in thread
From: Anthony Liguori @ 2008-08-26 15:36 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Ian Jackson wrote:
> Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
>> Then you need a pipe per-signal and you can't accurately simulate
>> threadfd() (since you can't communicate siginfo).
>>
>
> No, you don't need a pipe per signal. You only need multiple pipes if
> you have multiple different event loops which are each capable of
> handling only a subset of the signals, _and_ you're unwilling to use
> an atomic flag variable (eg, cpu_interrupt) or its equivalent.
>
> But we already have cpu_interrupt, and of course the aio completion
> system has its own recording of what's going on. So that does not
> apply to qemu.
>
> All that's needed is a reliable, race-free, way of avoiding spuriously
> blocking in a syscall when an event occurs between checking the state
> variables (aio_error, cpu interrupt check) and the call to select().
> One fd is sufficient for that.
>
> Just to make that concrete, even if we extended my patch to use its
> mechanism for SIGINT et al I don't think a second pipe would be
> needed.
>
In KVM, we do use the signal number to determine action. We could use
globals but since we're multi-threaded, that gets pretty nasty. The
same would apply to a threaded QEMU.
>> I don't see threads as a problem. Are you concerned about mini-OS?
>>
>
> Minios certainly doesn't currently have any threads and it would
> probably be a severe pain to introduce them. That's just one example
> of a portability problem.
>
We're definitely not going to avoid threads forever in QEMU. KVM
requires threads to support multiple VCPUs. Threads are also needed to
support true SMP with TCG.
And right now, implementing a thread pool is the only sane way to get
reasonable disk IO in userspace.
I share your concerns about threading, which is why we have to use them
in a very careful way. My signalfd() patch uses them in a very isolated
way that is pretty easily verified.
You could always add proper signalfd() support to minios. You can also
certainly implement a signalfd() emulation that uses pipe(). signalfd()
is really the right solution to this problem (and it doesn't require
threads by default).
Regards,
Anthony Liguori
> Ian.
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:36 ` Anthony Liguori
@ 2008-08-26 17:47 ` Jamie Lokier
2008-08-26 18:03 ` Jamie Lokier
` (2 more replies)
0 siblings, 3 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-26 17:47 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Ian Jackson, Gerd Hoffmann
Anthony Liguori wrote:
> In KVM, we do use the signal number to determine action. We could use
> globals but since we're multi-threaded, that gets pretty nasty. The
> same would apply to a threaded QEMU.
Often one pipe per select-caller is enough, with global state to
distinguish events. For signalfd() emulation, that's an array of
sig_atomic_t signal_was_hit[NSIG];
(Or just `char' - in pthreads that should be safe to write, right?
One bit is enough storage, but not thread safe.)
However, if you depend on the signal number _combined_ with which
thread receives the signal - then you're in trouble.
pthread_self() should not be called from a signal handler; it's safety
is not portable. Thread-local storage (__thread int x) is not
portable and might not be signal safe either. All you've got left is
gettid(), and that's not portable either.
"signalfd helper thread writing to pipe" wouldn't solve that either.
You'd need one _per_ thread that would have called signalfd() (if you
had it) - in order to distinguish recipients. But that helper's
handler can't tell which thread it is, and therefore which pipe to write to!
But you did say you're using signalfd(). And signalfd() doesn't
distinguish thread recipients (I'm surprised it doesn't). That whole
meander was just an interesting but irrelevant puzzle.
Leading to: why would (real) signals being used to collect AIO events
anyway, if you don't have signalfd()? If you've got a helper thread,
just call aio_suspend() in the helper thread. Then you can just
deliver the AIO completion result to the relevant data structure or
even move it off the waiting queue (pthread_mutex_lock is your
friend), then wake the main waiting thread with byte written to its
pipe.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 17:47 ` Jamie Lokier
@ 2008-08-26 18:03 ` Jamie Lokier
2008-08-26 18:50 ` Anthony Liguori
2008-08-27 9:46 ` Ian Jackson
2 siblings, 0 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-26 18:03 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Ian Jackson, Gerd Hoffmann
Jamie Lokier wrote:
> But you did say you're using signalfd(). And signalfd() doesn't
> distinguish thread recipients (I'm surprised it doesn't). That whole
> meander was just an interesting but irrelevant puzzle.
Gah, ignore that paragraph please. Blind spot in manual reading.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 17:47 ` Jamie Lokier
2008-08-26 18:03 ` Jamie Lokier
@ 2008-08-26 18:50 ` Anthony Liguori
2008-08-27 9:46 ` Ian Jackson
2 siblings, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-26 18:50 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Ian Jackson, Gerd Hoffmann
Jamie Lokier wrote:
> Leading to: why would (real) signals being used to collect AIO events
> anyway, if you don't have signalfd()? If you've got a helper thread,
> just call aio_suspend() in the helper thread.
There is no helper thread when signalfd() is available. The helper
thread thing is only for backwards compatibility or for unices that
don't support signalfd. Plus, there are a lot of places that we use
signals currently.
> Then you can just
> deliver the AIO completion result to the relevant data structure or
> even move it off the waiting queue (pthread_mutex_lock is your
> friend), then wake the main waiting thread with byte written to its
> pipe.
>
It's the queuing with threads where things start getting complicated and
error prone.
Regards,
Anthony Liguori
> -- Jamie
>
>
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 17:47 ` Jamie Lokier
2008-08-26 18:03 ` Jamie Lokier
2008-08-26 18:50 ` Anthony Liguori
@ 2008-08-27 9:46 ` Ian Jackson
2008-08-27 16:27 ` Jamie Lokier
2 siblings, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-27 9:46 UTC (permalink / raw)
To: Jamie Lokier; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> Leading to: why would (real) signals being used to collect AIO events
> anyway, if you don't have signalfd()? If you've got a helper thread,
If you use aio, glibc has a tendency to emulate it with thread(s).
The helper thread I was referring to in my commit message is the
glibc-created helper thread. glibc appears (in strace) to do IO in
this thread and then raises the signal you asked for (SIGUSR2 in
qemu's case).
In some of my tests with qemu that SIGUSR2 was then immediately
delivered on the helper thread even though the main program had it
blocked. I saw this in strace in a real qemu instance but I wasn't
able to reproduce it in a trivial test program. The result is that we
lose the race that sigwait() is supposed to avoid. Hence my
replacement of sigwait with the self-pipe trick.
This is not to be confused with Anthony Ligouri's helper thread, which
is doing the same job as my SIGUSR2 handler. In Anthony's setup there
will be two helper threads (assuming no signalfd and no kernel aio):
one invented by glibc to emulate aio, and one invented by Anthony to
emulate signalfd. glibc will do the aio in its helper thread, raise
SIGUSR2, which will wake up Anthony's thread, which will write to the
pipe and thus wake up the main program.
Ah, plumbing.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 9:46 ` Ian Jackson
@ 2008-08-27 16:27 ` Jamie Lokier
2008-08-27 16:35 ` Ian Jackson
0 siblings, 1 reply; 117+ messages in thread
From: Jamie Lokier @ 2008-08-27 16:27 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Ian Jackson wrote:
> Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> > Leading to: why would (real) signals being used to collect AIO events
> > anyway, if you don't have signalfd()? If you've got a helper thread,
>
> If you use aio, glibc has a tendency to emulate it with thread(s).
> The helper thread I was referring to in my commit message is the
> glibc-created helper thread. glibc appears (in strace) to do IO in
> this thread and then raises the signal you asked for (SIGUSR2 in
> qemu's case).
That observation implies Anthony's helper thread will not always
receive the SIGUSR2 signals - Glibc's thread may receive them instead.
So when the signalfd-emulation signal handler writes to the pipe, it
might block Glibc's AIO thread. Is that ok, Anthony?
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 16:27 ` Jamie Lokier
@ 2008-08-27 16:35 ` Ian Jackson
2008-08-27 17:55 ` Anthony Liguori
2008-08-27 18:41 ` Jamie Lokier
0 siblings, 2 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-27 16:35 UTC (permalink / raw)
To: Jamie Lokier; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> Ian Jackson wrote:
> > If you use aio, glibc has a tendency to emulate it with thread(s).
> > The helper thread I was referring to in my commit message is the
> > glibc-created helper thread. glibc appears (in strace) to do IO in
> > this thread and then raises the signal you asked for (SIGUSR2 in
> > qemu's case).
>
> That observation implies Anthony's helper thread will not always
> receive the SIGUSR2 signals - Glibc's thread may receive them instead.
Yes. Err, I'm not sure what effect that will have in Anthony's patch
but it probably wouldn't be good.
As I say I haven't managed to reproduce this other than in a full
qemu-dm tree (from qemu-xen).
So it may be that it is because of something I did or something the
qemu-xen code does - although it seems hard to imagine how it could
break the signal mask in the helper thread given that that thread runs
entirely glibc code, except when a signal is being delivered (and of
course the signal mask is saved and restored for signal delivery).
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 16:35 ` Ian Jackson
@ 2008-08-27 17:55 ` Anthony Liguori
2008-08-27 18:41 ` Jamie Lokier
1 sibling, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-27 17:55 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Ian Jackson wrote:
> Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
> Yes. Err, I'm not sure what effect that will have in Anthony's patch
> but it probably wouldn't be good.
>
> As I say I haven't managed to reproduce this other than in a full
> qemu-dm tree (from qemu-xen).
>
> So it may be that it is because of something I did or something the
> qemu-xen code does - although it seems hard to imagine how it could
> break the signal mask in the helper thread given that that thread runs
> entirely glibc code, except when a signal is being delivered (and of
> course the signal mask is saved and restored for signal delivery).
>
Perhaps the signal is being consumed in one of qemu-dm's other threads
mistakenly?
What you're describing would seem like a serious bug. For a while, I
was testing a version of QEMU/KVM that slept forever in select() and I
never saw any instances of the signal not getting delivered.
Regards,
Anthony Liguori
> Ian.
>
>
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 16:35 ` Ian Jackson
2008-08-27 17:55 ` Anthony Liguori
@ 2008-08-27 18:41 ` Jamie Lokier
2008-08-28 9:34 ` Ian Jackson
1 sibling, 1 reply; 117+ messages in thread
From: Jamie Lokier @ 2008-08-27 18:41 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Ian Jackson wrote:
> > > If you use aio, glibc has a tendency to emulate it with thread(s).
> > > The helper thread I was referring to in my commit message is the
> > > glibc-created helper thread. glibc appears (in strace) to do IO in
> > > this thread and then raises the signal you asked for (SIGUSR2 in
> > > qemu's case).
> >
> > That observation implies Anthony's helper thread will not always
> > receive the SIGUSR2 signals - Glibc's thread may receive them instead.
>
> Yes. Err, I'm not sure what effect that will have in Anthony's patch
> but it probably wouldn't be good.
>
> As I say I haven't managed to reproduce this other than in a full
> qemu-dm tree (from qemu-xen).
>
> So it may be that it is because of something I did or something the
> qemu-xen code does - although it seems hard to imagine how it could
> break the signal mask in the helper thread given that that thread runs
> entirely glibc code, except when a signal is being delivered (and of
> course the signal mask is saved and restored for signal delivery).
It's not your fault.
A quick read of Glibc 2.3.1 source says its AIO code does not change
the signal mask when it creates an AIO handling thread.
(This might be a Glibc bug - threads you don't know about shouldn't be
receiving your signals, should they?)
That means if it wasn't masked when you issued the first AIO request,
it won't be masked in that thread.
It's quite annoying that POSIX threads doesn't provide any way to say
"mask this signal in all other threads", which you often want to do
from a library.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 18:41 ` Jamie Lokier
@ 2008-08-28 9:34 ` Ian Jackson
2008-09-07 2:41 ` Anthony Liguori
0 siblings, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-28 9:34 UTC (permalink / raw)
To: Jamie Lokier; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> A quick read of Glibc 2.3.1 source says its AIO code does not change
> the signal mask when it creates an AIO handling thread.
Thanks for investigating that.
> (This might be a Glibc bug - threads you don't know about shouldn't be
> receiving your signals, should they?)
Yes, it's a bug. (I wonder why my test program didn't expose it.)
> That means if it wasn't masked when you issued the first AIO request,
> it won't be masked in that thread.
In practice since we use aio this probably means we can't rely on
signal blocking except for signals which qemu keeps blocked `nearly
all' the time. So pipe-to-self in the signal handler is the way to
go.
> It's quite annoying that POSIX threads doesn't provide any way to say
> "mask this signal in all other threads", which you often want to do
> from a library.
That too.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-28 9:34 ` Ian Jackson
@ 2008-09-07 2:41 ` Anthony Liguori
0 siblings, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-09-07 2:41 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Ian Jackson wrote:
> Jamie Lokier writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
>> A quick read of Glibc 2.3.1 source says its AIO code does not change
>> the signal mask when it creates an AIO handling thread.
>>
>
> Thanks for investigating that.
>
>
>> (This might be a Glibc bug - threads you don't know about shouldn't be
>> receiving your signals, should they?)
>>
>
> Yes, it's a bug. (I wonder why my test program didn't expose it.)
>
>
>> That means if it wasn't masked when you issued the first AIO request,
>> it won't be masked in that thread.
>>
>
> In practice since we use aio this probably means we can't rely on
> signal blocking except for signals which qemu keeps blocked `nearly
> all' the time. So pipe-to-self in the signal handler is the way to
> go.
>
In my signalfd patch, we block SIGUSR2 all the time so it shouldn't be
affected by this "bug". We use sigwait or signalfd to receive the signal.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:28 ` Ian Jackson
2008-08-26 15:36 ` Anthony Liguori
@ 2008-08-26 15:36 ` Samuel Thibault
1 sibling, 0 replies; 117+ messages in thread
From: Samuel Thibault @ 2008-08-26 15:36 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Ian Jackson, le Tue 26 Aug 2008 16:28:54 +0100, a écrit :
> > I don't see threads as a problem. Are you concerned about mini-OS?
>
> Minios certainly doesn't currently have any threads and it would
> probably be a severe pain to introduce them.
It has threads for its own internal uses, but it doesn't provide an
interface for them (makes everything a lot simpler and efficient of
course)
Samuel
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:08 ` Ian Jackson
2008-08-26 15:20 ` Anthony Liguori
@ 2008-08-26 15:21 ` Anthony Liguori
1 sibling, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-26 15:21 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Ian Jackson wrote:
> Anthony Liguori writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
> Then the pipe already has something in it which is fine. The purpose
> of writing to it is to make sure that it is not empty. We don't count
> the bytes.
>
BTW, I don't see your patch yet, but my mail is pretty funky right now.
Then you need a pipe per-signal and you can't accurately simulate
threadfd() (since you can't communicate siginfo).
> Please consider my patch instead, which doesn't depend on threads and
> is a lot smaller.
>
I don't see threads as a problem. Are you concerned about mini-OS?
Regards,
Anthony Liguori
> Ian.
>
^ permalink raw reply [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KXzwZ-002RA9@chiark.greenend.org.uk>]
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
[not found] ` <m2n.s.1KXzwZ-002RA9@chiark.greenend.org.uk>
@ 2008-08-26 15:23 ` Ian Jackson
2008-08-26 15:29 ` Avi Kivity
[not found] ` <m2n.s.1KY0XT-002RJM@chiark.greenend.org.uk>
0 siblings, 2 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 15:23 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> Not sure that it matters, but the semantics are slightly different:
> with a thread you don't get EINTR in random syscalls as the signal
> thread is the only one that has the signals unblocked.
That's true but I don't think it's really relevant because we already
have other signal handlers (and we might grow further ones) so the
rest of the code has to be robust against those.
My experience suggests very strongly that we should avoid doing
multithreaded things if at all possible. Threads are less portable;
even when they are provided many of the implementations are buggy
(although less so nowadays).
There is also the way that once you have a multithreaded program, it
is much more difficult to discourage the expansion of the concurrent
functionality until the whole program is a mass of race bugs. I would
rather not open this stable door.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:23 ` Ian Jackson
@ 2008-08-26 15:29 ` Avi Kivity
[not found] ` <m2n.s.1KY0XT-002RJM@chiark.greenend.org.uk>
1 sibling, 0 replies; 117+ messages in thread
From: Avi Kivity @ 2008-08-26 15:29 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Ian Jackson wrote:
> Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
>> Not sure that it matters, but the semantics are slightly different:
>> with a thread you don't get EINTR in random syscalls as the signal
>> thread is the only one that has the signals unblocked.
>>
>
> That's true but I don't think it's really relevant because we already
> have other signal handlers (and we might grow further ones) so the
> rest of the code has to be robust against those.
>
> My experience suggests very strongly that we should avoid doing
> multithreaded things if at all possible. Threads are less portable;
> even when they are provided many of the implementations are buggy
> (although less so nowadays).
>
> There is also the way that once you have a multithreaded program, it
> is much more difficult to discourage the expansion of the concurrent
> functionality until the whole program is a mass of race bugs. I would
> rather not open this stable door.
>
>
In this case the thread is only used to emulate the signalfd() system
call; it is not part of qemu proper.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KY0XT-002RJM@chiark.greenend.org.uk>]
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
[not found] ` <m2n.s.1KY0XT-002RJM@chiark.greenend.org.uk>
@ 2008-08-26 15:38 ` Ian Jackson
2008-08-26 16:12 ` Avi Kivity
0 siblings, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 15:38 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> Ian Jackson wrote:
> > My experience suggests very strongly that we should avoid doing
> > multithreaded things if at all possible. Threads are less portable;
> > even when they are provided many of the implementations are buggy
> > (although less so nowadays).
> >
> > There is also the way that once you have a multithreaded program, it
> > is much more difficult to discourage the expansion of the concurrent
> > functionality until the whole program is a mass of race bugs. I would
> > rather not open this stable door.
>
> In this case the thread is only used to emulate the signalfd() system
> call; it is not part of qemu proper.
That doesn't make us any less exposed to bugs in the threading
implementation, and is only of marginal use if anything for explaining
to people why we shouldn't have a Windows-style multithreaded bug
pile.
Also, I think messing about with Linux-specific syscalls and then
emulating them is hardly a sensible way to carry on, when the
alternative is portable and simple.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 15:38 ` Ian Jackson
@ 2008-08-26 16:12 ` Avi Kivity
2008-08-26 16:14 ` Julian Seward
0 siblings, 1 reply; 117+ messages in thread
From: Avi Kivity @ 2008-08-26 16:12 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Ian Jackson wrote:
> Avi Kivity writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
>
>> Ian Jackson wrote:
>>
>>> My experience suggests very strongly that we should avoid doing
>>> multithreaded things if at all possible. Threads are less portable;
>>> even when they are provided many of the implementations are buggy
>>> (although less so nowadays).
>>>
>>> There is also the way that once you have a multithreaded program, it
>>> is much more difficult to discourage the expansion of the concurrent
>>> functionality until the whole program is a mass of race bugs. I would
>>> rather not open this stable door.
>>>
>> In this case the thread is only used to emulate the signalfd() system
>> call; it is not part of qemu proper.
>>
>
> That doesn't make us any less exposed to bugs in the threading
> implementation, and is only of marginal use if anything for explaining
> to people why we shouldn't have a Windows-style multithreaded bug
> pile.
>
>
Threading bugs in the implementation? In 2008?
Practically all serious software is multithreaded nowadays.
> Also, I think messing about with Linux-specific syscalls and then
> emulating them is hardly a sensible way to carry on, when the
> alternative is portable and simple.
This has merit; though signalfd() will be a lot faster than catching a
signal and writing to a pipe.
In particular, delivering a signal to userspace has to save the floating
point context, while signalfd() avoids it. With the signals used to
signal I/O completion, this matters.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 16:12 ` Avi Kivity
@ 2008-08-26 16:14 ` Julian Seward
2008-08-26 16:39 ` Anthony Liguori
2008-08-27 8:05 ` Avi Kivity
0 siblings, 2 replies; 117+ messages in thread
From: Julian Seward @ 2008-08-26 16:14 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
On Tuesday 26 August 2008, Avi Kivity wrote:
> Threading bugs in the implementation? In 2008?
>
> Practically all serious software is multithreaded nowadays.
That's not in itself an argument in support of writing threaded code
if you don't have to.
I spent much of the last year developing thread-checking tools
in the Valgrind framework (Helgrind), and testing them on serious
software. From that I would say that practically all serious
multithreaded software is riddled with threading bugs, mostly
data races and inconsistent lock orderings (potential deadlocks).
J
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 16:14 ` Julian Seward
@ 2008-08-26 16:39 ` Anthony Liguori
2008-08-27 8:05 ` Avi Kivity
1 sibling, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-26 16:39 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Julian Seward wrote:
> On Tuesday 26 August 2008, Avi Kivity wrote:
>
>
>> Threading bugs in the implementation? In 2008?
>>
>> Practically all serious software is multithreaded nowadays.
>>
>
> That's not in itself an argument in support of writing threaded code
> if you don't have to.
>
I know arguing about threading is almost as fun as arguing about XML,
but in this very specific, very particular case, it's pretty obvious by
simple inspection that there aren't thread safety issues.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-26 16:14 ` Julian Seward
2008-08-26 16:39 ` Anthony Liguori
@ 2008-08-27 8:05 ` Avi Kivity
2008-08-27 8:49 ` Alan Cox
1 sibling, 1 reply; 117+ messages in thread
From: Avi Kivity @ 2008-08-27 8:05 UTC (permalink / raw)
To: Julian Seward; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Julian Seward wrote:
> On Tuesday 26 August 2008, Avi Kivity wrote:
>
>
>> Threading bugs in the implementation? In 2008?
>>
>> Practically all serious software is multithreaded nowadays.
>>
>
> That's not in itself an argument in support of writing threaded code
> if you don't have to.
>
> I spent much of the last year developing thread-checking tools
> in the Valgrind framework (Helgrind), and testing them on serious
> software. From that I would say that practically all serious
> multithreaded software is riddled with threading bugs, mostly
> data races and inconsistent lock orderings (potential deadlocks).
>
Most serious software has to be threaded. The absence of nonblocking
libraries, the complexities of compositing nonblocking components when
they do exist, and the requirement for utilizing multiple cores, all
push towards threading.
I agree that it's incredibly easy to write badly threaded code. But I
don't see a way to avoid it.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 8:05 ` Avi Kivity
@ 2008-08-27 8:49 ` Alan Cox
2008-08-27 9:10 ` Avi Kivity
2008-08-27 9:12 ` Gleb Natapov
0 siblings, 2 replies; 117+ messages in thread
From: Alan Cox @ 2008-08-27 8:49 UTC (permalink / raw)
To: Avi Kivity; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
> I agree that it's incredibly easy to write badly threaded code. But I
> don't see a way to avoid it.
Message passing, using the right language tools (in C that unfortunately
means writing your own or going back to 60s textbooks)
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 8:49 ` Alan Cox
@ 2008-08-27 9:10 ` Avi Kivity
2008-08-27 9:16 ` Keir Fraser
2008-08-27 9:12 ` Gleb Natapov
1 sibling, 1 reply; 117+ messages in thread
From: Avi Kivity @ 2008-08-27 9:10 UTC (permalink / raw)
To: Alan Cox; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Alan Cox wrote:
>> I agree that it's incredibly easy to write badly threaded code. But I
>> don't see a way to avoid it.
>>
>
> Message passing, using the right language tools (in C that unfortunately
> means writing your own or going back to 60s textbooks)
>
Oh, if we're talking languages, then I'm all for pure functional
languages. If you are side effects free, you can let the runtime do all
the threading for you automatically.
I don't see that catching on though.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 9:10 ` Avi Kivity
@ 2008-08-27 9:16 ` Keir Fraser
0 siblings, 0 replies; 117+ messages in thread
From: Keir Fraser @ 2008-08-27 9:16 UTC (permalink / raw)
To: Avi Kivity, Alan Cox; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
On 27/8/08 10:10, "Avi Kivity" <avi@qumranet.com> wrote:
>> Message passing, using the right language tools (in C that unfortunately
>> means writing your own or going back to 60s textbooks)
>>
>
> Oh, if we're talking languages, then I'm all for pure functional
> languages. If you are side effects free, you can let the runtime do all
> the threading for you automatically.
Practical efficient implementation of this is still an open research
problem, I believe. At least as far as auto-parallelisation is concerned.
It's one of those pure-functional-language myths.
-- Keir
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
2008-08-27 8:49 ` Alan Cox
2008-08-27 9:10 ` Avi Kivity
@ 2008-08-27 9:12 ` Gleb Natapov
1 sibling, 0 replies; 117+ messages in thread
From: Gleb Natapov @ 2008-08-27 9:12 UTC (permalink / raw)
To: Alan Cox; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
On Wed, Aug 27, 2008 at 09:49:35AM +0100, Alan Cox wrote:
> > I agree that it's incredibly easy to write badly threaded code. But I
> > don't see a way to avoid it.
>
> Message passing, using the right language tools (in C that unfortunately
> means writing your own or going back to 60s textbooks)
>
Long live MPI!
--
Gleb.
^ permalink raw reply [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KXz4u-002Qvq@chiark.greenend.org.uk>]
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals.
[not found] ` <m2n.s.1KXz4u-002Qvq@chiark.greenend.org.uk>
@ 2008-08-26 14:55 ` Ian Jackson
0 siblings, 0 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 14:55 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Ian Jackson writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 01/13] Handle terminating signals."):
> Why does it need a thread ? You can just write to the pipe in the
> signal handler. I'll post my code.
Just done. As you see it's quite straightforward. We have to
save/restore errno in the signal handler as usual.
So should I provide a patch to do the same for the termination
signals ? They are much less time-critical than aio completion so
it's perhaps less important.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 02/13] add DisplayState->idle
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 01/13] Handle terminating signals Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 16:33 ` [Qemu-devel] " Samuel Thibault
2008-08-21 20:12 ` [Qemu-devel] " Anthony Liguori
2008-08-21 16:27 ` [Qemu-devel] [PATCH 03/13] add container_of() macro to osdep.h Gerd Hoffmann
` (12 subsequent siblings)
14 siblings, 2 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann, Samuel Thibault
From: Samuel Thibault <samuel.thibault@eu.citrix.com>
From: Samuel Thibault <samuel.thibault@eu.citrix.com>
Add idle field to DisplayState struct, so drivers can figure
the display is idle and take advantage of that.
The xen framebuffer driver will use this to communicate the
idle state to the guest, so it knows it can stop doing updates
to a virtual display which is invisible anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
console.h | 1 +
sdl.c | 2 ++
vl.c | 2 ++
vnc.c | 3 +++
4 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/console.h b/console.h
index 561ef51..233e7ef 100644
--- a/console.h
+++ b/console.h
@@ -80,6 +80,7 @@ struct DisplayState {
void *opaque;
struct QEMUTimer *gui_timer;
uint64_t gui_timer_interval;
+ int idle;
void (*dpy_update)(struct DisplayState *s, int x, int y, int w, int h);
void (*dpy_resize)(struct DisplayState *s, int w, int h);
diff --git a/sdl.c b/sdl.c
index 84a9d6d..15427c5 100644
--- a/sdl.c
+++ b/sdl.c
@@ -524,9 +524,11 @@ static void sdl_refresh(DisplayState *ds)
if (ev->active.gain) {
/* Back to default interval */
ds->gui_timer_interval = 0;
+ ds->idle = 0;
} else {
/* Sleeping interval */
ds->gui_timer_interval = 500;
+ ds->idle = 1;
}
}
break;
diff --git a/vl.c b/vl.c
index 071ee0c..ee89107 100644
--- a/vl.c
+++ b/vl.c
@@ -5974,6 +5974,8 @@ static void dumb_display_init(DisplayState *ds)
ds->dpy_update = dumb_update;
ds->dpy_resize = dumb_resize;
ds->dpy_refresh = dumb_refresh;
+ ds->gui_timer_interval = 500;
+ ds->idle = 1;
}
/***********************************************************/
diff --git a/vnc.c b/vnc.c
index 2d17044..0d55a78 100644
--- a/vnc.c
+++ b/vnc.c
@@ -660,6 +660,7 @@ static int vnc_client_io_error(VncState *vs, int ret, int last_errno)
qemu_set_fd_handler2(vs->csock, NULL, NULL, NULL, NULL);
closesocket(vs->csock);
vs->csock = -1;
+ vs->ds->idle = 1;
buffer_reset(&vs->input);
buffer_reset(&vs->output);
vs->need_update = 0;
@@ -1920,6 +1921,7 @@ static int protocol_version(VncState *vs, uint8_t *version, size_t len)
static void vnc_connect(VncState *vs)
{
VNC_DEBUG("New client on socket %d\n", vs->csock);
+ vs->ds->idle = 0;
socket_set_nonblock(vs->csock);
qemu_set_fd_handler2(vs->csock, NULL, vnc_client_read, NULL, vs);
vnc_write(vs, "RFB 003.008\n", 12);
@@ -1959,6 +1961,7 @@ void vnc_display_init(DisplayState *ds)
exit(1);
ds->opaque = vs;
+ ds->idle = 1;
vnc_state = vs;
vs->display = NULL;
vs->password = NULL;
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] Re: [PATCH 02/13] add DisplayState->idle
2008-08-21 16:27 ` [Qemu-devel] [PATCH 02/13] add DisplayState->idle Gerd Hoffmann
@ 2008-08-21 16:33 ` Samuel Thibault
2008-08-21 20:12 ` [Qemu-devel] " Anthony Liguori
1 sibling, 0 replies; 117+ messages in thread
From: Samuel Thibault @ 2008-08-21 16:33 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, qemu-devel
Gerd Hoffmann, le Thu 21 Aug 2008 18:27:23 +0200, a écrit :
> From: Samuel Thibault <samuel.thibault@eu.citrix.com>
>
> From: Samuel Thibault <samuel.thibault@eu.citrix.com>
>
> Add idle field to DisplayState struct, so drivers can figure
> the display is idle and take advantage of that.
>
> The xen framebuffer driver will use this to communicate the
> idle state to the guest, so it knows it can stop doing updates
> to a virtual display which is invisible anyway.
Signed-off-by: Samuel Thibault <samuel.thibault@eu.citrix.com>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 02/13] add DisplayState->idle
2008-08-21 16:27 ` [Qemu-devel] [PATCH 02/13] add DisplayState->idle Gerd Hoffmann
2008-08-21 16:33 ` [Qemu-devel] " Samuel Thibault
@ 2008-08-21 20:12 ` Anthony Liguori
1 sibling, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-21 20:12 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann, Samuel Thibault
Gerd Hoffmann wrote:
> From: Samuel Thibault <samuel.thibault@eu.citrix.com>
>
> From: Samuel Thibault <samuel.thibault@eu.citrix.com>
>
> Add idle field to DisplayState struct, so drivers can figure
> the display is idle and take advantage of that.
>
> The xen framebuffer driver will use this to communicate the
> idle state to the guest, so it knows it can stop doing updates
> to a virtual display which is invisible anyway.
>
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
>
>
Applied. Thanks.
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 03/13] add container_of() macro to osdep.h
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 01/13] Handle terminating signals Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 02/13] add DisplayState->idle Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 20:12 ` Anthony Liguori
2008-08-21 16:27 ` [Qemu-devel] [PATCH 04/13] move GUI_REFRESH_INTERVAL define from vl.c to console.h Gerd Hoffmann
` (11 subsequent siblings)
14 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
>From linux kernel sources, xen bits will use it, put it
into a place where others can see and use it too ;)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
osdep.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/osdep.h b/osdep.h
index 6312e7a..09ebace 100644
--- a/osdep.h
+++ b/osdep.h
@@ -23,6 +23,10 @@
#define unlikely(x) __builtin_expect(!!(x), 0)
#endif
+#define container_of(ptr, type, member) ({ \
+ const typeof( ((type *)0)->member ) *__mptr = (ptr); \
+ (type *)( (char *)__mptr - offsetof(type,member) );})
+
#ifndef MIN
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#endif
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 04/13] move GUI_REFRESH_INTERVAL define from vl.c to console.h
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (2 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 03/13] add container_of() macro to osdep.h Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 20:12 ` Anthony Liguori
2008-08-21 16:27 ` [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support Gerd Hoffmann
` (10 subsequent siblings)
14 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
console.h | 3 +++
vl.c | 2 --
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/console.h b/console.h
index 233e7ef..069e73d 100644
--- a/console.h
+++ b/console.h
@@ -9,6 +9,9 @@
#define MOUSE_EVENT_RBUTTON 0x02
#define MOUSE_EVENT_MBUTTON 0x04
+/* in ms */
+#define GUI_REFRESH_INTERVAL 30
+
typedef void QEMUPutKBDEvent(void *opaque, int keycode);
typedef void QEMUPutMouseEvent(void *opaque, int dx, int dy, int dz, int buttons_state);
diff --git a/vl.c b/vl.c
index ee89107..3211507 100644
--- a/vl.c
+++ b/vl.c
@@ -154,8 +154,6 @@ int inet_aton(const char *cp, struct in_addr *ia);
#else
#define DEFAULT_RAM_SIZE 128
#endif
-/* in ms */
-#define GUI_REFRESH_INTERVAL 30
/* Max number of USB devices that can be specified on the commandline. */
#define MAX_USB_CMDLINE 8
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (3 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 04/13] move GUI_REFRESH_INTERVAL define from vl.c to console.h Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 20:18 ` Anthony Liguori
2008-08-21 16:27 ` [Qemu-devel] [PATCH 06/13] xen: backend driver core Gerd Hoffmann
` (9 subsequent siblings)
14 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
- configure script and build system changes.
- wind up new machine type.
- add -domid command line option.
- allow xenpv machines run without disk and kernel specified
by adding a nodisk_ok field to QEMUMachine.
- detect xen being present.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 8 +++
configure | 27 +++++++++
hw/boards.h | 3 +
hw/xen.h | 18 ++++++
hw/xen_machine_pv.c | 146 +++++++++++++++++++++++++++++++++++++++++++++++++
target-i386/machine.c | 3 +
vl.c | 15 +++++
7 files changed, 220 insertions(+), 0 deletions(-)
create mode 100644 hw/xen.h
create mode 100644 hw/xen_machine_pv.c
diff --git a/Makefile.target b/Makefile.target
index 42162c3..13ee854 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -515,6 +515,14 @@ CPPFLAGS += $(CONFIG_VNC_TLS_CFLAGS)
LIBS += $(CONFIG_VNC_TLS_LIBS)
endif
+# xen backend driver support
+XEN_OBJS := xen_machine_pv.o
+ifeq ($(CONFIG_XEN), yes)
+ OBJS += $(XEN_OBJS)
+ LIBS += $(XEN_LIBS)
+ $(XEN_OBJS) : CFLAGS += -Wall -Wmissing-prototypes -Wstrict-prototypes
+endif
+
# SCSI layer
OBJS+= lsi53c895a.o esp.o
diff --git a/configure b/configure
index 74ce443..14689d2 100755
--- a/configure
+++ b/configure
@@ -109,6 +109,7 @@ curses="yes"
aio="yes"
nptl="yes"
mixemu="no"
+xen="no"
# OS specific
targetos=`uname -s`
@@ -204,6 +205,7 @@ linux="yes"
linux_user="yes"
if [ "$cpu" = "i386" -o "$cpu" = "x86_64" ] ; then
kqemu="yes"
+ xen="yes"
audio_possible_drivers="$audio_possible_drivers fmod"
fi
;;
@@ -287,6 +289,8 @@ for opt do
;;
--disable-kqemu) kqemu="no"
;;
+ --disable-xen) xen="no"
+ ;;
--disable-brlapi) brlapi="no"
;;
--enable-profiler) profiler="yes"
@@ -425,6 +429,7 @@ echo " Available drivers: $audio_possible_drivers"
echo " --audio-card-list=LIST set list of additional emulated audio cards"
echo " Available cards: ac97 adlib cs4231a gus"
echo " --enable-mixemu enable mixer emulation"
+echo " --disable-xen disable xen backend driver support"
echo " --disable-brlapi disable BrlAPI"
echo " --disable-vnc-tls disable TLS encryption for VNC server"
echo " --disable-curses disable curses output"
@@ -686,6 +691,22 @@ else
fi
##########################################
+# xen probe
+
+if test "$xen" = "yes" ; then
+cat > $TMPC <<EOF
+#include <xenctrl.h>
+#include <xs.h>
+int main(void) { xs_daemon_open; xc_interface_open; }
+EOF
+ if $cc $ARCH_CFLAGS -c -o $TMPO $TMPC -lxenstore -lxenctrl 2> /dev/null ; then
+ :
+ else
+ xen="no"
+ fi
+fi
+
+##########################################
# SDL probe
sdl_too_old=no
@@ -946,6 +967,7 @@ if test -n "$sparc_cpu"; then
echo "Target Sparc Arch $sparc_cpu"
fi
echo "kqemu support $kqemu"
+echo "xen support $xen"
echo "brlapi support $brlapi"
echo "Documentation $build_docs"
[ ! -z "$uname_release" ] && \
@@ -1203,6 +1225,11 @@ if test "$brlapi" = "yes" ; then
echo "#define CONFIG_BRLAPI 1" >> $config_h
echo "BRLAPI_LIBS=-lbrlapi" >> $config_mak
fi
+if test "$xen" = "yes" ; then
+ echo "CONFIG_XEN=yes" >> $config_mak
+ echo "#define CONFIG_XEN 1" >> $config_h
+ echo "XEN_LIBS=-lxenstore -lxenctrl" >> $config_mak
+fi
if test "$aio" = "yes" ; then
echo "#define CONFIG_AIO 1" >> $config_h
fi
diff --git a/hw/boards.h b/hw/boards.h
index e6dd198..9af9939 100644
--- a/hw/boards.h
+++ b/hw/boards.h
@@ -30,6 +30,9 @@ extern QEMUMachine bareetraxfs_machine;
extern QEMUMachine pc_machine;
extern QEMUMachine isapc_machine;
+/* xen_machine.c */
+extern QEMUMachine xenpv_machine;
+
/* ppc.c */
extern QEMUMachine prep_machine;
extern QEMUMachine core99_machine;
diff --git a/hw/xen.h b/hw/xen.h
new file mode 100644
index 0000000..a772aad
--- /dev/null
+++ b/hw/xen.h
@@ -0,0 +1,18 @@
+#ifndef QEMU_XEN_H
+#define QEMU_XEN_H 1
+/*
+ * public xen header
+ * stuff needed outside xen-*.c, i.e. interfaces to qemu.
+ * must not depend on any xen headers being present in
+ * /usr/include/xen, so it can be included unconditionally.
+ */
+
+/* xen-machine.c */
+extern int xen_domid;
+extern int xen_present;
+extern int xen_emulate;
+extern int xen_domainbuild;
+
+int xen_detect(void);
+
+#endif /* QEMU_XEN_H */
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
new file mode 100644
index 0000000..05f0e67
--- /dev/null
+++ b/hw/xen_machine_pv.c
@@ -0,0 +1,146 @@
+/*
+ * QEMU Xen PV Machine
+ *
+ * Copyright (c) 2007,08 Red Hat
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "hw.h"
+#include "boards.h"
+
+#include <xenctrl.h>
+
+/* -------------------------------------------------------------------- */
+/* variables */
+
+int xen_domid;
+int xen_present = -1;
+int xen_emulate;
+int xen_domainbuild;
+
+/* -------------------------------------------------------------------- */
+/* initialization code */
+
+/*
+ * Figure the environment we are running in.
+ * Returns true when xen is present, false otherwise.
+ * Also checks whenever the domain specified via -domid
+ * exists (so we can attach) or whenever it must be created.
+ */
+int xen_detect(void)
+{
+ struct xc_dominfo info;
+ int xc, rc;
+
+ if (xen_present != -1)
+ goto out;
+
+ /* running on xen? (priviledged domain) */
+ xen_present = 0;
+ xc = xc_interface_open();
+ if (-1 == xc)
+ goto out;
+ xen_present = 1;
+
+ /* does the domain exist? Or should we create one? */
+ rc = xc_domain_getinfo(xc, xen_domid, 1, &info);
+ if ((1 != rc) || (info.domid != xen_domid))
+ xen_domainbuild = 1;
+ close(xc);
+
+out:
+ return xen_present;
+}
+
+static int xen_init(void)
+{
+ if (!xen_domid) {
+ fprintf(stderr, "%s: no domid specified\n", __FUNCTION__);
+ return -1;
+ }
+
+ if (!xen_detect()) {
+ fprintf(stderr, "%s: emulating Xen\n", __FUNCTION__);
+ xen_emulate = 1;
+ }
+
+ return 0;
+}
+
+static int xen_init_pv(DisplayState *ds)
+{
+ int rc;
+
+ rc = xen_init();
+ if (rc < 0)
+ return rc;
+
+ return 0;
+}
+
+/* -------------------------------------------------------------------- */
+/* paravirtualized xen machine */
+
+static void xenpv_init(ram_addr_t ram_size, int vga_ram_size,
+ const char *boot_device, DisplayState *ds,
+ const char *kernel_filename,
+ const char *kernel_cmdline,
+ const char *initrd_filename,
+ const char *cpu_model)
+{
+ CPUState *env;
+ int rc;
+
+ rc = xen_init_pv(ds);
+ if (-1 == rc)
+ goto err;
+
+ if (xen_emulate) {
+ fprintf(stderr, "xen pv emulation not implemented yet\n");
+ goto err;
+ }
+ if (xen_domainbuild) {
+ fprintf(stderr, "xen pv domain creation not implemented yet\n");
+ goto err;
+ }
+
+ /* create dummy cpu, halted */
+ if (cpu_model == NULL) {
+#ifdef TARGET_X86_64
+ cpu_model = "qemu64";
+#else
+ cpu_model = "qemu32";
+#endif
+ }
+ env = cpu_init(cpu_model);
+ env->halted = 1;
+
+ return;
+
+err:
+ exit(1);
+}
+
+QEMUMachine xenpv_machine = {
+ .name = "xenpv",
+ .desc = "paravirtualized Xen machine",
+ .init = xenpv_init,
+ .nodisk_ok = 1,
+};
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 91dbd55..98ece17 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -9,6 +9,9 @@ void register_machines(void)
{
qemu_register_machine(&pc_machine);
qemu_register_machine(&isapc_machine);
+#ifdef CONFIG_XEN
+ qemu_register_machine(&xenpv_machine);
+#endif
}
static void cpu_put_seg(QEMUFile *f, SegmentCache *dt)
diff --git a/vl.c b/vl.c
index 3211507..ff94c6f 100644
--- a/vl.c
+++ b/vl.c
@@ -29,6 +29,7 @@
#include "hw/audiodev.h"
#include "hw/isa.h"
#include "hw/baum.h"
+#include "hw/xen.h"
#include "net.h"
#include "console.h"
#include "sysemu.h"
@@ -7780,6 +7781,9 @@ static void help(int exitcode)
"-startdate select initial date of the clock\n"
"-icount [N|auto]\n"
" Enable virtual instruction counter with 2^N clock ticks per instruction\n"
+#ifdef CONFIG_XEN
+ "-domid specify xen guest domain id\n"
+#endif
"\n"
"During emulation, the following keys are useful:\n"
"ctrl-alt-f toggle full screen\n"
@@ -7885,6 +7889,9 @@ enum {
QEMU_OPTION_startdate,
QEMU_OPTION_tb_size,
QEMU_OPTION_icount,
+#ifdef CONFIG_XEN
+ QEMU_OPTION_domid,
+#endif
};
typedef struct QEMUOption {
@@ -7973,6 +7980,9 @@ const QEMUOption qemu_options[] = {
#ifdef CONFIG_CURSES
{ "curses", 0, QEMU_OPTION_curses },
#endif
+#ifdef CONFIG_XEN
+ { "domid", HAS_ARG, QEMU_OPTION_domid },
+#endif
/* temporary options */
{ "usb", 0, QEMU_OPTION_usb },
@@ -8847,6 +8857,11 @@ int main(int argc, char **argv)
icount_time_shift = strtol(optarg, NULL, 0);
}
break;
+#ifdef CONFIG_XEN
+ case QEMU_OPTION_domid:
+ xen_domid = atoi(optarg);
+ break;
+#endif
}
}
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-21 16:27 ` [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support Gerd Hoffmann
@ 2008-08-21 20:18 ` Anthony Liguori
0 siblings, 0 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-21 20:18 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Ian Jackson, Gerd Hoffmann, Samuel Thibault
Gerd Hoffmann wrote:
> - configure script and build system changes.
> - wind up new machine type.
> - add -domid command line option.
> - allow xenpv machines run without disk and kernel specified
> by adding a nodisk_ok field to QEMUMachine.
> - detect xen being present.
>
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
>
>
I'm pretty happy with up to 11 in the series but I'll hold off applying
them until they show up in qemu-dm or at least are Acked-by Ian or Samuel.
Ian/Samuel: is there any subset of these patches that you are
comfortable with or do they all need more discussion?
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 06/13] xen: backend driver core
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (4 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 07/13] xen: add console backend driver Gerd Hoffmann
` (8 subsequent siblings)
14 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This patch adds infrastructure for xen backend drivers living in qemu,
so drivers don't need to implement common stuff on their own. It's
mostly xenbus management stuff: some functions to access xentore,
setting up xenstore watches, callbacks on device discovery and state
changes, handle event channel, ...
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 2 +-
hw/xen_backend.c | 691 +++++++++++++++++++++++++++++++++++++++++++++++++++
hw/xen_backend.h | 86 +++++++
hw/xen_common.h | 34 +++
hw/xen_machine_pv.c | 7 +-
5 files changed, 818 insertions(+), 2 deletions(-)
create mode 100644 hw/xen_backend.c
create mode 100644 hw/xen_backend.h
create mode 100644 hw/xen_common.h
diff --git a/Makefile.target b/Makefile.target
index 13ee854..78a6402 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -516,7 +516,7 @@ LIBS += $(CONFIG_VNC_TLS_LIBS)
endif
# xen backend driver support
-XEN_OBJS := xen_machine_pv.o
+XEN_OBJS := xen_machine_pv.o xen_backend.o
ifeq ($(CONFIG_XEN), yes)
OBJS += $(XEN_OBJS)
LIBS += $(XEN_LIBS)
diff --git a/hw/xen_backend.c b/hw/xen_backend.c
new file mode 100644
index 0000000..e926a01
--- /dev/null
+++ b/hw/xen_backend.c
@@ -0,0 +1,691 @@
+/*
+ * xen backend driver infrastructure
+ * (c) 2008 Gerd Hoffmann <kraxel@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+/*
+ * TODO: add some xenbus / xenstore concepts overview here.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <sys/signal.h>
+
+#include <xs.h>
+#include <xenctrl.h>
+#include <xen/grant_table.h>
+
+#include "hw.h"
+#include "qemu-char.h"
+#include "xen_backend.h"
+
+/* ------------------------------------------------------------- */
+
+/* public */
+int xen_xc;
+struct xs_handle *xenstore = NULL;
+
+/* private */
+static TAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = TAILQ_HEAD_INITIALIZER(xendevs);
+static int debug = 0;
+
+/* ------------------------------------------------------------- */
+
+int xenstore_write_str(const char *base, const char *node, const char *val)
+{
+ char abspath[XEN_BUFSIZE];
+
+ snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
+ if (!xs_write(xenstore, 0, abspath, val, strlen(val)))
+ return -1;
+ return 0;
+}
+
+char *xenstore_read_str(const char *base, const char *node)
+{
+ char abspath[XEN_BUFSIZE];
+ unsigned int len;
+
+ snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
+ return xs_read(xenstore, 0, abspath, &len);
+}
+
+int xenstore_write_int(const char *base, const char *node, int ival)
+{
+ char val[32];
+
+ snprintf(val, sizeof(val), "%d", ival);
+ return xenstore_write_str(base, node, val);
+}
+
+int xenstore_read_int(const char *base, const char *node, int *ival)
+{
+ char *val;
+ int rc = -1;
+
+ val = xenstore_read_str(base, node);
+ if (val && 1 == sscanf(val, "%d", ival))
+ rc = 0;
+ qemu_free(val);
+ return rc;
+}
+
+int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const char *val)
+{
+ return xenstore_write_str(xendev->be, node, val);
+}
+
+int xenstore_write_be_int(struct XenDevice *xendev, const char *node, int ival)
+{
+ return xenstore_write_int(xendev->be, node, ival);
+}
+
+char *xenstore_read_be_str(struct XenDevice *xendev, const char *node)
+{
+ return xenstore_read_str(xendev->be, node);
+}
+
+int xenstore_read_be_int(struct XenDevice *xendev, const char *node, int *ival)
+{
+ return xenstore_read_int(xendev->be, node, ival);
+}
+
+char *xenstore_read_fe_str(struct XenDevice *xendev, const char *node)
+{
+ return xenstore_read_str(xendev->fe, node);
+}
+
+int xenstore_read_fe_int(struct XenDevice *xendev, const char *node, int *ival)
+{
+ return xenstore_read_int(xendev->fe, node, ival);
+}
+
+/* ------------------------------------------------------------- */
+
+const char *xenbus_strstate(enum xenbus_state state)
+{
+ static const char *const name[] = {
+ [ XenbusStateUnknown ] = "Unknown",
+ [ XenbusStateInitialising ] = "Initialising",
+ [ XenbusStateInitWait ] = "InitWait",
+ [ XenbusStateInitialised ] = "Initialised",
+ [ XenbusStateConnected ] = "Connected",
+ [ XenbusStateClosing ] = "Closing",
+ [ XenbusStateClosed ] = "Closed",
+ };
+ return (state < ARRAY_SIZE(name)) ? name[state] : "INVALID";
+}
+
+int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state)
+{
+ int rc;
+
+ rc = xenstore_write_be_int(xendev, "state", state);
+ if (rc < 0)
+ return rc;
+ xen_be_printf(xendev, 1, "backend state: %s -> %s\n",
+ xenbus_strstate(xendev->be_state), xenbus_strstate(state));
+ xendev->be_state = state;
+ return 0;
+}
+
+/* ------------------------------------------------------------- */
+
+struct XenDevice *xen_be_find_xendev(char *type, int dom, int dev)
+{
+ struct XenDevice *xendev;
+
+ TAILQ_FOREACH(xendev, &xendevs, next) {
+ if (xendev->dom != dom)
+ continue;
+ if (xendev->dev != dev)
+ continue;
+ if (0 != strcmp(xendev->type, type))
+ continue;
+ return xendev;
+ }
+ return NULL;
+}
+
+/*
+ * get xen backend device, allocate a new one if it doesn't exist.
+ */
+static struct XenDevice *xen_be_get_xendev(char *type, int dom, int dev,
+ struct XenDevOps *ops)
+{
+ struct XenDevice *xendev;
+ char *dom0;
+
+ xendev = xen_be_find_xendev(type, dom, dev);
+ if (xendev)
+ return xendev;
+
+ /* init new xendev */
+ xendev = qemu_mallocz(ops->size);
+ if (!xendev)
+ return NULL;
+ xendev->type = type;
+ xendev->dom = dom;
+ xendev->dev = dev;
+ xendev->ops = ops;
+
+ dom0 = xs_get_domain_path(xenstore, 0);
+ snprintf(xendev->be, sizeof(xendev->be), "%s/backend/%s/%d/%d",
+ dom0, xendev->type, xendev->dom, xendev->dev);
+ snprintf(xendev->name, sizeof(xendev->name), "%s-%d",
+ xendev->type, xendev->dev);
+ free(dom0);
+
+ xendev->debug = debug;
+ xendev->local_port = -1;
+
+ xendev->evtchndev = xc_evtchn_open();
+ if (xendev->evtchndev < 0) {
+ fprintf(stderr, "can't open evtchn device\n");
+ qemu_free(xendev);
+ return NULL;
+ }
+ fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
+
+ if (ops->flags & DEVOPS_FLAG_NEED_GNTDEV) {
+ xendev->gnttabdev = xc_gnttab_open();
+ if (xendev->gnttabdev < 0) {
+ fprintf(stderr, "can't open gnttab device\n");
+ xc_evtchn_close(xendev->evtchndev);
+ qemu_free(xendev);
+ return NULL;
+ }
+ } else {
+ xendev->gnttabdev = -1;
+ }
+
+ TAILQ_INSERT_TAIL(&xendevs, xendev, next);
+
+ if (xendev->ops->alloc)
+ xendev->ops->alloc(xendev);
+
+ return xendev;
+}
+
+/*
+ * release xen backend device.
+ */
+static struct XenDevice *xen_be_del_xendev(int dom, int dev)
+{
+ struct XenDevice *xendev, *xnext;
+
+ /*
+ * This is pretty much like TAILQ_FOREACH(xendev, &xendevs, next) but
+ * we save the next pointer in xnext because we might free xendev.
+ */
+ xnext = xendevs.tqh_first;
+ while (xnext) {
+ xendev = xnext;
+ xnext = xendev->next.tqe_next;
+
+ if (xendev->dom != dom)
+ continue;
+ if (xendev->dev != dev && dev != -1)
+ continue;
+
+ if (xendev->ops->free)
+ xendev->ops->free(xendev);
+
+ if (xendev->fe) {
+ char token[XEN_BUFSIZE];
+ snprintf(token, sizeof(token), "fe:%p", xendev);
+ xs_unwatch(xenstore, xendev->fe, token);
+ qemu_free(xendev->fe);
+ }
+
+ if (xendev->evtchndev >= 0)
+ xc_evtchn_close(xendev->evtchndev);
+ if (xendev->gnttabdev >= 0)
+ xc_gnttab_close(xendev->gnttabdev);
+
+ TAILQ_REMOVE(&xendevs, xendev, next);
+ qemu_free(xendev);
+ }
+ return NULL;
+}
+
+/*
+ * Sync internal data structures on xenstore updates.
+ * Node specifies the changed field. node = NULL means
+ * update all fields (used for initialization).
+ */
+static void xen_be_backend_changed(struct XenDevice *xendev, const char *node)
+{
+ if (NULL == node || 0 == strcmp(node, "online")) {
+ if (-1 == xenstore_read_be_int(xendev, "online", &xendev->online))
+ xendev->online = 0;
+ }
+
+ if (node) {
+ xen_be_printf(xendev, 2, "backend update: %s\n", node);
+ if (xendev->ops->backend_changed)
+ xendev->ops->backend_changed(xendev, node);
+ }
+}
+
+static void xen_be_frontend_changed(struct XenDevice *xendev, const char *node)
+{
+ int fe_state;
+
+ if (NULL == node || 0 == strcmp(node, "state")) {
+ if (-1 == xenstore_read_fe_int(xendev, "state", &fe_state))
+ fe_state = XenbusStateUnknown;
+ if (xendev->fe_state != fe_state)
+ xen_be_printf(xendev, 1, "frontend state: %s -> %s\n",
+ xenbus_strstate(xendev->fe_state),
+ xenbus_strstate(fe_state));
+ xendev->fe_state = fe_state;
+ }
+ if (NULL == node || 0 == strcmp(node, "protocol")) {
+ qemu_free(xendev->protocol);
+ xendev->protocol = xenstore_read_fe_str(xendev, "protocol");
+ if (xendev->protocol)
+ xen_be_printf(xendev, 1, "frontend protocol: %s\n", xendev->protocol);
+ }
+
+ if (node) {
+ xen_be_printf(xendev, 2, "frontend update: %s\n", node);
+ if (xendev->ops->frontend_changed)
+ xendev->ops->frontend_changed(xendev, node);
+ }
+}
+
+/* ------------------------------------------------------------- */
+/* Check for possible state transitions and perform them. */
+
+/*
+ * Initial xendev setup. Read frontend path, register watch for it.
+ * Should succeed once xend finished setting up the backend device.
+ *
+ * Also sets initial state (-> Initializing) when done. Which
+ * only affects the xendev->be_state variable as xenbus should
+ * already be put into that state by xend.
+ */
+static int xen_be_try_setup(struct XenDevice *xendev)
+{
+ char token[XEN_BUFSIZE];
+ int be_state;
+
+ if (-1 == xenstore_read_be_int(xendev, "state", &be_state)) {
+ xen_be_printf(xendev, 0, "reading backend state failed\n");
+ return -1;
+ }
+
+ if (be_state != XenbusStateInitialising) {
+ xen_be_printf(xendev, 0, "initial backend state is wrong (%s)\n",
+ xenbus_strstate(be_state));
+ return -1;
+ }
+
+ xendev->fe = xenstore_read_be_str(xendev, "frontend");
+ if (NULL == xendev->fe) {
+ xen_be_printf(xendev, 0, "reading frontend path failed\n");
+ return -1;
+ }
+
+ /* setup frontend watch */
+ snprintf(token, sizeof(token), "fe:%p", xendev);
+ if (!xs_watch(xenstore, xendev->fe, token)) {
+ xen_be_printf(xendev, 0, "watching frontend path (%s) failed\n",
+ xendev->fe);
+ return -1;
+ }
+ xen_be_set_state(xendev, XenbusStateInitialising);
+
+ xen_be_backend_changed(xendev, NULL);
+ xen_be_frontend_changed(xendev, NULL);
+ return 0;
+}
+
+/*
+ * Try initialize xendev. Prepare everything the backend can do
+ * without synchronizing with the frontend. Fakes hotplug-status. No
+ * hotplug involved here because this is about userspace drivers, thus
+ * there are kernel backend devices which could invoke hotplug.
+ *
+ * Goes to InitWait on success.
+ */
+static int xen_be_try_init(struct XenDevice *xendev)
+{
+ int rc = 0;
+
+ if (!xendev->online) {
+ xen_be_printf(xendev, 1, "not online\n");
+ return -1;
+ }
+
+ if (xendev->ops->init)
+ rc = xendev->ops->init(xendev);
+ if (0 != rc) {
+ xen_be_printf(xendev, 1, "init() failed\n");
+ return rc;
+ }
+
+ xenstore_write_be_str(xendev, "hotplug-status", "connected");
+ xen_be_set_state(xendev, XenbusStateInitWait);
+ return 0;
+}
+
+/*
+ * Try to connect xendev. Depends on the frontend being ready
+ * for it (shared ring and evtchn info in xenstore, state being
+ * Initialised or Connected).
+ *
+ * Goes to Connected on success.
+ */
+static int xen_be_try_connect(struct XenDevice *xendev)
+{
+ int rc = 0;
+
+ if (xendev->fe_state != XenbusStateInitialised &&
+ xendev->fe_state != XenbusStateConnected) {
+ if (xendev->ops->flags & DEVOPS_FLAG_IGNORE_STATE) {
+ xen_be_printf(xendev, 2, "frontend not ready, ignoring\n");
+ } else {
+ xen_be_printf(xendev, 2, "frontend not ready (yet)\n");
+ return -1;
+ }
+ }
+
+ if (xendev->ops->connect)
+ rc = xendev->ops->connect(xendev);
+ if (0 != rc) {
+ xen_be_printf(xendev, 0, "connect() failed\n");
+ return rc;
+ }
+
+ xen_be_set_state(xendev, XenbusStateConnected);
+ return 0;
+}
+
+/*
+ * Teardown connection.
+ *
+ * Goes to Closed when done.
+ */
+static void xen_be_disconnect(struct XenDevice *xendev, enum xenbus_state state)
+{
+ if (xendev->be_state != XenbusStateClosing &&
+ xendev->be_state != XenbusStateClosed &&
+ xendev->ops->disconnect)
+ xendev->ops->disconnect(xendev);
+ if (xendev->be_state != state)
+ xen_be_set_state(xendev, state);
+}
+
+/*
+ * Try to reset xendev, for reconnection by another frontend instance.
+ */
+static int xen_be_try_reset(struct XenDevice *xendev)
+{
+ if (xendev->fe_state != XenbusStateInitialising)
+ return -1;
+
+ xen_be_printf(xendev, 1, "device reset (for re-connect)\n");
+ xen_be_set_state(xendev, XenbusStateInitialising);
+ return 0;
+}
+
+/*
+ * state change dispatcher function
+ */
+void xen_be_check_state(struct XenDevice *xendev)
+{
+ int rc = 0;
+
+ /* frontend may request shutdown from almost anywhere */
+ if (xendev->fe_state == XenbusStateClosing ||
+ xendev->fe_state == XenbusStateClosed) {
+ xen_be_disconnect(xendev, xendev->fe_state);
+ return;
+ }
+
+ /* check for possible backend state transitions */
+ for (;;) {
+ switch (xendev->be_state) {
+ case XenbusStateUnknown:
+ rc = xen_be_try_setup(xendev);
+ break;
+ case XenbusStateInitialising:
+ rc = xen_be_try_init(xendev);
+ break;
+ case XenbusStateInitWait:
+ rc = xen_be_try_connect(xendev);
+ break;
+ case XenbusStateClosed:
+ rc = xen_be_try_reset(xendev);
+ break;
+ default:
+ rc = -1;
+ }
+ if (0 != rc)
+ break;
+ }
+}
+
+/* ------------------------------------------------------------- */
+
+static int xenstore_scan(char *type, int dom, struct XenDevOps *ops)
+{
+ struct XenDevice *xendev;
+ char path[XEN_BUFSIZE], token[XEN_BUFSIZE];
+ char **dev = NULL, *dom0;
+ unsigned int cdev, j;
+
+ /* setup watch */
+ dom0 = xs_get_domain_path(xenstore, 0);
+ snprintf(token, sizeof(token), "be:%p:%d:%p", type, dom, ops);
+ snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom);
+ free(dom0);
+ if (!xs_watch(xenstore, path, token)) {
+ fprintf(stderr, "xen be: watching backend path (%s) failed\n", path);
+ return -1;
+ }
+
+ /* look for backends */
+ dev = xs_directory(xenstore, 0, path, &cdev);
+ if (!dev)
+ return 0;
+ for (j = 0; j < cdev; j++) {
+ xendev = xen_be_get_xendev(type, dom, atoi(dev[j]), ops);
+ if (NULL == xendev)
+ continue;
+ xen_be_check_state(xendev);
+ }
+ qemu_free(dev);
+ return 0;
+}
+
+static void xenstore_update_be(char *watch, char *type, int dom,
+ struct XenDevOps *ops)
+{
+ struct XenDevice *xendev;
+ char path[XEN_BUFSIZE], *dom0;
+ unsigned int len, dev;
+
+ dom0 = xs_get_domain_path(xenstore, 0);
+ len = snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom);
+ free(dom0);
+ if (0 != strncmp(path, watch, len))
+ return;
+ if (2 != sscanf(watch+len, "/%u/%255s", &dev, path)) {
+ strcpy(path, "");
+ if (1 != sscanf(watch+len, "/%u", &dev))
+ dev = -1;
+ }
+ if (-1 == dev)
+ return;
+
+ if (0) {
+ /* FIXME: detect devices being deleted from xenstore ... */
+ xen_be_del_xendev(dom, dev);
+ }
+
+ xendev = xen_be_get_xendev(type, dom, dev, ops);
+ if (NULL != xendev) {
+ xen_be_backend_changed(xendev, path);
+ xen_be_check_state(xendev);
+ }
+}
+
+static void xenstore_update_fe(char *watch, struct XenDevice *xendev)
+{
+ char *node;
+ unsigned int len;
+
+ len = strlen(xendev->fe);
+ if (0 != strncmp(xendev->fe, watch, len))
+ return;
+ if (watch[len] != '/')
+ return;
+ node = watch + len + 1;
+
+ xen_be_frontend_changed(xendev, node);
+ xen_be_check_state(xendev);
+}
+
+static void xenstore_update(void *unused)
+{
+ char **vec = NULL;
+ intptr_t type, ops, ptr;
+ unsigned int dom, count;
+
+ vec = xs_read_watch(xenstore, &count);
+ if (NULL == vec)
+ goto cleanup;
+
+ if (3 == sscanf(vec[XS_WATCH_TOKEN], "be:%" PRIxPTR ":%d:%" PRIxPTR,
+ &type, &dom, &ops))
+ xenstore_update_be(vec[XS_WATCH_PATH], (void*)type, dom, (void*)ops);
+ if (1 == sscanf(vec[XS_WATCH_TOKEN], "fe:%" PRIxPTR, &ptr))
+ xenstore_update_fe(vec[XS_WATCH_PATH], (void*)ptr);
+
+cleanup:
+ qemu_free(vec);
+}
+
+static void xen_be_evtchn_event(void *opaque)
+{
+ struct XenDevice *xendev = opaque;
+ evtchn_port_t port;
+
+ port = xc_evtchn_pending(xendev->evtchndev);
+ if (port != xendev->local_port) {
+ xen_be_printf(xendev, 0, "xc_evtchn_pending returned %d (expected %d)\n",
+ port, xendev->local_port);
+ return;
+ }
+ xc_evtchn_unmask(xendev->evtchndev, port);
+
+ if (xendev->ops->event)
+ xendev->ops->event(xendev);
+}
+
+/* -------------------------------------------------------------------- */
+
+int xen_be_init(void)
+{
+ xenstore = xs_daemon_open();
+ if (!xenstore) {
+ fprintf(stderr, "can't connect to xenstored\n");
+ return -1;
+ }
+
+ if (qemu_set_fd_handler(xs_fileno(xenstore), xenstore_update, NULL, NULL) < 0)
+ goto err;
+
+ xen_xc = xc_interface_open();
+ if (-1 == xen_xc) {
+ fprintf(stderr, "can't open xen interface\n");
+ goto err;
+ }
+ return 0;
+
+err:
+ qemu_set_fd_handler(xs_fileno(xenstore), NULL, NULL, NULL);
+ xs_daemon_close(xenstore);
+ xenstore = NULL;
+
+ return -1;
+}
+
+int xen_be_register(char *type, struct XenDevOps *ops)
+{
+ return xenstore_scan(type, xen_domid, ops);
+}
+
+int xen_be_bind_evtchn(struct XenDevice *xendev)
+{
+ if (xendev->local_port != -1)
+ return 0;
+ xendev->local_port = xc_evtchn_bind_interdomain
+ (xendev->evtchndev, xendev->dom, xendev->remote_port);
+ if (-1 == xendev->local_port) {
+ xen_be_printf(xendev, 0, "xc_evtchn_bind_interdomain failed\n");
+ return -1;
+ }
+ xen_be_printf(xendev, 2, "bind evtchn port %d\n", xendev->local_port);
+ qemu_set_fd_handler(xc_evtchn_fd(xendev->evtchndev),
+ xen_be_evtchn_event, NULL, xendev);
+ return 0;
+}
+
+void xen_be_unbind_evtchn(struct XenDevice *xendev)
+{
+ if (xendev->local_port == -1)
+ return;
+ qemu_set_fd_handler(xc_evtchn_fd(xendev->evtchndev), NULL, NULL, NULL);
+ xc_evtchn_unbind(xendev->evtchndev, xendev->local_port);
+ xen_be_printf(xendev, 2, "unbind evtchn port %d\n", xendev->local_port);
+ xendev->local_port = -1;
+}
+
+int xen_be_send_notify(struct XenDevice *xendev)
+{
+ return xc_evtchn_notify(xendev->evtchndev, xendev->local_port);
+}
+
+/*
+ * msg_level:
+ * 0 == errors.
+ * 1 == informative debug messages.
+ * 2 == noisy debug messages.
+ * 3 == will flood your log.
+ */
+void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...)
+{
+ va_list args;
+
+ if (msg_level > xendev->debug)
+ return;
+ fprintf(stderr, "xen be: %s: ", xendev->name);
+ va_start(args, fmt);
+ vfprintf(stderr, fmt, args);
+ va_end(args);
+}
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
new file mode 100644
index 0000000..3a79ad0
--- /dev/null
+++ b/hw/xen_backend.h
@@ -0,0 +1,86 @@
+#ifndef QEMU_XEN_BACKEND_H
+#define QEMU_XEN_BACKEND_H 1
+
+#include "xen_common.h"
+
+/* ------------------------------------------------------------- */
+
+#define XEN_BUFSIZE 1024
+
+struct XenDevice;
+
+/* driver uses grant tables -> open gntdev device (xendev->gnttabdev) */
+#define DEVOPS_FLAG_NEED_GNTDEV 1
+/* don't expect frontend doing correct state transitions (aka console quirk) */
+#define DEVOPS_FLAG_IGNORE_STATE 2
+
+struct XenDevOps {
+ size_t size;
+ uint32_t flags;
+ void (*alloc)(struct XenDevice *xendev);
+ int (*init)(struct XenDevice *xendev);
+ int (*connect)(struct XenDevice *xendev);
+ void (*event)(struct XenDevice *xendev);
+ void (*disconnect)(struct XenDevice *xendev);
+ int (*free)(struct XenDevice *xendev);
+ void (*backend_changed)(struct XenDevice *xendev, const char *node);
+ void (*frontend_changed)(struct XenDevice *xendev, const char *node);
+};
+
+struct XenDevice {
+ char *type;
+ int dom;
+ int dev;
+ char name[64];
+ int debug;
+
+ enum xenbus_state be_state;
+ enum xenbus_state fe_state;
+ int online;
+ char be[XEN_BUFSIZE];
+ char *fe;
+ char *protocol;
+ int remote_port;
+ int local_port;
+
+ int evtchndev;
+ int gnttabdev;
+
+ struct XenDevOps *ops;
+ TAILQ_ENTRY(XenDevice) next;
+};
+
+/* ------------------------------------------------------------- */
+
+/* variables */
+extern int xen_xc;
+extern struct xs_handle *xenstore;
+
+/* xenstore helper functions */
+int xenstore_write_str(const char *base, const char *node, const char *val);
+int xenstore_write_int(const char *base, const char *node, int ival);
+char *xenstore_read_str(const char *base, const char *node);
+int xenstore_read_int(const char *base, const char *node, int *ival);
+
+int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const char *val);
+int xenstore_write_be_int(struct XenDevice *xendev, const char *node, int ival);
+char *xenstore_read_be_str(struct XenDevice *xendev, const char *node);
+int xenstore_read_be_int(struct XenDevice *xendev, const char *node, int *ival);
+char *xenstore_read_fe_str(struct XenDevice *xendev, const char *node);
+int xenstore_read_fe_int(struct XenDevice *xendev, const char *node, int *ival);
+
+const char *xenbus_strstate(enum xenbus_state state);
+struct XenDevice *xen_be_find_xendev(char *type, int dom, int dev);
+void xen_be_check_state(struct XenDevice *xendev);
+
+/* xen backend driver bits */
+int xen_be_init(void);
+int xen_be_register(char *type, struct XenDevOps *ops);
+int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state);
+int xen_be_bind_evtchn(struct XenDevice *xendev);
+void xen_be_unbind_evtchn(struct XenDevice *xendev);
+int xen_be_send_notify(struct XenDevice *xendev);
+void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...)
+ __attribute__ ((format(printf, 3, 4)));
+
+#endif /* QEMU_XEN_BACKEND_H */
diff --git a/hw/xen_common.h b/hw/xen_common.h
new file mode 100644
index 0000000..a9e1d26
--- /dev/null
+++ b/hw/xen_common.h
@@ -0,0 +1,34 @@
+#ifndef QEMU_XEN_COMMON_H
+#define QEMU_XEN_COMMON_H 1
+
+#include <stddef.h>
+#include <inttypes.h>
+
+#include <xenctrl.h>
+#include <xs.h>
+#include <xen/io/xenbus.h>
+
+#include "hw.h"
+#include "xen.h"
+#include "sys-queue.h" /* BSD list implementation */
+
+/*
+ * tweaks needed to build with different xen versions
+ * 0x00030205 -> 3.1.0
+ * 0x00030207 -> 3.2.0
+ * 0x00030208 -> unstable
+ */
+#include <xen/xen-compat.h>
+#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030205
+# define evtchn_port_or_error_t int
+#endif
+#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030207
+# define xc_map_foreign_pages xc_map_foreign_batch
+#endif
+#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030208
+# define xen_mb() mb()
+# define xen_rmb() rmb()
+# define xen_wmb() wmb()
+#endif
+
+#endif /* QEMU_XEN_COMMON_H */
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 05f0e67..2eceb8e 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -25,7 +25,7 @@
#include "hw.h"
#include "boards.h"
-#include <xenctrl.h>
+#include "xen_backend.h"
/* -------------------------------------------------------------------- */
/* variables */
@@ -81,6 +81,11 @@ static int xen_init(void)
xen_emulate = 1;
}
+ if (-1 == xen_be_init()) {
+ fprintf(stderr, "%s: xen backend core setup failed\n", __FUNCTION__);
+ return -1;
+ }
+
return 0;
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 07/13] xen: add console backend driver.
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (5 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 06/13] xen: backend driver core Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 08/13] xen: add framebuffer " Gerd Hoffmann
` (7 subsequent siblings)
14 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This patch adds a xenconsole backend driver. It it based on current
xen-unstable code. It has been changed to make use of the common
backend driver code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 1 +
hw/xen_backend.h | 3 +
hw/xen_console.c | 271 +++++++++++++++++++++++++++++++++++++++++++++++++++
hw/xen_machine_pv.c | 3 +
4 files changed, 278 insertions(+), 0 deletions(-)
create mode 100644 hw/xen_console.c
diff --git a/Makefile.target b/Makefile.target
index 78a6402..92ad109 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -517,6 +517,7 @@ endif
# xen backend driver support
XEN_OBJS := xen_machine_pv.o xen_backend.o
+XEN_OBJS += xen_console.o
ifeq ($(CONFIG_XEN), yes)
OBJS += $(XEN_OBJS)
LIBS += $(XEN_LIBS)
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index 3a79ad0..10bd1a6 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -83,4 +83,7 @@ int xen_be_send_notify(struct XenDevice *xendev);
void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...)
__attribute__ ((format(printf, 3, 4)));
+/* actual backend drivers */
+struct XenDevOps xen_console_ops; /* xen_console.c */
+
#endif /* QEMU_XEN_BACKEND_H */
diff --git a/hw/xen_console.c b/hw/xen_console.c
new file mode 100644
index 0000000..2ea88f6
--- /dev/null
+++ b/hw/xen_console.c
@@ -0,0 +1,271 @@
+/*
+ * Copyright (C) International Business Machines Corp., 2005
+ * Author(s): Anthony Liguori <aliguori@us.ibm.com>
+ *
+ * Copyright (C) Red Hat 2007
+ *
+ * Xen Console
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <stdlib.h>
+#include <errno.h>
+#include <string.h>
+#include <sys/select.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <termios.h>
+#include <stdarg.h>
+#include <sys/mman.h>
+#include <xs.h>
+#include <xen/io/console.h>
+#include <xenctrl.h>
+
+#include "hw.h"
+#include "sysemu.h"
+#include "qemu-char.h"
+#include "xen_backend.h"
+
+#define dolog(val, fmt, ...) fprintf(stderr, fmt "\n", ## __VA_ARGS__)
+
+struct buffer {
+ uint8_t *data;
+ size_t consumed;
+ size_t size;
+ size_t capacity;
+ size_t max_capacity;
+};
+
+struct XenConsole {
+ struct XenDevice xendev; /* must be first */
+ struct buffer buffer;
+ char console[XEN_BUFSIZE];
+ int ring_ref;
+ void *sring;
+ CharDriverState *chr;
+ int backlog;
+};
+
+static void buffer_append(struct XenConsole *con)
+{
+ struct buffer *buffer = &con->buffer;
+ XENCONS_RING_IDX cons, prod, size;
+ struct xencons_interface *intf = con->sring;
+
+ cons = intf->out_cons;
+ prod = intf->out_prod;
+ xen_mb();
+
+ size = prod - cons;
+ if ((size == 0) || (size > sizeof(intf->out)))
+ return;
+
+ if ((buffer->capacity - buffer->size) < size) {
+ buffer->capacity += (size + 1024);
+ buffer->data = qemu_realloc(buffer->data, buffer->capacity);
+ if (buffer->data == NULL) {
+ dolog(LOG_ERR, "Memory allocation failed");
+ exit(ENOMEM);
+ }
+ }
+
+ while (cons != prod)
+ buffer->data[buffer->size++] = intf->out[
+ MASK_XENCONS_IDX(cons++, intf->out)];
+
+ xen_mb();
+ intf->out_cons = cons;
+ xen_be_send_notify(&con->xendev);
+
+ if (buffer->max_capacity &&
+ buffer->size > buffer->max_capacity) {
+ /* Discard the middle of the data. */
+
+ size_t over = buffer->size - buffer->max_capacity;
+ uint8_t *maxpos = buffer->data + buffer->max_capacity;
+
+ memmove(maxpos - over, maxpos, over);
+ buffer->data = qemu_realloc(buffer->data, buffer->max_capacity);
+ buffer->size = buffer->capacity = buffer->max_capacity;
+
+ if (buffer->consumed > buffer->max_capacity - over)
+ buffer->consumed = buffer->max_capacity - over;
+ }
+}
+
+static void buffer_advance(struct buffer *buffer, size_t len)
+{
+ buffer->consumed += len;
+ if (buffer->consumed == buffer->size) {
+ buffer->consumed = 0;
+ buffer->size = 0;
+ }
+}
+
+static int ring_free_bytes(struct XenConsole *con)
+{
+ struct xencons_interface *intf = con->sring;
+ XENCONS_RING_IDX cons, prod, space;
+
+ cons = intf->in_cons;
+ prod = intf->in_prod;
+ xen_mb();
+
+ space = prod - cons;
+ if (space > sizeof(intf->in))
+ return 0; /* ring is screwed: ignore it */
+
+ return (sizeof(intf->in) - space);
+}
+
+static int xencons_can_receive(void *opaque)
+{
+ struct XenConsole *con = opaque;
+ return ring_free_bytes(con);
+}
+
+static void xencons_receive(void *opaque, const uint8_t *buf, int len)
+{
+ struct XenConsole *con = opaque;
+ struct xencons_interface *intf = con->sring;
+ XENCONS_RING_IDX prod;
+ int i, max;
+
+ max = ring_free_bytes(con);
+ /* The can_receive() func limits this, but check again anyway */
+ if (max < len)
+ len = max;
+
+ prod = intf->in_prod;
+ for (i = 0; i < len; i++) {
+ intf->in[MASK_XENCONS_IDX(prod++, intf->in)] =
+ buf[i];
+ }
+ xen_wmb();
+ intf->in_prod = prod;
+ xen_be_send_notify(&con->xendev);
+}
+
+static void xencons_send(struct XenConsole *con)
+{
+ ssize_t len, size;
+
+ size = con->buffer.size - con->buffer.consumed;
+ len = qemu_chr_write(con->chr, con->buffer.data + con->buffer.consumed,
+ size);
+ if (len < 1) {
+ if (!con->backlog) {
+ con->backlog = 1;
+ xen_be_printf(&con->xendev, 1, "backlog piling up, nobody listening?\n");
+ }
+ } else {
+ buffer_advance(&con->buffer, len);
+ if (con->backlog && len == size) {
+ con->backlog = 0;
+ xen_be_printf(&con->xendev, 1, "backlog is gone\n");
+ }
+ }
+}
+
+/* -------------------------------------------------------------------- */
+
+static int con_init(struct XenDevice *xendev)
+{
+ struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
+ char *type, *dom;
+
+ if (!serial_hds[con->xendev.dev]) {
+ xen_be_printf(xendev, 1, "serial line %d not configured\n", con->xendev.dev);
+ return -1;
+ }
+
+ /* setup */
+ dom = xs_get_domain_path(xenstore, con->xendev.dom);
+ snprintf(con->console, sizeof(con->console), "%s/console", dom);
+ free(dom);
+ con->chr = serial_hds[con->xendev.dev];
+
+ type = xenstore_read_str(con->console, "type");
+ if (!type || 0 != strcmp(type, "ioemu")) {
+ xen_be_printf(xendev, 1, "not for me (type=%s)\n", type);
+ return -1;
+ }
+
+ return 0;
+}
+
+static int con_connect(struct XenDevice *xendev)
+{
+ struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
+ int limit;
+
+ if (-1 == xenstore_read_int(con->console, "ring-ref", &con->ring_ref))
+ return -1;
+ if (-1 == xenstore_read_int(con->console, "port", &con->xendev.remote_port))
+ return -1;
+ if (0 == xenstore_read_int(con->console, "limit", &limit))
+ con->buffer.max_capacity = limit;
+
+ con->sring = xc_map_foreign_range(xen_xc, con->xendev.dom,
+ XC_PAGE_SIZE,
+ PROT_READ|PROT_WRITE,
+ con->ring_ref);
+ if (!con->sring)
+ return -1;
+
+ xen_be_bind_evtchn(&con->xendev);
+ qemu_chr_add_handlers(con->chr, xencons_can_receive, xencons_receive,
+ NULL, con);
+
+ xen_be_printf(xendev, 1, "ring mfn %d, remote port %d, local port %d, limit %zd\n",
+ con->ring_ref,
+ con->xendev.remote_port,
+ con->xendev.local_port,
+ con->buffer.max_capacity);
+ return 0;
+}
+
+static void con_disconnect(struct XenDevice *xendev)
+{
+ struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
+
+ qemu_chr_add_handlers(con->chr, NULL, NULL, NULL, NULL);
+ xen_be_unbind_evtchn(&con->xendev);
+
+ if (con->sring) {
+ munmap(con->sring, XC_PAGE_SIZE);
+ con->sring = NULL;
+ }
+}
+
+static void con_event(struct XenDevice *xendev)
+{
+ struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
+
+ buffer_append(con);
+ if (con->buffer.size - con->buffer.consumed)
+ xencons_send(con);
+}
+
+/* -------------------------------------------------------------------- */
+
+struct XenDevOps xen_console_ops = {
+ .size = sizeof(struct XenConsole),
+ .flags = DEVOPS_FLAG_IGNORE_STATE,
+ .init = con_init,
+ .connect = con_connect,
+ .event = con_event,
+ .disconnect = con_disconnect,
+};
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 2eceb8e..93f66d0 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -97,6 +97,9 @@ static int xen_init_pv(DisplayState *ds)
if (rc < 0)
return rc;
+ /* xenbus backend drivers */
+ xen_be_register("console", &xen_console_ops);
+
return 0;
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 08/13] xen: add framebuffer backend driver
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (6 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 07/13] xen: add console backend driver Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 09/13] xen: add block device " Gerd Hoffmann
` (6 subsequent siblings)
14 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This patch adds a frsamebuffer (and kbd+mouse) backend driver. It
it based on current xen-unstable code. It has been changed to make
use of the common backend driver code. It also has been changed to
compile with xen headers older than unstable (aka son-to-be 4.0).
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 2 +-
hw/xen_backend.h | 4 +
hw/xen_framebuffer.c | 975 ++++++++++++++++++++++++++++++++++++++++++++++++++
hw/xen_machine_pv.c | 5 +
4 files changed, 985 insertions(+), 1 deletions(-)
create mode 100644 hw/xen_framebuffer.c
diff --git a/Makefile.target b/Makefile.target
index 92ad109..bcf4d93 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -517,7 +517,7 @@ endif
# xen backend driver support
XEN_OBJS := xen_machine_pv.o xen_backend.o
-XEN_OBJS += xen_console.o
+XEN_OBJS += xen_console.o xen_framebuffer.o
ifeq ($(CONFIG_XEN), yes)
OBJS += $(XEN_OBJS)
LIBS += $(XEN_LIBS)
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index 10bd1a6..d8bc23a 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -85,5 +85,9 @@ void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...
/* actual backend drivers */
struct XenDevOps xen_console_ops; /* xen_console.c */
+struct XenDevOps xen_kbdmouse_ops; /* xen_framebuffer.c */
+struct XenDevOps xen_framebuffer_ops; /* xen_framebuffer.c */
+
+void xen_set_display(int domid, DisplayState *ds);
#endif /* QEMU_XEN_BACKEND_H */
diff --git a/hw/xen_framebuffer.c b/hw/xen_framebuffer.c
new file mode 100644
index 0000000..a36d364
--- /dev/null
+++ b/hw/xen_framebuffer.c
@@ -0,0 +1,975 @@
+/*
+ * xen paravirt framebuffer backend
+ *
+ * Copyright IBM, Corp. 2005-2006
+ * Copyright Red Hat, Inc. 2006-2008
+ *
+ * Authors:
+ * Anthony Liguori <aliguori@us.ibm.com>,
+ * Markus Armbruster <armbru@redhat.com>,
+ * Daniel P. Berrange <berrange@redhat.com>,
+ * Pat Campbell <plc@novell.com>,
+ * Gerd Hoffmann <kraxel@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <stdarg.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdbool.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+
+#include <xs.h>
+#include <xenctrl.h>
+#include <xen/event_channel.h>
+#include <xen/io/xenbus.h>
+#include <xen/io/fbif.h>
+#include <xen/io/kbdif.h>
+#include <xen/io/protocols.h>
+
+#include "hw.h"
+#include "sysemu.h"
+#include "console.h"
+#include "qemu-char.h"
+#include "xen_backend.h"
+
+#ifndef BTN_LEFT
+#define BTN_LEFT 0x110 /* from <linux/input.h> */
+#endif
+
+/* -------------------------------------------------------------------- */
+
+struct common {
+ struct XenDevice xendev; /* must be first */
+ void *page;
+ DisplayState *ds;
+};
+
+struct XenInput {
+ struct common c;
+ int abs_pointer_wanted; /* Whether guest supports absolute pointer */
+ int button_state; /* Last seen pointer button state */
+ int extended;
+ QEMUPutMouseEntry *qmouse;
+};
+
+#define UP_QUEUE 8
+
+struct XenFB {
+ struct common c;
+ size_t fb_len;
+ int row_stride;
+ int depth;
+ int width;
+ int height;
+ int offset;
+ void *pixels;
+ int fbpages;
+ int feature_update;
+ int refresh_period;
+ int bug_trigger;
+ int have_console;
+
+ struct {
+ int x,y,w,h;
+ } up_rects[UP_QUEUE];
+ int up_count;
+ int up_fullscreen;
+};
+
+/* -------------------------------------------------------------------- */
+
+static int common_bind(struct common *c)
+{
+ int mfn;
+
+ if (-1 == xenstore_read_fe_int(&c->xendev, "page-ref", &mfn))
+ return -1;
+ if (-1 == xenstore_read_fe_int(&c->xendev, "event-channel", &c->xendev.remote_port))
+ return -1;
+
+ xen_be_bind_evtchn(&c->xendev);
+ xen_be_printf(&c->xendev, 1, "ring mfn %d, remote-port %d, local-port %d\n",
+ mfn, c->xendev.remote_port, c->xendev.local_port);
+
+ c->page = xc_map_foreign_range(xen_xc, c->xendev.dom,
+ XC_PAGE_SIZE,
+ PROT_READ | PROT_WRITE, mfn);
+ if (NULL == c->page)
+ return -1;
+ return 0;
+}
+
+static void common_unbind(struct common *c)
+{
+ if (c->page) {
+ munmap(c->page, XC_PAGE_SIZE);
+ c->page = NULL;
+ }
+ xen_be_unbind_evtchn(&c->xendev);
+}
+
+/* -------------------------------------------------------------------- */
+
+#if 0
+/*
+ * These two tables are not needed any more, but left in here
+ * intentionally as documentation, to show how scancode2linux[]
+ * was generated.
+ *
+ * Tables to map from scancode to Linux input layer keycode.
+ * Scancodes are hardware-specific. These maps assumes a
+ * standard AT or PS/2 keyboard which is what QEMU feeds us.
+ */
+static const unsigned char atkbd_set2_keycode[512] = {
+
+ 0, 67, 65, 63, 61, 59, 60, 88, 0, 68, 66, 64, 62, 15, 41,117,
+ 0, 56, 42, 93, 29, 16, 2, 0, 0, 0, 44, 31, 30, 17, 3, 0,
+ 0, 46, 45, 32, 18, 5, 4, 95, 0, 57, 47, 33, 20, 19, 6,183,
+ 0, 49, 48, 35, 34, 21, 7,184, 0, 0, 50, 36, 22, 8, 9,185,
+ 0, 51, 37, 23, 24, 11, 10, 0, 0, 52, 53, 38, 39, 25, 12, 0,
+ 0, 89, 40, 0, 26, 13, 0, 0, 58, 54, 28, 27, 0, 43, 0, 85,
+ 0, 86, 91, 90, 92, 0, 14, 94, 0, 79,124, 75, 71,121, 0, 0,
+ 82, 83, 80, 76, 77, 72, 1, 69, 87, 78, 81, 74, 55, 73, 70, 99,
+
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 217,100,255, 0, 97,165, 0, 0,156, 0, 0, 0, 0, 0, 0,125,
+ 173,114, 0,113, 0, 0, 0,126,128, 0, 0,140, 0, 0, 0,127,
+ 159, 0,115, 0,164, 0, 0,116,158, 0,150,166, 0, 0, 0,142,
+ 157, 0, 0, 0, 0, 0, 0, 0,155, 0, 98, 0, 0,163, 0, 0,
+ 226, 0, 0, 0, 0, 0, 0, 0, 0,255, 96, 0, 0, 0,143, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0,107, 0,105,102, 0, 0,112,
+ 110,111,108,112,106,103, 0,119, 0,118,109, 0, 99,104,119, 0,
+
+};
+
+static const unsigned char atkbd_unxlate_table[128] = {
+
+ 0,118, 22, 30, 38, 37, 46, 54, 61, 62, 70, 69, 78, 85,102, 13,
+ 21, 29, 36, 45, 44, 53, 60, 67, 68, 77, 84, 91, 90, 20, 28, 27,
+ 35, 43, 52, 51, 59, 66, 75, 76, 82, 14, 18, 93, 26, 34, 33, 42,
+ 50, 49, 58, 65, 73, 74, 89,124, 17, 41, 88, 5, 6, 4, 12, 3,
+ 11, 2, 10, 1, 9,119,126,108,117,125,123,107,115,116,121,105,
+ 114,122,112,113,127, 96, 97,120, 7, 15, 23, 31, 39, 47, 55, 63,
+ 71, 79, 86, 94, 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 87,111,
+ 19, 25, 57, 81, 83, 92, 95, 98, 99,100,101,103,104,106,109,110
+
+};
+#endif
+
+/*
+ * for (i = 0; i < 128; i++) {
+ * scancode2linux[i] = atkbd_set2_keycode[atkbd_unxlate_table[i]];
+ * scancode2linux[i | 0x80] = atkbd_set2_keycode[atkbd_unxlate_table[i] | 0x80];
+ * }
+ */
+static const unsigned char scancode2linux[512] = {
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+ 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
+ 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
+ 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
+ 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
+ 80, 81, 82, 83, 99, 0, 86, 87, 88,117, 0, 0, 95,183,184,185,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 93, 0, 0, 89, 0, 0, 85, 91, 90, 92, 0, 94, 0,124,121, 0,
+
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 165, 0, 0, 0, 0, 0, 0, 0, 0,163, 0, 0, 96, 97, 0, 0,
+ 113,140,164, 0,166, 0, 0, 0, 0, 0,255, 0, 0, 0,114, 0,
+ 115, 0,150, 0, 0, 98,255, 99,100, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0,119,119,102,103,104, 0,105,112,106,118,107,
+ 108,109,110,111, 0, 0, 0, 0, 0, 0, 0,125,126,127,116,142,
+ 0, 0, 0,143, 0,217,156,173,128,159,158,157,155,226, 0,112,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+};
+
+/* Send an event to the keyboard frontend driver */
+static int xenfb_kbd_event(struct XenInput *xenfb,
+ union xenkbd_in_event *event)
+{
+ struct xenkbd_page *page = xenfb->c.page;
+ uint32_t prod;
+
+ if (xenfb->c.xendev.be_state != XenbusStateConnected)
+ return 0;
+ if (!page)
+ return 0;
+
+ prod = page->in_prod;
+ if (prod - page->in_cons == XENKBD_IN_RING_LEN) {
+ errno = EAGAIN;
+ return -1;
+ }
+
+ xen_mb(); /* ensure ring space available */
+ XENKBD_IN_RING_REF(page, prod) = *event;
+ xen_wmb(); /* ensure ring contents visible */
+ page->in_prod = prod + 1;
+ return xen_be_send_notify(&xenfb->c.xendev);
+}
+
+/* Send a keyboard (or mouse button) event */
+static int xenfb_send_key(struct XenInput *xenfb, bool down, int keycode)
+{
+ union xenkbd_in_event event;
+
+ memset(&event, 0, XENKBD_IN_EVENT_SIZE);
+ event.type = XENKBD_TYPE_KEY;
+ event.key.pressed = down ? 1 : 0;
+ event.key.keycode = keycode;
+
+ return xenfb_kbd_event(xenfb, &event);
+}
+
+/* Send a relative mouse movement event */
+static int xenfb_send_motion(struct XenInput *xenfb,
+ int rel_x, int rel_y, int rel_z)
+{
+ union xenkbd_in_event event;
+
+ memset(&event, 0, XENKBD_IN_EVENT_SIZE);
+ event.type = XENKBD_TYPE_MOTION;
+ event.motion.rel_x = rel_x;
+ event.motion.rel_y = rel_y;
+#if __XEN_LATEST_INTERFACE_VERSION__ >= 0x00030207
+ event.motion.rel_z = rel_z;
+#endif
+
+ return xenfb_kbd_event(xenfb, &event);
+}
+
+/* Send an absolute mouse movement event */
+static int xenfb_send_position(struct XenInput *xenfb,
+ int abs_x, int abs_y, int z)
+{
+ union xenkbd_in_event event;
+
+ memset(&event, 0, XENKBD_IN_EVENT_SIZE);
+ event.type = XENKBD_TYPE_POS;
+ event.pos.abs_x = abs_x;
+ event.pos.abs_y = abs_y;
+#if __XEN_LATEST_INTERFACE_VERSION__ == 0x00030207
+ event.pos.abs_z = z;
+#endif
+#if __XEN_LATEST_INTERFACE_VERSION__ >= 0x00030208
+ event.pos.rel_z = z;
+#endif
+
+ return xenfb_kbd_event(xenfb, &event);
+}
+
+/*
+ * Send a key event from the client to the guest OS
+ * QEMU gives us a raw scancode from an AT / PS/2 style keyboard.
+ * We have to turn this into a Linux Input layer keycode.
+ *
+ * Extra complexity from the fact that with extended scancodes
+ * (like those produced by arrow keys) this method gets called
+ * twice, but we only want to send a single event. So we have to
+ * track the '0xe0' scancode state & collapse the extended keys
+ * as needed.
+ *
+ * Wish we could just send scancodes straight to the guest which
+ * already has code for dealing with this...
+ */
+static void xenfb_key_event(void *opaque, int scancode)
+{
+ struct XenInput *xenfb = opaque;
+ int down = 1;
+
+ if (scancode == 0xe0) {
+ xenfb->extended = 1;
+ return;
+ } else if (scancode & 0x80) {
+ scancode &= 0x7f;
+ down = 0;
+ }
+ if (xenfb->extended) {
+ scancode |= 0x80;
+ xenfb->extended = 0;
+ }
+ xenfb_send_key(xenfb, down, scancode2linux[scancode]);
+}
+
+/*
+ * Send a mouse event from the client to the guest OS
+ *
+ * The QEMU mouse can be in either relative, or absolute mode.
+ * Movement is sent separately from button state, which has to
+ * be encoded as virtual key events. We also don't actually get
+ * given any button up/down events, so have to track changes in
+ * the button state.
+ */
+static void xenfb_mouse_event(void *opaque,
+ int dx, int dy, int dz, int button_state)
+{
+ struct XenInput *xenfb = opaque;
+ int i;
+
+ if (xenfb->abs_pointer_wanted)
+ xenfb_send_position(xenfb,
+ dx * (xenfb->c.ds->width - 1) / 0x7fff,
+ dy * (xenfb->c.ds->height - 1) / 0x7fff,
+ dz);
+ else
+ xenfb_send_motion(xenfb, dx, dy, dz);
+
+ for (i = 0 ; i < 8 ; i++) {
+ int lastDown = xenfb->button_state & (1 << i);
+ int down = button_state & (1 << i);
+ if (down == lastDown)
+ continue;
+
+ if (xenfb_send_key(xenfb, down, BTN_LEFT+i) < 0)
+ return;
+ }
+ xenfb->button_state = button_state;
+}
+
+static int input_init(struct XenDevice *xendev)
+{
+ struct XenInput *in = container_of(xendev, struct XenInput, c.xendev);
+
+ if (!in->c.ds) {
+ /* xen_set_display() below will set that and trigger us again */
+ xen_be_printf(xendev, 1, "ds not set (yet)\n");
+ return -1;
+ }
+
+ xenstore_write_be_int(xendev, "feature-abs-pointer", 1);
+ return 0;
+}
+
+static int input_connect(struct XenDevice *xendev)
+{
+ struct XenInput *in = container_of(xendev, struct XenInput, c.xendev);
+ int rc;
+
+ if (-1 == xenstore_read_fe_int(xendev, "request-abs-pointer", &in->abs_pointer_wanted))
+ in->abs_pointer_wanted = 0;
+
+ rc = common_bind(&in->c);
+ if (0 != rc)
+ return rc;
+
+ qemu_add_kbd_event_handler(xenfb_key_event, in);
+ in->qmouse = qemu_add_mouse_event_handler(xenfb_mouse_event, in,
+ in->abs_pointer_wanted,
+ "Xen PVFB Mouse");
+ return 0;
+}
+
+static void input_disconnect(struct XenDevice *xendev)
+{
+ struct XenInput *in = container_of(xendev, struct XenInput, c.xendev);
+
+ if (in->qmouse) {
+ qemu_remove_mouse_event_handler(in->qmouse);
+ in->qmouse = NULL;
+ }
+ qemu_add_kbd_event_handler(NULL, NULL);
+ common_unbind(&in->c);
+}
+
+static void input_event(struct XenDevice *xendev)
+{
+ struct XenInput *xenfb = container_of(xendev, struct XenInput, c.xendev);
+ struct xenkbd_page *page = xenfb->c.page;
+
+ /* We don't understand any keyboard events, so just ignore them. */
+ if (page->out_prod == page->out_cons)
+ return;
+ page->out_cons = page->out_prod;
+ xen_be_send_notify(&xenfb->c.xendev);
+}
+
+/* -------------------------------------------------------------------- */
+
+static void xenfb_copy_mfns(int mode, int count, unsigned long *dst, void *src)
+{
+ uint32_t *src32 = src;
+ uint64_t *src64 = src;
+ int i;
+
+ for (i = 0; i < count; i++)
+ dst[i] = (mode == 32) ? src32[i] : src64[i];
+}
+
+static int xenfb_map_fb(struct XenFB *xenfb)
+{
+ struct xenfb_page *page = xenfb->c.page;
+ char *protocol = xenfb->c.xendev.protocol;
+ int n_fbdirs;
+ unsigned long *pgmfns = NULL;
+ unsigned long *fbmfns = NULL;
+ void *map, *pd;
+ int mode, ret = -1;
+
+ /* default to native */
+ pd = page->pd;
+ mode = sizeof(unsigned long) * 8;
+
+ if (!protocol) {
+ /*
+ * Undefined protocol, some guesswork needed.
+ *
+ * Old frontends which don't set the protocol use
+ * one page directory only, thus pd[1] must be zero.
+ * pd[1] of the 32bit struct layout and the lower
+ * 32 bits of pd[0] of the 64bit struct layout have
+ * the same location, so we can check that ...
+ */
+ uint32_t *ptr32 = NULL;
+ uint32_t *ptr64 = NULL;
+#if defined(__i386__)
+ ptr32 = (void*)page->pd;
+ ptr64 = ((void*)page->pd) + 4;
+#elif defined(__x86_64__)
+ ptr32 = ((void*)page->pd) - 4;
+ ptr64 = (void*)page->pd;
+#endif
+ if (ptr32) {
+ if (0 == ptr32[1]) {
+ mode = 32;
+ pd = ptr32;
+ } else {
+ mode = 64;
+ pd = ptr64;
+ }
+ }
+#if defined(__x86_64__)
+ } else if (0 == strcmp(protocol, XEN_IO_PROTO_ABI_X86_32)) {
+ /* 64bit dom0, 32bit domU */
+ mode = 32;
+ pd = ((void*)page->pd) - 4;
+#elif defined(__i386__)
+ } else if (0 == strcmp(protocol, XEN_IO_PROTO_ABI_X86_64)) {
+ /* 32bit dom0, 64bit domU */
+ mode = 64;
+ pd = ((void*)page->pd) + 4;
+#endif
+ }
+
+ if (xenfb->pixels) {
+ munmap(xenfb->pixels, xenfb->fbpages * XC_PAGE_SIZE);
+ xenfb->pixels = NULL;
+ }
+
+ xenfb->fbpages = (xenfb->fb_len + (XC_PAGE_SIZE - 1)) / XC_PAGE_SIZE;
+ n_fbdirs = xenfb->fbpages * mode / 8;
+ n_fbdirs = (n_fbdirs + (XC_PAGE_SIZE - 1)) / XC_PAGE_SIZE;
+
+ pgmfns = qemu_mallocz(sizeof(unsigned long) * n_fbdirs);
+ fbmfns = qemu_mallocz(sizeof(unsigned long) * xenfb->fbpages);
+ if (!pgmfns || !fbmfns)
+ goto out;
+
+ xenfb_copy_mfns(mode, n_fbdirs, pgmfns, pd);
+ map = xc_map_foreign_pages(xen_xc, xenfb->c.xendev.dom,
+ PROT_READ, pgmfns, n_fbdirs);
+ if (map == NULL)
+ goto out;
+ xenfb_copy_mfns(mode, xenfb->fbpages, fbmfns, map);
+ munmap(map, n_fbdirs * XC_PAGE_SIZE);
+
+ xenfb->pixels = xc_map_foreign_pages(xen_xc, xenfb->c.xendev.dom,
+ PROT_READ | PROT_WRITE, fbmfns, xenfb->fbpages);
+ if (xenfb->pixels == NULL)
+ goto out;
+
+ ret = 0; /* all is fine */
+
+out:
+ qemu_free(pgmfns);
+ qemu_free(fbmfns);
+ return ret;
+}
+
+static int xenfb_configure_fb(struct XenFB *xenfb, size_t fb_len_lim,
+ int width, int height, int depth,
+ size_t fb_len, int offset, int row_stride)
+{
+ size_t mfn_sz = sizeof(*((struct xenfb_page *)0)->pd);
+ size_t pd_len = sizeof(((struct xenfb_page *)0)->pd) / mfn_sz;
+ size_t fb_pages = pd_len * XC_PAGE_SIZE / mfn_sz;
+ size_t fb_len_max = fb_pages * XC_PAGE_SIZE;
+ int max_width, max_height;
+
+ if (fb_len_lim > fb_len_max) {
+ xen_be_printf(&xenfb->c.xendev, 0, "fb size limit %zu exceeds %zu, corrected\n",
+ fb_len_lim, fb_len_max);
+ fb_len_lim = fb_len_max;
+ }
+ if (fb_len_lim && fb_len > fb_len_lim) {
+ xen_be_printf(&xenfb->c.xendev, 0, "frontend fb size %zu limited to %zu\n",
+ fb_len, fb_len_lim);
+ fb_len = fb_len_lim;
+ }
+ if (depth != 8 && depth != 16 && depth != 24 && depth != 32) {
+ xen_be_printf(&xenfb->c.xendev, 0, "can't handle frontend fb depth %d\n",
+ depth);
+ return -1;
+ }
+ if (row_stride < 0 || row_stride > fb_len) {
+ xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend stride %d\n", row_stride);
+ return -1;
+ }
+ max_width = row_stride / (depth / 8);
+ if (width < 0 || width > max_width) {
+ xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend width %d limited to %d\n",
+ width, max_width);
+ width = max_width;
+ }
+ if (offset < 0 || offset >= fb_len) {
+ xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend offset %d (max %zu)\n",
+ offset, fb_len - 1);
+ return -1;
+ }
+ max_height = (fb_len - offset) / row_stride;
+ if (height < 0 || height > max_height) {
+ xen_be_printf(&xenfb->c.xendev, 0, "invalid frontend height %d limited to %d\n",
+ height, max_height);
+ height = max_height;
+ }
+ xenfb->fb_len = fb_len;
+ xenfb->row_stride = row_stride;
+ xenfb->depth = depth;
+ xenfb->width = width;
+ xenfb->height = height;
+ xenfb->offset = offset;
+ xen_be_printf(&xenfb->c.xendev, 1, "framebuffer %dx%dx%d offset %d stride %d\n",
+ width, height, depth, offset, row_stride);
+ return 0;
+}
+
+/* A convenient function for munging pixels between different depths */
+#define BLT(SRC_T,DST_T,RSB,GSB,BSB,RDB,GDB,BDB) \
+ for (line = y ; line < (y+h) ; line++) { \
+ SRC_T *src = (SRC_T *)(xenfb->pixels \
+ + xenfb->offset \
+ + (line * xenfb->row_stride) \
+ + (x * xenfb->depth / 8)); \
+ DST_T *dst = (DST_T *)(xenfb->c.ds->data \
+ + (line * xenfb->c.ds->linesize) \
+ + (x * xenfb->c.ds->depth / 8)); \
+ int col; \
+ const int RSS = 32 - (RSB + GSB + BSB); \
+ const int GSS = 32 - (GSB + BSB); \
+ const int BSS = 32 - (BSB); \
+ const uint32_t RSM = (~0U) << (32 - RSB); \
+ const uint32_t GSM = (~0U) << (32 - GSB); \
+ const uint32_t BSM = (~0U) << (32 - BSB); \
+ const int RDS = 32 - (RDB + GDB + BDB); \
+ const int GDS = 32 - (GDB + BDB); \
+ const int BDS = 32 - (BDB); \
+ const uint32_t RDM = (~0U) << (32 - RDB); \
+ const uint32_t GDM = (~0U) << (32 - GDB); \
+ const uint32_t BDM = (~0U) << (32 - BDB); \
+ for (col = x ; col < (x+w) ; col++) { \
+ uint32_t spix = *src; \
+ *dst = (((spix << RSS) & RSM & RDM) >> RDS) | \
+ (((spix << GSS) & GSM & GDM) >> GDS) | \
+ (((spix << BSS) & BSM & BDM) >> BDS); \
+ src = (SRC_T *) ((unsigned long) src + xenfb->depth / 8); \
+ dst = (DST_T *) ((unsigned long) dst + xenfb->c.ds->depth / 8); \
+ } \
+ }
+
+
+/* This copies data from the guest framebuffer region, into QEMU's copy
+ * NB. QEMU's copy is stored in the pixel format of a) the local X
+ * server (SDL case) or b) the current VNC client pixel format.
+ * When shifting between colour depths we preserve the MSB.
+ */
+static void xenfb_guest_copy(struct XenFB *xenfb, int x, int y, int w, int h)
+{
+ int line;
+
+ /*
+ * TODO: xen's qemu-dm seems to have some patches to
+ * make the qemu display code avoid unneeded
+ * work.
+ * - Port them over.
+ * - Put ds->shared_buf back into use then.
+ */
+ if (1 /* !xenfb->c.ds->shared_buf */) {
+ if (xenfb->depth == xenfb->c.ds->depth) { /* Perfect match can use fast path */
+ for (line = y ; line < (y+h) ; line++) {
+ memcpy(xenfb->c.ds->data + (line * xenfb->c.ds->linesize) + (x * xenfb->c.ds->depth / 8),
+ xenfb->pixels + xenfb->offset + (line * xenfb->row_stride) + (x * xenfb->depth / 8),
+ w * xenfb->depth / 8);
+ }
+ } else { /* Mismatch requires slow pixel munging */
+ /* 8 bit == r:3 g:3 b:2 */
+ /* 16 bit == r:5 g:6 b:5 */
+ /* 24 bit == r:8 g:8 b:8 */
+ /* 32 bit == r:8 g:8 b:8 (padding:8) */
+ if (xenfb->depth == 8) {
+ if (xenfb->c.ds->depth == 16) {
+ BLT(uint8_t, uint16_t, 3, 3, 2, 5, 6, 5);
+ } else if (xenfb->c.ds->depth == 32) {
+ BLT(uint8_t, uint32_t, 3, 3, 2, 8, 8, 8);
+ }
+ } else if (xenfb->depth == 16) {
+ if (xenfb->c.ds->depth == 8) {
+ BLT(uint16_t, uint8_t, 5, 6, 5, 3, 3, 2);
+ } else if (xenfb->c.ds->depth == 32) {
+ BLT(uint16_t, uint32_t, 5, 6, 5, 8, 8, 8);
+ }
+ } else if (xenfb->depth == 24 || xenfb->depth == 32) {
+ if (xenfb->c.ds->depth == 8) {
+ BLT(uint32_t, uint8_t, 8, 8, 8, 3, 3, 2);
+ } else if (xenfb->c.ds->depth == 16) {
+ BLT(uint32_t, uint16_t, 8, 8, 8, 5, 6, 5);
+ } else if (xenfb->c.ds->depth == 32) {
+ BLT(uint32_t, uint32_t, 8, 8, 8, 8, 8, 8);
+ }
+ }
+ }
+ }
+ dpy_update(xenfb->c.ds, x, y, w, h);
+}
+
+#ifdef XENFB_TYPE_REFRESH_PERIOD
+static int xenfb_queue_full(struct XenFB *xenfb)
+{
+ struct xenfb_page *page = xenfb->c.page;
+ uint32_t cons, prod;
+
+ if (!page)
+ return 1;
+
+ prod = page->in_prod;
+ cons = page->in_cons;
+ return prod - cons == XENFB_IN_RING_LEN;
+}
+
+static void xenfb_send_event(struct XenFB *xenfb, union xenfb_in_event *event)
+{
+ uint32_t prod;
+ struct xenfb_page *page = xenfb->c.page;
+
+ prod = page->in_prod;
+ /* caller ensures !xenfb_queue_full() */
+ xen_mb(); /* ensure ring space available */
+ XENFB_IN_RING_REF(page, prod) = *event;
+ xen_wmb(); /* ensure ring contents visible */
+ page->in_prod = prod + 1;
+
+ xen_be_send_notify(&xenfb->c.xendev);
+}
+
+static void xenfb_send_refresh_period(struct XenFB *xenfb, int period)
+{
+ union xenfb_in_event event;
+
+ memset(&event, 0, sizeof(event));
+ event.type = XENFB_TYPE_REFRESH_PERIOD;
+ event.refresh_period.period = period;
+ xenfb_send_event(xenfb, &event);
+}
+#endif
+
+/*
+ * Periodic update of display.
+ * Also transmit the refresh interval to the frontend.
+ *
+ * Never ever do any qemu display operations
+ * (resize, screen update) outside this function.
+ * Our screen might be inactive. When asked for
+ * an update we know it is active.
+ */
+static void xenfb_update(void *opaque)
+{
+ struct XenFB *xenfb = opaque;
+ int i;
+
+ if (xenfb->feature_update) {
+#ifdef XENFB_TYPE_REFRESH_PERIOD
+ int period;
+
+ if (xenfb_queue_full(xenfb))
+ return;
+
+ if (xenfb->c.ds->idle)
+ period = XENFB_NO_REFRESH;
+ else {
+ period = xenfb->c.ds->gui_timer_interval;
+ if (!period)
+ period = GUI_REFRESH_INTERVAL;
+ }
+
+ if (xenfb->refresh_period != period) {
+ xenfb_send_refresh_period(xenfb, period);
+ xenfb->refresh_period = period;
+ }
+#else
+ ; /* nothing */
+#endif
+ } else {
+ /* we don't get update notifications, thus use the
+ * sledge hammer approach ... */
+ xenfb->up_fullscreen = 1;
+ }
+
+ /* resize if needed */
+ if (xenfb->width != xenfb->c.ds->width || xenfb->height != xenfb->c.ds->height) {
+ xen_be_printf(&xenfb->c.xendev, 1, "update: resizing\n");
+ dpy_resize(xenfb->c.ds, xenfb->width, xenfb->height);
+ xenfb->up_fullscreen = 1;
+ }
+
+ /* run queued updates */
+ if (xenfb->up_fullscreen) {
+ xen_be_printf(&xenfb->c.xendev, 3, "update: fullscreen\n");
+ xenfb_guest_copy(xenfb, 0, 0, xenfb->width, xenfb->height);
+ } else if (xenfb->up_count) {
+ xen_be_printf(&xenfb->c.xendev, 3, "update: %d rects\n", xenfb->up_count);
+ for (i = 0; i < xenfb->up_count; i++)
+ xenfb_guest_copy(xenfb,
+ xenfb->up_rects[i].x,
+ xenfb->up_rects[i].y,
+ xenfb->up_rects[i].w,
+ xenfb->up_rects[i].h);
+ } else {
+ xen_be_printf(&xenfb->c.xendev, 3, "update: nothing\n");
+ }
+ xenfb->up_count = 0;
+ xenfb->up_fullscreen = 0;
+}
+
+/* QEMU display state changed, so refresh the framebuffer copy */
+static void xenfb_invalidate(void *opaque)
+{
+ struct XenFB *xenfb = opaque;
+ xenfb->up_fullscreen = 1;
+}
+
+static void xenfb_handle_events(struct XenFB *xenfb)
+{
+ uint32_t prod, cons;
+ struct xenfb_page *page = xenfb->c.page;
+
+ prod = page->out_prod;
+ if (prod == page->out_cons)
+ return;
+ xen_rmb(); /* ensure we see ring contents up to prod */
+ for (cons = page->out_cons; cons != prod; cons++) {
+ union xenfb_out_event *event = &XENFB_OUT_RING_REF(page, cons);
+ int x, y, w, h;
+
+ switch (event->type) {
+ case XENFB_TYPE_UPDATE:
+ if (xenfb->up_count == UP_QUEUE)
+ xenfb->up_fullscreen = 1;
+ if (xenfb->up_fullscreen)
+ break;
+ x = MAX(event->update.x, 0);
+ y = MAX(event->update.y, 0);
+ w = MIN(event->update.width, xenfb->width - x);
+ h = MIN(event->update.height, xenfb->height - y);
+ if (w < 0 || h < 0) {
+ fprintf(stderr, "xen be: %s: bogus update ignored\n",
+ xenfb->c.xendev.name);
+ break;
+ }
+ if (x != event->update.x || y != event->update.y
+ || w != event->update.width
+ || h != event->update.height) {
+ fprintf(stderr, "xen be: %s: bogus update clipped\n",
+ xenfb->c.xendev.name);
+ }
+ if (w == xenfb->width && h > xenfb->height / 2) {
+ /* scroll detector: updated more than 50% of the lines,
+ * don't bother keeping track of the rectangles then */
+ xenfb->up_fullscreen = 1;
+ } else {
+ xenfb->up_rects[xenfb->up_count].x = x;
+ xenfb->up_rects[xenfb->up_count].y = y;
+ xenfb->up_rects[xenfb->up_count].w = w;
+ xenfb->up_rects[xenfb->up_count].h = h;
+ xenfb->up_count++;
+ }
+ break;
+#ifdef XENFB_TYPE_RESIZE
+ case XENFB_TYPE_RESIZE:
+ if (xenfb_configure_fb(xenfb, xenfb->fb_len,
+ event->resize.width,
+ event->resize.height,
+ event->resize.depth,
+ xenfb->fb_len,
+ event->resize.offset,
+ event->resize.stride) < 0)
+ break;
+ xenfb_invalidate(xenfb);
+ break;
+#endif
+ }
+ }
+ xen_mb(); /* ensure we're done with ring contents */
+ page->out_cons = cons;
+}
+
+static int fb_init(struct XenDevice *xendev)
+{
+ struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev);
+
+ if (!fb->c.ds) {
+ /* xen_set_display() below will set that and trigger us again */
+ xen_be_printf(xendev, 1, "ds not set (yet)\n");
+ return -1;
+ }
+
+ fb->refresh_period = -1;
+
+#ifdef XENFB_TYPE_RESIZE
+ xenstore_write_be_int(xendev, "feature-resize", 1);
+#endif
+ return 0;
+}
+
+static int fb_connect(struct XenDevice *xendev)
+{
+ struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev);
+ struct xenfb_page *fb_page;
+ int videoram;
+ int rc;
+
+ if (-1 == xenstore_read_fe_int(xendev, "feature-update", &fb->feature_update))
+ fb->feature_update = 0;
+ if (fb->feature_update)
+ xenstore_write_be_int(xendev, "request-update", 1);
+
+ if (-1 == xenstore_read_fe_int(xendev, "videoram", &videoram))
+ videoram = 0;
+
+ rc = common_bind(&fb->c);
+ if (0 != rc)
+ return rc;
+
+ fb_page = fb->c.page;
+ rc = xenfb_configure_fb(fb, videoram * 1024 * 1024U,
+ fb_page->width, fb_page->height, fb_page->depth,
+ fb_page->mem_length, 0, fb_page->line_length);
+ if (0 != rc)
+ return rc;
+
+ rc = xenfb_map_fb(fb);
+ if (0 != rc)
+ return rc;
+
+ if (!fb->have_console) {
+ graphic_console_init(fb->c.ds,
+ xenfb_update,
+ xenfb_invalidate,
+ NULL,
+ NULL,
+ fb);
+ fb->have_console = 1;
+ }
+
+ xen_be_printf(xendev, 1, "feature-update=%d, videoram=%d\n",
+ fb->feature_update, videoram);
+
+ return 0;
+}
+
+static void fb_disconnect(struct XenDevice *xendev)
+{
+ struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev);
+
+ /*
+ * FIXME: qemu can't un-init gfx display (yet?).
+ * Replacing the framebuffer with anonymous shared memory
+ * instead. This releases the guest pages and keeps qemu happy.
+ */
+ fb->pixels = mmap(fb->pixels, fb->fbpages * XC_PAGE_SIZE,
+ PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON,
+ -1, 0);
+ common_unbind(&fb->c);
+ fb->feature_update = 0;
+ fb->bug_trigger = 0;
+}
+
+static void fb_frontend_changed(struct XenDevice *xendev, const char *node)
+{
+ struct XenFB *fb = container_of(xendev, struct XenFB, c.xendev);
+
+ /*
+ * Set state to Connected *again* once the frontend switched
+ * to connected. We must trigger the watch a second time to
+ * workaround a frontend bug.
+ */
+ if (0 == fb->bug_trigger && 0 == strcmp(node, "state") &&
+ xendev->fe_state == XenbusStateConnected &&
+ xendev->be_state == XenbusStateConnected) {
+ xen_be_set_state(xendev, XenbusStateConnected);
+ fb->bug_trigger = 1; /* only once */
+ }
+}
+
+static void fb_event(struct XenDevice *xendev)
+{
+ struct XenFB *xenfb = container_of(xendev, struct XenFB, c.xendev);
+
+ xenfb_handle_events(xenfb);
+ xen_be_send_notify(&xenfb->c.xendev);
+}
+
+/* -------------------------------------------------------------------- */
+
+struct XenDevOps xen_kbdmouse_ops = {
+ .size = sizeof(struct XenInput),
+ .init = input_init,
+ .connect = input_connect,
+ .disconnect = input_disconnect,
+ .event = input_event,
+};
+
+struct XenDevOps xen_framebuffer_ops = {
+ .size = sizeof(struct XenFB),
+ .init = fb_init,
+ .connect = fb_connect,
+ .disconnect = fb_disconnect,
+ .event = fb_event,
+ .frontend_changed = fb_frontend_changed,
+};
+
+static void xen_set_display_type(int domid, char *type, DisplayState *ds)
+{
+ struct XenDevice *xendev;
+ struct common *c;
+
+ xendev = xen_be_find_xendev(type, domid, 0);
+ if (!xendev)
+ return;
+ c = container_of(xendev, struct common, xendev);
+ c->ds = ds;
+ xen_be_printf(xendev, 1, "ds is %p\n", ds);
+ /* retry ->init() */
+ xen_be_check_state(xendev);
+}
+
+void xen_set_display(int domid, DisplayState *ds)
+{
+ xen_set_display_type(domid, "vkbd", ds);
+ xen_set_display_type(domid, "vfb", ds);
+}
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 93f66d0..d1d0a93 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -99,6 +99,11 @@ static int xen_init_pv(DisplayState *ds)
/* xenbus backend drivers */
xen_be_register("console", &xen_console_ops);
+ xen_be_register("vkbd", &xen_kbdmouse_ops);
+ xen_be_register("vfb", &xen_framebuffer_ops);
+
+ /* setup framebuffer */
+ xen_set_display(xen_domid, ds);
return 0;
}
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 09/13] xen: add block device backend driver.
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (7 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 08/13] xen: add framebuffer " Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 10/13] xen: add net " Gerd Hoffmann
` (5 subsequent siblings)
14 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This patch adds a block device backend driver to qemu. It is a pure
userspace implemention using the gntdev interface. It uses "qdisk" as
backend name in xenstore so it doesn't interfere with the other existing
backends (blkback aka "vbd" and tapdisk aka "tap").
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 2 +-
hw/xen_backend.h | 2 +
hw/xen_blkif.h | 103 ++++++++
hw/xen_disk.c | 693 +++++++++++++++++++++++++++++++++++++++++++++++++++
hw/xen_machine_pv.c | 1 +
sysemu.h | 2 +-
vl.c | 4 +
7 files changed, 805 insertions(+), 2 deletions(-)
create mode 100644 hw/xen_blkif.h
create mode 100644 hw/xen_disk.c
diff --git a/Makefile.target b/Makefile.target
index bcf4d93..a60ca16 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -517,7 +517,7 @@ endif
# xen backend driver support
XEN_OBJS := xen_machine_pv.o xen_backend.o
-XEN_OBJS += xen_console.o xen_framebuffer.o
+XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o
ifeq ($(CONFIG_XEN), yes)
OBJS += $(XEN_OBJS)
LIBS += $(XEN_LIBS)
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index d8bc23a..9dbcb19 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -2,6 +2,7 @@
#define QEMU_XEN_BACKEND_H 1
#include "xen_common.h"
+#include "sysemu.h"
/* ------------------------------------------------------------- */
@@ -87,6 +88,7 @@ void xen_be_printf(struct XenDevice *xendev, int msg_level, const char *fmt, ...
struct XenDevOps xen_console_ops; /* xen_console.c */
struct XenDevOps xen_kbdmouse_ops; /* xen_framebuffer.c */
struct XenDevOps xen_framebuffer_ops; /* xen_framebuffer.c */
+struct XenDevOps xen_blkdev_ops; /* xen_disk.c */
void xen_set_display(int domid, DisplayState *ds);
diff --git a/hw/xen_blkif.h b/hw/xen_blkif.h
new file mode 100644
index 0000000..254a5fd
--- /dev/null
+++ b/hw/xen_blkif.h
@@ -0,0 +1,103 @@
+#ifndef __XEN_BLKIF_H__
+#define __XEN_BLKIF_H__
+
+#include <xen/io/ring.h>
+#include <xen/io/blkif.h>
+#include <xen/io/protocols.h>
+
+/* Not a real protocol. Used to generate ring structs which contain
+ * the elements common to all protocols only. This way we get a
+ * compiler-checkable way to use common struct elements, so we can
+ * avoid using switch(protocol) in a number of places. */
+struct blkif_common_request {
+ char dummy;
+};
+struct blkif_common_response {
+ char dummy;
+};
+
+/* i386 protocol version */
+#pragma pack(push, 4)
+struct blkif_x86_32_request {
+ uint8_t operation; /* BLKIF_OP_??? */
+ uint8_t nr_segments; /* number of segments */
+ blkif_vdev_t handle; /* only for read/write requests */
+ uint64_t id; /* private guest value, echoed in resp */
+ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */
+ struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+};
+struct blkif_x86_32_response {
+ uint64_t id; /* copied from request */
+ uint8_t operation; /* copied from request */
+ int16_t status; /* BLKIF_RSP_??? */
+};
+typedef struct blkif_x86_32_request blkif_x86_32_request_t;
+typedef struct blkif_x86_32_response blkif_x86_32_response_t;
+#pragma pack(pop)
+
+/* x86_64 protocol version */
+struct blkif_x86_64_request {
+ uint8_t operation; /* BLKIF_OP_??? */
+ uint8_t nr_segments; /* number of segments */
+ blkif_vdev_t handle; /* only for read/write requests */
+ uint64_t __attribute__((__aligned__(8))) id;
+ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */
+ struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+};
+struct blkif_x86_64_response {
+ uint64_t __attribute__((__aligned__(8))) id;
+ uint8_t operation; /* copied from request */
+ int16_t status; /* BLKIF_RSP_??? */
+};
+typedef struct blkif_x86_64_request blkif_x86_64_request_t;
+typedef struct blkif_x86_64_response blkif_x86_64_response_t;
+
+DEFINE_RING_TYPES(blkif_common, struct blkif_common_request, struct blkif_common_response);
+DEFINE_RING_TYPES(blkif_x86_32, struct blkif_x86_32_request, struct blkif_x86_32_response);
+DEFINE_RING_TYPES(blkif_x86_64, struct blkif_x86_64_request, struct blkif_x86_64_response);
+
+union blkif_back_rings {
+ blkif_back_ring_t native;
+ blkif_common_back_ring_t common;
+ blkif_x86_32_back_ring_t x86_32;
+ blkif_x86_64_back_ring_t x86_64;
+};
+typedef union blkif_back_rings blkif_back_rings_t;
+
+enum blkif_protocol {
+ BLKIF_PROTOCOL_NATIVE = 1,
+ BLKIF_PROTOCOL_X86_32 = 2,
+ BLKIF_PROTOCOL_X86_64 = 3,
+};
+
+static void inline blkif_get_x86_32_req(blkif_request_t *dst, blkif_x86_32_request_t *src)
+{
+ int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST;
+
+ dst->operation = src->operation;
+ dst->nr_segments = src->nr_segments;
+ dst->handle = src->handle;
+ dst->id = src->id;
+ dst->sector_number = src->sector_number;
+ if (n > src->nr_segments)
+ n = src->nr_segments;
+ for (i = 0; i < n; i++)
+ dst->seg[i] = src->seg[i];
+}
+
+static void inline blkif_get_x86_64_req(blkif_request_t *dst, blkif_x86_64_request_t *src)
+{
+ int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST;
+
+ dst->operation = src->operation;
+ dst->nr_segments = src->nr_segments;
+ dst->handle = src->handle;
+ dst->id = src->id;
+ dst->sector_number = src->sector_number;
+ if (n > src->nr_segments)
+ n = src->nr_segments;
+ for (i = 0; i < n; i++)
+ dst->seg[i] = src->seg[i];
+}
+
+#endif /* __XEN_BLKIF_H__ */
diff --git a/hw/xen_disk.c b/hw/xen_disk.c
new file mode 100644
index 0000000..5c1cbdf
--- /dev/null
+++ b/hw/xen_disk.c
@@ -0,0 +1,693 @@
+/*
+ * xen paravirt block device backend
+ *
+ * (c) Gerd Hoffmann <kraxel@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+/*
+ * FIXME: the code is designed to handle multiple outstanding
+ * requests (using aio or using threads), which isn't used right
+ * now due to limitations of the qemu block driver interface.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <inttypes.h>
+#include <time.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <sys/uio.h>
+
+#include <xs.h>
+#include <xenctrl.h>
+#include <xen/io/xenbus.h>
+
+#include "hw.h"
+#include "block_int.h"
+#include "qemu-char.h"
+#include "xen_blkif.h"
+#include "xen_backend.h"
+
+/* ------------------------------------------------------------- */
+
+#define BLOCK_SIZE 512
+#define IOCB_COUNT (BLKIF_MAX_SEGMENTS_PER_REQUEST + 2)
+
+struct ioreq {
+ blkif_request_t req;
+ int16_t status;
+
+ /* parsed request */
+ off_t start, end;
+ struct iovec vec[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+ int vecs;
+ int presync;
+ int postsync;
+
+ /* grant mapping */
+ uint32_t domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+ uint32_t refs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+ int prot;
+ void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+ void *pages;
+
+ struct XenBlkDev *blkdev;
+ LIST_ENTRY(ioreq) list;
+};
+
+struct XenBlkDev {
+ struct XenDevice xendev; /* must be first */
+ char *params;
+ char *mode;
+ char *type;
+ char *dev;
+ char *devtype;
+ char *fileproto;
+ char *filename;
+ int ring_ref;
+ void *sring;
+ int64_t file_blk;
+ int64_t file_size;
+ int protocol;
+ blkif_back_rings_t rings;
+ int more_work;
+ int cnt_map;
+
+ /* request lists */
+ LIST_HEAD(inflight_head, ioreq) inflight;
+ LIST_HEAD(finished_head, ioreq) finished;
+ LIST_HEAD(freelist_head, ioreq) freelist;
+ int requests;
+
+ /* qemu block driver */
+ int index;
+ BlockDriverState *bs;
+};
+
+static int syncwrite = 0;
+static int batch_maps = 0;
+static int max_requests = 32;
+
+/* ------------------------------------------------------------- */
+
+static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
+{
+ struct ioreq *ioreq = NULL;
+
+ if (LIST_EMPTY(&blkdev->freelist)) {
+ if (blkdev->requests >= max_requests)
+ goto out;
+ /* allocate new struct */
+ ioreq = qemu_mallocz(sizeof(*ioreq));
+ if (NULL == ioreq)
+ goto out;
+ ioreq->blkdev = blkdev;
+ blkdev->requests++;
+ } else {
+ /* get one from freelist */
+ ioreq = LIST_FIRST(&blkdev->freelist);
+ LIST_REMOVE(ioreq, list);
+ }
+ LIST_INSERT_HEAD(&blkdev->inflight, ioreq, list);
+
+out:
+ return ioreq;
+}
+
+static void ioreq_finish(struct ioreq *ioreq)
+{
+ struct XenBlkDev *blkdev = ioreq->blkdev;
+
+ LIST_REMOVE(ioreq, list);
+ LIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
+}
+
+static void ioreq_release(struct ioreq *ioreq)
+{
+ struct XenBlkDev *blkdev = ioreq->blkdev;
+
+ LIST_REMOVE(ioreq, list);
+ memset(ioreq, 0, sizeof(*ioreq));
+ ioreq->blkdev = blkdev;
+ LIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
+}
+
+/*
+ * translate request into iovec + start offset + end offset
+ * do sanity checks along the way
+ */
+static int ioreq_parse(struct ioreq *ioreq)
+{
+ struct XenBlkDev *blkdev = ioreq->blkdev;
+ uintptr_t mem;
+ size_t len;
+ int i;
+
+ xen_be_printf(&blkdev->xendev, 3,
+ "op %d, nr %d, handle %d, id %" PRId64 ", sector %" PRId64 "\n",
+ ioreq->req.operation, ioreq->req.nr_segments,
+ ioreq->req.handle, ioreq->req.id, ioreq->req.sector_number);
+ switch (ioreq->req.operation) {
+ case BLKIF_OP_READ:
+ ioreq->prot = PROT_WRITE; /* to memory */
+ if (BLKIF_OP_READ != ioreq->req.operation && blkdev->mode[0] != 'w') {
+ xen_be_printf(&blkdev->xendev, 0, "error: write req for ro device\n");
+ goto err;
+ }
+ break;
+ case BLKIF_OP_WRITE_BARRIER:
+ if (!syncwrite)
+ ioreq->presync = ioreq->postsync = 1;
+ /* fall through */
+ case BLKIF_OP_WRITE:
+ ioreq->prot = PROT_READ; /* from memory */
+ if (syncwrite)
+ ioreq->postsync = 1;
+ break;
+ default:
+ xen_be_printf(&blkdev->xendev, 0, "error: unknown operation (%d)\n",
+ ioreq->req.operation);
+ goto err;
+ };
+
+ ioreq->start = ioreq->end = ioreq->req.sector_number * blkdev->file_blk;
+ for (i = 0; i < ioreq->req.nr_segments; i++) {
+ if (i == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
+ xen_be_printf(&blkdev->xendev, 0, "error: nr_segments too big\n");
+ goto err;
+ }
+ if (ioreq->req.seg[i].first_sect > ioreq->req.seg[i].last_sect) {
+ xen_be_printf(&blkdev->xendev, 0, "error: first > last sector\n");
+ goto err;
+ }
+ if (ioreq->req.seg[i].last_sect * BLOCK_SIZE >= XC_PAGE_SIZE) {
+ xen_be_printf(&blkdev->xendev, 0, "error: page crossing\n");
+ goto err;
+ }
+ len = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) * blkdev->file_blk;
+
+ ioreq->domids[i] = blkdev->xendev.dom;
+ ioreq->refs[i] = ioreq->req.seg[i].gref;
+ mem = ioreq->req.seg[i].first_sect * blkdev->file_blk;
+
+ ioreq->vec[i].iov_base = (void*)mem;
+ ioreq->vec[i].iov_len = len;
+ ioreq->end += len;
+ }
+ if (ioreq->end > blkdev->file_size) {
+ xen_be_printf(&blkdev->xendev, 0, "error: access beyond end of file\n");
+ goto err;
+ }
+ ioreq->vecs = i;
+ return 0;
+
+err:
+ ioreq->status = BLKIF_RSP_ERROR;
+ return -1;
+}
+
+static void ioreq_unmap(struct ioreq *ioreq)
+{
+ int gnt = ioreq->blkdev->xendev.gnttabdev;
+ int i;
+
+ if (batch_maps) {
+ if (!ioreq->pages)
+ return;
+ if (0 != xc_gnttab_munmap(gnt, ioreq->pages, ioreq->vecs))
+ xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap failed: %s\n",
+ strerror(errno));
+ ioreq->blkdev->cnt_map -= ioreq->vecs;
+ ioreq->pages = NULL;
+ } else {
+ for (i = 0; i < ioreq->vecs; i++) {
+ if (!ioreq->page[i])
+ continue;
+ if (0 != xc_gnttab_munmap(gnt, ioreq->page[i], 1))
+ xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap failed: %s\n",
+ strerror(errno));
+ ioreq->blkdev->cnt_map--;
+ ioreq->page[i] = NULL;
+ }
+ }
+}
+
+static int ioreq_map(struct ioreq *ioreq)
+{
+ int gnt = ioreq->blkdev->xendev.gnttabdev;
+ int i;
+
+ if (batch_maps) {
+ ioreq->pages = xc_gnttab_map_grant_refs
+ (gnt, ioreq->vecs, ioreq->domids, ioreq->refs, ioreq->prot);
+ if (NULL == ioreq->pages) {
+ xen_be_printf(&ioreq->blkdev->xendev, 0,
+ "can't map %d grant refs (%s, %d maps)\n",
+ ioreq->vecs, strerror(errno), ioreq->blkdev->cnt_map);
+ return -1;
+ }
+ for (i = 0; i < ioreq->vecs; i++)
+ ioreq->vec[i].iov_base = ioreq->pages + i * XC_PAGE_SIZE +
+ (uintptr_t)ioreq->vec[i].iov_base;
+ ioreq->blkdev->cnt_map += ioreq->vecs;
+ } else {
+ for (i = 0; i < ioreq->vecs; i++) {
+ ioreq->page[i] = xc_gnttab_map_grant_ref
+ (gnt, ioreq->domids[i], ioreq->refs[i], ioreq->prot);
+ if (NULL == ioreq->page[i]) {
+ xen_be_printf(&ioreq->blkdev->xendev, 0,
+ "can't map grant ref %d (%s, %d maps)\n",
+ ioreq->refs[i], strerror(errno), ioreq->blkdev->cnt_map);
+ ioreq_unmap(ioreq);
+ return -1;
+ }
+ ioreq->vec[i].iov_base = ioreq->page[i] + (uintptr_t)ioreq->vec[i].iov_base;
+ ioreq->blkdev->cnt_map++;
+ }
+ }
+ return 0;
+}
+
+static int ioreq_runio_qemu(struct ioreq *ioreq)
+{
+ struct XenBlkDev *blkdev = ioreq->blkdev;
+ int i, rc, len = 0;
+ off_t pos;
+
+ if (-1 == ioreq_map(ioreq))
+ goto err;
+ if (ioreq->presync)
+ bdrv_flush(blkdev->bs);
+
+ switch (ioreq->req.operation) {
+ case BLKIF_OP_READ:
+ pos = ioreq->start;
+ for (i = 0; i < ioreq->vecs; i++) {
+ rc = bdrv_read(blkdev->bs, pos / BLOCK_SIZE,
+ ioreq->vec[i].iov_base,
+ ioreq->vec[i].iov_len / BLOCK_SIZE);
+ if (rc != 0) {
+ xen_be_printf(&blkdev->xendev, 0, "rd I/O error (%p, len %zd)\n",
+ ioreq->vec[i].iov_base,
+ ioreq->vec[i].iov_len);
+ goto err;
+ }
+ len += ioreq->vec[i].iov_len;
+ pos += ioreq->vec[i].iov_len;
+ }
+ break;
+ case BLKIF_OP_WRITE:
+ case BLKIF_OP_WRITE_BARRIER:
+ pos = ioreq->start;
+ for (i = 0; i < ioreq->vecs; i++) {
+ rc = bdrv_write(blkdev->bs, pos / BLOCK_SIZE,
+ ioreq->vec[i].iov_base,
+ ioreq->vec[i].iov_len / BLOCK_SIZE);
+ if (rc != 0) {
+ xen_be_printf(&blkdev->xendev, 0, "wr I/O error (%p, len %zd)\n",
+ ioreq->vec[i].iov_base,
+ ioreq->vec[i].iov_len);
+ goto err;
+ }
+ len += ioreq->vec[i].iov_len;
+ pos += ioreq->vec[i].iov_len;
+ }
+ break;
+ default:
+ /* unknown operation (shouldn't happen -- parse catches this) */
+ goto err;
+ }
+
+ if (ioreq->postsync)
+ bdrv_flush(blkdev->bs);
+ ioreq->status = BLKIF_RSP_OKAY;
+
+ ioreq_unmap(ioreq);
+ return 0;
+
+err:
+ ioreq->status = BLKIF_RSP_ERROR;
+ return -1;
+}
+
+static int blk_send_response_one(struct ioreq *ioreq)
+{
+ struct XenBlkDev *blkdev = ioreq->blkdev;
+ int send_notify = 0;
+ int have_requests = 0;
+ blkif_response_t resp;
+ void *dst;
+
+ resp.id = ioreq->req.id;
+ resp.operation = ioreq->req.operation;
+ resp.status = ioreq->status;
+
+ /* Place on the response ring for the relevant domain. */
+ switch (blkdev->protocol) {
+ case BLKIF_PROTOCOL_NATIVE:
+ dst = RING_GET_RESPONSE(&blkdev->rings.native, blkdev->rings.native.rsp_prod_pvt);
+ break;
+ case BLKIF_PROTOCOL_X86_32:
+ dst = RING_GET_RESPONSE(&blkdev->rings.x86_32, blkdev->rings.x86_32.rsp_prod_pvt);
+ break;
+ case BLKIF_PROTOCOL_X86_64:
+ dst = RING_GET_RESPONSE(&blkdev->rings.x86_64, blkdev->rings.x86_64.rsp_prod_pvt);
+ break;
+ default:
+ dst = NULL;
+ }
+ memcpy(dst, &resp, sizeof(resp));
+ blkdev->rings.common.rsp_prod_pvt++;
+
+ RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&blkdev->rings.common, send_notify);
+ if (blkdev->rings.common.rsp_prod_pvt == blkdev->rings.common.req_cons) {
+ /*
+ * Tail check for pending requests. Allows frontend to avoid
+ * notifications if requests are already in flight (lower
+ * overheads and promotes batching).
+ */
+ RING_FINAL_CHECK_FOR_REQUESTS(&blkdev->rings.common, have_requests);
+ } else if (RING_HAS_UNCONSUMED_REQUESTS(&blkdev->rings.common)) {
+ have_requests = 1;
+ }
+
+ if (have_requests)
+ blkdev->more_work++;
+ return send_notify;
+}
+
+/* walk finished list, send outstanding responses, free requests */
+static void blk_send_response_all(struct XenBlkDev *blkdev)
+{
+ struct ioreq *ioreq;
+ int send_notify = 0;
+
+ while (!LIST_EMPTY(&blkdev->finished)) {
+ ioreq = LIST_FIRST(&blkdev->finished);
+ send_notify += blk_send_response_one(ioreq);
+ ioreq_release(ioreq);
+ }
+ if (send_notify)
+ xen_be_send_notify(&blkdev->xendev);
+}
+
+static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq, RING_IDX rc)
+{
+ switch (blkdev->protocol) {
+ case BLKIF_PROTOCOL_NATIVE:
+ memcpy(&ioreq->req, RING_GET_REQUEST(&blkdev->rings.native, rc),
+ sizeof(ioreq->req));
+ break;
+ case BLKIF_PROTOCOL_X86_32:
+ blkif_get_x86_32_req(&ioreq->req, RING_GET_REQUEST(&blkdev->rings.x86_32, rc));
+ break;
+ case BLKIF_PROTOCOL_X86_64:
+ blkif_get_x86_64_req(&ioreq->req, RING_GET_REQUEST(&blkdev->rings.x86_64, rc));
+ break;
+ }
+ return 0;
+}
+
+static void blk_handle_requests(struct XenBlkDev *blkdev)
+{
+ RING_IDX rc, rp;
+ struct ioreq *ioreq;
+
+ do {
+ blkdev->more_work = 0;
+
+ rc = blkdev->rings.common.req_cons;
+ rp = blkdev->rings.common.sring->req_prod;
+ xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
+
+ /* Limit #of requests we queue up for I/O so we ack requests
+ * faster if busy. Improves backend/frontend parallelism and
+ * reduces evchn signaling. */
+ if (rp > rc + (max_requests >> 2)) {
+ rp = rc + (max_requests >> 2);
+ blkdev->more_work++;
+ }
+
+ while ((rc != rp)) {
+ /* pull request from ring */
+ if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc))
+ break;
+ ioreq = ioreq_start(blkdev);
+ if (NULL == ioreq) {
+ blkdev->more_work++;
+ break;
+ }
+ blk_get_request(blkdev, ioreq, rc);
+ blkdev->rings.common.req_cons = ++rc;
+
+ /* parse them */
+ if (0 != ioreq_parse(ioreq)) {
+ if (blk_send_response_one(ioreq))
+ xen_be_send_notify(&blkdev->xendev);
+ ioreq_release(ioreq);
+ continue;
+ }
+
+ /* run i/o in qemu mode */
+ ioreq_runio_qemu(ioreq);
+ ioreq_finish(ioreq);
+ }
+ blk_send_response_all(blkdev);
+
+ } while (blkdev->more_work);
+}
+
+/* ------------------------------------------------------------- */
+
+static void blk_alloc(struct XenDevice *xendev)
+{
+ struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+
+ LIST_INIT(&blkdev->inflight);
+ LIST_INIT(&blkdev->finished);
+ LIST_INIT(&blkdev->freelist);
+}
+
+static int blk_init(struct XenDevice *xendev)
+{
+ struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+ int mode, qflags, have_barriers, info = 0;
+ char *h;
+
+ /* read xenstore entries */
+ if (NULL == blkdev->params) {
+ blkdev->params = xenstore_read_be_str(&blkdev->xendev, "params");
+ if (NULL != (h = strchr(blkdev->params, ':'))) {
+ blkdev->fileproto = blkdev->params;
+ blkdev->filename = h+1;
+ *h = 0;
+ } else {
+ blkdev->fileproto = "<unset>";
+ blkdev->filename = blkdev->params;
+ }
+ }
+ if (NULL == blkdev->mode)
+ blkdev->mode = xenstore_read_be_str(&blkdev->xendev, "mode");
+ if (NULL == blkdev->type)
+ blkdev->type = xenstore_read_be_str(&blkdev->xendev, "type");
+ if (NULL == blkdev->dev)
+ blkdev->dev = xenstore_read_be_str(&blkdev->xendev, "dev");
+ if (NULL == blkdev->devtype)
+ blkdev->devtype = xenstore_read_be_str(&blkdev->xendev, "device-type");
+
+ /* do we have all we need? */
+ if (NULL == blkdev->params ||
+ NULL == blkdev->mode ||
+ NULL == blkdev->type ||
+ NULL == blkdev->dev)
+ return -1;
+
+ /* read-only ? */
+ if (0 == strcmp(blkdev->mode, "w")) {
+ mode = O_RDWR;
+ qflags = BDRV_O_RDWR;
+ } else {
+ mode = O_RDONLY;
+ qflags = BDRV_O_RDONLY;
+ info |= VDISK_READONLY;
+ }
+
+ /* cdrom ? */
+ if (blkdev->devtype && !strcmp(blkdev->devtype, "cdrom"))
+ info |= VDISK_CDROM;
+
+ /* init qemu block driver */
+ blkdev->index = (blkdev->xendev.dev - 202 * 256) / 16;
+ blkdev->index = drive_get_index(IF_XEN, 0, blkdev->index);
+ if (blkdev->index == -1) {
+ /* setup via xenbus -> create new block driver instance */
+ xen_be_printf(&blkdev->xendev, 2, "create new bdrv (xenbus setup)\n");
+ blkdev->bs = bdrv_new(blkdev->dev);
+ if (blkdev->bs) {
+ if (0 != bdrv_open2(blkdev->bs, blkdev->filename, qflags,
+ bdrv_find_format(blkdev->fileproto))) {
+ bdrv_delete(blkdev->bs);
+ blkdev->bs = NULL;
+ }
+ }
+ if (!blkdev->bs)
+ return -1;
+ } else {
+ /* setup via qemu cmdline -> already setup for us */
+ xen_be_printf(&blkdev->xendev, 2, "get configured bdrv (cmdline setup)\n");
+ blkdev->bs = drives_table[blkdev->index].bdrv;
+ }
+ blkdev->file_blk = BLOCK_SIZE;
+ blkdev->file_size = bdrv_getlength(blkdev->bs);
+ if (blkdev->file_size < 0) {
+ xen_be_printf(&blkdev->xendev, 1, "bdrv_getlength: %d (%s) | drv %s\n",
+ (int)blkdev->file_size, strerror(-blkdev->file_size),
+ blkdev->bs->drv ? blkdev->bs->drv->format_name : "-");
+ blkdev->file_size = 0;
+ }
+ have_barriers = blkdev->bs->drv && blkdev->bs->drv->bdrv_flush ? 1 : 0;
+
+ xen_be_printf(xendev, 1, "type \"%s\", fileproto \"%s\", filename \"%s\","
+ " size %" PRId64 " (%" PRId64 " MB)\n",
+ blkdev->type, blkdev->fileproto, blkdev->filename,
+ blkdev->file_size, blkdev->file_size >> 20);
+
+ /* fill info */
+ xenstore_write_be_int(&blkdev->xendev, "feature-barrier", have_barriers);
+ xenstore_write_be_int(&blkdev->xendev, "info", info);
+ xenstore_write_be_int(&blkdev->xendev, "sector-size", blkdev->file_blk);
+ xenstore_write_be_int(&blkdev->xendev, "sectors",
+ blkdev->file_size / blkdev->file_blk);
+ return 0;
+}
+
+static int blk_connect(struct XenDevice *xendev)
+{
+ struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+
+ if (-1 == xenstore_read_fe_int(&blkdev->xendev, "ring-ref", &blkdev->ring_ref))
+ return -1;
+ if (-1 == xenstore_read_fe_int(&blkdev->xendev, "event-channel",
+ &blkdev->xendev.remote_port))
+ return -1;
+
+ blkdev->protocol = BLKIF_PROTOCOL_NATIVE;
+ if (blkdev->xendev.protocol) {
+ if (0 == strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_32))
+ blkdev->protocol = BLKIF_PROTOCOL_X86_32;
+ if (0 == strcmp(blkdev->xendev.protocol, XEN_IO_PROTO_ABI_X86_64))
+ blkdev->protocol = BLKIF_PROTOCOL_X86_64;
+ }
+
+ blkdev->sring = xc_gnttab_map_grant_ref(blkdev->xendev.gnttabdev,
+ blkdev->xendev.dom,
+ blkdev->ring_ref,
+ PROT_READ | PROT_WRITE);
+ if (!blkdev->sring)
+ return -1;
+ blkdev->cnt_map++;
+
+ switch (blkdev->protocol) {
+ case BLKIF_PROTOCOL_NATIVE:
+ {
+ blkif_sring_t *sring_native = blkdev->sring;
+ BACK_RING_INIT(&blkdev->rings.native, sring_native, XC_PAGE_SIZE);
+ break;
+ }
+ case BLKIF_PROTOCOL_X86_32:
+ {
+ blkif_x86_32_sring_t *sring_x86_32 = blkdev->sring;
+ BACK_RING_INIT(&blkdev->rings.x86_32, sring_x86_32, XC_PAGE_SIZE);
+ break;
+ }
+ case BLKIF_PROTOCOL_X86_64:
+ {
+ blkif_x86_64_sring_t *sring_x86_64 = blkdev->sring;
+ BACK_RING_INIT(&blkdev->rings.x86_64, sring_x86_64, XC_PAGE_SIZE);
+ break;
+ }
+ }
+
+ xen_be_bind_evtchn(&blkdev->xendev);
+
+ xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
+ "remote port %d, local port %d\n",
+ blkdev->xendev.protocol, blkdev->ring_ref,
+ blkdev->xendev.remote_port, blkdev->xendev.local_port);
+ return 0;
+}
+
+static void blk_disconnect(struct XenDevice *xendev)
+{
+ struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+
+ if (blkdev->bs) {
+ if (blkdev->index == -1) {
+ /* close/delete only if we created it ourself */
+ bdrv_close(blkdev->bs);
+ bdrv_delete(blkdev->bs);
+ }
+ blkdev->bs = NULL;
+ }
+ xen_be_unbind_evtchn(&blkdev->xendev);
+
+ if (blkdev->sring) {
+ xc_gnttab_munmap(blkdev->xendev.gnttabdev, blkdev->sring, 1);
+ blkdev->cnt_map--;
+ blkdev->sring = NULL;
+ }
+}
+
+static int blk_free(struct XenDevice *xendev)
+{
+ struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+ struct ioreq *ioreq;
+
+ while (!LIST_EMPTY(&blkdev->freelist)) {
+ ioreq = LIST_FIRST(&blkdev->freelist);
+ LIST_REMOVE(ioreq, list);
+ qemu_free(ioreq);
+ }
+
+ qemu_free(blkdev->params);
+ qemu_free(blkdev->mode);
+ return 0;
+}
+
+static void blk_event(struct XenDevice *xendev)
+{
+ struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+ blk_handle_requests(blkdev);
+}
+
+struct XenDevOps xen_blkdev_ops = {
+ .size = sizeof(struct XenBlkDev),
+ .flags = DEVOPS_FLAG_NEED_GNTDEV,
+ .alloc = blk_alloc,
+ .init = blk_init,
+ .connect = blk_connect,
+ .disconnect = blk_disconnect,
+ .event = blk_event,
+ .free = blk_free,
+};
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index d1d0a93..27b85e7 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -101,6 +101,7 @@ static int xen_init_pv(DisplayState *ds)
xen_be_register("console", &xen_console_ops);
xen_be_register("vkbd", &xen_kbdmouse_ops);
xen_be_register("vfb", &xen_framebuffer_ops);
+ xen_be_register("qdisk", &xen_blkdev_ops);
/* setup framebuffer */
xen_set_display(xen_domid, ds);
diff --git a/sysemu.h b/sysemu.h
index b12fae0..49e75b1 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -113,7 +113,7 @@ extern unsigned int nb_prom_envs;
#endif
typedef enum {
- IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD
+ IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_XEN
} BlockInterfaceType;
typedef struct DriveInfo {
diff --git a/vl.c b/vl.c
index ff94c6f..5e023e6 100644
--- a/vl.c
+++ b/vl.c
@@ -5516,6 +5516,9 @@ static int drive_init(struct drive_opt *arg, int snapshot,
} else if (!strcmp(buf, "sd")) {
type = IF_SD;
max_devs = 0;
+ } else if (!strcmp(buf, "xen")) {
+ type = IF_XEN;
+ max_devs = 0;
} else {
fprintf(stderr, "qemu: '%s' unsupported bus type '%s'\n", str, buf);
return -1;
@@ -5701,6 +5704,7 @@ static int drive_init(struct drive_opt *arg, int snapshot,
switch(type) {
case IF_IDE:
case IF_SCSI:
+ case IF_XEN:
switch(media) {
case MEDIA_DISK:
if (cyls != 0) {
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 10/13] xen: add net backend driver.
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (8 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 09/13] xen: add block device " Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 11/13] xen: blk & nic configuration via cmd line Gerd Hoffmann
` (4 subsequent siblings)
14 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This patch adds a network interface backend driver to qemu. It is a pure
userspace implemention using the gntdev interface. It uses "qnet" as
backend name in xenstore so it doesn't interfere with the netback
backend (aka "vnif").
The network backend is hooked into the corrosponding qemu vlan, i.e.
vif 0 is hooked into vlan 0. To make the packages actually arrive
somewhere you additionally have to link the vlan to the outside world
using the usual qemu command line options such as "-net tap,...".
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 2 +-
hw/xen_backend.h | 1 +
hw/xen_machine_pv.c | 1 +
hw/xen_nic.c | 382 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 385 insertions(+), 1 deletions(-)
create mode 100644 hw/xen_nic.c
diff --git a/Makefile.target b/Makefile.target
index a60ca16..94b5342 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -517,7 +517,7 @@ endif
# xen backend driver support
XEN_OBJS := xen_machine_pv.o xen_backend.o
-XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o
+XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o xen_nic.o
ifeq ($(CONFIG_XEN), yes)
OBJS += $(XEN_OBJS)
LIBS += $(XEN_LIBS)
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index 9dbcb19..9ea4421 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -89,6 +89,7 @@ struct XenDevOps xen_console_ops; /* xen_console.c */
struct XenDevOps xen_kbdmouse_ops; /* xen_framebuffer.c */
struct XenDevOps xen_framebuffer_ops; /* xen_framebuffer.c */
struct XenDevOps xen_blkdev_ops; /* xen_disk.c */
+struct XenDevOps xen_netdev_ops; /* xen_nic.c */
void xen_set_display(int domid, DisplayState *ds);
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 27b85e7..e6d2bd6 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -102,6 +102,7 @@ static int xen_init_pv(DisplayState *ds)
xen_be_register("vkbd", &xen_kbdmouse_ops);
xen_be_register("vfb", &xen_framebuffer_ops);
xen_be_register("qdisk", &xen_blkdev_ops);
+ xen_be_register("qnic", &xen_netdev_ops);
/* setup framebuffer */
xen_set_display(xen_domid, ds);
diff --git a/hw/xen_nic.c b/hw/xen_nic.c
new file mode 100644
index 0000000..e2e6a85
--- /dev/null
+++ b/hw/xen_nic.c
@@ -0,0 +1,382 @@
+/*
+ * xen paravirt network card backend
+ *
+ * (c) Gerd Hoffmann <kraxel@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <unistd.h>
+#include <signal.h>
+#include <inttypes.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <pthread.h>
+#include <sys/socket.h>
+#include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <sys/wait.h>
+#include <linux/if.h>
+#include <linux/if_tun.h>
+
+#include <xs.h>
+#include <xenctrl.h>
+#include <xen/io/xenbus.h>
+#include <xen/io/netif.h>
+
+#include "hw.h"
+#include "net.h"
+#include "qemu-char.h"
+#include "xen_backend.h"
+
+/* ------------------------------------------------------------- */
+
+struct XenNetDev {
+ struct XenDevice xendev; /* must be first */
+ char *mac;
+ int tx_work;
+ int tx_ring_ref;
+ int rx_ring_ref;
+ struct netif_tx_sring *txs;
+ struct netif_rx_sring *rxs;
+ netif_tx_back_ring_t tx_ring;
+ netif_rx_back_ring_t rx_ring;
+ VLANClientState *vs;
+};
+
+/* ------------------------------------------------------------- */
+
+static void net_tx_response(struct XenNetDev *netdev, netif_tx_request_t *txp, int8_t st)
+{
+ RING_IDX i = netdev->tx_ring.rsp_prod_pvt;
+ netif_tx_response_t *resp;
+ int notify;
+
+ resp = RING_GET_RESPONSE(&netdev->tx_ring, i);
+ resp->id = txp->id;
+ resp->status = st;
+
+#if 0
+ if (txp->flags & NETTXF_extra_info)
+ RING_GET_RESPONSE(&netdev->tx_ring, ++i)->status = NETIF_RSP_NULL;
+#endif
+
+ netdev->tx_ring.rsp_prod_pvt = ++i;
+ RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->tx_ring, notify);
+ if (notify)
+ xen_be_send_notify(&netdev->xendev);
+
+ if (i == netdev->tx_ring.req_cons) {
+ int more_to_do;
+ RING_FINAL_CHECK_FOR_REQUESTS(&netdev->tx_ring, more_to_do);
+ if (more_to_do)
+ netdev->tx_work++;
+ }
+}
+
+static void net_tx_error(struct XenNetDev *netdev, netif_tx_request_t *txp, RING_IDX end)
+{
+#if 0
+ /*
+ * Hmm, why netback fails everything in the ring?
+ * Should we do that even when not supporting SG and TSO?
+ */
+ RING_IDX cons = netdev->tx_ring.req_cons;
+
+ do {
+ make_tx_response(netif, txp, NETIF_RSP_ERROR);
+ if (cons >= end)
+ break;
+ txp = RING_GET_REQUEST(&netdev->tx_ring, cons++);
+ } while (1);
+ netdev->tx_ring.req_cons = cons;
+ netif_schedule_work(netif);
+ netif_put(netif);
+#else
+ net_tx_response(netdev, txp, NETIF_RSP_ERROR);
+#endif
+}
+
+static void net_tx_packets(struct XenNetDev *netdev)
+{
+ netif_tx_request_t txreq;
+ RING_IDX rc, rp;
+ void *page;
+
+ for (;;) {
+ rc = netdev->tx_ring.req_cons;
+ rp = netdev->tx_ring.sring->req_prod;
+ xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
+
+ while ((rc != rp)) {
+ if (RING_REQUEST_CONS_OVERFLOW(&netdev->tx_ring, rc))
+ break;
+ memcpy(&txreq, RING_GET_REQUEST(&netdev->tx_ring, rc), sizeof(txreq));
+ netdev->tx_ring.req_cons = ++rc;
+
+#if 1
+ /* should not happen in theory, we don't announce the *
+ * feature-{sg,gso,whatelse} flags in xenstore (yet?) */
+ if (txreq.flags & NETTXF_extra_info) {
+ xen_be_printf(&netdev->xendev, 0, "FIXME: extra info flag\n");
+ net_tx_error(netdev, &txreq, rc);
+ continue;
+ }
+ if (txreq.flags & NETTXF_more_data) {
+ xen_be_printf(&netdev->xendev, 0, "FIXME: more data flag\n");
+ net_tx_error(netdev, &txreq, rc);
+ continue;
+ }
+#endif
+
+ if (txreq.size < 14) {
+ xen_be_printf(&netdev->xendev, 0, "bad packet size: %d\n", txreq.size);
+ net_tx_error(netdev, &txreq, rc);
+ continue;
+ }
+
+ if ((txreq.offset + txreq.size) > XC_PAGE_SIZE) {
+ xen_be_printf(&netdev->xendev, 0, "error: page crossing\n");
+ net_tx_error(netdev, &txreq, rc);
+ continue;
+ }
+
+ xen_be_printf(&netdev->xendev, 3, "tx packet ref %d, off %d, len %d, flags 0x%x%s%s%s%s\n",
+ txreq.gref, txreq.offset, txreq.size, txreq.flags,
+ (txreq.flags & NETTXF_csum_blank) ? " csum_blank" : "",
+ (txreq.flags & NETTXF_data_validated) ? " data_validated" : "",
+ (txreq.flags & NETTXF_more_data) ? " more_data" : "",
+ (txreq.flags & NETTXF_extra_info) ? " extra_info" : "");
+
+ page = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+ netdev->xendev.dom,
+ txreq.gref, PROT_READ);
+ if (NULL == page) {
+ xen_be_printf(&netdev->xendev, 0, "error: tx gref dereference failed (%d)\n",
+ txreq.gref);
+ net_tx_error(netdev, &txreq, rc);
+ continue;
+ }
+ if (txreq.flags & NETTXF_csum_blank)
+ net_checksum_calculate(page + txreq.offset, txreq.size);
+ qemu_send_packet(netdev->vs, page + txreq.offset, txreq.size);
+ xc_gnttab_munmap(netdev->xendev.gnttabdev, page, 1);
+ net_tx_response(netdev, &txreq, NETIF_RSP_OKAY);
+ }
+ if (!netdev->tx_work)
+ break;
+ netdev->tx_work = 0;
+ }
+}
+
+/* ------------------------------------------------------------- */
+
+static void net_rx_response(struct XenNetDev *netdev,
+ netif_rx_request_t *req, int8_t st,
+ uint16_t offset, uint16_t size,
+ uint16_t flags)
+{
+ RING_IDX i = netdev->rx_ring.rsp_prod_pvt;
+ netif_rx_response_t *resp;
+ int notify;
+
+ resp = RING_GET_RESPONSE(&netdev->rx_ring, i);
+ resp->offset = offset;
+ resp->flags = flags;
+ resp->id = req->id;
+ resp->status = (int16_t)size;
+ if (st < 0)
+ resp->status = (int16_t)st;
+
+ xen_be_printf(&netdev->xendev, 3, "rx response: idx %d, status %d, flags 0x%x\n",
+ i, resp->status, resp->flags);
+
+ netdev->rx_ring.rsp_prod_pvt = ++i;
+ RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&netdev->rx_ring, notify);
+ if (notify)
+ xen_be_send_notify(&netdev->xendev);
+}
+
+#define NET_IP_ALIGN 2
+
+static int net_rx_ok(void *opaque)
+{
+ struct XenNetDev *netdev = opaque;
+ RING_IDX rc, rp;
+
+ if (netdev->xendev.be_state != XenbusStateConnected)
+ return 0;
+
+ rc = netdev->rx_ring.req_cons;
+ rp = netdev->rx_ring.sring->req_prod;
+ xen_rmb();
+
+ if (rc == rp || RING_REQUEST_CONS_OVERFLOW(&netdev->rx_ring, rc)) {
+ xen_be_printf(&netdev->xendev, 2, "%s: no rx buffers (%d/%d)\n",
+ __FUNCTION__, rc, rp);
+ return 0;
+ }
+ return 1;
+}
+
+static void net_rx_packet(void *opaque, const uint8_t *buf, int size)
+{
+ struct XenNetDev *netdev = opaque;
+ netif_rx_request_t rxreq;
+ RING_IDX rc, rp;
+ void *page;
+
+ if (netdev->xendev.be_state != XenbusStateConnected)
+ return;
+
+ rc = netdev->rx_ring.req_cons;
+ rp = netdev->rx_ring.sring->req_prod;
+ xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
+
+ if (rc == rp || RING_REQUEST_CONS_OVERFLOW(&netdev->rx_ring, rc)) {
+ xen_be_printf(&netdev->xendev, 2, "no buffer, drop packet\n");
+ return;
+ }
+ if (size > XC_PAGE_SIZE - NET_IP_ALIGN) {
+ xen_be_printf(&netdev->xendev, 0, "packet too big (%d > %ld)",
+ size, XC_PAGE_SIZE - NET_IP_ALIGN);
+ return;
+ }
+
+ memcpy(&rxreq, RING_GET_REQUEST(&netdev->rx_ring, rc), sizeof(rxreq));
+ netdev->rx_ring.req_cons = ++rc;
+
+ page = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+ netdev->xendev.dom,
+ rxreq.gref, PROT_WRITE);
+ if (NULL == page) {
+ xen_be_printf(&netdev->xendev, 0, "error: rx gref dereference failed (%d)\n",
+ rxreq.gref);
+ net_rx_response(netdev, &rxreq, NETIF_RSP_ERROR, 0, 0, 0);
+ return;
+ }
+ memcpy(page + NET_IP_ALIGN, buf, size);
+ xc_gnttab_munmap(netdev->xendev.gnttabdev, page, 1);
+ net_rx_response(netdev, &rxreq, NETIF_RSP_OKAY, NET_IP_ALIGN, size, 0);
+}
+
+/* ------------------------------------------------------------- */
+
+static int net_init(struct XenDevice *xendev)
+{
+ struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
+ VLANState *vlan;
+
+ /* read xenstore entries */
+ if (NULL == netdev->mac)
+ netdev->mac = xenstore_read_be_str(&netdev->xendev, "mac");
+
+ /* do we have all we need? */
+ if (NULL == netdev->mac)
+ return -1;
+
+ vlan = qemu_find_vlan(netdev->xendev.dev);
+ netdev->vs = qemu_new_vlan_client(vlan, net_rx_packet, net_rx_ok, netdev);
+ snprintf(netdev->vs->info_str, sizeof(netdev->vs->info_str),
+ "nic: xenbus vif macaddr=%s", netdev->mac);
+
+ /* fill info */
+ xenstore_write_be_int(&netdev->xendev, "feature-rx-copy", 1);
+ xenstore_write_be_int(&netdev->xendev, "feature-rx-flip", 0);
+
+ return 0;
+}
+
+static int net_connect(struct XenDevice *xendev)
+{
+ struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
+ int rx_copy;
+
+ if (-1 == xenstore_read_fe_int(&netdev->xendev, "tx-ring-ref",
+ &netdev->tx_ring_ref))
+ return -1;
+ if (-1 == xenstore_read_fe_int(&netdev->xendev, "rx-ring-ref",
+ &netdev->rx_ring_ref))
+ return 1;
+ if (-1 == xenstore_read_fe_int(&netdev->xendev, "event-channel",
+ &netdev->xendev.remote_port))
+ return -1;
+
+ if (-1 == xenstore_read_fe_int(&netdev->xendev, "request-rx-copy", &rx_copy))
+ rx_copy = 0;
+ if (0 == rx_copy) {
+ xen_be_printf(&netdev->xendev, 0, "frontend doesn't support rx-copy.\n");
+ return -1;
+ }
+
+ netdev->txs = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+ netdev->xendev.dom,
+ netdev->tx_ring_ref,
+ PROT_READ | PROT_WRITE);
+ netdev->rxs = xc_gnttab_map_grant_ref(netdev->xendev.gnttabdev,
+ netdev->xendev.dom,
+ netdev->rx_ring_ref,
+ PROT_READ | PROT_WRITE);
+ if (!netdev->txs || !netdev->rxs)
+ return -1;
+ BACK_RING_INIT(&netdev->tx_ring, netdev->txs, XC_PAGE_SIZE);
+ BACK_RING_INIT(&netdev->rx_ring, netdev->rxs, XC_PAGE_SIZE);
+
+ xen_be_bind_evtchn(&netdev->xendev);
+
+ xen_be_printf(&netdev->xendev, 1, "ok: tx-ring-ref %d, rx-ring-ref %d, "
+ "remote port %d, local port %d\n",
+ netdev->tx_ring_ref, netdev->rx_ring_ref,
+ netdev->xendev.remote_port, netdev->xendev.local_port);
+ return 0;
+}
+
+static void net_disconnect(struct XenDevice *xendev)
+{
+ struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
+
+ xen_be_unbind_evtchn(&netdev->xendev);
+
+ if (netdev->txs) {
+ xc_gnttab_munmap(netdev->xendev.gnttabdev, netdev->txs, 1);
+ netdev->txs = NULL;
+ }
+ if (netdev->rxs) {
+ xc_gnttab_munmap(netdev->xendev.gnttabdev, netdev->rxs, 1);
+ netdev->rxs = NULL;
+ }
+}
+
+static void net_event(struct XenDevice *xendev)
+{
+ struct XenNetDev *netdev = container_of(xendev, struct XenNetDev, xendev);
+ net_tx_packets(netdev);
+}
+
+/* ------------------------------------------------------------- */
+
+struct XenDevOps xen_netdev_ops = {
+ .size = sizeof(struct XenNetDev),
+ .flags = DEVOPS_FLAG_NEED_GNTDEV,
+ .init = net_init,
+ .connect = net_connect,
+ .event = net_event,
+ .disconnect = net_disconnect,
+};
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 11/13] xen: blk & nic configuration via cmd line.
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (9 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 10/13] xen: add net " Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 16:27 ` [Qemu-devel] [PATCH 12/13] set vnc password from xenstore Gerd Hoffmann
` (3 subsequent siblings)
14 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This patch makes qemu create backend and frontend device entries in
xenstore for devices configured on the command line. It will use
qdisk and qnic backend names, so the qemu internal backends will
be used.
Disks can be created using -drive if=xen,file=...
Nics can be created using -net nic,macaddr=...
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 2 +-
hw/xen_backend.h | 7 +++
hw/xen_devconfig.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++
hw/xen_machine_pv.c | 19 ++++++-
4 files changed, 170 insertions(+), 2 deletions(-)
create mode 100644 hw/xen_devconfig.c
diff --git a/Makefile.target b/Makefile.target
index 94b5342..7e507e4 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -516,7 +516,7 @@ LIBS += $(CONFIG_VNC_TLS_LIBS)
endif
# xen backend driver support
-XEN_OBJS := xen_machine_pv.o xen_backend.o
+XEN_OBJS := xen_machine_pv.o xen_backend.o xen_devconfig.o
XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o xen_nic.o
ifeq ($(CONFIG_XEN), yes)
OBJS += $(XEN_OBJS)
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index 9ea4421..f591743 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -3,6 +3,8 @@
#include "xen_common.h"
#include "sysemu.h"
+#include "net.h"
+#include "block_int.h"
/* ------------------------------------------------------------- */
@@ -93,4 +95,9 @@ struct XenDevOps xen_netdev_ops; /* xen_nic.c */
void xen_set_display(int domid, DisplayState *ds);
+/* configuration (aka xenbus setup) */
+void xen_config_cleanup(void);
+int xen_config_dev_blk(DriveInfo *disk);
+int xen_config_dev_nic(NICInfo *nic);
+
#endif /* QEMU_XEN_BACKEND_H */
diff --git a/hw/xen_devconfig.c b/hw/xen_devconfig.c
new file mode 100644
index 0000000..25d28b4
--- /dev/null
+++ b/hw/xen_devconfig.c
@@ -0,0 +1,144 @@
+#include "xen_backend.h"
+
+/* ------------------------------------------------------------- */
+
+struct xs_dirs {
+ char *xs_dir;
+ TAILQ_ENTRY(xs_dirs) list;
+};
+static TAILQ_HEAD(xs_dirs_head, xs_dirs) xs_cleanup = TAILQ_HEAD_INITIALIZER(xs_cleanup);
+
+static void xen_config_cleanup_dir(char *dir)
+{
+ struct xs_dirs *d;
+
+ d = qemu_malloc(sizeof(*d));
+ if (!d)
+ return;
+ d->xs_dir = dir;
+ TAILQ_INSERT_TAIL(&xs_cleanup, d, list);
+}
+
+void xen_config_cleanup(void)
+{
+ struct xs_dirs *d;
+
+ fprintf(stderr, "xen be: %s\n", __FUNCTION__);
+ TAILQ_FOREACH(d, &xs_cleanup, list) {
+ xs_rm(xenstore, 0, d->xs_dir);
+ }
+}
+
+/* ------------------------------------------------------------- */
+
+static int xen_config_dev_mkdir(char *dev, int p)
+{
+ struct xs_permissions perms = {
+ .id = xen_domid,
+ .perms = p,
+ };
+
+ if (!xs_mkdir(xenstore, 0, dev)) {
+ fprintf(stderr, "xs_mkdir %s: failed\n", dev);
+ return -1;
+ }
+ xen_config_cleanup_dir(qemu_strdup(dev));
+
+ if (!xs_set_permissions(xenstore, 0, dev, &perms, 1)) {
+ fprintf(stderr, "%s: xs_set_permissions failed\n", __FUNCTION__);
+ return -1;
+ }
+ return 0;
+}
+
+static int xen_config_dev_dirs(char *ftype, char *btype, int vdev,
+ char *fe, char *be, int len)
+{
+ char *dom;
+
+ dom = xs_get_domain_path(xenstore, xen_domid);
+ snprintf(fe, len, "%s/device/%s/%d", dom, ftype, vdev);
+ free(dom);
+
+ dom = xs_get_domain_path(xenstore, 0);
+ snprintf(be, len, "%s/backend/%s/%d/%d", dom, btype, xen_domid, vdev);
+ free(dom);
+
+ xen_config_dev_mkdir(fe, XS_PERM_WRITE);
+ xen_config_dev_mkdir(be, XS_PERM_READ);
+ return 0;
+}
+
+static int xen_config_dev_all(char *fe, char *be)
+{
+ /* frontend */
+#if 0
+ xenstore_write_str(fe, "protocol",
+ xen_config_dev_protocol(xen));
+#endif
+ xenstore_write_int(fe, "state", XenbusStateInitialising);
+ xenstore_write_int(fe, "backend-id", 0);
+ xenstore_write_str(fe, "backend", be);
+
+ /* backend */
+ xenstore_write_str(be, "domain", qemu_name ? qemu_name : "no-name");
+ xenstore_write_int(be, "online", 1);
+ xenstore_write_int(be, "state", XenbusStateInitialising);
+ xenstore_write_int(be, "frontend-id", xen_domid);
+ xenstore_write_str(be, "frontend", fe);
+
+ return 0;
+}
+
+/* ------------------------------------------------------------- */
+
+int xen_config_dev_blk(DriveInfo *disk)
+{
+ char fe[256], be[256];
+ int vdev = 202 * 256 + 16 * disk->unit;
+ int cdrom = disk->bdrv->type == BDRV_TYPE_CDROM;
+ char *devtype = cdrom ? "cdrom" : "disk";
+ char *mode = cdrom ? "r" : "w";
+
+ snprintf(disk->bdrv->device_name, sizeof(disk->bdrv->device_name),
+ "xvd%c", 'a' + disk->unit);
+ fprintf(stderr, "xen be: config disk %d [%s]: %s\n",
+ disk->unit, disk->bdrv->device_name, disk->bdrv->filename);
+ xen_config_dev_dirs("vbd", "qdisk", vdev, fe, be, sizeof(fe));
+
+ /* frontend */
+ xenstore_write_int(fe, "virtual-device", vdev);
+ xenstore_write_str(fe, "device-type", devtype);
+
+ /* backend */
+ xenstore_write_str(be, "dev", disk->bdrv->device_name);
+ xenstore_write_str(be, "type", "file");
+ xenstore_write_str(be, "params", disk->bdrv->filename);
+ xenstore_write_str(be, "mode", mode);
+
+ /* common stuff */
+ return xen_config_dev_all(fe, be);
+}
+
+int xen_config_dev_nic(NICInfo *nic)
+{
+ char fe[256], be[256];
+ char mac[20];
+
+ snprintf(mac, sizeof(mac), "%02x:%02x:%02x:%02x:%02x:%02x",
+ nic->macaddr[0], nic->macaddr[1], nic->macaddr[2],
+ nic->macaddr[3], nic->macaddr[4], nic->macaddr[5]);
+ fprintf(stderr, "xen be: config nic %d: mac=\"%s\"\n", nic->vlan->id, mac);
+ xen_config_dev_dirs("vif", "qnic", nic->vlan->id, fe, be, sizeof(fe));
+
+ /* frontend */
+ xenstore_write_int(fe, "handle", nic->vlan->id);
+ xenstore_write_str(fe, "mac", mac);
+
+ /* backend */
+ xenstore_write_int(be, "handle", nic->vlan->id);
+ xenstore_write_str(be, "mac", mac);
+
+ /* common stuff */
+ return xen_config_dev_all(fe, be);
+}
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index e6d2bd6..851837c 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -121,7 +121,7 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size,
const char *cpu_model)
{
CPUState *env;
- int rc;
+ int index,i,rc;
rc = xen_init_pv(ds);
if (-1 == rc)
@@ -147,6 +147,23 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size,
env = cpu_init(cpu_model);
env->halted = 1;
+ /* configure disks */
+ for (i = 0; i < 16; i++) {
+ index = drive_get_index(IF_XEN, 0, i);
+ if (index == -1)
+ continue;
+ xen_config_dev_blk(drives_table + index);
+ }
+
+ /* configure nics */
+ for (i = 0; i < nb_nics; i++) {
+ if (nd_table[i].model && 0 != strcmp(nd_table[i].model, "xen"))
+ continue;
+ xen_config_dev_nic(nd_table + i);
+ }
+
+ /* config cleanup hook */
+ atexit(xen_config_cleanup);
return;
err:
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (10 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 11/13] xen: blk & nic configuration via cmd line Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
2008-08-21 20:14 ` Anthony Liguori
2008-08-21 20:17 ` [Qemu-devel] Re: [Xen-devel] " Daniel P. Berrange
2008-08-21 16:27 ` [Qemu-devel] [PATCH 13/13] xen: pv domain builder Gerd Hoffmann
` (2 subsequent siblings)
14 siblings, 2 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
hw/xen_machine_pv.c | 26 ++++++++++++++++++++++++++
1 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 851837c..9f18742 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -24,6 +24,7 @@
#include "hw.h"
#include "boards.h"
+#include "console.h"
#include "xen_backend.h"
@@ -110,6 +111,29 @@ static int xen_init_pv(DisplayState *ds)
return 0;
}
+static void xen_init_vnc(void)
+{
+ char xspath[256], *dom = NULL, *vm = NULL, *pw = NULL;
+ int len;
+
+ dom = xs_get_domain_path(xenstore, xen_domid);
+ snprintf(xspath, sizeof(xspath), "%s/vm", dom);
+ vm = xs_read(xenstore, 0, xspath, &len);
+ if (!vm)
+ goto out;
+ snprintf(xspath, sizeof(xspath), "%s/vncpassword", vm);
+ pw = xs_read(xenstore, 0, xspath, &len);
+ if (!pw || !strlen(pw))
+ goto out;
+ vnc_display_password(NULL, pw);
+ fprintf(stderr, "vnc password set from xenstore\n");
+
+out:
+ free(dom);
+ free(vm);
+ free(pw);
+}
+
/* -------------------------------------------------------------------- */
/* paravirtualized xen machine */
@@ -136,6 +160,8 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size,
goto err;
}
+ xen_init_vnc();
+
/* create dummy cpu, halted */
if (cpu_model == NULL) {
#ifdef TARGET_X86_64
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 16:27 ` [Qemu-devel] [PATCH 12/13] set vnc password from xenstore Gerd Hoffmann
@ 2008-08-21 20:14 ` Anthony Liguori
2008-08-21 20:19 ` [Xen-devel] " Daniel P. Berrange
2008-08-26 10:13 ` Ian Jackson
2008-08-21 20:17 ` [Qemu-devel] Re: [Xen-devel] " Daniel P. Berrange
1 sibling, 2 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-21 20:14 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
I don't really like this approach. I don't want to have a totally
separate management interface just for Xend to use (via xenstore).
Can't Xend use the monitor like every other management tool?
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:14 ` Anthony Liguori
@ 2008-08-21 20:19 ` Daniel P. Berrange
2008-08-21 20:22 ` Anthony Liguori
` (2 more replies)
2008-08-26 10:13 ` Ian Jackson
1 sibling, 3 replies; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-21 20:19 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
On Thu, Aug 21, 2008 at 03:14:26PM -0500, Anthony Liguori wrote:
> I don't really like this approach. I don't want to have a totally
> separate management interface just for Xend to use (via xenstore).
>
> Can't Xend use the monitor like every other management tool?
If the monitor really isn't sufficient, then I'd suggest allowing an
open file descriptor to be specified on the -vnc arg, eg
-vnc localhost:3,passwdfd=6
And then let QEMU read the password from this file descriptor. I'd rather
XenD just used the monitor though - it'll ultimately need to use it anyway
to perform disk/network hotplug amongst other things.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:19 ` [Xen-devel] " Daniel P. Berrange
@ 2008-08-21 20:22 ` Anthony Liguori
2008-08-21 20:26 ` Daniel P. Berrange
` (2 more replies)
2008-08-21 20:59 ` Jamie Lokier
2008-08-21 21:29 ` Gerd Hoffmann
2 siblings, 3 replies; 117+ messages in thread
From: Anthony Liguori @ 2008-08-21 20:22 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Daniel P. Berrange wrote:
> On Thu, Aug 21, 2008 at 03:14:26PM -0500, Anthony Liguori wrote:
>
>> I don't really like this approach. I don't want to have a totally
>> separate management interface just for Xend to use (via xenstore).
>>
>> Can't Xend use the monitor like every other management tool?
>>
>
> If the monitor really isn't sufficient, then I'd suggest allowing an
> open file descriptor to be specified on the -vnc arg, eg
>
> -vnc localhost:3,passwdfd=6
>
In general, I strongly dislike passing file descriptors like this.
> And then let QEMU read the password from this file descriptor. I'd rather
> XenD just used the monitor though - it'll ultimately need to use it anyway
> to perform disk/network hotplug amongst other things.
>
If there are reasons that Xend doesn't want to interact with the monitor
(for instance, if you want to support monitor redirection), we can
certainly discuss how to enhance the monitor (like having multiple
monitor instances). Ultimately, we need to have everyone use the same
interface to interact with QEMU.
Regards,
Anthony Liguori
> Daniel
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:22 ` Anthony Liguori
@ 2008-08-21 20:26 ` Daniel P. Berrange
2008-08-21 21:34 ` Gerd Hoffmann
2008-08-24 8:57 ` Avi Kivity
2 siblings, 0 replies; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-21 20:26 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
On Thu, Aug 21, 2008 at 03:22:41PM -0500, Anthony Liguori wrote:
> Daniel P. Berrange wrote:
> >On Thu, Aug 21, 2008 at 03:14:26PM -0500, Anthony Liguori wrote:
> >
> >>I don't really like this approach. I don't want to have a totally
> >>separate management interface just for Xend to use (via xenstore).
> >>
> >>Can't Xend use the monitor like every other management tool?
> >>
> >
> >If the monitor really isn't sufficient, then I'd suggest allowing an
> >open file descriptor to be specified on the -vnc arg, eg
> >
> > -vnc localhost:3,passwdfd=6
> >
>
> In general, I strongly dislike passing file descriptors like this.
>
> >And then let QEMU read the password from this file descriptor. I'd rather
> >XenD just used the monitor though - it'll ultimately need to use it anyway
> >to perform disk/network hotplug amongst other things.
> >
>
> If there are reasons that Xend doesn't want to interact with the monitor
> (for instance, if you want to support monitor redirection), we can
> certainly discuss how to enhance the monitor (like having multiple
> monitor instances). Ultimately, we need to have everyone use the same
> interface to interact with QEMU.
Based on my understanding of XenD code & requirements I think it is mostly
just a historical accident that it doesn't use the monitor - quite simply
when XenD first picked up QEMU there wasn't much of use the monitor could
provide. There were no disk/network/vnc commands for it to use, so it
did the easiest thing and read the data out of xenstore, since it was
there already.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:22 ` Anthony Liguori
2008-08-21 20:26 ` Daniel P. Berrange
@ 2008-08-21 21:34 ` Gerd Hoffmann
2008-08-21 21:48 ` Daniel P. Berrange
2008-08-21 22:34 ` Anthony Liguori
2008-08-24 8:57 ` Avi Kivity
2 siblings, 2 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 21:34 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel
Hi,
> If there are reasons that Xend doesn't want to interact with the monitor
> (for instance, if you want to support monitor redirection), we can
> certainly discuss how to enhance the monitor (like having multiple
> monitor instances).
Multiple monitor instances would be very useful anyway.
Right now there is no way to use the monitor for libvirt-managed qemu
instances because libvirt uses the monitor. Being able to both use
libvirt *and* have a monitor prompt to type commands would be great.
cheers,
Gerd
--
http://kraxel.fedorapeople.org/xenner/
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 21:34 ` Gerd Hoffmann
@ 2008-08-21 21:48 ` Daniel P. Berrange
2008-08-21 22:36 ` Anthony Liguori
` (2 more replies)
2008-08-21 22:34 ` Anthony Liguori
1 sibling, 3 replies; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-21 21:48 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, qemu-devel
On Thu, Aug 21, 2008 at 11:34:41PM +0200, Gerd Hoffmann wrote:
> Hi,
>
> > If there are reasons that Xend doesn't want to interact with the monitor
> > (for instance, if you want to support monitor redirection), we can
> > certainly discuss how to enhance the monitor (like having multiple
> > monitor instances).
>
> Multiple monitor instances would be very useful anyway.
>
> Right now there is no way to use the monitor for libvirt-managed qemu
> instances because libvirt uses the monitor. Being able to both use
> libvirt *and* have a monitor prompt to type commands would be great.
I disagree - that means you'll no longer be able to trust what libvirt
tells you about the VM, and libvirt won't have a guarenteed consistent
view of the VM's state because things will be changed behind its back.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 21:48 ` Daniel P. Berrange
@ 2008-08-21 22:36 ` Anthony Liguori
2008-08-24 9:03 ` Avi Kivity
2008-08-22 7:04 ` Gerd Hoffmann
2008-08-22 19:01 ` Jamie Lokier
2 siblings, 1 reply; 117+ messages in thread
From: Anthony Liguori @ 2008-08-21 22:36 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Gerd Hoffmann, qemu-devel
Daniel P. Berrange wrote:
> On Thu, Aug 21, 2008 at 11:34:41PM +0200, Gerd Hoffmann wrote:
>
>> Hi,
>>
>> Multiple monitor instances would be very useful anyway.
>>
>> Right now there is no way to use the monitor for libvirt-managed qemu
>> instances because libvirt uses the monitor. Being able to both use
>> libvirt *and* have a monitor prompt to type commands would be great.
>>
>
> I disagree - that means you'll no longer be able to trust what libvirt
> tells you about the VM, and libvirt won't have a guarenteed consistent
> view of the VM's state because things will be changed behind its back.
>
A nice side effect of having multiple monitors is that you can implement
a proper 'select' command. This command would block and tell you about
asynchronous events (like guest CD-ROM ejecting). You could use this
mechanism to notify libvirt when things changed because of other
monitors too.
It's more work though from a libvirt perspective and you probably don't
want users screwing around in the monitor anyway.
Regards,
Anthony Liguori
> Daniel
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 22:36 ` Anthony Liguori
@ 2008-08-24 9:03 ` Avi Kivity
0 siblings, 0 replies; 117+ messages in thread
From: Avi Kivity @ 2008-08-24 9:03 UTC (permalink / raw)
To: qemu-devel
Anthony Liguori wrote:
>
> A nice side effect of having multiple monitors is that you can
> implement a proper 'select' command. This command would block and
> tell you about asynchronous events (like guest CD-ROM ejecting). You
> could use this mechanism to notify libvirt when things changed because
> of other monitors too.
I'd much prefer that a single monitor be able to receive async
notifications. It's not that difficult to implement; if notification
lines had a unique prefix the monitor client could on that.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 21:48 ` Daniel P. Berrange
2008-08-21 22:36 ` Anthony Liguori
@ 2008-08-22 7:04 ` Gerd Hoffmann
2008-08-22 19:11 ` Jamie Lokier
2008-08-22 19:01 ` Jamie Lokier
2 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-22 7:04 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, qemu-devel
Daniel P. Berrange wrote:
>> Multiple monitor instances would be very useful anyway.
>>
>> Right now there is no way to use the monitor for libvirt-managed qemu
>> instances because libvirt uses the monitor. Being able to both use
>> libvirt *and* have a monitor prompt to type commands would be great.
>
> I disagree - that means you'll no longer be able to trust what libvirt
> tells you about the VM, and libvirt won't have a guarenteed consistent
> view of the VM's state because things will be changed behind its back.
It is *not* the only purpose of the monitor to confuse libvirt. The
whole 'info <foo>' command family is very useful for development and
debugging purposes and it will not interfere at all with libvirt.
I can't stand your attitude to refuse any feature just because you fear
users might abuse it. That applies to almost any feature which makes
life easier for developers. Which in turn means using libvirt for
development is either a bit complicated or impossible.
cheers,
Gerd
--
http://kraxel.fedorapeople.org/xenner/
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-22 7:04 ` Gerd Hoffmann
@ 2008-08-22 19:11 ` Jamie Lokier
0 siblings, 0 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-22 19:11 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel
Gerd Hoffmann wrote:
> Daniel P. Berrange wrote:
> >> Multiple monitor instances would be very useful anyway.
> >>
> >> Right now there is no way to use the monitor for libvirt-managed qemu
> >> instances because libvirt uses the monitor. Being able to both use
> >> libvirt *and* have a monitor prompt to type commands would be great.
> >
> > I disagree - that means you'll no longer be able to trust what libvirt
> > tells you about the VM, and libvirt won't have a guarenteed consistent
> > view of the VM's state because things will be changed behind its back.
>
> It is *not* the only purpose of the monitor to confuse libvirt. The
> whole 'info <foo>' command family is very useful for development and
> debugging purposes and it will not interfere at all with libvirt.
>
> I can't stand your attitude to refuse any feature just because you fear
> users might abuse it. That applies to almost any feature which makes
> life easier for developers. Which in turn means using libvirt for
> development is either a bit complicated or impossible.
I think it's good for libvirt to constrain and protect users, if it's
targetting regular VM users. There are too many ways to accidentally
break what can be valuable VMs.
But if that's mandatory (can't bypass it when wanted), it should
acknowledge that libvirt is _not_ suitable for people wanting to use
advanced QEMU functionality - and QEMU features must not be designed
to assume libvirt.
It would be nice to manage VMs with a tool like libvirt _and_ still
have access to the low-level stuff like the QEMU monitor, and the
command line that libvirt generates. Maybe that's asking too much and
we have to choose one or the other.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 21:48 ` Daniel P. Berrange
2008-08-21 22:36 ` Anthony Liguori
2008-08-22 7:04 ` Gerd Hoffmann
@ 2008-08-22 19:01 ` Jamie Lokier
2 siblings, 0 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-22 19:01 UTC (permalink / raw)
To: Daniel P. Berrange, qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Daniel P. Berrange wrote:
> On Thu, Aug 21, 2008 at 11:34:41PM +0200, Gerd Hoffmann wrote:
> > Hi,
> >
> > > If there are reasons that Xend doesn't want to interact with the monitor
> > > (for instance, if you want to support monitor redirection), we can
> > > certainly discuss how to enhance the monitor (like having multiple
> > > monitor instances).
> >
> > Multiple monitor instances would be very useful anyway.
> >
> > Right now there is no way to use the monitor for libvirt-managed qemu
> > instances because libvirt uses the monitor. Being able to both use
> > libvirt *and* have a monitor prompt to type commands would be great.
>
> I disagree - that means you'll no longer be able to trust what libvirt
> tells you about the VM, and libvirt won't have a guarenteed consistent
> view of the VM's state because things will be changed behind its back.
If there were multiple monitor instances, you could filter them _all_
through libvirt, so it would always know the VM state.
That's what I do in my vM manager script. User can connect to the
manager's socket and send a monitor command. It's forwarded to QEMU,
but some filtering so the script knows what's going on, e.g. when
changing CD-ROM, starting/stopping the CPU etc.
You can't do that nicely now, though, because some commands take a
long time to execute.
Also, there's no nice way to wait for events like "tell me when the
guest reboots the CPU" at the moment.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 21:34 ` Gerd Hoffmann
2008-08-21 21:48 ` Daniel P. Berrange
@ 2008-08-21 22:34 ` Anthony Liguori
2008-08-22 8:27 ` Gerd Hoffmann
1 sibling, 1 reply; 117+ messages in thread
From: Anthony Liguori @ 2008-08-21 22:34 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, qemu-devel
Gerd Hoffmann wrote:
> Hi,
>
>
>> If there are reasons that Xend doesn't want to interact with the monitor
>> (for instance, if you want to support monitor redirection), we can
>> certainly discuss how to enhance the monitor (like having multiple
>> monitor instances).
>>
>
> Multiple monitor instances would be very useful anyway.
>
The trick here is to add a MonitorState context to every callback and
have term_printf() take a MonitorState as the first argument. It's a
mostly mechanical change.
> Right now there is no way to use the monitor for libvirt-managed qemu
> instances because libvirt uses the monitor. Being able to both use
> libvirt *and* have a monitor prompt to type commands would be great.
>
It would make the current monitor multiplex hackery a lot cleaner too.
Regards,
Anthony Liguori
> cheers,
> Gerd
>
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 22:34 ` Anthony Liguori
@ 2008-08-22 8:27 ` Gerd Hoffmann
0 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-22 8:27 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel
Anthony Liguori wrote:
> Gerd Hoffmann wrote:
>> Multiple monitor instances would be very useful anyway.
>
> The trick here is to add a MonitorState context to every callback and
> have term_printf() take a MonitorState as the first argument. It's a
> mostly mechanical change.
The devil is in the details ...
How do we want support multiple monitors? One way would be using
multiple -monitor switches, so you can have -- say -- one pty for
libvirt and one vc for the user / developer. Another way would be
allowing multiple connects to unix/tcp sockets, which is probably a bit
more tricky due to qemu not having infrastructure for that (yet). We
also could do both ...
And when touching all monitor command functions anyway, we could also
cleanup the handler mess a bit ;)
How about this: We define *two* different callback functions:
typedef struct term_cmd_t {
[ ... ]
void (*handler_one)(MonitorState *s, const char *arg);
void (*handler_many)(MonitorState *s, int argv, char *argv[])
[ ... ]
};
The first one can be used by the monitor functions taking at most one
argument, with minimal source code changes, which is probably more than
90% of them. The remaining ones can be converted to use the second form.
cheers,
Gerd
--
http://kraxel.fedorapeople.org/xenner/
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:22 ` Anthony Liguori
2008-08-21 20:26 ` Daniel P. Berrange
2008-08-21 21:34 ` Gerd Hoffmann
@ 2008-08-24 8:57 ` Avi Kivity
2008-08-24 9:59 ` Jamie Lokier
2 siblings, 1 reply; 117+ messages in thread
From: Avi Kivity @ 2008-08-24 8:57 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Anthony Liguori wrote:
>>
>> -vnc localhost:3,passwdfd=6
>>
>
> In general, I strongly dislike passing file descriptors like this.
I find this very useful, and actually think we should encourage it, and
also allow passing file descriptors over the monitor (using
SCM_RIGHTS). This can help de-privilege qemu. In fact, you can run a
guest where qemu doesn't even have permissions to open the backing file.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-24 8:57 ` Avi Kivity
@ 2008-08-24 9:59 ` Jamie Lokier
2008-08-24 10:07 ` Avi Kivity
0 siblings, 1 reply; 117+ messages in thread
From: Jamie Lokier @ 2008-08-24 9:59 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Avi Kivity wrote:
> Anthony Liguori wrote:
> >> -vnc localhost:3,passwdfd=6
> >
> >In general, I strongly dislike passing file descriptors like this.
>
> I find this very useful, and actually think we should encourage it, and
> also allow passing file descriptors over the monitor (using
> SCM_RIGHTS). This can help de-privilege qemu. In fact, you can run a
> guest where qemu doesn't even have permissions to open the backing file.
You can already do this:
qemu -hdc /proc/self/fd/3 3<>MS-DOS-6.22.img
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-24 9:59 ` Jamie Lokier
@ 2008-08-24 10:07 ` Avi Kivity
2008-08-24 12:48 ` Jamie Lokier
0 siblings, 1 reply; 117+ messages in thread
From: Avi Kivity @ 2008-08-24 10:07 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Jamie Lokier wrote:
> Avi Kivity wrote:
>
>> Anthony Liguori wrote:
>>
>>>> -vnc localhost:3,passwdfd=6
>>>>
>>> In general, I strongly dislike passing file descriptors like this.
>>>
>> I find this very useful, and actually think we should encourage it, and
>> also allow passing file descriptors over the monitor (using
>> SCM_RIGHTS). This can help de-privilege qemu. In fact, you can run a
>> guest where qemu doesn't even have permissions to open the backing file.
>>
>
> You can already do this:
>
> qemu -hdc /proc/self/fd/3 3<>MS-DOS-6.22.img
>
>
Not over the monitor.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-24 10:07 ` Avi Kivity
@ 2008-08-24 12:48 ` Jamie Lokier
0 siblings, 0 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-24 12:48 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Avi Kivity wrote:
> Jamie Lokier wrote:
> >Avi Kivity wrote:
> >
> >>Anthony Liguori wrote:
> >>
> >>>> -vnc localhost:3,passwdfd=6
> >>>>
> >>>In general, I strongly dislike passing file descriptors like this.
> >>>
> >>I find this very useful, and actually think we should encourage it, and
> >>also allow passing file descriptors over the monitor (using
> >>SCM_RIGHTS). This can help de-privilege qemu. In fact, you can run a
> >>guest where qemu doesn't even have permissions to open the backing file.
> >>
> >
> >You can already do this:
> >
> > qemu -hdc /proc/self/fd/3 3<>MS-DOS-6.22.img
>
> Not over the monitor.
Oh, you're thinking of sending SCM_RIGHTS over the monitor..
Nice :-)
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:19 ` [Xen-devel] " Daniel P. Berrange
2008-08-21 20:22 ` Anthony Liguori
@ 2008-08-21 20:59 ` Jamie Lokier
2008-08-21 21:29 ` Gerd Hoffmann
2 siblings, 0 replies; 117+ messages in thread
From: Jamie Lokier @ 2008-08-21 20:59 UTC (permalink / raw)
To: Daniel P. Berrange, qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Daniel P. Berrange wrote:
> If the monitor really isn't sufficient, then I'd suggest allowing an
> open file descriptor to be specified on the -vnc arg, eg
>
> -vnc localhost:3,passwdfd=6
If you're going to do that, pass a file name instead.
If someone wants to pass a file descriptor, they can simply write
passwd=/proc/self/fd/6.
-- Jamie
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:19 ` [Xen-devel] " Daniel P. Berrange
2008-08-21 20:22 ` Anthony Liguori
2008-08-21 20:59 ` Jamie Lokier
@ 2008-08-21 21:29 ` Gerd Hoffmann
2 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 21:29 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, qemu-devel
Daniel P. Berrange wrote:
> If the monitor really isn't sufficient, then I'd suggest allowing an
> open file descriptor to be specified on the -vnc arg, eg
>
> -vnc localhost:3,passwdfd=6
>
> And then let QEMU read the password from this file descriptor. I'd rather
> XenD just used the monitor though - it'll ultimately need to use it anyway
> to perform disk/network hotplug amongst other things.
I'd say using the monitor is the way to go for xend. I'd expect setting
the vnc port in xenstore (for -vncunused functionality) will be vetoed
too. Thus xend will have to use the monitor anyway to figure the vnc
port. No point in implementing a passwdfd= option then.
cheers,
Gerd
--
http://kraxel.fedorapeople.org/xenner/
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:14 ` Anthony Liguori
2008-08-21 20:19 ` [Xen-devel] " Daniel P. Berrange
@ 2008-08-26 10:13 ` Ian Jackson
1 sibling, 0 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 10:13 UTC (permalink / raw)
To: Anthony Liguori; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Anthony Liguori writes ("[Xen-devel] Re: [Qemu-devel] [PATCH 12/13] set vnc password from xenstore."):
> I don't really like this approach. I don't want to have a totally
> separate management interface just for Xend to use (via xenstore).
I agree that the xenstore management approach is not right for qemu.
It would be better to go through a more stable interface.
> Can't Xend use the monitor like every other management tool?
I think it probably could. When I was doing my initial work to start
merging our tree with the qemu one, I didn't do this because I wanted
a drop-in replacement for the older code, and because I ran out of
time.
But as others have noted there are also two things which the monitor
doesn't do which we would need: multiple monitors, and the ability for
a monitor client to wait for events. We need multiple monitors so
that a monitor remains available to those people whose Xen
installations are already making use of it.
In the new scheme of things I envisage a xenstore <-> qemu-monitor
conversion layer. This would work much like the way xenstore.c works
in our Xen qemu tree does. It probably ought not to be part of xend,
and I think I would still like it to be compiled as part of our qemu
build.
If upstream qemu don't want a bevy of alternative management veneers
around the main qemu binary in their tree then I don't think there
would be a problem with us maintaining that indefinitely in our
version. The important thing is not which tree the code is in, but
that we should go through a stable interface intended by qemu upstream
to be used for this kind of thing.
So I would suggest that the starting point would be to provide for
multiple monitors. Then, we can gradually change the code in our
tree's xenstore.c to go via the monitor (adding functionality to the
monitor as necessary). Eventually we'll have a xenstore.c which
doesn't do anything other than go via the monitor at which point
we can either leave things, or include that upstream if desired.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* [Qemu-devel] Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 16:27 ` [Qemu-devel] [PATCH 12/13] set vnc password from xenstore Gerd Hoffmann
2008-08-21 20:14 ` Anthony Liguori
@ 2008-08-21 20:17 ` Daniel P. Berrange
2008-08-21 21:25 ` Gerd Hoffmann
[not found] ` <m2n.s.1KWHfl-001Ntw@chiark.greenend.org.uk>
1 sibling, 2 replies; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-21 20:17 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, qemu-devel
What is the justification for this change. We already have a means
to set the VNC password from the monitor. If this isn't sufficient
we should introduce an additional method which isn't Xen specific.
I know XenD currently sends the passwords to QEMU via xenstore,
but that's not a good reason to preserve this approach. XenD
does lots of stuff via XenStore which it should be doing via
command line arguments - such as setting up disks instead of using
the -drive param.
IMHO, XenD should just set the VNC password by connecting to the
monitor. That's what its there for after all.
Daniel
On Thu, Aug 21, 2008 at 06:27:33PM +0200, Gerd Hoffmann wrote:
>
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
> hw/xen_machine_pv.c | 26 ++++++++++++++++++++++++++
> 1 files changed, 26 insertions(+), 0 deletions(-)
>
> diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
> index 851837c..9f18742 100644
> --- a/hw/xen_machine_pv.c
> +++ b/hw/xen_machine_pv.c
> @@ -24,6 +24,7 @@
>
> #include "hw.h"
> #include "boards.h"
> +#include "console.h"
>
> #include "xen_backend.h"
>
> @@ -110,6 +111,29 @@ static int xen_init_pv(DisplayState *ds)
> return 0;
> }
>
> +static void xen_init_vnc(void)
> +{
> + char xspath[256], *dom = NULL, *vm = NULL, *pw = NULL;
> + int len;
> +
> + dom = xs_get_domain_path(xenstore, xen_domid);
> + snprintf(xspath, sizeof(xspath), "%s/vm", dom);
> + vm = xs_read(xenstore, 0, xspath, &len);
> + if (!vm)
> + goto out;
> + snprintf(xspath, sizeof(xspath), "%s/vncpassword", vm);
> + pw = xs_read(xenstore, 0, xspath, &len);
> + if (!pw || !strlen(pw))
> + goto out;
> + vnc_display_password(NULL, pw);
> + fprintf(stderr, "vnc password set from xenstore\n");
> +
> +out:
> + free(dom);
> + free(vm);
> + free(pw);
> +}
> +
> /* -------------------------------------------------------------------- */
> /* paravirtualized xen machine */
>
> @@ -136,6 +160,8 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size,
> goto err;
> }
>
> + xen_init_vnc();
> +
> /* create dummy cpu, halted */
> if (cpu_model == NULL) {
> #ifdef TARGET_X86_64
> --
> 1.5.5.1
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* [Qemu-devel] Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore.
2008-08-21 20:17 ` [Qemu-devel] Re: [Xen-devel] " Daniel P. Berrange
@ 2008-08-21 21:25 ` Gerd Hoffmann
[not found] ` <m2n.s.1KWHfl-001Ntw@chiark.greenend.org.uk>
1 sibling, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 21:25 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, qemu-devel
Daniel P. Berrange wrote:
> What is the justification for this change. We already have a means
> to set the VNC password from the monitor. If this isn't sufficient
> we should introduce an additional method which isn't Xen specific.
Justification is that I've just implemented what qemu-dm does today, so
upstream qemu can easily replace qemu-dm.
I'm perfectly fine with dropping the patch altogether.
> IMHO, XenD should just set the VNC password by connecting to the
> monitor. That's what its there for after all.
Agreed.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KWHfl-001Ntw@chiark.greenend.org.uk>]
* Re: [Qemu-devel] Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore.
[not found] ` <m2n.s.1KWHfl-001Ntw@chiark.greenend.org.uk>
@ 2008-08-26 10:47 ` Ian Jackson
0 siblings, 0 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 10:47 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel
Gerd Hoffmann writes ("[Qemu-devel] Re: [Xen-devel] [PATCH 12/13] set vnc password from xenstore."):
> Justification is that I've just implemented what qemu-dm does today, so
> upstream qemu can easily replace qemu-dm.
This seems to be predicated on the idea that xen upstream are just
going to drop our tree in favour of Gerd's. That seems unlikely to
me.
Rather, we would like to evolve the various branches so that they are
closer together. In particular, where changes are though to be needed
to bring qemu-xen closer to upstream, we (Xen upstream) would like to
consider each such change on its own merits. We don't intend to
import a whole pile of externally-written Xen support code to replace
what we have now.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* [Qemu-devel] [PATCH 13/13] xen: pv domain builder.
2008-08-21 16:27 [Qemu-devel] [PATCH 00/13] merge some xen bits into qemu Gerd Hoffmann
` (11 preceding siblings ...)
2008-08-21 16:27 ` [Qemu-devel] [PATCH 12/13] set vnc password from xenstore Gerd Hoffmann
@ 2008-08-21 16:27 ` Gerd Hoffmann
[not found] ` <m2n.s.1KWDD7-002TB9@chiark.greenend.org.uk>
[not found] ` <m2n.s.1KWDfC-002TCd@chiark.greenend.org.uk>
14 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-21 16:27 UTC (permalink / raw)
To: qemu-devel, xen-devel; +Cc: Gerd Hoffmann
This adds domain building support for paravirtual domains to qemu.
This allows booting xen guests directly with qemu, without Xend
and the management stack.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
Makefile.target | 2 +-
configure | 2 +-
hw/xen_backend.h | 3 +
hw/xen_devconfig.c | 29 +++++
hw/xen_domainbuild.c | 281 ++++++++++++++++++++++++++++++++++++++++++++++++++
hw/xen_domainbuild.h | 9 ++
hw/xen_machine_pv.c | 34 ++++---
7 files changed, 344 insertions(+), 16 deletions(-)
create mode 100644 hw/xen_domainbuild.c
create mode 100644 hw/xen_domainbuild.h
diff --git a/Makefile.target b/Makefile.target
index 7e507e4..47bfc77 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -516,7 +516,7 @@ LIBS += $(CONFIG_VNC_TLS_LIBS)
endif
# xen backend driver support
-XEN_OBJS := xen_machine_pv.o xen_backend.o xen_devconfig.o
+XEN_OBJS := xen_machine_pv.o xen_backend.o xen_devconfig.o xen_domainbuild.o
XEN_OBJS += xen_console.o xen_framebuffer.o xen_disk.o xen_nic.o
ifeq ($(CONFIG_XEN), yes)
OBJS += $(XEN_OBJS)
diff --git a/configure b/configure
index 14689d2..cec0b48 100755
--- a/configure
+++ b/configure
@@ -1228,7 +1228,7 @@ fi
if test "$xen" = "yes" ; then
echo "CONFIG_XEN=yes" >> $config_mak
echo "#define CONFIG_XEN 1" >> $config_h
- echo "XEN_LIBS=-lxenstore -lxenctrl" >> $config_mak
+ echo "XEN_LIBS=-lxenstore -lxenctrl -lxenguest" >> $config_mak
fi
if test "$aio" = "yes" ; then
echo "#define CONFIG_AIO 1" >> $config_h
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index f591743..70ad016 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -99,5 +99,8 @@ void xen_set_display(int domid, DisplayState *ds);
void xen_config_cleanup(void);
int xen_config_dev_blk(DriveInfo *disk);
int xen_config_dev_nic(NICInfo *nic);
+int xen_config_dev_vfb(int vdev, char *type);
+int xen_config_dev_vkbd(int vdev);
+int xen_config_dev_console(int vdev);
#endif /* QEMU_XEN_BACKEND_H */
diff --git a/hw/xen_devconfig.c b/hw/xen_devconfig.c
index 25d28b4..0d506ca 100644
--- a/hw/xen_devconfig.c
+++ b/hw/xen_devconfig.c
@@ -142,3 +142,32 @@ int xen_config_dev_nic(NICInfo *nic)
/* common stuff */
return xen_config_dev_all(fe, be);
}
+
+int xen_config_dev_vfb(int vdev, char *type)
+{
+ char fe[256], be[256];
+
+ xen_config_dev_dirs("vfb", "vfb", vdev, fe, be, sizeof(fe));
+
+ /* backend */
+ xenstore_write_str(be, "type", type);
+
+ /* common stuff */
+ return xen_config_dev_all(fe, be);
+}
+
+int xen_config_dev_vkbd(int vdev)
+{
+ char fe[256], be[256];
+
+ xen_config_dev_dirs("vkbd", "vkbd", vdev, fe, be, sizeof(fe));
+ return xen_config_dev_all(fe, be);
+}
+
+int xen_config_dev_console(int vdev)
+{
+ char fe[256], be[256];
+
+ xen_config_dev_dirs("console", "console", vdev, fe, be, sizeof(fe));
+ return xen_config_dev_all(fe, be);
+}
diff --git a/hw/xen_domainbuild.c b/hw/xen_domainbuild.c
new file mode 100644
index 0000000..deba98d
--- /dev/null
+++ b/hw/xen_domainbuild.c
@@ -0,0 +1,281 @@
+#include "xen_backend.h"
+#include "xen_domainbuild.h"
+#include "sysemu.h"
+#include "qemu-timer.h"
+
+#include <xenguest.h>
+
+/* temporary */
+static const char qemu_uuid[] = "f0c9d298-3dd9-4c20-8c90-0205ac932241";
+
+static int xenstore_domain_mkdir(char *path)
+{
+ struct xs_permissions perms = {
+ .id = xen_domid,
+ .perms = XS_PERM_WRITE,
+ };
+
+ if (!xs_mkdir(xenstore, 0, path)) {
+ fprintf(stderr, "%s: xs_mkdir %s: failed\n", __FUNCTION__, path);
+ return -1;
+ }
+ if (!xs_set_permissions(xenstore, 0, path, &perms, 1)) {
+ fprintf(stderr, "%s: xs_set_permissions failed\n", __FUNCTION__);
+ return -1;
+ }
+ return 0;
+}
+
+static int xenstore_domain_init1(const char *kernel, const char *ramdisk,
+ const char *cmdline)
+{
+ char *dom, vm[256], path[256];
+ int i;
+
+ dom = xs_get_domain_path(xenstore, xen_domid);
+ snprintf(vm, sizeof(vm), "/vm/%s", qemu_uuid);
+
+ xenstore_domain_mkdir(dom);
+
+ xenstore_write_str(vm, "image/ostype", "linux");
+ if (kernel)
+ xenstore_write_str(vm, "image/kernel", kernel);
+ if (ramdisk)
+ xenstore_write_str(vm, "image/ramdisk", ramdisk);
+ if (cmdline)
+ xenstore_write_str(vm, "image/cmdline", cmdline);
+
+ /* name + id */
+ xenstore_write_str(vm, "name", qemu_name ? qemu_name : "no-name");
+ xenstore_write_str(vm, "uuid", qemu_uuid);
+ xenstore_write_str(dom, "name", qemu_name ? qemu_name : "no-name");
+ xenstore_write_int(dom, "domid", xen_domid);
+ xenstore_write_str(dom, "vm", vm);
+
+ /* memory */
+ xenstore_write_int(dom, "memory/target", ram_size >> 10); // kB
+ xenstore_write_int(vm, "memory", ram_size >> 20); // MB
+ xenstore_write_int(vm, "maxmem", ram_size >> 20); // MB
+
+ /* cpus */
+ for (i = 0; i < smp_cpus; i++) {
+ snprintf(path, sizeof(path), "cpu/%d/availability",i);
+ xenstore_write_str(dom, path, "online");
+ }
+ xenstore_write_int(vm, "vcpu_avail", smp_cpus);
+ xenstore_write_int(vm, "vcpus", smp_cpus);
+
+ /* vnc password */
+ xenstore_write_str(vm, "vncpassword", "" /* FIXME */);
+
+ free(dom);
+ return 0;
+}
+
+static int xenstore_domain_init2(int xenstore_port, int xenstore_mfn,
+ int console_port, int console_mfn)
+{
+ char *dom;
+
+ dom = xs_get_domain_path(xenstore, xen_domid);
+
+ /* signal new domain */
+ xs_introduce_domain(xenstore,
+ xen_domid,
+ xenstore_mfn,
+ xenstore_port);
+
+ /* xenstore */
+ xenstore_write_int(dom, "store/ring-ref", xenstore_mfn);
+ xenstore_write_int(dom, "store/port", xenstore_port);
+
+ /* console */
+ xenstore_write_str(dom, "console/type", "ioemu");
+ xenstore_write_int(dom, "console/limit", 128 * 1024);
+ xenstore_write_int(dom, "console/ring-ref", console_mfn);
+ xenstore_write_int(dom, "console/port", console_port);
+ xen_config_dev_console(0);
+
+ /* devices */
+ if (1 /* FIXME */) {
+ xen_config_dev_vfb(0, "vnc");
+ xen_config_dev_vkbd(0);
+ }
+
+ free(dom);
+ return 0;
+}
+
+/* ------------------------------------------------------------- */
+
+static QEMUTimer *xen_poll;
+
+/* check domain state once per second */
+static void xen_domain_poll(void *opaque)
+{
+ struct xc_dominfo info;
+ int rc;
+
+ rc = xc_domain_getinfo(xen_xc, xen_domid, 1, &info);
+ if ((1 != rc) || (info.domid != xen_domid)) {
+ fprintf(stderr, "xen: domain %d is gone\n", xen_domid);
+ goto quit;
+ }
+ if (info.dying) {
+ fprintf(stderr, "xen: domain %d is dying (%s%s)\n", xen_domid,
+ info.crashed ? "crashed" : "",
+ info.shutdown ? "shutdown" : "");
+ goto quit;
+ }
+
+ qemu_mod_timer(xen_poll, qemu_get_clock(rt_clock) + 1000);
+ return;
+
+quit:
+ qemu_system_shutdown_request();
+ return;
+}
+
+static void xen_domain_watcher(void)
+{
+ int qemu_running = 1;
+ int fd[2], i, rc;
+ char byte;
+
+ pipe(fd);
+ if (0 != fork())
+ return; /* not child */
+
+ /* close all file handles, except stdio/out/err,
+ * our watch pipe and the xen interface handle */
+ for (i = 3; i < 256; i++) {
+ if (i == fd[0])
+ continue;
+ if (i == xen_xc)
+ continue;
+ close(i);
+ }
+
+ /* wait for qemu exiting */
+ while (qemu_running) {
+ rc = read(fd[0], &byte, 1);
+ switch (rc) {
+ case -1:
+ if (EINTR == errno)
+ continue;
+ fprintf(stderr, "%s: Huh? read error: %s\n", __FUNCTION__, strerror(errno));
+ qemu_running = 0;
+ break;
+ case 0:
+ /* EOF -> qemu exited */
+ qemu_running = 0;
+ break;
+ default:
+ fprintf(stderr, "%s: Huh? data on the watch pipe?\n", __FUNCTION__);
+ break;
+ }
+ }
+
+ /* cleanup */
+ fprintf(stderr, "%s: destroy domain %d\n", __FUNCTION__, xen_domid);
+ xc_domain_destroy(xen_xc, xen_domid);
+ _exit(0);
+}
+
+/* normal cleanup */
+static void xen_domain_cleanup(void)
+{
+ char *dom;
+
+ dom = xs_get_domain_path(xenstore, xen_domid);
+ if (dom) {
+ xs_rm(xenstore, 0, dom);
+ free(dom);
+ }
+ xs_release_domain(xenstore, xen_domid);
+}
+
+int xen_domain_build_pv(const char *kernel, const char *ramdisk,
+ const char *cmdline)
+{
+ uint32_t ssidref = 0;
+ uint32_t flags = 0;
+ xen_domain_handle_t uuid;
+ unsigned int xenstore_port = 0, console_port = 0;
+ unsigned long xenstore_mfn = 0, console_mfn = 0;
+ int rc, i, pos, val;
+
+ for (i = 0, pos = 0; i < 16; i++, pos += 2) {
+ if (i == 4 || i == 6 || i == 8 || i == 10) {
+ /* skip dashes */
+ if ('-' != qemu_uuid[pos]) {
+ fprintf(stderr, "xen: uuid parse error");
+ goto err;
+ }
+ pos++;
+ }
+ if (1 != sscanf(qemu_uuid+pos, "%02x", &val)) {
+ fprintf(stderr, "xen: uuid parse error");
+ goto err;
+ }
+ uuid[i] = val;
+ }
+
+ rc = xc_domain_create(xen_xc, ssidref, uuid, flags, &xen_domid);
+ if (rc < 0) {
+ fprintf(stderr, "xen: xc_domain_create() failed\n");
+ goto err;
+ }
+ fprintf(stderr, "xen: created domain %d\n", xen_domid);
+ atexit(xen_domain_cleanup);
+ xen_domain_watcher();
+
+ xenstore_domain_init1(kernel, ramdisk, cmdline);
+
+ rc = xc_domain_max_vcpus(xen_xc, xen_domid, smp_cpus);
+ if (rc < 0) {
+ fprintf(stderr, "xen: xc_domain_max_vcpus() failed\n");
+ goto err;
+ }
+
+ rc = xc_domain_setcpuweight(xen_xc, xen_domid, 256);
+ if (rc < 0) {
+ fprintf(stderr, "xen: xc_domain_setcpuweight() failed\n");
+ goto err;
+ }
+
+ rc = xc_domain_setmaxmem(xen_xc, xen_domid, ram_size >> 10);
+ if (rc < 0) {
+ fprintf(stderr, "xen: xc_domain_setmaxmem() failed\n");
+ goto err;
+ }
+
+ xenstore_port = xc_evtchn_alloc_unbound(xen_xc, xen_domid, 0);
+ console_port = xc_evtchn_alloc_unbound(xen_xc, xen_domid, 0);
+
+ rc = xc_linux_build(xen_xc, xen_domid, ram_size >> 20,
+ kernel, ramdisk, cmdline,
+ 0, flags,
+ xenstore_port, &xenstore_mfn,
+ console_port, &console_mfn);
+ if (rc < 0) {
+ fprintf(stderr, "xen: xc_linux_build() failed\n");
+ goto err;
+ }
+
+ xenstore_domain_init2(xenstore_port, xenstore_mfn,
+ console_port, console_mfn);
+
+ rc = xc_domain_unpause(xen_xc, xen_domid);
+ if (rc < 0) {
+ fprintf(stderr, "xen: xc_domain_unpause() failed\n");
+ goto err;
+ }
+
+ xen_poll = qemu_new_timer(rt_clock, xen_domain_poll, NULL);
+ qemu_mod_timer(xen_poll, qemu_get_clock(rt_clock) + 1000);
+ return 0;
+
+err:
+ return -1;
+}
diff --git a/hw/xen_domainbuild.h b/hw/xen_domainbuild.h
new file mode 100644
index 0000000..21ee180
--- /dev/null
+++ b/hw/xen_domainbuild.h
@@ -0,0 +1,9 @@
+#ifndef QEMU_XEN_DOMAINBUILD_H
+#define QEMU_XEN_DOMAINBUILD_H 1
+
+#include "xen_common.h"
+
+int xen_domain_build_pv(const char *kernel, const char *ramdisk,
+ const char *cmdline);
+
+#endif /* QEMU_XEN_DOMAINBUILD_H */
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 9f18742..8c9f3c1 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -27,6 +27,7 @@
#include "console.h"
#include "xen_backend.h"
+#include "xen_domainbuild.h"
/* -------------------------------------------------------------------- */
/* variables */
@@ -90,7 +91,9 @@ static int xen_init(void)
return 0;
}
-static int xen_init_pv(DisplayState *ds)
+static int xen_init_pv(const char *kernel_filename,
+ const char *kernel_cmdline,
+ const char *initrd_filename)
{
int rc;
@@ -98,6 +101,19 @@ static int xen_init_pv(DisplayState *ds)
if (rc < 0)
return rc;
+ if (xen_emulate) {
+ fprintf(stderr, "xen pv emulation not implemented yet\n");
+ return -1;
+ }
+
+ if (xen_domainbuild) {
+ if (xen_domain_build_pv(kernel_filename, initrd_filename,
+ kernel_cmdline) < 0) {
+ fprintf(stderr, "xen pv domain creation failed\n");
+ return -1;
+ }
+ }
+
/* xenbus backend drivers */
xen_be_register("console", &xen_console_ops);
xen_be_register("vkbd", &xen_kbdmouse_ops);
@@ -105,9 +121,6 @@ static int xen_init_pv(DisplayState *ds)
xen_be_register("qdisk", &xen_blkdev_ops);
xen_be_register("qnic", &xen_netdev_ops);
- /* setup framebuffer */
- xen_set_display(xen_domid, ds);
-
return 0;
}
@@ -147,19 +160,12 @@ static void xenpv_init(ram_addr_t ram_size, int vga_ram_size,
CPUState *env;
int index,i,rc;
- rc = xen_init_pv(ds);
+ rc = xen_init_pv(kernel_filename, kernel_cmdline, initrd_filename);
if (-1 == rc)
goto err;
- if (xen_emulate) {
- fprintf(stderr, "xen pv emulation not implemented yet\n");
- goto err;
- }
- if (xen_domainbuild) {
- fprintf(stderr, "xen pv domain creation not implemented yet\n");
- goto err;
- }
-
+ /* setup framebuffer */
+ xen_set_display(xen_domid, ds);
xen_init_vnc();
/* create dummy cpu, halted */
--
1.5.5.1
^ permalink raw reply related [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KWDD7-002TB9@chiark.greenend.org.uk>]
* Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
[not found] ` <m2n.s.1KWDD7-002TB9@chiark.greenend.org.uk>
@ 2008-08-26 10:31 ` Ian Jackson
2008-08-26 12:34 ` Gerd Hoffmann
2008-08-26 12:57 ` [Xen-devel] " Daniel P. Berrange
0 siblings, 2 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 10:31 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, Gerd Hoffmann
Gerd Hoffmann writes ("[Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> +/*
> + * Figure the environment we are running in.
> + * Returns true when xen is present, false otherwise.
> + * Also checks whenever the domain specified via -domid
> + * exists (so we can attach) or whenever it must be created.
> + */
I don't think this is the right approach. The intent appears to be
that if you run a particular qemu rune, it will do a completely
different thing when running under Xen.
One bad effect of this is that it is not possible to do a software
emulation of the Xen environment if the host happens to actually be
running under Xen.
The command line, or chosen qemu executable, should specify what qemu
is supposed to do:
- supporting tasks (hardware emulation etc.)
for a real Xen domain managed by Xen tools via xenstore etc.
- software emulation of the Xen environment for the benefit
of a Xen domU but without using the Xen hypervisor (and
this should obviously work even if running under Xen0
- whatever other use cases are important; apparently Gerd
seems keen to run some code with parts of the xend toolstack
replaced
Note that of course we have also ported qemu to our embedded minios so
that it can run as a service domain.
> + "-domid specify xen guest domain id\n"
Now would be a good time to rename this option to have the word `xen'
in it.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 10:31 ` [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support Ian Jackson
@ 2008-08-26 12:34 ` Gerd Hoffmann
2008-08-26 12:57 ` [Xen-devel] " Daniel P. Berrange
1 sibling, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-26 12:34 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel
Ian Jackson wrote:
> Gerd Hoffmann writes ("[Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
>> +/*
>> + * Figure the environment we are running in.
>> + * Returns true when xen is present, false otherwise.
>> + * Also checks whenever the domain specified via -domid
>> + * exists (so we can attach) or whenever it must be created.
>> + */
>
> I don't think this is the right approach. The intent appears to be
> that if you run a particular qemu rune, it will do a completely
> different thing when running under Xen.
That is the intended default behaviour, yes. If you run on Xen, you
most likely want to use it, right? That matches behaviour of kvm and
kqemu: If support is available, it is used by default (there are
options to turn it off).
> One bad effect of this is that it is not possible to do a software
> emulation of the Xen environment if the host happens to actually be
> running under Xen.
That isn't a fundamental issue. It can trivially be fixed by adding a
command line option, which just has to set xen_present = 0.
> The command line, or chosen qemu executable, should specify what qemu
> is supposed to do:
> - supporting tasks (hardware emulation etc.)
> for a real Xen domain managed by Xen tools via xenstore etc.
> - software emulation of the Xen environment for the benefit
> of a Xen domU but without using the Xen hypervisor (and
> this should obviously work even if running under Xen0
> - whatever other use cases are important; apparently Gerd
> seems keen to run some code with parts of the xend toolstack
> replaced
We can certainly add command line options to force a specific operation
mode, see above. I don't think such an option should be required
though. IMHO qemu can and should pick sensible defaults.
>> + "-domid specify xen guest domain id\n"
>
> Now would be a good time to rename this option to have the word `xen'
> in it.
How about -xen-domid? And prefix any other xen specific options with
-xen too of course?
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 10:31 ` [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support Ian Jackson
2008-08-26 12:34 ` Gerd Hoffmann
@ 2008-08-26 12:57 ` Daniel P. Berrange
2008-08-26 13:20 ` Gerd Hoffmann
2008-08-26 13:24 ` Ian Jackson
1 sibling, 2 replies; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-26 12:57 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
On Tue, Aug 26, 2008 at 11:31:47AM +0100, Ian Jackson wrote:
> Gerd Hoffmann writes ("[Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> > +/*
> > + * Figure the environment we are running in.
> > + * Returns true when xen is present, false otherwise.
> > + * Also checks whenever the domain specified via -domid
> > + * exists (so we can attach) or whenever it must be created.
> > + */
>
> I don't think this is the right approach. The intent appears to be
> that if you run a particular qemu rune, it will do a completely
> different thing when running under Xen.
This particular check is doing two different jobs in one go. I think
they need to be dealt with separately.
- Check to see if running on Xen or not. This is a similar idea to
to pv_ops where we have one kernel binary, and probe at runtime
to decide whether to run in KVM, Xen, VMWare or bare metal mode.
This is important to QEMU too. Merging the source trees for all
projects using QEMU is just the first step. A single binary for
all is the holy grail, though may not be practical for Mini-OS
usage
- Check to see whether the guest domain is pre-created (ie attaching
to an existing VM created by XenD), or whether QEMU is constructing
the domain from scratch.
Probing for whether a domain ID exists in the hypervisor, and if not, then
creating it, has a nasty race condition, where XenD could launch QEMU
and then (for whatever reason) decide to kill off the domain, but QEMU
then re-creates it during its startup procedure.
If we want QEMU to explicitly created Xen domains from scratch, rather than
attaching to an existing one, then there needs to be a way to reliably
specify this behaviour via the CLI.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 12:57 ` [Xen-devel] " Daniel P. Berrange
@ 2008-08-26 13:20 ` Gerd Hoffmann
2008-08-26 13:23 ` Daniel P. Berrange
2008-08-26 13:24 ` Ian Jackson
1 sibling, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-26 13:20 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Ian Jackson, qemu-devel
Daniel P. Berrange wrote:
> Probing for whether a domain ID exists in the hypervisor, and if not, then
> creating it, has a nasty race condition, where XenD could launch QEMU
> and then (for whatever reason) decide to kill off the domain, but QEMU
> then re-creates it during its startup procedure.
Oops. Yes, that needs to be addressed.
> If we want QEMU to explicitly created Xen domains from scratch, rather than
> attaching to an existing one, then there needs to be a way to reliably
> specify this behaviour via the CLI.
So we are scoring at three options for xen now:
-xen-domid <nr> specify xen domain id.
-xen-create ask qemu to create the domain.
-xen-emulate force xen emulation mode.
I'll go add these for the next round.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 13:20 ` Gerd Hoffmann
@ 2008-08-26 13:23 ` Daniel P. Berrange
2008-08-26 13:33 ` Ian Jackson
0 siblings, 1 reply; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-26 13:23 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, Ian Jackson, qemu-devel
On Tue, Aug 26, 2008 at 03:20:06PM +0200, Gerd Hoffmann wrote:
> Daniel P. Berrange wrote:
> > Probing for whether a domain ID exists in the hypervisor, and if not, then
> > creating it, has a nasty race condition, where XenD could launch QEMU
> > and then (for whatever reason) decide to kill off the domain, but QEMU
> > then re-creates it during its startup procedure.
>
> Oops. Yes, that needs to be addressed.
>
> > If we want QEMU to explicitly created Xen domains from scratch, rather than
> > attaching to an existing one, then there needs to be a way to reliably
> > specify this behaviour via the CLI.
>
> So we are scoring at three options for xen now:
>
> -xen-domid <nr> specify xen domain id.
> -xen-create ask qemu to create the domain.
Actually I'd invert that, eg -xen-attach since common QEMU semantics
are that it is completely in charge of creation. Attaching to an externally
created VM is the xen additional semantics.
Regards,
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 13:23 ` Daniel P. Berrange
@ 2008-08-26 13:33 ` Ian Jackson
2008-08-26 14:14 ` Daniel P. Berrange
[not found] ` <m2n.s.1KXzR5-002R2A@chiark.greenend.org.uk>
0 siblings, 2 replies; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 13:33 UTC (permalink / raw)
To: Gerd Hoffmann, Daniel P. Berrange; +Cc: xen-devel, qemu-devel
Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> So we are scoring at three options for xen now:
>
> -xen-domid <nr> specify xen domain id.
> -xen-create ask qemu to create the domain.
> -xen-emulate force xen emulation mode.
No, the correct set of options are:
-xen-bare no xend on this system, qemu should create a Xen domain
without this, Xen machine types use emulated
Xen functionality
-xen-domid <nr> has slightly different meanings in the various use cases
And if you really think you must have the same executable for the xend
case:
-xen-attach used by xend; qemu is to support a Xen HVM domain
not to be used from the command line
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 13:33 ` Ian Jackson
@ 2008-08-26 14:14 ` Daniel P. Berrange
2008-08-26 14:31 ` Gerd Hoffmann
[not found] ` <m2n.s.1KXzR5-002R2A@chiark.greenend.org.uk>
1 sibling, 1 reply; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-26 14:14 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, Gerd Hoffmann, qemu-devel
On Tue, Aug 26, 2008 at 02:33:09PM +0100, Ian Jackson wrote:
> Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> > So we are scoring at three options for xen now:
> >
> > -xen-domid <nr> specify xen domain id.
> > -xen-create ask qemu to create the domain.
> > -xen-emulate force xen emulation mode.
>
> No, the correct set of options are:
>
> -xen-bare no xend on this system, qemu should create a Xen domain
> without this, Xen machine types use emulated
> Xen functionality
I'm not sure I understand what you mean here ? Are you suggesting this
for the 'Xen hypervisor present, standalone mode - ie no XenD' ? If so
I think that's not right - for standalone mode we shouldn't require any
special args for the admin. We should only require additional args for
cases where management tools like XenD or libvirt are invoking QEMU.
> -xen-domid <nr> has slightly different meanings in the various use cases
Perhaps best not to overload this then - pass domid to the -attach arg
directly, and have a -xen-create <nr> for the scenario where you want
to specify the domid for a newly created VM
> And if you really think you must have the same executable for the xend
> case:
>
> -xen-attach used by xend; qemu is to support a Xen HVM domain
> not to be used from the command line
I'll try to enumerate the scenarios here...
On KVM, using Xenner + shell : No special args
On KVM, using Xenner + libvirtd : -xen-create <domid>
On Xen HV + shell : No special args
On Xen HV + libvirtd : -xen-create <domid>
On Xen HV + XenD : -xen-attach <domid>
NB. by 'shell' I mean the admin manually launching QEMU from the shell
If neither -xen-create or -xen-attach is given, then it will create a
new domain from scratch & auto-allocate a domid. Passing both the
-xen-create and -xen-attach args is forbidden / nonsensical.
In all cases, the actual hypervisor domain ID should be exposed via the
monitor console.
Ian, did I miss any of the use cases you want addressed ? I'm leaving
mini-os out of this now, since as you say, its impossible to have one
binary deal with it, but I'll assume its similar to -xen-attach <domid>
use case in its working.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 14:14 ` Daniel P. Berrange
@ 2008-08-26 14:31 ` Gerd Hoffmann
2008-08-26 14:40 ` Daniel P. Berrange
0 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-26 14:31 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Ian Jackson, qemu-devel
Daniel P. Berrange wrote:
> I'll try to enumerate the scenarios here...
>
> On KVM, using Xenner + shell : No special args
> On KVM, using Xenner + libvirtd : -xen-create <domid>
This is "xen emulation" and (according to long term plan plan) doesn't
require kvm but should work with qemu emulating things too. This means
it doesn't depend on kvm, you can do that with xen underneath too. This
is what the -xen-emulate switch is good for.
> On Xen HV + shell : No special args
> On Xen HV + libvirtd : -xen-create <domid>
> On Xen HV + XenD : -xen-attach <domid>
I like the "create" and "attach" namings as they make pretty clear what
qemu should do.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 14:31 ` Gerd Hoffmann
@ 2008-08-26 14:40 ` Daniel P. Berrange
2008-08-26 14:49 ` Gerd Hoffmann
0 siblings, 1 reply; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-26 14:40 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, Ian Jackson, qemu-devel
On Tue, Aug 26, 2008 at 04:31:49PM +0200, Gerd Hoffmann wrote:
> Daniel P. Berrange wrote:
> > I'll try to enumerate the scenarios here...
> >
> > On KVM, using Xenner + shell : No special args
> > On KVM, using Xenner + libvirtd : -xen-create <domid>
>
> This is "xen emulation" and (according to long term plan plan) doesn't
> require kvm but should work with qemu emulating things too. This means
> it doesn't depend on kvm, you can do that with xen underneath too. This
> is what the -xen-emulate switch is good for.
Wouldn't the non-KVM case already be dealt with by the -no-kvm command
line arg to QEMU ?
> > On Xen HV + shell : No special args
> > On Xen HV + libvirtd : -xen-create <domid>
> > On Xen HV + XenD : -xen-attach <domid>
>
> I like the "create" and "attach" namings as they make pretty clear what
> qemu should do.
Regards,
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 14:40 ` Daniel P. Berrange
@ 2008-08-26 14:49 ` Gerd Hoffmann
2008-08-26 14:53 ` Daniel P. Berrange
2008-08-26 14:53 ` Ian Jackson
0 siblings, 2 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-26 14:49 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Ian Jackson, qemu-devel
Daniel P. Berrange wrote:
> On Tue, Aug 26, 2008 at 04:31:49PM +0200, Gerd Hoffmann wrote:
>> Daniel P. Berrange wrote:
>>> I'll try to enumerate the scenarios here...
>>>
>>> On KVM, using Xenner + shell : No special args
>>> On KVM, using Xenner + libvirtd : -xen-create <domid>
>> This is "xen emulation" and (according to long term plan plan) doesn't
>> require kvm but should work with qemu emulating things too. This means
>> it doesn't depend on kvm, you can do that with xen underneath too. This
>> is what the -xen-emulate switch is good for.
>
> Wouldn't the non-KVM case already be dealt with by the -no-kvm command
> line arg to QEMU ?
But when running on xen we want being able to qemu tell it should *not*
use the xen hypervisor but emulate things, thus -xen-emulate.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 14:49 ` Gerd Hoffmann
@ 2008-08-26 14:53 ` Daniel P. Berrange
2008-08-26 19:24 ` Gerd Hoffmann
2008-08-26 14:53 ` Ian Jackson
1 sibling, 1 reply; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-26 14:53 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, Ian Jackson, qemu-devel
On Tue, Aug 26, 2008 at 04:49:05PM +0200, Gerd Hoffmann wrote:
> Daniel P. Berrange wrote:
> > On Tue, Aug 26, 2008 at 04:31:49PM +0200, Gerd Hoffmann wrote:
> >> Daniel P. Berrange wrote:
> >>> I'll try to enumerate the scenarios here...
> >>>
> >>> On KVM, using Xenner + shell : No special args
> >>> On KVM, using Xenner + libvirtd : -xen-create <domid>
> >> This is "xen emulation" and (according to long term plan plan) doesn't
> >> require kvm but should work with qemu emulating things too. This means
> >> it doesn't depend on kvm, you can do that with xen underneath too. This
> >> is what the -xen-emulate switch is good for.
> >
> > Wouldn't the non-KVM case already be dealt with by the -no-kvm command
> > line arg to QEMU ?
>
> But when running on xen we want being able to qemu tell it should *not*
> use the xen hypervisor but emulate things, thus -xen-emulate.
Oh, i see what you mean. Existing naming practice would suggest naming
it -no-xenhv or just -no-xen
All these '-no-XXX' args are getting a little silly though. I'd rather
tell QEMU what I *do* want, than what I don't want. A generic arg
'-accelerator none|xen|kvm|kqemu' might be worth considering in the future.
That can be a separate discussion from these patches though - we should
just follow the -no-XXX naming for now to minimize diffs
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 14:53 ` Daniel P. Berrange
@ 2008-08-26 19:24 ` Gerd Hoffmann
0 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-26 19:24 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Ian Jackson, qemu-devel
Daniel P. Berrange wrote:
> On Tue, Aug 26, 2008 at 04:49:05PM +0200, Gerd Hoffmann wrote:
>> But when running on xen we want being able to qemu tell it should *not*
>> use the xen hypervisor but emulate things, thus -xen-emulate.
>
> Oh, i see what you mean. Existing naming practice would suggest naming
> it -no-xenhv or just -no-xen
Well, as you are going to run a xen guest -no-xen wouldn't be very
intuitive IMHO. -no-xenhv is only slightly better.
> All these '-no-XXX' args are getting a little silly though. I'd rather
> tell QEMU what I *do* want, than what I don't want. A generic arg
> '-accelerator none|xen|kvm|kqemu' might be worth considering in the future.
Agreed.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 14:49 ` Gerd Hoffmann
2008-08-26 14:53 ` Daniel P. Berrange
@ 2008-08-26 14:53 ` Ian Jackson
2008-08-27 8:51 ` Gerd Hoffmann
1 sibling, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 14:53 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, qemu-devel
Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> But when running on xen we want being able to qemu tell it should *not*
> use the xen hypervisor but emulate things, thus -xen-emulate.
That should be the default. It is not safe to use the Xen hypervisor
directly if there is a management toolstack which expects to mediate
domain creation.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 14:53 ` Ian Jackson
@ 2008-08-27 8:51 ` Gerd Hoffmann
2008-08-27 9:53 ` Ian Jackson
0 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-27 8:51 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel
Ian Jackson wrote:
> Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
>> But when running on xen we want being able to qemu tell it should *not*
>> use the xen hypervisor but emulate things, thus -xen-emulate.
>
> That should be the default. It is not safe to use the Xen hypervisor
> directly if there is a management toolstack which expects to mediate
> domain creation.
Ok. I think we have three cases then:
(1) xen emulation using xenner (default, with or without kvm).
(2) running on xen, with xend creating the domain, and
(3) running on xen, with qemu creating the domain.
For (2) and (3) we need command line switches to enable these modes.
Specifying a domain ID makes sense in all three cases though. So I'd
keep the -xen-domid switch, like this:
-xen-domid <nr> specify xen domain id.
-xen-attach attach to specified domain (created by xend).
-xen-create create the domain directly.
Comments?
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-27 8:51 ` Gerd Hoffmann
@ 2008-08-27 9:53 ` Ian Jackson
2008-08-27 9:56 ` Daniel P. Berrange
0 siblings, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-27 9:53 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, qemu-devel
Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> Ok. I think we have three cases then:
>
> (1) xen emulation using xenner (default, with or without kvm).
> (2) running on xen, with xend creating the domain, and
> (3) running on xen, with qemu creating the domain.
>
> For (2) and (3) we need command line switches to enable these modes.
> Specifying a domain ID makes sense in all three cases though. So I'd
> keep the -xen-domid switch, like this:
>
> -xen-domid <nr> specify xen domain id.
> -xen-attach attach to specified domain (created by xend).
> -xen-create create the domain directly.
>
> Comments?
That sounds reasonable. Can we please make sure that the usage
message for -xen-create and -xen-attach makes it clear that a user
should not supply these options on a Xen system using the Xen
management toolstack ? (Which I think includes a system using xm and
also a system using libvirt, although perhaps Daniel Berrange will
correct me.)
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-27 9:53 ` Ian Jackson
@ 2008-08-27 9:56 ` Daniel P. Berrange
2008-08-27 10:00 ` Ian Jackson
0 siblings, 1 reply; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-27 9:56 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, Gerd Hoffmann, qemu-devel
On Wed, Aug 27, 2008 at 10:53:37AM +0100, Ian Jackson wrote:
> Gerd Hoffmann writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> > Ok. I think we have three cases then:
> >
> > (1) xen emulation using xenner (default, with or without kvm).
> > (2) running on xen, with xend creating the domain, and
> > (3) running on xen, with qemu creating the domain.
> >
> > For (2) and (3) we need command line switches to enable these modes.
> > Specifying a domain ID makes sense in all three cases though. So I'd
> > keep the -xen-domid switch, like this:
> >
> > -xen-domid <nr> specify xen domain id.
> > -xen-attach attach to specified domain (created by xend).
> > -xen-create create the domain directly.
> >
> > Comments?
>
> That sounds reasonable. Can we please make sure that the usage
> message for -xen-create and -xen-attach makes it clear that a user
> should not supply these options on a Xen system using the Xen
> management toolstack ? (Which I think includes a system using xm and
> also a system using libvirt, although perhaps Daniel Berrange will
> correct me.)
There's no requirement from libvirt itself - just whatever infrastructure
libvirt is using. So just a message about XenD/xm would be sufficient.
Regards,
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-27 9:56 ` Daniel P. Berrange
@ 2008-08-27 10:00 ` Ian Jackson
2008-08-27 10:23 ` Daniel P. Berrange
0 siblings, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-27 10:00 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Gerd Hoffmann, qemu-devel
Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> There's no requirement from libvirt itself - just whatever infrastructure
> libvirt is using. So just a message about XenD/xm would be sufficient.
Right, but I just wanted to avoid the situation where a naive user
sees `not for xm/xend systems' and thinks `that's not me because I'm
using libvirt'.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-27 10:00 ` Ian Jackson
@ 2008-08-27 10:23 ` Daniel P. Berrange
2008-08-27 10:37 ` Gerd Hoffmann
0 siblings, 1 reply; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-27 10:23 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, Gerd Hoffmann, qemu-devel
On Wed, Aug 27, 2008 at 11:00:16AM +0100, Ian Jackson wrote:
> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> > There's no requirement from libvirt itself - just whatever infrastructure
> > libvirt is using. So just a message about XenD/xm would be sufficient.
>
> Right, but I just wanted to avoid the situation where a naive user
> sees `not for xm/xend systems' and thinks `that's not me because I'm
> using libvirt'.
They could be right in that thinking though - if using libvirt's QEMU
backend, instead of XenD backend, then it'd be fine to launch VMs
manually. Only the presence of XenD places constraints on usage.
Regards,
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-27 10:23 ` Daniel P. Berrange
@ 2008-08-27 10:37 ` Gerd Hoffmann
2008-08-27 10:44 ` Daniel P. Berrange
0 siblings, 1 reply; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-27 10:37 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Ian Jackson, qemu-devel
Daniel P. Berrange wrote:
> On Wed, Aug 27, 2008 at 11:00:16AM +0100, Ian Jackson wrote:
>> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
>>> There's no requirement from libvirt itself - just whatever infrastructure
>>> libvirt is using. So just a message about XenD/xm would be sufficient.
>> Right, but I just wanted to avoid the situation where a naive user
>> sees `not for xm/xend systems' and thinks `that's not me because I'm
>> using libvirt'.
>
> They could be right in that thinking though - if using libvirt's QEMU
> backend, instead of XenD backend, then it'd be fine to launch VMs
> manually. Only the presence of XenD places constraints on usage.
What constrains btw?
I'd expect with libvirt managing domains via qemu -xen-create and xend
managing domains via -xen-attach basically run into the same class of
problems if the user starts booting domains manually.
The most obvious issue would be is that domain IDs are in use by the
user-started domains which the management stack thinks are unused and
thus might be allocated for new domains. Likewise, both management
stacks will find domains in xenstore they don't know anything about.
So I don't see a fundamental difference between libvirt and xend here.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-27 10:37 ` Gerd Hoffmann
@ 2008-08-27 10:44 ` Daniel P. Berrange
2008-08-27 11:01 ` Gerd Hoffmann
0 siblings, 1 reply; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-27 10:44 UTC (permalink / raw)
To: Gerd Hoffmann; +Cc: xen-devel, Ian Jackson, qemu-devel
On Wed, Aug 27, 2008 at 12:37:42PM +0200, Gerd Hoffmann wrote:
> Daniel P. Berrange wrote:
> > On Wed, Aug 27, 2008 at 11:00:16AM +0100, Ian Jackson wrote:
> >> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> >>> There's no requirement from libvirt itself - just whatever infrastructure
> >>> libvirt is using. So just a message about XenD/xm would be sufficient.
> >> Right, but I just wanted to avoid the situation where a naive user
> >> sees `not for xm/xend systems' and thinks `that's not me because I'm
> >> using libvirt'.
> >
> > They could be right in that thinking though - if using libvirt's QEMU
> > backend, instead of XenD backend, then it'd be fine to launch VMs
> > manually. Only the presence of XenD places constraints on usage.
>
> What constrains btw?
If you start an Xen guest directly with QEMU, that'll be seen by XenD
and bad things will start to happen. At best, XenD will tear down your
manually created domain if it sees it before you write enough info
into xenstore.
> I'd expect with libvirt managing domains via qemu -xen-create and xend
> managing domains via -xen-attach basically run into the same class of
> problems if the user starts booting domains manually.
>
> The most obvious issue would be is that domain IDs are in use by the
> user-started domains which the management stack thinks are unused and
> thus might be allocated for new domains. Likewise, both management
> stacks will find domains in xenstore they don't know anything about.
>
> So I don't see a fundamental difference between libvirt and xend here.
In the context of the QEMU driver in libvirt, we don't currently care what
the actual Xen hypervisor domain ID is - we make up our own domain ID
space, and have no need for it to map to hypervisor IDs. Thus, externally
created QEMU instances can use the HV too and won't clash. In fact libvirt
won't even care if the VM is using the HV or not - it is just another QEMU
process - all the xenstore/xen HV interaction is self contained inside
QEMU.
In the context of the Xen driver in libvirt, we're going via XenD and
don't create VMs manually.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-27 10:44 ` Daniel P. Berrange
@ 2008-08-27 11:01 ` Gerd Hoffmann
0 siblings, 0 replies; 117+ messages in thread
From: Gerd Hoffmann @ 2008-08-27 11:01 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, Ian Jackson, qemu-devel
Daniel P. Berrange wrote:
>>> They could be right in that thinking though - if using libvirt's QEMU
>>> backend, instead of XenD backend, then it'd be fine to launch VMs
>>> manually. Only the presence of XenD places constraints on usage.
>> What constrains btw?
>
> If you start an Xen guest directly with QEMU, that'll be seen by XenD
> and bad things will start to happen. At best, XenD will tear down your
> manually created domain if it sees it before you write enough info
> into xenstore.
Didn't saw that happen.
>> I'd expect with libvirt managing domains via qemu -xen-create and xend
>> managing domains via -xen-attach basically run into the same class of
>> problems if the user starts booting domains manually.
>>
>> The most obvious issue would be is that domain IDs are in use by the
>> user-started domains which the management stack thinks are unused and
>> thus might be allocated for new domains. Likewise, both management
>> stacks will find domains in xenstore they don't know anything about.
>>
>> So I don't see a fundamental difference between libvirt and xend here.
>
> In the context of the QEMU driver in libvirt, we don't currently care what
> the actual Xen hypervisor domain ID is - we make up our own domain ID
> space, and have no need for it to map to hypervisor IDs.
They should be identical as that would be less confusing to the user.
Having two ID spaces, both called domain ID, is a very bad idea IMHO.
> In fact libvirt
> won't even care if the VM is using the HV or not - it is just another QEMU
> process - all the xenstore/xen HV interaction is self contained inside
> QEMU.
Yep. But even in that case it would be good if libvirt and the guest os
have the same idea about what the domain ID is.
cheers,
Gerd
^ permalink raw reply [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KXzR5-002R2A@chiark.greenend.org.uk>]
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
[not found] ` <m2n.s.1KXzR5-002R2A@chiark.greenend.org.uk>
@ 2008-08-26 15:05 ` Ian Jackson
2008-08-26 15:09 ` Daniel P. Berrange
0 siblings, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 15:05 UTC (permalink / raw)
To: Daniel P. Berrange, qemu-devel; +Cc: xen-devel, Ian Jackson, Gerd Hoffmann
Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> On Tue, Aug 26, 2008 at 02:33:09PM +0100, Ian Jackson wrote:
> > -xen-bare no xend on this system, qemu should create a Xen domain
> > without this, Xen machine types use emulated
> > Xen functionality
>
> I'm not sure I understand what you mean here ? Are you suggesting this
> for the 'Xen hypervisor present, standalone mode - ie no XenD' ?
Yes.
> If so I think that's not right - for standalone mode we shouldn't
> require any special args for the admin. We should only require
> additional args for cases where management tools like XenD or
> libvirt are invoking QEMU.
The usual case for a Xen installation is that xend is running. In
that situation, it is wrong for qemu to create a domain directly.
That will interfere with xend's management of the whole system.
So that means that it is not right to detect that we are running under
Xen and if so create a domain by default. That should only be done if
the person invoking qemu knows that there is no xend and therefore
that creating the domain directly is sensible.
> I'll try to enumerate the scenarios here...
>
> On KVM, using Xenner + shell : No special args
> On KVM, using Xenner + libvirtd : -xen-create <domid>
> On Xen HV + shell : No special args
> On Xen HV + libvirtd : -xen-create <domid>
> On Xen HV + XenD : -xen-attach <domid>
You've missed out entirely software emulation of a Xen environment.
AIUI this is pretty mucbh Xenner without KVM.
And see my comments above about running under Xen. The Xen hypervisor
is not a system facility which user processes are expected to make use
of without negotiating with the prevailing Xen management stack. This
is unlike KVM, where all KVM's callers can be oblivious to each other.
If you run qemu to invoke a Xen domU image it should not disturb any
existing actual Xen installation, and the behaviour and outcome should
not depend on whether the whole lot is running under a real Xen (with
xend et al). It is fine to use KVM if available because that's just a
performance improvement.
This is because the way to start a Xen domU, as stated in the Xen
documentation etc., is via the Xen management tools. If a user tries
to do it via qemu then they are either making a mistake, or intending
to run an emulated Xen environment. (Note that they may be running
qemu on a Xen domU in which case creating a domain isn't even
possible.)
If we make qemu actually create a guest by direct hypercalls, the Xen
management stack will be disrupted ...
> If neither -xen-create or -xen-attach is given, then it will create a
> new domain from scratch & auto-allocate a domid. Passing both the
> -xen-create and -xen-attach args is forbidden / nonsensical.
... so this behaviour should not be the default.
> Ian, did I miss any of the use cases you want addressed ? I'm leaving
> mini-os out of this now, since as you say, its impossible to have one
> binary deal with it, but I'll assume its similar to -xen-attach <domid>
> use case in its working.
Fairly similar.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 15:05 ` Ian Jackson
@ 2008-08-26 15:09 ` Daniel P. Berrange
0 siblings, 0 replies; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-26 15:09 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
On Tue, Aug 26, 2008 at 04:05:01PM +0100, Ian Jackson wrote:
> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> > I'll try to enumerate the scenarios here...
> >
> > On KVM, using Xenner + shell : No special args
> > On KVM, using Xenner + libvirtd : -xen-create <domid>
> > On Xen HV + shell : No special args
> > On Xen HV + libvirtd : -xen-create <domid>
> > On Xen HV + XenD : -xen-attach <domid>
>
> You've missed out entirely software emulation of a Xen environment.
> AIUI this is pretty mucbh Xenner without KVM.
>
> And see my comments above about running under Xen. The Xen hypervisor
> is not a system facility which user processes are expected to make use
> of without negotiating with the prevailing Xen management stack. This
> is unlike KVM, where all KVM's callers can be oblivious to each other.
>
> If you run qemu to invoke a Xen domU image it should not disturb any
> existing actual Xen installation, and the behaviour and outcome should
> not depend on whether the whole lot is running under a real Xen (with
> xend et al). It is fine to use KVM if available because that's just a
> performance improvement.
>
> This is because the way to start a Xen domU, as stated in the Xen
> documentation etc., is via the Xen management tools. If a user tries
> to do it via qemu then they are either making a mistake, or intending
> to run an emulated Xen environment. (Note that they may be running
> qemu on a Xen domU in which case creating a domain isn't even
> possible.)
>
> If we make qemu actually create a guest by direct hypercalls, the Xen
> management stack will be disrupted ...
Ok, in that case dis-regard my sugestion of '-no-xen' in the other mail
I just sent. Clearly the negative style args won't work in the xen case,
and we have to go for a positive switch - maybe it really is worth doing
an explicit '-accelerator none|kvm|xen|kqemu' right away for this
patchset...
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 12:57 ` [Xen-devel] " Daniel P. Berrange
2008-08-26 13:20 ` Gerd Hoffmann
@ 2008-08-26 13:24 ` Ian Jackson
2008-08-26 13:50 ` Daniel P. Berrange
1 sibling, 1 reply; 117+ messages in thread
From: Ian Jackson @ 2008-08-26 13:24 UTC (permalink / raw)
To: Daniel P. Berrange; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> This particular check is doing two different jobs in one go. I think
> they need to be dealt with separately.
>
> - Check to see if running on Xen or not. This is a similar idea to
> to pv_ops where we have one kernel binary, and probe at runtime
> to decide whether to run in KVM, Xen, VMWare or bare metal mode.
I don't think this is correct for the Xen case. Certainly it is not
correct for a qemu invoked by the Xen management tools, and it is also
not correct for a qemu invoked by hand on a system which already has
xend etc.
If you're trying to run qemu under Xen but standalone without xend
then you're going to have to tell qemu that that's what you want.
Otherwise if you try to run the same qemu under Xen but _with_ xend,
qemu in dom0 will create a domain bypassing xend and cause all sorts
of trouble.
Autodetection of this kind of thing is fragile, and in the Xen
toolstack case also unnecessary (since the Xen toolstack can invoke
qemu with appropriate options to tell it only to do the emulation
parts).
So I think the default mode of operation for qemu instructed to
provide a Xen domU environment should be 100% software, or via KVM if
KVM is available. KVM is designed as a facility to be used by user
processes without negotiation with the system administrator so it is
fine to autodetect that and use it.
The Xen hypercall and domain management interfaces are not like that:
the interfaces provided by the hypervisor et al should be used only by
cooperation with the Xen management stack.
> This is important to QEMU too. Merging the source trees for all
> projects using QEMU is just the first step. A single binary for
> all is the holy grail, though may not be practical for Mini-OS
> usage
I'm certainly in favour of closer integration. If qemu upstream are
willing to take all of our code, including the parts which serve modes
of operation and features which qemu upstream don't agree are needed,
then excellent.
But I think that for now it would be easier to concentrate on
providing interfaces to allow the qemu-xen tree to mainly be
additional code which goes through well-defined public interfaces to
the rest of qemu. (From this point of view it's a shame that qemu
itself isn't more like a library than an executable.)
Certainly the minios port can't be the same executable. The qemu code
is statically linked with minios into what amounts to a monolithic PV
guest kernel.
> If we want QEMU to explicitly created Xen domains from scratch, rather than
> attaching to an existing one, then there needs to be a way to reliably
> specify this behaviour via the CLI.
Yes.
And this should not be done by default because on a system with xend,
you don't want to be messing with xend's idea of the running domains.
Ian.
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support
2008-08-26 13:24 ` Ian Jackson
@ 2008-08-26 13:50 ` Daniel P. Berrange
0 siblings, 0 replies; 117+ messages in thread
From: Daniel P. Berrange @ 2008-08-26 13:50 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel, qemu-devel, Gerd Hoffmann
On Tue, Aug 26, 2008 at 02:24:10PM +0100, Ian Jackson wrote:
> Daniel P. Berrange writes ("Re: [Xen-devel] Re: [Qemu-devel] [PATCH 05/13] xen: groundwork for xen support"):
> > This particular check is doing two different jobs in one go. I think
> > they need to be dealt with separately.
> >
> > - Check to see if running on Xen or not. This is a similar idea to
> > to pv_ops where we have one kernel binary, and probe at runtime
> > to decide whether to run in KVM, Xen, VMWare or bare metal mode.
>
> I don't think this is correct for the Xen case. Certainly it is not
> correct for a qemu invoked by the Xen management tools, and it is also
> not correct for a qemu invoked by hand on a system which already has
> xend etc.
>
> If you're trying to run qemu under Xen but standalone without xend
> then you're going to have to tell qemu that that's what you want.
> Otherwise if you try to run the same qemu under Xen but _with_ xend,
> qemu in dom0 will create a domain bypassing xend and cause all sorts
> of trouble.
Yes, mixing manually created VMs & XenD managed VMs is not a use case
I was thinking we should try to address. The core use case for creating
VMs on the Xen hypervisor, is that you want to avoid the python stack,
and thus you aren't likely to have XenD around anyway.
Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
^ permalink raw reply [flat|nested] 117+ messages in thread
[parent not found: <m2n.s.1KWDfC-002TCd@chiark.greenend.org.uk>]