From: Namhyung Kim <namhyung@kernel.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Tony Luck" <tony.luck@intel.com>,
"Kees Cook" <keescook@chromium.org>,
kvm@vger.kernel.org, "Radim Krčmář" <rkrcmar@redhat.com>,
"Anton Vorontsov" <anton@enomsg.org>,
LKML <linux-kernel@vger.kernel.org>,
"Steven Rostedt" <rostedt@goodmis.org>,
qemu-devel@nongnu.org, "Minchan Kim" <minchan@kernel.org>,
"Anthony Liguori" <aliguori@amazon.com>,
"Colin Cross" <ccross@android.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
virtualization@lists.linux-foundation.org,
"Ingo Molnar" <mingo@kernel.org>
Subject: Re: [PATCH 6/7] qemu: Implement virtio-pstore device
Date: Thu, 28 Jul 2016 14:39:53 +0900 [thread overview]
Message-ID: <20160728053953.GB4371@sejong> (raw)
In-Reply-To: <20160728023254-mutt-send-email-mst@kernel.org>
On Thu, Jul 28, 2016 at 03:02:54AM +0300, Michael S. Tsirkin wrote:
> On Thu, Jul 28, 2016 at 12:08:30AM +0900, Namhyung Kim wrote:
> > Add virtio pstore device to allow kernel log files saved on the host.
> > It will save the log files on the directory given by pstore device
> > option.
> >
> > $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...
> >
> > (guest) # echo c > /proc/sysrq-trigger
>
> So if the point is handling system crashes, I suspect
> virtio is the wrong protocol to use. ATM it's rather
> elaborate for performance, it's likely not to work when
> linux is crashing. I think you want something very very
> simple that will still work when you crash. Like maybe
> a serial device?
Well, I dont' know. As you know, the kernel oops dump is already sent
to serial device but it's rather slow. As I wrote in the cover
letter, enabling ftrace_dump_on_oops makes it even worse.. Also
pstore saves the (compressed) binary data so I thought it'd be better
to have a dedicated IO channel.
I know that we can't rely on anything in kernel when the kernel is
crashing. In the virtualization environment, I think it'd be great if
it can dump the crash data in the host directly so I chose the virtio.
I thought the virtio is a relatively thin layer and independent from
other kernel features. Even if it doesn't guarantee to work 100%,
it's worth trying IMHO.
>
> > $ ls dir-xx
> > dmesg-1.enc.z dmesg-2.enc.z
> >
> > The log files are usually compressed using zlib. Users can see the log
> > messages directly on the host or on the guest (using pstore filesystem).
>
> So this lacks all management tools that a regular storage device
> has, and these are necessary if people are to use this
> in production.
>
> For example, some kind of provision for limiting the amount of host disk
> this can consume seems called for. Rate-limiting disk writes
> on host also seems necessary. Handling host disk errors always
> propagates error to guest but an option to e.g. stop vm might
> be useful to avoid corruption.
Yes, it needs that kind of management. I'd like to look at the
existing implementation. But basically I thought it as a debugging
feature and it won't produce much data for the default setting. Maybe
I can limit the file size not to exceed the buffer size and keep up to
pre-configured number of files for each type. Now host disk error
will be propagated to guest.
>
> >
> > The 'directory' property is required for virtio-pstore device to work.
> > It also adds 'bufsize' and 'console' (boolean) properties.
>
> No idea what these do. Seem to be RW by guest but have no effect
> otherwise.
The 'directory' is a pathname which it saves the data files. The
'bufsize' sets the size of buffer used by pstore. The 'console'
enables saving pstore console type data.
>
>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Cc: Anton Vorontsov <anton@enomsg.org>
> > Cc: Colin Cross <ccross@android.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: kvm@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
[SNIP]
> > diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> > new file mode 100644
> > index 0000000..2ca7786
> > --- /dev/null
> > +++ b/hw/virtio/virtio-pstore.c
> > @@ -0,0 +1,477 @@
> > +/*
> > + * Virtio Pstore Device
> > + *
> > + * Copyright (C) 2016 LG Electronics
> > + *
> > + * Authors:
> > + * Namhyung Kim <namhyung@gmail.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.
>
> We generally ask new code to be v2 or later.
Ok, will update.
>
> > See
> > + * the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include <stdio.h>
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/iov.h"
> > +#include "qemu-common.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/error-report.h"
> > +#include "sysemu/kvm.h"
> > +#include "qapi/visitor.h"
> > +#include "qapi-event.h"
> > +#include "trace.h"
> > +
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/virtio-pstore.h"
> > +
> > +
> > +static void virtio_pstore_to_filename(VirtIOPstore *s, char *buf, size_t sz,
> > + struct virtio_pstore_req *req)
> > +{
> > + const char *basename;
> > + unsigned long long id = 0;
> > + unsigned int flags = le32_to_cpu(req->flags);
> > +
> > + switch (le16_to_cpu(req->type)) {
> > + case VIRTIO_PSTORE_TYPE_DMESG:
> > + basename = "dmesg";
> > + id = s->id++;
> > + break;
> > + case VIRTIO_PSTORE_TYPE_CONSOLE:
> > + basename = "console";
> > + if (s->console_id) {
> > + id = s->console_id;
> > + } else {
> > + id = s->console_id = s->id++;
> > + }
> > + break;
> > + default:
> > + basename = "unknown";
> > + break;
> > + }
> > +
> > + snprintf(buf, sz, "%s/%s-%llu%s", s->directory, basename, id,
> > + flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> > +}
> > +
> > +static void virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> > + char *buf, size_t sz,
> > + struct virtio_pstore_fileinfo *info)
> > +{
> > + snprintf(buf, sz, "%s/%s", s->directory, name);
>
> if this does not fit, buf will not be 0 terminated and can
> generally be corrupted.
Will change it to use malloc instead.
>
>
> > +
> > + if (g_str_has_prefix(name, "dmesg-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_DMESG;
> > + name += strlen("dmesg-");
> > + } else if (g_str_has_prefix(name, "console-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_CONSOLE;
> > + name += strlen("console-");
> > + } else if (g_str_has_prefix(name, "unknown-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> > + name += strlen("unknown-");
> > + }
> > +
> > + qemu_strtoull(name, NULL, 0, &info->id);
> > +
> > + info->flags = 0;
> > + if (g_str_has_suffix(name, ".enc.z")) {
> > + info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> > + }
> > +}
[SNIP]
> > +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, struct iovec *out_sg,
> > + unsigned int out_num,
> > + struct virtio_pstore_req *req)
> > +{
> > + char path[PATH_MAX];
> > + int fd;
> > + ssize_t len;
> > + unsigned short type;
> > + int flags = O_WRONLY | O_CREAT;
> > +
> > + /* we already consume the req */
> > + iov_discard_front(&out_sg, &out_num, sizeof(*req));
> > +
> > + virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > + type = le16_to_cpu(req->type);
> > +
> > + if (type == VIRTIO_PSTORE_TYPE_DMESG) {
> > + flags |= O_TRUNC;
> > + } else if (type == VIRTIO_PSTORE_TYPE_CONSOLE) {
> > + flags |= O_APPEND;
> > + }
> > +
> > + fd = open(path, flags, 0644);
> > + if (fd < 0) {
> > + error_report("cannot open %s", path);
> > + return -1;
> > + }
> > + len = writev(fd, out_sg, out_num);
> > + close(fd);
> > +
> > + return len;
>
> All this is blocking VM until host io completes.
Hmm.. I don't know about the internals of qemu. So does it make guest
stop? If so, that's what I want to do for _DMESG. :) As it's called
only on kernel oops I think it's admittable. But for _CONSOLE, it
needs to do asynchronously. Maybe I can add a thread to do the work.
>
>
> > +}
> > +
> > +static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
> > +{
> > + if (s->dirp == NULL) {
> > + return 0;
> > + }
> > +
> > + closedir(s->dirp);
> > + s->dirp = NULL;
> > +
> > + return 0;
> > +}
> > +
> > +static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
> > + struct virtio_pstore_req *req)
> > +{
> > + char path[PATH_MAX];
> > +
> > + virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > + return unlink(path);
> > +}
> > +
> > +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > + VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> > + VirtQueueElement *elem;
> > + struct virtio_pstore_req req;
> > + struct virtio_pstore_res res;
> > + ssize_t len = 0;
> > + int ret;
> > +
> > + for (;;) {
> > + elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> > + if (!elem) {
> > + return;
> > + }
> > +
> > + if (elem->out_num < 1 || elem->in_num < 1) {
> > + error_report("request or response buffer is missing");
> > + exit(1);
> > + }
> > +
> > + len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> > + if (len != (ssize_t)sizeof(req)) {
> > + error_report("invalid request size: %ld", (long)len);
> > + exit(1);
> > + }
> > + res.cmd = req.cmd;
> > + res.type = req.type;
> > +
> > + switch (le16_to_cpu(req.cmd)) {
> > + case VIRTIO_PSTORE_CMD_OPEN:
> > + ret = virtio_pstore_do_open(s);
> > + break;
> > + case VIRTIO_PSTORE_CMD_READ:
> > + ret = virtio_pstore_do_read(s, elem->in_sg, elem->in_num, &res);
> > + if (ret > 0) {
> > + len = ret;
> > + ret = 0;
> > + }
> > + break;
> > + case VIRTIO_PSTORE_CMD_WRITE:
> > + ret = virtio_pstore_do_write(s, elem->out_sg, elem->out_num, &req);
> > + break;
> > + case VIRTIO_PSTORE_CMD_CLOSE:
> > + ret = virtio_pstore_do_close(s);
> > + break;
> > + case VIRTIO_PSTORE_CMD_ERASE:
> > + ret = virtio_pstore_do_erase(s, &req);
> > + break;
> > + default:
> > + ret = -1;
> > + break;
> > + }
> > +
> > + res.ret = ret;
> > +
> > + iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > + virtqueue_push(vq, elem, sizeof(res) + len);
>
> this is wrong - len should be # of bytes written into guest memory.
???
All command except READ only write the virtio_pstore_ret so len is 0.
For READ, virtio_pstore_do_read() returns the actual bytes it wrote
(minus sizeof(res) of course), so I think it's fine, no?
Anyway, thanks for your review!
Thanks,
Namhyung
>
> > +
> > + virtio_notify(vdev, vq);
> > + g_free(elem);
> > +
> > + if (ret < 0) {
> > + return;
> > + }
> > + }
> > +}
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: Namhyung Kim <namhyung@kernel.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org,
virtualization@lists.linux-foundation.org,
LKML <linux-kernel@vger.kernel.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"Anthony Liguori" <aliguori@amazon.com>,
"Anton Vorontsov" <anton@enomsg.org>,
"Colin Cross" <ccross@android.com>,
"Kees Cook" <keescook@chromium.org>,
"Tony Luck" <tony.luck@intel.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ingo Molnar" <mingo@kernel.org>,
"Minchan Kim" <minchan@kernel.org>
Subject: Re: [PATCH 6/7] qemu: Implement virtio-pstore device
Date: Thu, 28 Jul 2016 14:39:53 +0900 [thread overview]
Message-ID: <20160728053953.GB4371@sejong> (raw)
In-Reply-To: <20160728023254-mutt-send-email-mst@kernel.org>
On Thu, Jul 28, 2016 at 03:02:54AM +0300, Michael S. Tsirkin wrote:
> On Thu, Jul 28, 2016 at 12:08:30AM +0900, Namhyung Kim wrote:
> > Add virtio pstore device to allow kernel log files saved on the host.
> > It will save the log files on the directory given by pstore device
> > option.
> >
> > $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...
> >
> > (guest) # echo c > /proc/sysrq-trigger
>
> So if the point is handling system crashes, I suspect
> virtio is the wrong protocol to use. ATM it's rather
> elaborate for performance, it's likely not to work when
> linux is crashing. I think you want something very very
> simple that will still work when you crash. Like maybe
> a serial device?
Well, I dont' know. As you know, the kernel oops dump is already sent
to serial device but it's rather slow. As I wrote in the cover
letter, enabling ftrace_dump_on_oops makes it even worse.. Also
pstore saves the (compressed) binary data so I thought it'd be better
to have a dedicated IO channel.
I know that we can't rely on anything in kernel when the kernel is
crashing. In the virtualization environment, I think it'd be great if
it can dump the crash data in the host directly so I chose the virtio.
I thought the virtio is a relatively thin layer and independent from
other kernel features. Even if it doesn't guarantee to work 100%,
it's worth trying IMHO.
>
> > $ ls dir-xx
> > dmesg-1.enc.z dmesg-2.enc.z
> >
> > The log files are usually compressed using zlib. Users can see the log
> > messages directly on the host or on the guest (using pstore filesystem).
>
> So this lacks all management tools that a regular storage device
> has, and these are necessary if people are to use this
> in production.
>
> For example, some kind of provision for limiting the amount of host disk
> this can consume seems called for. Rate-limiting disk writes
> on host also seems necessary. Handling host disk errors always
> propagates error to guest but an option to e.g. stop vm might
> be useful to avoid corruption.
Yes, it needs that kind of management. I'd like to look at the
existing implementation. But basically I thought it as a debugging
feature and it won't produce much data for the default setting. Maybe
I can limit the file size not to exceed the buffer size and keep up to
pre-configured number of files for each type. Now host disk error
will be propagated to guest.
>
> >
> > The 'directory' property is required for virtio-pstore device to work.
> > It also adds 'bufsize' and 'console' (boolean) properties.
>
> No idea what these do. Seem to be RW by guest but have no effect
> otherwise.
The 'directory' is a pathname which it saves the data files. The
'bufsize' sets the size of buffer used by pstore. The 'console'
enables saving pstore console type data.
>
>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Cc: Anton Vorontsov <anton@enomsg.org>
> > Cc: Colin Cross <ccross@android.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: kvm@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
[SNIP]
> > diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> > new file mode 100644
> > index 0000000..2ca7786
> > --- /dev/null
> > +++ b/hw/virtio/virtio-pstore.c
> > @@ -0,0 +1,477 @@
> > +/*
> > + * Virtio Pstore Device
> > + *
> > + * Copyright (C) 2016 LG Electronics
> > + *
> > + * Authors:
> > + * Namhyung Kim <namhyung@gmail.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.
>
> We generally ask new code to be v2 or later.
Ok, will update.
>
> > See
> > + * the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include <stdio.h>
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/iov.h"
> > +#include "qemu-common.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/error-report.h"
> > +#include "sysemu/kvm.h"
> > +#include "qapi/visitor.h"
> > +#include "qapi-event.h"
> > +#include "trace.h"
> > +
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/virtio-pstore.h"
> > +
> > +
> > +static void virtio_pstore_to_filename(VirtIOPstore *s, char *buf, size_t sz,
> > + struct virtio_pstore_req *req)
> > +{
> > + const char *basename;
> > + unsigned long long id = 0;
> > + unsigned int flags = le32_to_cpu(req->flags);
> > +
> > + switch (le16_to_cpu(req->type)) {
> > + case VIRTIO_PSTORE_TYPE_DMESG:
> > + basename = "dmesg";
> > + id = s->id++;
> > + break;
> > + case VIRTIO_PSTORE_TYPE_CONSOLE:
> > + basename = "console";
> > + if (s->console_id) {
> > + id = s->console_id;
> > + } else {
> > + id = s->console_id = s->id++;
> > + }
> > + break;
> > + default:
> > + basename = "unknown";
> > + break;
> > + }
> > +
> > + snprintf(buf, sz, "%s/%s-%llu%s", s->directory, basename, id,
> > + flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> > +}
> > +
> > +static void virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> > + char *buf, size_t sz,
> > + struct virtio_pstore_fileinfo *info)
> > +{
> > + snprintf(buf, sz, "%s/%s", s->directory, name);
>
> if this does not fit, buf will not be 0 terminated and can
> generally be corrupted.
Will change it to use malloc instead.
>
>
> > +
> > + if (g_str_has_prefix(name, "dmesg-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_DMESG;
> > + name += strlen("dmesg-");
> > + } else if (g_str_has_prefix(name, "console-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_CONSOLE;
> > + name += strlen("console-");
> > + } else if (g_str_has_prefix(name, "unknown-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> > + name += strlen("unknown-");
> > + }
> > +
> > + qemu_strtoull(name, NULL, 0, &info->id);
> > +
> > + info->flags = 0;
> > + if (g_str_has_suffix(name, ".enc.z")) {
> > + info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> > + }
> > +}
[SNIP]
> > +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, struct iovec *out_sg,
> > + unsigned int out_num,
> > + struct virtio_pstore_req *req)
> > +{
> > + char path[PATH_MAX];
> > + int fd;
> > + ssize_t len;
> > + unsigned short type;
> > + int flags = O_WRONLY | O_CREAT;
> > +
> > + /* we already consume the req */
> > + iov_discard_front(&out_sg, &out_num, sizeof(*req));
> > +
> > + virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > + type = le16_to_cpu(req->type);
> > +
> > + if (type == VIRTIO_PSTORE_TYPE_DMESG) {
> > + flags |= O_TRUNC;
> > + } else if (type == VIRTIO_PSTORE_TYPE_CONSOLE) {
> > + flags |= O_APPEND;
> > + }
> > +
> > + fd = open(path, flags, 0644);
> > + if (fd < 0) {
> > + error_report("cannot open %s", path);
> > + return -1;
> > + }
> > + len = writev(fd, out_sg, out_num);
> > + close(fd);
> > +
> > + return len;
>
> All this is blocking VM until host io completes.
Hmm.. I don't know about the internals of qemu. So does it make guest
stop? If so, that's what I want to do for _DMESG. :) As it's called
only on kernel oops I think it's admittable. But for _CONSOLE, it
needs to do asynchronously. Maybe I can add a thread to do the work.
>
>
> > +}
> > +
> > +static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
> > +{
> > + if (s->dirp == NULL) {
> > + return 0;
> > + }
> > +
> > + closedir(s->dirp);
> > + s->dirp = NULL;
> > +
> > + return 0;
> > +}
> > +
> > +static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
> > + struct virtio_pstore_req *req)
> > +{
> > + char path[PATH_MAX];
> > +
> > + virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > + return unlink(path);
> > +}
> > +
> > +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > + VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> > + VirtQueueElement *elem;
> > + struct virtio_pstore_req req;
> > + struct virtio_pstore_res res;
> > + ssize_t len = 0;
> > + int ret;
> > +
> > + for (;;) {
> > + elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> > + if (!elem) {
> > + return;
> > + }
> > +
> > + if (elem->out_num < 1 || elem->in_num < 1) {
> > + error_report("request or response buffer is missing");
> > + exit(1);
> > + }
> > +
> > + len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> > + if (len != (ssize_t)sizeof(req)) {
> > + error_report("invalid request size: %ld", (long)len);
> > + exit(1);
> > + }
> > + res.cmd = req.cmd;
> > + res.type = req.type;
> > +
> > + switch (le16_to_cpu(req.cmd)) {
> > + case VIRTIO_PSTORE_CMD_OPEN:
> > + ret = virtio_pstore_do_open(s);
> > + break;
> > + case VIRTIO_PSTORE_CMD_READ:
> > + ret = virtio_pstore_do_read(s, elem->in_sg, elem->in_num, &res);
> > + if (ret > 0) {
> > + len = ret;
> > + ret = 0;
> > + }
> > + break;
> > + case VIRTIO_PSTORE_CMD_WRITE:
> > + ret = virtio_pstore_do_write(s, elem->out_sg, elem->out_num, &req);
> > + break;
> > + case VIRTIO_PSTORE_CMD_CLOSE:
> > + ret = virtio_pstore_do_close(s);
> > + break;
> > + case VIRTIO_PSTORE_CMD_ERASE:
> > + ret = virtio_pstore_do_erase(s, &req);
> > + break;
> > + default:
> > + ret = -1;
> > + break;
> > + }
> > +
> > + res.ret = ret;
> > +
> > + iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > + virtqueue_push(vq, elem, sizeof(res) + len);
>
> this is wrong - len should be # of bytes written into guest memory.
???
All command except READ only write the virtio_pstore_ret so len is 0.
For READ, virtio_pstore_do_read() returns the actual bytes it wrote
(minus sizeof(res) of course), so I think it's fine, no?
Anyway, thanks for your review!
Thanks,
Namhyung
>
> > +
> > + virtio_notify(vdev, vq);
> > + g_free(elem);
> > +
> > + if (ret < 0) {
> > + return;
> > + }
> > + }
> > +}
WARNING: multiple messages have this Message-ID (diff)
From: Namhyung Kim <namhyung@kernel.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: kvm@vger.kernel.org, qemu-devel@nongnu.org,
virtualization@lists.linux-foundation.org,
LKML <linux-kernel@vger.kernel.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"Anthony Liguori" <aliguori@amazon.com>,
"Anton Vorontsov" <anton@enomsg.org>,
"Colin Cross" <ccross@android.com>,
"Kees Cook" <keescook@chromium.org>,
"Tony Luck" <tony.luck@intel.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ingo Molnar" <mingo@kernel.org>,
"Minchan Kim" <minchan@kernel.org>
Subject: Re: [Qemu-devel] [PATCH 6/7] qemu: Implement virtio-pstore device
Date: Thu, 28 Jul 2016 14:39:53 +0900 [thread overview]
Message-ID: <20160728053953.GB4371@sejong> (raw)
In-Reply-To: <20160728023254-mutt-send-email-mst@kernel.org>
On Thu, Jul 28, 2016 at 03:02:54AM +0300, Michael S. Tsirkin wrote:
> On Thu, Jul 28, 2016 at 12:08:30AM +0900, Namhyung Kim wrote:
> > Add virtio pstore device to allow kernel log files saved on the host.
> > It will save the log files on the directory given by pstore device
> > option.
> >
> > $ qemu-system-x86_64 -device virtio-pstore,directory=dir-xx ...
> >
> > (guest) # echo c > /proc/sysrq-trigger
>
> So if the point is handling system crashes, I suspect
> virtio is the wrong protocol to use. ATM it's rather
> elaborate for performance, it's likely not to work when
> linux is crashing. I think you want something very very
> simple that will still work when you crash. Like maybe
> a serial device?
Well, I dont' know. As you know, the kernel oops dump is already sent
to serial device but it's rather slow. As I wrote in the cover
letter, enabling ftrace_dump_on_oops makes it even worse.. Also
pstore saves the (compressed) binary data so I thought it'd be better
to have a dedicated IO channel.
I know that we can't rely on anything in kernel when the kernel is
crashing. In the virtualization environment, I think it'd be great if
it can dump the crash data in the host directly so I chose the virtio.
I thought the virtio is a relatively thin layer and independent from
other kernel features. Even if it doesn't guarantee to work 100%,
it's worth trying IMHO.
>
> > $ ls dir-xx
> > dmesg-1.enc.z dmesg-2.enc.z
> >
> > The log files are usually compressed using zlib. Users can see the log
> > messages directly on the host or on the guest (using pstore filesystem).
>
> So this lacks all management tools that a regular storage device
> has, and these are necessary if people are to use this
> in production.
>
> For example, some kind of provision for limiting the amount of host disk
> this can consume seems called for. Rate-limiting disk writes
> on host also seems necessary. Handling host disk errors always
> propagates error to guest but an option to e.g. stop vm might
> be useful to avoid corruption.
Yes, it needs that kind of management. I'd like to look at the
existing implementation. But basically I thought it as a debugging
feature and it won't produce much data for the default setting. Maybe
I can limit the file size not to exceed the buffer size and keep up to
pre-configured number of files for each type. Now host disk error
will be propagated to guest.
>
> >
> > The 'directory' property is required for virtio-pstore device to work.
> > It also adds 'bufsize' and 'console' (boolean) properties.
>
> No idea what these do. Seem to be RW by guest but have no effect
> otherwise.
The 'directory' is a pathname which it saves the data files. The
'bufsize' sets the size of buffer used by pstore. The 'console'
enables saving pstore console type data.
>
>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Cc: Anton Vorontsov <anton@enomsg.org>
> > Cc: Colin Cross <ccross@android.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: kvm@vger.kernel.org
> > Cc: qemu-devel@nongnu.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
[SNIP]
> > diff --git a/hw/virtio/virtio-pstore.c b/hw/virtio/virtio-pstore.c
> > new file mode 100644
> > index 0000000..2ca7786
> > --- /dev/null
> > +++ b/hw/virtio/virtio-pstore.c
> > @@ -0,0 +1,477 @@
> > +/*
> > + * Virtio Pstore Device
> > + *
> > + * Copyright (C) 2016 LG Electronics
> > + *
> > + * Authors:
> > + * Namhyung Kim <namhyung@gmail.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.
>
> We generally ask new code to be v2 or later.
Ok, will update.
>
> > See
> > + * the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include <stdio.h>
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/iov.h"
> > +#include "qemu-common.h"
> > +#include "qemu/cutils.h"
> > +#include "qemu/error-report.h"
> > +#include "sysemu/kvm.h"
> > +#include "qapi/visitor.h"
> > +#include "qapi-event.h"
> > +#include "trace.h"
> > +
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/virtio-pstore.h"
> > +
> > +
> > +static void virtio_pstore_to_filename(VirtIOPstore *s, char *buf, size_t sz,
> > + struct virtio_pstore_req *req)
> > +{
> > + const char *basename;
> > + unsigned long long id = 0;
> > + unsigned int flags = le32_to_cpu(req->flags);
> > +
> > + switch (le16_to_cpu(req->type)) {
> > + case VIRTIO_PSTORE_TYPE_DMESG:
> > + basename = "dmesg";
> > + id = s->id++;
> > + break;
> > + case VIRTIO_PSTORE_TYPE_CONSOLE:
> > + basename = "console";
> > + if (s->console_id) {
> > + id = s->console_id;
> > + } else {
> > + id = s->console_id = s->id++;
> > + }
> > + break;
> > + default:
> > + basename = "unknown";
> > + break;
> > + }
> > +
> > + snprintf(buf, sz, "%s/%s-%llu%s", s->directory, basename, id,
> > + flags & VIRTIO_PSTORE_FL_COMPRESSED ? ".enc.z" : "");
> > +}
> > +
> > +static void virtio_pstore_from_filename(VirtIOPstore *s, char *name,
> > + char *buf, size_t sz,
> > + struct virtio_pstore_fileinfo *info)
> > +{
> > + snprintf(buf, sz, "%s/%s", s->directory, name);
>
> if this does not fit, buf will not be 0 terminated and can
> generally be corrupted.
Will change it to use malloc instead.
>
>
> > +
> > + if (g_str_has_prefix(name, "dmesg-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_DMESG;
> > + name += strlen("dmesg-");
> > + } else if (g_str_has_prefix(name, "console-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_CONSOLE;
> > + name += strlen("console-");
> > + } else if (g_str_has_prefix(name, "unknown-")) {
> > + info->type = VIRTIO_PSTORE_TYPE_UNKNOWN;
> > + name += strlen("unknown-");
> > + }
> > +
> > + qemu_strtoull(name, NULL, 0, &info->id);
> > +
> > + info->flags = 0;
> > + if (g_str_has_suffix(name, ".enc.z")) {
> > + info->flags |= VIRTIO_PSTORE_FL_COMPRESSED;
> > + }
> > +}
[SNIP]
> > +static ssize_t virtio_pstore_do_write(VirtIOPstore *s, struct iovec *out_sg,
> > + unsigned int out_num,
> > + struct virtio_pstore_req *req)
> > +{
> > + char path[PATH_MAX];
> > + int fd;
> > + ssize_t len;
> > + unsigned short type;
> > + int flags = O_WRONLY | O_CREAT;
> > +
> > + /* we already consume the req */
> > + iov_discard_front(&out_sg, &out_num, sizeof(*req));
> > +
> > + virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > + type = le16_to_cpu(req->type);
> > +
> > + if (type == VIRTIO_PSTORE_TYPE_DMESG) {
> > + flags |= O_TRUNC;
> > + } else if (type == VIRTIO_PSTORE_TYPE_CONSOLE) {
> > + flags |= O_APPEND;
> > + }
> > +
> > + fd = open(path, flags, 0644);
> > + if (fd < 0) {
> > + error_report("cannot open %s", path);
> > + return -1;
> > + }
> > + len = writev(fd, out_sg, out_num);
> > + close(fd);
> > +
> > + return len;
>
> All this is blocking VM until host io completes.
Hmm.. I don't know about the internals of qemu. So does it make guest
stop? If so, that's what I want to do for _DMESG. :) As it's called
only on kernel oops I think it's admittable. But for _CONSOLE, it
needs to do asynchronously. Maybe I can add a thread to do the work.
>
>
> > +}
> > +
> > +static ssize_t virtio_pstore_do_close(VirtIOPstore *s)
> > +{
> > + if (s->dirp == NULL) {
> > + return 0;
> > + }
> > +
> > + closedir(s->dirp);
> > + s->dirp = NULL;
> > +
> > + return 0;
> > +}
> > +
> > +static ssize_t virtio_pstore_do_erase(VirtIOPstore *s,
> > + struct virtio_pstore_req *req)
> > +{
> > + char path[PATH_MAX];
> > +
> > + virtio_pstore_to_filename(s, path, sizeof(path), req);
> > +
> > + return unlink(path);
> > +}
> > +
> > +static void virtio_pstore_handle_io(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > + VirtIOPstore *s = VIRTIO_PSTORE(vdev);
> > + VirtQueueElement *elem;
> > + struct virtio_pstore_req req;
> > + struct virtio_pstore_res res;
> > + ssize_t len = 0;
> > + int ret;
> > +
> > + for (;;) {
> > + elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
> > + if (!elem) {
> > + return;
> > + }
> > +
> > + if (elem->out_num < 1 || elem->in_num < 1) {
> > + error_report("request or response buffer is missing");
> > + exit(1);
> > + }
> > +
> > + len = iov_to_buf(elem->out_sg, elem->out_num, 0, &req, sizeof(req));
> > + if (len != (ssize_t)sizeof(req)) {
> > + error_report("invalid request size: %ld", (long)len);
> > + exit(1);
> > + }
> > + res.cmd = req.cmd;
> > + res.type = req.type;
> > +
> > + switch (le16_to_cpu(req.cmd)) {
> > + case VIRTIO_PSTORE_CMD_OPEN:
> > + ret = virtio_pstore_do_open(s);
> > + break;
> > + case VIRTIO_PSTORE_CMD_READ:
> > + ret = virtio_pstore_do_read(s, elem->in_sg, elem->in_num, &res);
> > + if (ret > 0) {
> > + len = ret;
> > + ret = 0;
> > + }
> > + break;
> > + case VIRTIO_PSTORE_CMD_WRITE:
> > + ret = virtio_pstore_do_write(s, elem->out_sg, elem->out_num, &req);
> > + break;
> > + case VIRTIO_PSTORE_CMD_CLOSE:
> > + ret = virtio_pstore_do_close(s);
> > + break;
> > + case VIRTIO_PSTORE_CMD_ERASE:
> > + ret = virtio_pstore_do_erase(s, &req);
> > + break;
> > + default:
> > + ret = -1;
> > + break;
> > + }
> > +
> > + res.ret = ret;
> > +
> > + iov_from_buf(elem->in_sg, elem->in_num, 0, &res, sizeof(res));
> > + virtqueue_push(vq, elem, sizeof(res) + len);
>
> this is wrong - len should be # of bytes written into guest memory.
???
All command except READ only write the virtio_pstore_ret so len is 0.
For READ, virtio_pstore_do_read() returns the actual bytes it wrote
(minus sizeof(res) of course), so I think it's fine, no?
Anyway, thanks for your review!
Thanks,
Namhyung
>
> > +
> > + virtio_notify(vdev, vq);
> > + g_free(elem);
> > +
> > + if (ret < 0) {
> > + return;
> > + }
> > + }
> > +}
next prev parent reply other threads:[~2016-07-28 5:39 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-27 15:08 [RFC/PATCHSET 0/7] virtio: Implement virtio pstore device (v2) Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-27 15:08 ` [PATCH 1/7] pstore: Split pstore fragile flags Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-27 15:08 ` Namhyung Kim
2016-07-27 15:08 ` [PATCH 2/7] pstore/ram: Set pstore flags dynamically Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-27 15:08 ` Namhyung Kim
2016-07-27 15:08 ` [PATCH 3/7] pstore: Manage buffer position for async write Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-27 15:08 ` Namhyung Kim
2016-07-27 15:08 ` [PATCH 4/7] virtio: Basic implementation of virtio pstore driver Namhyung Kim
2016-07-27 15:08 ` Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-27 15:08 ` [PATCH 5/7] virtio-pstore: Support PSTORE_TYPE_CONSOLE Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-27 15:08 ` Namhyung Kim
2016-07-27 15:08 ` [PATCH 6/7] qemu: Implement virtio-pstore device Namhyung Kim
2016-07-27 15:08 ` Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-28 0:02 ` Michael S. Tsirkin
2016-07-28 0:02 ` [Qemu-devel] " Michael S. Tsirkin
2016-07-28 0:02 ` Michael S. Tsirkin
2016-07-28 5:39 ` Namhyung Kim [this message]
2016-07-28 5:39 ` [Qemu-devel] " Namhyung Kim
2016-07-28 5:39 ` Namhyung Kim
2016-07-28 12:56 ` Stefan Hajnoczi
2016-07-28 12:56 ` [Qemu-devel] " Stefan Hajnoczi
2016-07-28 13:08 ` Daniel P. Berrange
2016-07-28 13:08 ` Daniel P. Berrange
2016-07-30 8:38 ` Namhyung Kim
2016-07-30 8:38 ` Namhyung Kim
2016-08-01 9:21 ` Daniel P. Berrange
2016-08-01 9:21 ` Daniel P. Berrange
2016-08-01 15:34 ` Namhyung Kim
2016-08-01 15:34 ` Namhyung Kim
2016-07-28 12:56 ` Stefan Hajnoczi
2016-07-28 14:38 ` Steven Rostedt
2016-07-28 14:38 ` Steven Rostedt
2016-07-28 14:38 ` [Qemu-devel] " Steven Rostedt
2016-07-28 13:22 ` Daniel P. Berrange
2016-07-28 13:22 ` Daniel P. Berrange
2016-07-30 8:57 ` Namhyung Kim
2016-07-30 8:57 ` Namhyung Kim
2016-08-01 9:24 ` Daniel P. Berrange
2016-08-01 9:24 ` Daniel P. Berrange
2016-08-01 15:52 ` Namhyung Kim
2016-08-01 15:52 ` Namhyung Kim
2016-07-27 15:08 ` [PATCH 7/7] kvmtool: " Namhyung Kim
2016-07-27 15:08 ` Namhyung Kim
2016-07-27 15:08 ` [Qemu-devel] " Namhyung Kim
2016-07-27 22:18 ` [RFC/PATCHSET 0/7] virtio: Implement virtio pstore device (v2) Michael S. Tsirkin
2016-07-27 22:18 ` [Qemu-devel] " Michael S. Tsirkin
2016-07-27 22:18 ` Michael S. Tsirkin
2016-07-28 2:46 ` Namhyung Kim
2016-07-28 2:46 ` Namhyung Kim
2016-07-28 2:46 ` [Qemu-devel] " Namhyung Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160728053953.GB4371@sejong \
--to=namhyung@kernel.org \
--cc=aliguori@amazon.com \
--cc=anton@enomsg.org \
--cc=ccross@android.com \
--cc=keescook@chromium.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=minchan@kernel.org \
--cc=mingo@kernel.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rkrcmar@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tony.luck@intel.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.