From: Hans de Goede <hdegoede@redhat.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: stable@kernel.org,
Virtualization List <virtualization@lists.linux-foundation.org>
Subject: Re: [PATCH] virtio: console: Don't block entire guest if host doesn't read data
Date: Tue, 19 Oct 2010 09:23:00 +0200 [thread overview]
Message-ID: <4CBD4754.5030504@redhat.com> (raw)
In-Reply-To: <20101019071021.GB2505@amit-laptop.redhat.com>
Hi,
On 10/19/2010 09:10 AM, Amit Shah wrote:
> On (Tue) Oct 19 2010 [08:55:16], Hans de Goede wrote:
>> Hi,
>>
>> On 10/19/2010 07:45 AM, Amit Shah wrote:
>>> If the host is slow in reading data or doesn't read data at all,
>>> blocking write calls not only blocked the program that called write()
>>> but the entire guest itself.
>>>
>>> To overcome this, let's not block till the host signals it has given
>>> back the virtio ring element we passed it. Instead, send the buffer to
>>> the host and return to userspace. This operation then becomes similar
>>> to how non-blocking writes work, so let's use the existing code for this
>>> path as well.
>>>
>>> This code change also ensures blocking write calls do get blocked if
>>> there's not enough room in the virtio ring as well as they don't return
>>> -EAGAIN to userspace.
>>>
>>> Signed-off-by: Amit Shah<amit.shah@redhat.com>
>>> CC: stable@kernel.org
>>> ---
>>> drivers/char/virtio_console.c | 17 ++++++++++++++---
>>> 1 files changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
>>> index c810481..0f69c5e 100644
>>> --- a/drivers/char/virtio_console.c
>>> +++ b/drivers/char/virtio_console.c
>>> @@ -459,9 +459,12 @@ static ssize_t send_buf(struct port *port, void *in_buf, size_t in_count,
>>>
>>> /*
>>> * Wait till the host acknowledges it pushed out the data we
>>> - * sent. This is done for ports in blocking mode or for data
>>> - * from the hvc_console; the tty operations are performed with
>>> - * spinlocks held so we can't sleep here.
>>> + * sent. This is done for data from the hvc_console; the tty
>>> + * operations are performed with spinlocks held so we can't
>>> + * sleep here. An alternative would be to copy the data to a
>>> + * buffer and relax the spinning requirement. The downside is
>>> + * we need to kmalloc a GFP_ATOMIC buffer each time the
>>> + * console driver writes something out.
>>> */
>>> while (!virtqueue_get_buf(out_vq,&len))
>>> cpu_relax();
>>> @@ -626,6 +629,14 @@ static ssize_t port_fops_write(struct file *filp, const char __user *ubuf,
>>> goto free_buf;
>>> }
>>>
>>> + /*
>>> + * We now ask send_buf() to not spin for generic ports -- we
>>> + * can re-use the same code path that non-blocking file
>>> + * descriptors take for blocking file descriptors since the
>>> + * wait is already done and we're certain the write will go
>>> + * through to the host.
>>> + */
>>> + nonblock = true;
>>> ret = send_buf(port, buf, count, nonblock);
>>>
>>> if (nonblock&& ret> 0)
>>
>> 1) Hmm, this changes the code to kfree the buffer, but only if the send_buf
>> succeeded (which it always should given we did a will_block check first).
>
> The change is to *not* free the buffer. It will be freed later when the
> host indicates it's done with it (happens in reclaim_consumed_buffers()).
>
Ah, thanks for explaining that.
>> I cannot help but notice that the data was not freed on a blocking fd
>> before this patch, but is freed now. And I see nothing in send_buf to make
>> it take ownership of the buffer / free it in the blocking case, and not take
>> ownership in the blocking case. More over if anything I would expect send_buf
>> to take ownership in the non blocking case (as the data is not directly
>> consumed there), and not take owner ship in the blocking case, but the check
>> is the reverse. Also why is the buffer not freed if the write failed, that
>> makes no sense.
>
> The buffer used to be freed in the blocking case, as we knew for certain
> the host was done with the buffer. Now it's not, we'll free it later.
>
>> 2) Assuming that things are changed so that send_buf does take ownership of the
>> buffer in the nonblocking case, shouldn't the buffer then be allocated
>> with GPF_ATOMIC ?
>
> Why? We're not called from irq context.
>
Ok, my bad.
>> 3) This patch will cause processes filling the virtqueue fast enough to block
>> to never wake up again, due to a missing waitqueue wakeup, see:
>> https://bugzilla.redhat.com/show_bug.cgi?id=643750
>
> Doesn't happen in my testcase, but this patch shouldn't cause that
> problem if it exists -- it's a problem that exists even now for
> nonblocking ports. So if such a bug exists, it needs to be fixed
> independently.
First of all lets agree that this is a real problem, there is simply nothing
waking the waitqueue were fops_write (or poll) block on when buffers become
available in out_vq, it may be hard to come up with a test case which fills
the queue fast enough to hit this scenario, but it is very real.
I agree it is an independent problem, and should be fixed in a separate
patch, but that patch should be part of the same set and become *before*
this one, as this patch now extends the problem to ports opened in blocking
mode too.
BTW, many thanks for working on this, it is appreciated :)
Regards,
Hans
next prev parent reply other threads:[~2010-10-19 7:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-19 5:45 [PATCH] virtio: console: Don't block entire guest if host doesn't read data Amit Shah
2010-10-19 6:55 ` Hans de Goede
2010-10-19 6:57 ` Hans de Goede
2010-10-19 7:13 ` Amit Shah
2010-10-19 7:10 ` Amit Shah
2010-10-19 7:23 ` Hans de Goede [this message]
2010-10-19 7:32 ` Amit Shah
2010-10-19 8:03 ` Hans de Goede
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CBD4754.5030504@redhat.com \
--to=hdegoede@redhat.com \
--cc=amit.shah@redhat.com \
--cc=stable@kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.