From: Hans de Goede <hdegoede@redhat.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: stable@kernel.org,
Virtualization List <virtualization@lists.linux-foundation.org>
Subject: Re: [PATCH] virtio: console: Don't block entire guest if host doesn't read data
Date: Tue, 19 Oct 2010 09:23:00 +0200 [thread overview]
Message-ID: <4CBD4754.5030504@redhat.com> (raw)
In-Reply-To: <20101019071021.GB2505@amit-laptop.redhat.com>
Hi,
On 10/19/2010 09:10 AM, Amit Shah wrote:
> On (Tue) Oct 19 2010 [08:55:16], Hans de Goede wrote:
>> Hi,
>>
>> On 10/19/2010 07:45 AM, Amit Shah wrote:
>>> If the host is slow in reading data or doesn't read data at all,
>>> blocking write calls not only blocked the program that called write()
>>> but the entire guest itself.
>>>
>>> To overcome this, let's not block till the host signals it has given
>>> back the virtio ring element we passed it. Instead, send the buffer to
>>> the host and return to userspace. This operation then becomes similar
>>> to how non-blocking writes work, so let's use the existing code for this
>>> path as well.
>>>
>>> This code change also ensures blocking write calls do get blocked if
>>> there's not enough room in the virtio ring as well as they don't return
>>> -EAGAIN to userspace.
>>>
>>> Signed-off-by: Amit Shah<amit.shah@redhat.com>
>>> CC: stable@kernel.org
>>> ---
>>> drivers/char/virtio_console.c | 17 ++++++++++++++---
>>> 1 files changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
>>> index c810481..0f69c5e 100644
>>> --- a/drivers/char/virtio_console.c
>>> +++ b/drivers/char/virtio_console.c
>>> @@ -459,9 +459,12 @@ static ssize_t send_buf(struct port *port, void *in_buf, size_t in_count,
>>>
>>> /*
>>> * Wait till the host acknowledges it pushed out the data we
>>> - * sent. This is done for ports in blocking mode or for data
>>> - * from the hvc_console; the tty operations are performed with
>>> - * spinlocks held so we can't sleep here.
>>> + * sent. This is done for data from the hvc_console; the tty
>>> + * operations are performed with spinlocks held so we can't
>>> + * sleep here. An alternative would be to copy the data to a
>>> + * buffer and relax the spinning requirement. The downside is
>>> + * we need to kmalloc a GFP_ATOMIC buffer each time the
>>> + * console driver writes something out.
>>> */
>>> while (!virtqueue_get_buf(out_vq,&len))
>>> cpu_relax();
>>> @@ -626,6 +629,14 @@ static ssize_t port_fops_write(struct file *filp, const char __user *ubuf,
>>> goto free_buf;
>>> }
>>>
>>> + /*
>>> + * We now ask send_buf() to not spin for generic ports -- we
>>> + * can re-use the same code path that non-blocking file
>>> + * descriptors take for blocking file descriptors since the
>>> + * wait is already done and we're certain the write will go
>>> + * through to the host.
>>> + */
>>> + nonblock = true;
>>> ret = send_buf(port, buf, count, nonblock);
>>>
>>> if (nonblock&& ret> 0)
>>
>> 1) Hmm, this changes the code to kfree the buffer, but only if the send_buf
>> succeeded (which it always should given we did a will_block check first).
>
> The change is to *not* free the buffer. It will be freed later when the
> host indicates it's done with it (happens in reclaim_consumed_buffers()).
>
Ah, thanks for explaining that.
>> I cannot help but notice that the data was not freed on a blocking fd
>> before this patch, but is freed now. And I see nothing in send_buf to make
>> it take ownership of the buffer / free it in the blocking case, and not take
>> ownership in the blocking case. More over if anything I would expect send_buf
>> to take ownership in the non blocking case (as the data is not directly
>> consumed there), and not take owner ship in the blocking case, but the check
>> is the reverse. Also why is the buffer not freed if the write failed, that
>> makes no sense.
>
> The buffer used to be freed in the blocking case, as we knew for certain
> the host was done with the buffer. Now it's not, we'll free it later.
>
>> 2) Assuming that things are changed so that send_buf does take ownership of the
>> buffer in the nonblocking case, shouldn't the buffer then be allocated
>> with GPF_ATOMIC ?
>
> Why? We're not called from irq context.
>
Ok, my bad.
>> 3) This patch will cause processes filling the virtqueue fast enough to block
>> to never wake up again, due to a missing waitqueue wakeup, see:
>> https://bugzilla.redhat.com/show_bug.cgi?id=643750
>
> Doesn't happen in my testcase, but this patch shouldn't cause that
> problem if it exists -- it's a problem that exists even now for
> nonblocking ports. So if such a bug exists, it needs to be fixed
> independently.
First of all lets agree that this is a real problem, there is simply nothing
waking the waitqueue were fops_write (or poll) block on when buffers become
available in out_vq, it may be hard to come up with a test case which fills
the queue fast enough to hit this scenario, but it is very real.
I agree it is an independent problem, and should be fixed in a separate
patch, but that patch should be part of the same set and become *before*
this one, as this patch now extends the problem to ports opened in blocking
mode too.
BTW, many thanks for working on this, it is appreciated :)
Regards,
Hans
next prev parent reply other threads:[~2010-10-19 7:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-19 5:45 [PATCH] virtio: console: Don't block entire guest if host doesn't read data Amit Shah
2010-10-19 6:55 ` Hans de Goede
2010-10-19 6:57 ` Hans de Goede
2010-10-19 7:13 ` Amit Shah
2010-10-19 7:10 ` Amit Shah
2010-10-19 7:23 ` Hans de Goede [this message]
2010-10-19 7:32 ` Amit Shah
2010-10-19 8:03 ` Hans de Goede
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CBD4754.5030504@redhat.com \
--to=hdegoede@redhat.com \
--cc=amit.shah@redhat.com \
--cc=stable@kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).