virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Hans de Goede <hdegoede@redhat.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: stable@kernel.org,
	Virtualization List <virtualization@lists.linux-foundation.org>
Subject: Re: [PATCH] virtio: console: Don't block entire guest if host doesn't read data
Date: Tue, 19 Oct 2010 09:23:00 +0200	[thread overview]
Message-ID: <4CBD4754.5030504@redhat.com> (raw)
In-Reply-To: <20101019071021.GB2505@amit-laptop.redhat.com>

Hi,

On 10/19/2010 09:10 AM, Amit Shah wrote:
> On (Tue) Oct 19 2010 [08:55:16], Hans de Goede wrote:
>> Hi,
>>
>> On 10/19/2010 07:45 AM, Amit Shah wrote:
>>> If the host is slow in reading data or doesn't read data at all,
>>> blocking write calls not only blocked the program that called write()
>>> but the entire guest itself.
>>>
>>> To overcome this, let's not block till the host signals it has given
>>> back the virtio ring element we passed it.  Instead, send the buffer to
>>> the host and return to userspace.  This operation then becomes similar
>>> to how non-blocking writes work, so let's use the existing code for this
>>> path as well.
>>>
>>> This code change also ensures blocking write calls do get blocked if
>>> there's not enough room in the virtio ring as well as they don't return
>>> -EAGAIN to userspace.
>>>
>>> Signed-off-by: Amit Shah<amit.shah@redhat.com>
>>> CC: stable@kernel.org
>>> ---
>>>   drivers/char/virtio_console.c |   17 ++++++++++++++---
>>>   1 files changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
>>> index c810481..0f69c5e 100644
>>> --- a/drivers/char/virtio_console.c
>>> +++ b/drivers/char/virtio_console.c
>>> @@ -459,9 +459,12 @@ static ssize_t send_buf(struct port *port, void *in_buf, size_t in_count,
>>>
>>>   	/*
>>>   	 * Wait till the host acknowledges it pushed out the data we
>>> -	 * sent.  This is done for ports in blocking mode or for data
>>> -	 * from the hvc_console; the tty operations are performed with
>>> -	 * spinlocks held so we can't sleep here.
>>> +	 * sent.  This is done for data from the hvc_console; the tty
>>> +	 * operations are performed with spinlocks held so we can't
>>> +	 * sleep here.  An alternative would be to copy the data to a
>>> +	 * buffer and relax the spinning requirement.  The downside is
>>> +	 * we need to kmalloc a GFP_ATOMIC buffer each time the
>>> +	 * console driver writes something out.
>>>   	 */
>>>   	while (!virtqueue_get_buf(out_vq,&len))
>>>   		cpu_relax();
>>> @@ -626,6 +629,14 @@ static ssize_t port_fops_write(struct file *filp, const char __user *ubuf,
>>>   		goto free_buf;
>>>   	}
>>>
>>> +	/*
>>> +	 * We now ask send_buf() to not spin for generic ports -- we
>>> +	 * can re-use the same code path that non-blocking file
>>> +	 * descriptors take for blocking file descriptors since the
>>> +	 * wait is already done and we're certain the write will go
>>> +	 * through to the host.
>>> +	 */
>>> +	nonblock = true;
>>>   	ret = send_buf(port, buf, count, nonblock);
>>>
>>>   	if (nonblock&&   ret>   0)
>>
>> 1) Hmm, this changes the code to kfree the buffer, but only if the send_buf
>>     succeeded (which it always should given we did a will_block check first).
>
> The change is to *not* free the buffer.  It will be freed later when the
> host indicates it's done with it (happens in reclaim_consumed_buffers()).
>

Ah, thanks for explaining that.

>>     I cannot help but notice that the data was not freed on a blocking fd
>>     before this patch, but is freed now. And I see nothing in send_buf to make
>>     it take ownership of the buffer / free it in the blocking case, and not take
>>     ownership in the blocking case. More over if anything I would expect send_buf
>>     to take ownership in the non blocking case (as the data is not directly
>>     consumed there), and not take owner ship in the blocking case, but the check
>>     is the reverse. Also why is the buffer not freed if the write failed, that
>>     makes no sense.
>
> The buffer used to be freed in the blocking case, as we knew for certain
> the host was done with the buffer.  Now it's not, we'll free it later.
>
>> 2) Assuming that things are changed so that send_buf does take ownership of the
>>     buffer in the nonblocking case, shouldn't the buffer then be allocated
>>     with GPF_ATOMIC ?
>
> Why?  We're not called from irq context.
>

Ok, my bad.

>> 3) This patch will cause processes filling the virtqueue fast enough to block
>>     to never wake up again, due to a missing waitqueue wakeup, see:
>>     https://bugzilla.redhat.com/show_bug.cgi?id=643750
>
> Doesn't happen in my testcase, but this patch shouldn't cause that
> problem if it exists -- it's a problem that exists even now for
> nonblocking ports.  So if such a bug exists, it needs to be fixed
> independently.

First of all lets agree that this is a real problem, there is simply nothing
waking the waitqueue were fops_write (or poll) block on when buffers become
available in out_vq, it may be hard to come up with a test case which fills
the queue fast enough to hit this scenario, but it is very real.

I agree it is an independent problem, and should be fixed in a separate
patch, but that patch should be part of the same set and become *before*
this one, as this patch now extends the problem to ports opened in blocking
mode too.

BTW, many thanks for working on this, it is appreciated :)

Regards,

Hans

  reply	other threads:[~2010-10-19  7:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-19  5:45 [PATCH] virtio: console: Don't block entire guest if host doesn't read data Amit Shah
2010-10-19  6:55 ` Hans de Goede
2010-10-19  6:57   ` Hans de Goede
2010-10-19  7:13     ` Amit Shah
2010-10-19  7:10   ` Amit Shah
2010-10-19  7:23     ` Hans de Goede [this message]
2010-10-19  7:32       ` Amit Shah
2010-10-19  8:03         ` Hans de Goede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CBD4754.5030504@redhat.com \
    --to=hdegoede@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=stable@kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).