From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:51928)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jasowang@redhat.com>) id 1YsRS2-0008Ew-W2
	for qemu-devel@nongnu.org; Wed, 13 May 2015 03:51:36 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jasowang@redhat.com>) id 1YsRS0-0005Df-76
	for qemu-devel@nongnu.org; Wed, 13 May 2015 03:51:34 -0400
Received: from mx1.redhat.com ([209.132.183.28]:55557)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jasowang@redhat.com>) id 1YsRS0-0005DU-0P
	for qemu-devel@nongnu.org; Wed, 13 May 2015 03:51:32 -0400
Message-ID: <5553027B.7070100@redhat.com>
Date: Wed, 13 May 2015 15:51:23 +0800
From: Jason Wang <jasowang@redhat.com>
MIME-Version: 1.0
References: <20150428071225-mutt-send-email-mst@redhat.com>	<1430201600.5354.0@smtp.corp.redhat.com>	<20150428085941-mutt-send-email-mst@redhat.com>	<20150428100415.377222a3.cornelia.huck@de.ibm.com>	<20150428100706-mutt-send-email-mst@redhat.com>	<20150428124007.443a6555.cornelia.huck@de.ibm.com>	<20150428124510-mutt-send-email-mst@redhat.com>	<20150428133951.78b9f7e3.cornelia.huck@de.ibm.com>	<20150428143914-mutt-send-email-mst@redhat.com>	<20150428153337.0ec1f5b6.cornelia.huck@de.ibm.com>
	<20150428163601-mutt-send-email-mst@redhat.com>
In-Reply-To: <20150428163601-mutt-send-email-mst@redhat.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH V7 08/16] virtio: introduce bus specific
 queue limit
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Michael S. Tsirkin" <mst@redhat.com>, Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>, Richard Henderson <rth@twiddle.net>, Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org, Alexander Graf <agraf@suse.de>


On 04/28/2015 10:40 PM, Michael S. Tsirkin wrote:
> On Tue, Apr 28, 2015 at 03:33:37PM +0200, Cornelia Huck wrote:
>> On Tue, 28 Apr 2015 14:47:11 +0200
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Tue, Apr 28, 2015 at 01:39:51PM +0200, Cornelia Huck wrote:
>>>> On Tue, 28 Apr 2015 12:55:40 +0200
>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>
>>>>> On Tue, Apr 28, 2015 at 12:40:07PM +0200, Cornelia Huck wrote:
>>>>>> On Tue, 28 Apr 2015 10:16:04 +0200
>>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>>>
>>>>>>> On Tue, Apr 28, 2015 at 10:04:15AM +0200, Cornelia Huck wrote:
>>>>>>>> On Tue, 28 Apr 2015 09:14:07 +0200
>>>>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Apr 28, 2015 at 02:13:20PM +0800, Jason Wang wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, Apr 28, 2015 at 1:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>>>>> On Tue, Apr 28, 2015 at 11:14:04AM +0800, Jason Wang wrote:
>>>>>>>>>>>>     On Mon, Apr 27, 2015 at 7:05 PM, Michael S. Tsirkin
>>>>>>>>>>>> <mst@redhat.com> wrote:
>>>>>>>>>>>>> On Thu, Apr 23, 2015 at 02:21:41PM +0800, Jason Wang wrote:
>>>>>>>>>>>>>> This patch introduces a bus specific queue limitation. It will be
>>>>>>>>>>>>>> useful for increasing the limit for one of the bus without
>>>>>>>>>>>> disturbing
>>>>>>>>>>>>>> other buses.
>>>>>>>>>>>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>>>>>>>>>>>>> Cc: Alexander Graf <agraf@suse.de>
>>>>>>>>>>>>>> Cc: Richard Henderson <rth@twiddle.net>
>>>>>>>>>>>>>> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
>>>>>>>>>>>>>> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
>>>>>>>>>>>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>>>>>>>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>>>>>>>>>> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
>>>>>>>>>>>>> Is this still needed if you drop the attempt to
>>>>>>>>>>>>> keep the limit around for old machine types?
>>>>>>>>>>>> If we agree to drop, we probably need transport specific macro.
>>>>>>>>>>> You mean just rename VIRTIO_PCI_QUEUE_MAX to VIRTIO_QUEUE_MAX?
>>>>>>>>>>> Fine, why not.
>>>>>>>>>> I mean keeping VIRTIO_PCI_QUEUE_MAX for pci only and just increase pci
>>>>>>>>>> limit. And introduce e.g VIRTIO_PCI_QUEUE_CCW for ccw and keep it as 64.
>>>>>>>>>> Since to my understanding, it's not safe to increase the limit for all other
>>>>>>>>>> transports which was pointed out by Cornelia in V1:
>>>>>>>>>> http://permalink.gmane.org/gmane.comp.emulators.qemu/318245.
>>>>>>>>> I think all you need is add a check to CCW_CMD_SET_IND:
>>>>>>>>> limit to 64 for legacy interrupts only.
>>>>>>>> It isn't that easy.
>>>>>>>>
>>>>>>>> What is easy is to add a check to the guest driver that fails setup for
>>>>>>>> devices with more than 64 queues not using adapter interrupts.
>>>>>>>>
>>>>>>>> On the host side, we're lacking information when interpreting
>>>>>>>> CCW_CMD_SET_IND (the command does not contain a queue count, and the
>>>>>>>> actual number of virtqueues is not readily available.)
>>>>>>> Why isn't it available? All devices call virtio_add_queue
>>>>>>> as appropriate. Just fail legacy adaptors.
>>>>>> Because we don't know what the guest is going to use? It is free to
>>>>>> use per-subchannel indicators, even if it is operating in virtio-1 mode.
>>>>>>>> We also can't
>>>>>>>> fence off when setting up the vqs, as this happens before we know which
>>>>>>>> kind of indicators the guest wants to use.
>>>>>>>>
>>>>>>>> More importantly, we haven't even speced what we want to do in this
>>>>>>>> case. Do we want to reject SET_IND for devices with more than 64
>>>>>>>> queues? (Probably yes.)
>>>>>>>>
>>>>>>>> All this involves more work, and I'd prefer to do Jason's changes
>>>>>>>> instead as this gives us some more time to figure this out properly.
>>>>>>>>
>>>>>>>> And we haven't even considered s390-virtio yet, which I really want to
>>>>>>>> touch as little as possible :)
>>>>>>> Well this patch does touch it anyway :)
>>>>>> But only small, self-evident changes.
>>>>>>
>>>>> Sorry, I don't see what you are trying to say.
>>>>> There's no chance legacy interrupts work with > 64 queues.
>>>>> Guests should have validated the # of queues, and not
>>>>> attempted to use >64 queues. Looks like there's no
>>>>> such validation in guest, right?
>>>> I have no idea whether > 64 queues would work with s390-virtio - it
>>>> might well work, but I'm not willing to extend any effort to verifying
>>>> that.
>>> Well this doesn't mean we won't make any changes, ever,
>>> just so we can reduce verification costs.
>>> Let's make the change everywhere, if we see issues
>>> we'll backtrack.
>> I don't like possibly breaking things with a seeing eye. And I know
>> that some virtio-ccw setups will break.
>>
>>>>> Solution - don't specify this configuration with legacy guests.
>>>>>
>>>>> Modern guests work so there's value in supporting such
>>>>> configuration in QEMU, I don't see why we must deny it in QEMU.
>>>> What is "legacy guest" in your context? A guest running with the legacy
>>>> transport or a guest using ccw but not virtio-1? A ccw guest using
>>>> adapter interrupts but not virtio-1 should be fine.
>>> A guest not using adapter interrupts.
>> There's nothing about that that's per-guest. It is a choice per-device.
>> In fact, the Linux guest driver falls back to classic interrupts if it
>> fails to setup adapter interrupts for a device - and this might happen
>> for large guests when the host adapter routing table is full.
>>
>>>>>>> For s390 just check and fail at init if you like.
>>>>>> What about devices that may change their number of queues? I'd really
>>>>>> prefer large queue numbers to be fenced off in the the individual
>>>>>> devices, and for that they need to be able to grab a transport-specific
>>>>>> queue limit.
>>>>> This is why I don't want bus specific limits in core,
>>>>> it just makes it too easy to sweep dirt under the carpet.
>>>>> s390 is legacy - fine, but don't perpetuate the issue
>>>>> in devices.
>>>> What is "swept under the carpet" here? A device can have min(max queues
>>>> from transport, max queues from device type) queues. I think it's
>>>> easier to refuse instantiating with too many queues per device type (as
>>>> most will be fine with 64 queues), so I don't want that code in the
>>>> transport (beyond making the limit available).
>>>>
>>>> For s390 I'd like in the end:
>>>> - s390-virtio: legacy - keep it working as best-can-do, so I'd prefer
>>>>   to keep it at 64 queues, even if more might work
>>>> - virtio-ccw, devices in legacy or virtio-1 mode: works with adapter
>>>>   interrupts, so let's fence off setting per-subchannel indicators if a
>>>>   device has more than 64 queues (needs work and a well thought-out
>>>>   rejection mechanism)
>>>>
>>>> That's _in the end_: I'd like to keep ccw at 64 queues _for now_ so
>>>> that we don't have a rushed interface change - and at the same time, I
>>>> don't want to hold off pci. Makes sense?
>>> If you want to fail configurations with > 64 queues in ccw or s390,
>>> that's fine by me. I don't want work arounds for these bugs in virtio
>>> core though. So transports should not have a say in how many queues can
>>> be supported, but they can fail configurations they can't support if
>>> they want to.
>> Eh, isn't that a contradiction? Failing a configuration means that the
>> transport does indeed have a say?
> I'm fine with general capability that lets transport check device
> and fail init, for whatever reason.
> E.g. can we teach plugged callback to fail?

Looks like we can (and for s390, we need add a callback just for
checking this). That just moves the transport specific limit to
k->device_plugged (my patch check k->queue_max). I don't see obvious
difference.