qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types
@ 2017-11-14  8:25 Dong Jia Shi
  2017-11-14 10:50 ` Cornelia Huck
  0 siblings, 1 reply; 5+ messages in thread
From: Dong Jia Shi @ 2017-11-14  8:25 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, bjsdjshi, fiuczy, bwalk, shalini, pasic, borntraeger

Dear Conny,

Good day!

Just now, our Libvirt folks pointed out a "usability mess" for the
design of differentiating address based on devices classes (real |
virtual). The complaints are mainly about the "s390-squash-mcss"
property and restrictions to define virtual device in the special 0xFE
css, and define real devices in non-0xFE.

We have some discussions internally, but failed to get it cleared. As we
think this is about the architecture, so hereby, I as a representative,
forward our arguments and questions here to ask you for help:

1. What benifit do we get to put virtual devices in css 0xFE?

2. Since we could accept squashing virtual devices into css 0, can we
accept to not trading 0xFE as a special css?
So that we can remove the restrictions for the cssid validation for each
type of device. Even we could drop the s390-squash-mcss, and just allow
the user to define any device in any css.

3. If we have to keep the squash property, then when squashing, it's
somewhat like "I don't care for the cssid", so is it possible for us to
not check the cssid in the device devno?
Libvirt would be benifited with this when automatically generating the
addresses.

4. Error message for devno conflict is not helpful. For the following
case:
  -M s390-ccw-virtio,s390-squash-mcss=on \
  -drive file=/dev/disk/by-path/ccw-0.0.3f3e,if=none,id=drive-virtio-disk1,format=raw \
  -device virtio-blk-ccw,devno=fe.0.2222,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=0 \
  -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920 \

We get this error message:
qemu-system-s390x: -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920: Device 0.0.2222 already exists

By checking 0.0.2222 from the cmd line, users can not find out the root
cause - squashing, easily. So, if we have to keep the squash property,
we could improve this message by adding a hint.

To sum up, we got the feeling that, this mess is not only for Libvirt
but also for QEMU cmd line users. And we are wondering if there is some
way to improve it.

Thanks,

-- 
Dong Jia Shi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types
  2017-11-14  8:25 [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types Dong Jia Shi
@ 2017-11-14 10:50 ` Cornelia Huck
  2017-11-14 11:17   ` Christian Borntraeger
  2017-11-21  6:51   ` Dong Jia Shi
  0 siblings, 2 replies; 5+ messages in thread
From: Cornelia Huck @ 2017-11-14 10:50 UTC (permalink / raw)
  To: Dong Jia Shi
  Cc: qemu-devel, fiuczy, bwalk, shalini, pasic, borntraeger,
	qemu-s390x

On Tue, 14 Nov 2017 16:25:47 +0800
Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> wrote:

> Dear Conny,
> 
> Good day!
> 
> Just now, our Libvirt folks pointed out a "usability mess" for the
> design of differentiating address based on devices classes (real |
> virtual). The complaints are mainly about the "s390-squash-mcss"
> property and restrictions to define virtual device in the special 0xFE
> css, and define real devices in non-0xFE.
> 
> We have some discussions internally, but failed to get it cleared. As we
> think this is about the architecture, so hereby, I as a representative,
> forward our arguments and questions here to ask you for help:
> 
> 1. What benifit do we get to put virtual devices in css 0xFE?

Some background here (for the benefit of innocent bystanders):

In the past, I had been involved with some cases where Linux guests
under z/VM died after a customer followed the recommended procedure to
vary off a path before applying service. That path was supposed to be a
path to a disk; unfortunately, z/VM had mapped all kinds of virtual
paths to it, including the only path to the console device. Oops.

With that in mind, we wanted to make sure that qemu would not be
susceptible to the same problem; IOW, we wanted to make sure that
chpids etc. were not mashed together for devices that did not have
anything to do with each other. At one point in time, the idea came up
to use a reserved css for virtio devices, which was deemed an elegant
solution as 'real' devices were still something far in the future. (And
I was under the delusion that we would have MCSS-E support in Linux by
then; that has not happened...)

So the basic idea of css 0xfe is: Maintain a clear separation between
devices emulated by qemu and pass-through devices (a more divisive
separation than by simply separating chpids).

> 
> 2. Since we could accept squashing virtual devices into css 0, can we
> accept to not trading 0xFE as a special css?

Using css 0xfe seemed like a good idea; but as things worked out
differently in the meantime, it seems it causes more problems right now
than it avoids.

> So that we can remove the restrictions for the cssid validation for each
> type of device. Even we could drop the s390-squash-mcss, and just allow
> the user to define any device in any css.

Opening up the different csses for all devices might help, but we need
to be careful:
- We still want to keep the chpids separated. Probably not a problem
  right now.
- We need to be able to point to a default css, especially as there are
  no MCSS-E capable OSs around yet.
- You need to double check if there are further restrictions on the
  allowed css ids. (I know that 0xff is reserved for special usage as
  well; but I can't find out more.)
- Backwards compatibility and migration: We certainly don't want old
  setups to break, and compat machines need to force the old scheme.

All best tested out via a prototype :)

> 
> 3. If we have to keep the squash property, then when squashing, it's
> somewhat like "I don't care for the cssid", so is it possible for us to
> not check the cssid in the device devno?
> Libvirt would be benifited with this when automatically generating the
> addresses.

I think we still need to keep the squashing around for compatibility,
but we may be able to give it the chop for something like 3.0.

(And we probably need to keep the existing restrictions in compat mode.)

> 
> 4. Error message for devno conflict is not helpful. For the following
> case:
>   -M s390-ccw-virtio,s390-squash-mcss=on \
>   -drive file=/dev/disk/by-path/ccw-0.0.3f3e,if=none,id=drive-virtio-disk1,format=raw \
>   -device virtio-blk-ccw,devno=fe.0.2222,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=0 \
>   -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920 \
> 
> We get this error message:
> qemu-system-s390x: -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920: Device 0.0.2222 already exists
> 
> By checking 0.0.2222 from the cmd line, users can not find out the root
> cause - squashing, easily. So, if we have to keep the squash property,
> we could improve this message by adding a hint.

That's probably a change that can be done quickly, without any compat
implications, right?

> 
> To sum up, we got the feeling that, this mess is not only for Libvirt
> but also for QEMU cmd line users. And we are wondering if there is some
> way to improve it.

Using css 0xfe seems to be an idea that turned out not to be as useful
as we hoped it would be. Maybe the right way forward is indeed to open
up the csses for all devices (although there might be a case for
putting non-virtual devices not into 0xfe by default and instead making
0 the default css).

Another thing: Should libvirt give its users enough rope to hang
themselves by allowing to create domains with devices all over the
channel subsystem images?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types
  2017-11-14 10:50 ` Cornelia Huck
@ 2017-11-14 11:17   ` Christian Borntraeger
  2017-11-21  6:51   ` Dong Jia Shi
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Borntraeger @ 2017-11-14 11:17 UTC (permalink / raw)
  To: Cornelia Huck, Dong Jia Shi
  Cc: qemu-devel, fiuczy, bwalk, shalini, pasic, qemu-s390x


On 11/14/2017 11:50 AM, Cornelia Huck wrote:
> On Tue, 14 Nov 2017 16:25:47 +0800
> Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> wrote:
> 
>> Dear Conny,
>>
>> Good day!
>>
>> Just now, our Libvirt folks pointed out a "usability mess" for the
>> design of differentiating address based on devices classes (real |
>> virtual). The complaints are mainly about the "s390-squash-mcss"
>> property and restrictions to define virtual device in the special 0xFE
>> css, and define real devices in non-0xFE.
>>
>> We have some discussions internally, but failed to get it cleared. As we
>> think this is about the architecture, so hereby, I as a representative,
>> forward our arguments and questions here to ask you for help:
>>
>> 1. What benifit do we get to put virtual devices in css 0xFE?
> 
> Some background here (for the benefit of innocent bystanders):
> 
> In the past, I had been involved with some cases where Linux guests
> under z/VM died after a customer followed the recommended procedure to
> vary off a path before applying service. That path was supposed to be a
> path to a disk; unfortunately, z/VM had mapped all kinds of virtual
> paths to it, including the only path to the console device. Oops.
> 
> With that in mind, we wanted to make sure that qemu would not be
> susceptible to the same problem; IOW, we wanted to make sure that
> chpids etc. were not mashed together for devices that did not have
> anything to do with each other. At one point in time, the idea came up
> to use a reserved css for virtio devices, which was deemed an elegant
> solution as 'real' devices were still something far in the future. (And
> I was under the delusion that we would have MCSS-E support in Linux by
> then; that has not happened...)

Yes and I think we can assume that MCSS-E will not come in the foreseable 
future.
> 
> So the basic idea of css 0xfe is: Maintain a clear separation between
> devices emulated by qemu and pass-through devices (a more divisive
> separation than by simply separating chpids).

> 
>>
>> 2. Since we could accept squashing virtual devices into css 0, can we
>> accept to not trading 0xFE as a special css?
> 
> Using css 0xfe seemed like a good idea; but as things worked out
> differently in the meantime, it seems it causes more problems right now
> than it avoids.
> 
>> So that we can remove the restrictions for the cssid validation for each
>> type of device. Even we could drop the s390-squash-mcss, and just allow
>> the user to define any device in any css.
> 
> Opening up the different csses for all devices might help, but we need
> to be careful:
> - We still want to keep the chpids separated. Probably not a problem
>   right now.
> - We need to be able to point to a default css, especially as there are
>   no MCSS-E capable OSs around yet.
> - You need to double check if there are further restrictions on the
>   allowed css ids. (I know that 0xff is reserved for special usage as
>   well; but I can't find out more.)
> - Backwards compatibility and migration: We certainly don't want old
>   setups to break, and compat machines need to force the old scheme.
> 
> All best tested out via a prototype :)

I never looked into the details, but my expectation for the squash thing
was that it actually allows to add real devices into the virtual channel
subsystem. So maybe we should really do a prototype.

> 
>>
>> 3. If we have to keep the squash property, then when squashing, it's
>> somewhat like "I don't care for the cssid", so is it possible for us to
>> not check the cssid in the device devno?
>> Libvirt would be benifited with this when automatically generating the
>> addresses.
> 
> I think we still need to keep the squashing around for compatibility,
> but we may be able to give it the chop for something like 3.0.
> 
> (And we probably need to keep the existing restrictions in compat mode.)
> 
>>
>> 4. Error message for devno conflict is not helpful. For the following
>> case:
>>   -M s390-ccw-virtio,s390-squash-mcss=on \
>>   -drive file=/dev/disk/by-path/ccw-0.0.3f3e,if=none,id=drive-virtio-disk1,format=raw \
>>   -device virtio-blk-ccw,devno=fe.0.2222,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=0 \
>>   -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920 \
>>
>> We get this error message:
>> qemu-system-s390x: -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920: Device 0.0.2222 already exists
>>
>> By checking 0.0.2222 from the cmd line, users can not find out the root
>> cause - squashing, easily. So, if we have to keep the squash property,
>> we could improve this message by adding a hint.
> 
> That's probably a change that can be done quickly, without any compat
> implications, right?
> 
>>
>> To sum up, we got the feeling that, this mess is not only for Libvirt
>> but also for QEMU cmd line users. And we are wondering if there is some
>> way to improve it.
> 
> Using css 0xfe seems to be an idea that turned out not to be as useful
> as we hoped it would be. Maybe the right way forward is indeed to open
> up the csses for all devices (although there might be a case for
> putting non-virtual devices not into 0xfe by default and instead making
> 0 the default css).

yes. 
> 
> Another thing: Should libvirt give its users enough rope to hang
> themselves by allowing to create domains with devices all over the
> channel subsystem images?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types
  2017-11-14 10:50 ` Cornelia Huck
  2017-11-14 11:17   ` Christian Borntraeger
@ 2017-11-21  6:51   ` Dong Jia Shi
  2017-11-21 10:15     ` Cornelia Huck
  1 sibling, 1 reply; 5+ messages in thread
From: Dong Jia Shi @ 2017-11-21  6:51 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Dong Jia Shi, qemu-devel, fiuczy, bwalk, shalini, pasic,
	borntraeger, qemu-s390x

* Cornelia Huck <cohuck@redhat.com> [2017-11-14 11:50:14 +0100]:

Hallo Conny,

After spending some time, just some updates for this one.

> On Tue, 14 Nov 2017 16:25:47 +0800
> Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> wrote:
> 
> > Dear Conny,
> > 
> > Good day!
> > 
> > Just now, our Libvirt folks pointed out a "usability mess" for the
> > design of differentiating address based on devices classes (real |
> > virtual). The complaints are mainly about the "s390-squash-mcss"
> > property and restrictions to define virtual device in the special 0xFE
> > css, and define real devices in non-0xFE.
> > 
> > We have some discussions internally, but failed to get it cleared. As we
> > think this is about the architecture, so hereby, I as a representative,
> > forward our arguments and questions here to ask you for help:
> > 
> > 1. What benifit do we get to put virtual devices in css 0xFE?
> 
> Some background here (for the benefit of innocent bystanders):
:)

> 
> In the past, I had been involved with some cases where Linux guests
> under z/VM died after a customer followed the recommended procedure to
> vary off a path before applying service. That path was supposed to be a
> path to a disk; unfortunately, z/VM had mapped all kinds of virtual
> paths to it, including the only path to the console device. Oops.
> 
> With that in mind, we wanted to make sure that qemu would not be
> susceptible to the same problem; IOW, we wanted to make sure that
> chpids etc. were not mashed together for devices that did not have
> anything to do with each other. At one point in time, the idea came up
> to use a reserved css for virtio devices, which was deemed an elegant
> solution as 'real' devices were still something far in the future. (And
> I was under the delusion that we would have MCSS-E support in Linux by
> then; that has not happened...)
> 
> So the basic idea of css 0xfe is: Maintain a clear separation between
> devices emulated by qemu and pass-through devices (a more divisive
> separation than by simply separating chpids).
> 
Thanks for the information. I think now everybody are clear about the
background.

[I sometime found it is a pleasure to listen to your story. Clear and
interesting.]

> > 
> > 2. Since we could accept squashing virtual devices into css 0, can we
> > accept to not trading 0xFE as a special css?
> 
> Using css 0xfe seemed like a good idea; but as things worked out
> differently in the meantime, it seems it causes more problems right now
> than it avoids.
> 
Have to agree. In particular after knowing the background.

> > So that we can remove the restrictions for the cssid validation for each
> > type of device. Even we could drop the s390-squash-mcss, and just allow
> > the user to define any device in any css.
> 
> Opening up the different csses for all devices might help, but we need
> to be careful:
> - We still want to keep the chpids separated. Probably not a problem
>   right now.
> - We need to be able to point to a default css, especially as there are
>   no MCSS-E capable OSs around yet.
> - You need to double check if there are further restrictions on the
>   allowed css ids. (I know that 0xff is reserved for special usage as
>   well; but I can't find out more.)
> - Backwards compatibility and migration: We certainly don't want old
>   setups to break, and compat machines need to force the old scheme.
> 
> All best tested out via a prototype :)
> 
After a round of internal discussion, Halil now has a prototype. I think
sooner he will post his patch with our internal agreement, and we can
continuing talking based on that then.

> > 
> > 3. If we have to keep the squash property, then when squashing, it's
> > somewhat like "I don't care for the cssid", so is it possible for us to
> > not check the cssid in the device devno?
> > Libvirt would be benifited with this when automatically generating the
> > addresses.
> 
> I think we still need to keep the squashing around for compatibility,
> but we may be able to give it the chop for something like 3.0.
> 
> (And we probably need to keep the existing restrictions in compat mode.)
> 
Can we just drop the squash property, right after we opened up all the
csses?
We do not support LGM for vfio-ccw, and there is no libvirt user until
now. So what else case could it be to stop us from dropping it?

> > 
> > 4. Error message for devno conflict is not helpful. For the following
> > case:
> >   -M s390-ccw-virtio,s390-squash-mcss=on \
> >   -drive file=/dev/disk/by-path/ccw-0.0.3f3e,if=none,id=drive-virtio-disk1,format=raw \
> >   -device virtio-blk-ccw,devno=fe.0.2222,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=0 \
> >   -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920 \
> > 
> > We get this error message:
> > qemu-system-s390x: -device vfio-ccw,devno=0.0.2222,sysfsdev=/sys/devices/css0/0.0.013f/6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920: Device 0.0.2222 already exists
> > 
> > By checking 0.0.2222 from the cmd line, users can not find out the root
> > cause - squashing, easily. So, if we have to keep the squash property,
> > we could improve this message by adding a hint.
> 
> That's probably a change that can be done quickly, without any compat
> implications, right?
Right on. I will suspend this until we got a final agreement.

> 
> > 
> > To sum up, we got the feeling that, this mess is not only for Libvirt
> > but also for QEMU cmd line users. And we are wondering if there is some
> > way to improve it.
> 
> Using css 0xfe seems to be an idea that turned out not to be as useful
> as we hoped it would be. Maybe the right way forward is indeed to open
> up the csses for all devices (although there might be a case for
> putting non-virtual devices not into 0xfe by default and instead making
> 0 the default css).
> 
> Another thing: Should libvirt give its users enough rope to hang
> themselves by allowing to create domains with devices all over the
> channel subsystem images?
> 
I think the commit message of Halil's patch will show you the idea.
Let's wait for some moment.

Thanks!

-- 
Dong Jia Shi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types
  2017-11-21  6:51   ` Dong Jia Shi
@ 2017-11-21 10:15     ` Cornelia Huck
  0 siblings, 0 replies; 5+ messages in thread
From: Cornelia Huck @ 2017-11-21 10:15 UTC (permalink / raw)
  To: Dong Jia Shi
  Cc: qemu-devel, fiuczy, bwalk, shalini, pasic, borntraeger,
	qemu-s390x

On Tue, 21 Nov 2017 14:51:26 +0800
Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> wrote:

> * Cornelia Huck <cohuck@redhat.com> [2017-11-14 11:50:14 +0100]:
> 
> Hallo Conny,
> 
> After spending some time, just some updates for this one.
> 
> > On Tue, 14 Nov 2017 16:25:47 +0800
> > Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> wrote:

> > > So that we can remove the restrictions for the cssid validation for each
> > > type of device. Even we could drop the s390-squash-mcss, and just allow
> > > the user to define any device in any css.  
> > 
> > Opening up the different csses for all devices might help, but we need
> > to be careful:
> > - We still want to keep the chpids separated. Probably not a problem
> >   right now.
> > - We need to be able to point to a default css, especially as there are
> >   no MCSS-E capable OSs around yet.
> > - You need to double check if there are further restrictions on the
> >   allowed css ids. (I know that 0xff is reserved for special usage as
> >   well; but I can't find out more.)
> > - Backwards compatibility and migration: We certainly don't want old
> >   setups to break, and compat machines need to force the old scheme.
> > 
> > All best tested out via a prototype :)
> >   
> After a round of internal discussion, Halil now has a prototype. I think
> sooner he will post his patch with our internal agreement, and we can
> continuing talking based on that then.

Please post upstream as soon as it make sense. Makes discussion
easier :)

> 
> > > 
> > > 3. If we have to keep the squash property, then when squashing, it's
> > > somewhat like "I don't care for the cssid", so is it possible for us to
> > > not check the cssid in the device devno?
> > > Libvirt would be benifited with this when automatically generating the
> > > addresses.  
> > 
> > I think we still need to keep the squashing around for compatibility,
> > but we may be able to give it the chop for something like 3.0.
> > 
> > (And we probably need to keep the existing restrictions in compat mode.)
> >   
> Can we just drop the squash property, right after we opened up all the
> csses?
> We do not support LGM for vfio-ccw, and there is no libvirt user until
> now. So what else case could it be to stop us from dropping it?

IIRC, we still require a non-0xfe cssid for non-virtual devices with
the squash parameter; they are just mapped into the default css later.
Just dropping the parameter would break existing command lines, and
accepting but ignoring it makes a previously working commandline
suddenly non-working (not all devices visible to guest).

We either need to follow the proper deprecation procedure (which means
that the parameter will stay around for two releases), or kill it in a
3.0 release (if that comes earlier). But we should be able to get rid
of it at least in the long run.

> > > To sum up, we got the feeling that, this mess is not only for Libvirt
> > > but also for QEMU cmd line users. And we are wondering if there is some
> > > way to improve it.  
> > 
> > Using css 0xfe seems to be an idea that turned out not to be as useful
> > as we hoped it would be. Maybe the right way forward is indeed to open
> > up the csses for all devices (although there might be a case for
> > putting non-virtual devices not into 0xfe by default and instead making
> > 0 the default css).
> > 
> > Another thing: Should libvirt give its users enough rope to hang
> > themselves by allowing to create domains with devices all over the
> > channel subsystem images?
> >   
> I think the commit message of Halil's patch will show you the idea.
> Let's wait for some moment.

Yup, let's see what he comes up with.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-11-21 10:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-14  8:25 [Qemu-devel] Questions about usability mess that caused by differentiating address based on devices types Dong Jia Shi
2017-11-14 10:50 ` Cornelia Huck
2017-11-14 11:17   ` Christian Borntraeger
2017-11-21  6:51   ` Dong Jia Shi
2017-11-21 10:15     ` Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).