From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: qemu-s390x@nongnu.org, Cornelia Huck <cohuck@redhat.com>,
qemu-devel@nongnu.org, Cindy Lu <lulu@redhat.com>
Subject: Re: [BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-ccw device
Date: Mon, 27 Jul 2020 09:16:00 -0400 [thread overview]
Message-ID: <20200727091422-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <c574a3c6-239f-9e8a-ee0d-de56c795a8cc@redhat.com>
On Mon, Jul 27, 2020 at 08:44:09PM +0800, Jason Wang wrote:
>
> On 2020/7/27 下午7:43, Michael S. Tsirkin wrote:
> > On Mon, Jul 27, 2020 at 04:51:23PM +0800, Jason Wang wrote:
> > > On 2020/7/27 下午4:41, Cornelia Huck wrote:
> > > > On Mon, 27 Jul 2020 15:38:12 +0800
> > > > Jason Wang<jasowang@redhat.com> wrote:
> > > >
> > > > > On 2020/7/27 下午2:43, Cornelia Huck wrote:
> > > > > > On Sat, 25 Jul 2020 08:40:07 +0800
> > > > > > Jason Wang<jasowang@redhat.com> wrote:
> > > > > > > On 2020/7/24 下午11:34, Cornelia Huck wrote:
> > > > > > > > On Fri, 24 Jul 2020 11:17:57 -0400
> > > > > > > > "Michael S. Tsirkin"<mst@redhat.com> wrote:
> > > > > > > > > On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote:
> > > > > > > > > > On Fri, 24 Jul 2020 09:30:58 -0400
> > > > > > > > > > "Michael S. Tsirkin"<mst@redhat.com> wrote:
> > > > > > > > > > > On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote:
> > > > > > > > > > > > When I start qemu with a second virtio-net-ccw device (i.e. adding
> > > > > > > > > > > > -device virtio-net-ccw in addition to the autogenerated device), I get
> > > > > > > > > > > > a segfault. gdb points to
> > > > > > > > > > > >
> > > > > > > > > > > > #0 0x000055d6ab52681d in virtio_net_get_config (vdev=<optimized out>,
> > > > > > > > > > > > config=0x55d6ad9e3f80 "RT") at /home/cohuck/git/qemu/hw/net/virtio-net.c:146
> > > > > > > > > > > > 146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {
> > > > > > > > > > > >
> > > > > > > > > > > > (backtrace doesn't go further)
> > > > > > > > > > The core was incomplete, but running under gdb directly shows that it
> > > > > > > > > > is just a bog-standard config space access (first for that device).
> > > > > > > > > >
> > > > > > > > > > The cause of the crash is that nc->peer is not set... no idea how that
> > > > > > > > > > can happen, not that familiar with that part of QEMU. (Should the code
> > > > > > > > > > check, or is that really something that should not happen?)
> > > > > > > > > >
> > > > > > > > > > What I don't understand is why it is set correctly for the first,
> > > > > > > > > > autogenerated virtio-net-ccw device, but not for the second one, and
> > > > > > > > > > why virtio-net-pci doesn't show these problems. The only difference
> > > > > > > > > > between -ccw and -pci that comes to my mind here is that config space
> > > > > > > > > > accesses for ccw are done via an asynchronous operation, so timing
> > > > > > > > > > might be different.
> > > > > > > > > Hopefully Jason has an idea. Could you post a full command line
> > > > > > > > > please? Do you need a working guest to trigger this? Does this trigger
> > > > > > > > > on an x86 host?
> > > > > > > > Yes, it does trigger with tcg-on-x86 as well. I've been using
> > > > > > > >
> > > > > > > > s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on
> > > > > > > > -m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001
> > > > > > > > -drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0
> > > > > > > > -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1
> > > > > > > > -device virtio-net-ccw
> > > > > > > >
> > > > > > > > It seems it needs the guest actually doing something with the nics; I
> > > > > > > > cannot reproduce the crash if I use the old advent calendar moon buggy
> > > > > > > > image and just add a virtio-net-ccw device.
> > > > > > > >
> > > > > > > > (I don't think it's a problem with my local build, as I see the problem
> > > > > > > > both on my laptop and on an LPAR.)
> > > > > > > It looks to me we forget the check the existence of peer.
> > > > > > >
> > > > > > > Please try the attached patch to see if it works.
> > > > > > Thanks, that patch gets my guest up and running again. So, FWIW,
> > > > > >
> > > > > > Tested-by: Cornelia Huck<cohuck@redhat.com>
> > > > > >
> > > > > > Any idea why this did not hit with virtio-net-pci (or the autogenerated
> > > > > > virtio-net-ccw device)?
> > > > > It can be hit with virtio-net-pci as well (just start without peer).
> > > > Hm, I had not been able to reproduce the crash with a 'naked' -device
> > > > virtio-net-pci. But checking seems to be the right idea anyway.
> > > Sorry for being unclear, I meant for networking part, you just need start
> > > without peer, and you need a real guest (any Linux) that is trying to access
> > > the config space of virtio-net.
> > >
> > > Thanks
> > A pxe guest will do it, but that doesn't support ccw, right?
>
>
> Yes, it depends on the cli actually.
>
>
> >
> > I'm still unclear why this triggers with ccw but not pci -
> > any idea?
>
>
> I don't test pxe but I can reproduce this with pci (just start a linux guest
> without a peer).
>
> Thanks
>
Might be a good addition to a unit test. Not sure what would the
test do exactly: just make sure guest runs? Looks like a lot of work
for an empty test ... maybe we can poke at the guest config with
qtest commands at least.
--
MST
next prev parent reply other threads:[~2020-07-27 13:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-24 13:27 [BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-ccw device Cornelia Huck
2020-07-24 13:30 ` Michael S. Tsirkin
2020-07-24 14:56 ` Cornelia Huck
2020-07-24 15:17 ` Michael S. Tsirkin
2020-07-24 15:34 ` Cornelia Huck
2020-07-25 0:40 ` Jason Wang
2020-07-27 6:43 ` Cornelia Huck
2020-07-27 7:38 ` Jason Wang
2020-07-27 8:41 ` Cornelia Huck
2020-07-27 8:51 ` Jason Wang
2020-07-27 11:43 ` Michael S. Tsirkin
2020-07-27 12:44 ` Jason Wang
2020-07-27 13:16 ` Michael S. Tsirkin [this message]
2020-07-28 4:10 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200727091422-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=cohuck@redhat.com \
--cc=jasowang@redhat.com \
--cc=lulu@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).