* [Qemu-devel] rbd: Possible regression in 2.9 RCs
@ 2017-03-30 23:42 Alexandru Avadanii
2017-03-31 0:39 ` Eric Blake
0 siblings, 1 reply; 7+ messages in thread
From: Alexandru Avadanii @ 2017-03-30 23:42 UTC (permalink / raw)
To: qemu-devel@nongnu.org; +Cc: svc-armband
Hi,
While testing out 2.9.0-rc2 on AArch64, we noticed a possible regression in QEMU, related to parsing -drive 'file=rbd/...':
> "conf option 6789 has no value".
Instance logs [1].
Occasionally, we get "conf option too long", with the same effect.
We bisected this manually between 2.8.0 (working ok with the above cmd) and 2.9.0-rc2, and the problematic change seems to be the merge point [2].
The problem is not arch-specific, i.e. it applies to x86 too, but our set up is based on AArch64, so I attached the logs I had lying around.
I would digg deeper into this myself, but I'm quite burnt out for today.
BR,
Alex
[1] http://paste.openstack.org/show/604938/
[2] https://github.com/qemu/qemu/commit/9a81b79
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs 2017-03-30 23:42 [Qemu-devel] rbd: Possible regression in 2.9 RCs Alexandru Avadanii @ 2017-03-31 0:39 ` Eric Blake 2017-03-31 0:52 ` Alexandru Avadanii 2017-03-31 1:05 ` Alexandru Avadanii 0 siblings, 2 replies; 7+ messages in thread From: Eric Blake @ 2017-03-31 0:39 UTC (permalink / raw) To: Alexandru Avadanii, qemu-devel@nongnu.org Cc: svc-armband, Jeff Cody, Markus Armbruster [-- Attachment #1: Type: text/plain, Size: 2019 bytes --] On 03/30/2017 06:42 PM, Alexandru Avadanii wrote: > Hi, > While testing out 2.9.0-rc2 on AArch64, we noticed a possible regression in QEMU, related to parsing -drive 'file=rbd/...': >> "conf option 6789 has no value". > > Instance logs [1]. Pastebins don't last forever; it helps to paste the actual error message in the email for archival purposes, and to make it easier for readers to see your problem without having to chase URLs: 2017-03-30T20:02:27.499695Z qemu-system-aarch64: -drive file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70:id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg==:auth_supported=cephx\;none:mon_host=192.168.1.2\:6789,format=raw,if=none,id=drive-virtio-disk0,serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback: conf option 6789 has no value Looking at that, the only instance of 6789 that I see is the 'mon_host=192.168.1.2\:6789,' portion. I bet what is happening is that we are mis-parsing the string, and trying to treat it as a key-values pair. In other words, it's probably an unintended regression introduced in the range of 7830f909..0a55679b by Jeff [3] or in Markus' cleanups between f51c363c..2836284d [4]. On the bright side, we still have time to fix it before 2.9 goes final, now that you called it to our attention. > Occasionally, we get "conf option too long", with the same effect. > > We bisected this manually between 2.8.0 (working ok with the above cmd) and 2.9.0-rc2, and the problematic change seems to be the merge point [2]. I suspect you didn't run the bisect quite correctly, as that merge point has nothing to do with block/rbd.c. > > [1] http://paste.openstack.org/show/604938/ > [2] https://github.com/qemu/qemu/commit/9a81b79 > [3] https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg07506.html [4] https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05565.html -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs 2017-03-31 0:39 ` Eric Blake @ 2017-03-31 0:52 ` Alexandru Avadanii 2017-03-31 1:05 ` Alexandru Avadanii 1 sibling, 0 replies; 7+ messages in thread From: Alexandru Avadanii @ 2017-03-31 0:52 UTC (permalink / raw) To: Eric Blake, qemu-devel@nongnu.org Cc: svc-armband, Jeff Cody, Markus Armbruster Hi, Eric, Thank you for looking into this! > -----Original Message----- > From: Eric Blake [mailto:eblake@redhat.com] > Sent: Friday, March 31, 2017 3:40 AM > To: Alexandru Avadanii; qemu-devel@nongnu.org > Cc: svc-armband; Jeff Cody; Markus Armbruster > Subject: Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs > > On 03/30/2017 06:42 PM, Alexandru Avadanii wrote: > > Hi, > > While testing out 2.9.0-rc2 on AArch64, we noticed a possible regression in > QEMU, related to parsing -drive 'file=rbd/...': > >> "conf option 6789 has no value". > > > > Instance logs [1]. > > Pastebins don't last forever; it helps to paste the actual error message in the > email for archival purposes, and to make it easier for readers to see your > problem without having to chase URLs: > > 2017-03-30T20:02:27.499695Z qemu-system-aarch64: -drive > file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d- > e7008b209a70:id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr > 7BKKXg==:auth_supported=cephx\;none:mon_host=192.168.1.2\:6789,form > at=raw,if=none,id=drive-virtio-disk0,serial=ea141b5c-cdb3-4765-910d- > e7008b209a70,cache=writeback: > conf option 6789 has no value > > Looking at that, the only instance of 6789 that I see is the > 'mon_host=192.168.1.2\:6789,' portion. I bet what is happening is that we > are mis-parsing the string, and trying to treat it as a key-values pair. In other > words, it's probably an unintended regression introduced in the range of > 7830f909..0a55679b by Jeff [3] or in Markus' cleanups between > f51c363c..2836284d [4]. > > On the bright side, we still have time to fix it before 2.9 goes final, now that > you called it to our attention. > > > Occasionally, we get "conf option too long", with the same effect. > > > > We bisected this manually between 2.8.0 (working ok with the above cmd) > and 2.9.0-rc2, and the problematic change seems to be the merge point [2]. > > I suspect you didn't run the bisect quite correctly, as that merge point has > nothing to do with block/rbd.c. I suspect that too, sorry. I'll redo this tomorrow and get back. > > > > > [1] http://paste.openstack.org/show/604938/ > > [2] https://github.com/qemu/qemu/commit/9a81b79 > > > > [3] https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg07506.html > [4] https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05565.html > > -- > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs 2017-03-31 0:39 ` Eric Blake 2017-03-31 0:52 ` Alexandru Avadanii @ 2017-03-31 1:05 ` Alexandru Avadanii 2017-03-31 1:22 ` Eric Blake 1 sibling, 1 reply; 7+ messages in thread From: Alexandru Avadanii @ 2017-03-31 1:05 UTC (permalink / raw) To: Eric Blake, qemu-devel@nongnu.org Cc: svc-armband, Jeff Cody, Markus Armbruster c7cacb3e7a2e9fdf929c993b98268e4179147cbb is the first bad commit block/rbd: parse all options via bdrv_parse_filename > -----Original Message----- > From: Alexandru Avadanii > Sent: Friday, March 31, 2017 3:52 AM > To: 'Eric Blake'; qemu-devel@nongnu.org > Cc: svc-armband; Jeff Cody; Markus Armbruster > Subject: RE: [Qemu-devel] rbd: Possible regression in 2.9 RCs > > Hi, Eric, > Thank you for looking into this! > > > -----Original Message----- > > From: Eric Blake [mailto:eblake@redhat.com] > > Sent: Friday, March 31, 2017 3:40 AM > > To: Alexandru Avadanii; qemu-devel@nongnu.org > > Cc: svc-armband; Jeff Cody; Markus Armbruster > > Subject: Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs > > > > On 03/30/2017 06:42 PM, Alexandru Avadanii wrote: > > > Hi, > > > While testing out 2.9.0-rc2 on AArch64, we noticed a possible > > > regression in > > QEMU, related to parsing -drive 'file=rbd/...': > > >> "conf option 6789 has no value". > > > > > > Instance logs [1]. > > > > Pastebins don't last forever; it helps to paste the actual error > > message in the email for archival purposes, and to make it easier for > > readers to see your problem without having to chase URLs: > > > > 2017-03-30T20:02:27.499695Z qemu-system-aarch64: -drive > > file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d- > > > e7008b209a70:id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr > > > 7BKKXg==:auth_supported=cephx\;none:mon_host=192.168.1.2\:6789,form > > at=raw,if=none,id=drive-virtio-disk0,serial=ea141b5c-cdb3-4765-910d- > > e7008b209a70,cache=writeback: > > conf option 6789 has no value > > > > Looking at that, the only instance of 6789 that I see is the > > 'mon_host=192.168.1.2\:6789,' portion. I bet what is happening is > > that we are mis-parsing the string, and trying to treat it as a > > key-values pair. In other words, it's probably an unintended > > regression introduced in the range of 7830f909..0a55679b by Jeff [3] > > or in Markus' cleanups between f51c363c..2836284d [4]. > > > > On the bright side, we still have time to fix it before 2.9 goes > > final, now that you called it to our attention. > > > > > Occasionally, we get "conf option too long", with the same effect. > > > > > > We bisected this manually between 2.8.0 (working ok with the above > > > cmd) > > and 2.9.0-rc2, and the problematic change seems to be the merge point [2]. > > > > I suspect you didn't run the bisect quite correctly, as that merge > > point has nothing to do with block/rbd.c. > > I suspect that too, sorry. I'll redo this tomorrow and get back. > > > > > > > > > [1] http://paste.openstack.org/show/604938/ > > > [2] https://github.com/qemu/qemu/commit/9a81b79 > > > > > > > [3] > > https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg07506.html > > [4] > > https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05565.html > > > > -- > > Eric Blake eblake redhat com +1-919-301-3266 > > Libvirt virtualization library http://libvirt.org ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs 2017-03-31 1:05 ` Alexandru Avadanii @ 2017-03-31 1:22 ` Eric Blake 2017-03-31 2:08 ` Eric Blake 0 siblings, 1 reply; 7+ messages in thread From: Eric Blake @ 2017-03-31 1:22 UTC (permalink / raw) To: Alexandru Avadanii, qemu-devel@nongnu.org Cc: svc-armband, Jeff Cody, Markus Armbruster [-- Attachment #1: Type: text/plain, Size: 1187 bytes --] On 03/30/2017 08:05 PM, Alexandru Avadanii wrote: > c7cacb3e7a2e9fdf929c993b98268e4179147cbb is the first bad commit > block/rbd: parse all options via bdrv_parse_filename Yep, my bisect finished about 2 minutes after your email on the same spot. I'm working on a patch. I can reproduce the problem with a mere: ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio -drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70:id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg==:auth_supported=cephx\;none:mon_host=192.168.1.2\:6789,format=raw,if=none,id=drive-virtio-disk0,serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback' the good behavior (on my setup) just hangs trying to connect to a non-existent machine, the bad behavior gets rather-badly misparsed (splitting the escaped : in the host:port portion as if the port were the next key-value pair) resulting in an instant error message. I don't have an actual RBD setup for testing the fix, but will cc you on the patch that I propose once I have something. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs 2017-03-31 1:22 ` Eric Blake @ 2017-03-31 2:08 ` Eric Blake 2017-03-31 3:54 ` Jeff Cody 0 siblings, 1 reply; 7+ messages in thread From: Eric Blake @ 2017-03-31 2:08 UTC (permalink / raw) To: Alexandru Avadanii, qemu-devel@nongnu.org Cc: Jeff Cody, Markus Armbruster, svc-armband [-- Attachment #1: Type: text/plain, Size: 2454 bytes --] On 03/30/2017 08:22 PM, Eric Blake wrote: > On 03/30/2017 08:05 PM, Alexandru Avadanii wrote: >> c7cacb3e7a2e9fdf929c993b98268e4179147cbb is the first bad commit >> block/rbd: parse all options via bdrv_parse_filename > > Yep, my bisect finished about 2 minutes after your email on the same > spot. I'm working on a patch. I can reproduce the problem with a mere: > > ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio > -drive > 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70:id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg==:auth_supported=cephx\;none:mon_host=192.168.1.2\:6789,format=raw,if=none,id=drive-virtio-disk0,serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback' > > the good behavior (on my setup) just hangs trying to connect to a > non-existent machine, the bad behavior gets rather-badly misparsed > (splitting the escaped : in the host:port portion as if the port were > the next key-value pair) resulting in an instant error message. I don't > have an actual RBD setup for testing the fix, but will cc you on the > patch that I propose once I have something. I found the culprit, but the patch is taking me longer. We are unescaping key="mon_host", value="192.168.1.2\:6789" when we first parse the key/value pairs in qemu_rbd_parse_filename(), then we slam the unescaped values back into a single string with ':' separators resulting in "mon_host=192.168.1.2:6789" for stuffing through the QDict, then we pass that string BACK through qemu_rbd_next_tok() inside qemu_rbd_set_keypairs(), and since we no longer have the \: escape, we are trying to treat 6789 as a new key on the second pass. Moral of the story: don't parse stuff twice. My patch will be to use a QList instead of a QString for the hidden "=keyvalue-pairs" object that we use to pass things around between parsing the filename and actually passing parameters to RBD, matching the fact that we already have this telling comment: /* FIXME: This is pretty ugly, and not the right way to do this. * These should be contained in a structure, and then * passed explicitly as individual key/value pairs to * rados. Consider this legacy code that needs to be * updated. */ -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 604 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] rbd: Possible regression in 2.9 RCs 2017-03-31 2:08 ` Eric Blake @ 2017-03-31 3:54 ` Jeff Cody 0 siblings, 0 replies; 7+ messages in thread From: Jeff Cody @ 2017-03-31 3:54 UTC (permalink / raw) To: Eric Blake Cc: Alexandru Avadanii, qemu-devel@nongnu.org, Markus Armbruster, svc-armband On Thu, Mar 30, 2017 at 09:08:20PM -0500, Eric Blake wrote: > On 03/30/2017 08:22 PM, Eric Blake wrote: > > On 03/30/2017 08:05 PM, Alexandru Avadanii wrote: > >> c7cacb3e7a2e9fdf929c993b98268e4179147cbb is the first bad commit > >> block/rbd: parse all options via bdrv_parse_filename > > > > Yep, my bisect finished about 2 minutes after your email on the same > > spot. I'm working on a patch. I can reproduce the problem with a mere: > > > > ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio > > -drive > > 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70:id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg==:auth_supported=cephx\;none:mon_host=192.168.1.2\:6789,format=raw,if=none,id=drive-virtio-disk0,serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback' > > > > the good behavior (on my setup) just hangs trying to connect to a > > non-existent machine, the bad behavior gets rather-badly misparsed > > (splitting the escaped : in the host:port portion as if the port were > > the next key-value pair) resulting in an instant error message. I don't > > have an actual RBD setup for testing the fix, but will cc you on the > > patch that I propose once I have something. > > I found the culprit, but the patch is taking me longer. > > We are unescaping key="mon_host", value="192.168.1.2\:6789" when we > first parse the key/value pairs in qemu_rbd_parse_filename(), then we > slam the unescaped values back into a single string with ':' separators > resulting in "mon_host=192.168.1.2:6789" for stuffing through the QDict, > then we pass that string BACK through qemu_rbd_next_tok() inside > qemu_rbd_set_keypairs(), and since we no longer have the \: escape, we > are trying to treat 6789 as a new key on the second pass. Moral of the > story: don't parse stuff twice. > > My patch will be to use a QList instead of a QString for the hidden > "=keyvalue-pairs" object that we use to pass things around between > parsing the filename and actually passing parameters to RBD, matching > the fact that we already have this telling comment: > > /* FIXME: This is pretty ugly, and not the right way to do this. > * These should be contained in a structure, and then > * passed explicitly as individual key/value pairs to > * rados. Consider this legacy code that needs to be > * updated. */ Ugh. Yep, I just got done coming to the same conclusion, only to hop on the email list to see that you did as well. A QList sounds like a good choice. Jeff ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-03-31 3:55 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-03-30 23:42 [Qemu-devel] rbd: Possible regression in 2.9 RCs Alexandru Avadanii 2017-03-31 0:39 ` Eric Blake 2017-03-31 0:52 ` Alexandru Avadanii 2017-03-31 1:05 ` Alexandru Avadanii 2017-03-31 1:22 ` Eric Blake 2017-03-31 2:08 ` Eric Blake 2017-03-31 3:54 ` Jeff Cody
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).