* [Qemu-devel] [RFC] block: Removed coroutine ownership assumption @ 2012-06-22 6:44 Peter A. G. Crosthwaite 2012-06-22 7:49 ` Kevin Wolf 2012-06-22 7:50 ` Jan Kiszka 0 siblings, 2 replies; 38+ messages in thread From: Peter A. G. Crosthwaite @ 2012-06-22 6:44 UTC (permalink / raw) To: qemu-devel, kwolf, stefanha Cc: peter.crosthwaite, edgar.iglesias, john.williams The block layer assumes that it is the only user of coroutines - The qemu_in_coroutine() is used to determine if a function is in one of the block layers coroutines, which is flawed. I.E. If a client (e.g. a device or a machine model) of the block layer uses couroutine itself, the block layer will identify the callers coroutines as its own, and may falsely yield the calling coroutine (instead of creating its own to yield). AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an issue, as anyone who comes along and used coroutines and the block layer together is going to run into some very obscure and hard to debug race conditions. Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> --- block.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 0acdcac..b50af15 100644 --- a/block.c +++ b/block.c @@ -380,7 +380,7 @@ int bdrv_create(BlockDriver *drv, const char* filename, return -ENOTSUP; } - if (qemu_in_coroutine()) { + if (0) { /* Fast-path if already in coroutine context */ bdrv_create_co_entry(&cco); } else { @@ -1590,7 +1590,7 @@ static int bdrv_rw_co(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, bdrv_io_limits_disable(bs); } - if (qemu_in_coroutine()) { + if (0) { /* Fast-path if already in coroutine context */ bdrv_rw_co_entry(&rwco); } else { @@ -3813,7 +3813,7 @@ int bdrv_flush(BlockDriverState *bs) .ret = NOT_DONE, }; - if (qemu_in_coroutine()) { + if (0) { /* Fast-path if already in coroutine context */ bdrv_flush_co_entry(&rwco); } else { @@ -3874,7 +3874,7 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) .ret = NOT_DONE, }; - if (qemu_in_coroutine()) { + if (0) { /* Fast-path if already in coroutine context */ bdrv_discard_co_entry(&rwco); } else { -- 1.7.3.2 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 6:44 [Qemu-devel] [RFC] block: Removed coroutine ownership assumption Peter A. G. Crosthwaite @ 2012-06-22 7:49 ` Kevin Wolf 2012-06-22 8:20 ` Peter Crosthwaite 2012-06-22 7:50 ` Jan Kiszka 1 sibling, 1 reply; 38+ messages in thread From: Kevin Wolf @ 2012-06-22 7:49 UTC (permalink / raw) To: Peter A. G. Crosthwaite Cc: edgar.iglesias, qemu-devel, john.williams, stefanha Am 22.06.2012 08:44, schrieb Peter A. G. Crosthwaite: > The block layer assumes that it is the only user of coroutines - > The qemu_in_coroutine() is used to determine if a function is in one of the > block layers coroutines, which is flawed. I.E. If a client (e.g. a device or > a machine model) of the block layer uses couroutine itself, the block layer > will identify the callers coroutines as its own, and may falsely yield the > calling coroutine (instead of creating its own to yield). > > AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an > issue, as anyone who comes along and used coroutines and the block layer > together is going to run into some very obscure and hard to debug race > conditions. > > Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> What does your coroutine caller look like that this is a problem? Does it make assumptions about the number of yields or anything like that? The assumption here is not that the block layer owns the coroutine, but that any code running in a coroutine can yield at any time as long at it makes sure that eventually the coroutine is reentered. Just like if you were running in a thread, you certainly wouldn't assume that the block layer has to create its own thread if it could get preempted, would you? Can you post some example code that explains the race conditions you're talking about? Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 7:49 ` Kevin Wolf @ 2012-06-22 8:20 ` Peter Crosthwaite 2012-06-22 8:31 ` Peter Maydell 2012-06-22 8:53 ` Kevin Wolf 0 siblings, 2 replies; 38+ messages in thread From: Peter Crosthwaite @ 2012-06-22 8:20 UTC (permalink / raw) To: Kevin Wolf; +Cc: edgar.iglesias, qemu-devel, john.williams, stefanha On Fri, Jun 22, 2012 at 5:49 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 22.06.2012 08:44, schrieb Peter A. G. Crosthwaite: >> The block layer assumes that it is the only user of coroutines - >> The qemu_in_coroutine() is used to determine if a function is in one of the >> block layers coroutines, which is flawed. I.E. If a client (e.g. a device or >> a machine model) of the block layer uses couroutine itself, the block layer >> will identify the callers coroutines as its own, and may falsely yield the >> calling coroutine (instead of creating its own to yield). >> >> AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an >> issue, as anyone who comes along and used coroutines and the block layer >> together is going to run into some very obscure and hard to debug race >> conditions. >> >> Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> > > What does your coroutine caller look like that this is a problem? Its a machine model that instantiated some block devices concurrently. Theres some chicken-and-egg issues with the instantiation such that they have the happen concurrently. One device instantiates a block device (pflash_cfi_01) from coroutine context. block then check qemu_in_coroutine() which returns true so it uses my coroutine for its inner workings, whereas if it were a normal machine model it would kick of its own coroutine to do its block stuff. Does > it make assumptions about the number of yields or anything like that? Yes it does, but I thought that was the point of coroutines vs threads? coroutines you control yield behaviour explicitly whearas thread you cant? > > The assumption here is not that the block layer owns the coroutine, but > that any code running in a coroutine can yield Yield to who tho? A coroutine yield transfers control back to the context that spawned it. For correct functionality, the block layer should yielf back to whatever context called it no? Which would mean that the the block layer when yielding should transfer back to pflash_cfi_01_register. But because it was called from a coroutine in the first place, it yields control all the way back my machine model (which then breaks). at any time as long at it > makes sure that eventually the coroutine is reentered. Just like if you > were running in a thread, you certainly wouldn't assume that the block > layer has to create its own thread if it could get preempted, would you? > > Can you post some example code that explains the race conditions you're > talking about? > > Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:20 ` Peter Crosthwaite @ 2012-06-22 8:31 ` Peter Maydell 2012-06-22 8:34 ` Stefan Hajnoczi 2012-06-22 8:53 ` Kevin Wolf 1 sibling, 1 reply; 38+ messages in thread From: Peter Maydell @ 2012-06-22 8:31 UTC (permalink / raw) To: Peter Crosthwaite Cc: Kevin Wolf, edgar.iglesias, qemu-devel, john.williams, stefanha On 22 June 2012 09:20, Peter Crosthwaite <peter.crosthwaite@petalogix.com> wrote: > Its a machine model that instantiated some block devices concurrently. > Theres some chicken-and-egg issues with the instantiation such that > they have the happen concurrently. One device instantiates a block > device (pflash_cfi_01) from coroutine context. block then check > qemu_in_coroutine() which returns true so it uses my coroutine for its > inner workings, whereas if it were a normal machine model it would > kick of its own coroutine to do its block stuff. I think the problem here is trying to instantiate bits of the machine model inside coroutines -- that just sounds wrong to me. Model and device instantiate should be a straightforward single threaded bit of code; if it isn't then something needs to be straightened out so you don't have your chicken-and-egg problem. -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:31 ` Peter Maydell @ 2012-06-22 8:34 ` Stefan Hajnoczi 0 siblings, 0 replies; 38+ messages in thread From: Stefan Hajnoczi @ 2012-06-22 8:34 UTC (permalink / raw) To: Peter Maydell Cc: Kevin Wolf, stefanha, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams On Fri, Jun 22, 2012 at 9:31 AM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 22 June 2012 09:20, Peter Crosthwaite > <peter.crosthwaite@petalogix.com> wrote: >> Its a machine model that instantiated some block devices concurrently. >> Theres some chicken-and-egg issues with the instantiation such that >> they have the happen concurrently. One device instantiates a block >> device (pflash_cfi_01) from coroutine context. block then check >> qemu_in_coroutine() which returns true so it uses my coroutine for its >> inner workings, whereas if it were a normal machine model it would >> kick of its own coroutine to do its block stuff. > > I think the problem here is trying to instantiate bits of the > machine model inside coroutines -- that just sounds wrong to me. > Model and device instantiate should be a straightforward single > threaded bit of code; if it isn't then something needs to be > straightened out so you don't have your chicken-and-egg problem. Yes, I agree with Peter here. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:20 ` Peter Crosthwaite 2012-06-22 8:31 ` Peter Maydell @ 2012-06-22 8:53 ` Kevin Wolf 2012-06-22 10:59 ` Peter Crosthwaite 1 sibling, 1 reply; 38+ messages in thread From: Kevin Wolf @ 2012-06-22 8:53 UTC (permalink / raw) To: Peter Crosthwaite; +Cc: edgar.iglesias, qemu-devel, john.williams, stefanha Am 22.06.2012 10:20, schrieb Peter Crosthwaite: > On Fri, Jun 22, 2012 at 5:49 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 22.06.2012 08:44, schrieb Peter A. G. Crosthwaite: >>> The block layer assumes that it is the only user of coroutines - >>> The qemu_in_coroutine() is used to determine if a function is in one of the >>> block layers coroutines, which is flawed. I.E. If a client (e.g. a device or >>> a machine model) of the block layer uses couroutine itself, the block layer >>> will identify the callers coroutines as its own, and may falsely yield the >>> calling coroutine (instead of creating its own to yield). >>> >>> AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an >>> issue, as anyone who comes along and used coroutines and the block layer >>> together is going to run into some very obscure and hard to debug race >>> conditions. >>> >>> Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> >> >> What does your coroutine caller look like that this is a problem? > > Its a machine model that instantiated some block devices concurrently. > Theres some chicken-and-egg issues with the instantiation such that > they have the happen concurrently. One device instantiates a block > device (pflash_cfi_01) from coroutine context. block then check > qemu_in_coroutine() which returns true so it uses my coroutine for its > inner workings, whereas if it were a normal machine model it would > kick of its own coroutine to do its block stuff. > > Does >> it make assumptions about the number of yields or anything like that? > > Yes it does, but I thought that was the point of coroutines vs > threads? coroutines you control yield behaviour explicitly whearas > thread you cant? Well, I can see your point, although today no code in qemu makes use of the property that the caller could in theory know where the coroutine yields. I think it's also dangerous because there is a non-obvious dependency of the caller on the internals of the coroutine. A simple innocent looking refactoring of the coroutine might break assumptions that are made here. Other code in qemu that uses coroutines only makes use of the fact that coroutines can only be interrupted at known points, so synchronisation between multiple coroutines becomes easier. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:53 ` Kevin Wolf @ 2012-06-22 10:59 ` Peter Crosthwaite [not found] ` <CAEgOgz4+5FsFUZT796chTADOXcRps0+74T4whfmdEsh_dO96VA@mail.gmail.com> 0 siblings, 1 reply; 38+ messages in thread From: Peter Crosthwaite @ 2012-06-22 10:59 UTC (permalink / raw) To: Kevin Wolf; +Cc: edgar.iglesias, qemu-devel, john.williams, stefanha On Fri, Jun 22, 2012 at 6:53 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 22.06.2012 10:20, schrieb Peter Crosthwaite: >> On Fri, Jun 22, 2012 at 5:49 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>> Am 22.06.2012 08:44, schrieb Peter A. G. Crosthwaite: >>>> The block layer assumes that it is the only user of coroutines - >>>> The qemu_in_coroutine() is used to determine if a function is in one of the >>>> block layers coroutines, which is flawed. I.E. If a client (e.g. a device or >>>> a machine model) of the block layer uses couroutine itself, the block layer >>>> will identify the callers coroutines as its own, and may falsely yield the >>>> calling coroutine (instead of creating its own to yield). >>>> >>>> AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an >>>> issue, as anyone who comes along and used coroutines and the block layer >>>> together is going to run into some very obscure and hard to debug race >>>> conditions. >>>> >>>> Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> >>> >>> What does your coroutine caller look like that this is a problem? >> >> Its a machine model that instantiated some block devices concurrently. >> Theres some chicken-and-egg issues with the instantiation such that >> they have the happen concurrently. One device instantiates a block >> device (pflash_cfi_01) from coroutine context. block then check >> qemu_in_coroutine() which returns true so it uses my coroutine for its >> inner workings, whereas if it were a normal machine model it would >> kick of its own coroutine to do its block stuff. >> >> Does >>> it make assumptions about the number of yields or anything like that? >> >> Yes it does, but I thought that was the point of coroutines vs >> threads? coroutines you control yield behaviour explicitly whearas >> thread you cant? > > Well, I can see your point, although today no code in qemu makes use of > the property that the caller could in theory know where the coroutine > yields. I think it's also dangerous because there is a non-obvious > dependency of the caller on the internals of the coroutine. A simple > innocent looking refactoring of the coroutine might break assumptions > that are made here. > > Other code in qemu that uses coroutines only makes use of the fact that > coroutines can only be interrupted at known points, So within the block layer, will the block coroutines yield anywhere or only at a qemu_coroutine_yield() site? Would the block layer break if a couroutine could be pre-empted by the OS anywhere? So lets continue this to an example of multiple clients of qemu-coroutine. The block layer is one client. It defines three possible yields points (qemu_co_yield() sites) in block.c. Lets call them A,B and C. The co-routine policy is that your thread of control will not be pre-empted until locations A, B or C are reached (I.E. coroutines can only be interrupted at known points). Alls fair and it works. Now here is where it breaks. I have a device or machine model or whatever (lets call it foo) that uses co-routines. It decides that it wants its coroutines to stop at only at points D,E and F for ease of synchronization. If those coroutines however make calls into to the block layer, the block layer will think its in its own co-routine and potentially yield at any of points A,B and C. Thing is, my foo system doesn't care about the block layer implementation, so it shouldnt have awareness of the fact it used co-routines too. But because it talks to block, it can get yielded as any call into the block API which is awkward considering the point of coroutines is they can only be interrupted at known points. You have essentially defeated the purpose or coroutines in the first place. Foo's coroutines are behaving like preemptible threads (while blocks are behaving like coroutines). The problem is solved with a simple piece of policy - dont yield someone elses co-routine. Thats this patch. Note that this is an RFC intended to start discussion - it needs a real implementation. Something along the lines of keeping track of the block layer coroutines and checking if in one of those specifically, rather then a blanket call to qemu_in_coroutine(). But I have this patch in my workarounds branch until we figure out a real solution. Regards, Peter so synchronisation > between multiple coroutines becomes easier. > > Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
[parent not found: <CAEgOgz4+5FsFUZT796chTADOXcRps0+74T4whfmdEsh_dO96VA@mail.gmail.com>]
[parent not found: <CAJSP0QWVJTb9jPZC_mdWpd4OwLz4VOuxGBZ_=2M4HNeexEC96Q@mail.gmail.com>]
[parent not found: <CAFEAcA_cX_jxZMSjjT=yBA1Hmf2VpWbGyDBZ8z9mqq5rVNsWWw@mail.gmail.com>]
[parent not found: <CAJSP0QV=BsRhdD_tVE9Cav-bhGiF-R3+Ab2aTtun6nSoPh0EmQ@mail.gmail.com>]
[parent not found: <m3vcidw3v3.fsf@blackfin.pond.sub.org>]
[parent not found: <CAEgOgz6Mai7PvBkHwN0EjhFsAFMU8W=V1B1X0ZLN3c1YHRaWKA@mail.gmail.com>]
[parent not found: <CAJSP0QWJcuEOmxhoAceqU2WYQkGT+fsizf-kdx_irq97j3pw4Q@mail.gmail.com>]
[parent not found: <CAEgOgz51jauHzvEk0Pk+oNQ0qPekjKatvNivv4vxQsQR_6nOVQ@mail.gmail.com>]
[parent not found: <CAJSP0QXnaw2HV7M+U=0S-ApGGnrGtt1w0P5w5gpmv7h-7bMs9g@mail.gmail.com>]
[parent not found: <CAEgOgz65h2baE0ufvSgfow-B5fGVwJKgNwsf3C2r65HNGdiQxg@mail.gmail.com>]
[parent not found: <4FED6638.90703@redhat.com>]
[parent not found: <CAEgOgz6sUREnwNuiSM=344ZTNq_4xMJDFU29Sn+6dTeVww4rhA@mail.gmail.com>]
[parent not found: <m3y5n61hl5.fsf@blackfin.pond.sub.org>]
[parent not found: <CAEgOgz7oPPsMexuzsYwc2LOGGOC4EM9NvHjXAp0TT2T8kOyirQ@mail.gmail.com>]
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption [not found] ` <CAEgOgz7oPPsMexuzsYwc2LOGGOC4EM9NvHjXAp0TT2T8kOyirQ@mail.gmail.com> @ 2012-07-02 8:50 ` Stefan Hajnoczi 2012-07-02 8:57 ` Peter Crosthwaite 0 siblings, 1 reply; 38+ messages in thread From: Stefan Hajnoczi @ 2012-07-02 8:50 UTC (permalink / raw) To: Peter Crosthwaite Cc: Kevin Wolf, Peter Maydell, stefanha, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams On Fri, Jun 29, 2012 at 12:51 PM, Peter Crosthwaite <peter.crosthwaite@petalogix.com> wrote: > BTW Yielding is one thing, but the elephant in the room here is > resumption of the coroutine. When AIO yields my coroutine I i need to > talk to AIO to get it unyielded (Stefans propsoed edit to my code). > What happens when tommorow something in QOM, or a device model or is > implemented with coros too? how do I know who yielded my routines and > what API to call re-enter it? Going back to what Kevin said, the qemu_aio_wait() isn't block layer specific and there will never be a requirement to call any other magic functions. QEMU is event-driven and you must pump events. If you call into another subsystem, be prepared to pump events so that I/O can make progress. It's an assumption that is so central to QEMU architecture that I don't see it as a problem. Once the main loop is running then the event loop is taken care of for you. But during startup you're in a different environment and need to pump events yourself. If we drop bdrv_read()/bdrv_write() this won't change. You'll have to call bdrv_co_readv()/bdrv_co_writev() and pump events. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 8:50 ` Stefan Hajnoczi @ 2012-07-02 8:57 ` Peter Crosthwaite 2012-07-02 9:04 ` Kevin Wolf 0 siblings, 1 reply; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-02 8:57 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Kevin Wolf, Peter Maydell, stefanha, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams On Mon, Jul 2, 2012 at 6:50 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: > On Fri, Jun 29, 2012 at 12:51 PM, Peter Crosthwaite > <peter.crosthwaite@petalogix.com> wrote: >> BTW Yielding is one thing, but the elephant in the room here is >> resumption of the coroutine. When AIO yields my coroutine I i need to >> talk to AIO to get it unyielded (Stefans propsoed edit to my code). >> What happens when tommorow something in QOM, or a device model or is >> implemented with coros too? how do I know who yielded my routines and >> what API to call re-enter it? > > Going back to what Kevin said, the qemu_aio_wait() isn't block layer > specific and there will never be a requirement to call any other magic > functions. > > QEMU is event-driven and you must pump events. If you call into > another subsystem, be prepared to pump events so that I/O can make > progress. It's an assumption that is so central to QEMU architecture > that I don't see it as a problem. > > Once the main loop is running then the event loop is taken care of for > you. But during startup you're in a different environment and need to > pump events yourself. > > If we drop bdrv_read()/bdrv_write() this won't change. You'll have to > call bdrv_co_readv()/bdrv_co_writev() and pump events. > Just tracing bdrv_aio_read(), It bypasses the fastpath logic: . So a conversion of pflash to bdrv_aio_readv is a possible solution here. bdrv_aio_read -> bdrv_co_aio_rw_vector : static BlockDriverAIOCB *bdrv_co_aio_rw_vector (..) { Coroutine *co; BlockDriverAIOCBCoroutine *acb; ... co = qemu_coroutine_create(bdrv_co_do_rw); qemu_coroutine_enter(co, acb); return &acb->common; } No conditional on the qemu_coroutine_create. So it will always create a new coroutine for its work which will solve my problem. All I need to do is pump events once at the end of machine model creation. Any my coroutines will never yield or get queued by block/AIO. Sound like a solution? Regards, Peter > Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 8:57 ` Peter Crosthwaite @ 2012-07-02 9:04 ` Kevin Wolf 2012-07-02 9:42 ` Peter Crosthwaite 0 siblings, 1 reply; 38+ messages in thread From: Kevin Wolf @ 2012-07-02 9:04 UTC (permalink / raw) To: Peter Crosthwaite Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams Am 02.07.2012 10:57, schrieb Peter Crosthwaite: > On Mon, Jul 2, 2012 at 6:50 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: >> On Fri, Jun 29, 2012 at 12:51 PM, Peter Crosthwaite >> <peter.crosthwaite@petalogix.com> wrote: >>> BTW Yielding is one thing, but the elephant in the room here is >>> resumption of the coroutine. When AIO yields my coroutine I i need to >>> talk to AIO to get it unyielded (Stefans propsoed edit to my code). >>> What happens when tommorow something in QOM, or a device model or is >>> implemented with coros too? how do I know who yielded my routines and >>> what API to call re-enter it? >> >> Going back to what Kevin said, the qemu_aio_wait() isn't block layer >> specific and there will never be a requirement to call any other magic >> functions. >> >> QEMU is event-driven and you must pump events. If you call into >> another subsystem, be prepared to pump events so that I/O can make >> progress. It's an assumption that is so central to QEMU architecture >> that I don't see it as a problem. >> >> Once the main loop is running then the event loop is taken care of for >> you. But during startup you're in a different environment and need to >> pump events yourself. >> >> If we drop bdrv_read()/bdrv_write() this won't change. You'll have to >> call bdrv_co_readv()/bdrv_co_writev() and pump events. >> > > Just tracing bdrv_aio_read(), It bypasses the fastpath logic: > > . So a conversion of pflash to bdrv_aio_readv is a possible solution here. > > bdrv_aio_read -> bdrv_co_aio_rw_vector : > > static BlockDriverAIOCB *bdrv_co_aio_rw_vector (..) { > Coroutine *co; > BlockDriverAIOCBCoroutine *acb; > > ... > > co = qemu_coroutine_create(bdrv_co_do_rw); > qemu_coroutine_enter(co, acb); > > return &acb->common; > } > > No conditional on the qemu_coroutine_create. So it will always create > a new coroutine for its work which will solve my problem. All I need > to do is pump events once at the end of machine model creation. Any my > coroutines will never yield or get queued by block/AIO. Sound like a > solution? If you don't need the read data in your initialisation code, then yes, that would work. bdrv_aio_* will always create a new coroutine. I just assumed that you wanted to use the data right away, and then using the AIO functions wouldn't have made much sense. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 9:04 ` Kevin Wolf @ 2012-07-02 9:42 ` Peter Crosthwaite 2012-07-02 10:01 ` Kevin Wolf 0 siblings, 1 reply; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-02 9:42 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >> On Mon, Jul 2, 2012 at 6:50 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote: >>> On Fri, Jun 29, 2012 at 12:51 PM, Peter Crosthwaite >>> <peter.crosthwaite@petalogix.com> wrote: >>>> BTW Yielding is one thing, but the elephant in the room here is >>>> resumption of the coroutine. When AIO yields my coroutine I i need to >>>> talk to AIO to get it unyielded (Stefans propsoed edit to my code). >>>> What happens when tommorow something in QOM, or a device model or is >>>> implemented with coros too? how do I know who yielded my routines and >>>> what API to call re-enter it? >>> >>> Going back to what Kevin said, the qemu_aio_wait() isn't block layer >>> specific and there will never be a requirement to call any other magic >>> functions. >>> >>> QEMU is event-driven and you must pump events. If you call into >>> another subsystem, be prepared to pump events so that I/O can make >>> progress. It's an assumption that is so central to QEMU architecture >>> that I don't see it as a problem. >>> >>> Once the main loop is running then the event loop is taken care of for >>> you. But during startup you're in a different environment and need to >>> pump events yourself. >>> >>> If we drop bdrv_read()/bdrv_write() this won't change. You'll have to >>> call bdrv_co_readv()/bdrv_co_writev() and pump events. >>> >> >> Just tracing bdrv_aio_read(), It bypasses the fastpath logic: >> >> . So a conversion of pflash to bdrv_aio_readv is a possible solution here. >> >> bdrv_aio_read -> bdrv_co_aio_rw_vector : >> >> static BlockDriverAIOCB *bdrv_co_aio_rw_vector (..) { >> Coroutine *co; >> BlockDriverAIOCBCoroutine *acb; >> >> ... >> >> co = qemu_coroutine_create(bdrv_co_do_rw); >> qemu_coroutine_enter(co, acb); >> >> return &acb->common; >> } >> >> No conditional on the qemu_coroutine_create. So it will always create >> a new coroutine for its work which will solve my problem. All I need >> to do is pump events once at the end of machine model creation. Any my >> coroutines will never yield or get queued by block/AIO. Sound like a >> solution? > > If you don't need the read data in your initialisation code, definately not :) Just as long as the read data is there by the time the machine goes live. Whats the current policy with bdrv_read()ing from init functions anyway? Several devices in qemu have init functions that read the entire storage into a buffer (then the guest just talks to the buffer rather than the backing store). Pflash (pflash_cfi_01.c) is the device that is causing me interference here and it works exactly like this. If we make the bdrv_read() aio though, how do you ensure that it has completed before the guest talks to the device? Will this just happen at the end of machine_init anyways? Can we put a one liner in the machine init framework that pumps all AIO events then just mass covert all these bdrv_reads (in init functions) to bdrv_aio_read with a nop completion callback? then yes, > that would work. bdrv_aio_* will always create a new coroutine. I just > assumed that you wanted to use the data right away, and then using the > AIO functions wouldn't have made much sense. You'd get a small performance increase no? Your machine init continues on while your I/O happens rather than being synchronous so there is motivation beyond my situation. Regards, Peter > > Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 9:42 ` Peter Crosthwaite @ 2012-07-02 10:01 ` Kevin Wolf 2012-07-02 10:18 ` Peter Maydell ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Kevin Wolf @ 2012-07-02 10:01 UTC (permalink / raw) To: Peter Crosthwaite Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams Am 02.07.2012 11:42, schrieb Peter Crosthwaite: > On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >>> No conditional on the qemu_coroutine_create. So it will always create >>> a new coroutine for its work which will solve my problem. All I need >>> to do is pump events once at the end of machine model creation. Any my >>> coroutines will never yield or get queued by block/AIO. Sound like a >>> solution? >> >> If you don't need the read data in your initialisation code, > > definately not :) Just as long as the read data is there by the time > the machine goes live. Whats the current policy with bdrv_read()ing > from init functions anyway? Several devices in qemu have init > functions that read the entire storage into a buffer (then the guest > just talks to the buffer rather than the backing store). Reading from block devices during device initialisation breaks migration, so I'd like to see it go away wherever possible. Reading in the whole image file doesn't sound like something for which a good excuse exists, you can do that as well during the first access. > Pflash (pflash_cfi_01.c) is the device that is causing me interference > here and it works exactly like this. If we make the bdrv_read() aio > though, how do you ensure that it has completed before the guest talks > to the device? Will this just happen at the end of machine_init > anyways? Can we put a one liner in the machine init framework that > pumps all AIO events then just mass covert all these bdrv_reads (in > init functions) to bdrv_aio_read with a nop completion callback? The initialisation function of the device can wait at its end for all AIOs to return. I wouldn't want to encourage more block layer use during the initialisation phase by supporting it in the infrastructure. > then yes, >> that would work. bdrv_aio_* will always create a new coroutine. I just >> assumed that you wanted to use the data right away, and then using the >> AIO functions wouldn't have made much sense. > > You'd get a small performance increase no? Your machine init continues > on while your I/O happens rather than being synchronous so there is > motivation beyond my situation. Yeah, as long as the next statement isn't "while (!returned) qemu_aio_wait();", which it is in the common case. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:01 ` Kevin Wolf @ 2012-07-02 10:18 ` Peter Maydell 2012-07-02 10:44 ` Kevin Wolf 2012-07-02 10:18 ` Peter Crosthwaite 2012-07-12 13:42 ` Markus Armbruster 2 siblings, 1 reply; 38+ messages in thread From: Peter Maydell @ 2012-07-02 10:18 UTC (permalink / raw) To: Kevin Wolf Cc: stefanha, Stefan Hajnoczi, Markus Armbruster, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams On 2 July 2012 11:01, Kevin Wolf <kwolf@redhat.com> wrote: > Reading from block devices during device initialisation breaks > migration, so I'd like to see it go away wherever possible. Reading in > the whole image file doesn't sound like something for which a good > excuse exists, you can do that as well during the first access. It's much nicer to be able to produce an error message ("file doesn't exist", "file is too short for this flash device") at device startup rather than miles later on at first access, and pulling in a 64K file at startup is a simple implementation. Why complicate things by adding code for "if this is the first access then read in the file"? I would have thought migration would be simpler with a "read whole file at startup" implementation, because in that case the required migration state is always "this block of memory", rather than "sometimes this block of memory and sometimes a flag which says we haven't initialised the memory and will need to do a file read". -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:18 ` Peter Maydell @ 2012-07-02 10:44 ` Kevin Wolf 2012-07-02 10:59 ` Peter Maydell 2012-07-02 11:12 ` Peter Crosthwaite 0 siblings, 2 replies; 38+ messages in thread From: Kevin Wolf @ 2012-07-02 10:44 UTC (permalink / raw) To: Peter Maydell Cc: stefanha, Stefan Hajnoczi, Markus Armbruster, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams Am 02.07.2012 12:18, schrieb Peter Maydell: > On 2 July 2012 11:01, Kevin Wolf <kwolf@redhat.com> wrote: >> Reading from block devices during device initialisation breaks >> migration, so I'd like to see it go away wherever possible. Reading in >> the whole image file doesn't sound like something for which a good >> excuse exists, you can do that as well during the first access. > > It's much nicer to be able to produce an error message ("file > doesn't exist", "file is too short for this flash device") at > device startup rather than miles later on at first access, "file doesn't exist" is an error that occurs for the backend (-drive if=none), not for the -device, so you shouldn't have to deal with that at all. > and pulling in a 64K file at startup is a simple implementation. > Why complicate things by adding code for "if this is the first > access then read in the file"? Because then it works. :-) Migration works more or less like this: 1. Destination creates device model based on command line 2. RAM is copied live, source keeps running 3. Source stops, device state is transferred 4. Destination starts running the VM Reading from a block device is meaningful the earliest in step 3, because at earlier points the guest still runs on the source and can overwrite the data on the block device. If you're reading in the whole image, you're doing it in step 1, so your data will be outdated by the time the migration completes. And this description doesn't even take things like cache coherency for image files into consideration. > I would have thought migration would be simpler with a "read > whole file at startup" implementation, because in that case > the required migration state is always "this block of memory", > rather than "sometimes this block of memory and sometimes a > flag which says we haven't initialised the memory and will > need to do a file read". Oh, so you're actually migrating the whole content of the flash device instead of using the same file on the migration destination? What's the advantage over rereading the file on the destination? You still need access to the file there, don't you? The approach with migrating the block device content probably works for your 64k flash, but what about my hard disk? Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:44 ` Kevin Wolf @ 2012-07-02 10:59 ` Peter Maydell 2012-07-02 11:03 ` Peter Crosthwaite 2012-07-02 11:12 ` Peter Crosthwaite 1 sibling, 1 reply; 38+ messages in thread From: Peter Maydell @ 2012-07-02 10:59 UTC (permalink / raw) To: Kevin Wolf Cc: stefanha, Stefan Hajnoczi, Markus Armbruster, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams On 2 July 2012 11:44, Kevin Wolf <kwolf@redhat.com> wrote: > Am 02.07.2012 12:18, schrieb Peter Maydell: >> Why complicate things by adding code for "if this is the first >> access then read in the file"? > > Because then it works. :-) > > Migration works more or less like this: > > 1. Destination creates device model based on command line > 2. RAM is copied live, source keeps running > 3. Source stops, device state is transferred > 4. Destination starts running the VM > > Reading from a block device is meaningful the earliest in step 3, > because at earlier points the guest still runs on the source and can > overwrite the data on the block device. If you're reading in the whole > image, you're doing it in step 1, so your data will be outdated by the > time the migration completes. For pflash_cfi01 migration will work because although we read the file (slightly pointlessly) in step 1, we will get the correct contents transferred over in step 2, because we call vmstate_register_ram() in device init. We need to migrate flash contents like that anyway, to handle the case of "no backing file, flash starts empty and gets whatever the guest writes to it for as long as the guest is alive". > The approach with migrating the block device content probably works for > your 64k flash, but what about my hard disk? I'm not claiming this is a good approach for everything, just for some things. -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:59 ` Peter Maydell @ 2012-07-02 11:03 ` Peter Crosthwaite 0 siblings, 0 replies; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-02 11:03 UTC (permalink / raw) To: Peter Maydell Cc: Kevin Wolf, stefanha, Stefan Hajnoczi, Markus Armbruster, qemu-devel, edgar.iglesias, john.williams On Mon, Jul 2, 2012 at 8:59 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 2 July 2012 11:44, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 02.07.2012 12:18, schrieb Peter Maydell: >>> Why complicate things by adding code for "if this is the first >>> access then read in the file"? >> >> Because then it works. :-) >> >> Migration works more or less like this: >> >> 1. Destination creates device model based on command line >> 2. RAM is copied live, source keeps running >> 3. Source stops, device state is transferred >> 4. Destination starts running the VM >> >> Reading from a block device is meaningful the earliest in step 3, >> because at earlier points the guest still runs on the source and can >> overwrite the data on the block device. If you're reading in the whole >> image, you're doing it in step 1, so your data will be outdated by the >> time the migration completes. > > For pflash_cfi01 migration will work because although we read > the file (slightly pointlessly) in step 1, we will get the correct > contents transferred over in step 2, because we call vmstate_register_ram() > in device init. > > We need to migrate flash contents like that anyway, to handle the > case of "no backing file, Yeah these little flashes also tend to support a mode where there is no backing (bdrv) file at all. If all your doing is testing smoke testing a driver then you can boot with no -drive args and the device model knows to just implement the buffer and init it to something sensible. If there needs to be a device-size buffer to support this behaviour in non-bdrv mode then in might as well be there for bdrv mode as well. Regards, Peter flash starts empty and gets whatever the > guest writes to it for as long as the guest is alive". > >> The approach with migrating the block device content probably works for >> your 64k flash, but what about my hard disk? > > I'm not claiming this is a good approach for everything, just for > some things. > > -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:44 ` Kevin Wolf 2012-07-02 10:59 ` Peter Maydell @ 2012-07-02 11:12 ` Peter Crosthwaite 2012-07-02 11:19 ` Kevin Wolf 1 sibling, 1 reply; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-02 11:12 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, stefanha, Stefan Hajnoczi, Markus Armbruster, qemu-devel, edgar.iglesias, john.williams On Mon, Jul 2, 2012 at 8:44 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 02.07.2012 12:18, schrieb Peter Maydell: >> On 2 July 2012 11:01, Kevin Wolf <kwolf@redhat.com> wrote: >>> Reading from block devices during device initialisation breaks >>> migration, so I'd like to see it go away wherever possible. Reading in >>> the whole image file doesn't sound like something for which a good >>> excuse exists, you can do that as well during the first access. >> >> It's much nicer to be able to produce an error message ("file >> doesn't exist", "file is too short for this flash device") at >> device startup rather than miles later on at first access, > > "file doesn't exist" is an error that occurs for the backend (-drive > if=none), not for the -device, so you shouldn't have to deal with that > at all. > >> and pulling in a 64K file at startup is a simple implementation. >> Why complicate things by adding code for "if this is the first >> access then read in the file"? > > Because then it works. :-) > > Migration works more or less like this: > > 1. Destination creates device model based on command line > 2. RAM is copied live, source keeps running > 3. Source stops, device state is transferred > 4. Destination starts running the VM > > Reading from a block device is meaningful the earliest in step 3, > because at earlier points the guest still runs on the source and can > overwrite the data on the block device. If you're reading in the whole > image, you're doing it in step 1, so your data will be outdated by the > time the migration completes. > I feel like theres a "two birds with one stone" solution here, by making bdrv_aio_read just yield until step 3? Just an if (..) somewhere in the bdrv framework that says "while not ready for migration qemu_coroutine_yield()". You implement the postponed bdrv_read that you want, but you also get rid of the bdrv_read()s that everyone hates without having the rewrite all the small flashes with if-first-read-load-all logic. Regards, Peter > And this description doesn't even take things like cache coherency for > image files into consideration. > >> I would have thought migration would be simpler with a "read >> whole file at startup" implementation, because in that case >> the required migration state is always "this block of memory", >> rather than "sometimes this block of memory and sometimes a >> flag which says we haven't initialised the memory and will >> need to do a file read". > > Oh, so you're actually migrating the whole content of the flash device > instead of using the same file on the migration destination? What's the > advantage over rereading the file on the destination? You still need > access to the file there, don't you? > > The approach with migrating the block device content probably works for > your 64k flash, but what about my hard disk? > > Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 11:12 ` Peter Crosthwaite @ 2012-07-02 11:19 ` Kevin Wolf 2012-07-02 11:25 ` Peter Crosthwaite 0 siblings, 1 reply; 38+ messages in thread From: Kevin Wolf @ 2012-07-02 11:19 UTC (permalink / raw) To: Peter Crosthwaite Cc: Peter Maydell, stefanha, Stefan Hajnoczi, Markus Armbruster, qemu-devel, edgar.iglesias, john.williams Am 02.07.2012 13:12, schrieb Peter Crosthwaite: > On Mon, Jul 2, 2012 at 8:44 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 02.07.2012 12:18, schrieb Peter Maydell: >>> On 2 July 2012 11:01, Kevin Wolf <kwolf@redhat.com> wrote: >>>> Reading from block devices during device initialisation breaks >>>> migration, so I'd like to see it go away wherever possible. Reading in >>>> the whole image file doesn't sound like something for which a good >>>> excuse exists, you can do that as well during the first access. >>> >>> It's much nicer to be able to produce an error message ("file >>> doesn't exist", "file is too short for this flash device") at >>> device startup rather than miles later on at first access, >> >> "file doesn't exist" is an error that occurs for the backend (-drive >> if=none), not for the -device, so you shouldn't have to deal with that >> at all. >> >>> and pulling in a 64K file at startup is a simple implementation. >>> Why complicate things by adding code for "if this is the first >>> access then read in the file"? >> >> Because then it works. :-) >> >> Migration works more or less like this: >> >> 1. Destination creates device model based on command line >> 2. RAM is copied live, source keeps running >> 3. Source stops, device state is transferred >> 4. Destination starts running the VM >> >> Reading from a block device is meaningful the earliest in step 3, >> because at earlier points the guest still runs on the source and can >> overwrite the data on the block device. If you're reading in the whole >> image, you're doing it in step 1, so your data will be outdated by the >> time the migration completes. >> > > I feel like theres a "two birds with one stone" solution here, by > making bdrv_aio_read just yield until step 3? Just an if (..) > somewhere in the bdrv framework that says "while not ready for > migration qemu_coroutine_yield()". You implement the postponed > bdrv_read that you want, but you also get rid of the bdrv_read()s that > everyone hates without having the rewrite all the small flashes with > if-first-read-load-all logic. Or we could just have a second "late init" callback into the devices where such requests could be issued. That would feel a bit cleaner. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 11:19 ` Kevin Wolf @ 2012-07-02 11:25 ` Peter Crosthwaite 0 siblings, 0 replies; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-02 11:25 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, stefanha, Stefan Hajnoczi, Markus Armbruster, qemu-devel, edgar.iglesias, john.williams On Mon, Jul 2, 2012 at 9:19 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 02.07.2012 13:12, schrieb Peter Crosthwaite: >> On Mon, Jul 2, 2012 at 8:44 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>> Am 02.07.2012 12:18, schrieb Peter Maydell: >>>> On 2 July 2012 11:01, Kevin Wolf <kwolf@redhat.com> wrote: >>>>> Reading from block devices during device initialisation breaks >>>>> migration, so I'd like to see it go away wherever possible. Reading in >>>>> the whole image file doesn't sound like something for which a good >>>>> excuse exists, you can do that as well during the first access. >>>> >>>> It's much nicer to be able to produce an error message ("file >>>> doesn't exist", "file is too short for this flash device") at >>>> device startup rather than miles later on at first access, >>> >>> "file doesn't exist" is an error that occurs for the backend (-drive >>> if=none), not for the -device, so you shouldn't have to deal with that >>> at all. >>> >>>> and pulling in a 64K file at startup is a simple implementation. >>>> Why complicate things by adding code for "if this is the first >>>> access then read in the file"? >>> >>> Because then it works. :-) >>> >>> Migration works more or less like this: >>> >>> 1. Destination creates device model based on command line >>> 2. RAM is copied live, source keeps running >>> 3. Source stops, device state is transferred >>> 4. Destination starts running the VM >>> >>> Reading from a block device is meaningful the earliest in step 3, >>> because at earlier points the guest still runs on the source and can >>> overwrite the data on the block device. If you're reading in the whole >>> image, you're doing it in step 1, so your data will be outdated by the >>> time the migration completes. >>> >> >> I feel like theres a "two birds with one stone" solution here, by >> making bdrv_aio_read just yield until step 3? Just an if (..) >> somewhere in the bdrv framework that says "while not ready for >> migration qemu_coroutine_yield()". You implement the postponed >> bdrv_read that you want, but you also get rid of the bdrv_read()s that >> everyone hates without having the rewrite all the small flashes with >> if-first-read-load-all logic. > > Or we could just have a second "late init" callback into the devices > where such requests could be issued. That would feel a bit cleaner. > I disagree. Then device authors have to take into account all these rules about what gets inited when. If you have a single fn and the block layer just kicks off a coroutine that schedules the work for when it should happen then everything is clean an simple out in device model land. Hide the complexity from the devices and implement it in one place in the block layer rather than in N different devices. Regards, Peter > Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:01 ` Kevin Wolf 2012-07-02 10:18 ` Peter Maydell @ 2012-07-02 10:18 ` Peter Crosthwaite 2012-07-02 10:52 ` Kevin Wolf 2012-07-12 13:42 ` Markus Armbruster 2 siblings, 1 reply; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-02 10:18 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams On Mon, Jul 2, 2012 at 8:01 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 02.07.2012 11:42, schrieb Peter Crosthwaite: >> On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>> Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >>>> No conditional on the qemu_coroutine_create. So it will always create >>>> a new coroutine for its work which will solve my problem. All I need >>>> to do is pump events once at the end of machine model creation. Any my >>>> coroutines will never yield or get queued by block/AIO. Sound like a >>>> solution? >>> >>> If you don't need the read data in your initialisation code, >> >> definately not :) Just as long as the read data is there by the time >> the machine goes live. Whats the current policy with bdrv_read()ing >> from init functions anyway? Several devices in qemu have init >> functions that read the entire storage into a buffer (then the guest >> just talks to the buffer rather than the backing store). > > Reading from block devices during device initialisation breaks > migration, so I'd like to see it go away wherever possible. Reading in > the whole image file doesn't sound like something for which a good > excuse exists, Makes sense for small devices on embedded platforms. You end up with a very simple and clean device model. The approach is totally broken for HDDs but it does make some sense for the little embedded flashes where you can get away with caching the entire device storage in RAM for the lifetime of the system. > you can do that as well during the first access. > Kind of painful though to change this for a lot of existing devices. Its a reasonable policy for new devices, but can we just fix the init->bdrv_read() calls in place for the moment? >> Pflash (pflash_cfi_01.c) is the device that is causing me interference >> here and it works exactly like this. If we make the bdrv_read() aio >> though, how do you ensure that it has completed before the guest talks >> to the device? Will this just happen at the end of machine_init >> anyways? Can we put a one liner in the machine init framework that >> pumps all AIO events then just mass covert all these bdrv_reads (in >> init functions) to bdrv_aio_read with a nop completion callback? > > The initialisation function of the device can wait at its end for all > AIOs to return. You lose the performance increase discussed below however, as instead of the init function returning to continue on with the machine init, you block on disk IO. I wouldn't want to encourage more block layer use during > the initialisation phase by supporting it in the infrastructure. > >> then yes, >>> that would work. bdrv_aio_* will always create a new coroutine. I just >>> assumed that you wanted to use the data right away, and then using the >>> AIO functions wouldn't have made much sense. >> >> You'd get a small performance increase no? Your machine init continues >> on while your I/O happens rather than being synchronous so there is >> motivation beyond my situation. > > Yeah, as long as the next statement isn't "while (!returned) > qemu_aio_wait();", which it is in the common case. > > Kevin Regards, Peter ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:18 ` Peter Crosthwaite @ 2012-07-02 10:52 ` Kevin Wolf 2012-07-02 10:57 ` Peter Crosthwaite 0 siblings, 1 reply; 38+ messages in thread From: Kevin Wolf @ 2012-07-02 10:52 UTC (permalink / raw) To: Peter Crosthwaite Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams Am 02.07.2012 12:18, schrieb Peter Crosthwaite: > On Mon, Jul 2, 2012 at 8:01 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 02.07.2012 11:42, schrieb Peter Crosthwaite: >>> On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>>> Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >>>>> No conditional on the qemu_coroutine_create. So it will always create >>>>> a new coroutine for its work which will solve my problem. All I need >>>>> to do is pump events once at the end of machine model creation. Any my >>>>> coroutines will never yield or get queued by block/AIO. Sound like a >>>>> solution? >>>> >>>> If you don't need the read data in your initialisation code, >>> >>> definately not :) Just as long as the read data is there by the time >>> the machine goes live. Whats the current policy with bdrv_read()ing >>> from init functions anyway? Several devices in qemu have init >>> functions that read the entire storage into a buffer (then the guest >>> just talks to the buffer rather than the backing store). >> >> Reading from block devices during device initialisation breaks >> migration, so I'd like to see it go away wherever possible. Reading in >> the whole image file doesn't sound like something for which a good >> excuse exists, > > Makes sense for small devices on embedded platforms. You end up with a > very simple and clean device model. The approach is totally broken for > HDDs but it does make some sense for the little embedded flashes where > you can get away with caching the entire device storage in RAM for the > lifetime of the system. It kind of works for read-only devices, it's just ugly there. With writeable devices it makes the VM unmigratable. >> you can do that as well during the first access. >> > > Kind of painful though to change this for a lot of existing devices. > Its a reasonable policy for new devices, but can we just fix the > init->bdrv_read() calls in place for the moment? Sure, but you asked for the policy and the policy is "only with good reasons". ;-) >>> Pflash (pflash_cfi_01.c) is the device that is causing me interference >>> here and it works exactly like this. If we make the bdrv_read() aio >>> though, how do you ensure that it has completed before the guest talks >>> to the device? Will this just happen at the end of machine_init >>> anyways? Can we put a one liner in the machine init framework that >>> pumps all AIO events then just mass covert all these bdrv_reads (in >>> init functions) to bdrv_aio_read with a nop completion callback? >> >> The initialisation function of the device can wait at its end for all >> AIOs to return. > > You lose the performance increase discussed below however, as instead > of the init function returning to continue on with the machine init, > you block on disk IO. Do you think it really matters? I guess it might if you have two such devices that take each a second or so to load, but otherwise there isn't much to parallelise. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:52 ` Kevin Wolf @ 2012-07-02 10:57 ` Peter Crosthwaite 2012-07-02 11:04 ` Kevin Wolf 0 siblings, 1 reply; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-02 10:57 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams On Mon, Jul 2, 2012 at 8:52 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 02.07.2012 12:18, schrieb Peter Crosthwaite: >> On Mon, Jul 2, 2012 at 8:01 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>> Am 02.07.2012 11:42, schrieb Peter Crosthwaite: >>>> On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>>>> Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >>>>>> No conditional on the qemu_coroutine_create. So it will always create >>>>>> a new coroutine for its work which will solve my problem. All I need >>>>>> to do is pump events once at the end of machine model creation. Any my >>>>>> coroutines will never yield or get queued by block/AIO. Sound like a >>>>>> solution? >>>>> >>>>> If you don't need the read data in your initialisation code, >>>> >>>> definately not :) Just as long as the read data is there by the time >>>> the machine goes live. Whats the current policy with bdrv_read()ing >>>> from init functions anyway? Several devices in qemu have init >>>> functions that read the entire storage into a buffer (then the guest >>>> just talks to the buffer rather than the backing store). >>> >>> Reading from block devices during device initialisation breaks >>> migration, so I'd like to see it go away wherever possible. Reading in >>> the whole image file doesn't sound like something for which a good >>> excuse exists, >> >> Makes sense for small devices on embedded platforms. You end up with a >> very simple and clean device model. The approach is totally broken for >> HDDs but it does make some sense for the little embedded flashes where >> you can get away with caching the entire device storage in RAM for the >> lifetime of the system. > > It kind of works for read-only devices, it's just ugly there. With > writeable devices it makes the VM unmigratable. > Isnt it just a simple case of syncing the buffer with the backing store on pre_save? Regards, Peter >>> you can do that as well during the first access. >>> >> >> Kind of painful though to change this for a lot of existing devices. >> Its a reasonable policy for new devices, but can we just fix the >> init->bdrv_read() calls in place for the moment? > > Sure, but you asked for the policy and the policy is "only with good > reasons". ;-) > >>>> Pflash (pflash_cfi_01.c) is the device that is causing me interference >>>> here and it works exactly like this. If we make the bdrv_read() aio >>>> though, how do you ensure that it has completed before the guest talks >>>> to the device? Will this just happen at the end of machine_init >>>> anyways? Can we put a one liner in the machine init framework that >>>> pumps all AIO events then just mass covert all these bdrv_reads (in >>>> init functions) to bdrv_aio_read with a nop completion callback? >>> >>> The initialisation function of the device can wait at its end for all >>> AIOs to return. >> >> You lose the performance increase discussed below however, as instead >> of the init function returning to continue on with the machine init, >> you block on disk IO. > > Do you think it really matters? I guess it might if you have two such > devices that take each a second or so to load, but otherwise there isn't > much to parallelise. > > Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:57 ` Peter Crosthwaite @ 2012-07-02 11:04 ` Kevin Wolf 0 siblings, 0 replies; 38+ messages in thread From: Kevin Wolf @ 2012-07-02 11:04 UTC (permalink / raw) To: Peter Crosthwaite Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Markus Armbruster, edgar.iglesias, john.williams Am 02.07.2012 12:57, schrieb Peter Crosthwaite: > On Mon, Jul 2, 2012 at 8:52 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> Am 02.07.2012 12:18, schrieb Peter Crosthwaite: >>> On Mon, Jul 2, 2012 at 8:01 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>>> Am 02.07.2012 11:42, schrieb Peter Crosthwaite: >>>>> On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>>>>> Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >>>>>>> No conditional on the qemu_coroutine_create. So it will always create >>>>>>> a new coroutine for its work which will solve my problem. All I need >>>>>>> to do is pump events once at the end of machine model creation. Any my >>>>>>> coroutines will never yield or get queued by block/AIO. Sound like a >>>>>>> solution? >>>>>> >>>>>> If you don't need the read data in your initialisation code, >>>>> >>>>> definately not :) Just as long as the read data is there by the time >>>>> the machine goes live. Whats the current policy with bdrv_read()ing >>>>> from init functions anyway? Several devices in qemu have init >>>>> functions that read the entire storage into a buffer (then the guest >>>>> just talks to the buffer rather than the backing store). >>>> >>>> Reading from block devices during device initialisation breaks >>>> migration, so I'd like to see it go away wherever possible. Reading in >>>> the whole image file doesn't sound like something for which a good >>>> excuse exists, >>> >>> Makes sense for small devices on embedded platforms. You end up with a >>> very simple and clean device model. The approach is totally broken for >>> HDDs but it does make some sense for the little embedded flashes where >>> you can get away with caching the entire device storage in RAM for the >>> lifetime of the system. >> >> It kind of works for read-only devices, it's just ugly there. With >> writeable devices it makes the VM unmigratable. >> > > Isnt it just a simple case of syncing the buffer with the backing > store on pre_save? Not if the buffer is only read during initialisation, which has long happened on the destination when pre_save runs. So you'd need to either migrate the whole block device content as suggested by PMM (works only for really small devices) or have a reopen in post_load. Both sounds rather hackish to me and likely doesn't work for block devices in general. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-02 10:01 ` Kevin Wolf 2012-07-02 10:18 ` Peter Maydell 2012-07-02 10:18 ` Peter Crosthwaite @ 2012-07-12 13:42 ` Markus Armbruster 2012-07-13 1:21 ` Peter Crosthwaite 2 siblings, 1 reply; 38+ messages in thread From: Markus Armbruster @ 2012-07-12 13:42 UTC (permalink / raw) To: Kevin Wolf Cc: Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams Kevin Wolf <kwolf@redhat.com> writes: > Am 02.07.2012 11:42, schrieb Peter Crosthwaite: >> On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>> Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >>>> No conditional on the qemu_coroutine_create. So it will always create >>>> a new coroutine for its work which will solve my problem. All I need >>>> to do is pump events once at the end of machine model creation. Any my >>>> coroutines will never yield or get queued by block/AIO. Sound like a >>>> solution? >>> >>> If you don't need the read data in your initialisation code, >> >> definately not :) Just as long as the read data is there by the time >> the machine goes live. Whats the current policy with bdrv_read()ing >> from init functions anyway? Several devices in qemu have init >> functions that read the entire storage into a buffer (then the guest >> just talks to the buffer rather than the backing store). > > Reading from block devices during device initialisation breaks > migration, so I'd like to see it go away wherever possible. Reading in > the whole image file doesn't sound like something for which a good > excuse exists, you can do that as well during the first access. I just stumbled over another problem case: encrypted images. Encrypted images specified on the command line get their keys from the monitor. -S is forced automatically. You can then use block_passwd to set keys. cont asks for keys still missing. However, device initialization happens *before* you get access to the monitor. Therefore, your init method runs without a key. It would be nice if the monitor was available before device initialization. Patches welcome. [...] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-12 13:42 ` Markus Armbruster @ 2012-07-13 1:21 ` Peter Crosthwaite 2012-07-13 8:33 ` Markus Armbruster 0 siblings, 1 reply; 38+ messages in thread From: Peter Crosthwaite @ 2012-07-13 1:21 UTC (permalink / raw) To: Markus Armbruster Cc: Kevin Wolf, Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, edgar.iglesias, john.williams On Thu, Jul 12, 2012 at 11:42 PM, Markus Armbruster <armbru@redhat.com> wrote: > Kevin Wolf <kwolf@redhat.com> writes: > >> Am 02.07.2012 11:42, schrieb Peter Crosthwaite: >>> On Mon, Jul 2, 2012 at 7:04 PM, Kevin Wolf <kwolf@redhat.com> wrote: >>>> Am 02.07.2012 10:57, schrieb Peter Crosthwaite: >>>>> No conditional on the qemu_coroutine_create. So it will always create >>>>> a new coroutine for its work which will solve my problem. All I need >>>>> to do is pump events once at the end of machine model creation. Any my >>>>> coroutines will never yield or get queued by block/AIO. Sound like a >>>>> solution? >>>> >>>> If you don't need the read data in your initialisation code, >>> >>> definately not :) Just as long as the read data is there by the time >>> the machine goes live. Whats the current policy with bdrv_read()ing >>> from init functions anyway? Several devices in qemu have init >>> functions that read the entire storage into a buffer (then the guest >>> just talks to the buffer rather than the backing store). >> >> Reading from block devices during device initialisation breaks >> migration, so I'd like to see it go away wherever possible. Reading in >> the whole image file doesn't sound like something for which a good >> excuse exists, you can do that as well during the first access. > > I just stumbled over another problem case: encrypted images. > > Encrypted images specified on the command line get their keys from the > monitor. -S is forced automatically. You can then use block_passwd to > set keys. cont asks for keys still missing. > > However, device initialization happens *before* you get access to the > monitor. Therefore, your init method runs without a key. > > It would be nice if the monitor was available before device > initialization. Patches welcome. > Hi Markus, I consolidated this discussion into a new RFC which proposes a few solutions the the bdrv_read() from init() problem. http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00237.html Are you able to comment on those ideas WRT your latest thoughts? Regards, Peter > [...] ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-07-13 1:21 ` Peter Crosthwaite @ 2012-07-13 8:33 ` Markus Armbruster 0 siblings, 0 replies; 38+ messages in thread From: Markus Armbruster @ 2012-07-13 8:33 UTC (permalink / raw) To: Peter Crosthwaite Cc: Kevin Wolf, Peter Maydell, stefanha, Stefan Hajnoczi, qemu-devel, edgar.iglesias, john.williams Peter Crosthwaite <peter.crosthwaite@petalogix.com> writes: > Hi Markus, > > I consolidated this discussion into a new RFC which proposes a few > solutions the the bdrv_read() from init() problem. > > http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00237.html > > Are you able to comment on those ideas WRT your latest thoughts? Done. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 6:44 [Qemu-devel] [RFC] block: Removed coroutine ownership assumption Peter A. G. Crosthwaite 2012-06-22 7:49 ` Kevin Wolf @ 2012-06-22 7:50 ` Jan Kiszka 2012-06-22 8:00 ` Peter Crosthwaite 1 sibling, 1 reply; 38+ messages in thread From: Jan Kiszka @ 2012-06-22 7:50 UTC (permalink / raw) To: Peter A. G. Crosthwaite Cc: kwolf, edgar.iglesias, qemu-devel, john.williams, stefanha On 2012-06-22 08:44, Peter A. G. Crosthwaite wrote: > The block layer assumes that it is the only user of coroutines - > The qemu_in_coroutine() is used to determine if a function is in one of the > block layers coroutines, which is flawed. I.E. If a client (e.g. a device or > a machine model) of the block layer uses couroutine itself, the block layer > will identify the callers coroutines as its own, and may falsely yield the > calling coroutine (instead of creating its own to yield). > > AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an > issue, as anyone who comes along and used coroutines and the block layer > together is going to run into some very obscure and hard to debug race > conditions. Not sure if I understood the intention yet: Is this supposed to fix an issue with the current usage of coroutines or to extend their usage beyond that? In the latter case, please don't do this. We'd rather like to get rid of them long term. Jan > > Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> > --- > block.c | 8 ++++---- > 1 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/block.c b/block.c > index 0acdcac..b50af15 100644 > --- a/block.c > +++ b/block.c > @@ -380,7 +380,7 @@ int bdrv_create(BlockDriver *drv, const char* filename, > return -ENOTSUP; > } > > - if (qemu_in_coroutine()) { > + if (0) { > /* Fast-path if already in coroutine context */ > bdrv_create_co_entry(&cco); > } else { > @@ -1590,7 +1590,7 @@ static int bdrv_rw_co(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, > bdrv_io_limits_disable(bs); > } > > - if (qemu_in_coroutine()) { > + if (0) { > /* Fast-path if already in coroutine context */ > bdrv_rw_co_entry(&rwco); > } else { > @@ -3813,7 +3813,7 @@ int bdrv_flush(BlockDriverState *bs) > .ret = NOT_DONE, > }; > > - if (qemu_in_coroutine()) { > + if (0) { > /* Fast-path if already in coroutine context */ > bdrv_flush_co_entry(&rwco); > } else { > @@ -3874,7 +3874,7 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) > .ret = NOT_DONE, > }; > > - if (qemu_in_coroutine()) { > + if (0) { > /* Fast-path if already in coroutine context */ > bdrv_discard_co_entry(&rwco); > } else { > -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 7:50 ` Jan Kiszka @ 2012-06-22 8:00 ` Peter Crosthwaite 2012-06-22 8:06 ` Kevin Wolf 2012-06-22 8:16 ` Peter Maydell 0 siblings, 2 replies; 38+ messages in thread From: Peter Crosthwaite @ 2012-06-22 8:00 UTC (permalink / raw) To: Jan Kiszka; +Cc: kwolf, edgar.iglesias, qemu-devel, john.williams, stefanha On Fri, Jun 22, 2012 at 5:50 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: > On 2012-06-22 08:44, Peter A. G. Crosthwaite wrote: >> The block layer assumes that it is the only user of coroutines - >> The qemu_in_coroutine() is used to determine if a function is in one of the >> block layers coroutines, which is flawed. I.E. If a client (e.g. a device or >> a machine model) of the block layer uses couroutine itself, the block layer >> will identify the callers coroutines as its own, and may falsely yield the >> calling coroutine (instead of creating its own to yield). >> >> AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an >> issue, as anyone who comes along and used coroutines and the block layer >> together is going to run into some very obscure and hard to debug race >> conditions. > > Not sure if I understood the intention yet: Is this supposed to fix an > issue with the current usage of coroutines or to extend their usage > beyond that? In the latter case, please don't do this. We'd rather like > to get rid of them long term. > My extended usage, which is under development and not ready for the list. But are you saying qemu-coroutine is deprecated? If so Ill just re-impelement my work with threads, mutexes and condition vars, but coroutines are the most natural way of doing it. > Jan > >> >> Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com> >> --- >> block.c | 8 ++++---- >> 1 files changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/block.c b/block.c >> index 0acdcac..b50af15 100644 >> --- a/block.c >> +++ b/block.c >> @@ -380,7 +380,7 @@ int bdrv_create(BlockDriver *drv, const char* filename, >> return -ENOTSUP; >> } >> >> - if (qemu_in_coroutine()) { >> + if (0) { >> /* Fast-path if already in coroutine context */ >> bdrv_create_co_entry(&cco); >> } else { >> @@ -1590,7 +1590,7 @@ static int bdrv_rw_co(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, >> bdrv_io_limits_disable(bs); >> } >> >> - if (qemu_in_coroutine()) { >> + if (0) { >> /* Fast-path if already in coroutine context */ >> bdrv_rw_co_entry(&rwco); >> } else { >> @@ -3813,7 +3813,7 @@ int bdrv_flush(BlockDriverState *bs) >> .ret = NOT_DONE, >> }; >> >> - if (qemu_in_coroutine()) { >> + if (0) { >> /* Fast-path if already in coroutine context */ >> bdrv_flush_co_entry(&rwco); >> } else { >> @@ -3874,7 +3874,7 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) >> .ret = NOT_DONE, >> }; >> >> - if (qemu_in_coroutine()) { >> + if (0) { >> /* Fast-path if already in coroutine context */ >> bdrv_discard_co_entry(&rwco); >> } else { >> > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux > ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:00 ` Peter Crosthwaite @ 2012-06-22 8:06 ` Kevin Wolf 2012-06-22 8:16 ` Peter Maydell 1 sibling, 0 replies; 38+ messages in thread From: Kevin Wolf @ 2012-06-22 8:06 UTC (permalink / raw) To: Peter Crosthwaite Cc: Jan Kiszka, edgar.iglesias, qemu-devel, john.williams, stefanha Am 22.06.2012 10:00, schrieb Peter Crosthwaite: > On Fri, Jun 22, 2012 at 5:50 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >> On 2012-06-22 08:44, Peter A. G. Crosthwaite wrote: >>> The block layer assumes that it is the only user of coroutines - >>> The qemu_in_coroutine() is used to determine if a function is in one of the >>> block layers coroutines, which is flawed. I.E. If a client (e.g. a device or >>> a machine model) of the block layer uses couroutine itself, the block layer >>> will identify the callers coroutines as its own, and may falsely yield the >>> calling coroutine (instead of creating its own to yield). >>> >>> AFAICT, there are no conflicts in the QEMU master here yet, but its kind of an >>> issue, as anyone who comes along and used coroutines and the block layer >>> together is going to run into some very obscure and hard to debug race >>> conditions. >> >> Not sure if I understood the intention yet: Is this supposed to fix an >> issue with the current usage of coroutines or to extend their usage >> beyond that? In the latter case, please don't do this. We'd rather like >> to get rid of them long term. >> > > My extended usage, which is under development and not ready for the > list. But are you saying qemu-coroutine is deprecated? If so Ill just > re-impelement my work with threads, mutexes and condition vars, but > coroutines are the most natural way of doing it. I don't think we're going to drop them as long as they are useful for someone, and definitely not anytime soon. Possibly at some point, when the block layer is converted to threads, we could make gthread the only (or the default) backend, which will impact performance a bit, but works on more platforms. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:00 ` Peter Crosthwaite 2012-06-22 8:06 ` Kevin Wolf @ 2012-06-22 8:16 ` Peter Maydell 2012-06-22 8:23 ` Peter Crosthwaite ` (3 more replies) 1 sibling, 4 replies; 38+ messages in thread From: Peter Maydell @ 2012-06-22 8:16 UTC (permalink / raw) To: Peter Crosthwaite Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, edgar.iglesias, john.williams On 22 June 2012 09:00, Peter Crosthwaite <peter.crosthwaite@petalogix.com> wrote: > On Fri, Jun 22, 2012 at 5:50 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >> Not sure if I understood the intention yet: Is this supposed to fix an >> issue with the current usage of coroutines or to extend their usage >> beyond that? In the latter case, please don't do this. We'd rather like >> to get rid of them long term. > My extended usage, which is under development and not ready for the > list. But are you saying qemu-coroutine is deprecated? If so Ill just > re-impelement my work with threads, mutexes and condition vars, but > coroutines are the most natural way of doing it. Basically coroutines are nastily unportable and we've had a set of problems with them (as witness the fact that we have three separate backend implementations!). There is supposedly some sort of migration plan for getting them out of the block layer eventually; they're a kind of halfway house for avoiding synchronous I/O there AIUI. I would much prefer not to see any further use of them in new code. Fundamentally C doesn't support coroutines and it's much better to just admit that IMHO and use more idiomatic design approaches instead. -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:16 ` Peter Maydell @ 2012-06-22 8:23 ` Peter Crosthwaite 2012-06-22 8:33 ` Stefan Hajnoczi ` (2 subsequent siblings) 3 siblings, 0 replies; 38+ messages in thread From: Peter Crosthwaite @ 2012-06-22 8:23 UTC (permalink / raw) To: Peter Maydell Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, edgar.iglesias, john.williams On Fri, Jun 22, 2012 at 6:16 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 22 June 2012 09:00, Peter Crosthwaite > <peter.crosthwaite@petalogix.com> wrote: >> On Fri, Jun 22, 2012 at 5:50 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >>> Not sure if I understood the intention yet: Is this supposed to fix an >>> issue with the current usage of coroutines or to extend their usage >>> beyond that? In the latter case, please don't do this. We'd rather like >>> to get rid of them long term. > >> My extended usage, which is under development and not ready for the >> list. But are you saying qemu-coroutine is deprecated? If so Ill just >> re-impelement my work with threads, mutexes and condition vars, but >> coroutines are the most natural way of doing it. > > Basically coroutines are nastily unportable and we've had a set > of problems with them (as witness the fact that we have three > separate backend implementations!). There is supposedly some sort > of migration plan for getting them out of the block layer eventually; > they're a kind of halfway house for avoiding synchronous I/O there > AIUI. I would much prefer not to see any further use of them in new > code. Fundamentally C doesn't support coroutines and it's much better > to just admit that IMHO and use more idiomatic design approaches > instead. > Sounds depracted to me :) Can we add some comments to coroutine? > -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:16 ` Peter Maydell 2012-06-22 8:23 ` Peter Crosthwaite @ 2012-06-22 8:33 ` Stefan Hajnoczi 2012-06-22 8:45 ` Kevin Wolf 2012-06-22 8:48 ` Markus Armbruster 3 siblings, 0 replies; 38+ messages in thread From: Stefan Hajnoczi @ 2012-06-22 8:33 UTC (permalink / raw) To: Peter Maydell Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams On Fri, Jun 22, 2012 at 9:16 AM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 22 June 2012 09:00, Peter Crosthwaite > <peter.crosthwaite@petalogix.com> wrote: >> On Fri, Jun 22, 2012 at 5:50 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >>> Not sure if I understood the intention yet: Is this supposed to fix an >>> issue with the current usage of coroutines or to extend their usage >>> beyond that? In the latter case, please don't do this. We'd rather like >>> to get rid of them long term. > >> My extended usage, which is under development and not ready for the >> list. But are you saying qemu-coroutine is deprecated? If so Ill just >> re-impelement my work with threads, mutexes and condition vars, but >> coroutines are the most natural way of doing it. > > Basically coroutines are nastily unportable and we've had a set > of problems with them (as witness the fact that we have three > separate backend implementations!). There is supposedly some sort > of migration plan for getting them out of the block layer eventually; > they're a kind of halfway house for avoiding synchronous I/O there > AIUI. I would much prefer not to see any further use of them in new > code. Fundamentally C doesn't support coroutines and it's much better > to just admit that IMHO and use more idiomatic design approaches > instead. If you avoid coroutines and write asynchronous code it will be *harder* to convert to threads! IMO it's better to make use of coroutines for now and convert to threads as soon as the global mutex has been pushed out into device emulation. Yes, they require retargeting to different host platforms, but we have that now. Stefan ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:16 ` Peter Maydell 2012-06-22 8:23 ` Peter Crosthwaite 2012-06-22 8:33 ` Stefan Hajnoczi @ 2012-06-22 8:45 ` Kevin Wolf 2012-06-22 8:48 ` Markus Armbruster 3 siblings, 0 replies; 38+ messages in thread From: Kevin Wolf @ 2012-06-22 8:45 UTC (permalink / raw) To: Peter Maydell Cc: stefanha, Jan Kiszka, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams Am 22.06.2012 10:16, schrieb Peter Maydell: > On 22 June 2012 09:00, Peter Crosthwaite > <peter.crosthwaite@petalogix.com> wrote: >> On Fri, Jun 22, 2012 at 5:50 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >>> Not sure if I understood the intention yet: Is this supposed to fix an >>> issue with the current usage of coroutines or to extend their usage >>> beyond that? In the latter case, please don't do this. We'd rather like >>> to get rid of them long term. > >> My extended usage, which is under development and not ready for the >> list. But are you saying qemu-coroutine is deprecated? If so Ill just >> re-impelement my work with threads, mutexes and condition vars, but >> coroutines are the most natural way of doing it. > > Basically coroutines are nastily unportable and we've had a set > of problems with them (as witness the fact that we have three > separate backend implementations!). There is supposedly some sort > of migration plan for getting them out of the block layer eventually; > they're a kind of halfway house for avoiding synchronous I/O there > AIUI. I would much prefer not to see any further use of them in new > code. Fundamentally C doesn't support coroutines and it's much better > to just admit that IMHO and use more idiomatic design approaches > instead. Our default coroutine implementation isn't really as portable as we wish, but while we use this in order to achieve the best performance, long term we could make gthread the default backend and keep coroutines where they are useful. The gthread backend is slower than the ucontext/sigaltstack ones, but it shouldn't make a difference in performance compared to using threads and putting exactly the same locking in place open-coded like Peter mentioned as an alternative approach. Kevin ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:16 ` Peter Maydell ` (2 preceding siblings ...) 2012-06-22 8:45 ` Kevin Wolf @ 2012-06-22 8:48 ` Markus Armbruster 2012-06-22 9:06 ` Peter Maydell 3 siblings, 1 reply; 38+ messages in thread From: Markus Armbruster @ 2012-06-22 8:48 UTC (permalink / raw) To: Peter Maydell Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams Peter Maydell <peter.maydell@linaro.org> writes: > On 22 June 2012 09:00, Peter Crosthwaite > <peter.crosthwaite@petalogix.com> wrote: >> On Fri, Jun 22, 2012 at 5:50 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote: >>> Not sure if I understood the intention yet: Is this supposed to fix an >>> issue with the current usage of coroutines or to extend their usage >>> beyond that? In the latter case, please don't do this. We'd rather like >>> to get rid of them long term. > >> My extended usage, which is under development and not ready for the >> list. But are you saying qemu-coroutine is deprecated? If so Ill just >> re-impelement my work with threads, mutexes and condition vars, but >> coroutines are the most natural way of doing it. > > Basically coroutines are nastily unportable and we've had a set > of problems with them (as witness the fact that we have three > separate backend implementations!). There is supposedly some sort > of migration plan for getting them out of the block layer eventually; > they're a kind of halfway house for avoiding synchronous I/O there > AIUI. I would much prefer not to see any further use of them in new > code. Fundamentally C doesn't support coroutines and it's much better > to just admit that IMHO and use more idiomatic design approaches > instead. I think you're overstating your case. People have been doing coroutines in C for decades, on a huge range of machines. They wouldn't have done so if it wasn't worth their while. Not what I'd call a "fundamental" problem. In my opinion, coroutines have been useful for us so far. Whether they remain useful, or serve us just as a stepping stone towards general threads remains to be seen. As Kevin said, there are no concrete plans to convert the block layer away from coroutines. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 8:48 ` Markus Armbruster @ 2012-06-22 9:06 ` Peter Maydell 2012-06-22 12:04 ` Markus Armbruster 0 siblings, 1 reply; 38+ messages in thread From: Peter Maydell @ 2012-06-22 9:06 UTC (permalink / raw) To: Markus Armbruster Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams On 22 June 2012 09:48, Markus Armbruster <armbru@redhat.com> wrote: > In my opinion, coroutines have been useful for us so far. Whether they > remain useful, or serve us just as a stepping stone towards general > threads remains to be seen. >From my point of view I've seen a whole pile of problems and not really any advantages... I particularly think it's a really bad idea to have a complex and potentially race-condition-prone bit of infrastructure implemented three different ways rather than having one implementation used everywhere -- it's just asking for obscure bugs on the non-x86 hosts. Really it just breaks the general rule I prefer to follow that you should write your code in the 'mainstream' of an API/platform; if you head too close to the shallows you're liable to hit a rock. -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 9:06 ` Peter Maydell @ 2012-06-22 12:04 ` Markus Armbruster 2012-06-22 12:30 ` Peter Maydell 0 siblings, 1 reply; 38+ messages in thread From: Markus Armbruster @ 2012-06-22 12:04 UTC (permalink / raw) To: Peter Maydell Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams Peter Maydell <peter.maydell@linaro.org> writes: > On 22 June 2012 09:48, Markus Armbruster <armbru@redhat.com> wrote: >> In my opinion, coroutines have been useful for us so far. Whether they >> remain useful, or serve us just as a stepping stone towards general >> threads remains to be seen. > >>From my point of view I've seen a whole pile of problems and not > really any advantages... Advantages over what? > I particularly think it's a really bad > idea to have a complex and potentially race-condition-prone bit > of infrastructure implemented three different ways rather than > having one implementation used everywhere -- it's just asking > for obscure bugs on the non-x86 hosts. Fair point, but it's an implementation problem, not a fundamental problem with coroutines. You *can* implement coroutines portably, e.g. on top of gthread. But there's a portability / speed tradeoff. Kevin already explained we chose speed over portability initially, and that choice is open to revision. > Really it just breaks the general rule I prefer to follow that > you should write your code in the 'mainstream' of an API/platform; > if you head too close to the shallows you're liable to hit a rock. It's a good rule. Like for most rules, there are exceptions. ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 12:04 ` Markus Armbruster @ 2012-06-22 12:30 ` Peter Maydell 2012-06-22 13:36 ` Markus Armbruster 0 siblings, 1 reply; 38+ messages in thread From: Peter Maydell @ 2012-06-22 12:30 UTC (permalink / raw) To: Markus Armbruster Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams On 22 June 2012 13:04, Markus Armbruster <armbru@redhat.com> wrote: > Peter Maydell <peter.maydell@linaro.org> writes: > >> On 22 June 2012 09:48, Markus Armbruster <armbru@redhat.com> wrote: >>> In my opinion, coroutines have been useful for us so far. Whether they >>> remain useful, or serve us just as a stepping stone towards general >>> threads remains to be seen. >> >>>From my point of view I've seen a whole pile of problems and not >> really any advantages... > > Advantages over what? Over what we had before we had coroutines. I know there are advantages, I'm just saying that personally I've been largely on the downside rather than the upside. >> I particularly think it's a really bad >> idea to have a complex and potentially race-condition-prone bit >> of infrastructure implemented three different ways rather than >> having one implementation used everywhere -- it's just asking >> for obscure bugs on the non-x86 hosts. > > Fair point, but it's an implementation problem, not a fundamental > problem with coroutines. You *can* implement coroutines portably, > e.g. on top of gthread. If you're implementing them on top of separate threads then you just have an obfuscated API to threads. -- PMM ^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [Qemu-devel] [RFC] block: Removed coroutine ownership assumption 2012-06-22 12:30 ` Peter Maydell @ 2012-06-22 13:36 ` Markus Armbruster 0 siblings, 0 replies; 38+ messages in thread From: Markus Armbruster @ 2012-06-22 13:36 UTC (permalink / raw) To: Peter Maydell Cc: kwolf, stefanha, Jan Kiszka, qemu-devel, Peter Crosthwaite, edgar.iglesias, john.williams Peter Maydell <peter.maydell@linaro.org> writes: > On 22 June 2012 13:04, Markus Armbruster <armbru@redhat.com> wrote: >> Peter Maydell <peter.maydell@linaro.org> writes: >> >>> On 22 June 2012 09:48, Markus Armbruster <armbru@redhat.com> wrote: >>>> In my opinion, coroutines have been useful for us so far. Whether they >>>> remain useful, or serve us just as a stepping stone towards general >>>> threads remains to be seen. >>> >>>>From my point of view I've seen a whole pile of problems and not >>> really any advantages... >> >> Advantages over what? > > Over what we had before we had coroutines. I know there are > advantages, I'm just saying that personally I've been largely > on the downside rather than the upside. Granted. Just saying that there's an upside, too :) >>> I particularly think it's a really bad >>> idea to have a complex and potentially race-condition-prone bit >>> of infrastructure implemented three different ways rather than >>> having one implementation used everywhere -- it's just asking >>> for obscure bugs on the non-x86 hosts. >> >> Fair point, but it's an implementation problem, not a fundamental >> problem with coroutines. You *can* implement coroutines portably, >> e.g. on top of gthread. > > If you're implementing them on top of separate threads then > you just have an obfuscated API to threads. No, what you really have is yet another means to express control flow. The fact that it can be implemented on top of threads doesn't necessarily make it an obfuscated API to threads. Look, a while loop isn't an obfuscated API to continuations, either. It's less general, but when it's all you need, it's easier to use. When all you need is coroutines, not having to worry about preemption makes them easier to use in my experience. ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2012-07-13 8:35 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-06-22 6:44 [Qemu-devel] [RFC] block: Removed coroutine ownership assumption Peter A. G. Crosthwaite 2012-06-22 7:49 ` Kevin Wolf 2012-06-22 8:20 ` Peter Crosthwaite 2012-06-22 8:31 ` Peter Maydell 2012-06-22 8:34 ` Stefan Hajnoczi 2012-06-22 8:53 ` Kevin Wolf 2012-06-22 10:59 ` Peter Crosthwaite [not found] ` <CAEgOgz4+5FsFUZT796chTADOXcRps0+74T4whfmdEsh_dO96VA@mail.gmail.com> [not found] ` <CAJSP0QWVJTb9jPZC_mdWpd4OwLz4VOuxGBZ_=2M4HNeexEC96Q@mail.gmail.com> [not found] ` <CAFEAcA_cX_jxZMSjjT=yBA1Hmf2VpWbGyDBZ8z9mqq5rVNsWWw@mail.gmail.com> [not found] ` <CAJSP0QV=BsRhdD_tVE9Cav-bhGiF-R3+Ab2aTtun6nSoPh0EmQ@mail.gmail.com> [not found] ` <m3vcidw3v3.fsf@blackfin.pond.sub.org> [not found] ` <CAEgOgz6Mai7PvBkHwN0EjhFsAFMU8W=V1B1X0ZLN3c1YHRaWKA@mail.gmail.com> [not found] ` <CAJSP0QWJcuEOmxhoAceqU2WYQkGT+fsizf-kdx_irq97j3pw4Q@mail.gmail.com> [not found] ` <CAEgOgz51jauHzvEk0Pk+oNQ0qPekjKatvNivv4vxQsQR_6nOVQ@mail.gmail.com> [not found] ` <CAJSP0QXnaw2HV7M+U=0S-ApGGnrGtt1w0P5w5gpmv7h-7bMs9g@mail.gmail.com> [not found] ` <CAEgOgz65h2baE0ufvSgfow-B5fGVwJKgNwsf3C2r65HNGdiQxg@mail.gmail.com> [not found] ` <4FED6638.90703@redhat.com> [not found] ` <CAEgOgz6sUREnwNuiSM=344ZTNq_4xMJDFU29Sn+6dTeVww4rhA@mail.gmail.com> [not found] ` <m3y5n61hl5.fsf@blackfin.pond.sub.org> [not found] ` <CAEgOgz7oPPsMexuzsYwc2LOGGOC4EM9NvHjXAp0TT2T8kOyirQ@mail.gmail.com> 2012-07-02 8:50 ` Stefan Hajnoczi 2012-07-02 8:57 ` Peter Crosthwaite 2012-07-02 9:04 ` Kevin Wolf 2012-07-02 9:42 ` Peter Crosthwaite 2012-07-02 10:01 ` Kevin Wolf 2012-07-02 10:18 ` Peter Maydell 2012-07-02 10:44 ` Kevin Wolf 2012-07-02 10:59 ` Peter Maydell 2012-07-02 11:03 ` Peter Crosthwaite 2012-07-02 11:12 ` Peter Crosthwaite 2012-07-02 11:19 ` Kevin Wolf 2012-07-02 11:25 ` Peter Crosthwaite 2012-07-02 10:18 ` Peter Crosthwaite 2012-07-02 10:52 ` Kevin Wolf 2012-07-02 10:57 ` Peter Crosthwaite 2012-07-02 11:04 ` Kevin Wolf 2012-07-12 13:42 ` Markus Armbruster 2012-07-13 1:21 ` Peter Crosthwaite 2012-07-13 8:33 ` Markus Armbruster 2012-06-22 7:50 ` Jan Kiszka 2012-06-22 8:00 ` Peter Crosthwaite 2012-06-22 8:06 ` Kevin Wolf 2012-06-22 8:16 ` Peter Maydell 2012-06-22 8:23 ` Peter Crosthwaite 2012-06-22 8:33 ` Stefan Hajnoczi 2012-06-22 8:45 ` Kevin Wolf 2012-06-22 8:48 ` Markus Armbruster 2012-06-22 9:06 ` Peter Maydell 2012-06-22 12:04 ` Markus Armbruster 2012-06-22 12:30 ` Peter Maydell 2012-06-22 13:36 ` Markus Armbruster
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).