* [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd [not found] <20050131111921.EE6F83BE6D@garcon.linbit.com> @ 2005-02-01 16:48 ` Lars Marowsky-Bree 2005-02-01 17:36 ` Philipp Reisner 0 siblings, 1 reply; 6+ messages in thread From: Lars Marowsky-Bree @ 2005-02-01 16:48 UTC (permalink / raw) To: drbd-dev On 2005-01-31T12:19:21, svn@svn.drbd.org wrote: > Modified: > branches/drbd-0.7/ChangeLog > branches/drbd-0.7/drbd.spec.in > branches/drbd-0.7/drbd/drbd_actlog.c > branches/drbd-0.7/drbd/drbd_compat_wrappers.h > Log: > No longer leak of BIOs. Good goal, but ... > Modified: branches/drbd-0.7/drbd/drbd_actlog.c > =================================================================== > --- branches/drbd-0.7/drbd/drbd_actlog.c 2005-01-31 09:20:39 UTC (rev 1742) > +++ branches/drbd-0.7/drbd/drbd_actlog.c 2005-01-31 11:19:19 UTC (rev 1743) > @@ -68,8 +68,6 @@ > struct completion event; > int ok; > > - bio_get(bio); > - > bio->bi_bdev = mdev->md_bdev; > bio->bi_sector = sector; > bio_add_page(bio, page, size, 0); I think this is a bug. If you look at the comments in linux/include/linux/bio.h, you'll find that this is just how the bio is supposed to be used: /* * get a reference to a bio, so it won't disappear. the intended use is * something like: * * bio_get(bio); * submit_bio(rw, bio); * if (bio->bi_flags ...) * do_something * bio_put(bio); * * without the bio_get(), it could potentially complete I/O before submit_bio * returns. and then bio would be freed memory when if (bio->bi_flags ...) * runs */ #define bio_get(bio) atomic_inc(&(bio)->bi_cnt) The fix I did was not the right way (removing the bio embedding was much better), but it also wasn't completely wrong. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd 2005-02-01 16:48 ` [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd Lars Marowsky-Bree @ 2005-02-01 17:36 ` Philipp Reisner 2005-02-02 10:13 ` Lars Marowsky-Bree 0 siblings, 1 reply; 6+ messages in thread From: Philipp Reisner @ 2005-02-01 17:36 UTC (permalink / raw) To: drbd-dev Am Dienstag, 1. Februar 2005 17:48 schrieb Lars Marowsky-Bree: > On 2005-01-31T12:19:21, svn@svn.drbd.org wrote: > > Modified: > > branches/drbd-0.7/ChangeLog > > branches/drbd-0.7/drbd.spec.in > > branches/drbd-0.7/drbd/drbd_actlog.c > > branches/drbd-0.7/drbd/drbd_compat_wrappers.h > > Log: > > No longer leak of BIOs. > > Good goal, but ... > > > Modified: branches/drbd-0.7/drbd/drbd_actlog.c > > =================================================================== > > --- branches/drbd-0.7/drbd/drbd_actlog.c 2005-01-31 09:20:39 UTC (rev > > 1742) +++ branches/drbd-0.7/drbd/drbd_actlog.c 2005-01-31 11:19:19 UTC > > (rev 1743) @@ -68,8 +68,6 @@ > > struct completion event; > > int ok; > > > > - bio_get(bio); > > - > > bio->bi_bdev = mdev->md_bdev; > > bio->bi_sector = sector; > > bio_add_page(bio, page, size, 0); > > I think this is a bug. If you look at the comments in > linux/include/linux/bio.h, you'll find that this is just how the bio is > supposed to be used: > > /* > * get a reference to a bio, so it won't disappear. the intended use is > * something like: > * > * bio_get(bio); > * submit_bio(rw, bio); > * if (bio->bi_flags ...) > * do_something > * bio_put(bio); > * > * without the bio_get(), it could potentially complete I/O before > submit_bio * returns. and then bio would be freed memory when if > (bio->bi_flags ...) * runs > */ > #define bio_get(bio) atomic_inc(&(bio)->bi_cnt) > > The fix I did was not the right way (removing the bio embedding was much > better), but it also wasn't completely wrong. > I did not look at the comments, I read the code. And what I read there was: a BIO has after bio_init() a refcount (bi_cnt) of 1. It is freed when the refcount drops to zero. bio_alloc(); do something with it bio_put(); is right. What you did is: bio_alloc(); bio_get(); // increase to 2 bio_put(); // decreat to 1 and let it live forever... -Philipp -- : Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com : ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd 2005-02-01 17:36 ` Philipp Reisner @ 2005-02-02 10:13 ` Lars Marowsky-Bree 2005-02-02 17:59 ` Philipp Reisner 0 siblings, 1 reply; 6+ messages in thread From: Lars Marowsky-Bree @ 2005-02-02 10:13 UTC (permalink / raw) To: Philipp Reisner, drbd-dev On 2005-02-01T18:36:43, Philipp Reisner <philipp.reisner@linbit.com> wrote: > I did not look at the comments, I read the code. And what I read there was: > a BIO has after bio_init() a refcount (bi_cnt) of 1. It is freed when the > refcount drops to zero. > > bio_alloc(); > do something with it "submit_bio()" is slightly different from just "something". > bio_put(); > > is right. > > What you did is: > > bio_alloc(); > bio_get(); // increase to 2 > bio_put(); // decreat to 1 > > and let it live forever... The cleanup at the end of the bio processing however should drop the bio reference count too. Which is why the comment in bio.h says what it is. Either the comment is wrong (and Jens's very same fix for the md code, too), or you have introduced a race condition into the actlog. Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd 2005-02-02 10:13 ` Lars Marowsky-Bree @ 2005-02-02 17:59 ` Philipp Reisner 2005-02-03 8:38 ` Jens Axboe 2005-02-03 9:15 ` Lars Marowsky-Bree 0 siblings, 2 replies; 6+ messages in thread From: Philipp Reisner @ 2005-02-02 17:59 UTC (permalink / raw) To: drbd-dev; +Cc: Jens Axboe Am Mittwoch, 2. Februar 2005 11:13 schrieb Lars Marowsky-Bree: > On 2005-02-01T18:36:43, Philipp Reisner <philipp.reisner@linbit.com> wrote: > > I did not look at the comments, I read the code. And what I read there > > was: a BIO has after bio_init() a refcount (bi_cnt) of 1. It is freed > > when the refcount drops to zero. > > > > bio_alloc(); > > do something with it > > "submit_bio()" is slightly different from just "something". in which regard ? We should not free it while we do something with it.. that is true for something and for submit_bio() :) > > bio_put(); > > > > is right. > > > > What you did is: > > > > bio_alloc(); > > bio_get(); // increase to 2 > > bio_put(); // decreat to 1 > > > > and let it live forever... > > The cleanup at the end of the bio processing however should drop the bio > reference count too. Which is why the comment in bio.h says what it is. > Either the comment is wrong (and Jens's very same fix for the md code, > too), or you have introduced a race condition into the actlog. Why do you not simply read the code ? Right in case you return from the function that alocated the bio, then it is a good idea to use the reference counting to make the completion handler to free it. This would be something like this: STATIC int _drbd_md_async_page_io(...) { struct bio *bio = bio_alloc(GFP_KERNEL, 1); bio->bi_end_io = drbd_md_io_complete_a; submit_bio(rw, bio); return; } You see, there is no single bio_get() or bio_put() ... What refcount will the bio have ? Right it will have one (1), so the bio lives on... After some while, when the disk was so nice to write our request the bio completion function would be called: int drbd_md_io_complete_a(struct bio *bio, unsigned int bytes_done, int error) { if (bio->bi_size) return 1; put_bio(bio); return 0; } And here we would decrease the BIOs refcount to 0 and free it. BUT the function we talk about has SYNCHRONOUS semantics! Otherwise it would have never worked with a BIO on the stack... Read carefull: STATIC int _drbd_md_sync_page_io(drbd_dev *mdev, struct page *page, sector_t sector, int rw, int size) { struct bio *bio = bio_alloc(GFP_KERNEL, 1); bio->bi_end_io = drbd_md_io_complete; wait_for_completion(&event); bio_put(bio); return ok; } and the corresponding io-completion function: int drbd_md_io_complete(struct bio *bio, unsigned int bytes_done, int error) { if (bio->bi_size) return 1; complete((struct completion*)bio->bi_private); return 0; } So our _drbd_md_sync_page_io() sleeps in wait_for_completion() until the whole action is carried out, and then ... in the same scope where I allocated the bio I free it by calling bio_put() --- I could do it on the stack -- But I should use bio_init() the... Jens: Please tell Lars that I am right on how the refcounting works. And, BTW, is there any reason to not have a BIO on the stack ? Besides people return from the function before the completion callback was called ? In other words: Might something below me increase the refcount of the BIO and keep the reference even after my IO completion function was called ? -Philipp ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd 2005-02-02 17:59 ` Philipp Reisner @ 2005-02-03 8:38 ` Jens Axboe 2005-02-03 9:15 ` Lars Marowsky-Bree 1 sibling, 0 replies; 6+ messages in thread From: Jens Axboe @ 2005-02-03 8:38 UTC (permalink / raw) To: Philipp Reisner; +Cc: drbd-dev On Wed, Feb 02 2005, Philipp Reisner wrote: > Am Mittwoch, 2. Februar 2005 11:13 schrieb Lars Marowsky-Bree: > > On 2005-02-01T18:36:43, Philipp Reisner <philipp.reisner@linbit.com> wrote: > > > I did not look at the comments, I read the code. And what I read there > > > was: a BIO has after bio_init() a refcount (bi_cnt) of 1. It is freed > > > when the refcount drops to zero. > > > > > > bio_alloc(); > > > do something with it > > > > "submit_bio()" is slightly different from just "something". > > in which regard ? We should not free it while we do something > with it.. that is true for something and for submit_bio() :) > > > > bio_put(); > > > > > > is right. > > > > > > What you did is: > > > > > > bio_alloc(); > > > bio_get(); // increase to 2 > > > bio_put(); // decreat to 1 > > > > > > and let it live forever... > > > > The cleanup at the end of the bio processing however should drop the bio > > reference count too. Which is why the comment in bio.h says what it is. > > Either the comment is wrong (and Jens's very same fix for the md code, > > too), or you have introduced a race condition into the actlog. > > Why do you not simply read the code ? > > Right in case you return from the function that alocated the bio, then > it is a good idea to use the reference counting to make the completion > handler to free it. This would be something like this: > > STATIC int _drbd_md_async_page_io(...) > { > struct bio *bio = bio_alloc(GFP_KERNEL, 1); > > bio->bi_end_io = drbd_md_io_complete_a; > submit_bio(rw, bio); > > return; > } > > You see, there is no single bio_get() or bio_put() ... What refcount > will the bio have ? Right it will have one (1), so the bio lives on... That's because you don't touch the bio after submit_bio(), if you did you would need to grab a reference to it. > After some while, when the disk was so nice to write our request the > bio completion function would be called: > > int drbd_md_io_complete_a(struct bio *bio, unsigned int bytes_done, int error) > { > if (bio->bi_size) > return 1; > > put_bio(bio); > return 0; > } > > And here we would decrease the BIOs refcount to 0 and free it. > > BUT the function we talk about has SYNCHRONOUS semantics! Otherwise > it would have never worked with a BIO on the stack... Read carefull: > > STATIC int _drbd_md_sync_page_io(drbd_dev *mdev, struct page *page, > sector_t sector, int rw, int size) > { > struct bio *bio = bio_alloc(GFP_KERNEL, 1); > > bio->bi_end_io = drbd_md_io_complete; > > wait_for_completion(&event); > > bio_put(bio); > return ok; > } > > and the corresponding io-completion function: > > int drbd_md_io_complete(struct bio *bio, unsigned int bytes_done, int error) > { > if (bio->bi_size) > return 1; > > complete((struct completion*)bio->bi_private); > return 0; > } That matching pair is fine. > So our _drbd_md_sync_page_io() sleeps in wait_for_completion() until the > whole action is carried out, and then ... in the same scope where I allocated > the bio I free it by calling bio_put() --- > I could do it on the stack -- But I should use bio_init() the... > > Jens: > Please tell Lars that I am right on how the refcounting works. It seems you are. The md fix is actually leaky, since the bio_end_io doesn't put the bio. An oversight on my behalf. It's only called only called on init so no continual leak. > And, BTW, is there any reason to not have a BIO on the stack ? > Besides people return from the function before the completion callback > was called ? > > In other words: Might something below me increase the refcount of the > BIO and keep the reference even after my IO completion function was > called ? That's a good point. On-stack objects really interfer with reference counting, such as used by bios. We had trouble with this and struct requests in the future as well, so I would really discourage on-stack bios completely. I'll make sure that there are no in-kernel uses of this soonish, probably with a little WARN_ON() to catch out of tree uses as well. -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd 2005-02-02 17:59 ` Philipp Reisner 2005-02-03 8:38 ` Jens Axboe @ 2005-02-03 9:15 ` Lars Marowsky-Bree 1 sibling, 0 replies; 6+ messages in thread From: Lars Marowsky-Bree @ 2005-02-03 9:15 UTC (permalink / raw) To: Philipp Reisner, drbd-dev; +Cc: Jens Axboe On 2005-02-02T18:59:10, Philipp Reisner <philipp.reisner@linbit.com> wrote: > Why do you not simply read the code ? I did, but because the comments in the code and those I got from Jens were contradictionary with what you said, and I wanted to figure out who was right, and I'm known to misunderstand code. But Jens confirmed in a private discussion that you're right ;-) Normally, the end_io functions would drop their reference count. Alas, because drbd and md were converted from on stack bios, those didn't. So yes, the additional bio_get seems to be wrong indeed, or a bio_put() should be added to the end_io. Thanks for catching this, this means we'll have another bugfix queued soon... Sincerely, Lars Marowsky-Brée <lmb@suse.de> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-02-03 9:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20050131111921.EE6F83BE6D@garcon.linbit.com>
2005-02-01 16:48 ` [Drbd-dev] Re: [DRBD-cvs] r1743 - in branches/drbd-0.7: . drbd Lars Marowsky-Bree
2005-02-01 17:36 ` Philipp Reisner
2005-02-02 10:13 ` Lars Marowsky-Bree
2005-02-02 17:59 ` Philipp Reisner
2005-02-03 8:38 ` Jens Axboe
2005-02-03 9:15 ` Lars Marowsky-Bree
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox