* [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests) [not found] ` <20040226105120.GC7580@suse.de> @ 2004-02-26 11:27 ` Miquel van Smoorenburg 2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg 0 siblings, 1 reply; 4+ messages in thread From: Miquel van Smoorenburg @ 2004-02-26 11:27 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, thornber, piggin, linux-lvm According to Jens Axboe: > On Thu, Feb 26 2004, Andrew Morton wrote: > > Miquel van Smoorenburg <miquels@cistron.nl> wrote: > > > Well, I'm looking for guidance from Jens first, since I'm not sure if > > > he wants the first solution (congestion testing passed down), the second > > > (congestion setting passed up) or both. > > > > The first patch looks nice. Why do we need the "pass it up" feature? > > Miquel, the newer patches do tend to get too complicated. I suppose I'm > fine with the first patch as well, modulo the dm part (I'd suggest > merging it without and talking to Joe about doing it properly). Okay here you go. I added bdi_rw_congested() for code in xfs and ext2 that calls both bdi_read_congested() and bdi_write_congested() in a row, and it was "free" anyway. bdi_congested-core.patch --- linux-2.6.3.orig/include/linux/backing-dev.h 2004-02-04 04:43:38.000000000 +0100 +++ linux-2.6.3-congested_fn/include/linux/backing-dev.h 2004-02-26 16:17:49.000000000 +0100 @@ -20,10 +20,14 @@ enum bdi_state { BDI_unused, /* Available bits start here */ }; +typedef int (congested_fn)(void *, int); + struct backing_dev_info { unsigned long ra_pages; /* max readahead in PAGE_CACHE_SIZE units */ unsigned long state; /* Always use atomic bitops on this */ int memory_backed; /* Cannot clean pages with writepage */ + congested_fn *congested_fn; /* Function pointer if device is md/dm */ + void *congested_data; /* Pointer to aux data for congested func */ }; extern struct backing_dev_info default_backing_dev_info; @@ -32,14 +36,27 @@ int writeback_acquire(struct backing_dev int writeback_in_progress(struct backing_dev_info *bdi); void writeback_release(struct backing_dev_info *bdi); +static inline int bdi_congested(struct backing_dev_info *bdi, int bdi_bits) +{ + if (bdi->congested_fn) + return bdi->congested_fn(bdi->congested_data, bdi_bits); + return (bdi->state & bdi_bits); +} + static inline int bdi_read_congested(struct backing_dev_info *bdi) { - return test_bit(BDI_read_congested, &bdi->state); + return bdi_congested(bdi, 1 << BDI_read_congested); } static inline int bdi_write_congested(struct backing_dev_info *bdi) { - return test_bit(BDI_write_congested, &bdi->state); + return bdi_congested(bdi, 1 << BDI_write_congested); +} + +static inline int bdi_rw_congested(struct backing_dev_info *bdi) +{ + return bdi_congested(bdi, (1 << BDI_read_congested)| + (1 << BDI_write_congested)); } #endif /* _LINUX_BACKING_DEV_H */ ^ permalink raw reply [flat|nested] 4+ messages in thread
* [linux-lvm] PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests) 2004-02-26 11:27 ` [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests) Miquel van Smoorenburg @ 2004-02-26 11:28 ` Miquel van Smoorenburg 2004-02-26 11:41 ` [linux-lvm] " Joe Thornber 2004-02-26 17:35 ` Andrew Morton 0 siblings, 2 replies; 4+ messages in thread From: Miquel van Smoorenburg @ 2004-02-26 11:28 UTC (permalink / raw) To: thornber; +Cc: Jens Axboe, Andrew Morton, piggin, linux-lvm According to Miquel van Smoorenburg: > Okay here you go. > bdi_congested-core.patch Here's the dm part: bdi_congested-dm.patch --- linux-2.6.3.orig/drivers/md/dm.h 2004-02-04 04:43:45.000000000 +0100 +++ linux-2.6.3-congested_fn/drivers/md/dm.h 2004-02-26 14:22:41.000000000 +0100 @@ -115,6 +115,7 @@ struct list_head *dm_table_get_devices(s int dm_table_get_mode(struct dm_table *t); void dm_table_suspend_targets(struct dm_table *t); void dm_table_resume_targets(struct dm_table *t); +int dm_table_any_congested(struct dm_table *t, int bdi_bits); /*----------------------------------------------------------------- * A registry of target types. --- linux-2.6.3.orig/drivers/md/dm-table.c 2004-02-04 04:44:59.000000000 +0100 +++ linux-2.6.3-congested_fn/drivers/md/dm-table.c 2004-02-26 14:22:30.000000000 +0100 @@ -857,6 +857,20 @@ void dm_table_resume_targets(struct dm_t } } +int dm_table_any_congested(struct dm_table *t, int bdi_bits) +{ + struct list_head *d, *devices; + int r = 0; + + devices = dm_table_get_devices(t); + for (d = devices->next; d != devices; d = d->next) { + struct dm_dev *dd = list_entry(d, struct dm_dev, list); + request_queue_t *q = bdev_get_queue(dd->bdev); + r |= bdi_congested(&q->backing_dev_info, bdi_bits); + } + + return r; +} EXPORT_SYMBOL(dm_get_device); EXPORT_SYMBOL(dm_put_device); --- linux-2.6.3.orig/drivers/md/dm.c 2004-02-22 13:52:15.000000000 +0100 +++ linux-2.6.3-congested_fn/drivers/md/dm.c 2004-02-26 14:26:57.000000000 +0100 @@ -526,6 +526,18 @@ static int dm_request(request_queue_t *q return 0; } +static int dm_any_congested(void *congested_data, int bdi_bits) +{ + int r; + struct mapped_device *md = congested_data; + + down_read(&md->lock); + r = dm_table_any_congested(md->map, bdi_bits); + up_read(&md->lock); + + return r; +} + /*----------------------------------------------------------------- * A bitset is used to keep track of allocated minor numbers. *---------------------------------------------------------------*/ @@ -608,6 +620,8 @@ static struct mapped_device *alloc_dev(u } md->queue->queuedata = md; + md->queue->backing_dev_info.congested_fn = dm_any_congested; + md->queue->backing_dev_info.congested_data = md; blk_queue_make_request(md->queue, dm_request); md->io_pool = mempool_create(MIN_IOS, mempool_alloc_slab, ^ permalink raw reply [flat|nested] 4+ messages in thread
* [linux-lvm] Re: PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests) 2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg @ 2004-02-26 11:41 ` Joe Thornber 2004-02-26 17:35 ` Andrew Morton 1 sibling, 0 replies; 4+ messages in thread From: Joe Thornber @ 2004-02-26 11:41 UTC (permalink / raw) To: Miquel van Smoorenburg Cc: thornber, Jens Axboe, Andrew Morton, piggin, linux-lvm On Thu, Feb 26, 2004 at 05:29:52PM +0100, Miquel van Smoorenburg wrote: > According to Miquel van Smoorenburg: > > Okay here you go. > > bdi_congested-core.patch > > Here's the dm part: > > bdi_congested-dm.patch Looks good to me. - Joe ^ permalink raw reply [flat|nested] 4+ messages in thread
* [linux-lvm] Re: PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests) 2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg 2004-02-26 11:41 ` [linux-lvm] " Joe Thornber @ 2004-02-26 17:35 ` Andrew Morton 1 sibling, 0 replies; 4+ messages in thread From: Andrew Morton @ 2004-02-26 17:35 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: thornber, axboe, piggin, linux-lvm Miquel van Smoorenburg <miquels@cistron.nl> wrote: > > + down_read(&md->lock); > + r = dm_table_any_congested(md->map, bdi_bits); > + up_read(&md->lock); This can deadlock if anyone ever performs a __GFP_IO memory allocation while holding md->lock. We could enter page reclaim with the lock held, come back in here and take it again. That's an instant deadlock if we were holding it for write and a super-rare deadlock if we were holding it for read (requires some other process to come in and run down_write() in between the two down_read()s). A quick audit shows that we're OK, I think. What does dm_table_suspend_targets() do? But still, this restriction needs to be understood, documented and adhered to in future DM development. If it gets really sticky in here it would be acceptable to do down_read_trylock() and return 0 if it fails - the congestion code is only advisory and 99% is good enough. I like these patches btw - I was always worried about what the stacked-device impementation of queue congestion would look like, and this is nice and simple. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-02-26 17:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1qJVx-75K-15@gated-at.bofh.it>
[not found] ` <1r6fH-3L8-11@gated-at.bofh.it>
[not found] ` <1r6S4-6cv-1@gated-at.bofh.it>
[not found] ` <403D02E3.4070208@tmr.com>
[not found] ` <c1j4mb$gmd$1@news.cistron.nl>
[not found] ` <20040225162431.1f08365d.akpm@osdl.org>
[not found] ` <20040226103704.GA13717@traveler.cistron.net>
[not found] ` <20040226024714.768e3c71.akpm@osdl.org>
[not found] ` <20040226105120.GC7580@suse.de>
2004-02-26 11:27 ` [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests) Miquel van Smoorenburg
2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
2004-02-26 11:41 ` [linux-lvm] " Joe Thornber
2004-02-26 17:35 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox