* [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests)
[not found] ` <20040226105120.GC7580@suse.de>
@ 2004-02-26 11:27 ` Miquel van Smoorenburg
2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
0 siblings, 1 reply; 4+ messages in thread
From: Miquel van Smoorenburg @ 2004-02-26 11:27 UTC (permalink / raw)
To: Jens Axboe; +Cc: Andrew Morton, thornber, piggin, linux-lvm
According to Jens Axboe:
> On Thu, Feb 26 2004, Andrew Morton wrote:
> > Miquel van Smoorenburg <miquels@cistron.nl> wrote:
> > > Well, I'm looking for guidance from Jens first, since I'm not sure if
> > > he wants the first solution (congestion testing passed down), the second
> > > (congestion setting passed up) or both.
> >
> > The first patch looks nice. Why do we need the "pass it up" feature?
>
> Miquel, the newer patches do tend to get too complicated. I suppose I'm
> fine with the first patch as well, modulo the dm part (I'd suggest
> merging it without and talking to Joe about doing it properly).
Okay here you go. I added bdi_rw_congested() for code in xfs and ext2
that calls both bdi_read_congested() and bdi_write_congested() in
a row, and it was "free" anyway.
bdi_congested-core.patch
--- linux-2.6.3.orig/include/linux/backing-dev.h 2004-02-04 04:43:38.000000000 +0100
+++ linux-2.6.3-congested_fn/include/linux/backing-dev.h 2004-02-26 16:17:49.000000000 +0100
@@ -20,10 +20,14 @@ enum bdi_state {
BDI_unused, /* Available bits start here */
};
+typedef int (congested_fn)(void *, int);
+
struct backing_dev_info {
unsigned long ra_pages; /* max readahead in PAGE_CACHE_SIZE units */
unsigned long state; /* Always use atomic bitops on this */
int memory_backed; /* Cannot clean pages with writepage */
+ congested_fn *congested_fn; /* Function pointer if device is md/dm */
+ void *congested_data; /* Pointer to aux data for congested func */
};
extern struct backing_dev_info default_backing_dev_info;
@@ -32,14 +36,27 @@ int writeback_acquire(struct backing_dev
int writeback_in_progress(struct backing_dev_info *bdi);
void writeback_release(struct backing_dev_info *bdi);
+static inline int bdi_congested(struct backing_dev_info *bdi, int bdi_bits)
+{
+ if (bdi->congested_fn)
+ return bdi->congested_fn(bdi->congested_data, bdi_bits);
+ return (bdi->state & bdi_bits);
+}
+
static inline int bdi_read_congested(struct backing_dev_info *bdi)
{
- return test_bit(BDI_read_congested, &bdi->state);
+ return bdi_congested(bdi, 1 << BDI_read_congested);
}
static inline int bdi_write_congested(struct backing_dev_info *bdi)
{
- return test_bit(BDI_write_congested, &bdi->state);
+ return bdi_congested(bdi, 1 << BDI_write_congested);
+}
+
+static inline int bdi_rw_congested(struct backing_dev_info *bdi)
+{
+ return bdi_congested(bdi, (1 << BDI_read_congested)|
+ (1 << BDI_write_congested));
}
#endif /* _LINUX_BACKING_DEV_H */
^ permalink raw reply [flat|nested] 4+ messages in thread
* [linux-lvm] PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests)
2004-02-26 11:27 ` [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests) Miquel van Smoorenburg
@ 2004-02-26 11:28 ` Miquel van Smoorenburg
2004-02-26 11:41 ` [linux-lvm] " Joe Thornber
2004-02-26 17:35 ` Andrew Morton
0 siblings, 2 replies; 4+ messages in thread
From: Miquel van Smoorenburg @ 2004-02-26 11:28 UTC (permalink / raw)
To: thornber; +Cc: Jens Axboe, Andrew Morton, piggin, linux-lvm
According to Miquel van Smoorenburg:
> Okay here you go.
> bdi_congested-core.patch
Here's the dm part:
bdi_congested-dm.patch
--- linux-2.6.3.orig/drivers/md/dm.h 2004-02-04 04:43:45.000000000 +0100
+++ linux-2.6.3-congested_fn/drivers/md/dm.h 2004-02-26 14:22:41.000000000 +0100
@@ -115,6 +115,7 @@ struct list_head *dm_table_get_devices(s
int dm_table_get_mode(struct dm_table *t);
void dm_table_suspend_targets(struct dm_table *t);
void dm_table_resume_targets(struct dm_table *t);
+int dm_table_any_congested(struct dm_table *t, int bdi_bits);
/*-----------------------------------------------------------------
* A registry of target types.
--- linux-2.6.3.orig/drivers/md/dm-table.c 2004-02-04 04:44:59.000000000 +0100
+++ linux-2.6.3-congested_fn/drivers/md/dm-table.c 2004-02-26 14:22:30.000000000 +0100
@@ -857,6 +857,20 @@ void dm_table_resume_targets(struct dm_t
}
}
+int dm_table_any_congested(struct dm_table *t, int bdi_bits)
+{
+ struct list_head *d, *devices;
+ int r = 0;
+
+ devices = dm_table_get_devices(t);
+ for (d = devices->next; d != devices; d = d->next) {
+ struct dm_dev *dd = list_entry(d, struct dm_dev, list);
+ request_queue_t *q = bdev_get_queue(dd->bdev);
+ r |= bdi_congested(&q->backing_dev_info, bdi_bits);
+ }
+
+ return r;
+}
EXPORT_SYMBOL(dm_get_device);
EXPORT_SYMBOL(dm_put_device);
--- linux-2.6.3.orig/drivers/md/dm.c 2004-02-22 13:52:15.000000000 +0100
+++ linux-2.6.3-congested_fn/drivers/md/dm.c 2004-02-26 14:26:57.000000000 +0100
@@ -526,6 +526,18 @@ static int dm_request(request_queue_t *q
return 0;
}
+static int dm_any_congested(void *congested_data, int bdi_bits)
+{
+ int r;
+ struct mapped_device *md = congested_data;
+
+ down_read(&md->lock);
+ r = dm_table_any_congested(md->map, bdi_bits);
+ up_read(&md->lock);
+
+ return r;
+}
+
/*-----------------------------------------------------------------
* A bitset is used to keep track of allocated minor numbers.
*---------------------------------------------------------------*/
@@ -608,6 +620,8 @@ static struct mapped_device *alloc_dev(u
}
md->queue->queuedata = md;
+ md->queue->backing_dev_info.congested_fn = dm_any_congested;
+ md->queue->backing_dev_info.congested_data = md;
blk_queue_make_request(md->queue, dm_request);
md->io_pool = mempool_create(MIN_IOS, mempool_alloc_slab,
^ permalink raw reply [flat|nested] 4+ messages in thread
* [linux-lvm] Re: PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests)
2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
@ 2004-02-26 11:41 ` Joe Thornber
2004-02-26 17:35 ` Andrew Morton
1 sibling, 0 replies; 4+ messages in thread
From: Joe Thornber @ 2004-02-26 11:41 UTC (permalink / raw)
To: Miquel van Smoorenburg
Cc: thornber, Jens Axboe, Andrew Morton, piggin, linux-lvm
On Thu, Feb 26, 2004 at 05:29:52PM +0100, Miquel van Smoorenburg wrote:
> According to Miquel van Smoorenburg:
> > Okay here you go.
> > bdi_congested-core.patch
>
> Here's the dm part:
>
> bdi_congested-dm.patch
Looks good to me.
- Joe
^ permalink raw reply [flat|nested] 4+ messages in thread
* [linux-lvm] Re: PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests)
2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
2004-02-26 11:41 ` [linux-lvm] " Joe Thornber
@ 2004-02-26 17:35 ` Andrew Morton
1 sibling, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2004-02-26 17:35 UTC (permalink / raw)
To: Miquel van Smoorenburg; +Cc: thornber, axboe, piggin, linux-lvm
Miquel van Smoorenburg <miquels@cistron.nl> wrote:
>
> + down_read(&md->lock);
> + r = dm_table_any_congested(md->map, bdi_bits);
> + up_read(&md->lock);
This can deadlock if anyone ever performs a __GFP_IO memory allocation
while holding md->lock.
We could enter page reclaim with the lock held, come back in here and take
it again. That's an instant deadlock if we were holding it for write and a
super-rare deadlock if we were holding it for read (requires some other
process to come in and run down_write() in between the two down_read()s).
A quick audit shows that we're OK, I think. What does
dm_table_suspend_targets() do? But still, this restriction needs to be
understood, documented and adhered to in future DM development.
If it gets really sticky in here it would be acceptable to do
down_read_trylock() and return 0 if it fails - the congestion code is only
advisory and 99% is good enough.
I like these patches btw - I was always worried about what the
stacked-device impementation of queue congestion would look like, and this
is nice and simple.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-02-26 17:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1qJVx-75K-15@gated-at.bofh.it>
[not found] ` <1r6fH-3L8-11@gated-at.bofh.it>
[not found] ` <1r6S4-6cv-1@gated-at.bofh.it>
[not found] ` <403D02E3.4070208@tmr.com>
[not found] ` <c1j4mb$gmd$1@news.cistron.nl>
[not found] ` <20040225162431.1f08365d.akpm@osdl.org>
[not found] ` <20040226103704.GA13717@traveler.cistron.net>
[not found] ` <20040226024714.768e3c71.akpm@osdl.org>
[not found] ` <20040226105120.GC7580@suse.de>
2004-02-26 11:27 ` [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests) Miquel van Smoorenburg
2004-02-26 11:28 ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
2004-02-26 11:41 ` [linux-lvm] " Joe Thornber
2004-02-26 17:35 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox