Linux LVM users
 help / color / mirror / Atom feed
* [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests)
       [not found]               ` <20040226105120.GC7580@suse.de>
@ 2004-02-26 11:27                 ` Miquel van Smoorenburg
  2004-02-26 11:28                   ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
  0 siblings, 1 reply; 4+ messages in thread
From: Miquel van Smoorenburg @ 2004-02-26 11:27 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, thornber, piggin, linux-lvm

According to Jens Axboe:
> On Thu, Feb 26 2004, Andrew Morton wrote:
> > Miquel van Smoorenburg <miquels@cistron.nl> wrote:
> > > Well, I'm looking for guidance from Jens first, since I'm not sure if
> > > he wants the first solution (congestion testing passed down), the second
> > > (congestion setting passed up) or both.
> > 
> > The first patch looks nice.  Why do we need the "pass it up" feature?
> 
> Miquel, the newer patches do tend to get too complicated. I suppose I'm
> fine with the first patch as well, modulo the dm part (I'd suggest
> merging it without and talking to Joe about doing it properly).

Okay here you go. I added bdi_rw_congested() for code in xfs and ext2
that calls both bdi_read_congested() and bdi_write_congested() in
a row, and it was "free" anyway.

bdi_congested-core.patch

--- linux-2.6.3.orig/include/linux/backing-dev.h	2004-02-04 04:43:38.000000000 +0100
+++ linux-2.6.3-congested_fn/include/linux/backing-dev.h	2004-02-26 16:17:49.000000000 +0100
@@ -20,10 +20,14 @@ enum bdi_state {
 	BDI_unused,		/* Available bits start here */
 };
 
+typedef int (congested_fn)(void *, int);
+
 struct backing_dev_info {
 	unsigned long ra_pages;	/* max readahead in PAGE_CACHE_SIZE units */
 	unsigned long state;	/* Always use atomic bitops on this */
 	int memory_backed;	/* Cannot clean pages with writepage */
+	congested_fn *congested_fn; /* Function pointer if device is md/dm */
+	void *congested_data;	/* Pointer to aux data for congested func */
 };
 
 extern struct backing_dev_info default_backing_dev_info;
@@ -32,14 +36,27 @@ int writeback_acquire(struct backing_dev
 int writeback_in_progress(struct backing_dev_info *bdi);
 void writeback_release(struct backing_dev_info *bdi);
 
+static inline int bdi_congested(struct backing_dev_info *bdi, int bdi_bits)
+{
+	if (bdi->congested_fn)
+		return bdi->congested_fn(bdi->congested_data, bdi_bits);
+	return (bdi->state & bdi_bits);
+}
+
 static inline int bdi_read_congested(struct backing_dev_info *bdi)
 {
-	return test_bit(BDI_read_congested, &bdi->state);
+	return bdi_congested(bdi, 1 << BDI_read_congested);
 }
 
 static inline int bdi_write_congested(struct backing_dev_info *bdi)
 {
-	return test_bit(BDI_write_congested, &bdi->state);
+	return bdi_congested(bdi, 1 << BDI_write_congested);
+}
+
+static inline int bdi_rw_congested(struct backing_dev_info *bdi)
+{
+	return bdi_congested(bdi, (1 << BDI_read_congested)|
+				  (1 << BDI_write_congested));
 }
 
 #endif		/* _LINUX_BACKING_DEV_H */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [linux-lvm] PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests)
  2004-02-26 11:27                 ` [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests) Miquel van Smoorenburg
@ 2004-02-26 11:28                   ` Miquel van Smoorenburg
  2004-02-26 11:41                     ` [linux-lvm] " Joe Thornber
  2004-02-26 17:35                     ` Andrew Morton
  0 siblings, 2 replies; 4+ messages in thread
From: Miquel van Smoorenburg @ 2004-02-26 11:28 UTC (permalink / raw)
  To: thornber; +Cc: Jens Axboe, Andrew Morton, piggin, linux-lvm

According to Miquel van Smoorenburg:
> Okay here you go.
> bdi_congested-core.patch

Here's the dm part:

bdi_congested-dm.patch

--- linux-2.6.3.orig/drivers/md/dm.h	2004-02-04 04:43:45.000000000 +0100
+++ linux-2.6.3-congested_fn/drivers/md/dm.h	2004-02-26 14:22:41.000000000 +0100
@@ -115,6 +115,7 @@ struct list_head *dm_table_get_devices(s
 int dm_table_get_mode(struct dm_table *t);
 void dm_table_suspend_targets(struct dm_table *t);
 void dm_table_resume_targets(struct dm_table *t);
+int dm_table_any_congested(struct dm_table *t, int bdi_bits);
 
 /*-----------------------------------------------------------------
  * A registry of target types.
--- linux-2.6.3.orig/drivers/md/dm-table.c	2004-02-04 04:44:59.000000000 +0100
+++ linux-2.6.3-congested_fn/drivers/md/dm-table.c	2004-02-26 14:22:30.000000000 +0100
@@ -857,6 +857,20 @@ void dm_table_resume_targets(struct dm_t
 	}
 }
 
+int dm_table_any_congested(struct dm_table *t, int bdi_bits)
+{
+	struct list_head *d, *devices;
+	int r = 0;
+ 
+	devices = dm_table_get_devices(t);
+	for (d = devices->next; d != devices; d = d->next) {
+		struct dm_dev *dd = list_entry(d, struct dm_dev, list);
+		request_queue_t *q = bdev_get_queue(dd->bdev);
+		r |= bdi_congested(&q->backing_dev_info, bdi_bits);
+	}
+ 
+	return r;
+}
 
 EXPORT_SYMBOL(dm_get_device);
 EXPORT_SYMBOL(dm_put_device);
--- linux-2.6.3.orig/drivers/md/dm.c	2004-02-22 13:52:15.000000000 +0100
+++ linux-2.6.3-congested_fn/drivers/md/dm.c	2004-02-26 14:26:57.000000000 +0100
@@ -526,6 +526,18 @@ static int dm_request(request_queue_t *q
 	return 0;
 }
 
+static int dm_any_congested(void *congested_data, int bdi_bits)
+{
+	int r;
+	struct mapped_device *md = congested_data;
+ 
+	down_read(&md->lock);
+	r = dm_table_any_congested(md->map, bdi_bits);
+	up_read(&md->lock);
+ 
+	return r;
+}
+
 /*-----------------------------------------------------------------
  * A bitset is used to keep track of allocated minor numbers.
  *---------------------------------------------------------------*/
@@ -608,6 +620,8 @@ static struct mapped_device *alloc_dev(u
 	}
 
 	md->queue->queuedata = md;
+	md->queue->backing_dev_info.congested_fn = dm_any_congested;
+	md->queue->backing_dev_info.congested_data = md;
 	blk_queue_make_request(md->queue, dm_request);
 
 	md->io_pool = mempool_create(MIN_IOS, mempool_alloc_slab,

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [linux-lvm] Re: PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests)
  2004-02-26 11:28                   ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
@ 2004-02-26 11:41                     ` Joe Thornber
  2004-02-26 17:35                     ` Andrew Morton
  1 sibling, 0 replies; 4+ messages in thread
From: Joe Thornber @ 2004-02-26 11:41 UTC (permalink / raw)
  To: Miquel van Smoorenburg
  Cc: thornber, Jens Axboe, Andrew Morton, piggin, linux-lvm

On Thu, Feb 26, 2004 at 05:29:52PM +0100, Miquel van Smoorenburg wrote:
> According to Miquel van Smoorenburg:
> > Okay here you go.
> > bdi_congested-core.patch
> 
> Here's the dm part:
> 
> bdi_congested-dm.patch

Looks good to me.

- Joe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [linux-lvm] Re: PATCH: bdi_congested-dm.patch (was: Re: IO scheduler, queue depth, nr_requests)
  2004-02-26 11:28                   ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
  2004-02-26 11:41                     ` [linux-lvm] " Joe Thornber
@ 2004-02-26 17:35                     ` Andrew Morton
  1 sibling, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2004-02-26 17:35 UTC (permalink / raw)
  To: Miquel van Smoorenburg; +Cc: thornber, axboe, piggin, linux-lvm

Miquel van Smoorenburg <miquels@cistron.nl> wrote:
>
> +	down_read(&md->lock);
> +	r = dm_table_any_congested(md->map, bdi_bits);
> +	up_read(&md->lock);

This can deadlock if anyone ever performs a __GFP_IO memory allocation
while holding md->lock.

We could enter page reclaim with the lock held, come back in here and take
it again.  That's an instant deadlock if we were holding it for write and a
super-rare deadlock if we were holding it for read (requires some other
process to come in and run down_write() in between the two down_read()s).

A quick audit shows that we're OK, I think.  What does
dm_table_suspend_targets() do?  But still, this restriction needs to be
understood, documented and adhered to in future DM development.

If it gets really sticky in here it would be acceptable to do
down_read_trylock() and return 0 if it fails - the congestion code is only
advisory and 99% is good enough.


I like these patches btw - I was always worried about what the
stacked-device impementation of queue congestion would look like, and this
is nice and simple.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-02-26 17:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1qJVx-75K-15@gated-at.bofh.it>
     [not found] ` <1r6fH-3L8-11@gated-at.bofh.it>
     [not found]   ` <1r6S4-6cv-1@gated-at.bofh.it>
     [not found]     ` <403D02E3.4070208@tmr.com>
     [not found]       ` <c1j4mb$gmd$1@news.cistron.nl>
     [not found]         ` <20040225162431.1f08365d.akpm@osdl.org>
     [not found]           ` <20040226103704.GA13717@traveler.cistron.net>
     [not found]             ` <20040226024714.768e3c71.akpm@osdl.org>
     [not found]               ` <20040226105120.GC7580@suse.de>
2004-02-26 11:27                 ` [linux-lvm] PATCH: bdi_congested-core.patch (was: Re: IO scheduler, queue depth, nr_requests) Miquel van Smoorenburg
2004-02-26 11:28                   ` [linux-lvm] PATCH: bdi_congested-dm.patch " Miquel van Smoorenburg
2004-02-26 11:41                     ` [linux-lvm] " Joe Thornber
2004-02-26 17:35                     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox