From: Jens Axboe <jens.axboe@oracle.com>
To: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
Geoff Levand <geoffrey.levand@am.sony.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Linux Kernel Development <linux-kernel@vger.kernel.org>,
Jim Paris <jim@jtan.com>,
Linux/PPC Development <linuxppc-dev@ozlabs.org>,
linux-mtd@lists.infradead.org,
Vivien Chappelier <vivien.chappelier@free.fr>,
David Woodhouse <dwmw2@infradead.org>,
Cell Broadband Engine OSS Development <cbe-oss-dev@ozlabs.org>
Subject: Re: [PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing video RAM as block device
Date: Fri, 6 Mar 2009 08:46:39 +0100 [thread overview]
Message-ID: <20090306074639.GN11787@kernel.dk> (raw)
In-Reply-To: <alpine.LRH.2.00.0903051325450.2618@vixen.sonytel.be>
On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> On Thu, 5 Mar 2009, Jens Axboe wrote:
> > On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> > > On Thu, 5 Mar 2009, Jens Axboe wrote:
> > > > On Wed, Mar 04 2009, Geert Uytterhoeven wrote:
> > > > > Below is the rewrite of the PS3 Video RAM Storage Driver as a plain block
> > > > > device, as requested by Arnd Bergmann.
>
> > > > I'd rewrite this as a ->make_request_fn handler instead. Then you can
> > > > get rid of the kernel thread. IOW, change
> > > >
> > > > queue = blk_init_queue(ps3vram_request, &priv->lock);
> > > >
> > > > to
> > > >
> > > > queue = blk_alloc_queue(GFP_KERNEL);
> > > > blk_queue_make_request(queue, ps3vram_make_request);
> > >
> > > Thanks, I didn't know that part...
> > >
> > > > Add error handling of course, and call blk_queue_max_*() to set your
> > > > limits for this device.
> > >
> > > I took out the blk_queue_max_*() calls (compared to ps3disk.c), as
> > > none of the limits apply, and the defaults are fine.
> > >
> > > Is that OK, or is it better to make it explicit?
> >
> > I think it's always good to make it explicit. Plus for this case you
> > definitely need it, as blk_init_queue() wont do it for you anymore.
>
> blk_queue_make_request() does it for me, too:
>
> void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
> {
> ...
> blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
> blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
> ...
> blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
> ...
> blk_queue_max_sectors(q, SAFE_MAX_SECTORS);
> ...
> }
>
> struct request_queue *
> blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
> {
> ...
> blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
>
> blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
> blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
> ...
> }
Indeed, there's some duplicated code in blk_init_queue_node(), I'll make
sure to get rid of that!
> > > > Then add a ps3vram_make_request() ala:
> > >
> > > > static void ps3vram_do_request(struct request_queue *q, struct bio *bio)
> > > > {
>
> > > > }
> > > >
> > > > I just typed it here, so if it doesn't compile you get to keep the
> > > > pieces :-)
> > >
> > > OK, I'll give it a try...
> > >
> > > BTW, does this mean the `simple' way, which I used based on LDD3, is
> > > deprecated?
> >
> > Depends.. It's obviously not a very effective approach, since you punt
> > to a thread for each request. But if you need the IO scheduler helping
> > you with merging and sorting (for a rotational device), it still has
> > some merit. For this particular case, the ->make_request_fn approach is
> > much better.
>
> Without the thread, performance indeed increased.
>
> But then I noticed ps3vram_make_request() may be called concurrently,
> so I had to add a mutex to avoid data corruption. This slows the
> driver down, and in the end, the version with a thread turns out to be
> ca. 1% faster. The version without a thread is about 50 lines less
> code, though.
That is correct, ->make_request_fn may get reentered. I'm not surprised
that performance dropped if you just shoved everything under a mutex.
You could be a little more smart and queue concurrent bio's for
processing when the current one is complete though, there are several
approaches there that be a lot faster than going all the way through the
IO stack and scheduler just to avoid concurrency.
--
Jens Axboe
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <jens.axboe@oracle.com>
To: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
Linux Kernel Development <linux-kernel@vger.kernel.org>,
Jim Paris <jim@jtan.com>,
Linux/PPC Development <linuxppc-dev@ozlabs.org>,
linux-mtd@lists.infradead.org,
Vivien Chappelier <vivien.chappelier@free.fr>,
David Woodhouse <dwmw2@infradead.org>,
Cell Broadband Engine OSS Development <cbe-oss-dev@ozlabs.org>
Subject: Re: [PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing video RAM as block device
Date: Fri, 6 Mar 2009 08:46:39 +0100 [thread overview]
Message-ID: <20090306074639.GN11787@kernel.dk> (raw)
In-Reply-To: <alpine.LRH.2.00.0903051325450.2618@vixen.sonytel.be>
On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> On Thu, 5 Mar 2009, Jens Axboe wrote:
> > On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> > > On Thu, 5 Mar 2009, Jens Axboe wrote:
> > > > On Wed, Mar 04 2009, Geert Uytterhoeven wrote:
> > > > > Below is the rewrite of the PS3 Video RAM Storage Driver as a plain block
> > > > > device, as requested by Arnd Bergmann.
>
> > > > I'd rewrite this as a ->make_request_fn handler instead. Then you can
> > > > get rid of the kernel thread. IOW, change
> > > >
> > > > queue = blk_init_queue(ps3vram_request, &priv->lock);
> > > >
> > > > to
> > > >
> > > > queue = blk_alloc_queue(GFP_KERNEL);
> > > > blk_queue_make_request(queue, ps3vram_make_request);
> > >
> > > Thanks, I didn't know that part...
> > >
> > > > Add error handling of course, and call blk_queue_max_*() to set your
> > > > limits for this device.
> > >
> > > I took out the blk_queue_max_*() calls (compared to ps3disk.c), as
> > > none of the limits apply, and the defaults are fine.
> > >
> > > Is that OK, or is it better to make it explicit?
> >
> > I think it's always good to make it explicit. Plus for this case you
> > definitely need it, as blk_init_queue() wont do it for you anymore.
>
> blk_queue_make_request() does it for me, too:
>
> void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
> {
> ...
> blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
> blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
> ...
> blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
> ...
> blk_queue_max_sectors(q, SAFE_MAX_SECTORS);
> ...
> }
>
> struct request_queue *
> blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
> {
> ...
> blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
>
> blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
> blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
> ...
> }
Indeed, there's some duplicated code in blk_init_queue_node(), I'll make
sure to get rid of that!
> > > > Then add a ps3vram_make_request() ala:
> > >
> > > > static void ps3vram_do_request(struct request_queue *q, struct bio *bio)
> > > > {
>
> > > > }
> > > >
> > > > I just typed it here, so if it doesn't compile you get to keep the
> > > > pieces :-)
> > >
> > > OK, I'll give it a try...
> > >
> > > BTW, does this mean the `simple' way, which I used based on LDD3, is
> > > deprecated?
> >
> > Depends.. It's obviously not a very effective approach, since you punt
> > to a thread for each request. But if you need the IO scheduler helping
> > you with merging and sorting (for a rotational device), it still has
> > some merit. For this particular case, the ->make_request_fn approach is
> > much better.
>
> Without the thread, performance indeed increased.
>
> But then I noticed ps3vram_make_request() may be called concurrently,
> so I had to add a mutex to avoid data corruption. This slows the
> driver down, and in the end, the version with a thread turns out to be
> ca. 1% faster. The version without a thread is about 50 lines less
> code, though.
That is correct, ->make_request_fn may get reentered. I'm not surprised
that performance dropped if you just shoved everything under a mutex.
You could be a little more smart and queue concurrent bio's for
processing when the current one is complete though, there are several
approaches there that be a lot faster than going all the way through the
IO stack and scheduler just to avoid concurrency.
--
Jens Axboe
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <jens.axboe@oracle.com>
To: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Jim Paris <jim@jtan.com>,
Vivien Chappelier <vivien.chappelier@free.fr>,
David Woodhouse <dwmw2@infradead.org>,
Arnd Bergmann <arnd@arndb.de>,
Geoff Levand <geoffrey.levand@am.sony.com>,
Linux/PPC Development <linuxppc-dev@ozlabs.org>,
Cell Broadband Engine OSS Development <cbe-oss-dev@ozlabs.org>,
Linux Kernel Development <linux-kernel@vger.kernel.org>,
linux-mtd@lists.infradead.org
Subject: Re: [PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing video RAM as block device
Date: Fri, 6 Mar 2009 08:46:39 +0100 [thread overview]
Message-ID: <20090306074639.GN11787@kernel.dk> (raw)
In-Reply-To: <alpine.LRH.2.00.0903051325450.2618@vixen.sonytel.be>
On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> On Thu, 5 Mar 2009, Jens Axboe wrote:
> > On Thu, Mar 05 2009, Geert Uytterhoeven wrote:
> > > On Thu, 5 Mar 2009, Jens Axboe wrote:
> > > > On Wed, Mar 04 2009, Geert Uytterhoeven wrote:
> > > > > Below is the rewrite of the PS3 Video RAM Storage Driver as a plain block
> > > > > device, as requested by Arnd Bergmann.
>
> > > > I'd rewrite this as a ->make_request_fn handler instead. Then you can
> > > > get rid of the kernel thread. IOW, change
> > > >
> > > > queue = blk_init_queue(ps3vram_request, &priv->lock);
> > > >
> > > > to
> > > >
> > > > queue = blk_alloc_queue(GFP_KERNEL);
> > > > blk_queue_make_request(queue, ps3vram_make_request);
> > >
> > > Thanks, I didn't know that part...
> > >
> > > > Add error handling of course, and call blk_queue_max_*() to set your
> > > > limits for this device.
> > >
> > > I took out the blk_queue_max_*() calls (compared to ps3disk.c), as
> > > none of the limits apply, and the defaults are fine.
> > >
> > > Is that OK, or is it better to make it explicit?
> >
> > I think it's always good to make it explicit. Plus for this case you
> > definitely need it, as blk_init_queue() wont do it for you anymore.
>
> blk_queue_make_request() does it for me, too:
>
> void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
> {
> ...
> blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
> blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
> ...
> blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
> ...
> blk_queue_max_sectors(q, SAFE_MAX_SECTORS);
> ...
> }
>
> struct request_queue *
> blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
> {
> ...
> blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
>
> blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
> blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
> ...
> }
Indeed, there's some duplicated code in blk_init_queue_node(), I'll make
sure to get rid of that!
> > > > Then add a ps3vram_make_request() ala:
> > >
> > > > static void ps3vram_do_request(struct request_queue *q, struct bio *bio)
> > > > {
>
> > > > }
> > > >
> > > > I just typed it here, so if it doesn't compile you get to keep the
> > > > pieces :-)
> > >
> > > OK, I'll give it a try...
> > >
> > > BTW, does this mean the `simple' way, which I used based on LDD3, is
> > > deprecated?
> >
> > Depends.. It's obviously not a very effective approach, since you punt
> > to a thread for each request. But if you need the IO scheduler helping
> > you with merging and sorting (for a rotational device), it still has
> > some merit. For this particular case, the ->make_request_fn approach is
> > much better.
>
> Without the thread, performance indeed increased.
>
> But then I noticed ps3vram_make_request() may be called concurrently,
> so I had to add a mutex to avoid data corruption. This slows the
> driver down, and in the end, the version with a thread turns out to be
> ca. 1% faster. The version without a thread is about 50 lines less
> code, though.
That is correct, ->make_request_fn may get reentered. I'm not surprised
that performance dropped if you just shoved everything under a mutex.
You could be a little more smart and queue concurrent bio's for
processing when the current one is complete though, there are several
approaches there that be a lot faster than going all the way through the
IO stack and scheduler just to avoid concurrency.
--
Jens Axboe
next prev parent reply other threads:[~2009-03-06 7:46 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-04 13:57 [PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing video RAM as block device Geert Uytterhoeven
2009-03-04 13:57 ` Geert Uytterhoeven
2009-03-04 13:57 ` Geert Uytterhoeven
2009-03-04 23:27 ` Benjamin Herrenschmidt
2009-03-04 23:27 ` Benjamin Herrenschmidt
2009-03-04 23:27 ` Benjamin Herrenschmidt
2009-03-05 6:54 ` Jens Axboe
2009-03-05 6:54 ` Jens Axboe
2009-03-05 6:54 ` Jens Axboe
2009-03-06 12:54 ` [PATCH] ps3/block: Replace mtd/ps3vram by block/ps3vram (was: Re: [PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing video RAM as block device) Geert Uytterhoeven
2009-03-06 12:54 ` Geert Uytterhoeven
2009-03-06 12:54 ` Geert Uytterhoeven
2009-03-09 17:51 ` [PATCH] ps3/block: Replace mtd/ps3vram by block/ps3vram Geoff Levand
2009-03-09 17:51 ` Geoff Levand
2009-03-05 0:21 ` [Cbe-oss-dev] [PATCH/RFC] ps3/block: Add ps3vram-ng driver for accessing video RAM as block device Marcus G. Daniels
2009-03-05 0:21 ` Marcus G. Daniels
2009-03-05 0:21 ` Marcus G. Daniels
2009-03-05 7:17 ` Olaf Hering
2009-03-05 7:17 ` Olaf Hering
2009-03-05 7:17 ` Olaf Hering
2009-03-05 7:59 ` Geert Uytterhoeven
2009-03-05 7:59 ` Geert Uytterhoeven
2009-03-05 7:59 ` Geert Uytterhoeven
2009-03-05 10:24 ` Geert Uytterhoeven
2009-03-05 10:24 ` Geert Uytterhoeven
2009-03-05 10:24 ` Geert Uytterhoeven
2009-03-05 18:12 ` Olaf Hering
2009-03-05 18:12 ` Olaf Hering
2009-03-05 18:12 ` Olaf Hering
2009-03-05 8:37 ` Jens Axboe
2009-03-05 8:37 ` Jens Axboe
2009-03-05 8:37 ` Jens Axboe
2009-03-05 10:50 ` Geert Uytterhoeven
2009-03-05 10:50 ` Geert Uytterhoeven
2009-03-05 10:50 ` Geert Uytterhoeven
2009-03-05 11:09 ` Jens Axboe
2009-03-05 11:09 ` Jens Axboe
2009-03-05 11:09 ` Jens Axboe
2009-03-05 16:45 ` Geert Uytterhoeven
2009-03-05 16:45 ` Geert Uytterhoeven
2009-03-05 16:45 ` Geert Uytterhoeven
2009-03-06 7:46 ` Jens Axboe [this message]
2009-03-06 7:46 ` Jens Axboe
2009-03-06 7:46 ` Jens Axboe
2009-03-06 12:48 ` Geert Uytterhoeven
2009-03-06 12:48 ` Geert Uytterhoeven
2009-03-06 12:48 ` Geert Uytterhoeven
2009-03-06 12:58 ` Jens Axboe
2009-03-06 12:58 ` Jens Axboe
2009-03-06 12:58 ` Jens Axboe
2009-03-06 14:26 ` Geert Uytterhoeven
2009-03-06 14:26 ` Geert Uytterhoeven
2009-03-06 14:26 ` Geert Uytterhoeven
2009-03-06 19:03 ` Jens Axboe
2009-03-06 19:03 ` Jens Axboe
2009-03-06 19:03 ` Jens Axboe
2009-03-09 10:43 ` Geert Uytterhoeven
2009-03-09 10:43 ` Geert Uytterhoeven
2009-03-09 10:43 ` Geert Uytterhoeven
2009-03-09 10:48 ` Jens Axboe
2009-03-09 10:48 ` Jens Axboe
2009-03-09 10:48 ` Jens Axboe
2009-03-09 10:50 ` Jens Axboe
2009-03-09 10:50 ` Jens Axboe
2009-03-09 10:50 ` Jens Axboe
2009-03-09 10:52 ` Geert Uytterhoeven
2009-03-09 10:52 ` Geert Uytterhoeven
2009-03-09 10:52 ` Geert Uytterhoeven
2009-03-09 10:58 ` Jens Axboe
2009-03-09 10:58 ` Jens Axboe
2009-03-09 10:58 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090306074639.GN11787@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=Geert.Uytterhoeven@sonycom.com \
--cc=arnd@arndb.de \
--cc=benh@kernel.crashing.org \
--cc=cbe-oss-dev@ozlabs.org \
--cc=dwmw2@infradead.org \
--cc=geoffrey.levand@am.sony.com \
--cc=jim@jtan.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mtd@lists.infradead.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=vivien.chappelier@free.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.