From: Andrew Morton <akpm@linux-foundation.org>
To: Kent Overstreet <koverstreet@google.com>
Cc: linux-kernel@vger.kernel.org, linux-aio@kvack.org,
linux-fsdevel@vger.kernel.org, zab@redhat.com, bcrl@kvack.org,
jmoyer@redhat.com, axboe@kernel.dk, viro@zeniv.linux.org.uk,
tytso@mit.edu
Subject: Re: [PATCH 14/32] aio: Make aio_read_evt() more efficient, convert to hrtimers
Date: Thu, 3 Jan 2013 15:19:20 -0800 [thread overview]
Message-ID: <20130103151920.ae731c2c.akpm@linux-foundation.org> (raw)
In-Reply-To: <1356573611-18590-16-git-send-email-koverstreet@google.com>
On Wed, 26 Dec 2012 17:59:52 -0800
Kent Overstreet <koverstreet@google.com> wrote:
> Previously, aio_read_event() pulled a single completion off the
> ringbuffer at a time, locking and unlocking each time. Changed it to
> pull off as many events as it can at a time, and copy them directly to
> userspace.
>
> This also fixes a bug where if copying the event to userspace failed,
> we'd lose the event.
>
> Also convert it to wait_event_interruptible_hrtimeout(), which
> simplifies it quite a bit.
>
> ...
>
> -static int aio_read_evt(struct kioctx *ioctx, struct io_event *ent)
> +static int aio_read_events_ring(struct kioctx *ctx,
> + struct io_event __user *event, long nr)
> {
> - struct aio_ring_info *info = &ioctx->ring_info;
> + struct aio_ring_info *info = &ctx->ring_info;
> struct aio_ring *ring;
> - unsigned long head;
> - int ret = 0;
> + unsigned head, pos;
> + int ret = 0, copy_ret;
> +
> + if (!mutex_trylock(&info->ring_lock)) {
> + __set_current_state(TASK_RUNNING);
> + mutex_lock(&info->ring_lock);
> + }
You're not big on showing your homework, I see :(
I agree that calling mutex_lock() in state TASK_[UN]INTERRUPTIBLE is at
least poor practice. Assuming this is what the code is trying to do.
But if aio_read_events_ring() is indeed called in state
TASK_[UN]INTERRUPTIBLE then the effect of the above code is to put the
task into an *unknown* state.
IOW, I don't have the foggiest clue what you're trying to do here and
you owe us all a code comment. At least.
> ring = kmap_atomic(info->ring_pages[0]);
> - pr_debug("h%u t%u m%u\n", ring->head, ring->tail, ring->nr);
> + head = ring->head;
> + kunmap_atomic(ring);
> +
> + pr_debug("h%u t%u m%u\n", head, info->tail, info->nr);
>
> - if (ring->head == ring->tail)
> + if (head == info->tail)
> goto out;
>
> - spin_lock(&info->ring_lock);
> -
> - head = ring->head % info->nr;
> - if (head != ring->tail) {
> - struct io_event *evp = aio_ring_event(info, head);
> - *ent = *evp;
> - head = (head + 1) % info->nr;
> - smp_mb(); /* finish reading the event before updatng the head */
> - ring->head = head;
> - ret = 1;
> - put_aio_ring_event(evp);
> + __set_current_state(TASK_RUNNING);
> +
> + while (ret < nr) {
> + unsigned i = (head < info->tail ? info->tail : info->nr) - head;
> + struct io_event *ev;
> + struct page *page;
> +
> + if (head == info->tail)
> + break;
> +
> + i = min_t(int, i, nr - ret);
> + i = min_t(int, i, AIO_EVENTS_PER_PAGE -
> + ((head + AIO_EVENTS_OFFSET) % AIO_EVENTS_PER_PAGE));
min_t() is kernel shorthand for "I screwed up my types". Methinks
`ret' should have long type. Or, better, unsigned (negative makes no
sense). And when a C programmer sees an variable called "i" he thinks
it has type "int", so that guy should be renamed.
Can we please clean all this up?
> + pos = head + AIO_EVENTS_OFFSET;
> + page = info->ring_pages[pos / AIO_EVENTS_PER_PAGE];
> + pos %= AIO_EVENTS_PER_PAGE;
> +
> + ev = kmap(page);
> + copy_ret = copy_to_user(event + ret, ev + pos, sizeof(*ev) * i);
> + kunmap(page);
> +
> + if (unlikely(copy_ret)) {
> + ret = -EFAULT;
> + goto out;
> + }
> +
> + ret += i;
> + head += i;
> + head %= info->nr;
> }
> - spin_unlock(&info->ring_lock);
>
> -out:
> + ring = kmap_atomic(info->ring_pages[0]);
> + ring->head = head;
> kunmap_atomic(ring);
> - pr_debug("%d h%u t%u\n", ret, ring->head, ring->tail);
> +
> + pr_debug("%d h%u t%u\n", ret, head, info->tail);
> +out:
> + mutex_unlock(&info->ring_lock);
> +
> return ret;
> }
>
> ...
>
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Kent Overstreet <koverstreet@google.com>
Cc: linux-kernel@vger.kernel.org, linux-aio@kvack.org,
linux-fsdevel@vger.kernel.org, zab@redhat.com, bcrl@kvack.org,
jmoyer@redhat.com, axboe@kernel.dk, viro@zeniv.linux.org.uk,
tytso@mit.edu
Subject: Re: [PATCH 14/32] aio: Make aio_read_evt() more efficient, convert to hrtimers
Date: Thu, 3 Jan 2013 15:19:20 -0800 [thread overview]
Message-ID: <20130103151920.ae731c2c.akpm@linux-foundation.org> (raw)
In-Reply-To: <1356573611-18590-16-git-send-email-koverstreet@google.com>
On Wed, 26 Dec 2012 17:59:52 -0800
Kent Overstreet <koverstreet@google.com> wrote:
> Previously, aio_read_event() pulled a single completion off the
> ringbuffer at a time, locking and unlocking each time. Changed it to
> pull off as many events as it can at a time, and copy them directly to
> userspace.
>
> This also fixes a bug where if copying the event to userspace failed,
> we'd lose the event.
>
> Also convert it to wait_event_interruptible_hrtimeout(), which
> simplifies it quite a bit.
>
> ...
>
> -static int aio_read_evt(struct kioctx *ioctx, struct io_event *ent)
> +static int aio_read_events_ring(struct kioctx *ctx,
> + struct io_event __user *event, long nr)
> {
> - struct aio_ring_info *info = &ioctx->ring_info;
> + struct aio_ring_info *info = &ctx->ring_info;
> struct aio_ring *ring;
> - unsigned long head;
> - int ret = 0;
> + unsigned head, pos;
> + int ret = 0, copy_ret;
> +
> + if (!mutex_trylock(&info->ring_lock)) {
> + __set_current_state(TASK_RUNNING);
> + mutex_lock(&info->ring_lock);
> + }
You're not big on showing your homework, I see :(
I agree that calling mutex_lock() in state TASK_[UN]INTERRUPTIBLE is at
least poor practice. Assuming this is what the code is trying to do.
But if aio_read_events_ring() is indeed called in state
TASK_[UN]INTERRUPTIBLE then the effect of the above code is to put the
task into an *unknown* state.
IOW, I don't have the foggiest clue what you're trying to do here and
you owe us all a code comment. At least.
> ring = kmap_atomic(info->ring_pages[0]);
> - pr_debug("h%u t%u m%u\n", ring->head, ring->tail, ring->nr);
> + head = ring->head;
> + kunmap_atomic(ring);
> +
> + pr_debug("h%u t%u m%u\n", head, info->tail, info->nr);
>
> - if (ring->head == ring->tail)
> + if (head == info->tail)
> goto out;
>
> - spin_lock(&info->ring_lock);
> -
> - head = ring->head % info->nr;
> - if (head != ring->tail) {
> - struct io_event *evp = aio_ring_event(info, head);
> - *ent = *evp;
> - head = (head + 1) % info->nr;
> - smp_mb(); /* finish reading the event before updatng the head */
> - ring->head = head;
> - ret = 1;
> - put_aio_ring_event(evp);
> + __set_current_state(TASK_RUNNING);
> +
> + while (ret < nr) {
> + unsigned i = (head < info->tail ? info->tail : info->nr) - head;
> + struct io_event *ev;
> + struct page *page;
> +
> + if (head == info->tail)
> + break;
> +
> + i = min_t(int, i, nr - ret);
> + i = min_t(int, i, AIO_EVENTS_PER_PAGE -
> + ((head + AIO_EVENTS_OFFSET) % AIO_EVENTS_PER_PAGE));
min_t() is kernel shorthand for "I screwed up my types". Methinks
`ret' should have long type. Or, better, unsigned (negative makes no
sense). And when a C programmer sees an variable called "i" he thinks
it has type "int", so that guy should be renamed.
Can we please clean all this up?
> + pos = head + AIO_EVENTS_OFFSET;
> + page = info->ring_pages[pos / AIO_EVENTS_PER_PAGE];
> + pos %= AIO_EVENTS_PER_PAGE;
> +
> + ev = kmap(page);
> + copy_ret = copy_to_user(event + ret, ev + pos, sizeof(*ev) * i);
> + kunmap(page);
> +
> + if (unlikely(copy_ret)) {
> + ret = -EFAULT;
> + goto out;
> + }
> +
> + ret += i;
> + head += i;
> + head %= info->nr;
> }
> - spin_unlock(&info->ring_lock);
>
> -out:
> + ring = kmap_atomic(info->ring_pages[0]);
> + ring->head = head;
> kunmap_atomic(ring);
> - pr_debug("%d h%u t%u\n", ret, ring->head, ring->tail);
> +
> + pr_debug("%d h%u t%u\n", ret, head, info->tail);
> +out:
> + mutex_unlock(&info->ring_lock);
> +
> return ret;
> }
>
> ...
>
next prev parent reply other threads:[~2013-01-03 23:19 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-27 1:59 [PATCH 00/32] AIO performance improvements/cleanups, v3 Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 01/32] mm: remove old aio use_mm() comment Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 02/32] aio: remove dead code from aio.h Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 03/32] gadget: remove only user of aio retry Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 04/32] aio: remove retry-based AIO Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-29 7:36 ` Hillf Danton
2012-12-29 7:36 ` Hillf Danton
2013-01-07 22:12 ` Kent Overstreet
2013-01-07 22:12 ` Kent Overstreet
2012-12-29 7:47 ` Hillf Danton
2012-12-29 7:47 ` Hillf Danton
2013-01-07 22:15 ` Kent Overstreet
2013-01-07 22:15 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 05/32] char: add aio_{read,write} to /dev/{null,zero} Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 06/32] aio: Kill return value of aio_complete() Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 07/32] aio: kiocb_cancel() Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 08/32] aio: Move private stuff out of aio.h Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 09/32] aio: dprintk() -> pr_debug() Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 10/32] aio: do fget() after aio_get_req() Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 11/32] aio: Make aio_put_req() lockless Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 12/32] aio: Refcounting cleanup Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 13/32] wait: Add wait_event_hrtimeout() Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 10:37 ` Fubo Chen
2012-12-27 10:37 ` Fubo Chen
2013-01-03 23:08 ` Andrew Morton
2013-01-03 23:08 ` Andrew Morton
2013-01-08 0:09 ` Kent Overstreet
2013-01-08 0:09 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 14/32] aio: Make aio_read_evt() more efficient, convert to hrtimers Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2013-01-03 23:19 ` Andrew Morton [this message]
2013-01-03 23:19 ` Andrew Morton
2013-01-08 0:28 ` Kent Overstreet
2013-01-08 0:28 ` Kent Overstreet
2013-01-08 1:00 ` Andrew Morton
2013-01-08 1:00 ` Andrew Morton
2013-01-08 1:28 ` Kent Overstreet
2013-01-08 1:28 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 15/32] aio: Use flush_dcache_page() Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 16/32] aio: Use cancellation list lazily Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 17/32] aio: Change reqs_active to include unreaped completions Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 18/32] aio: Kill batch allocation Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 19/32] aio: Kill struct aio_ring_info Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 20/32] aio: Give shared kioctx fields their own cachelines Kent Overstreet
2012-12-27 1:59 ` Kent Overstreet
2013-01-03 23:25 ` Andrew Morton
2013-01-03 23:25 ` Andrew Morton
2013-01-07 23:48 ` Kent Overstreet
2013-01-07 23:48 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 21/32] aio: reqs_active -> reqs_available Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 22/32] aio: percpu reqs_available Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 23/32] Generic dynamic per cpu refcounting Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2013-01-03 22:48 ` Andrew Morton
2013-01-03 22:48 ` Andrew Morton
2013-01-07 23:47 ` Kent Overstreet
2013-01-07 23:47 ` Kent Overstreet
2013-01-08 1:03 ` [PATCH] percpu-refcount: Sparse fixes Kent Overstreet
2013-01-08 1:03 ` Kent Overstreet
2013-01-25 0:51 ` [PATCH 23/32] Generic dynamic per cpu refcounting Tejun Heo
2013-01-25 0:51 ` Tejun Heo
2013-01-25 1:13 ` Kent Overstreet
2013-01-25 1:13 ` Kent Overstreet
2013-01-25 2:03 ` Tejun Heo
2013-01-25 2:03 ` Tejun Heo
2013-01-25 2:09 ` Tejun Heo
2013-01-25 2:09 ` Tejun Heo
2013-01-28 17:48 ` Kent Overstreet
2013-01-28 17:48 ` Kent Overstreet
2013-01-28 18:18 ` Tejun Heo
2013-01-28 18:18 ` Tejun Heo
2013-01-25 6:15 ` Rusty Russell
2013-01-28 17:53 ` Kent Overstreet
2013-01-28 17:53 ` Kent Overstreet
2013-01-28 17:59 ` Tejun Heo
2013-01-28 17:59 ` Tejun Heo
2013-01-28 18:32 ` Kent Overstreet
2013-01-28 18:32 ` Kent Overstreet
2013-01-28 18:57 ` Christoph Lameter
2013-01-28 18:57 ` Christoph Lameter
2013-02-08 14:44 ` Tejun Heo
2013-02-08 14:44 ` Tejun Heo
2013-02-08 14:49 ` Jens Axboe
2013-02-08 14:49 ` Jens Axboe
2013-02-08 17:50 ` Andrew Morton
2013-02-08 17:50 ` Andrew Morton
2013-02-08 21:27 ` Kent Overstreet
2013-02-08 21:27 ` Kent Overstreet
2013-02-11 14:21 ` Jeff Moyer
2013-02-11 14:21 ` Jeff Moyer
2013-02-08 21:17 ` Kent Overstreet
2013-02-08 21:17 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 24/32] aio: Percpu ioctx refcount Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 25/32] aio: use xchg() instead of completion_lock Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2013-01-03 23:34 ` Andrew Morton
2013-01-07 23:21 ` Kent Overstreet
2013-01-07 23:21 ` Kent Overstreet
2013-01-07 23:35 ` Andrew Morton
2013-01-07 23:35 ` Andrew Morton
2013-01-08 0:01 ` Kent Overstreet
2013-01-08 0:01 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 26/32] aio: Don't include aio.h in sched.h Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 27/32] aio: Kill ki_key Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 28/32] aio: Kill ki_retry Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 29/32] block, aio: Batch completion for bios/kiocbs Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2013-01-04 9:22 ` Jens Axboe
2013-01-04 9:22 ` Jens Axboe
2013-01-07 23:34 ` Kent Overstreet
2013-01-07 23:34 ` Kent Overstreet
2013-01-08 15:33 ` Jeff Moyer
2013-01-08 15:33 ` Jeff Moyer
2013-01-08 16:06 ` Kent Overstreet
2013-01-08 16:06 ` Kent Overstreet
2013-01-08 16:15 ` Jeff Moyer
2013-01-08 16:15 ` Jeff Moyer
2013-01-08 16:48 ` Kent Overstreet
2013-01-08 16:48 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 30/32] virtio-blk: Convert to batch completion Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 31/32] mtip32xx: " Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 32/32] aio: Smoosh struct kiocb Kent Overstreet
2012-12-27 2:00 ` Kent Overstreet
2013-01-04 9:22 ` [PATCH 00/32] AIO performance improvements/cleanups, v3 Jens Axboe
2013-01-04 9:22 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130103151920.ae731c2c.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bcrl@kvack.org \
--cc=jmoyer@redhat.com \
--cc=koverstreet@google.com \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=zab@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.