From: Stefan Hajnoczi <stefanha@redhat.com>
To: Aarushi Mehta <mehta.aaru20@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Maxim Levitsky <mlevitsk@redhat.com>,
Sergio Lopez <slp@redhat.com>,
qemu-block@nongnu.org, Markus Armbruster <armbru@redhat.com>,
qemu-devel@nongnu.org, saket.sinha89@gmail.com,
Max Reitz <mreitz@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <fam@euphon.net>,
Stefan Hajnoczi <stefan@redhat.com>,
Julia Suvorova <jusual@mail.ru>
Subject: Re: [Qemu-devel] [PATCH v9 16/17] block/io_uring: adds fd registration
Date: Fri, 2 Aug 2019 09:03:41 +0100 [thread overview]
Message-ID: <20190802080341.GD4227@stefanha-x1.localdomain> (raw)
In-Reply-To: <20190801234031.29561-17-mehta.aaru20@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4984 bytes --]
On Fri, Aug 02, 2019 at 05:10:30AM +0530, Aarushi Mehta wrote:
The fd lifecycle/leak issue remains. After a drive is removed the
kernel still has a reference to the file. If this repeats many times
our process will run out of open files.
A callback is required to unregister the file descriptor in
block/file-posix.c:
static void raw_aio_detach_aio_context(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_IO_URING
BDRVRawState *s = bs->opaque;
LuringState *luring;
luring = aio_get_linux_io_uring(bdrv_get_aio_context(bs));
if (luring && s->fd >= 0) {
luring_fd_unregister(luring, s->fd);
}
#endif
}
I think this should eliminate fd leaks, but please test it. You can use
drive_add/drive_del and device_add/device_del to hotplug and unplug
-drive and -device objects on the HMP monitor. Use "ls -l /proc/PID/fd"
to see the list of currently open files.
> + g_hash_table_insert(lookup, GINT_TO_POINTER(fd), GINT_TO_POINTER(nr));
> + trace_luring_fd_register(fd, nr);
This trace event can be made even more useful by including
io_uring_register_files()'s return value so we know whether the kernel
accepted fd_array[] or not.
> + return io_uring_register_files(ring, fd_reg->fd_array, nr + 1);
> +}
> +/**
> + * luring_fd_unregister:
> + *
> + * Unregisters file descriptors, TODO: error handling
> + */
> +static void luring_fd_unregister(LuringState *s)
> +{
> + io_uring_unregister_files(&s->ring);
> + g_hash_table_unref(s->fd_reg.fd_lookup);
> + g_free(s->fd_reg.fd_array);
Please use 4-space indentation.
Missing s->fd_reg.fd_array = NULL so that the next g_realloc_n()
allocates a fresh array instead of trying to reallocate a freed pointer.
> +}
> +
> +/**
> + * luring_fd_lookup:
> + *
> + * Used to lookup fd index in registered array at submission time
> + * If the lookup table has not been created or the fd is not in the table,
> + * the fd is registered.
> + *
> + * If registration errors, the hash is cleared and the fd used directly
> + *
> + * Unregistering is done at luring_detach_aio_context
> + */
> +static int luring_fd_lookup(LuringState *s, int fd)
> +{
> + int ret;
> + void *index;
> + GHashTable *lookup;
> +
> + if (!s->fd_reg.fd_lookup) {
> + s->fd_reg.fd_lookup = g_hash_table_new_full(g_direct_hash,
> + g_direct_equal,
> + g_free, g_free);
> + luring_fd_register(&s->ring, &s->fd_reg, fd);
> + }
This if statement can be eliminated:
static void luring_fd_init(LuringState *s)
{
s->fd_reg.fd_lookup = g_hash_table_new_full(g_direct_hash,
g_direct_equal,
g_free, g_free);
}
static void luring_fd_cleanup(LuringState *s)
{
io_uring_unregister_files(&s->ring);
g_hash_table_unref(s->fd_reg.fd_lookup);
g_free(s->fd_reg.fd_array);
s->fd_reg.fd_array = NULL;
}
Call luring_fd_init() from luring_attach_aio_context() and call
luring_fd_cleanup() from luring_detach_aio_context(). This makes
luring_fd_lookup() simpler and gives a nice symmetry to attach/detach.
luring_fd_cleanup() is just luring_fd_unregister() renamed.
> + lookup = s->fd_reg.fd_lookup;
> + index = g_hash_table_lookup(lookup, GINT_TO_POINTER(fd));
> +
> + if (index < 0) {
> + ret = luring_fd_register(&s->ring, &s->fd_reg, fd);
> +
> + if (ret < 0) {
> + if (ret == -ENOMEM || ret == -EMFILE ||
> + ret == -ENXIO) {
> + return ret;
> + } else {
> + /* Should not reach here */
> + g_hash_table_remove_all(lookup);
> + g_free(s->fd_reg.fd_array);
> + return ret;
I suggest making luring_fd_register() clean up after itself when an
error occurs. Then you can change this code to:
if (ret < 0) {
return ret;
}
It's usually convenient for a function to clean up after itself instead
of relying on the caller to do it since only the function knows exactly
what state has been modified so far.
The luring_fd_register() code becomes:
ret = io_uring_register_files(ring, fd_reg->fd_array, nr + 1);
if (ret == -ENOMEM || ret == -EMFILE || ret == -ENXIO) {
/* Leave fd_array[] alone, fd will be overwritten next time anyway */
g_hash_table_remove(lookup, GINT_TO_POINTER(fd));
} else if (ret < 0) {
/* A more severe error, clear out all registered fds */
g_hash_table_remove_all(lookup);
g_free(s->fd_reg.fd_array);
s->fd_reg.fd_array = NULL;
}
return ret;
> + }
> + }
> + index = g_hash_table_lookup(lookup, GINT_TO_POINTER(fd));
One final idea: make luring_fd_register() return the index on success so
callers don't need to look up the key again. In luring_fd_register():
if (ret < 0) {
return ret;
} else {
return nr;
}
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-08-02 8:04 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-01 23:40 [Qemu-devel] [PATCH v9 00/17] Add support for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 01/17] configure: permit use of io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 02/17] qapi/block-core: add option for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 03/17] block/block: add BDRV flag " Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 04/17] block/io_uring: implements interfaces " Aarushi Mehta
2019-08-07 11:35 ` Julia Suvorova via Qemu-devel
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 05/17] stubs: add stubs for io_uring interface Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 06/17] util/async: add aio interfaces for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 07/17] blockdev: adds bdrv_parse_aio to use io_uring Aarushi Mehta
2019-08-07 11:44 ` Julia Suvorova via Qemu-devel
2019-08-07 12:05 ` Aarushi Mehta
2019-08-07 12:49 ` Julia Suvorova via Qemu-devel
2019-10-04 15:06 ` Stefan Hajnoczi
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 08/17] block/file-posix.c: extend " Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 09/17] block: add trace events for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 10/17] block/io_uring: adds userspace completion polling Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 11/17] qemu-io: adds option to use aio engine Aarushi Mehta
2019-08-02 6:40 ` Stefan Hajnoczi
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 12/17] qemu-img: adds option to use aio engine for benchmarking Aarushi Mehta
2019-08-02 6:42 ` Stefan Hajnoczi
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 13/17] qemu-nbd: adds option for aio engines Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 14/17] tests/qemu-iotests: enable testing with aio options Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 15/17] tests/qemu-iotests: use AIOMODE with various tests Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 16/17] block/io_uring: adds fd registration Aarushi Mehta
2019-08-02 8:03 ` Stefan Hajnoczi [this message]
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 17/17] block/io_uring: enable kernel submission polling Aarushi Mehta
2019-08-02 8:08 ` Stefan Hajnoczi
2019-08-05 8:29 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190802080341.GD4227@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=armbru@redhat.com \
--cc=fam@euphon.net \
--cc=jusual@mail.ru \
--cc=kwolf@redhat.com \
--cc=mehta.aaru20@gmail.com \
--cc=mlevitsk@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=saket.sinha89@gmail.com \
--cc=slp@redhat.com \
--cc=stefan@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).