qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Aarushi Mehta <mehta.aaru20@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	Sergio Lopez <slp@redhat.com>,
	qemu-block@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, saket.sinha89@gmail.com,
	Max Reitz <mreitz@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <fam@euphon.net>,
	Stefan Hajnoczi <stefan@redhat.com>,
	Julia Suvorova <jusual@mail.ru>
Subject: Re: [Qemu-devel] [PATCH v9 16/17] block/io_uring: adds fd registration
Date: Fri, 2 Aug 2019 09:03:41 +0100	[thread overview]
Message-ID: <20190802080341.GD4227@stefanha-x1.localdomain> (raw)
In-Reply-To: <20190801234031.29561-17-mehta.aaru20@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4984 bytes --]

On Fri, Aug 02, 2019 at 05:10:30AM +0530, Aarushi Mehta wrote:

The fd lifecycle/leak issue remains.  After a drive is removed the
kernel still has a reference to the file.  If this repeats many times
our process will run out of open files.

A callback is required to unregister the file descriptor in
block/file-posix.c:

  static void raw_aio_detach_aio_context(BlockDriverState *bs)
  {
  #ifdef CONFIG_LINUX_IO_URING
      BDRVRawState *s = bs->opaque;
      LuringState *luring;

      luring = aio_get_linux_io_uring(bdrv_get_aio_context(bs));

      if (luring && s->fd >= 0) {
          luring_fd_unregister(luring, s->fd);
      }
  #endif
  }

I think this should eliminate fd leaks, but please test it.  You can use
drive_add/drive_del and device_add/device_del to hotplug and unplug
-drive and -device objects on the HMP monitor.  Use "ls -l /proc/PID/fd"
to see the list of currently open files.

> +    g_hash_table_insert(lookup, GINT_TO_POINTER(fd), GINT_TO_POINTER(nr));
> +    trace_luring_fd_register(fd, nr);

This trace event can be made even more useful by including
io_uring_register_files()'s return value so we know whether the kernel
accepted fd_array[] or not.

> +    return io_uring_register_files(ring, fd_reg->fd_array, nr + 1);
> +}
> +/**
> + * luring_fd_unregister:
> + *
> + * Unregisters file descriptors, TODO: error handling
> + */
> +static void luring_fd_unregister(LuringState *s)
> +{
> +        io_uring_unregister_files(&s->ring);
> +        g_hash_table_unref(s->fd_reg.fd_lookup);
> +        g_free(s->fd_reg.fd_array);

Please use 4-space indentation.

Missing s->fd_reg.fd_array = NULL so that the next g_realloc_n()
allocates a fresh array instead of trying to reallocate a freed pointer.

> +}
> +
> +/**
> + * luring_fd_lookup:
> + *
> + * Used to lookup fd index in registered array at submission time
> + * If the lookup table has not been created or the fd is not in the table,
> + * the fd is registered.
> + *
> + * If registration errors, the hash is cleared and the fd used directly
> + *
> + * Unregistering is done at luring_detach_aio_context
> + */
> +static int luring_fd_lookup(LuringState *s, int fd)
> +{
> +    int ret;
> +    void *index;
> +    GHashTable *lookup;
> +
> +    if (!s->fd_reg.fd_lookup) {
> +        s->fd_reg.fd_lookup = g_hash_table_new_full(g_direct_hash,
> +                                                    g_direct_equal,
> +                                                    g_free, g_free);
> +        luring_fd_register(&s->ring, &s->fd_reg, fd);
> +    }

This if statement can be eliminated:

  static void luring_fd_init(LuringState *s)
  {
      s->fd_reg.fd_lookup = g_hash_table_new_full(g_direct_hash,
                                                  g_direct_equal,
						  g_free, g_free);
  }

  static void luring_fd_cleanup(LuringState *s)
  {
      io_uring_unregister_files(&s->ring);
      g_hash_table_unref(s->fd_reg.fd_lookup);
      g_free(s->fd_reg.fd_array);
      s->fd_reg.fd_array = NULL;
  }

Call luring_fd_init() from luring_attach_aio_context() and call
luring_fd_cleanup() from luring_detach_aio_context().  This makes
luring_fd_lookup() simpler and gives a nice symmetry to attach/detach.

luring_fd_cleanup() is just luring_fd_unregister() renamed.

> +    lookup = s->fd_reg.fd_lookup;
> +    index = g_hash_table_lookup(lookup, GINT_TO_POINTER(fd));
> +
> +    if (index < 0) {
> +        ret = luring_fd_register(&s->ring, &s->fd_reg, fd);
> +
> +        if (ret < 0) {
> +            if (ret == -ENOMEM || ret == -EMFILE ||
> +                ret == -ENXIO) {
> +                return ret;
> +            } else {
> +                /* Should not reach here */
> +                g_hash_table_remove_all(lookup);
> +                g_free(s->fd_reg.fd_array);
> +                return ret;

I suggest making luring_fd_register() clean up after itself when an
error occurs.  Then you can change this code to:

  if (ret < 0) {
      return ret;
  }

It's usually convenient for a function to clean up after itself instead
of relying on the caller to do it since only the function knows exactly
what state has been modified so far.

The luring_fd_register() code becomes:

  ret = io_uring_register_files(ring, fd_reg->fd_array, nr + 1);
  if (ret == -ENOMEM || ret == -EMFILE || ret == -ENXIO) {
      /* Leave fd_array[] alone, fd will be overwritten next time anyway */
      g_hash_table_remove(lookup, GINT_TO_POINTER(fd));
  } else if (ret < 0) {
      /* A more severe error, clear out all registered fds */
      g_hash_table_remove_all(lookup);
      g_free(s->fd_reg.fd_array);
      s->fd_reg.fd_array = NULL;
  }
  return ret;

> +            }
> +        }
> +        index = g_hash_table_lookup(lookup, GINT_TO_POINTER(fd));

One final idea: make luring_fd_register() return the index on success so
callers don't need to look up the key again.  In luring_fd_register():

  if (ret < 0) {
      return ret;
  } else {
      return nr;
  }

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-08-02  8:04 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-01 23:40 [Qemu-devel] [PATCH v9 00/17] Add support for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 01/17] configure: permit use of io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 02/17] qapi/block-core: add option for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 03/17] block/block: add BDRV flag " Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 04/17] block/io_uring: implements interfaces " Aarushi Mehta
2019-08-07 11:35   ` Julia Suvorova via Qemu-devel
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 05/17] stubs: add stubs for io_uring interface Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 06/17] util/async: add aio interfaces for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 07/17] blockdev: adds bdrv_parse_aio to use io_uring Aarushi Mehta
2019-08-07 11:44   ` Julia Suvorova via Qemu-devel
2019-08-07 12:05     ` Aarushi Mehta
2019-08-07 12:49       ` Julia Suvorova via Qemu-devel
2019-10-04 15:06         ` Stefan Hajnoczi
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 08/17] block/file-posix.c: extend " Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 09/17] block: add trace events for io_uring Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 10/17] block/io_uring: adds userspace completion polling Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 11/17] qemu-io: adds option to use aio engine Aarushi Mehta
2019-08-02  6:40   ` Stefan Hajnoczi
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 12/17] qemu-img: adds option to use aio engine for benchmarking Aarushi Mehta
2019-08-02  6:42   ` Stefan Hajnoczi
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 13/17] qemu-nbd: adds option for aio engines Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 14/17] tests/qemu-iotests: enable testing with aio options Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 15/17] tests/qemu-iotests: use AIOMODE with various tests Aarushi Mehta
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 16/17] block/io_uring: adds fd registration Aarushi Mehta
2019-08-02  8:03   ` Stefan Hajnoczi [this message]
2019-08-01 23:40 ` [Qemu-devel] [PATCH v9 17/17] block/io_uring: enable kernel submission polling Aarushi Mehta
2019-08-02  8:08   ` Stefan Hajnoczi
2019-08-05  8:29   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190802080341.GD4227@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=fam@euphon.net \
    --cc=jusual@mail.ru \
    --cc=kwolf@redhat.com \
    --cc=mehta.aaru20@gmail.com \
    --cc=mlevitsk@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=saket.sinha89@gmail.com \
    --cc=slp@redhat.com \
    --cc=stefan@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).