All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	 "quintela@redhat.com" <quintela@redhat.com>,
	 "peterx@redhat.com" <peterx@redhat.com>,
	"leobras@redhat.com" <leobras@redhat.com>
Subject: Re: [PATCH 39/52] migration/rdma: Convert qemu_rdma_write_one() to Error
Date: Thu, 28 Sep 2023 08:49:02 +0200	[thread overview]
Message-ID: <87ttre3i8x.fsf@pond.sub.org> (raw)
In-Reply-To: <87zg17dejj.fsf@pond.sub.org> (Markus Armbruster's message of "Wed, 27 Sep 2023 13:46:40 +0200")

Let me try to map solutions.

Markus Armbruster <armbru@redhat.com> writes:

> migration/rdma.c uses errno directly or via perror() after the following
> functions:
>
> * poll()
>
>   POSIX specifies errno is set on error.  Good.

Nothing wrong, nothing to do.

> * rdma_get_cm_event(), rdma_connect(), rdma_get_cm_event()
>
>   Manual page promises "if an error occurs, errno will be set".  Good.

Nothing wrong, nothing to do.

> * ibv_open_device()
>
>   Manual page does not mention errno.  Using it seems ill-advised.
>
>   qemu_rdma_broken_ipv6_kernel() recovers from EPERM by trying the next
>   device.  Wrong if ibv_open_device() doesn't actually set errno.
>
>   What is to be done here?

1. Investigate whether ibv_open_device() sets errno.  I can't spare time
   for that.

2. Add a comment pointing out the problem, in the hope somebody
   investigates later.

3. Do nothing.

> * ibv_reg_mr()
>
>   Manual page does not mention errno.  Using it seems ill-advised.
>
>   qemu_rdma_reg_whole_ram_blocks() and qemu_rdma_register_and_get_keys()
>   recover from errno = ENOTSUP by retrying with modified @access
>   argument.  Wrong if ibv_reg_mr() doesn't actually set errno.
>
>   What is to be done here?

Likewise.

> * ibv_get_cq_event()
>
>   Manual page does not mention errno.  Using it seems ill-advised.
>
>   qemu_rdma_block_for_wrid() calls perror().  Removed in PATCH 48.  Good
>   enough.

1. Add a comment pointing out the problem, remove it in PATCH 48.

2. Nothing wrong after the series, nothing to do.

> * ibv_post_send()
>
>   Manual page has the function return "the value of errno on failure".
>   Sounds like it sets errno to the value it returns.  However, the
>   rdma-core repository defines it as
>
>     static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr,
>                                     struct ibv_send_wr **bad_wr)
>     {
>             return qp->context->ops.post_send(qp, wr, bad_wr);
>     }
>
>   and at least one of the methods fails without setting errno:
>
>     static int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
>                               struct ibv_send_wr **bad)
>     {
>             /* This version of driver supports RAW QP only.
>              * Posting WR is done directly in the application.
>              */
>             return EOPNOTSUPP;
>     }
>
>   qemu_rdma_write_one() calls perror().  PATCH 39 (this one) replaces it
>   by error_setg(), not error_setg_errno().  Seems prudent, but should be
>   called out in the commit message.

1. Add a comment pointing out the problem, remove it in PATCH 39.

2. Pass @ret, not @errno to error_setg_errno().

3. Nothing wrong after the series, nothing to do.

Since 2. is easy, let's do it.

> * ibv_advise_mr()
>
>   Manual page has the function return "the value of errno on failure".
>   Sounds like it sets errno to the value it returns, but my findings for
>   ibv_post_send() make me doubt it.
>
>   qemu_rdma_advise_prefetch_mr() traces strerror(errno).  Could be
>   misleading.  Drop that part?

1. Change sterror(errno) to strerror(ret)

2. Drop strerror(errno)

3. Do nothing.

Since 1. is easy, let's do it.

> * ibv_dereg_mr()
>
>   Manual page has the function return "the value of errno on failure".
>   Sounds like it sets errno to the value it returns, but my findings for
>   ibv_post_send() make me doubt it.
>
>   qemu_rdma_unregister_waiting() calls perror().  Removed in PATCH 51.
>   Good enough.

1. Add a comment pointing out the problem, remove it in PATCH 51.

2. Nothing wrong after the series, nothing to do.

> * qemu_get_cm_event_timeout()
>
>   Can fail without setting errno.
>
>   qemu_rdma_connect() calls perror().  Removed in PATCH 45.  Good
>   enough.
>
> Thoughts?

Considering all of the above...  I'd like to stick a patch documenting
problematic errno use early in the series, and fix all the easy ones
later in the series, leaving just the two difficult ones in
qemu_rdma_broken_ipv6_kernel() and qemu_rdma_reg_whole_ram_blocks().

Makes sense?

> [...]
>
> [*] https://github.com/linux-rdma/rdma-core.git
>     commit 55fa316b4b18f258d8ac1ceb4aa5a7a35b094dcf



  reply	other threads:[~2023-09-28  6:50 UTC|newest]

Thread overview: 174+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-18 14:41 [PATCH 00/52] migration/rdma: Error handling fixes Markus Armbruster
2023-09-18 14:41 ` [PATCH 01/52] migration/rdma: Clean up qemu_rdma_poll()'s return type Markus Armbruster
2023-09-18 16:50   ` Fabiano Rosas
2023-09-21  8:52   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 02/52] migration/rdma: Clean up qemu_rdma_data_init()'s " Markus Armbruster
2023-09-18 16:51   ` Fabiano Rosas
2023-09-21  8:52   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 03/52] migration/rdma: Clean up rdma_delete_block()'s " Markus Armbruster
2023-09-18 16:53   ` Fabiano Rosas
2023-09-21  8:53   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 04/52] migration/rdma: Drop fragile wr_id formatting Markus Armbruster
2023-09-18 17:01   ` Fabiano Rosas
2023-09-21  8:54   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 05/52] migration/rdma: Consistently use uint64_t for work request IDs Markus Armbruster
2023-09-18 17:03   ` Fabiano Rosas
2023-09-19  5:39   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 06/52] migration/rdma: Clean up two more harmless signed vs. unsigned issues Markus Armbruster
2023-09-18 17:10   ` Fabiano Rosas
2023-09-20 13:11     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 07/52] migration/rdma: Give qio_channel_rdma_source_funcs internal linkage Markus Armbruster
2023-09-18 17:11   ` Fabiano Rosas
2023-09-21  9:00   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 08/52] migration/rdma: Fix qemu_rdma_accept() to return failure on errors Markus Armbruster
2023-09-18 17:15   ` Fabiano Rosas
2023-09-21  9:07   ` Zhijian Li (Fujitsu)
2023-09-28 10:53     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 09/52] migration/rdma: Put @errp parameter last Markus Armbruster
2023-09-18 17:17   ` Fabiano Rosas
2023-09-21  9:08   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 10/52] migration/rdma: Eliminate error_propagate() Markus Armbruster
2023-09-18 17:20   ` Fabiano Rosas
2023-09-21  9:31   ` Zhijian Li (Fujitsu)
2023-09-27 16:20   ` Eric Blake
2023-09-27 19:02     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 11/52] migration/rdma: Drop rdma_add_block() error handling Markus Armbruster
2023-09-18 17:32   ` Fabiano Rosas
2023-09-21  9:39   ` Zhijian Li (Fujitsu)
2023-09-21 11:15     ` Markus Armbruster
2023-09-22  7:49       ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 12/52] migration/rdma: Drop qemu_rdma_search_ram_block() " Markus Armbruster
2023-09-18 17:35   ` Fabiano Rosas
2023-09-20 13:11     ` Markus Armbruster
2023-09-22  7:50   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 13/52] migration/rdma: Make qemu_rdma_buffer_mergable() return bool Markus Armbruster
2023-09-18 17:36   ` Fabiano Rosas
2023-09-22  7:51   ` Zhijian Li (Fujitsu)
2023-09-27 16:26   ` Eric Blake
2023-09-27 19:04     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 14/52] migration/rdma: Use bool for two RDMAContext flags Markus Armbruster
2023-09-18 17:37   ` Fabiano Rosas
2023-09-22  7:54   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 15/52] migration/rdma: Ditch useless numeric error codes in error messages Markus Armbruster
2023-09-18 18:47   ` Fabiano Rosas
2023-09-22  8:44   ` Zhijian Li (Fujitsu)
2023-09-22  9:43     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 16/52] migration/rdma: Fix io_writev(), io_readv() methods to obey contract Markus Armbruster
2023-09-18 18:57   ` Fabiano Rosas
2023-09-22  8:59   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 17/52] migration/rdma: Replace dangerous macro CHECK_ERROR_STATE() Markus Armbruster
2023-09-18 18:57   ` Fabiano Rosas
2023-09-22  9:01   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 18/52] migration/rdma: Fix qemu_rdma_broken_ipv6_kernel() to set error Markus Armbruster
2023-09-18 19:00   ` Fabiano Rosas
2023-09-22  9:10   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 19/52] migration/rdma: Fix qemu_get_cm_event_timeout() to always " Markus Armbruster
2023-09-19 16:02   ` Peter Xu
2023-09-22  9:12   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 20/52] migration/rdma: Drop dead qemu_rdma_data_init() code for !@host_port Markus Armbruster
2023-09-19 16:02   ` Peter Xu
2023-09-20 13:13     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 21/52] migration/rdma: Fix QEMUFileHooks method return values Markus Armbruster
2023-09-25  4:08   ` Zhijian Li (Fujitsu)
2023-09-25  6:36     ` Markus Armbruster
2023-09-25  7:03       ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 22/52] migration/rdma: Fix rdma_getaddrinfo() error checking Markus Armbruster
2023-09-25  5:21   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 23/52] migration/rdma: Clean up qemu_rdma_wait_comp_channel()'s error value Markus Armbruster
2023-09-25  5:43   ` Zhijian Li (Fujitsu)
2023-09-25  6:55     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 24/52] migration/rdma: Return -1 instead of negative errno code Markus Armbruster
2023-09-26 10:15   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 25/52] migration/rdma: Dumb down remaining int error values to -1 Markus Armbruster
2023-09-26 10:16   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 26/52] migration/rdma: Replace int error_state by bool errored Markus Armbruster
2023-09-25  6:17   ` Zhijian Li (Fujitsu)
2023-09-25  7:09     ` Markus Armbruster
2023-09-26 10:18       ` Zhijian Li (Fujitsu)
2023-09-27 17:38   ` Eric Blake
2023-09-27 19:04     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 27/52] migration/rdma: Drop superfluous assignments to @ret Markus Armbruster
2023-09-25  6:20   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 28/52] migration/rdma: Check negative error values the same way everywhere Markus Armbruster
2023-09-25  6:26   ` Zhijian Li (Fujitsu)
2023-09-25  7:29     ` Markus Armbruster
2023-10-04 16:32       ` Juan Quintela
2023-10-04 16:57         ` Peter Xu
2023-10-05  5:45           ` Markus Armbruster
2023-10-05 14:52             ` Peter Xu
2023-10-05 16:06               ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 29/52] migration/rdma: Plug a memory leak and improve a message Markus Armbruster
2023-09-25  6:31   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 30/52] migration/rdma: Delete inappropriate error_report() in macro ERROR() Markus Armbruster
2023-09-25  6:35   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 31/52] migration/rdma: Retire " Markus Armbruster
2023-09-25  7:31   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 32/52] migration/rdma: Fix error handling around rdma_getaddrinfo() Markus Armbruster
2023-09-25  7:32   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 33/52] migration/rdma: Drop "@errp is clear" guards around error_setg() Markus Armbruster
2023-09-25  7:32   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 34/52] migration/rdma: Convert qemu_rdma_exchange_recv() to Error Markus Armbruster
2023-09-26  1:37   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 35/52] migration/rdma: Convert qemu_rdma_exchange_send() " Markus Armbruster
2023-09-26  1:42   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 36/52] migration/rdma: Convert qemu_rdma_exchange_get_response() " Markus Armbruster
2023-09-26  1:45   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 37/52] migration/rdma: Convert qemu_rdma_reg_whole_ram_blocks() " Markus Armbruster
2023-09-26  1:51   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 38/52] migration/rdma: Convert qemu_rdma_write_flush() " Markus Armbruster
2023-09-26  1:56   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 39/52] migration/rdma: Convert qemu_rdma_write_one() " Markus Armbruster
2023-09-26  5:24   ` Zhijian Li (Fujitsu)
2023-09-26  5:50   ` Zhijian Li (Fujitsu)
2023-09-26  5:55     ` Zhijian Li (Fujitsu)
2023-09-26  9:26       ` Markus Armbruster
2023-09-27 11:46         ` Markus Armbruster
2023-09-28  6:49           ` Markus Armbruster [this message]
2023-10-07  3:40           ` Zhijian Li (Fujitsu)
2023-10-16 12:11             ` Jason Gunthorpe
2023-10-17  5:22               ` Zhijian Li (Fujitsu)
2023-09-26  6:40     ` Markus Armbruster
2023-09-18 14:41 ` [PATCH 40/52] migration/rdma: Convert qemu_rdma_write() " Markus Armbruster
2023-09-26  5:25   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 41/52] migration/rdma: Convert qemu_rdma_post_send_control() " Markus Armbruster
2023-09-26  5:29   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 42/52] migration/rdma: Convert qemu_rdma_post_recv_control() " Markus Armbruster
2023-09-26  5:31   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 43/52] migration/rdma: Convert qemu_rdma_alloc_pd_cq() " Markus Armbruster
2023-09-26  5:43   ` Zhijian Li (Fujitsu)
2023-09-26  6:41     ` Markus Armbruster
2023-09-26  6:55       ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 44/52] migration/rdma: Silence qemu_rdma_resolve_host() Markus Armbruster
2023-09-26  5:44   ` Zhijian Li (Fujitsu)
2023-09-18 14:41 ` [PATCH 45/52] migration/rdma: Silence qemu_rdma_connect() Markus Armbruster
2023-09-26  6:00   ` Zhijian Li (Fujitsu)
2023-09-18 14:42 ` [PATCH 46/52] migration/rdma: Silence qemu_rdma_reg_control() Markus Armbruster
2023-09-26  6:00   ` Zhijian Li (Fujitsu)
2023-09-18 14:42 ` [PATCH 47/52] migration/rdma: Don't report received completion events as error Markus Armbruster
2023-09-26  6:06   ` Zhijian Li (Fujitsu)
2023-09-18 14:42 ` [PATCH 48/52] migration/rdma: Silence qemu_rdma_block_for_wrid() Markus Armbruster
2023-09-26  6:17   ` Zhijian Li (Fujitsu)
2023-09-18 14:42 ` [PATCH 49/52] migration/rdma: Silence qemu_rdma_register_and_get_keys() Markus Armbruster
2023-09-26  6:21   ` Zhijian Li (Fujitsu)
2023-09-18 14:42 ` [PATCH 50/52] migration/rdma: Silence qemu_rdma_cleanup() Markus Armbruster
2023-09-26 10:12   ` Zhijian Li (Fujitsu)
2023-09-26 11:52     ` Markus Armbruster
2023-09-27  1:41       ` Zhijian Li (Fujitsu)
2023-09-27  5:30         ` Markus Armbruster
2023-09-18 14:42 ` [PATCH 51/52] migration/rdma: Use error_report() & friends instead of stderr Markus Armbruster
2023-09-26  6:35   ` Zhijian Li (Fujitsu)
2023-09-27 12:16     ` Markus Armbruster
2023-09-18 14:42 ` [PATCH 52/52] migration/rdma: Fix how we show device details on open Markus Armbruster
2023-09-26  6:49   ` Zhijian Li (Fujitsu)
2023-09-26  9:19     ` Markus Armbruster
2023-09-19 16:49 ` [PATCH 00/52] migration/rdma: Error handling fixes Peter Xu
2023-09-19 18:29   ` Daniel P. Berrangé
2023-09-19 18:57     ` Fabiano Rosas
2023-09-20 13:22     ` Markus Armbruster
2023-10-04 18:00     ` Juan Quintela
2023-10-05  7:14       ` Daniel P. Berrangé
2023-10-31 10:25         ` Juan Quintela
2023-09-21  8:27   ` Zhijian Li (Fujitsu)
2023-09-22 15:21     ` Peter Xu
2023-09-25  8:06       ` Zhijian Li (Fujitsu)
2023-09-28 11:08 ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttre3i8x.fsf@pond.sub.org \
    --to=armbru@redhat.com \
    --cc=leobras@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.