* Re: [PATCH v2 1/1] nbd: trace long NBD operations
2022-05-30 10:39 [PATCH v2 1/1] nbd: trace long NBD operations Denis V. Lunev
@ 2022-06-02 10:32 ` Vladimir Sementsov-Ogievskiy
2022-06-06 14:36 ` Denis V. Lunev
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-02 10:32 UTC (permalink / raw)
To: Denis V. Lunev, qemu-block, qemu-devel
Cc: Eric Blake, Kevin Wolf, Hanna Reitz, Paolo Bonzini
On 5/30/22 13:39, Denis V. Lunev wrote:
> At the moment there are 2 sources of lengthy operations if configured:
> * open connection, which could retry inside and
> * reconnect of already opened connection
> These operations could be quite lengthy and cumbersome to catch thus
> it would be quite natural to add trace points for them.
>
> This patch is based on the original downstream work made by Vladimir.
>
> Signed-off-by: Denis V. Lunev<den@openvz.org>
> CC: Eric Blake<eblake@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy<vsementsov@yandex-team.ru>
> CC: Kevin Wolf<kwolf@redhat.com>
> CC: Hanna Reitz<hreitz@redhat.com>
> CC: Paolo Bonzini<pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 1/1] nbd: trace long NBD operations
2022-05-30 10:39 [PATCH v2 1/1] nbd: trace long NBD operations Denis V. Lunev
2022-06-02 10:32 ` Vladimir Sementsov-Ogievskiy
@ 2022-06-06 14:36 ` Denis V. Lunev
2022-06-13 11:29 ` Denis V. Lunev
2022-06-14 7:48 ` Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 5+ messages in thread
From: Denis V. Lunev @ 2022-06-06 14:36 UTC (permalink / raw)
To: qemu-block, qemu-devel
Cc: Eric Blake, Vladimir Sementsov-Ogievskiy, Kevin Wolf, Hanna Reitz,
Paolo Bonzini
On 30.05.2022 12:39, Denis V. Lunev wrote:
> At the moment there are 2 sources of lengthy operations if configured:
> * open connection, which could retry inside and
> * reconnect of already opened connection
> These operations could be quite lengthy and cumbersome to catch thus
> it would be quite natural to add trace points for them.
>
> This patch is based on the original downstream work made by Vladimir.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Hanna Reitz <hreitz@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/nbd.c | 6 +++++-
> block/trace-events | 2 ++
> nbd/client-connection.c | 2 ++
> nbd/trace-events | 3 +++
> 4 files changed, 12 insertions(+), 1 deletion(-)
>
> Changes from v1:
> - %d -> %u for in_flights
> - renamed open trace-point + fixed timeout type
> - moved trace_nbd_reconnect_attempt_result above
>
> diff --git a/block/nbd.c b/block/nbd.c
> index 512a53bf73..19e773d602 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -371,6 +371,7 @@ static bool nbd_client_connecting(BDRVNBDState *s)
> /* Called with s->requests_lock taken. */
> static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> {
> + int ret;
> bool blocking = s->state == NBD_CLIENT_CONNECTING_WAIT;
>
> /*
> @@ -380,6 +381,8 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> assert(nbd_client_connecting(s));
> assert(s->in_flight == 1);
>
> + trace_nbd_reconnect_attempt(s->bs->in_flight);
> +
> if (blocking && !s->reconnect_delay_timer) {
> /*
> * It's the first reconnect attempt after switching to
> @@ -401,7 +404,8 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> }
>
> qemu_mutex_unlock(&s->requests_lock);
> - nbd_co_do_establish_connection(s->bs, blocking, NULL);
> + ret = nbd_co_do_establish_connection(s->bs, blocking, NULL);
> + trace_nbd_reconnect_attempt_result(ret, s->bs->in_flight);
> qemu_mutex_lock(&s->requests_lock);
>
> /*
> diff --git a/block/trace-events b/block/trace-events
> index 549090d453..48dbf10c66 100644
> --- a/block/trace-events
> +++ b/block/trace-events
> @@ -172,6 +172,8 @@ nbd_read_reply_entry_fail(int ret, const char *err) "ret = %d, err: %s"
> nbd_co_request_fail(uint64_t from, uint32_t len, uint64_t handle, uint16_t flags, uint16_t type, const char *name, int ret, const char *err) "Request failed { .from = %" PRIu64", .len = %" PRIu32 ", .handle = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) } ret = %d, err: %s"
> nbd_client_handshake(const char *export_name) "export '%s'"
> nbd_client_handshake_success(const char *export_name) "export '%s'"
> +nbd_reconnect_attempt(unsigned in_flight) "in_flight %u"
> +nbd_reconnect_attempt_result(int ret, unsigned in_flight) "ret %d in_flight %u"
>
> # ssh.c
> ssh_restart_coroutine(void *co) "co=%p"
> diff --git a/nbd/client-connection.c b/nbd/client-connection.c
> index 2a632931c3..0c5f917efa 100644
> --- a/nbd/client-connection.c
> +++ b/nbd/client-connection.c
> @@ -23,6 +23,7 @@
> */
>
> #include "qemu/osdep.h"
> +#include "trace.h"
>
> #include "block/nbd.h"
>
> @@ -210,6 +211,7 @@ static void *connect_thread_func(void *opaque)
> object_unref(OBJECT(conn->sioc));
> conn->sioc = NULL;
> if (conn->do_retry && !conn->detached) {
> + trace_nbd_connect_thread_sleep(timeout);
> qemu_mutex_unlock(&conn->mutex);
>
> sleep(timeout);
> diff --git a/nbd/trace-events b/nbd/trace-events
> index c4919a2dd5..b7032ca277 100644
> --- a/nbd/trace-events
> +++ b/nbd/trace-events
> @@ -73,3 +73,6 @@ nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *n
> nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
> nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx32 ", align=0x%" PRIx32
> nbd_trip(void) "Reading request"
> +
> +# client-connection.c
> +nbd_connect_thread_sleep(uint64_t timeout) "timeout %" PRIu64
ping
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 1/1] nbd: trace long NBD operations
2022-05-30 10:39 [PATCH v2 1/1] nbd: trace long NBD operations Denis V. Lunev
2022-06-02 10:32 ` Vladimir Sementsov-Ogievskiy
2022-06-06 14:36 ` Denis V. Lunev
@ 2022-06-13 11:29 ` Denis V. Lunev
2022-06-14 7:48 ` Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 5+ messages in thread
From: Denis V. Lunev @ 2022-06-13 11:29 UTC (permalink / raw)
To: qemu-block, qemu-devel
Cc: Eric Blake, Vladimir Sementsov-Ogievskiy, Kevin Wolf, Hanna Reitz,
Paolo Bonzini
On 30.05.2022 12:39, Denis V. Lunev wrote:
> At the moment there are 2 sources of lengthy operations if configured:
> * open connection, which could retry inside and
> * reconnect of already opened connection
> These operations could be quite lengthy and cumbersome to catch thus
> it would be quite natural to add trace points for them.
>
> This patch is based on the original downstream work made by Vladimir.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Hanna Reitz <hreitz@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/nbd.c | 6 +++++-
> block/trace-events | 2 ++
> nbd/client-connection.c | 2 ++
> nbd/trace-events | 3 +++
> 4 files changed, 12 insertions(+), 1 deletion(-)
>
> Changes from v1:
> - %d -> %u for in_flights
> - renamed open trace-point + fixed timeout type
> - moved trace_nbd_reconnect_attempt_result above
>
> diff --git a/block/nbd.c b/block/nbd.c
> index 512a53bf73..19e773d602 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -371,6 +371,7 @@ static bool nbd_client_connecting(BDRVNBDState *s)
> /* Called with s->requests_lock taken. */
> static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> {
> + int ret;
> bool blocking = s->state == NBD_CLIENT_CONNECTING_WAIT;
>
> /*
> @@ -380,6 +381,8 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> assert(nbd_client_connecting(s));
> assert(s->in_flight == 1);
>
> + trace_nbd_reconnect_attempt(s->bs->in_flight);
> +
> if (blocking && !s->reconnect_delay_timer) {
> /*
> * It's the first reconnect attempt after switching to
> @@ -401,7 +404,8 @@ static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> }
>
> qemu_mutex_unlock(&s->requests_lock);
> - nbd_co_do_establish_connection(s->bs, blocking, NULL);
> + ret = nbd_co_do_establish_connection(s->bs, blocking, NULL);
> + trace_nbd_reconnect_attempt_result(ret, s->bs->in_flight);
> qemu_mutex_lock(&s->requests_lock);
>
> /*
> diff --git a/block/trace-events b/block/trace-events
> index 549090d453..48dbf10c66 100644
> --- a/block/trace-events
> +++ b/block/trace-events
> @@ -172,6 +172,8 @@ nbd_read_reply_entry_fail(int ret, const char *err) "ret = %d, err: %s"
> nbd_co_request_fail(uint64_t from, uint32_t len, uint64_t handle, uint16_t flags, uint16_t type, const char *name, int ret, const char *err) "Request failed { .from = %" PRIu64", .len = %" PRIu32 ", .handle = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) } ret = %d, err: %s"
> nbd_client_handshake(const char *export_name) "export '%s'"
> nbd_client_handshake_success(const char *export_name) "export '%s'"
> +nbd_reconnect_attempt(unsigned in_flight) "in_flight %u"
> +nbd_reconnect_attempt_result(int ret, unsigned in_flight) "ret %d in_flight %u"
>
> # ssh.c
> ssh_restart_coroutine(void *co) "co=%p"
> diff --git a/nbd/client-connection.c b/nbd/client-connection.c
> index 2a632931c3..0c5f917efa 100644
> --- a/nbd/client-connection.c
> +++ b/nbd/client-connection.c
> @@ -23,6 +23,7 @@
> */
>
> #include "qemu/osdep.h"
> +#include "trace.h"
>
> #include "block/nbd.h"
>
> @@ -210,6 +211,7 @@ static void *connect_thread_func(void *opaque)
> object_unref(OBJECT(conn->sioc));
> conn->sioc = NULL;
> if (conn->do_retry && !conn->detached) {
> + trace_nbd_connect_thread_sleep(timeout);
> qemu_mutex_unlock(&conn->mutex);
>
> sleep(timeout);
> diff --git a/nbd/trace-events b/nbd/trace-events
> index c4919a2dd5..b7032ca277 100644
> --- a/nbd/trace-events
> +++ b/nbd/trace-events
> @@ -73,3 +73,6 @@ nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *n
> nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
> nbd_co_receive_align_compliance(const char *op, uint64_t from, uint32_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx32 ", align=0x%" PRIx32
> nbd_trip(void) "Reading request"
> +
> +# client-connection.c
> +nbd_connect_thread_sleep(uint64_t timeout) "timeout %" PRIu64
ping v2
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 1/1] nbd: trace long NBD operations
2022-05-30 10:39 [PATCH v2 1/1] nbd: trace long NBD operations Denis V. Lunev
` (2 preceding siblings ...)
2022-06-13 11:29 ` Denis V. Lunev
@ 2022-06-14 7:48 ` Vladimir Sementsov-Ogievskiy
3 siblings, 0 replies; 5+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2022-06-14 7:48 UTC (permalink / raw)
To: Denis V. Lunev, qemu-block, qemu-devel
Cc: Eric Blake, Kevin Wolf, Hanna Reitz, Paolo Bonzini
On 5/30/22 13:39, Denis V. Lunev wrote:
> At the moment there are 2 sources of lengthy operations if configured:
> * open connection, which could retry inside and
> * reconnect of already opened connection
> These operations could be quite lengthy and cumbersome to catch thus
> it would be quite natural to add trace points for them.
>
> This patch is based on the original downstream work made by Vladimir.
>
> Signed-off-by: Denis V. Lunev<den@openvz.org>
> CC: Eric Blake<eblake@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy<vsementsov@yandex-team.ru>
> CC: Kevin Wolf<kwolf@redhat.com>
> CC: Hanna Reitz<hreitz@redhat.com>
> CC: Paolo Bonzini<pbonzini@redhat.com>
Thanks, applied to my block branch at https://gitlab.com/vsementsov/qemu/-/commits/block
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 5+ messages in thread