* [PATCH 0/2] requeue request if only one connection is configured @ 2020-02-19 6:31 Hou Pu 2020-02-19 6:31 ` [PATCH 1/2] nbd: enable replace socket " Hou Pu 2020-02-19 6:31 ` [PATCH 2/2] nbd: requeue command if the soecket is changed Hou Pu 0 siblings, 2 replies; 5+ messages in thread From: Hou Pu @ 2020-02-19 6:31 UTC (permalink / raw) To: josef, axboe, mchristi; +Cc: linux-block, nbd, Hou Pu Hello, NBD server could be upgraded if we have multiple connections. But if we have only one connection, after we take down NBD server, all inflight IO could finally timeout and return error. These patches fix this using current reconfiguration framework. I noticed that Mike has following patchset nbd: local daemon restart support https://lore.kernel.org/linux-block/5DD41C49.3080209@redhat.com/ It add another netlink interface (NBD_ATTR_SWAP_SOCKETS) and requeue request immediately after recongirure/swap socket. It do not need to wait for timeout to fire and requeue in timeout handler, which seems more like an improvement. Let fix this in current framework first. Hou Pu (2): nbd: enable replace socket if only one connection is configured nbd: requeue command if the soecket is changed drivers/block/nbd.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] nbd: enable replace socket if only one connection is configured 2020-02-19 6:31 [PATCH 0/2] requeue request if only one connection is configured Hou Pu @ 2020-02-19 6:31 ` Hou Pu 2020-02-25 6:32 ` Mike Christie 2020-02-19 6:31 ` [PATCH 2/2] nbd: requeue command if the soecket is changed Hou Pu 1 sibling, 1 reply; 5+ messages in thread From: Hou Pu @ 2020-02-19 6:31 UTC (permalink / raw) To: josef, axboe, mchristi; +Cc: linux-block, nbd, Hou Pu Nbd server with multiple connections could be upgraded since 560bc4b (nbd: handle dead connections). But if only one conncection is configured, after we take down nbd server, all inflight IO would finally timeout and return error. We could requeue them like what we do with multiple connections and wait for new socket in submit path. Signed-off-by: Hou Pu <houpu@bytedance.com> --- drivers/block/nbd.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 78181908f0df..8e348c9c49a4 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -395,16 +395,19 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, } config = nbd->config; - if (config->num_connections > 1) { + if (config->num_connections > 1 || + (config->num_connections == 1 && nbd->tag_set.timeout)) { dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out, retrying (%d/%d alive)\n", atomic_read(&config->live_connections), config->num_connections); /* * Hooray we have more connections, requeue this IO, the submit - * path will put it on a real connection. + * path will put it on a real connection. Or if only one + * connection is configured, the submit path will wait util + * a new connection is reconfigured or util dead timeout. */ - if (config->socks && config->num_connections > 1) { + if (config->socks) { if (cmd->index < config->num_connections) { struct nbd_sock *nsock = config->socks[cmd->index]; @@ -747,8 +750,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) * and let the timeout stuff handle resubmitting * this request onto another connection. */ - if (nbd_disconnected(config) || - config->num_connections <= 1) { + if (nbd_disconnected(config)) { cmd->status = BLK_STS_IOERR; goto out; } @@ -825,7 +827,7 @@ static int find_fallback(struct nbd_device *nbd, int index) if (config->num_connections <= 1) { dev_err_ratelimited(disk_to_dev(nbd->disk), - "Attempted send on invalid socket\n"); + "Dead connection, failed to find a fallback\n"); return new_index; } -- 2.11.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] nbd: enable replace socket if only one connection is configured 2020-02-19 6:31 ` [PATCH 1/2] nbd: enable replace socket " Hou Pu @ 2020-02-25 6:32 ` Mike Christie 2020-02-25 14:10 ` Hou Pu 0 siblings, 1 reply; 5+ messages in thread From: Mike Christie @ 2020-02-25 6:32 UTC (permalink / raw) To: Hou Pu, josef, axboe; +Cc: linux-block, nbd, Hou Pu On 02/19/2020 12:31 AM, Hou Pu wrote: > Nbd server with multiple connections could be upgraded since > 560bc4b (nbd: handle dead connections). But if only one conncection > is configured, after we take down nbd server, all inflight IO > would finally timeout and return error. We could requeue them > like what we do with multiple connections and wait for new socket > in submit path. > > Signed-off-by: Hou Pu <houpu@bytedance.com> > --- > drivers/block/nbd.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > index 78181908f0df..8e348c9c49a4 100644 > --- a/drivers/block/nbd.c > +++ b/drivers/block/nbd.c > @@ -395,16 +395,19 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, > } > config = nbd->config; > > - if (config->num_connections > 1) { > + if (config->num_connections > 1 || > + (config->num_connections == 1 && nbd->tag_set.timeout)) { > dev_err_ratelimited(nbd_to_dev(nbd), > "Connection timed out, retrying (%d/%d alive)\n", > atomic_read(&config->live_connections), > config->num_connections); > /* > * Hooray we have more connections, requeue this IO, the submit > - * path will put it on a real connection. > + * path will put it on a real connection. Or if only one > + * connection is configured, the submit path will wait util > + * a new connection is reconfigured or util dead timeout. > */ > - if (config->socks && config->num_connections > 1) { > + if (config->socks) { > if (cmd->index < config->num_connections) { > struct nbd_sock *nsock = > config->socks[cmd->index]; > @@ -747,8 +750,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) > * and let the timeout stuff handle resubmitting > * this request onto another connection. > */ > - if (nbd_disconnected(config) || > - config->num_connections <= 1) { > + if (nbd_disconnected(config)) { I think you need to update the comment right above this chunk. It still mentions num_connections=1 working differently. > cmd->status = BLK_STS_IOERR; > goto out; > } > @@ -825,7 +827,7 @@ static int find_fallback(struct nbd_device *nbd, int index) > > if (config->num_connections <= 1) { > dev_err_ratelimited(disk_to_dev(nbd->disk), > - "Attempted send on invalid socket\n"); > + "Dead connection, failed to find a fallback\n"); > return new_index; > } > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] nbd: enable replace socket if only one connection is configured 2020-02-25 6:32 ` Mike Christie @ 2020-02-25 14:10 ` Hou Pu 0 siblings, 0 replies; 5+ messages in thread From: Hou Pu @ 2020-02-25 14:10 UTC (permalink / raw) To: Mike Christie; +Cc: Josef Bacik, axboe, linux-block, nbd, Hou Pu On Tue, Feb 25, 2020 at 2:32 PM Mike Christie <mchristi@redhat.com> wrote: > > On 02/19/2020 12:31 AM, Hou Pu wrote: > > Nbd server with multiple connections could be upgraded since > > 560bc4b (nbd: handle dead connections). But if only one conncection > > is configured, after we take down nbd server, all inflight IO > > would finally timeout and return error. We could requeue them > > like what we do with multiple connections and wait for new socket > > in submit path. > > > > Signed-off-by: Hou Pu <houpu@bytedance.com> > > --- > > drivers/block/nbd.c | 14 ++++++++------ > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > > index 78181908f0df..8e348c9c49a4 100644 > > --- a/drivers/block/nbd.c > > +++ b/drivers/block/nbd.c > > @@ -395,16 +395,19 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, > > } > > config = nbd->config; > > > > - if (config->num_connections > 1) { > > + if (config->num_connections > 1 || > > + (config->num_connections == 1 && nbd->tag_set.timeout)) { > > dev_err_ratelimited(nbd_to_dev(nbd), > > "Connection timed out, retrying (%d/%d alive)\n", > > atomic_read(&config->live_connections), > > config->num_connections); > > /* > > * Hooray we have more connections, requeue this IO, the submit > > - * path will put it on a real connection. > > + * path will put it on a real connection. Or if only one > > + * connection is configured, the submit path will wait util > > + * a new connection is reconfigured or util dead timeout. > > */ > > - if (config->socks && config->num_connections > 1) { > > + if (config->socks) { > > if (cmd->index < config->num_connections) { > > struct nbd_sock *nsock = > > config->socks[cmd->index]; > > @@ -747,8 +750,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index) > > * and let the timeout stuff handle resubmitting > > * this request onto another connection. > > */ > > - if (nbd_disconnected(config) || > > - config->num_connections <= 1) { > > + if (nbd_disconnected(config)) { > > I think you need to update the comment right above this chunk. It still > mentions num_connections=1 working differently. Thanks. Will change the comment in v2. > > > > cmd->status = BLK_STS_IOERR; > > goto out; > > } > > @@ -825,7 +827,7 @@ static int find_fallback(struct nbd_device *nbd, int index) > > > > if (config->num_connections <= 1) { > > dev_err_ratelimited(disk_to_dev(nbd->disk), > > - "Attempted send on invalid socket\n"); > > + "Dead connection, failed to find a fallback\n"); > > return new_index; > > } > > > > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] nbd: requeue command if the soecket is changed 2020-02-19 6:31 [PATCH 0/2] requeue request if only one connection is configured Hou Pu 2020-02-19 6:31 ` [PATCH 1/2] nbd: enable replace socket " Hou Pu @ 2020-02-19 6:31 ` Hou Pu 1 sibling, 0 replies; 5+ messages in thread From: Hou Pu @ 2020-02-19 6:31 UTC (permalink / raw) To: josef, axboe, mchristi; +Cc: linux-block, nbd, Hou Pu In commit 2da22da5734 (nbd: fix zero cmd timeout handling v2), it is allowed to reset timer when it fires if tag_set.timeout is set to zero. If the server is shutdown and a new socket is reconfigured, the request should be requeued to be processed by new server instead of waiting for response from the old one. Signed-off-by: Hou Pu <houpu@bytedance.com> --- drivers/block/nbd.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 8e348c9c49a4..7bbc5477066c 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -434,12 +434,22 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req, * Userspace sets timeout=0 to disable socket disconnection, * so just warn and reset the timer. */ + struct nbd_sock *nsock = config->socks[cmd->index]; cmd->retries++; dev_info(nbd_to_dev(nbd), "Possible stuck request %p: control (%s@%llu,%uB). Runtime %u seconds\n", req, nbdcmd_to_ascii(req_to_nbd_cmd_type(req)), (unsigned long long)blk_rq_pos(req) << 9, blk_rq_bytes(req), (req->timeout / HZ) * cmd->retries); + mutex_lock(&nsock->tx_lock); + if (cmd->cookie != nsock->cookie) { + nbd_requeue_cmd(cmd); + mutex_unlock(&nsock->tx_lock); + mutex_unlock(&cmd->lock); + nbd_config_put(nbd); + return BLK_EH_DONE; + } + mutex_unlock(&nsock->tx_lock); mutex_unlock(&cmd->lock); nbd_config_put(nbd); return BLK_EH_RESET_TIMER; -- 2.11.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-02-25 14:11 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-02-19 6:31 [PATCH 0/2] requeue request if only one connection is configured Hou Pu 2020-02-19 6:31 ` [PATCH 1/2] nbd: enable replace socket " Hou Pu 2020-02-25 6:32 ` Mike Christie 2020-02-25 14:10 ` Hou Pu 2020-02-19 6:31 ` [PATCH 2/2] nbd: requeue command if the soecket is changed Hou Pu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox