From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D8A93126C8 for ; Tue, 2 Dec 2025 12:21:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764678072; cv=none; b=GHpto4vvFDPPQ86JfE2LijNlnaek4nYFSPMirjOQztpakSC7MHR+jf3ETPyi7ctAUca2P9oj7OdlzWsB04NAvOYXDPaHKq00FFxC8q1BtnCEeQy7Us6K8yjP7PlU40tZySMr2APSFm4Bf/Su6YKpDxFQ+cLk6LQUxvbYDQARrfs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764678072; c=relaxed/simple; bh=ICmHdJOJKZcTbZzVPWf3rdpfyXqO6+9oka/qVmzx1jw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ahZLYmPlTYta2NpUDCfa6qhrPXHvpCNmVgfQzPzQbH1dNhW1gTEEYglDrhjnvs5QL7Huh9RtXRmr/4ccI0vGThIAUz/gNoy3Qk95G1m0kE2raOuttS/3Q36zoKanfH3g+pP/bdC0WJ0yP2wAuldV0UBy3C+CJYz9ADx1cbpg1VI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=NfAufbIP; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NfAufbIP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1764678069; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GRDxJWEGzJrZPD7eMQduq+pkbcsx1lJwLmpPLiR/kVA=; b=NfAufbIPPGZh0bPhqATVDAazc4ftyK4b6pS3imSpxtZc7g238+8lzFUg6ozAkkFuHke/Mh PfM8VDtzSBOc5F1c166TANxVVZQiicaHscoArU/o7Ed9Gjoki5DQS9C0q2+A78hiu2kkSb 9U2kNEv5uI3xE7YPg5Nhe2cLHwP6Vm8= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-26-Qzg6K7x_Pry0oLVgj2EKlg-1; Tue, 02 Dec 2025 07:21:07 -0500 X-MC-Unique: Qzg6K7x_Pry0oLVgj2EKlg-1 X-Mimecast-MFC-AGG-ID: Qzg6K7x_Pry0oLVgj2EKlg_1764678066 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C9FF41956094; Tue, 2 Dec 2025 12:21:06 +0000 (UTC) Received: from localhost (unknown [10.72.116.20]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6FB7A30001A4; Tue, 2 Dec 2025 12:21:05 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Caleb Sander Mateos , Uday Shankar , Ming Lei Subject: [PATCH V5 10/21] ublk: add new feature UBLK_F_BATCH_IO Date: Tue, 2 Dec 2025 20:19:04 +0800 Message-ID: <20251202121917.1412280-11-ming.lei@redhat.com> In-Reply-To: <20251202121917.1412280-1-ming.lei@redhat.com> References: <20251202121917.1412280-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Add new feature UBLK_F_BATCH_IO which replaces the following two per-io commands: - UBLK_U_IO_FETCH_REQ - UBLK_U_IO_COMMIT_AND_FETCH_REQ with three per-queue batch io uring_cmd: - UBLK_U_IO_PREP_IO_CMDS - UBLK_U_IO_COMMIT_IO_CMDS - UBLK_U_IO_FETCH_IO_CMDS Then ublk can deliver batch io commands to ublk server in single multishort uring_cmd, also allows to prepare & commit multiple commands in batch style via single uring_cmd, communication cost is reduced a lot. This feature also doesn't limit task context any more for all supported commands, so any allowed uring_cmd can be issued in any task context. ublk server implementation becomes much easier. Meantime load balance becomes much easier to support with this feature. The command `UBLK_U_IO_FETCH_IO_CMDS` can be issued from multiple task contexts, so each task can adjust this command's buffer length or number of inflight commands for controlling how much load is handled by current task. Later, priority parameter will be added to command `UBLK_U_IO_FETCH_IO_CMDS` for improving load balance support. UBLK_U_IO_NEED_GET_DATA isn't supported in batch io yet, but it may be enabled in future via its batch pair. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 53 +++++++++++++++++++++++++++++------ include/uapi/linux/ublk_cmd.h | 16 +++++++++++ 2 files changed, 61 insertions(+), 8 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 3865edabe1e6..034420e8df55 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -74,7 +74,8 @@ | UBLK_F_AUTO_BUF_REG \ | UBLK_F_QUIESCE \ | UBLK_F_PER_IO_DAEMON \ - | UBLK_F_BUF_REG_OFF_DAEMON) + | UBLK_F_BUF_REG_OFF_DAEMON \ + | UBLK_F_BATCH_IO) #define UBLK_F_ALL_RECOVERY_FLAGS (UBLK_F_USER_RECOVERY \ | UBLK_F_USER_RECOVERY_REISSUE \ @@ -331,12 +332,12 @@ static void ublk_batch_dispatch(struct ublk_queue *ubq, static inline bool ublk_dev_support_batch_io(const struct ublk_device *ub) { - return false; + return ub->dev_info.flags & UBLK_F_BATCH_IO; } static inline bool ublk_support_batch_io(const struct ublk_queue *ubq) { - return false; + return ubq->flags & UBLK_F_BATCH_IO; } static inline void ublk_io_lock(struct ublk_io *io) @@ -3462,6 +3463,35 @@ static int ublk_validate_batch_fetch_cmd(struct ublk_batch_io_data *data, return 0; } +static int ublk_handle_non_batch_cmd(struct io_uring_cmd *cmd, + unsigned int issue_flags) +{ + const struct ublksrv_io_cmd *ub_cmd = io_uring_sqe_cmd(cmd->sqe); + struct ublk_device *ub = cmd->file->private_data; + unsigned tag = READ_ONCE(ub_cmd->tag); + unsigned q_id = READ_ONCE(ub_cmd->q_id); + unsigned index = READ_ONCE(ub_cmd->addr); + struct ublk_queue *ubq; + struct ublk_io *io; + + if (cmd->cmd_op == UBLK_U_IO_UNREGISTER_IO_BUF) + return ublk_unregister_io_buf(cmd, ub, index, issue_flags); + + if (q_id >= ub->dev_info.nr_hw_queues) + return -EINVAL; + + if (tag >= ub->dev_info.queue_depth) + return -EINVAL; + + if (cmd->cmd_op != UBLK_U_IO_REGISTER_IO_BUF) + return -EOPNOTSUPP; + + ubq = ublk_get_queue(ub, q_id); + io = &ubq->ios[tag]; + return ublk_register_io_buf(cmd, ub, q_id, tag, io, index, + issue_flags); +} + static int ublk_ch_batch_io_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) { @@ -3509,7 +3539,8 @@ static int ublk_ch_batch_io_uring_cmd(struct io_uring_cmd *cmd, ret = ublk_handle_batch_fetch_cmd(&data); break; default: - ret = -EOPNOTSUPP; + ret = ublk_handle_non_batch_cmd(cmd, issue_flags); + break; } out: return ret; @@ -4179,10 +4210,9 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ctrl_cmd *header) */ ub->dev_info.flags &= UBLK_F_ALL; - ub->dev_info.flags |= UBLK_F_CMD_IOCTL_ENCODE | - UBLK_F_URING_CMD_COMP_IN_TASK | - UBLK_F_PER_IO_DAEMON | - UBLK_F_BUF_REG_OFF_DAEMON; + /* So far, UBLK_F_PER_IO_DAEMON won't be exposed for BATCH_IO */ + if (ublk_dev_support_batch_io(ub)) + ub->dev_info.flags &= ~UBLK_F_PER_IO_DAEMON; /* GET_DATA isn't needed any more with USER_COPY or ZERO COPY */ if (ub->dev_info.flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY | @@ -4540,6 +4570,13 @@ static int ublk_wait_for_idle_io(struct ublk_device *ub, unsigned int elapsed = 0; int ret; + /* + * For UBLK_F_BATCH_IO ublk server can get notified with existing + * or new fetch command, so needn't wait any more + */ + if (ublk_dev_support_batch_io(ub)) + return 0; + while (elapsed < timeout_ms && !signal_pending(current)) { unsigned int queues_cancelable = 0; int i; diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h index cd894c1d188e..a64b7ee578fb 100644 --- a/include/uapi/linux/ublk_cmd.h +++ b/include/uapi/linux/ublk_cmd.h @@ -335,6 +335,22 @@ */ #define UBLK_F_BUF_REG_OFF_DAEMON (1ULL << 14) + +/* + * Support the following commands for delivering & committing io command + * in batch. + * + * - UBLK_U_IO_PREP_IO_CMDS + * - UBLK_U_IO_COMMIT_IO_CMDS + * - UBLK_U_IO_FETCH_IO_CMDS + * - UBLK_U_IO_REGISTER_IO_BUF + * - UBLK_U_IO_UNREGISTER_IO_BUF + * + * The existing UBLK_U_IO_FETCH_REQ, UBLK_U_IO_COMMIT_AND_FETCH_REQ and + * UBLK_U_IO_NEED_GET_DATA uring_cmd are not supported for this feature. + */ +#define UBLK_F_BATCH_IO (1ULL << 15) + /* device state */ #define UBLK_S_DEV_DEAD 0 #define UBLK_S_DEV_LIVE 1 -- 2.47.0