All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dongsheng Yang <dongsheng.yang@linux.dev>
To: axboe@kernel.dk, dan.j.williams@intel.com,
	gregory.price@memverge.com, John@groves.net,
	Jonathan.Cameron@Huawei.com, bbhushan2@marvell.com,
	chaitanyak@nvidia.com, rdunlap@infradead.org
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-cxl@vger.kernel.org, linux-bcache@vger.kernel.org,
	Dongsheng Yang <dongsheng.yang@linux.dev>
Subject: [PATCH v2 1/8] cbd: introduce cbd_transport
Date: Wed, 18 Sep 2024 10:18:14 +0000	[thread overview]
Message-ID: <20240918101821.681118-2-dongsheng.yang@linux.dev> (raw)
In-Reply-To: <20240918101821.681118-1-dongsheng.yang@linux.dev>

cbd_transport represents the layout of the entire shared memory, as shown below.

	+-------------------------------------------------------------------------------------------------------------------------------+
	|                           cbd transport                                                                                       |
	+--------------------+-----------------------+-----------------------+----------------------+-----------------------------------+
	|                    |       hosts           |      backends         |       blkdevs        |        channels                   |
	| cbd transport info +----+----+----+--------+----+----+----+--------+----+----+----+-------+-------+-------+-------+-----------+
	|                    |    |    |    |  ...   |    |    |    |  ...   |    |    |    |  ...  |       |       |       |   ...     |
	+--------------------+----+----+----+--------+----+----+----+--------+----+----+----+-------+---+---+---+---+-------+-----------+
													|       |
													|       |
													|       |
													|       |
		  +-------------------------------------------------------------------------------------+       |
		  |                                                                                             |
		  |                                                                                             |
		  v                                                                                             |
	    +-----------------------------------------------------------+                                       |
	    |                 channel segment                           |                                       |
	    +--------------------+--------------------------------------+                                       |
	    |    channel meta    |              channel data            |                                       |
	    +---------+----------+--------------------------------------+                                       |
		      |                                                                                         |
		      |                                                                                         |
		      |                                                                                         |
		      v                                                                                         |
	    +----------------------------------------------------------+                                        |
	    |                 channel meta                             |                                        |
	    +-----------+--------------+-------------------------------+                                        |
	    | meta ctrl |  comp ring   |       cmd ring                |                                        |
	    +-----------+--------------+-------------------------------+                                        |
														|
														|
														|
		   +--------------------------------------------------------------------------------------------+
		   |
		   |
		   |
		   v
	     +----------------------------------------------------------+
	     |                cache segment                             |
	     +-----------+----------------------------------------------+
	     |   info    |               data                           |
	     +-----------+----------------------------------------------+

The shared memory is divided into five regions:

	a) Transport_info:
	    Information about the overall transport, including the layout
	of the transport.

	b) Hosts:
	    Each host wishing to utilize this transport needs to register
	its own information within a host entry in this region.

	c) Backends:
	    Starting a backend on a host requires filling in information
	in a backend entry within this region.

	d) Blkdevs:
	    Once a backend is established, it can be mapped to CBD device
	on any associated host. The information about the blkdevs is then
	filled into the blkdevs region.

	e) Segments:
	    This is the actual data communication area, where communication
	between blkdev and backend occurs. Each queue of a block device uses
	a channel, and each backend has a corresponding handler interacting
	with this queue.

	f) Channel segment:
	    Channel is one type of segment, is further divided into meta and
	data regions.
	    The meta region includes subm rings and comp rings.
	The blkdev converts upper-layer requests into cbd_se and fills
	them into the subm ring. The handler accepts the cbd_se from
	the subm ring and sends them to the local actual block device
	of the backend (e.g., sda). After completion, the results are
	formed into cbd_ce and filled into the comp ring. The blkdev
	then receives the cbd_ce and returns the results to the upper-layer
	IO sender.

	g) Cache segment:
	    Cache segment is another type of segment, when cache enabled
	for a backend, transport will allocate cache segments to this
	backend.

Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev>
---
 drivers/block/cbd/cbd_internal.h  | 1193 +++++++++++++++++++++++++++++
 drivers/block/cbd/cbd_transport.c |  957 +++++++++++++++++++++++
 2 files changed, 2150 insertions(+)
 create mode 100644 drivers/block/cbd/cbd_internal.h
 create mode 100644 drivers/block/cbd/cbd_transport.c

diff --git a/drivers/block/cbd/cbd_internal.h b/drivers/block/cbd/cbd_internal.h
new file mode 100644
index 000000000000..236dbcb62906
--- /dev/null
+++ b/drivers/block/cbd/cbd_internal.h
@@ -0,0 +1,1193 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _CBD_INTERNAL_H
+#define _CBD_INTERNAL_H
+
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/module.h>
+#include <linux/blk-mq.h>
+#include <asm/byteorder.h>
+#include <asm/types.h>
+#include <linux/types.h>
+#include <linux/delay.h>
+#include <linux/fs.h>
+#include <linux/dax.h>
+#include <linux/blkdev.h>
+#include <linux/slab.h>
+#include <linux/parser.h>
+#include <linux/idr.h>
+#include <linux/workqueue.h>
+#include <linux/uuid.h>
+#include <linux/bitfield.h>
+#include <linux/crc32.h>
+#include <linux/hashtable.h>
+
+/*
+ * CBD (CXL Block Device) provides two usage scenarios: single-host and multi-hosts.
+ *
+ * (1) Single-host scenario, CBD can use a pmem device as a cache for block devices,
+ * providing a caching mechanism specifically designed for persistent memory.
+ *
+ *	+-----------------------------------------------------------------+
+ *	|                         single-host                             |
+ *	+-----------------------------------------------------------------+
+ *	|                                                                 |
+ *	|                                                                 |
+ *	|                                                                 |
+ *	|                                                                 |
+ *	|                                                                 |
+ *	|                        +-----------+     +------------+         |
+ *	|                        | /dev/cbd0 |     | /dev/cbd1  |         |
+ *	|                        |           |     |            |         |
+ *	|  +---------------------|-----------|-----|------------|-------+ |
+ *	|  |                     |           |     |            |       | |
+ *	|  |      /dev/pmem0     | cbd0 cache|     | cbd1 cache |       | |
+ *	|  |                     |           |     |            |       | |
+ *	|  +---------------------|-----------|-----|------------|-------+ |
+ *	|                        |+---------+|     |+----------+|         |
+ *	|                        ||/dev/sda ||     || /dev/sdb ||         |
+ *	|                        |+---------+|     |+----------+|         |
+ *	|                        +-----------+     +------------+         |
+ *	+-----------------------------------------------------------------+
+ *
+ * (2) Multi-hosts scenario, CBD also provides a cache while taking advantage of
+ * shared memory features, allowing users to access block devices on other nodes across
+ * different hosts.
+ *
+ * As shared memory is supported in CXL3.0 spec, we can transfer data via CXL shared memory.
+ * CBD use CXL shared memory to transfer data between node-1 and node-2.
+ *
+ *	+--------------------------------------------------------------------------------------------------------+
+ *	|                                           multi-hosts                                                  |
+ *	+--------------------------------------------------------------------------------------------------------+
+ *	|                                                                                                        |
+ *	|                                                                                                        |
+ *	| +-------------------------------+                               +------------------------------------+ |
+ *	| |          node-1               |                               |              node-2                | |
+ *	| +-------------------------------+                               +------------------------------------+ |
+ *	| |                               |                               |                                    | |
+ *	| |                       +-------+                               +---------+                          | |
+ *	| |                       | cbd0  |                               | backend0+------------------+       | |
+ *	| |                       +-------+                               +---------+                  |       | |
+ *	| |                       | pmem0 |                               | pmem0   |                  v       | |
+ *	| |               +-------+-------+                               +---------+----+     +---------------+ |
+ *	| |               |    cxl driver |                               | cxl driver   |     |  /dev/sda     | |
+ *	| +---------------+--------+------+                               +-----+--------+-----+---------------+ |
+ *	|                          |                                            |                                |
+ *	|                          |                                            |                                |
+ *	|                          |        CXL                         CXL     |                                |
+ *	|                          +----------------+               +-----------+                                |
+ *	|                                           |               |                                            |
+ *	|                                           |               |                                            |
+ *	|                                           |               |                                            |
+ *	|                 +-------------------------+---------------+--------------------------+                 |
+ *	|                 |                         +---------------+                          |                 |
+ *	|                 | shared memory device    |  cbd0 cache   |                          |                 |
+ *	|                 |                         +---------------+                          |                 |
+ *	|                 +--------------------------------------------------------------------+                 |
+ *	|                                                                                                        |
+ *	+--------------------------------------------------------------------------------------------------------+
+ */
+
+#define cbd_err(fmt, ...)							\
+	pr_err("cbd: %s:%u " fmt, __func__, __LINE__, ##__VA_ARGS__)
+#define cbd_info(fmt, ...)							\
+	pr_info("cbd: %s:%u " fmt, __func__, __LINE__, ##__VA_ARGS__)
+#define cbd_debug(fmt, ...)							\
+	pr_debug("cbd: %s:%u " fmt, __func__, __LINE__, ##__VA_ARGS__)
+
+#define cbdt_err(transport, fmt, ...)						\
+	cbd_err("cbd_transport%u: " fmt,					\
+		 transport->id, ##__VA_ARGS__)
+#define cbdt_info(transport, fmt, ...)						\
+	cbd_info("cbd_transport%u: " fmt,					\
+		 transport->id, ##__VA_ARGS__)
+#define cbdt_debug(transport, fmt, ...)						\
+	cbd_debug("cbd_transport%u: " fmt,					\
+		 transport->id, ##__VA_ARGS__)
+
+#define cbdb_err(backend, fmt, ...)						\
+	cbdt_err(backend->cbdt, "backend%d: " fmt,				\
+		 backend->backend_id, ##__VA_ARGS__)
+#define cbdb_info(backend, fmt, ...)						\
+	cbdt_info(backend->cbdt, "backend%d: " fmt,				\
+		 backend->backend_id, ##__VA_ARGS__)
+#define cbdbdebug(backend, fmt, ...)						\
+	cbdt_debug(backend->cbdt, "backend%d: " fmt,				\
+		 backend->backend_id, ##__VA_ARGS__)
+
+#define cbd_handler_err(handler, fmt, ...)					\
+	cbdb_err(handler->cbdb, "handler%d: " fmt,				\
+		 handler->channel.seg_id, ##__VA_ARGS__)
+#define cbd_handler_info(handler, fmt, ...)					\
+	cbdb_info(handler->cbdb, "handler%d: " fmt,				\
+		 handler->channel.seg_id, ##__VA_ARGS__)
+#define cbd_handler_debug(handler, fmt, ...)					\
+	cbdb_debug(handler->cbdb, "handler%d: " fmt,				\
+		 handler->channel.seg_id, ##__VA_ARGS__)
+
+#define cbd_blk_err(dev, fmt, ...)						\
+	cbdt_err(dev->cbdt, "cbd%d: " fmt,					\
+		 dev->mapped_id, ##__VA_ARGS__)
+#define cbd_blk_info(dev, fmt, ...)						\
+	cbdt_info(dev->cbdt, "cbd%d: " fmt,					\
+		 dev->mapped_id, ##__VA_ARGS__)
+#define cbd_blk_debug(dev, fmt, ...)						\
+	cbdt_debug(dev->cbdt, "cbd%d: " fmt,					\
+		 dev->mapped_id, ##__VA_ARGS__)
+
+#define cbd_queue_err(queue, fmt, ...)						\
+	cbd_blk_err(queue->cbd_blkdev, "queue%d: " fmt,			\
+		     queue->channel.seg_id, ##__VA_ARGS__)
+#define cbd_queue_info(queue, fmt, ...)						\
+	cbd_blk_info(queue->cbd_blkdev, "queue%d: " fmt,			\
+		     queue->channel.seg_id, ##__VA_ARGS__)
+#define cbd_queue_debug(queue, fmt, ...)					\
+	cbd_blk_debug(queue->cbd_blkdev, "queue%d: " fmt,			\
+		     queue->channel.seg_id, ##__VA_ARGS__)
+
+#define cbd_channel_err(channel, fmt, ...)					\
+	cbdt_err(channel->cbdt, "channel%d: " fmt,				\
+		 channel->seg_id, ##__VA_ARGS__)
+#define cbd_channel_info(channel, fmt, ...)					\
+	cbdt_info(channel->cbdt, "channel%d: " fmt,				\
+		 channel->seg_id, ##__VA_ARGS__)
+#define cbd_channel_debug(channel, fmt, ...)					\
+	cbdt_debug(channel->cbdt, "channel%d: " fmt,				\
+		 channel->seg_id, ##__VA_ARGS__)
+
+#define cbd_cache_err(cache, fmt, ...)						\
+	cbdt_err(cache->cbdt, "cache%d: " fmt,					\
+		 cache->cache_id, ##__VA_ARGS__)
+#define cbd_cache_info(cache, fmt, ...)						\
+	cbdt_info(cache->cbdt, "cache%d: " fmt,					\
+		 cache->cache_id, ##__VA_ARGS__)
+#define cbd_cache_debug(cache, fmt, ...)					\
+	cbdt_debug(cache->cbdt, "cache%d: " fmt,				\
+		 cache->cache_id, ##__VA_ARGS__)
+
+#define CBD_KB	(1024)
+#define CBD_MB	(CBD_KB * CBD_KB)
+
+#define CBD_TRANSPORT_MAX	1024
+#define CBD_PATH_LEN	512
+#define CBD_NAME_LEN	32
+
+#define CBD_QUEUES_MAX		128
+
+#define CBD_PART_SHIFT 4
+#define CBD_DRV_NAME "cbd"
+#define CBD_DEV_NAME_LEN 32
+
+#define CBD_HB_INTERVAL		msecs_to_jiffies(5000) /* 5s */
+#define CBD_HB_TIMEOUT		(30 * 1000) /* 30s */
+
+/*
+ * CBD transport layout:
+ *
+ *	+-------------------------------------------------------------------------------------------------------------------------------+
+ *	|                           cbd transport                                                                                       |
+ *	+--------------------+-----------------------+-----------------------+----------------------+-----------------------------------+
+ *	|                    |       hosts           |      backends         |       blkdevs        |        channels                   |
+ *	| cbd transport info +----+----+----+--------+----+----+----+--------+----+----+----+-------+-------+-------+-------+-----------+
+ *	|                    |    |    |    |  ...   |    |    |    |  ...   |    |    |    |  ...  |       |       |       |   ...     |
+ *	+--------------------+----+----+----+--------+----+----+----+--------+----+----+----+-------+---+---+---+---+-------+-----------+
+ *	                                                                                                |       |
+ *	                                                                                                |       |
+ *	                                                                                                |       |
+ *	                                                                                                |       |
+ *	          +-------------------------------------------------------------------------------------+       |
+ *	          |                                                                                             |
+ *	          |                                                                                             |
+ *	          v                                                                                             |
+ *	    +-----------------------------------------------------------+                                       |
+ *	    |                 channel segment                           |                                       |
+ *	    +--------------------+--------------------------------------+                                       |
+ *	    |    channel meta    |              channel data            |                                       |
+ *	    +---------+----------+--------------------------------------+                                       |
+ *	              |                                                                                         |
+ *	              |                                                                                         |
+ *	              |                                                                                         |
+ *	              v                                                                                         |
+ *	    +----------------------------------------------------------+                                        |
+ *	    |                 channel meta                             |                                        |
+ *	    +-----------+--------------+-------------------------------+                                        |
+ *	    | meta ctrl |  comp ring   |       cmd ring                |                                        |
+ *	    +-----------+--------------+-------------------------------+                                        |
+ *	                                                                                                        |
+ *	                                                                                                        |
+ *	                                                                                                        |
+ *	           +--------------------------------------------------------------------------------------------+
+ *	           |
+ *	           |
+ *	           |
+ *	           v
+ *	     +----------------------------------------------------------+
+ *	     |                cache segment                             |
+ *	     +-----------+----------------------------------------------+
+ *	     |   info    |               data                           |
+ *	     +-----------+----------------------------------------------+
+ */
+
+/* cbd segment */
+#define CBDT_SEG_SIZE		(16 * 1024 * 1024)
+
+/* cbd channel seg */
+#define CBDC_META_SIZE		(4 * 1024 * 1024)
+#define CBDC_SUBMR_RESERVED	sizeof(struct cbd_se)
+#define CBDC_CMPR_RESERVED	sizeof(struct cbd_ce)
+
+#define CBDC_DATA_ALIGH		4096
+#define CBDC_DATA_RESERVED	CBDC_DATA_ALIGH
+
+#define CBDC_CTRL_OFF		0
+#define CBDC_CTRL_SIZE		PAGE_SIZE
+#define CBDC_COMPR_OFF		(CBDC_CTRL_OFF + CBDC_CTRL_SIZE)
+#define CBDC_COMPR_SIZE		(sizeof(struct cbd_ce) * 1024)
+#define CBDC_SUBMR_OFF		(CBDC_COMPR_OFF + CBDC_COMPR_SIZE)
+#define CBDC_SUBMR_SIZE		(CBDC_META_SIZE - CBDC_SUBMR_OFF)
+
+#define CBDC_DATA_OFF		CBDC_META_SIZE
+#define CBDC_DATA_SIZE		(CBDT_SEG_SIZE - CBDC_META_SIZE)
+
+#define CBDC_UPDATE_SUBMR_HEAD(head, used, size) (head = ((head % size) + used) % size)
+#define CBDC_UPDATE_SUBMR_TAIL(tail, used, size) (tail = ((tail % size) + used) % size)
+
+#define CBDC_UPDATE_COMPR_HEAD(head, used, size) (head = ((head % size) + used) % size)
+#define CBDC_UPDATE_COMPR_TAIL(tail, used, size) (tail = ((tail % size) + used) % size)
+
+/* cbd transport */
+#define CBD_TRANSPORT_MAGIC		0x65B05EFA96C596EFULL
+#define CBD_TRANSPORT_VERSION		1
+
+#define CBDT_INFO_OFF			0
+#define CBDT_INFO_SIZE			PAGE_SIZE
+
+#define CBDT_HOST_INFO_SIZE		round_up(sizeof(struct cbd_host_info), PAGE_SIZE)
+#define CBDT_BACKEND_INFO_SIZE		round_up(sizeof(struct cbd_backend_info), PAGE_SIZE)
+#define CBDT_BLKDEV_INFO_SIZE		round_up(sizeof(struct cbd_blkdev_info), PAGE_SIZE)
+
+#define CBD_TRASNPORT_SIZE_MIN		(512 * 1024 * 1024)
+
+/*
+ * CBD structure diagram:
+ *
+ *	                                        +--------------+
+ *	                                        | cbd_transport|                                               +----------+
+ *	                                        +--------------+                                               | cbd_host |
+ *	                                        |              |                                               +----------+
+ *	                                        |   host       +---------------------------------------------->|          |
+ *	                   +--------------------+   backends   |                                               | hostname |
+ *	                   |                    |   devices    +------------------------------------------+    |          |
+ *	                   |                    |              |                                          |    +----------+
+ *	                   |                    +--------------+                                          |
+ *	                   |                                                                              |
+ *	                   |                                                                              |
+ *	                   |                                                                              |
+ *	                   |                                                                              |
+ *	                   |                                                                              |
+ *	                   v                                                                              v
+ *	             +------------+     +-----------+     +------+                                  +-----------+      +-----------+     +------+
+ *	             | cbd_backend+---->|cbd_backend+---->| NULL |                                  | cbd_blkdev+----->| cbd_blkdev+---->| NULL |
+ *	             +------------+     +-----------+     +------+                                  +-----------+      +-----------+     +------+
+ *	+------------+  cbd_cache |     |  handlers |                                        +------+  queues   |      |  queues   |
+ *	|            |            |     +-----------+                                        |      |           |      +-----------+
+ *	|     +------+  handlers  |                                                          |      |           |
+ *	|     |      +------------+                                                          |      | cbd_cache +-------------------------------------+
+ *	|     |                                                                              |      +-----------+                                     |
+ *	|     |                                                                              |                                                        |
+ *	|     |      +-------------+       +-------------+           +------+                |      +-----------+      +-----------+     +------+     |
+ *	|     +----->| cbd_handler +------>| cbd_handler +---------->| NULL |                +----->| cbd_queue +----->| cbd_queue +---->| NULL |     |
+ *	|            +-------------+       +-------------+           +------+                       +-----------+      +-----------+     +------+     |
+ *	|     +------+ channel     |       |   channel   |                                   +------+  channel  |      |  channel  |                  |
+ *	|     |      +-------------+       +-------------+                                   |      +-----------+      +-----------+                  |
+ *	|     |                                                                              |                                                        |
+ *	|     |                                                                              |                                                        |
+ *	|     |                                                                              |                                                        |
+ *	|     |                                                                              v                                                        |
+ *	|     |                                                        +-----------------------+                                                      |
+ *	|     +------------------------------------------------------->|      cbd_channel      |                                                      |
+ *	|                                                              +-----------------------+                                                      |
+ *	|                                                              | channel_id            |                                                      |
+ *	|                                                              | cmdr (cmd ring)       |                                                      |
+ *	|                                                              | compr (complete ring) |                                                      |
+ *	|                                                              | data (data area)      |                                                      |
+ *	|                                                              |                       |                                                      |
+ *	|                                                              +-----------------------+                                                      |
+ *	|                                                                                                                                             |
+ *	|                                                 +-----------------------------+                                                             |
+ *	+------------------------------------------------>|         cbd_cache           |<------------------------------------------------------------+
+ *	                                                  +-----------------------------+
+ *	                                                  |     cache_wq                |
+ *	                                                  |     cache_tree              |
+ *	                                                  |     segments[]              |
+ *	                                                  +-----------------------------+
+ */
+
+#define CBD_DEVICE(OBJ)					\
+struct cbd_## OBJ ##_device {				\
+	struct device dev;				\
+	struct cbd_transport *cbdt;			\
+	struct cbd_## OBJ ##_info *OBJ##_info;		\
+};							\
+							\
+struct cbd_## OBJ ##s_device {				\
+	struct device OBJ ##s_dev;			\
+	struct cbd_## OBJ ##_device OBJ ##_devs[];	\
+}
+
+/* cbd_worker_cfg*/
+struct cbd_worker_cfg {
+	u32			busy_retry_cur;
+	u32			busy_retry_count;
+	u32			busy_retry_max;
+	u32			busy_retry_min;
+	u64			busy_retry_interval;
+};
+
+static inline void cbdwc_init(struct cbd_worker_cfg *cfg)
+{
+	/* init cbd_worker_cfg with default values */
+	cfg->busy_retry_cur = 0;
+	cfg->busy_retry_count = 100;
+	cfg->busy_retry_max = cfg->busy_retry_count * 2;
+	cfg->busy_retry_min = 0;
+	cfg->busy_retry_interval = 1;			/* 1us */
+}
+
+/* reset retry_cur and increase busy_retry_count */
+static inline void cbdwc_hit(struct cbd_worker_cfg *cfg)
+{
+	u32 delta;
+
+	cfg->busy_retry_cur = 0;
+
+	if (cfg->busy_retry_count == cfg->busy_retry_max)
+		return;
+
+	/* retry_count increase by 1/16 */
+	delta = cfg->busy_retry_count >> 4;
+	if (!delta)
+		delta = (cfg->busy_retry_max + cfg->busy_retry_min) >> 1;
+
+	cfg->busy_retry_count += delta;
+
+	if (cfg->busy_retry_count > cfg->busy_retry_max)
+		cfg->busy_retry_count = cfg->busy_retry_max;
+}
+
+/* reset retry_cur and decrease busy_retry_count */
+static inline void cbdwc_miss(struct cbd_worker_cfg *cfg)
+{
+	u32 delta;
+
+	cfg->busy_retry_cur = 0;
+
+	if (cfg->busy_retry_count == cfg->busy_retry_min)
+		return;
+
+	/* retry_count decrease by 1/16 */
+	delta = cfg->busy_retry_count >> 4;
+	if (!delta)
+		delta = cfg->busy_retry_count;
+
+	cfg->busy_retry_count -= delta;
+}
+
+static inline bool cbdwc_need_retry(struct cbd_worker_cfg *cfg)
+{
+	if (++cfg->busy_retry_cur < cfg->busy_retry_count) {
+		cpu_relax();
+		fsleep(cfg->busy_retry_interval);
+		return true;
+	}
+
+	return false;
+}
+
+/* cbd_transport */
+#define CBDT_INFO_F_BIGENDIAN		(1 << 0)
+#define CBDT_INFO_F_CRC			(1 << 1)
+
+#ifdef CONFIG_CBD_MULTIHOST
+#define CBDT_HOSTS_MAX			16
+#else
+#define CBDT_HOSTS_MAX			1
+#endif /*CONFIG_CBD_MULTIHOST*/
+
+struct cbd_transport_info {
+	__le64 magic;
+	__le16 version;
+	__le16 flags;
+
+	u32 hosts_registered;
+
+	u64 host_area_off;
+	u32 host_info_size;
+	u32 host_num;
+
+	u64 backend_area_off;
+	u32 backend_info_size;
+	u32 backend_num;
+
+	u64 blkdev_area_off;
+	u32 blkdev_info_size;
+	u32 blkdev_num;
+
+	u64 segment_area_off;
+	u32 segment_size;
+	u32 segment_num;
+};
+
+struct cbd_transport {
+	u16	id;
+	struct device device;
+	struct mutex lock;
+	struct mutex adm_lock;
+
+	struct cbd_transport_info *transport_info;
+
+	struct cbd_host *host;
+	struct list_head backends;
+	struct list_head devices;
+
+	struct cbd_hosts_device *cbd_hosts_dev;
+	struct cbd_segments_device *cbd_segments_dev;
+	struct cbd_backends_device *cbd_backends_dev;
+	struct cbd_blkdevs_device *cbd_blkdevs_dev;
+
+	struct dax_device *dax_dev;
+	struct file *bdev_file;
+};
+
+struct cbdt_register_options {
+	char hostname[CBD_NAME_LEN];
+	char path[CBD_PATH_LEN];
+	u16 format:1;
+	u16 force:1;
+	u16 unused:15;
+};
+
+struct cbd_blkdev;
+struct cbd_backend;
+struct cbd_backend_io;
+struct cbd_cache;
+
+int cbdt_register(struct cbdt_register_options *opts);
+int cbdt_unregister(u32 transport_id);
+
+struct cbd_host_info *cbdt_get_host_info(struct cbd_transport *cbdt, u32 id);
+struct cbd_backend_info *cbdt_get_backend_info(struct cbd_transport *cbdt, u32 id);
+struct cbd_blkdev_info *cbdt_get_blkdev_info(struct cbd_transport *cbdt, u32 id);
+struct cbd_segment_info *cbdt_get_segment_info(struct cbd_transport *cbdt, u32 id);
+static inline struct cbd_channel_info *cbdt_get_channel_info(struct cbd_transport *cbdt, u32 id)
+{
+	return (struct cbd_channel_info *)cbdt_get_segment_info(cbdt, id);
+}
+
+int cbdt_get_empty_host_id(struct cbd_transport *cbdt, u32 *id);
+int cbdt_get_empty_backend_id(struct cbd_transport *cbdt, u32 *id);
+int cbdt_get_empty_blkdev_id(struct cbd_transport *cbdt, u32 *id);
+int cbdt_get_empty_segment_id(struct cbd_transport *cbdt, u32 *id);
+
+void cbdt_add_backend(struct cbd_transport *cbdt, struct cbd_backend *cbdb);
+void cbdt_del_backend(struct cbd_transport *cbdt, struct cbd_backend *cbdb);
+struct cbd_backend *cbdt_get_backend(struct cbd_transport *cbdt, u32 id);
+void cbdt_add_blkdev(struct cbd_transport *cbdt, struct cbd_blkdev *blkdev);
+void cbdt_del_blkdev(struct cbd_transport *cbdt, struct cbd_blkdev *blkdev);
+struct cbd_blkdev *cbdt_get_blkdev(struct cbd_transport *cbdt, u32 id);
+
+struct page *cbdt_page(struct cbd_transport *cbdt, u64 transport_off, u32 *page_off);
+void cbdt_zero_range(struct cbd_transport *cbdt, void *pos, u32 size);
+
+/* cbd_host */
+CBD_DEVICE(host);
+
+enum cbd_host_state {
+	cbd_host_state_none	= 0,
+	cbd_host_state_running,
+	cbd_host_state_removing
+};
+
+struct cbd_host_info {
+	u8	state;
+	u64	alive_ts;
+	char	hostname[CBD_NAME_LEN];
+};
+
+struct cbd_host {
+	u32			host_id;
+	struct cbd_transport	*cbdt;
+
+	struct cbd_host_device	*dev;
+	struct cbd_host_info	*host_info;
+	struct delayed_work	hb_work; /* heartbeat work */
+};
+
+int cbd_host_register(struct cbd_transport *cbdt, char *hostname);
+int cbd_host_unregister(struct cbd_transport *cbdt);
+int cbd_host_clear(struct cbd_transport *cbdt, u32 host_id);
+bool cbd_host_info_is_alive(struct cbd_host_info *info);
+
+/* cbd_segment */
+CBD_DEVICE(segment);
+
+enum cbd_segment_state {
+	cbd_segment_state_none		= 0,
+	cbd_segment_state_running,
+	cbd_segment_state_removing
+};
+
+enum cbd_seg_type {
+	cbds_type_none = 0,
+	cbds_type_channel,
+	cbds_type_cache
+};
+
+static inline const char *cbds_type_str(enum cbd_seg_type type)
+{
+	if (type == cbds_type_channel)
+		return "channel";
+	else if (type == cbds_type_cache)
+		return "cache";
+
+	return "Unknown";
+}
+
+struct cbd_segment_info {
+	u8 state;
+	u8 type;
+	u8 ref;
+	u32 next_seg;
+	u64 alive_ts;
+};
+
+struct cbd_seg_pos {
+	struct cbd_segment *segment;
+	u32 off;
+};
+
+struct cbd_seg_ops {
+	void (*sanitize_pos)(struct cbd_seg_pos *pos);
+};
+
+struct cbds_init_options {
+	u32 seg_id;
+	enum cbd_seg_type type;
+	u32 data_off;
+	struct cbd_seg_ops *seg_ops;
+	void *priv_data;
+};
+
+struct cbd_segment {
+	struct cbd_transport		*cbdt;
+	struct cbd_segment		*next;
+
+	u32				seg_id;
+	struct cbd_segment_info		*segment_info;
+	struct cbd_seg_ops		*seg_ops;
+
+	void				*data;
+	u32				data_size;
+
+	void				*priv_data;
+
+	struct delayed_work		hb_work; /* heartbeat work */
+};
+
+int cbd_segment_clear(struct cbd_transport *cbdt, u32 segment_id);
+void cbd_segment_init(struct cbd_transport *cbdt, struct cbd_segment *segment,
+		      struct cbds_init_options *options);
+void cbd_segment_exit(struct cbd_segment *segment);
+bool cbd_segment_info_is_alive(struct cbd_segment_info *info);
+void cbds_copy_to_bio(struct cbd_segment *segment,
+		u32 data_off, u32 data_len, struct bio *bio, u32 bio_off);
+void cbds_copy_from_bio(struct cbd_segment *segment,
+		u32 data_off, u32 data_len, struct bio *bio, u32 bio_off);
+u32 cbd_seg_crc(struct cbd_segment *segment, u32 data_off, u32 data_len);
+int cbds_map_pages(struct cbd_segment *segment, struct cbd_backend_io *io);
+int cbds_pos_advance(struct cbd_seg_pos *seg_pos, u32 len);
+void cbds_copy_data(struct cbd_seg_pos *dst_pos,
+		struct cbd_seg_pos *src_pos, u32 len);
+
+/* cbd_channel */
+
+enum cbdc_blkdev_state {
+	cbdc_blkdev_state_none		= 0,
+	cbdc_blkdev_state_running,
+	cbdc_blkdev_state_stopped,
+};
+
+enum cbdc_backend_state {
+	cbdc_backend_state_none		= 0,
+	cbdc_backend_state_running,
+	cbdc_backend_state_stopped,
+};
+
+struct cbd_channel_info {
+	struct cbd_segment_info seg_info;	/* must be the first member */
+	u8	blkdev_state;
+	u32	blkdev_id;
+
+	u8	backend_state;
+	u32	backend_id;
+
+	u32	polling:1;
+
+	u32	submr_head;
+	u32	submr_tail;
+
+	u32	compr_head;
+	u32	compr_tail;
+};
+
+struct cbd_channel {
+	u32				seg_id;
+	struct cbd_segment		segment;
+
+	struct cbd_channel_info		*channel_info;
+
+	struct cbd_transport		*cbdt;
+
+	void				*submr;
+	void				*compr;
+
+	u32				submr_size;
+	u32				compr_size;
+
+	u32				data_size;
+	u32				data_head;
+	u32				data_tail;
+
+	spinlock_t			submr_lock;
+	spinlock_t			compr_lock;
+};
+
+void cbd_channel_init(struct cbd_channel *channel, struct cbd_transport *cbdt, u32 seg_id);
+void cbd_channel_exit(struct cbd_channel *channel);
+void cbdc_copy_from_bio(struct cbd_channel *channel,
+		u32 data_off, u32 data_len, struct bio *bio, u32 bio_off);
+void cbdc_copy_to_bio(struct cbd_channel *channel,
+		u32 data_off, u32 data_len, struct bio *bio, u32 bio_off);
+u32 cbd_channel_crc(struct cbd_channel *channel, u32 data_off, u32 data_len);
+int cbdc_map_pages(struct cbd_channel *channel, struct cbd_backend_io *io);
+int cbd_get_empty_channel_id(struct cbd_transport *cbdt, u32 *id);
+ssize_t cbd_channel_seg_detail_show(struct cbd_channel_info *channel_info, char *buf);
+
+/* cbd cache */
+struct cbd_cache_seg_info {
+	struct cbd_segment_info segment_info;	/* first member */
+	u32 flags;
+	u32 next_cache_seg_id;		/* index in cache->segments */
+	u64 gen;
+};
+
+#define CBD_CACHE_SEG_FLAGS_HAS_NEXT	(1 << 0)
+#define CBD_CACHE_SEG_FLAGS_WB_DONE	(1 << 1)
+#define CBD_CACHE_SEG_FLAGS_GC_DONE	(1 << 2)
+
+enum cbd_cache_blkdev_state {
+	cbd_cache_blkdev_state_none = 0,
+	cbd_cache_blkdev_state_running,
+	cbd_cache_blkdev_state_removing
+};
+
+struct cbd_cache_segment {
+	struct cbd_cache	*cache;
+	u32			cache_seg_id;	/* index in cache->segments */
+	u32			used;
+	spinlock_t		gen_lock;
+	struct cbd_cache_seg_info *cache_seg_info;
+	struct cbd_segment	segment;
+	atomic_t		refs;
+};
+
+struct cbd_cache_pos {
+	struct cbd_cache_segment *cache_seg;
+	u32		seg_off;
+};
+
+struct cbd_cache_pos_onmedia {
+	u32 cache_seg_id;
+	u32 seg_off;
+};
+
+struct cbd_cache_info {
+	u8	blkdev_state;
+	u32	blkdev_id;
+
+	u32	seg_id;
+	u32	n_segs;
+
+	u32	used_segs;
+	u16	gc_percent;
+
+	struct cbd_cache_pos_onmedia key_tail_pos;
+	struct cbd_cache_pos_onmedia dirty_tail_pos;
+};
+
+struct cbd_cache_tree {
+	struct rb_root			root;
+	spinlock_t			tree_lock;
+};
+
+struct cbd_cache_data_head {
+	spinlock_t			data_head_lock;
+	struct cbd_cache_pos		head_pos;
+};
+
+struct cbd_cache_key {
+	struct cbd_cache *cache;
+	struct cbd_cache_tree *cache_tree;
+	struct kref ref;
+
+	struct rb_node rb_node;
+	struct list_head list_node;
+
+	u64		off;
+	u32		len;
+	u64		flags;
+
+	struct cbd_cache_pos	cache_pos;
+
+	u64		seg_gen;
+#ifdef CONFIG_CBD_CRC
+	u32	data_crc;
+#endif
+};
+
+#define CBD_CACHE_KEY_FLAGS_EMPTY	(1 << 0)
+#define CBD_CACHE_KEY_FLAGS_CLEAN	(1 << 1)
+
+struct cbd_cache_key_onmedia {
+	u64	off;
+	u32	len;
+
+	u32	flags;
+
+	u32	cache_seg_id;
+	u32	cache_seg_off;
+
+	u64	seg_gen;
+#ifdef CONFIG_CBD_CRC
+	u32	data_crc;
+#endif
+};
+
+struct cbd_cache_kset_onmedia {
+	u32	crc;
+	u64	magic;
+	u64	flags;
+	u32	key_num;
+	struct cbd_cache_key_onmedia	data[];
+};
+
+#define CBD_KSET_FLAGS_LAST	(1 << 0)
+
+#define CBD_KSET_MAGIC		0x676894a64e164f1aULL
+
+struct cbd_cache_kset {
+	struct cbd_cache		*cache;
+	spinlock_t			kset_lock;
+	struct delayed_work		flush_work;
+	struct cbd_cache_kset_onmedia	kset_onmedia;
+};
+
+enum cbd_cache_state {
+	cbd_cache_state_none = 0,
+	cbd_cache_state_running,
+	cbd_cache_state_stopping
+};
+
+struct cbd_cache {
+	struct cbd_transport		*cbdt;
+	struct cbd_cache_info		*cache_info;
+	u32				cache_id;	/* same with related backend->backend_id */
+
+	u32				n_heads;
+	struct cbd_cache_data_head	*data_heads;
+
+	spinlock_t			key_head_lock;
+	struct cbd_cache_pos		key_head;
+	u32				n_ksets;
+	struct cbd_cache_kset		*ksets;
+
+	struct cbd_cache_pos		key_tail;
+	struct cbd_cache_pos		dirty_tail;
+
+	struct kmem_cache		*key_cache;
+	u32				n_trees;
+	struct cbd_cache_tree		*cache_trees;
+	struct work_struct		clean_work;
+
+	spinlock_t			miss_read_reqs_lock;
+	struct list_head		miss_read_reqs;
+	struct work_struct		miss_read_end_work;
+
+	struct workqueue_struct		*cache_wq;
+
+	struct file			*bdev_file;
+	u64				dev_size;
+	struct delayed_work		writeback_work;
+	struct delayed_work		gc_work;
+	struct bio_set			*bioset;
+
+	struct kmem_cache		*req_cache;
+
+	u32				state:8;
+	u32				init_keys:1;
+	u32				start_writeback:1;
+	u32				start_gc:1;
+
+	u32				n_segs;
+	unsigned long			*seg_map;
+	u32				last_cache_seg;
+	spinlock_t			seg_map_lock;
+	struct cbd_cache_segment	segments[]; /* should be the last member */
+};
+
+struct cbd_request;
+struct cbd_cache_opts {
+	struct cbd_cache_info *cache_info;
+	bool alloc_segs;
+	bool start_writeback;
+	bool start_gc;
+	bool init_keys;
+	u64 dev_size;
+	u32 n_paral;
+	struct file *bdev_file;	/* needed for start_writeback is true */
+};
+
+struct cbd_cache *cbd_cache_alloc(struct cbd_transport *cbdt,
+				  struct cbd_cache_opts *opts);
+void cbd_cache_destroy(struct cbd_cache *cache);
+int cbd_cache_handle_req(struct cbd_cache *cache, struct cbd_request *cbd_req);
+
+/* cbd_handler */
+struct cbd_handler {
+	struct cbd_backend	*cbdb;
+	struct cbd_channel_info *channel_info;
+
+	struct cbd_channel	channel;
+	spinlock_t		compr_lock;
+
+	u32			se_to_handle;
+	u64			req_tid_expected;
+
+	u32			polling:1;
+
+	struct delayed_work	handle_work;
+	struct cbd_worker_cfg	handle_worker_cfg;
+
+	struct hlist_node	hash_node;
+	struct bio_set		bioset;
+};
+
+void cbd_handler_destroy(struct cbd_handler *handler);
+int cbd_handler_create(struct cbd_backend *cbdb, u32 seg_id);
+void cbd_handler_notify(struct cbd_handler *handler);
+
+/* cbd_backend */
+CBD_DEVICE(backend);
+
+enum cbd_backend_state {
+	cbd_backend_state_none	= 0,
+	cbd_backend_state_running,
+	cbd_backend_state_removing
+};
+
+#define CBDB_BLKDEV_COUNT_MAX	1
+
+struct cbd_backend_info {
+	u8			state;
+	u32			host_id;
+	u32			blkdev_count;
+	u64			alive_ts;
+	u64			dev_size; /* nr_sectors */
+	struct cbd_cache_info	cache_info;
+
+	char			path[CBD_PATH_LEN];
+};
+
+struct cbd_backend_io {
+	struct cbd_se		*se;
+	u64			off;
+	u32			len;
+	struct bio		*bio;
+	struct cbd_handler	*handler;
+};
+
+#define CBD_BACKENDS_HANDLER_BITS	7
+
+struct cbd_backend {
+	u32			backend_id;
+	char			path[CBD_PATH_LEN];
+	struct cbd_transport	*cbdt;
+	struct cbd_backend_info *backend_info;
+	spinlock_t		lock;
+
+	struct block_device	*bdev;
+	struct file		*bdev_file;
+
+	struct workqueue_struct	*task_wq;
+	struct delayed_work	state_work;
+	struct delayed_work	hb_work; /* heartbeat work */
+
+	struct list_head	node; /* cbd_transport->backends */
+	DECLARE_HASHTABLE(handlers_hash, CBD_BACKENDS_HANDLER_BITS);
+
+	struct cbd_backend_device *backend_device;
+
+	struct kmem_cache	*backend_io_cache;
+
+	struct cbd_cache	*cbd_cache;
+	struct device		cache_dev;
+	bool			cache_dev_registered;
+};
+
+int cbd_backend_start(struct cbd_transport *cbdt, char *path, u32 backend_id, u32 cache_segs);
+int cbd_backend_stop(struct cbd_transport *cbdt, u32 backend_id);
+int cbd_backend_clear(struct cbd_transport *cbdt, u32 backend_id);
+int cbdb_add_handler(struct cbd_backend *cbdb, struct cbd_handler *handler);
+void cbdb_del_handler(struct cbd_backend *cbdb, struct cbd_handler *handler);
+bool cbd_backend_info_is_alive(struct cbd_backend_info *info);
+bool cbd_backend_cache_on(struct cbd_backend_info *backend_info);
+void cbd_backend_notify(struct cbd_backend *cbdb, u32 seg_id);
+
+/* cbd_queue */
+enum cbd_op {
+	CBD_OP_WRITE = 0,
+	CBD_OP_READ,
+	CBD_OP_FLUSH,
+};
+
+struct cbd_se {
+#ifdef CONFIG_CBD_CRC
+	u32			se_crc;		/* should be the first member */
+	u32			data_crc;
+#endif
+	u32			op;
+	u32			flags;
+	u64			req_tid;
+
+	u64			offset;
+	u32			len;
+
+	u32			data_off;
+	u32			data_len;
+};
+
+struct cbd_ce {
+#ifdef CONFIG_CBD_CRC
+	u32		ce_crc;		/* should be the first member */
+	u32		data_crc;
+#endif
+	u64		req_tid;
+	u32		result;
+	u32		flags;
+};
+
+#ifdef CONFIG_CBD_CRC
+static inline u32 cbd_se_crc(struct cbd_se *se)
+{
+	return crc32(0, (void *)se + 4, sizeof(*se) - 4);
+}
+
+static inline u32 cbd_ce_crc(struct cbd_ce *ce)
+{
+	return crc32(0, (void *)ce + 4, sizeof(*ce) - 4);
+}
+#endif
+
+struct cbd_request {
+	struct cbd_queue	*cbdq;
+
+	struct cbd_se		*se;
+	struct cbd_ce		*ce;
+	struct request		*req;
+
+	u64			off;
+	struct bio		*bio;
+	u32			bio_off;
+	spinlock_t		lock; /* race between cache and complete_work to access bio */
+
+	enum cbd_op		op;
+	u64			req_tid;
+	struct list_head	inflight_reqs_node;
+
+	u32			data_off;
+	u32			data_len;
+
+	struct work_struct	work;
+
+	struct kref		ref;
+	int			ret;
+	struct cbd_request	*parent;
+
+	void			*priv_data;
+	void (*end_req)(struct cbd_request *cbd_req, void *priv_data);
+};
+
+struct cbd_cache_req {
+	struct cbd_cache	*cache;
+	enum cbd_op		op;
+	struct work_struct	work;
+};
+
+#define CBD_SE_FLAGS_DONE	1
+
+static inline bool cbd_se_flags_test(struct cbd_se *se, u32 bit)
+{
+	return (se->flags & bit);
+}
+
+static inline void cbd_se_flags_set(struct cbd_se *se, u32 bit)
+{
+	se->flags |= bit;
+}
+
+enum cbd_queue_state {
+	cbd_queue_state_none	= 0,
+	cbd_queue_state_running,
+	cbd_queue_state_removing
+};
+
+struct cbd_queue {
+	struct cbd_blkdev	*cbd_blkdev;
+	u32			index;
+	struct list_head	inflight_reqs;
+	spinlock_t		inflight_reqs_lock;
+	u64			req_tid;
+
+	u64			*released_extents;
+
+	struct cbd_channel_info	*channel_info;
+	struct cbd_channel	channel;
+
+	atomic_t		state;
+
+	struct delayed_work	complete_work;
+	struct cbd_worker_cfg	complete_worker_cfg;
+};
+
+int cbd_queue_start(struct cbd_queue *cbdq);
+void cbd_queue_stop(struct cbd_queue *cbdq);
+extern const struct blk_mq_ops cbd_mq_ops;
+int cbd_queue_req_to_backend(struct cbd_request *cbd_req);
+void cbd_req_get(struct cbd_request *cbd_req);
+void cbd_req_put(struct cbd_request *cbd_req, int ret);
+void cbd_queue_advance(struct cbd_queue *cbdq, struct cbd_request *cbd_req);
+
+/* cbd_blkdev */
+CBD_DEVICE(blkdev);
+
+enum cbd_blkdev_state {
+	cbd_blkdev_state_none	= 0,
+	cbd_blkdev_state_running,
+	cbd_blkdev_state_removing
+};
+
+struct cbd_blkdev_info {
+	u8	state;
+	u64	alive_ts;
+	u32	backend_id;
+	u32	host_id;
+	u32	mapped_id;
+};
+
+struct cbd_blkdev {
+	u32			blkdev_id; /* index in transport blkdev area */
+	u32			backend_id;
+	int			mapped_id; /* id in block device such as: /dev/cbd0 */
+
+	struct cbd_backend	*backend; /* reference to backend if blkdev and backend on the same host */
+
+	int			major;		/* blkdev assigned major */
+	int			minor;
+	struct gendisk		*disk;		/* blkdev's gendisk and rq */
+
+	struct mutex		lock;
+	unsigned long		open_count;	/* protected by lock */
+
+	struct list_head	node;
+	struct delayed_work	hb_work; /* heartbeat work */
+
+	/* Block layer tags. */
+	struct blk_mq_tag_set	tag_set;
+
+	uint32_t		num_queues;
+	struct cbd_queue	*queues;
+
+	u64			dev_size;
+
+	struct workqueue_struct	*task_wq;
+
+	struct cbd_blkdev_device *blkdev_dev;
+	struct cbd_blkdev_info *blkdev_info;
+
+	struct cbd_transport *cbdt;
+
+	struct cbd_cache	*cbd_cache;
+};
+
+int cbd_blkdev_init(void);
+void cbd_blkdev_exit(void);
+int cbd_blkdev_start(struct cbd_transport *cbdt, u32 backend_id, u32 queues);
+int cbd_blkdev_stop(struct cbd_transport *cbdt, u32 devid, bool force);
+int cbd_blkdev_clear(struct cbd_transport *cbdt, u32 devid);
+bool cbd_blkdev_info_is_alive(struct cbd_blkdev_info *info);
+
+extern struct workqueue_struct	*cbd_wq;
+
+#define cbd_setup_device(DEV, PARENT, TYPE, fmt, ...)		\
+do {								\
+	device_initialize(DEV);					\
+	device_set_pm_not_required(DEV);			\
+	dev_set_name(DEV, fmt, ##__VA_ARGS__);			\
+	DEV->parent = PARENT;					\
+	DEV->type = TYPE;					\
+								\
+	ret = device_add(DEV);					\
+} while (0)
+
+#define CBD_OBJ_HEARTBEAT(OBJ)								\
+static void OBJ##_hb_workfn(struct work_struct *work)					\
+{											\
+	struct cbd_##OBJ *obj = container_of(work, struct cbd_##OBJ, hb_work.work);	\
+	struct cbd_##OBJ##_info *info = obj->OBJ##_info;				\
+											\
+	info->alive_ts = ktime_get_real();						\
+											\
+	queue_delayed_work(cbd_wq, &obj->hb_work, CBD_HB_INTERVAL);			\
+}											\
+											\
+bool cbd_##OBJ##_info_is_alive(struct cbd_##OBJ##_info *info)				\
+{											\
+	ktime_t oldest, ts;								\
+											\
+	ts = info->alive_ts;								\
+	oldest = ktime_sub_ms(ktime_get_real(), CBD_HB_TIMEOUT);			\
+											\
+	if (ktime_after(ts, oldest))							\
+		return true;								\
+											\
+	return false;									\
+}											\
+											\
+static ssize_t cbd_##OBJ##_alive_show(struct device *dev,				\
+			       struct device_attribute *attr,				\
+			       char *buf)						\
+{											\
+	struct cbd_##OBJ##_device *_dev;						\
+											\
+	_dev = container_of(dev, struct cbd_##OBJ##_device, dev);			\
+											\
+	if (cbd_##OBJ##_info_is_alive(_dev->OBJ##_info))				\
+		return sprintf(buf, "true\n");						\
+											\
+	return sprintf(buf, "false\n");							\
+}											\
+											\
+static DEVICE_ATTR(alive, 0400, cbd_##OBJ##_alive_show, NULL)
+
+#endif /* _CBD_INTERNAL_H */
diff --git a/drivers/block/cbd/cbd_transport.c b/drivers/block/cbd/cbd_transport.c
new file mode 100644
index 000000000000..30dc4cd6c77a
--- /dev/null
+++ b/drivers/block/cbd/cbd_transport.c
@@ -0,0 +1,957 @@
+#include <linux/pfn_t.h>
+#include "cbd_internal.h"
+
+#define CBDT_OBJ(OBJ, OBJ_SIZE)							\
+extern struct device_type cbd_##OBJ##_type;					\
+extern struct device_type cbd_##OBJ##s_type;					\
+										\
+static int cbd_##OBJ##s_init(struct cbd_transport *cbdt)			\
+{										\
+	struct cbd_##OBJ##s_device *devs;					\
+	struct cbd_##OBJ##_device *cbd_dev;					\
+	struct device *dev;							\
+	int i;									\
+	int ret;								\
+										\
+	u32 memsize = struct_size(devs, OBJ##_devs,				\
+			cbdt->transport_info->OBJ##_num);			\
+	devs = kzalloc(memsize, GFP_KERNEL);					\
+	if (!devs) {								\
+		return -ENOMEM;							\
+	}									\
+										\
+	dev = &devs->OBJ##s_dev;						\
+	device_initialize(dev);							\
+	device_set_pm_not_required(dev);					\
+	dev_set_name(dev, "cbd_" #OBJ "s");					\
+	dev->parent = &cbdt->device;						\
+	dev->type = &cbd_##OBJ##s_type;						\
+	ret = device_add(dev);							\
+	if (ret) {								\
+		goto devs_free;							\
+	}									\
+										\
+	for (i = 0; i < cbdt->transport_info->OBJ##_num; i++) {			\
+		cbd_dev = &devs->OBJ##_devs[i];					\
+		dev = &cbd_dev->dev;						\
+										\
+		cbd_dev->cbdt = cbdt;						\
+		cbd_dev->OBJ##_info = cbdt_get_##OBJ##_info(cbdt, i);		\
+		device_initialize(dev);						\
+		device_set_pm_not_required(dev);				\
+		dev_set_name(dev, #OBJ "%u", i);				\
+		dev->parent = &devs->OBJ##s_dev;				\
+		dev->type = &cbd_##OBJ##_type;					\
+										\
+		ret = device_add(dev);						\
+		if (ret) {							\
+			i--;							\
+			goto del_device;					\
+		}								\
+	}									\
+	cbdt->cbd_##OBJ##s_dev = devs;						\
+										\
+	return 0;								\
+del_device:									\
+	for (; i >= 0; i--) {							\
+		cbd_dev = &devs->OBJ##_devs[i];					\
+		dev = &cbd_dev->dev;						\
+		device_del(dev);						\
+	}									\
+devs_free:									\
+	kfree(devs);								\
+	return ret;								\
+}										\
+										\
+static void cbd_##OBJ##s_exit(struct cbd_transport *cbdt)			\
+{										\
+	struct cbd_##OBJ##s_device *devs = cbdt->cbd_##OBJ##s_dev;		\
+	struct device *dev;							\
+	int i;									\
+										\
+	if (!devs)								\
+		return;								\
+										\
+	for (i = 0; i < cbdt->transport_info->OBJ##_num; i++) {			\
+		struct cbd_##OBJ##_device *cbd_dev = &devs->OBJ##_devs[i];	\
+		dev = &cbd_dev->dev;						\
+										\
+		device_del(dev);						\
+	}									\
+										\
+	device_del(&devs->OBJ##s_dev);						\
+										\
+	kfree(devs);								\
+	cbdt->cbd_##OBJ##s_dev = NULL;						\
+										\
+	return;									\
+}										\
+										\
+static inline struct cbd_##OBJ##_info						\
+*__get_##OBJ##_info(struct cbd_transport *cbdt, u32 id)				\
+{										\
+	struct cbd_transport_info *info = cbdt->transport_info;			\
+	void *start = cbdt->transport_info;					\
+										\
+	start += info->OBJ##_area_off;						\
+										\
+	return start + ((u64)info->OBJ_SIZE * id);				\
+}										\
+										\
+struct cbd_##OBJ##_info								\
+*cbdt_get_##OBJ##_info(struct cbd_transport *cbdt, u32 id)			\
+{										\
+	struct cbd_##OBJ##_info *info;						\
+										\
+	mutex_lock(&cbdt->lock);						\
+	info = __get_##OBJ##_info(cbdt, id);					\
+	mutex_unlock(&cbdt->lock);						\
+										\
+	return info;								\
+}										\
+										\
+int cbdt_get_empty_##OBJ##_id(struct cbd_transport *cbdt, u32 *id)		\
+{										\
+	struct cbd_transport_info *info = cbdt->transport_info;			\
+	struct cbd_##OBJ##_info *_info;						\
+	int ret = 0;								\
+	int i;									\
+										\
+	mutex_lock(&cbdt->lock);						\
+	for (i = 0; i < info->OBJ##_num; i++) {					\
+		_info = __get_##OBJ##_info(cbdt, i);				\
+		if (_info->state == cbd_##OBJ##_state_none) {			\
+			cbdt_zero_range(cbdt, _info, info->OBJ_SIZE);			\
+			*id = i;						\
+			goto out;						\
+		}								\
+	}									\
+										\
+	cbdt_err(cbdt, "No available " #OBJ "_id found.");			\
+	ret = -ENOENT;								\
+out:										\
+	mutex_unlock(&cbdt->lock);						\
+										\
+	return ret;								\
+}
+
+CBDT_OBJ(host, host_info_size);
+CBDT_OBJ(backend, backend_info_size);
+CBDT_OBJ(blkdev, blkdev_info_size);
+CBDT_OBJ(segment, segment_size);
+
+static struct cbd_transport *cbd_transports[CBD_TRANSPORT_MAX];
+static DEFINE_IDA(cbd_transport_id_ida);
+static DEFINE_MUTEX(cbd_transport_mutex);
+
+extern struct bus_type cbd_bus_type;
+extern struct device cbd_root_dev;
+
+static ssize_t cbd_myhost_show(struct device *dev,
+			       struct device_attribute *attr,
+			       char *buf)
+{
+	struct cbd_transport *cbdt;
+	struct cbd_host *host;
+
+	cbdt = container_of(dev, struct cbd_transport, device);
+
+	host = cbdt->host;
+	if (!host)
+		return 0;
+
+	return sprintf(buf, "%d\n", host->host_id);
+}
+
+static DEVICE_ATTR(my_host_id, 0400, cbd_myhost_show, NULL);
+
+enum {
+	CBDT_ADM_OPT_ERR		= 0,
+	CBDT_ADM_OPT_OP,
+	CBDT_ADM_OPT_FORCE,
+	CBDT_ADM_OPT_PATH,
+	CBDT_ADM_OPT_BID,
+	CBDT_ADM_OPT_DID,
+	CBDT_ADM_OPT_QUEUES,
+	CBDT_ADM_OPT_HID,
+	CBDT_ADM_OPT_SID,
+	CBDT_ADM_OPT_CACHE_SIZE,
+};
+
+enum {
+	CBDT_ADM_OP_B_START,
+	CBDT_ADM_OP_B_STOP,
+	CBDT_ADM_OP_B_CLEAR,
+	CBDT_ADM_OP_DEV_START,
+	CBDT_ADM_OP_DEV_STOP,
+	CBDT_ADM_OP_DEV_CLEAR,
+	CBDT_ADM_OP_H_CLEAR,
+	CBDT_ADM_OP_S_CLEAR,
+};
+
+static const char *const adm_op_names[] = {
+	[CBDT_ADM_OP_B_START] = "backend-start",
+	[CBDT_ADM_OP_B_STOP] = "backend-stop",
+	[CBDT_ADM_OP_B_CLEAR] = "backend-clear",
+	[CBDT_ADM_OP_DEV_START] = "dev-start",
+	[CBDT_ADM_OP_DEV_STOP] = "dev-stop",
+	[CBDT_ADM_OP_DEV_CLEAR] = "dev-clear",
+	[CBDT_ADM_OP_H_CLEAR] = "host-clear",
+	[CBDT_ADM_OP_S_CLEAR] = "segment-clear",
+};
+
+static const match_table_t adm_opt_tokens = {
+	{ CBDT_ADM_OPT_OP,		"op=%s"	},
+	{ CBDT_ADM_OPT_FORCE,		"force=%u" },
+	{ CBDT_ADM_OPT_PATH,		"path=%s" },
+	{ CBDT_ADM_OPT_BID,		"backend_id=%u" },
+	{ CBDT_ADM_OPT_DID,		"dev_id=%u" },
+	{ CBDT_ADM_OPT_QUEUES,		"queues=%u" },
+	{ CBDT_ADM_OPT_HID,		"host_id=%u" },
+	{ CBDT_ADM_OPT_SID,		"segment_id=%u" },
+	{ CBDT_ADM_OPT_CACHE_SIZE,	"cache_size=%u" },	/* unit is MiB */
+	{ CBDT_ADM_OPT_ERR,		NULL	}
+};
+
+
+struct cbd_adm_options {
+	u16 op;
+	u16 force:1;
+	u32 backend_id;
+	union {
+		struct host_options {
+			u32 hid;
+		} host;
+		struct backend_options {
+			char path[CBD_PATH_LEN];
+			u64 cache_size_M;
+		} backend;
+		struct segment_options {
+			u32 sid;
+		} segment;
+		struct blkdev_options {
+			u32 devid;
+			u32 queues;
+		} blkdev;
+	};
+};
+
+static int parse_adm_options(struct cbd_transport *cbdt,
+		char *buf,
+		struct cbd_adm_options *opts)
+{
+	substring_t args[MAX_OPT_ARGS];
+	char *o, *p;
+	int token, ret = 0;
+
+	o = buf;
+
+	while ((p = strsep(&o, ",\n")) != NULL) {
+		if (!*p)
+			continue;
+
+		token = match_token(p, adm_opt_tokens, args);
+		switch (token) {
+		case CBDT_ADM_OPT_OP:
+			ret = match_string(adm_op_names, ARRAY_SIZE(adm_op_names), args[0].from);
+			if (ret < 0) {
+				cbdt_err(cbdt, "unknown op: '%s'\n", args[0].from);
+				ret = -EINVAL;
+				break;
+			}
+			opts->op = ret;
+			break;
+		case CBDT_ADM_OPT_PATH:
+			if (match_strlcpy(opts->backend.path, &args[0],
+				CBD_PATH_LEN) == 0) {
+				ret = -EINVAL;
+				break;
+			}
+			break;
+		case CBDT_ADM_OPT_FORCE:
+			if (match_uint(args, &token) || token != 1) {
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->force = 1;
+			break;
+		case CBDT_ADM_OPT_BID:
+			if (match_uint(args, &token)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->backend_id = token;
+			break;
+		case CBDT_ADM_OPT_DID:
+			if (match_uint(args, &token)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->blkdev.devid = token;
+			break;
+		case CBDT_ADM_OPT_QUEUES:
+			if (match_uint(args, &token)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->blkdev.queues = token;
+			break;
+		case CBDT_ADM_OPT_HID:
+			if (match_uint(args, &token)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->host.hid = token;
+			break;
+		case CBDT_ADM_OPT_SID:
+			if (match_uint(args, &token)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->segment.sid = token;
+			break;
+		case CBDT_ADM_OPT_CACHE_SIZE:
+			if (match_uint(args, &token)) {
+				ret = -EINVAL;
+				goto out;
+			}
+			opts->backend.cache_size_M = token;
+			break;
+		default:
+			cbdt_err(cbdt, "unknown parameter or missing value '%s'\n", p);
+			ret = -EINVAL;
+			goto out;
+		}
+	}
+
+out:
+	return ret;
+}
+
+void cbdt_zero_range(struct cbd_transport *cbdt, void *pos, u32 size)
+{
+	memset(pos, 0, size);
+}
+
+static void segments_format(struct cbd_transport *cbdt)
+{
+	u32 i;
+	struct cbd_segment_info *seg_info;
+
+	for (i = 0; i < cbdt->transport_info->segment_num; i++) {
+		seg_info = cbdt_get_segment_info(cbdt, i);
+		cbdt_zero_range(cbdt, seg_info, sizeof(struct cbd_segment_info));
+	}
+}
+
+static int cbd_transport_format(struct cbd_transport *cbdt, bool force)
+{
+	struct cbd_transport_info *info = cbdt->transport_info;
+	u64 transport_dev_size;
+	u32 seg_size;
+	u32 nr_segs;
+	u64 magic;
+	u16 flags = 0;
+
+	magic = le64_to_cpu(info->magic);
+	if (magic && !force)
+		return -EEXIST;
+
+	transport_dev_size = bdev_nr_bytes(file_bdev(cbdt->bdev_file));
+	if (transport_dev_size < CBD_TRASNPORT_SIZE_MIN) {
+		cbdt_err(cbdt, "dax device is too small, required at least %u",
+				CBD_TRASNPORT_SIZE_MIN);
+		return -ENOSPC;
+	}
+
+	memset(info, 0, sizeof(*info));
+
+	info->magic = cpu_to_le64(CBD_TRANSPORT_MAGIC);
+	info->version = cpu_to_le16(CBD_TRANSPORT_VERSION);
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
+	flags |= CBDT_INFO_F_BIGENDIAN;
+#endif
+#ifdef CONFIG_CBD_CRC
+	flags |= CBDT_INFO_F_CRC;
+#endif
+	info->flags = cpu_to_le16(flags);
+
+	info->hosts_registered = 0;
+	/*
+	 * Try to fully utilize all available space,
+	 * assuming host:blkdev:backend:segment = 1:1:1:1
+	 */
+	seg_size = (CBDT_HOST_INFO_SIZE + CBDT_BACKEND_INFO_SIZE +
+			CBDT_BLKDEV_INFO_SIZE + CBDT_SEG_SIZE);
+	nr_segs = (transport_dev_size - CBDT_INFO_SIZE) / seg_size;
+
+	info->host_area_off = CBDT_INFO_OFF + CBDT_INFO_SIZE;
+	info->host_info_size = CBDT_HOST_INFO_SIZE;
+	info->host_num = nr_segs;
+
+	info->backend_area_off = info->host_area_off + (info->host_info_size * info->host_num);
+	info->backend_info_size = CBDT_BACKEND_INFO_SIZE;
+	info->backend_num = nr_segs;
+
+	info->blkdev_area_off = info->backend_area_off + (info->backend_info_size * info->backend_num);
+	info->blkdev_info_size = CBDT_BLKDEV_INFO_SIZE;
+	info->blkdev_num = nr_segs;
+
+	info->segment_area_off = info->blkdev_area_off + (info->blkdev_info_size * info->blkdev_num);
+	info->segment_size = CBDT_SEG_SIZE;
+	info->segment_num = nr_segs;
+
+	cbdt_zero_range(cbdt, (void *)info + info->host_area_off,
+			     info->segment_area_off - info->host_area_off);
+
+	segments_format(cbdt);
+
+	return 0;
+}
+
+/*
+ * Any transport metadata allocation or reclaim should be in the
+ * control operation rutine
+ *
+ * All transport space allocation and deallocation should occur within the control flow,
+ * specifically within `adm_store()`, so that all transport space allocation
+ * and deallocation are managed within this function. This prevents other processes
+ * from involving transport space allocation and deallocation. By making `adm_store`
+ * exclusive, we can manage space effectively. For a single-host scenario, `adm_lock`
+ * can ensure mutual exclusion of `adm_store`. However, in a multi-host scenario,
+ * we need a distributed lock to guarantee that all `adm_store` calls are mutually exclusive.
+ *
+ * TODO: Is there a way to lock the CXL shared memory device?
+ */
+static ssize_t adm_store(struct device *dev,
+			struct device_attribute *attr,
+			const char *ubuf,
+			size_t size)
+{
+	int ret;
+	char *buf;
+	struct cbd_adm_options opts = { 0 };
+	struct cbd_transport *cbdt;
+
+	opts.backend_id = U32_MAX;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	cbdt = container_of(dev, struct cbd_transport, device);
+
+	buf = kmemdup(ubuf, size + 1, GFP_KERNEL);
+	if (IS_ERR(buf)) {
+		cbdt_err(cbdt, "failed to dup buf for adm option: %d", (int)PTR_ERR(buf));
+		return PTR_ERR(buf);
+	}
+	buf[size] = '\0';
+	ret = parse_adm_options(cbdt, buf, &opts);
+	if (ret < 0) {
+		kfree(buf);
+		return ret;
+	}
+	kfree(buf);
+
+	mutex_lock(&cbdt->adm_lock);
+	switch (opts.op) {
+	case CBDT_ADM_OP_B_START:
+		u32 cache_segs = 0;
+
+		if (opts.backend.cache_size_M > 0)
+			cache_segs = DIV_ROUND_UP(opts.backend.cache_size_M,
+					cbdt->transport_info->segment_size / CBD_MB);
+
+		ret = cbd_backend_start(cbdt, opts.backend.path, opts.backend_id, cache_segs);
+		break;
+	case CBDT_ADM_OP_B_STOP:
+		ret = cbd_backend_stop(cbdt, opts.backend_id);
+		break;
+	case CBDT_ADM_OP_B_CLEAR:
+		ret = cbd_backend_clear(cbdt, opts.backend_id);
+		break;
+	case CBDT_ADM_OP_DEV_START:
+		if (opts.blkdev.queues > CBD_QUEUES_MAX) {
+			mutex_unlock(&cbdt->adm_lock);
+			cbdt_err(cbdt, "invalid queues = %u, larger than max %u\n",
+					opts.blkdev.queues, CBD_QUEUES_MAX);
+			return -EINVAL;
+		}
+		ret = cbd_blkdev_start(cbdt, opts.backend_id, opts.blkdev.queues);
+		break;
+	case CBDT_ADM_OP_DEV_STOP:
+		ret = cbd_blkdev_stop(cbdt, opts.blkdev.devid, opts.force);
+		break;
+	case CBDT_ADM_OP_DEV_CLEAR:
+		ret = cbd_blkdev_clear(cbdt, opts.blkdev.devid);
+		break;
+	case CBDT_ADM_OP_H_CLEAR:
+		ret = cbd_host_clear(cbdt, opts.host.hid);
+		break;
+	case CBDT_ADM_OP_S_CLEAR:
+		ret = cbd_segment_clear(cbdt, opts.segment.sid);
+		break;
+	default:
+		mutex_unlock(&cbdt->adm_lock);
+		cbdt_err(cbdt, "invalid op: %d\n", opts.op);
+		return -EINVAL;
+	}
+	mutex_unlock(&cbdt->adm_lock);
+
+	if (ret < 0)
+		return ret;
+
+	return size;
+}
+
+static DEVICE_ATTR_WO(adm);
+
+static ssize_t cbd_transport_info(struct cbd_transport *cbdt, char *buf)
+{
+	struct cbd_transport_info *info = cbdt->transport_info;
+	ssize_t ret;
+
+	mutex_lock(&cbdt->lock);
+	info = cbdt->transport_info;
+	mutex_unlock(&cbdt->lock);
+
+	ret = sprintf(buf, "magic: 0x%llx\n"
+			"version: %u\n"
+			"flags: %x\n\n"
+			"hosts_registered: %u\n"
+			"host_area_off: %llu\n"
+			"bytes_per_host_info: %u\n"
+			"host_num: %u\n\n"
+			"backend_area_off: %llu\n"
+			"bytes_per_backend_info: %u\n"
+			"backend_num: %u\n\n"
+			"blkdev_area_off: %llu\n"
+			"bytes_per_blkdev_info: %u\n"
+			"blkdev_num: %u\n\n"
+			"segment_area_off: %llu\n"
+			"bytes_per_segment: %u\n"
+			"segment_num: %u\n",
+			le64_to_cpu(info->magic),
+			le16_to_cpu(info->version),
+			le16_to_cpu(info->flags),
+			info->hosts_registered,
+			info->host_area_off,
+			info->host_info_size,
+			info->host_num,
+			info->backend_area_off,
+			info->backend_info_size,
+			info->backend_num,
+			info->blkdev_area_off,
+			info->blkdev_info_size,
+			info->blkdev_num,
+			info->segment_area_off,
+			info->segment_size,
+			info->segment_num);
+
+	return ret;
+}
+
+static ssize_t cbd_info_show(struct device *dev,
+			       struct device_attribute *attr,
+			       char *buf)
+{
+	struct cbd_transport *cbdt;
+
+	cbdt = container_of(dev, struct cbd_transport, device);
+
+	return cbd_transport_info(cbdt, buf);
+}
+static DEVICE_ATTR(info, 0400, cbd_info_show, NULL);
+
+static struct attribute *cbd_transport_attrs[] = {
+	&dev_attr_adm.attr,
+	&dev_attr_info.attr,
+	&dev_attr_my_host_id.attr,
+	NULL
+};
+
+static struct attribute_group cbd_transport_attr_group = {
+	.attrs = cbd_transport_attrs,
+};
+
+static const struct attribute_group *cbd_transport_attr_groups[] = {
+	&cbd_transport_attr_group,
+	NULL
+};
+
+static void cbd_transport_release(struct device *dev)
+{
+}
+
+const struct device_type cbd_transport_type = {
+	.name		= "cbd_transport",
+	.groups		= cbd_transport_attr_groups,
+	.release	= cbd_transport_release,
+};
+
+static int
+cbd_dax_notify_failure(
+	struct dax_device	*dax_devp,
+	u64			offset,
+	u64			len,
+	int			mf_flags)
+{
+
+	pr_err("%s: dax_devp %llx offset %llx len %lld mf_flags %x\n",
+	       __func__, (u64)dax_devp, (u64)offset, (u64)len, mf_flags);
+	return -EOPNOTSUPP;
+}
+
+const struct dax_holder_operations cbd_dax_holder_ops = {
+	.notify_failure		= cbd_dax_notify_failure,
+};
+
+static struct cbd_transport *cbdt_alloc(void)
+{
+	struct cbd_transport *cbdt;
+	int ret;
+
+	cbdt = kzalloc(sizeof(struct cbd_transport), GFP_KERNEL);
+	if (!cbdt)
+		return NULL;
+
+	mutex_init(&cbdt->lock);
+	mutex_init(&cbdt->adm_lock);
+	INIT_LIST_HEAD(&cbdt->backends);
+	INIT_LIST_HEAD(&cbdt->devices);
+
+	ret = ida_simple_get(&cbd_transport_id_ida, 0, CBD_TRANSPORT_MAX,
+				GFP_KERNEL);
+	if (ret < 0)
+		goto cbdt_free;
+
+	cbdt->id = ret;
+	cbd_transports[cbdt->id] = cbdt;
+
+	return cbdt;
+
+cbdt_free:
+	kfree(cbdt);
+	return NULL;
+}
+
+static void cbdt_destroy(struct cbd_transport *cbdt)
+{
+	cbd_transports[cbdt->id] = NULL;
+	ida_simple_remove(&cbd_transport_id_ida, cbdt->id);
+	kfree(cbdt);
+}
+
+static int cbdt_dax_init(struct cbd_transport *cbdt, char *path)
+{
+	struct dax_device *dax_dev = NULL;
+	struct file *bdev_file = NULL;
+	long access_size;
+	void *kaddr;
+	u64 start_off = 0;
+	int ret;
+	int id;
+
+	bdev_file = bdev_file_open_by_path(path, BLK_OPEN_READ | BLK_OPEN_WRITE, cbdt, NULL);
+	if (IS_ERR(bdev_file)) {
+		cbdt_err(cbdt, "%s: failed blkdev_get_by_path(%s)\n", __func__, path);
+		ret = PTR_ERR(bdev_file);
+		goto err;
+	}
+
+	dax_dev = fs_dax_get_by_bdev(file_bdev(bdev_file), &start_off,
+				     cbdt,
+				     &cbd_dax_holder_ops);
+	if (IS_ERR(dax_dev)) {
+		cbdt_err(cbdt, "%s: unable to get daxdev from bdev_file\n", __func__);
+		ret = -ENODEV;
+		goto fput;
+	}
+
+	id = dax_read_lock();
+	access_size = dax_direct_access(dax_dev, 0, 1, DAX_ACCESS, &kaddr, NULL);
+	if (access_size != 1) {
+		dax_read_unlock(id);
+		ret = -EINVAL;
+		goto dax_put;
+	}
+
+	cbdt->bdev_file = bdev_file;
+	cbdt->dax_dev = dax_dev;
+	cbdt->transport_info = (struct cbd_transport_info *)kaddr;
+	dax_read_unlock(id);
+
+	return 0;
+
+dax_put:
+	fs_put_dax(dax_dev, cbdt);
+fput:
+	fput(bdev_file);
+err:
+	return ret;
+}
+
+static void cbdt_dax_release(struct cbd_transport *cbdt)
+{
+	if (cbdt->dax_dev)
+		fs_put_dax(cbdt->dax_dev, cbdt);
+
+	if (cbdt->bdev_file)
+		fput(cbdt->bdev_file);
+}
+
+static int cbd_transport_init(struct cbd_transport *cbdt)
+{
+	struct device *dev;
+
+	dev = &cbdt->device;
+	device_initialize(dev);
+	device_set_pm_not_required(dev);
+	dev->bus = &cbd_bus_type;
+	dev->type = &cbd_transport_type;
+	dev->parent = &cbd_root_dev;
+
+	dev_set_name(&cbdt->device, "transport%d", cbdt->id);
+
+	return device_add(&cbdt->device);
+}
+
+
+static int cbdt_validate(struct cbd_transport *cbdt)
+{
+	u16 flags;
+
+	if (le64_to_cpu(cbdt->transport_info->magic) != CBD_TRANSPORT_MAGIC) {
+		cbdt_err(cbdt, "unexpected magic: %llx\n",
+				le64_to_cpu(cbdt->transport_info->magic));
+		return -EINVAL;
+	}
+
+	flags = le16_to_cpu(cbdt->transport_info->flags);
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
+	if (!(flags & CBDT_INFO_F_BIGENDIAN)) {
+		cbdt_err(cbdt, "transport is not big endian\n");
+		return -EINVAL;
+	}
+#else
+	if (flags & CBDT_INFO_F_BIGENDIAN) {
+		cbdt_err(cbdt, "transport is big endian\n");
+		return -EINVAL;
+	}
+#endif
+
+#ifndef CONFIG_CBD_CRC
+	if (flags & CBDT_INFO_F_CRC) {
+		cbdt_err(cbdt, "transport expects CBD_CRC enabled.\n");
+		return -ENOTSUPP;
+	}
+#endif
+
+	return 0;
+}
+
+int cbdt_unregister(u32 tid)
+{
+	struct cbd_transport *cbdt;
+
+	cbdt = cbd_transports[tid];
+	if (!cbdt) {
+		pr_err("tid: %u, is not registered\n", tid);
+		return -EINVAL;
+	}
+
+	mutex_lock(&cbdt->lock);
+	if (!list_empty(&cbdt->backends) || !list_empty(&cbdt->devices)) {
+		mutex_unlock(&cbdt->lock);
+		return -EBUSY;
+	}
+	mutex_unlock(&cbdt->lock);
+
+	cbd_blkdevs_exit(cbdt);
+	cbd_segments_exit(cbdt);
+	cbd_backends_exit(cbdt);
+	cbd_hosts_exit(cbdt);
+
+	cbd_host_unregister(cbdt);
+	cbdt->transport_info->hosts_registered--;
+
+	device_unregister(&cbdt->device);
+	cbdt_dax_release(cbdt);
+	cbdt_destroy(cbdt);
+	module_put(THIS_MODULE);
+
+	return 0;
+}
+
+static bool cbdt_register_allowed(struct cbd_transport *cbdt)
+{
+	struct cbd_transport_info *transport_info;
+
+	transport_info = cbdt->transport_info;
+
+	if (transport_info->hosts_registered >= CBDT_HOSTS_MAX) {
+		cbdt_err(cbdt, "too many hosts registered: %u (max %u).",
+				transport_info->hosts_registered,
+				CBDT_HOSTS_MAX);
+		return false;
+	}
+
+	return true;
+}
+
+int cbdt_register(struct cbdt_register_options *opts)
+{
+	struct cbd_transport *cbdt;
+	int ret;
+
+	if (!try_module_get(THIS_MODULE))
+		return -ENODEV;
+
+	if (!strstr(opts->path, "/dev/pmem")) {
+		pr_err("%s: path (%s) is not pmem\n",
+		       __func__, opts->path);
+		ret = -EINVAL;
+		goto module_put;
+	}
+
+	cbdt = cbdt_alloc();
+	if (!cbdt) {
+		ret = -ENOMEM;
+		goto module_put;
+	}
+
+	ret = cbdt_dax_init(cbdt, opts->path);
+	if (ret)
+		goto cbdt_destroy;
+
+	if (!cbdt_register_allowed(cbdt)) {
+		ret = -EINVAL;
+		goto dax_release;
+	}
+
+	if (opts->format) {
+		ret = cbd_transport_format(cbdt, opts->force);
+		if (ret < 0)
+			goto dax_release;
+	}
+
+	ret = cbdt_validate(cbdt);
+	if (ret)
+		goto dax_release;
+
+	ret = cbd_transport_init(cbdt);
+	if (ret)
+		goto dax_release;
+
+	ret = cbd_host_register(cbdt, opts->hostname);
+	if (ret)
+		goto dev_unregister;
+
+	if (cbd_hosts_init(cbdt) || cbd_backends_init(cbdt) ||
+	    cbd_segments_init(cbdt) || cbd_blkdevs_init(cbdt)) {
+		ret = -ENOMEM;
+		goto devs_exit;
+	}
+
+	cbdt->transport_info->hosts_registered++;
+
+	return 0;
+
+devs_exit:
+	cbd_blkdevs_exit(cbdt);
+	cbd_segments_exit(cbdt);
+	cbd_backends_exit(cbdt);
+	cbd_hosts_exit(cbdt);
+
+	cbd_host_unregister(cbdt);
+dev_unregister:
+	device_unregister(&cbdt->device);
+dax_release:
+	cbdt_dax_release(cbdt);
+cbdt_destroy:
+	cbdt_destroy(cbdt);
+module_put:
+	module_put(THIS_MODULE);
+
+	return ret;
+}
+
+void cbdt_add_backend(struct cbd_transport *cbdt, struct cbd_backend *cbdb)
+{
+	mutex_lock(&cbdt->lock);
+	list_add(&cbdb->node, &cbdt->backends);
+	mutex_unlock(&cbdt->lock);
+}
+
+void cbdt_del_backend(struct cbd_transport *cbdt, struct cbd_backend *cbdb)
+{
+	if (list_empty(&cbdb->node))
+		return;
+
+	mutex_lock(&cbdt->lock);
+	list_del_init(&cbdb->node);
+	mutex_unlock(&cbdt->lock);
+}
+
+struct cbd_backend *cbdt_get_backend(struct cbd_transport *cbdt, u32 id)
+{
+	struct cbd_backend *backend;
+
+	mutex_lock(&cbdt->lock);
+	list_for_each_entry(backend, &cbdt->backends, node) {
+		if (backend->backend_id == id)
+			goto out;
+	}
+	backend = NULL;
+out:
+	mutex_unlock(&cbdt->lock);
+	return backend;
+}
+
+void cbdt_add_blkdev(struct cbd_transport *cbdt, struct cbd_blkdev *blkdev)
+{
+	mutex_lock(&cbdt->lock);
+	list_add(&blkdev->node, &cbdt->devices);
+	mutex_unlock(&cbdt->lock);
+}
+
+void cbdt_del_blkdev(struct cbd_transport *cbdt, struct cbd_blkdev *blkdev)
+{
+	if (list_empty(&blkdev->node))
+		return;
+
+	mutex_lock(&cbdt->lock);
+	list_del_init(&blkdev->node);
+	mutex_unlock(&cbdt->lock);
+}
+
+struct cbd_blkdev *cbdt_get_blkdev(struct cbd_transport *cbdt, u32 id)
+{
+	struct cbd_blkdev *dev;
+
+	mutex_lock(&cbdt->lock);
+	list_for_each_entry(dev, &cbdt->devices, node) {
+		if (dev->blkdev_id == id)
+			goto out;
+	}
+	dev = NULL;
+out:
+	mutex_unlock(&cbdt->lock);
+	return dev;
+}
+
+struct page *cbdt_page(struct cbd_transport *cbdt, u64 transport_off, u32 *page_off)
+{
+	long access_size;
+	pfn_t pfn;
+
+	access_size = dax_direct_access(cbdt->dax_dev, transport_off >> PAGE_SHIFT,
+					1, DAX_ACCESS, NULL, &pfn);
+	if (access_size < 0)
+		return NULL;
+
+	if (page_off)
+		*page_off = transport_off & PAGE_MASK;
+
+	return pfn_t_to_page(pfn);
+}
-- 
2.34.1


  reply	other threads:[~2024-09-18 10:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-18 10:18 [PATCH v2 0/8] Introduce CBD (CXL Block Device) Dongsheng Yang
2024-09-18 10:18 ` Dongsheng Yang [this message]
2024-09-18 10:18 ` [PATCH v2 2/8] cbd: introduce cbd_host Dongsheng Yang
2024-09-18 10:18 ` [PATCH v2 3/8] cbd: introduce cbd_segment Dongsheng Yang
2024-09-18 10:18 ` [PATCH v2 4/8] cbd: introduce cbd_channel Dongsheng Yang
2024-09-18 10:18 ` [PATCH v2 5/8] cbd: introduce cbd_cache Dongsheng Yang
2024-09-18 10:18 ` [PATCH v2 6/8] cbd: introduce cbd_blkdev Dongsheng Yang
2024-09-18 10:18 ` [PATCH v2 7/8] cbd: introduce cbd_backend Dongsheng Yang
2024-09-18 10:18 ` [PATCH v2 8/8] block: Init for CBD(CXL Block Device) module Dongsheng Yang
2024-09-24 16:35   ` Randy Dunlap
2024-09-25  1:48     ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240918101821.681118-2-dongsheng.yang@linux.dev \
    --to=dongsheng.yang@linux.dev \
    --cc=John@groves.net \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=axboe@kernel.dk \
    --cc=bbhushan2@marvell.com \
    --cc=chaitanyak@nvidia.com \
    --cc=dan.j.williams@intel.com \
    --cc=gregory.price@memverge.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rdunlap@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.