From mboxrd@z Thu Jan 1 00:00:00 1970 From: hch@lst.de (Christoph Hellwig) Date: Thu, 17 May 2018 14:47:14 +0200 Subject: [PATCH V5] nvmet: add simple file backed ns support In-Reply-To: <20180517054558.5648-1-chaitanya.kulkarni@wdc.com> References: <20180517054558.5648-1-chaitanya.kulkarni@wdc.com> Message-ID: <20180517124714.GA31470@lst.de> On Thu, May 17, 2018@01:45:58AM -0400, Chaitanya Kulkarni wrote: > This patch adds simple file backed namespace support for > NVMeOF target. > > In current implementation NVMeOF supports block > device backed namespaces on the target side. > With this implementation regular file(s) can be used > to initialize the namespace(s) via target side > configfs interface. > > For admin smart log command on the target side, we only > use block device backed namespaces to compute target > side smart log. > > The new file io-cmd-file.c is responsible for handling > the code for I/O commands when ns is file backed. Also, > we introduce mempools based slow path for file backed > ns. Please use ~ 73 lines for the changelog to instead of line wrapping earlier. > +static int nvmet_ns_enable_blkdev(struct nvmet_ns *ns) > +static int nvmet_ns_enable_file(struct nvmet_ns *ns) I think these should be moved to io-cmd.c and io-cmd-file.c respectively, including adding little helpers for close, so that the code dealing with file structures or block devices is relatively isolated now that we split up the implementation files. Sagi - what do you think about renaming io-cmd.c to io-cmd-bdev.c while we're at it? > + ns->cachep = kmem_cache_create("nvmet-fs", what about nvmet-bvec as the name? > + if (req->ns->file) > + return nvmet_parse_io_cmd_file(req); > + else if (req->ns->bdev) > + return nvmet_parse_io_cmd_blkdev(req); > + > + WARN_ON_ONCE("failed to parse io cmd: invalid ns handle"); > + > + return NVME_SC_INVALID_OPCODE | NVME_SC_DNR; I'd just leave it at and if/else, things will blow up nicely if this invariant ever gets broken: if (req->ns->file) return nvmet_parse_io_cmd_file(req); else return nvmet_parse_io_cmd_blkdev(req); > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License > + * as published by the Free Software Foundation; either version 2 > + * of the License, or (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. You can remove this whole boilerplate given that you have a SPDX tag. > +typedef ssize_t (*call_iter_t) (struct kiocb *, struct iov_iter *); > +typedef void (*ki_complete_t)(struct kiocb *iocb, long ret, long ret2); No need for these typedefs. > + if ((bv_cnt == NVMET_MAX_BVEC) || ((nr_bvec - 1) == 0)) { No need for all but the outer braces. > + init_completion(&req->f.io_complete); > + > + ret = nvmet_issue_bvec(req, pos, bv_cnt, len, io_done); > + if (ret == -EIOCBQUEUED) > + wait_for_completion(&req->f.io_complete); Instead of using your own completion, the synchronous case should just not set a ki_complete callback at all, which means the file system will handle the I/O synchronously for you. That probably means you can share the actual low-level routine for doing I/O, just don't set ki_complete in the is_sync case, and have some sort of outer loop chunking up into nr_bvec sized chunks > + req->f.bvec = req->inline_bvec; > + if (nr_bvec > NVMET_MAX_INLINE_BIOVEC) { > + req->f.bvec = kmalloc_array(nr_bvec, sizeof(struct bio_vec), > + GFP_KERNEL); > + /* fallback */ > + if (!req->f.bvec) { > + req->f.bvec = mempool_alloc(req->ns->mempool, > + GFP_KERNEL); > + if (!req->f.bvec) > + goto out; > + slow_path = true; > + } > + } > + > + req->f.bvec = mempool_alloc(req->ns->mempool, > + GFP_KERNEL); > + if (!req->f.bvec) > + goto out; > + slow_path = true; This seems to always hit the slow path. How about this instead: if (nr_bvec > NVMET_MAX_INLINE_BIOVEC) { req->f.bvec = kmalloc_array(nr_bvec, sizeof(struct bio_vec), GFP_KERNEL); } else { req->f.bvec = req->inline_bvec; } /* fallback under memory pressure */ if (unlikely(!req->f.bvec)) { req->f.bvec = mempool_alloc(req->ns->mempool, GFP_KERNEL); if (nr_bvec > NVMET_MAX_MEMPOOL_BVEC) is_sync = true; } } Also I think NVMET_MAX_BVEC should be NVMET_MAX_MEMPOOL_BVEC and the actual I/O functions really don't need most of the arguments as far as I can tell. mempool_alloc with a GFP_KERNEL argument can't fail it will just block forever, so no need to chek the return value there. > -static inline u32 nvmet_rw_len(struct nvmet_req *req) > +inline u32 nvmet_rw_len(struct nvmet_req *req) > { > return ((u32)le16_to_cpu(req->cmd->rw.length) + 1) << > req->ns->blksize_shift; I'd keep this as an inline and just move it to nvmet.h > +#define NVMET_MAX_BVEC 16 > +#define NVMET_MPOOL_MIN_OBJ NVMET_MAX_BVEC Even if both of these are 16 they aren't really related, so please define NVMET_MPOOL_MIN_OBJ directly. Also with the move of the enable function these can be kept in io-cmd-file.c. > /* Helper Macros when NVMe error is NVME_SC_CONNECT_INVALID_PARAM > * The 16 bit shift is to set IATTR bit to 1, which means offending > @@ -43,6 +45,7 @@ struct nvmet_ns { > struct list_head dev_link; > struct percpu_ref ref; > struct block_device *bdev; > + struct file *file; > u32 nsid; > u32 blksize_shift; > loff_t size; > @@ -57,6 +60,8 @@ struct nvmet_ns { > struct config_group group; > > struct completion disable_done; > + mempool_t *mempool; > + struct kmem_cache *cachep; I'd name these bvec_pool an bvec_cache.