From mboxrd@z Thu Jan 1 00:00:00 1970 From: axboe@fb.com (Jens Axboe) Date: Thu, 22 Jan 2015 11:59:01 -0700 Subject: [PATCH] NVMe: avoid kmalloc/kfree for smaller IO In-Reply-To: References: <20150122043805.GA12546@kernel.dk> Message-ID: <54C14875.2040806@fb.com> On 01/22/2015 10:26 AM, Keith Busch wrote: > On Wed, 21 Jan 2015, Jens Axboe wrote: >> Currently we allocate an nvme_iod for each IO, which holds the >> sg list, prps, and other IO related info. Set a threshold of >> 2 pages and/or 8KB of data, below which we can just embed this >> in the per-command pdu in blk-mq. For any IO at or below >> NVME_INT_PAGES and NVME_INT_BYTES, we save a kmalloc and kfree. >> >> For higher IOPS, this saves up to 1% of CPU time. >> >> Signed-off-by: Jens Axboe >> >> ---- > >> +/* >> + * Max size of iod being embedded in the request payload >> + */ >> +#define NVME_INT_PAGES 2 >> +#define NVME_INT_BYTES (NVME_INT_PAGES * PAGE_CACHE_SIZE) > > I think the above needs to use what the device thinks a page size, > right? If > there's a mismatched host-device page size, nvme_setup_prps could end up > accessing a non-existent prp list. > > #define NVME_INT_BYTES(dev) (NVME_INT_PAGES * dev->page_size) Good point, I missed that aspect of it. I'll make that change and repost. -- Jens Axboe