From mboxrd@z Thu Jan 1 00:00:00 1970 From: willy@linux.intel.com (Matthew Wilcox) Date: Fri, 17 May 2013 08:57:00 -0400 Subject: [PATCH] Call nvme_process_cq from submit path Message-ID: <20130517125700.GN6057@linux.intel.com> There are a couple of good reasons to process the CQ from the submit path. One is that if the SQ is full, processing the CQ may free up some slots in the SQ. Another is that, if the interrupt mitigation is configured correctly, we may be able to avoid receiving an interrupt if we process the completion before the timer fires. Processing an empty CQ should be cheap; it's two loads from the nvmeq data structure, a 64-bit load from the CQ and three compares. I'd be intrigued to see if anyone can measure a performance decrease from doing this with any workload. NB: this isn't the patch I'd be committing; I'll move nvme_process_cq above nvme_make_request, but that move makes this look like a Big Deal when at its heart, it's a one-line patch. diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c index 8efdfaa..f55d666 100644 --- a/drivers/block/nvme-core.c +++ b/drivers/block/nvme-core.c @@ -708,6 +708,8 @@ static int nvme_submit_bio_queue(struct nvme_queue *nvmeq, struct nvme_ns *ns, return result; } +static irqreturn_t nvme_process_cq(struct nvme_queue *nvmeq); + static void nvme_make_request(struct request_queue *q, struct bio *bio) { struct nvme_ns *ns = q->queuedata; @@ -722,7 +724,7 @@ static void nvme_make_request(struct request_queue *q, struct bio *bio) add_wait_queue(&nvmeq->sq_full, &nvmeq->sq_cong_wait); bio_list_add(&nvmeq->sq_cong, bio); } - + nvme_process_cq(nvmeq); spin_unlock_irq(&nvmeq->q_lock); put_nvmeq(nvmeq); }