linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] lightnvm: pblk bug fixes for 4.14
@ 2017-09-06 15:01 Javier González
  2017-09-06 15:01 ` [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc Javier González
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Javier González @ 2017-09-06 15:01 UTC (permalink / raw)
  To: mb, axboe; +Cc: linux-block, linux-kernel, Javier González

Hi Jens,

Here are the pblk bug fixes for this window.

The patches apply on your for-4.14/block, and you can be found at:
  https://github.com/OpenChannelSSD/linux/tree/pblk.for-4.14

Thanks,
Javier

Javier González (6):
  lightnvm: pblk: check for failed mempool alloc.
  lightnvm: pblk: initialize debug stat counter
  lightnvm: pblk: use right flag for GC allocation
  lightnvm: pblk: free padded entries in write buffer
  lightnvm: pblk: fix write I/O sync stat
  lightnvm: pblk: avoid deadlock on low LUN config

 drivers/lightnvm/pblk-core.c  |  5 ++++-
 drivers/lightnvm/pblk-init.c  |  3 ++-
 drivers/lightnvm/pblk-read.c  |  7 +++++--
 drivers/lightnvm/pblk-rl.c    |  6 ++++++
 drivers/lightnvm/pblk-write.c | 16 ++++++++++++----
 drivers/lightnvm/pblk.h       |  2 ++
 6 files changed, 31 insertions(+), 8 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:01 [PATCH 0/6] lightnvm: pblk bug fixes for 4.14 Javier González
@ 2017-09-06 15:01 ` Javier González
  2017-09-06 15:08   ` Johannes Thumshirn
  2017-09-06 15:01 ` [PATCH 2/6] lightnvm: pblk: initialize debug stat counter Javier González
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Javier González @ 2017-09-06 15:01 UTC (permalink / raw)
  To: mb, axboe
  Cc: linux-block, linux-kernel, Javier González,
	Matias Bjørling

Check for failed mempool allocations and act accordingly.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 81501644fb15..acb07bbcb416 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -165,6 +165,8 @@ struct nvm_rq *pblk_alloc_rqd(struct pblk *pblk, int rw)
 	}
 
 	rqd = mempool_alloc(pool, GFP_KERNEL);
+	if (!rqd)
+		return NULL;
 	memset(rqd, 0, rq_size);
 
 	return rqd;
@@ -1478,6 +1480,8 @@ int pblk_blk_erase_async(struct pblk *pblk, struct ppa_addr ppa)
 	int err;
 
 	rqd = mempool_alloc(pblk->g_rq_pool, GFP_KERNEL);
+	if (!rqd)
+		return -ENOMEM;
 	memset(rqd, 0, pblk_g_rq_size);
 
 	pblk_setup_e_rq(pblk, rqd, ppa);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/6] lightnvm: pblk: initialize debug stat counter
  2017-09-06 15:01 [PATCH 0/6] lightnvm: pblk bug fixes for 4.14 Javier González
  2017-09-06 15:01 ` [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc Javier González
@ 2017-09-06 15:01 ` Javier González
  2017-09-06 15:15   ` Johannes Thumshirn
  2017-09-06 15:01 ` [PATCH 3/6] lightnvm: pblk: use right flag for GC allocation Javier González
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Javier González @ 2017-09-06 15:01 UTC (permalink / raw)
  To: mb, axboe
  Cc: linux-block, linux-kernel, Javier González,
	Matias Bjørling

Initialize the stat counter for garbage collected reads.

Fixes: a4bd217b43268 ("lightnvm: physical block device (pblk) target")
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
---
 drivers/lightnvm/pblk-init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 1b0f61233c21..8459434edd89 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -944,6 +944,7 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 	atomic_long_set(&pblk->recov_writes, 0);
 	atomic_long_set(&pblk->recov_writes, 0);
 	atomic_long_set(&pblk->recov_gc_writes, 0);
+	atomic_long_set(&pblk->recov_gc_reads, 0);
 #endif
 
 	atomic_long_set(&pblk->read_failed, 0);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/6] lightnvm: pblk: use right flag for GC allocation
  2017-09-06 15:01 [PATCH 0/6] lightnvm: pblk bug fixes for 4.14 Javier González
  2017-09-06 15:01 ` [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc Javier González
  2017-09-06 15:01 ` [PATCH 2/6] lightnvm: pblk: initialize debug stat counter Javier González
@ 2017-09-06 15:01 ` Javier González
  2017-09-06 15:19   ` Johannes Thumshirn
  2017-09-06 15:01 ` [PATCH 4/6] lightnvm: pblk: free padded entries in write buffer Javier González
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Javier González @ 2017-09-06 15:01 UTC (permalink / raw)
  To: mb, axboe
  Cc: linux-block, linux-kernel, Javier González,
	Matias Bjørling

The data buffer for the GC path allocates virtual memory through
vmalloc. When this change was introduced, a flag signaling kmalloc'ed
memory was wrongly introduced. Use the right flag when creating a bio
from this buffer.

Fixes: de54e703a422 ("lightnvm: pblk: use vmalloc for GC data buffer")
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
---
 drivers/lightnvm/pblk-read.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index d682e89e6493..ee8efb55b330 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -499,7 +499,7 @@ int pblk_submit_read_gc(struct pblk *pblk, u64 *lba_list, void *data,
 
 	data_len = (*secs_to_gc) * geo->sec_size;
 	bio = pblk_bio_map_addr(pblk, data, *secs_to_gc, data_len,
-						PBLK_KMALLOC_META, GFP_KERNEL);
+						PBLK_VMALLOC_META, GFP_KERNEL);
 	if (IS_ERR(bio)) {
 		pr_err("pblk: could not allocate GC bio (%lu)\n", PTR_ERR(bio));
 		goto err_free_dma;
@@ -519,7 +519,7 @@ int pblk_submit_read_gc(struct pblk *pblk, u64 *lba_list, void *data,
 	if (ret) {
 		bio_endio(bio);
 		pr_err("pblk: GC read request failed\n");
-		goto err_free_dma;
+		goto err_free_bio;
 	}
 
 	if (!wait_for_completion_io_timeout(&wait,
@@ -541,10 +541,13 @@ int pblk_submit_read_gc(struct pblk *pblk, u64 *lba_list, void *data,
 	atomic_long_sub(*secs_to_gc, &pblk->inflight_reads);
 #endif
 
+	bio_put(bio);
 out:
 	nvm_dev_dma_free(dev->parent, rqd.meta_list, rqd.dma_meta_list);
 	return NVM_IO_OK;
 
+err_free_bio:
+	bio_put(bio);
 err_free_dma:
 	nvm_dev_dma_free(dev->parent, rqd.meta_list, rqd.dma_meta_list);
 	return NVM_IO_ERR;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/6] lightnvm: pblk: free padded entries in write buffer
  2017-09-06 15:01 [PATCH 0/6] lightnvm: pblk bug fixes for 4.14 Javier González
                   ` (2 preceding siblings ...)
  2017-09-06 15:01 ` [PATCH 3/6] lightnvm: pblk: use right flag for GC allocation Javier González
@ 2017-09-06 15:01 ` Javier González
  2017-09-06 15:22   ` Johannes Thumshirn
  2017-09-06 15:01 ` [PATCH 5/6] lightnvm: pblk: fix write I/O sync stat Javier González
  2017-09-06 15:01 ` [PATCH 6/6] lightnvm: pblk: avoid deadlock on low LUN config Javier González
  5 siblings, 1 reply; 19+ messages in thread
From: Javier González @ 2017-09-06 15:01 UTC (permalink / raw)
  To: mb, axboe
  Cc: linux-block, linux-kernel, Javier González,
	Matias Bjørling

When a REQ_FLUSH reaches pblk, the bio cannot be directly completed.
Instead, data on the write buffer is flushed and the bio is completed on
the completion pah. This might require some sectors to be padded in
order to guarantee a successful write.

This patch fixes a memory leak on the padded pages. A consequence of
this bad free was that internal bios not containing data (only a flush)
were not being completed.

Fixes: a4bd217b4326 ("lightnvm: physical block device (pblk) target")
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
---
 drivers/lightnvm/pblk-core.c  |  1 -
 drivers/lightnvm/pblk-write.c | 14 +++++++++++---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index acb07bbcb416..84bff271abd8 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -192,7 +192,6 @@ void pblk_bio_free_pages(struct pblk *pblk, struct bio *bio, int off,
 
 	WARN_ON(off + nr_pages != bio->bi_vcnt);
 
-	bio_advance(bio, off * PBLK_EXPOSED_PAGE_SIZE);
 	for (i = off; i < nr_pages + off; i++) {
 		bv = bio->bi_io_vec[i];
 		mempool_free(bv.bv_page, pblk->page_pool);
diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index 3ad9e56d2473..6acb4a92435f 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -25,13 +25,20 @@ static unsigned long pblk_end_w_bio(struct pblk *pblk, struct nvm_rq *rqd,
 	unsigned long ret;
 	int i;
 
-	for (i = 0; i < c_ctx->nr_valid; i++) {
+	i = 0;
+	do {
 		struct pblk_w_ctx *w_ctx;
 
 		w_ctx = pblk_rb_w_ctx(&pblk->rwb, c_ctx->sentry + i);
 		while ((original_bio = bio_list_pop(&w_ctx->bios)))
 			bio_endio(original_bio);
-	}
+
+		i++;
+	} while (i < c_ctx->nr_valid);
+
+	if (c_ctx->nr_padded)
+		pblk_bio_free_pages(pblk, rqd->bio, c_ctx->nr_valid,
+							c_ctx->nr_padded);
 
 #ifdef CONFIG_NVM_DEBUG
 	atomic_long_add(c_ctx->nr_valid, &pblk->sync_writes);
@@ -521,7 +528,8 @@ static void pblk_free_write_rqd(struct pblk *pblk, struct nvm_rq *rqd)
 	struct bio *bio = rqd->bio;
 
 	if (c_ctx->nr_padded)
-		pblk_bio_free_pages(pblk, bio, rqd->nr_ppas, c_ctx->nr_padded);
+		pblk_bio_free_pages(pblk, bio, c_ctx->nr_valid,
+							c_ctx->nr_padded);
 }
 
 static int pblk_submit_write(struct pblk *pblk)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/6] lightnvm: pblk: fix write I/O sync stat
  2017-09-06 15:01 [PATCH 0/6] lightnvm: pblk bug fixes for 4.14 Javier González
                   ` (3 preceding siblings ...)
  2017-09-06 15:01 ` [PATCH 4/6] lightnvm: pblk: free padded entries in write buffer Javier González
@ 2017-09-06 15:01 ` Javier González
  2017-09-06 15:23   ` Johannes Thumshirn
  2017-09-06 15:01 ` [PATCH 6/6] lightnvm: pblk: avoid deadlock on low LUN config Javier González
  5 siblings, 1 reply; 19+ messages in thread
From: Javier González @ 2017-09-06 15:01 UTC (permalink / raw)
  To: mb, axboe
  Cc: linux-block, linux-kernel, Javier González,
	Matias Bjørling

Fix stat counter to collect the right number of I/Os being synced on the
completion path.

Fixes: 0880a9aa2d91f ("lightnvm: pblk: delete redundant buffer pointer")
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
---
 drivers/lightnvm/pblk-write.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index 6acb4a92435f..b48d52b2ae38 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -41,7 +41,7 @@ static unsigned long pblk_end_w_bio(struct pblk *pblk, struct nvm_rq *rqd,
 							c_ctx->nr_padded);
 
 #ifdef CONFIG_NVM_DEBUG
-	atomic_long_add(c_ctx->nr_valid, &pblk->sync_writes);
+	atomic_long_add(rqd->nr_ppas, &pblk->sync_writes);
 #endif
 
 	ret = pblk_rb_sync_advance(&pblk->rwb, c_ctx->nr_valid);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/6] lightnvm: pblk: avoid deadlock on low LUN config
  2017-09-06 15:01 [PATCH 0/6] lightnvm: pblk bug fixes for 4.14 Javier González
                   ` (4 preceding siblings ...)
  2017-09-06 15:01 ` [PATCH 5/6] lightnvm: pblk: fix write I/O sync stat Javier González
@ 2017-09-06 15:01 ` Javier González
  5 siblings, 0 replies; 19+ messages in thread
From: Javier González @ 2017-09-06 15:01 UTC (permalink / raw)
  To: mb, axboe
  Cc: linux-block, linux-kernel, Javier González,
	Matias Bjørling

On low LUN configurations, make sure not to send bios that are bigger
than the buffer size.

Fixes: a4bd217b4326 ("lightnvm: physical block device (pblk) target")
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <matias@cnexlabs.com>
---
 drivers/lightnvm/pblk-init.c | 2 +-
 drivers/lightnvm/pblk-rl.c   | 6 ++++++
 drivers/lightnvm/pblk.h      | 2 ++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 8459434edd89..12c05aebac16 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -46,7 +46,7 @@ static int pblk_rw_io(struct request_queue *q, struct pblk *pblk,
 	 * user I/Os. Unless stalled, the rate limiter leaves at least 256KB
 	 * available for user I/O.
 	 */
-	if (unlikely(pblk_get_secs(bio) >= pblk_rl_sysfs_rate_show(&pblk->rl)))
+	if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
 		blk_queue_split(q, &bio);
 
 	return pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
index 2e6a5361baf0..596bdec433c3 100644
--- a/drivers/lightnvm/pblk-rl.c
+++ b/drivers/lightnvm/pblk-rl.c
@@ -178,6 +178,11 @@ int pblk_rl_sysfs_rate_show(struct pblk_rl *rl)
 	return rl->rb_user_max;
 }
 
+int pblk_rl_max_io(struct pblk_rl *rl)
+{
+	return rl->rb_max_io;
+}
+
 static void pblk_rl_u_timer(unsigned long data)
 {
 	struct pblk_rl *rl = (struct pblk_rl *)data;
@@ -214,6 +219,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
 	/* To start with, all buffer is available to user I/O writers */
 	rl->rb_budget = budget;
 	rl->rb_user_max = budget;
+	rl->rb_max_io = budget >> 1;
 	rl->rb_gc_max = 0;
 	rl->rb_state = PBLK_RL_HIGH;
 
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 67e623bd5c2d..a2753f9da830 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -267,6 +267,7 @@ struct pblk_rl {
 	int rb_gc_max;		/* Max buffer entries available for GC I/O */
 	int rb_gc_rsv;		/* Reserved buffer entries for GC I/O */
 	int rb_state;		/* Rate-limiter current state */
+	int rb_max_io;		/* Maximum size for an I/O giving the config */
 
 	atomic_t rb_user_cnt;	/* User I/O buffer counter */
 	atomic_t rb_gc_cnt;	/* GC I/O buffer counter */
@@ -844,6 +845,7 @@ int pblk_rl_gc_may_insert(struct pblk_rl *rl, int nr_entries);
 void pblk_rl_gc_in(struct pblk_rl *rl, int nr_entries);
 void pblk_rl_out(struct pblk_rl *rl, int nr_user, int nr_gc);
 int pblk_rl_sysfs_rate_show(struct pblk_rl *rl);
+int pblk_rl_max_io(struct pblk_rl *rl);
 void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line);
 void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line);
 void pblk_rl_set_space_limit(struct pblk_rl *rl, int entries_left);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:01 ` [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc Javier González
@ 2017-09-06 15:08   ` Johannes Thumshirn
  2017-09-06 15:09     ` Jens Axboe
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Thumshirn @ 2017-09-06 15:08 UTC (permalink / raw)
  To: Javier González
  Cc: mb, axboe, linux-block, linux-kernel, Javier González,
	Matias Bjørling

On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier Gonz�lez wrote:
> Check for failed mempool allocations and act accordingly.

Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
"[...] Note that due to preallocation, this function *never* fails when called
from process contexts. (it might fail if called from an IRQ context.) [...]"


Byte,
	Johannes

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:08   ` Johannes Thumshirn
@ 2017-09-06 15:09     ` Jens Axboe
  2017-09-06 15:12       ` Javier González
  2017-09-06 15:12       ` Johannes Thumshirn
  0 siblings, 2 replies; 19+ messages in thread
From: Jens Axboe @ 2017-09-06 15:09 UTC (permalink / raw)
  To: Johannes Thumshirn, Javier González
  Cc: mb, linux-block, linux-kernel, Javier González,
	Matias Bjørling

On 09/06/2017 09:08 AM, Johannes Thumshirn wrote:
> On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier González wrote:
>> Check for failed mempool allocations and act accordingly.
> 
> Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
> "[...] Note that due to preallocation, this function *never* fails when called
> from process contexts. (it might fail if called from an IRQ context.) [...]"

It's not needed, mempool() will never fail if __GFP_WAIT is set in the
mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:09     ` Jens Axboe
@ 2017-09-06 15:12       ` Javier González
  2017-09-06 15:13         ` Jens Axboe
  2017-09-06 15:12       ` Johannes Thumshirn
  1 sibling, 1 reply; 19+ messages in thread
From: Javier González @ 2017-09-06 15:12 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Johannes Thumshirn, Matias Bjørling, linux-block,
	linux-kernel, Matias Bjørling

[-- Attachment #1: Type: text/plain, Size: 844 bytes --]

> On 6 Sep 2017, at 17.09, Jens Axboe <axboe@kernel.dk> wrote:
> 
> On 09/06/2017 09:08 AM, Johannes Thumshirn wrote:
>> On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier González wrote:
>>> Check for failed mempool allocations and act accordingly.
>> 
>> Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
>> "[...] Note that due to preallocation, this function *never* fails when called
>> from process contexts. (it might fail if called from an IRQ context.) [...]"
> 
> It's not needed, mempool() will never fail if __GFP_WAIT is set in the
> mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT.

Thanks for the clarification. Do you just drop the patch, or do you want
me to re-send the series?

I see that I do this other places, I'll clean it up for next window.

Thanks,
Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:09     ` Jens Axboe
  2017-09-06 15:12       ` Javier González
@ 2017-09-06 15:12       ` Johannes Thumshirn
  1 sibling, 0 replies; 19+ messages in thread
From: Johannes Thumshirn @ 2017-09-06 15:12 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Javier González, mb, linux-block, linux-kernel,
	Javier González, Matias Bjørling

On Wed, Sep 06, 2017 at 09:09:29AM -0600, Jens Axboe wrote:
> On 09/06/2017 09:08 AM, Johannes Thumshirn wrote:
> > On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier Gonz�lez wrote:
> >> Check for failed mempool allocations and act accordingly.
> > 
> > Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
> > "[...] Note that due to preallocation, this function *never* fails when called
> > from process contexts. (it might fail if called from an IRQ context.) [...]"
> 
> It's not needed, mempool() will never fail if __GFP_WAIT is set in the
> mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT.

Exactly. Maybe I shouldn't have it phrased as a question though... 

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:12       ` Javier González
@ 2017-09-06 15:13         ` Jens Axboe
  2017-09-06 15:14           ` Javier González
  2017-09-06 15:20           ` Jens Axboe
  0 siblings, 2 replies; 19+ messages in thread
From: Jens Axboe @ 2017-09-06 15:13 UTC (permalink / raw)
  To: Javier González
  Cc: Johannes Thumshirn, Matias Bjørling, linux-block,
	linux-kernel, Matias Bjørling

On 09/06/2017 09:12 AM, Javier González wrote:
>> On 6 Sep 2017, at 17.09, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 09/06/2017 09:08 AM, Johannes Thumshirn wrote:
>>> On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier González wrote:
>>>> Check for failed mempool allocations and act accordingly.
>>>
>>> Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
>>> "[...] Note that due to preallocation, this function *never* fails when called
>>> from process contexts. (it might fail if called from an IRQ context.) [...]"
>>
>> It's not needed, mempool() will never fail if __GFP_WAIT is set in the
>> mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT.
> 
> Thanks for the clarification. Do you just drop the patch, or do you want
> me to re-send the series?

No need to resend. I'll pick up the others in a day or two, once people
have had some time to go over them.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:13         ` Jens Axboe
@ 2017-09-06 15:14           ` Javier González
  2017-09-06 15:20           ` Jens Axboe
  1 sibling, 0 replies; 19+ messages in thread
From: Javier González @ 2017-09-06 15:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Johannes Thumshirn, Matias Bjørling, linux-block,
	linux-kernel, Matias Bjørling

[-- Attachment #1: Type: text/plain, Size: 1094 bytes --]

> On 6 Sep 2017, at 17.13, Jens Axboe <axboe@kernel.dk> wrote:
> 
> On 09/06/2017 09:12 AM, Javier González wrote:
>>> On 6 Sep 2017, at 17.09, Jens Axboe <axboe@kernel.dk> wrote:
>>> 
>>> On 09/06/2017 09:08 AM, Johannes Thumshirn wrote:
>>>> On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier González wrote:
>>>>> Check for failed mempool allocations and act accordingly.
>>>> 
>>>> Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
>>>> "[...] Note that due to preallocation, this function *never* fails when called
>>>> from process contexts. (it might fail if called from an IRQ context.) [...]"
>>> 
>>> It's not needed, mempool() will never fail if __GFP_WAIT is set in the
>>> mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT.
>> 
>> Thanks for the clarification. Do you just drop the patch, or do you want
>> me to re-send the series?
> 
> No need to resend. I'll pick up the others in a day or two, once people
> have had some time to go over them.
> 

Thanks. And apologies for the delay on the patches...

Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] lightnvm: pblk: initialize debug stat counter
  2017-09-06 15:01 ` [PATCH 2/6] lightnvm: pblk: initialize debug stat counter Javier González
@ 2017-09-06 15:15   ` Johannes Thumshirn
  0 siblings, 0 replies; 19+ messages in thread
From: Johannes Thumshirn @ 2017-09-06 15:15 UTC (permalink / raw)
  To: Javier González
  Cc: mb, axboe, linux-block, linux-kernel, Javier González,
	Matias Bjørling

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] lightnvm: pblk: use right flag for GC allocation
  2017-09-06 15:01 ` [PATCH 3/6] lightnvm: pblk: use right flag for GC allocation Javier González
@ 2017-09-06 15:19   ` Johannes Thumshirn
  0 siblings, 0 replies; 19+ messages in thread
From: Johannes Thumshirn @ 2017-09-06 15:19 UTC (permalink / raw)
  To: Javier González
  Cc: mb, axboe, linux-block, linux-kernel, Javier González,
	Matias Bjørling

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:13         ` Jens Axboe
  2017-09-06 15:14           ` Javier González
@ 2017-09-06 15:20           ` Jens Axboe
  2017-09-06 18:28             ` Javier González
  1 sibling, 1 reply; 19+ messages in thread
From: Jens Axboe @ 2017-09-06 15:20 UTC (permalink / raw)
  To: Javier González
  Cc: Johannes Thumshirn, Matias Bjørling, linux-block,
	linux-kernel, Matias Bjørling

On 09/06/2017 09:13 AM, Jens Axboe wrote:
> On 09/06/2017 09:12 AM, Javier González wrote:
>>> On 6 Sep 2017, at 17.09, Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> On 09/06/2017 09:08 AM, Johannes Thumshirn wrote:
>>>> On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier González wrote:
>>>>> Check for failed mempool allocations and act accordingly.
>>>>
>>>> Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
>>>> "[...] Note that due to preallocation, this function *never* fails when called
>>>> from process contexts. (it might fail if called from an IRQ context.) [...]"
>>>
>>> It's not needed, mempool() will never fail if __GFP_WAIT is set in the
>>> mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT.
>>
>> Thanks for the clarification. Do you just drop the patch, or do you want
>> me to re-send the series?
> 
> No need to resend. I'll pick up the others in a day or two, once people
> have had some time to go over them.

I took a quick look at your mempool usage, and I'm not sure it's
correct.  For a mempool to work, you have to ensure that you provide a
forward progress guarantee. With that guarantee, you know that if you do
end up sleeping on allocation, you already have items inflight that will
be freed when that operation completes. In other words, all allocations
must have a defined and finite life time, as any allocation can
potentially sleep/block for that life time. You can't allocate something
and hold on to it forever, then you are violating the terms of agreement
that makes a mempool work.

The first one that caught my eye is pblk->page_pool. You have this loop:

for (i = 0; i < nr_pages; i++) {                                        
        page = mempool_alloc(pblk->page_pool, flags);                   
        if (!page)                                                      
                goto err;                                               
                                                                        
        ret = bio_add_pc_page(q, bio, page, PBLK_EXPOSED_PAGE_SIZE, 0); 
        if (ret != PBLK_EXPOSED_PAGE_SIZE) {                            
                pr_err("pblk: could not add page to bio\n");            
                mempool_free(page, pblk->page_pool);                    
                goto err;                                               
        }                                                               
}          

which looks suspect. This mempool is created with a reserve pool of
PAGE_POOL_SIZE (16) members. Do we know if the bio has 16 pages or less?
If not, then this is broken and can deadlock forever.

You have a lot of mempool usage in the code, would probably not hurt to
audit all of them.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] lightnvm: pblk: free padded entries in write buffer
  2017-09-06 15:01 ` [PATCH 4/6] lightnvm: pblk: free padded entries in write buffer Javier González
@ 2017-09-06 15:22   ` Johannes Thumshirn
  0 siblings, 0 replies; 19+ messages in thread
From: Johannes Thumshirn @ 2017-09-06 15:22 UTC (permalink / raw)
  To: Javier González
  Cc: mb, axboe, linux-block, linux-kernel, Javier González,
	Matias Bjørling

On Wed, Sep 06, 2017 at 05:01:04PM +0200, Javier Gonz�lez wrote:
> the completion pah. This might require some sectors to be padded in
                 ^ path

Looks good otherwise,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/6] lightnvm: pblk: fix write I/O sync stat
  2017-09-06 15:01 ` [PATCH 5/6] lightnvm: pblk: fix write I/O sync stat Javier González
@ 2017-09-06 15:23   ` Johannes Thumshirn
  0 siblings, 0 replies; 19+ messages in thread
From: Johannes Thumshirn @ 2017-09-06 15:23 UTC (permalink / raw)
  To: Javier González
  Cc: mb, axboe, linux-block, linux-kernel, Javier González,
	Matias Bjørling


Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc.
  2017-09-06 15:20           ` Jens Axboe
@ 2017-09-06 18:28             ` Javier González
  0 siblings, 0 replies; 19+ messages in thread
From: Javier González @ 2017-09-06 18:28 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Johannes Thumshirn, Matias Bjørling, linux-block,
	Linux Kernel Mailing List, Matias Bjørling

[-- Attachment #1: Type: text/plain, Size: 3573 bytes --]

> On 6 Sep 2017, at 17.20, Jens Axboe <axboe@kernel.dk> wrote:
> 
> On 09/06/2017 09:13 AM, Jens Axboe wrote:
>> On 09/06/2017 09:12 AM, Javier González wrote:
>>>> On 6 Sep 2017, at 17.09, Jens Axboe <axboe@kernel.dk> wrote:
>>>> 
>>>> On 09/06/2017 09:08 AM, Johannes Thumshirn wrote:
>>>>> On Wed, Sep 06, 2017 at 05:01:01PM +0200, Javier González wrote:
>>>>>> Check for failed mempool allocations and act accordingly.
>>>>> 
>>>>> Are you sure it is needed? Quoting from mempool_alloc()s Documentation:
>>>>> "[...] Note that due to preallocation, this function *never* fails when called
>>>>> from process contexts. (it might fail if called from an IRQ context.) [...]"
>>>> 
>>>> It's not needed, mempool() will never fail if __GFP_WAIT is set in the
>>>> mask. The use case here is GFP_KERNEL, which does include __GFP_WAIT.
>>> 
>>> Thanks for the clarification. Do you just drop the patch, or do you want
>>> me to re-send the series?
>> 
>> No need to resend. I'll pick up the others in a day or two, once people
>> have had some time to go over them.
> 
> I took a quick look at your mempool usage, and I'm not sure it's
> correct.  For a mempool to work, you have to ensure that you provide a
> forward progress guarantee. With that guarantee, you know that if you do
> end up sleeping on allocation, you already have items inflight that will
> be freed when that operation completes. In other words, all allocations
> must have a defined and finite life time, as any allocation can
> potentially sleep/block for that life time. You can't allocate something
> and hold on to it forever, then you are violating the terms of agreement
> that makes a mempool work.

I understood the part of guaranteeing the number of inflight items to
keep the mempool active without waiting, but I must admit that I assumed
that the mempool would resize when getting pressure and that the penalty
would be increased latency, not the mempool giving up and causing a
deadlock.

> 
> The first one that caught my eye is pblk->page_pool. You have this loop:
> 
> for (i = 0; i < nr_pages; i++) {
>        page = mempool_alloc(pblk->page_pool, flags);
>        if (!page)
>                goto err;
> 
>        ret = bio_add_pc_page(q, bio, page, PBLK_EXPOSED_PAGE_SIZE, 0);
>        if (ret != PBLK_EXPOSED_PAGE_SIZE) {
>                pr_err("pblk: could not add page to bio\n");
>                mempool_free(page, pblk->page_pool);
>                goto err;
>        }
> }
> 
> which looks suspect. This mempool is created with a reserve pool of
> PAGE_POOL_SIZE (16) members. Do we know if the bio has 16 pages or less?
> If not, then this is broken and can deadlock forever.

I can see that in this case, the 16 elements do not hold. In the read
path, we can guarantee that a read will be <= 64 sectors (4KB pages), so
this is definitely wrong. I'll fix it tomorrow.

Since we are at it, I have for some time wondered what's the right way
to balance the mempool sizes so that we are a good citizen and perform
at the same time. I don't see a lot of code using mempool_resize to tune
the min_nr based on load. For example, in our write path, the numbers
are easy to calculate, but on the read path I completely
over-dimensioned the mempool to ensure not having to wait for the
completion path. Any good rule of thumb here?

> You have a lot of mempool usage in the code, would probably not hurt to
> audit all of them.

Yes. I will take a look and add comments to the sizes.

Thanks Jens,
Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-09-06 18:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-06 15:01 [PATCH 0/6] lightnvm: pblk bug fixes for 4.14 Javier González
2017-09-06 15:01 ` [PATCH 1/6] lightnvm: pblk: check for failed mempool alloc Javier González
2017-09-06 15:08   ` Johannes Thumshirn
2017-09-06 15:09     ` Jens Axboe
2017-09-06 15:12       ` Javier González
2017-09-06 15:13         ` Jens Axboe
2017-09-06 15:14           ` Javier González
2017-09-06 15:20           ` Jens Axboe
2017-09-06 18:28             ` Javier González
2017-09-06 15:12       ` Johannes Thumshirn
2017-09-06 15:01 ` [PATCH 2/6] lightnvm: pblk: initialize debug stat counter Javier González
2017-09-06 15:15   ` Johannes Thumshirn
2017-09-06 15:01 ` [PATCH 3/6] lightnvm: pblk: use right flag for GC allocation Javier González
2017-09-06 15:19   ` Johannes Thumshirn
2017-09-06 15:01 ` [PATCH 4/6] lightnvm: pblk: free padded entries in write buffer Javier González
2017-09-06 15:22   ` Johannes Thumshirn
2017-09-06 15:01 ` [PATCH 5/6] lightnvm: pblk: fix write I/O sync stat Javier González
2017-09-06 15:23   ` Johannes Thumshirn
2017-09-06 15:01 ` [PATCH 6/6] lightnvm: pblk: avoid deadlock on low LUN config Javier González

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).