Reliable write support.

public inbox for linux-mmc@vger.kernel.org
 help / color / mirror / Atom feed

* Reliable write support.
@ 2011-03-24 21:22 Andrei Warkentin
  2011-03-24 21:22 ` [comments] MMC: " Andrei Warkentin
  0 siblings, 1 reply; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-24 21:22 UTC (permalink / raw)
  To: linux-mmc

Hi group,

As part of enabling maximum-reliability (at cost of performance) workarounds for MMC media whose
controllers employ a scheme, where there is usually a small (flash-page sized) and a large (erase-block sized) buffer,
I needed reliable write support. As per Arnd's feedback, I've split the reliable write support out by itself. I believe
other people were going to look at this as well (https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Specs/StoragePerfEMMC).

The patch provides a mechanism by which reliable writes can be enabled if they are supported, and enables them for REQ_FUA/REQ_META write transfers.
REQ_FUA/REQ_FLUSH is a replacement for barrier bios, which were broken before anyway (couldn't handle partial completions, which are what's
going to happen for legacy reliable write support). REQ_FUA is a flag a bio submitted from the filesystem and will make sure that I/O completion
for this request is only signaled after the data has been committed to non-volatile storage. REQ_META are supposed to be used by filesystems to
mark bios that contain metadata.

The patch is against linux-next and was tested with a Toshiba MMC08G (legacy reliable write support) eMMC on x64.

ToC:
[comments] MMC: Reliable write support.

Thanks ahead of time,
A

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [comments] MMC: Reliable write support.
  2011-03-24 21:22 Reliable write support Andrei Warkentin
@ 2011-03-24 21:22 ` Andrei Warkentin
  2011-03-25 15:14   ` Arnd Bergmann
  0 siblings, 1 reply; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-24 21:22 UTC (permalink / raw)
  To: linux-mmc; +Cc: Andrei Warkentin

This is a request-for-comments patch. Please provide your feedback.

Allows reliable writes to be used for MMC writes. Reliable writes are used
to service write REQ_FUA/REQ_META requests. Handles both the legacy and the enhanced
reliable write support in MMC cards.

Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed
to reduce write amplification issues in cards employing a small (usually flash page-sized)
buffer and a large (usually erase-block sized) buffer, at the expense of performance.

Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
---
 drivers/mmc/card/block.c |   85 +++++++++++++++++++++++++++++++++++++++++++--
 drivers/mmc/core/mmc.c   |    5 +++
 include/linux/mmc/card.h |    2 +
 include/linux/mmc/mmc.h  |    4 ++
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 61d233a..712fe96 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -48,6 +48,10 @@ MODULE_ALIAS("mmc:block");
 #endif
 #define MODULE_PARAM_PREFIX "mmcblk."
 
+#define REL_WRITES_SUPPORTED(card) (mmc_card_mmc((card)) &&	\
+    (((card)->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN) ||	\
+     ((card)->ext_csd.rel_sectors)))
+
 static DEFINE_MUTEX(block_mutex);
 
 /*
@@ -331,6 +335,59 @@ out:
 	return err ? 0 : 1;
 }
 
+static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->data;
+
+	/*
+	   No-op, only service this because we need REQ_FUA
+	   for reliable writes.
+	*/
+	spin_lock_irq(&md->lock);
+	__blk_end_request_all(req, 0);
+	spin_unlock_irq(&md->lock);
+
+	return 1;
+}
+
+/*
+ * Reformat current write as a reliable write, supporting
+ * both legacy and the enhanced reliable write MMC cards.
+ * In each transfer we'll handle only as much as a single
+ * reliable write can handle, thus finish the request in
+ * partial completions.
+ */
+static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq,
+				   struct mmc_card *card,
+				   struct request *req)
+{
+	int err;
+	struct mmc_command set_count;
+
+	if (!(card->ext_csd.rel_param &
+	      EXT_CSD_WR_REL_PARAM_EN)) {
+
+		/* Legacy mode imposes restrictions on transfers. */
+		if (!IS_ALIGNED(brq->cmd.arg, card->ext_csd.rel_sectors))
+			brq->data.blocks = 1;
+
+		if (brq->data.blocks > card->ext_csd.rel_sectors)
+			brq->data.blocks = card->ext_csd.rel_sectors;
+		else if (brq->data.blocks != card->ext_csd.rel_sectors)
+			brq->data.blocks = 1;
+	}
+
+	memset(&set_count, 0, sizeof(struct mmc_command));
+	set_count.opcode = MMC_SET_BLOCK_COUNT;
+	set_count.arg = brq->data.blocks | (1 << 31);
+	set_count.flags = MMC_RSP_R1 | MMC_CMD_AC;
+	err = mmc_wait_for_cmd(card->host, &set_count, 0);
+	if (err)
+		printk(KERN_ERR "%s: error %d SET_BLOCK_COUNT\n",
+		       req->rq_disk->disk_name, err);
+	return err;
+}
+
 static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
@@ -338,6 +395,15 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 	struct mmc_blk_request brq;
 	int ret = 1, disable_multi = 0;
 
+	/*
+	  Reliable writes are used to implement Forced Unit Access and
+	  REQ_META accesses, and it's supported only on MMCs.
+	 */
+	bool do_rel_wr = ((req->cmd_flags & REQ_FUA) ||
+			  (req->cmd_flags & REQ_META)) &&
+		(rq_data_dir(req) == WRITE) &&
+		REL_WRITES_SUPPORTED(card);
+
 	mmc_claim_host(card->host);
 
 	do {
@@ -374,12 +440,14 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		if (disable_multi && brq.data.blocks > 1)
 			brq.data.blocks = 1;
 
-		if (brq.data.blocks > 1) {
+		if (brq.data.blocks > 1 || do_rel_wr) {
 			/* SPI multiblock writes terminate using a special
-			 * token, not a STOP_TRANSMISSION request.
+			 * token, not a STOP_TRANSMISSION request. Reliable
+			 * writes use SET_BLOCK_COUNT and do not use a
+			 * STOP_TRANSMISSION request either.
 			 */
-			if (!mmc_host_is_spi(card->host)
-					|| rq_data_dir(req) == READ)
+			if ((!mmc_host_is_spi(card->host) && !do_rel_wr) ||
+			    rq_data_dir(req) == READ)
 				brq.mrq.stop = &brq.stop;
 			readcmd = MMC_READ_MULTIPLE_BLOCK;
 			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
@@ -396,6 +464,11 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			brq.data.flags |= MMC_DATA_WRITE;
 		}
 
+		if (do_rel_wr) {
+			if (mmc_apply_rel_rw(&brq, card, req))
+				goto cmd_err;
+		}
+
 		mmc_set_data_timeout(&brq.data, card);
 
 		brq.data.sg = mq->sg;
@@ -565,6 +638,8 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 			return mmc_blk_issue_secdiscard_rq(mq, req);
 		else
 			return mmc_blk_issue_discard_rq(mq, req);
+	} else if (req->cmd_flags & REQ_FLUSH) {
+		return mmc_blk_issue_flush(mq, req);
 	} else {
 		return mmc_blk_issue_rw_rq(mq, req);
 	}
@@ -622,6 +697,8 @@ static struct mmc_blk_data *mmc_blk_alloc(struct mmc_card *card)
 	md->disk->queue = md->queue.queue;
 	md->disk->driverfs_dev = &card->dev;
 	set_disk_ro(md->disk, md->read_only);
+	if (REL_WRITES_SUPPORTED(card))
+		blk_queue_flush(md->queue.queue, REQ_FLUSH | REQ_FUA);
 
 	/*
 	 * As discussed on lkml, GENHD_FL_REMOVABLE should:
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 14e95f3..1b1e142 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -299,6 +299,8 @@ static int mmc_read_ext_csd(struct mmc_card *card)
 			ext_csd[EXT_CSD_ERASE_TIMEOUT_MULT];
 		card->ext_csd.hc_erase_size =
 			ext_csd[EXT_CSD_HC_ERASE_GRP_SIZE] << 10;
+
+		card->ext_csd.rel_sectors = ext_csd[EXT_CSD_REL_WR_SEC_C];
 	}
 
 	if (card->ext_csd.rev >= 4) {
@@ -350,6 +352,9 @@ static int mmc_read_ext_csd(struct mmc_card *card)
 			ext_csd[EXT_CSD_TRIM_MULT];
 	}
 
+	if (card->ext_csd.rev >= 5)
+		card->ext_csd.rel_param = ext_csd[EXT_CSD_WR_REL_PARAM];
+
 	if (ext_csd[EXT_CSD_ERASED_MEM_CONT])
 		card->erased_byte = 0xFF;
 	else
diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
index adb4888..959e3d8 100644
--- a/include/linux/mmc/card.h
+++ b/include/linux/mmc/card.h
@@ -45,6 +45,8 @@ struct mmc_ext_csd {
 	u8			rev;
 	u8			erase_group_def;
 	u8			sec_feature_support;
+	u8			rel_sectors;
+	u8			rel_param;
 	unsigned int		sa_timeout;		/* Units: 100ns */
 	unsigned int		hs_max_dtr;
 	unsigned int		sectors;
diff --git a/include/linux/mmc/mmc.h b/include/linux/mmc/mmc.h
index 264ba54..44a8157 100644
--- a/include/linux/mmc/mmc.h
+++ b/include/linux/mmc/mmc.h
@@ -255,6 +255,7 @@ struct _mmc_csd {
 
 #define EXT_CSD_PARTITION_ATTRIBUTE	156	/* R/W */
 #define EXT_CSD_PARTITION_SUPPORT	160	/* RO */
+#define EXT_CSD_WR_REL_PARAM		166	/* RO */
 #define EXT_CSD_ERASE_GROUP_DEF		175	/* R/W */
 #define EXT_CSD_ERASED_MEM_CONT		181	/* RO */
 #define EXT_CSD_BUS_WIDTH		183	/* R/W */
@@ -264,6 +265,7 @@ struct _mmc_csd {
 #define EXT_CSD_CARD_TYPE		196	/* RO */
 #define EXT_CSD_SEC_CNT			212	/* RO, 4 bytes */
 #define EXT_CSD_S_A_TIMEOUT		217	/* RO */
+#define EXT_CSD_REL_WR_SEC_C		222	/* RO */
 #define EXT_CSD_HC_WP_GRP_SIZE		221	/* RO */
 #define EXT_CSD_ERASE_TIMEOUT_MULT	223	/* RO */
 #define EXT_CSD_HC_ERASE_GRP_SIZE	224	/* RO */
@@ -276,6 +278,8 @@ struct _mmc_csd {
  * EXT_CSD field definitions
  */
 
+#define EXT_CSD_WR_REL_PARAM_EN		(1<<2)
+
 #define EXT_CSD_CMD_SET_NORMAL		(1<<0)
 #define EXT_CSD_CMD_SET_SECURE		(1<<1)
 #define EXT_CSD_CMD_SET_CPSECURE	(1<<2)
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-24 21:22 ` [comments] MMC: " Andrei Warkentin
@ 2011-03-25 15:14   ` Arnd Bergmann
  2011-03-26  7:17     ` Andrei Warkentin
  2011-03-29  0:50     ` Andrei Warkentin
  0 siblings, 2 replies; 14+ messages in thread
From: Arnd Bergmann @ 2011-03-25 15:14 UTC (permalink / raw)
  To: Andrei Warkentin; +Cc: linux-mmc

On Thursday 24 March 2011, Andrei Warkentin wrote:
> This is a request-for-comments patch. Please provide your feedback.
> 
> Allows reliable writes to be used for MMC writes. Reliable writes are used
> to service write REQ_FUA/REQ_META requests. Handles both the legacy and the enhanced
> reliable write support in MMC cards.
> 
> Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed
> to reduce write amplification issues in cards employing a small (usually flash page-sized)
> buffer and a large (usually erase-block sized) buffer, at the expense of performance.

Looks good to me, but I don't really understand some of the block layer
specifics here. One question:

> +static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
> +{
> +	struct mmc_blk_data *md = mq->data;
> +
> +	/*
> +	   No-op, only service this because we need REQ_FUA
> +	   for reliable writes.
> +	*/
> +	spin_lock_irq(&md->lock);
> +	__blk_end_request_all(req, 0);
> +	spin_unlock_irq(&md->lock);
> +
> +	return 1;
> +}

How does this work when you have a flush that does not directly follow
a REQ_FUA or REQ_META request? I would assume that we still need to
flush in some way, which you don't seem to do here.

	Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-25 15:14   ` Arnd Bergmann
@ 2011-03-26  7:17     ` Andrei Warkentin
  2011-03-26  7:22       ` Andrei Warkentin
  2011-03-29  0:50     ` Andrei Warkentin
  1 sibling, 1 reply; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-26  7:17 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-mmc

Hi Arnd,

On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 24 March 2011, Andrei Warkentin wrote:
>>
>> Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed
>> to reduce write amplification issues in cards employing a small (usually flash page-sized)
>> buffer and a large (usually erase-block sized) buffer, at the expense of performance.
>
> Looks good to me, but I don't really understand some of the block layer
> specifics here. One question:
> How does this work when you have a flush that does not directly follow
> a REQ_FUA or REQ_META request? I would assume that we still need to
> flush in some way, which you don't seem to do here.

In much the same way as before - REQ_FLUSH does nothing as it did
nothing before. The only reason I'm handing REQ_FLUSH, is because you
can't register for REQ_FUA without registering for REQ_FLUSH, which
does make sense - after all, if REQ_FUA is used to ensure data is
committed immediately, then by default it isn't, and you need a flush
operation if you want to commit the data at a particular checkpoint
You are correct in stating that it is a likely issue, but it is an
issue even without this patch.  Unfortunately, there is nothing (that
I saw) in the latest JEDEC MMC specification that says how a card's
internal buffers might be forced to be flushed to non-volatile
storage. Clearly, any recent card which tries to play optimization
games in order to reduce write amplification and flash wearing is
going to have an internal buffer, so the inability to handle REQ_FLUSH
is already a potential issue for these cards.

Looking at the 4.41 JEDEC spec, CMD0, CMD15 and hardware reset are all
non-ways to ensure flush. In fact they will terminate the programming
operation resulting in an unknown state.

I'm working with a flash vendor to see if I missed something or if
there is some way to flush internal buffers, and if that way is
generic enough to apply to all cards.

A

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-26  7:17     ` Andrei Warkentin
@ 2011-03-26  7:22       ` Andrei Warkentin
  0 siblings, 0 replies; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-26  7:22 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-mmc

On Sat, Mar 26, 2011 at 2:17 AM, Andrei Warkentin <andreiw@motorola.com> wrote:
> Hi Arnd,
>
> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Thursday 24 March 2011, Andrei Warkentin wrote:
>>>
>>> Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed
>>> to reduce write amplification issues in cards employing a small (usually flash page-sized)
>>> buffer and a large (usually erase-block sized) buffer, at the expense of performance.
>>
>> Looks good to me, but I don't really understand some of the block layer
>> specifics here. One question:
>> How does this work when you have a flush that does not directly follow
>> a REQ_FUA or REQ_META request? I would assume that we still need to
>> flush in some way, which you don't seem to do here.
>
> In much the same way as before - REQ_FLUSH does nothing as it did
> nothing before. The only reason I'm handing REQ_FLUSH, is because you
> can't register for REQ_FUA without registering for REQ_FLUSH, which
> does make sense - after all, if REQ_FUA is used to ensure data is
> committed immediately, then by default it isn't, and you need a flush
> operation if you want to commit the data at a particular checkpoint
> You are correct in stating that it is a likely issue, but it is an
> issue even without this patch.  Unfortunately, there is nothing (that
> I saw) in the latest JEDEC MMC specification that says how a card's
> internal buffers might be forced to be flushed to non-volatile
> storage. Clearly, any recent card which tries to play optimization
> games in order to reduce write amplification and flash wearing is
> going to have an internal buffer, so the inability to handle REQ_FLUSH
> is already a potential issue for these cards.
>
> Looking at the 4.41 JEDEC spec, CMD0, CMD15 and hardware reset are all
> non-ways to ensure flush. In fact they will terminate the programming
> operation resulting in an unknown state.
>
> I'm working with a flash vendor to see if I missed something or if
> there is some way to flush internal buffers, and if that way is
> generic enough to apply to all cards.
>
> A
>

Sorry, I totally forgot to add. REQ_FLUSH means - flush all write back
caches internal to the disk hardware.

A

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-25 15:14   ` Arnd Bergmann
  2011-03-26  7:17     ` Andrei Warkentin
@ 2011-03-29  0:50     ` Andrei Warkentin
  2011-03-29  7:01       ` Arnd Bergmann
  1 sibling, 1 reply; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-29  0:50 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-mmc

On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> +static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
>> +{
>> +     struct mmc_blk_data *md = mq->data;
>> +
>> +     /*
>> +        No-op, only service this because we need REQ_FUA
>> +        for reliable writes.
>> +     */
>> +     spin_lock_irq(&md->lock);
>> +     __blk_end_request_all(req, 0);
>> +     spin_unlock_irq(&md->lock);
>> +
>> +     return 1;
>> +}
>
> How does this work when you have a flush that does not directly follow
> a REQ_FUA or REQ_META request? I would assume that we still need to
> flush in some way, which you don't seem to do here.
>

Arnd,

I confirmed with two MMC vendors that there is no "flush". Once the
DAT transfer completes, the data is stored on non-volatile storage as
soon as the busy status is cleared.

Reliable writes are still "more reliable" because if the DAT transfer
is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC),
you have predictable flash contents. So it makes sense to map REQ_FUA
to it (and REQ_META, I would guess).

What do you think?
A

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-29  0:50     ` Andrei Warkentin
@ 2011-03-29  7:01       ` Arnd Bergmann
  2011-03-29 22:44         ` Andrei Warkentin
  0 siblings, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2011-03-29  7:01 UTC (permalink / raw)
  To: Andrei Warkentin; +Cc: linux-mmc

On Tuesday 29 March 2011, Andrei Warkentin wrote:
> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> 
> I confirmed with two MMC vendors that there is no "flush". Once the
> DAT transfer completes, the data is stored on non-volatile storage as
> soon as the busy status is cleared.
> 
> Reliable writes are still "more reliable" because if the DAT transfer
> is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC),
> you have predictable flash contents. So it makes sense to map REQ_FUA
> to it (and REQ_META, I would guess).

Yes, sounds good.

So I guess on MLC flash, a reliable write will go to a flash page
that does not have data in any of its paired pages.

	Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-29  7:01       ` Arnd Bergmann
@ 2011-03-29 22:44         ` Andrei Warkentin
  2011-03-30 12:05           ` Arnd Bergmann
  0 siblings, 1 reply; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-29 22:44 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-mmc

On Tue, Mar 29, 2011 at 2:01 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tuesday 29 March 2011, Andrei Warkentin wrote:
>> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>
>> I confirmed with two MMC vendors that there is no "flush". Once the
>> DAT transfer completes, the data is stored on non-volatile storage as
>> soon as the busy status is cleared.
>>
>> Reliable writes are still "more reliable" because if the DAT transfer
>> is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC),
>> you have predictable flash contents. So it makes sense to map REQ_FUA
>> to it (and REQ_META, I would guess).
>
> Yes, sounds good.
>
> So I guess on MLC flash, a reliable write will go to a flash page
> that does not have data in any of its paired pages.

Should I resubmit the patch?

A

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-29 22:44         ` Andrei Warkentin
@ 2011-03-30 12:05           ` Arnd Bergmann
  2011-03-30 22:38             ` Chris Ball
  0 siblings, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2011-03-30 12:05 UTC (permalink / raw)
  To: Andrei Warkentin; +Cc: linux-mmc, Chris Ball

On Wednesday 30 March 2011, Andrei Warkentin wrote:
> On Tue, Mar 29, 2011 at 2:01 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Tuesday 29 March 2011, Andrei Warkentin wrote:
> >> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >>
> >> I confirmed with two MMC vendors that there is no "flush". Once the
> >> DAT transfer completes, the data is stored on non-volatile storage as
> >> soon as the busy status is cleared.
> >>
> >> Reliable writes are still "more reliable" because if the DAT transfer
> >> is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC),
> >> you have predictable flash contents. So it makes sense to map REQ_FUA
> >> to it (and REQ_META, I would guess).
> >
> > Yes, sounds good.
> >
> > So I guess on MLC flash, a reliable write will go to a flash page
> > that does not have data in any of its paired pages.
> 
> Should I resubmit the patch?

I think the patch was ok, you can add my

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

Chris, what is your opinion on the patch?

	Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-30 12:05           ` Arnd Bergmann
@ 2011-03-30 22:38             ` Chris Ball
  2011-03-31 20:39               ` Andrei Warkentin
  2011-03-31 23:40               ` [PATCH] " Andrei Warkentin
  0 siblings, 2 replies; 14+ messages in thread
From: Chris Ball @ 2011-03-30 22:38 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Andrei Warkentin, linux-mmc

Hi,

On Wed, Mar 30 2011, Arnd Bergmann wrote:
> On Wednesday 30 March 2011, Andrei Warkentin wrote:
>> On Tue, Mar 29, 2011 at 2:01 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Tuesday 29 March 2011, Andrei Warkentin wrote:
>> >> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >>
>> >> I confirmed with two MMC vendors that there is no "flush". Once the
>> >> DAT transfer completes, the data is stored on non-volatile storage as
>> >> soon as the busy status is cleared.
>> >>
>> >> Reliable writes are still "more reliable" because if the DAT transfer
>> >> is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC),
>> >> you have predictable flash contents. So it makes sense to map REQ_FUA
>> >> to it (and REQ_META, I would guess).
>> >
>> > Yes, sounds good.
>> >
>> > So I guess on MLC flash, a reliable write will go to a flash page
>> > that does not have data in any of its paired pages.
>> 
>> Should I resubmit the patch?
>
> I think the patch was ok, you can add my
>
> Reviewed-by: Arnd Bergmann <arnd@arndb.de>
>
> Chris, what is your opinion on the patch?

Looks unobjectionable to me, I'd take it with your ACK -- the only
thought I had was whether it might make sense to add a test to mmc_test
to check that reliable writes make it out successfully with the same
data they went in with; it would be awful if there was a data loss bug
in the code that we didn't catch because we aren't choosing to use
REQ_FUA/REQ_META.

Then I wondered if there are *other* types of request and data integrity
that we should be growing mmc_test to handle, and then I was wondering
whether this is the realm of mmc_test at all.  Any thoughts on that?  :)

I'd also apply the indentation/cleanup patch below, rebase against
mmc-next, and shorten the commit message to 74 chars.  (Andrei, please
check the below over for correctness in case I missed something.)

Thanks,

- Chris.

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 712fe96..91a6767 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -340,9 +340,9 @@ static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
 	struct mmc_blk_data *md = mq->data;
 
 	/*
-	   No-op, only service this because we need REQ_FUA
-	   for reliable writes.
-	*/
+	 * No-op, only service this because we need REQ_FUA for reliable
+	 * writes.
+	 */
 	spin_lock_irq(&md->lock);
 	__blk_end_request_all(req, 0);
 	spin_unlock_irq(&md->lock);
@@ -364,16 +364,14 @@ static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq,
 	int err;
 	struct mmc_command set_count;
 
-	if (!(card->ext_csd.rel_param &
-	      EXT_CSD_WR_REL_PARAM_EN)) {
-
+	if (!(card->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN)) {
 		/* Legacy mode imposes restrictions on transfers. */
 		if (!IS_ALIGNED(brq->cmd.arg, card->ext_csd.rel_sectors))
 			brq->data.blocks = 1;
 
 		if (brq->data.blocks > card->ext_csd.rel_sectors)
 			brq->data.blocks = card->ext_csd.rel_sectors;
-		else if (brq->data.blocks != card->ext_csd.rel_sectors)
+		else if (brq->data.blocks < card->ext_csd.rel_sectors)
 			brq->data.blocks = 1;
 	}
 
@@ -396,8 +394,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 	int ret = 1, disable_multi = 0;
 
 	/*
-	  Reliable writes are used to implement Forced Unit Access and
-	  REQ_META accesses, and it's supported only on MMCs.
+	 * Reliable writes are used to implement Forced Unit Access and
+	 * REQ_META accesses, and are supported only on MMCs.
 	 */
 	bool do_rel_wr = ((req->cmd_flags & REQ_FUA) ||
 			  (req->cmd_flags & REQ_META)) &&
@@ -464,10 +462,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			brq.data.flags |= MMC_DATA_WRITE;
 		}
 
-		if (do_rel_wr) {
-			if (mmc_apply_rel_rw(&brq, card, req))
-				goto cmd_err;
-		}
+		if (do_rel_wr && mmc_apply_rel_rw(&brq, card, req))
+			goto cmd_err;
 
 		mmc_set_data_timeout(&brq.data, card);
 
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-30 22:38             ` Chris Ball
@ 2011-03-31 20:39               ` Andrei Warkentin
  2011-03-31 20:58                 ` Chris Ball
  2011-03-31 23:40               ` [PATCH] " Andrei Warkentin
  1 sibling, 1 reply; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-31 20:39 UTC (permalink / raw)
  To: Chris Ball; +Cc: Arnd Bergmann, linux-mmc

Hi Chris,

On Wed, Mar 30, 2011 at 5:38 PM, Chris Ball <cjb@laptop.org> wrote:
> Looks unobjectionable to me, I'd take it with your ACK -- the only
> thought I had was whether it might make sense to add a test to mmc_test
> to check that reliable writes make it out successfully with the same
> data they went in with; it would be awful if there was a data loss bug
> in the code that we didn't catch because we aren't choosing to use
> REQ_FUA/REQ_META.
>
> Then I wondered if there are *other* types of request and data integrity
> that we should be growing mmc_test to handle, and then I was wondering
> whether this is the realm of mmc_test at all.  Any thoughts on that?  :)

I took a look at mmc_test. It seems like it was meant more to test
cards, rather than block.c functionality, as it issues all MMC
requests by itself, instead of submitting via block layer. It would be
separately an interesting an idea to implement an MMC
reliable write test to mmc_test, but it wouldn't help much with bugs
in block.c. What do you think?

A

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [comments] MMC: Reliable write support.
  2011-03-31 20:39               ` Andrei Warkentin
@ 2011-03-31 20:58                 ` Chris Ball
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Ball @ 2011-03-31 20:58 UTC (permalink / raw)
  To: Andrei Warkentin; +Cc: Arnd Bergmann, linux-mmc

Hi,

On Thu, Mar 31 2011, Andrei Warkentin wrote:
> I took a look at mmc_test. It seems like it was meant more to test
> cards, rather than block.c functionality, as it issues all MMC
> requests by itself, instead of submitting via block layer. It would be
> separately an interesting an idea to implement an MMC reliable write
> test to mmc_test, but it wouldn't help much with bugs in block.c.
> What do you think?

Ah, that's right.  I guess this is just calling for a userspace test
suite, and there's nothing more to be done in the kernel..

- Chris.
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] MMC: Reliable write support.
  2011-03-30 22:38             ` Chris Ball
  2011-03-31 20:39               ` Andrei Warkentin
@ 2011-03-31 23:40               ` Andrei Warkentin
  2011-04-01  0:47                 ` Chris Ball
  1 sibling, 1 reply; 14+ messages in thread
From: Andrei Warkentin @ 2011-03-31 23:40 UTC (permalink / raw)
  To: linux-mmc; +Cc: Andrei Warkentin

Allows reliable writes to be used for MMC writes. Reliable writes are used
to service write REQ_FUA/REQ_META requests. Handles both the legacy and
the enhanced reliable write support in MMC cards.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
---
 drivers/mmc/card/block.c |   81 +++++++++++++++++++++++++++++++++++++++++++--
 drivers/mmc/core/mmc.c   |    5 +++
 include/linux/mmc/card.h |    2 +
 include/linux/mmc/mmc.h  |    4 ++
 4 files changed, 88 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 61d233a..91a6767 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -48,6 +48,10 @@ MODULE_ALIAS("mmc:block");
 #endif
 #define MODULE_PARAM_PREFIX "mmcblk."
 
+#define REL_WRITES_SUPPORTED(card) (mmc_card_mmc((card)) &&	\
+    (((card)->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN) ||	\
+     ((card)->ext_csd.rel_sectors)))
+
 static DEFINE_MUTEX(block_mutex);
 
 /*
@@ -331,6 +335,57 @@ out:
 	return err ? 0 : 1;
 }
 
+static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->data;
+
+	/*
+	 * No-op, only service this because we need REQ_FUA for reliable
+	 * writes.
+	 */
+	spin_lock_irq(&md->lock);
+	__blk_end_request_all(req, 0);
+	spin_unlock_irq(&md->lock);
+
+	return 1;
+}
+
+/*
+ * Reformat current write as a reliable write, supporting
+ * both legacy and the enhanced reliable write MMC cards.
+ * In each transfer we'll handle only as much as a single
+ * reliable write can handle, thus finish the request in
+ * partial completions.
+ */
+static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq,
+				   struct mmc_card *card,
+				   struct request *req)
+{
+	int err;
+	struct mmc_command set_count;
+
+	if (!(card->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN)) {
+		/* Legacy mode imposes restrictions on transfers. */
+		if (!IS_ALIGNED(brq->cmd.arg, card->ext_csd.rel_sectors))
+			brq->data.blocks = 1;
+
+		if (brq->data.blocks > card->ext_csd.rel_sectors)
+			brq->data.blocks = card->ext_csd.rel_sectors;
+		else if (brq->data.blocks < card->ext_csd.rel_sectors)
+			brq->data.blocks = 1;
+	}
+
+	memset(&set_count, 0, sizeof(struct mmc_command));
+	set_count.opcode = MMC_SET_BLOCK_COUNT;
+	set_count.arg = brq->data.blocks | (1 << 31);
+	set_count.flags = MMC_RSP_R1 | MMC_CMD_AC;
+	err = mmc_wait_for_cmd(card->host, &set_count, 0);
+	if (err)
+		printk(KERN_ERR "%s: error %d SET_BLOCK_COUNT\n",
+		       req->rq_disk->disk_name, err);
+	return err;
+}
+
 static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 {
 	struct mmc_blk_data *md = mq->data;
@@ -338,6 +393,15 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 	struct mmc_blk_request brq;
 	int ret = 1, disable_multi = 0;
 
+	/*
+	 * Reliable writes are used to implement Forced Unit Access and
+	 * REQ_META accesses, and are supported only on MMCs.
+	 */
+	bool do_rel_wr = ((req->cmd_flags & REQ_FUA) ||
+			  (req->cmd_flags & REQ_META)) &&
+		(rq_data_dir(req) == WRITE) &&
+		REL_WRITES_SUPPORTED(card);
+
 	mmc_claim_host(card->host);
 
 	do {
@@ -374,12 +438,14 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 		if (disable_multi && brq.data.blocks > 1)
 			brq.data.blocks = 1;
 
-		if (brq.data.blocks > 1) {
+		if (brq.data.blocks > 1 || do_rel_wr) {
 			/* SPI multiblock writes terminate using a special
-			 * token, not a STOP_TRANSMISSION request.
+			 * token, not a STOP_TRANSMISSION request. Reliable
+			 * writes use SET_BLOCK_COUNT and do not use a
+			 * STOP_TRANSMISSION request either.
 			 */
-			if (!mmc_host_is_spi(card->host)
-					|| rq_data_dir(req) == READ)
+			if ((!mmc_host_is_spi(card->host) && !do_rel_wr) ||
+			    rq_data_dir(req) == READ)
 				brq.mrq.stop = &brq.stop;
 			readcmd = MMC_READ_MULTIPLE_BLOCK;
 			writecmd = MMC_WRITE_MULTIPLE_BLOCK;
@@ -396,6 +462,9 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req)
 			brq.data.flags |= MMC_DATA_WRITE;
 		}
 
+		if (do_rel_wr && mmc_apply_rel_rw(&brq, card, req))
+			goto cmd_err;
+
 		mmc_set_data_timeout(&brq.data, card);
 
 		brq.data.sg = mq->sg;
@@ -565,6 +634,8 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
 			return mmc_blk_issue_secdiscard_rq(mq, req);
 		else
 			return mmc_blk_issue_discard_rq(mq, req);
+	} else if (req->cmd_flags & REQ_FLUSH) {
+		return mmc_blk_issue_flush(mq, req);
 	} else {
 		return mmc_blk_issue_rw_rq(mq, req);
 	}
@@ -622,6 +693,8 @@ static struct mmc_blk_data *mmc_blk_alloc(struct mmc_card *card)
 	md->disk->queue = md->queue.queue;
 	md->disk->driverfs_dev = &card->dev;
 	set_disk_ro(md->disk, md->read_only);
+	if (REL_WRITES_SUPPORTED(card))
+		blk_queue_flush(md->queue.queue, REQ_FLUSH | REQ_FUA);
 
 	/*
 	 * As discussed on lkml, GENHD_FL_REMOVABLE should:
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 2d48800..caba751 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -300,6 +300,8 @@ static int mmc_read_ext_csd(struct mmc_card *card)
 			ext_csd[EXT_CSD_ERASE_TIMEOUT_MULT];
 		card->ext_csd.hc_erase_size =
 			ext_csd[EXT_CSD_HC_ERASE_GRP_SIZE] << 10;
+
+		card->ext_csd.rel_sectors = ext_csd[EXT_CSD_REL_WR_SEC_C];
 	}
 
 	if (card->ext_csd.rev >= 4) {
@@ -351,6 +353,9 @@ static int mmc_read_ext_csd(struct mmc_card *card)
 			ext_csd[EXT_CSD_TRIM_MULT];
 	}
 
+	if (card->ext_csd.rev >= 5)
+		card->ext_csd.rel_param = ext_csd[EXT_CSD_WR_REL_PARAM];
+
 	if (ext_csd[EXT_CSD_ERASED_MEM_CONT])
 		card->erased_byte = 0xFF;
 	else
diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
index 557b732..c4e96fa 100644
--- a/include/linux/mmc/card.h
+++ b/include/linux/mmc/card.h
@@ -45,6 +45,8 @@ struct mmc_ext_csd {
 	u8			rev;
 	u8			erase_group_def;
 	u8			sec_feature_support;
+	u8			rel_sectors;
+	u8			rel_param;
 	u8			bootconfig;
 	unsigned int		sa_timeout;		/* Units: 100ns */
 	unsigned int		hs_max_dtr;
diff --git a/include/linux/mmc/mmc.h b/include/linux/mmc/mmc.h
index b5ec88f..390aa6e 100644
--- a/include/linux/mmc/mmc.h
+++ b/include/linux/mmc/mmc.h
@@ -255,6 +255,7 @@ struct _mmc_csd {
 
 #define EXT_CSD_PARTITION_ATTRIBUTE	156	/* R/W */
 #define EXT_CSD_PARTITION_SUPPORT	160	/* RO */
+#define EXT_CSD_WR_REL_PARAM		166	/* RO */
 #define EXT_CSD_ERASE_GROUP_DEF		175	/* R/W */
 #define EXT_CSD_BOOT_CONFIG		179	/* R/W */
 #define EXT_CSD_ERASED_MEM_CONT		181	/* RO */
@@ -265,6 +266,7 @@ struct _mmc_csd {
 #define EXT_CSD_CARD_TYPE		196	/* RO */
 #define EXT_CSD_SEC_CNT			212	/* RO, 4 bytes */
 #define EXT_CSD_S_A_TIMEOUT		217	/* RO */
+#define EXT_CSD_REL_WR_SEC_C		222	/* RO */
 #define EXT_CSD_HC_WP_GRP_SIZE		221	/* RO */
 #define EXT_CSD_ERASE_TIMEOUT_MULT	223	/* RO */
 #define EXT_CSD_HC_ERASE_GRP_SIZE	224	/* RO */
@@ -277,6 +279,8 @@ struct _mmc_csd {
  * EXT_CSD field definitions
  */
 
+#define EXT_CSD_WR_REL_PARAM_EN		(1<<2)
+
 #define EXT_CSD_CMD_SET_NORMAL		(1<<0)
 #define EXT_CSD_CMD_SET_SECURE		(1<<1)
 #define EXT_CSD_CMD_SET_CPSECURE	(1<<2)
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] MMC: Reliable write support.
  2011-03-31 23:40               ` [PATCH] " Andrei Warkentin
@ 2011-04-01  0:47                 ` Chris Ball
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Ball @ 2011-04-01  0:47 UTC (permalink / raw)
  To: Andrei Warkentin; +Cc: linux-mmc

Hi,

On Thu, Mar 31 2011, Andrei Warkentin wrote:
> Allows reliable writes to be used for MMC writes. Reliable writes are used
> to service write REQ_FUA/REQ_META requests. Handles both the legacy and
> the enhanced reliable write support in MMC cards.
>
> Reviewed-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Andrei Warkentin <andreiw@motorola.com>

Thanks, Andrei, pushed to mmc-next for .40.

- Chris.
-- 
Chris Ball   <cjb@laptop.org>   <http://printf.net/>
One Laptop Per Child

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-04-01  0:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-24 21:22 Reliable write support Andrei Warkentin
2011-03-24 21:22 ` [comments] MMC: " Andrei Warkentin
2011-03-25 15:14   ` Arnd Bergmann
2011-03-26  7:17     ` Andrei Warkentin
2011-03-26  7:22       ` Andrei Warkentin
2011-03-29  0:50     ` Andrei Warkentin
2011-03-29  7:01       ` Arnd Bergmann
2011-03-29 22:44         ` Andrei Warkentin
2011-03-30 12:05           ` Arnd Bergmann
2011-03-30 22:38             ` Chris Ball
2011-03-31 20:39               ` Andrei Warkentin
2011-03-31 20:58                 ` Chris Ball
2011-03-31 23:40               ` [PATCH] " Andrei Warkentin
2011-04-01  0:47                 ` Chris Ball

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox