* Reliable write support. @ 2011-03-24 21:22 Andrei Warkentin 2011-03-24 21:22 ` [comments] MMC: " Andrei Warkentin 0 siblings, 1 reply; 14+ messages in thread From: Andrei Warkentin @ 2011-03-24 21:22 UTC (permalink / raw) To: linux-mmc Hi group, As part of enabling maximum-reliability (at cost of performance) workarounds for MMC media whose controllers employ a scheme, where there is usually a small (flash-page sized) and a large (erase-block sized) buffer, I needed reliable write support. As per Arnd's feedback, I've split the reliable write support out by itself. I believe other people were going to look at this as well (https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Specs/StoragePerfEMMC). The patch provides a mechanism by which reliable writes can be enabled if they are supported, and enables them for REQ_FUA/REQ_META write transfers. REQ_FUA/REQ_FLUSH is a replacement for barrier bios, which were broken before anyway (couldn't handle partial completions, which are what's going to happen for legacy reliable write support). REQ_FUA is a flag a bio submitted from the filesystem and will make sure that I/O completion for this request is only signaled after the data has been committed to non-volatile storage. REQ_META are supposed to be used by filesystems to mark bios that contain metadata. The patch is against linux-next and was tested with a Toshiba MMC08G (legacy reliable write support) eMMC on x64. ToC: [comments] MMC: Reliable write support. Thanks ahead of time, A ^ permalink raw reply [flat|nested] 14+ messages in thread
* [comments] MMC: Reliable write support. 2011-03-24 21:22 Reliable write support Andrei Warkentin @ 2011-03-24 21:22 ` Andrei Warkentin 2011-03-25 15:14 ` Arnd Bergmann 0 siblings, 1 reply; 14+ messages in thread From: Andrei Warkentin @ 2011-03-24 21:22 UTC (permalink / raw) To: linux-mmc; +Cc: Andrei Warkentin This is a request-for-comments patch. Please provide your feedback. Allows reliable writes to be used for MMC writes. Reliable writes are used to service write REQ_FUA/REQ_META requests. Handles both the legacy and the enhanced reliable write support in MMC cards. Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed to reduce write amplification issues in cards employing a small (usually flash page-sized) buffer and a large (usually erase-block sized) buffer, at the expense of performance. Signed-off-by: Andrei Warkentin <andreiw@motorola.com> --- drivers/mmc/card/block.c | 85 +++++++++++++++++++++++++++++++++++++++++++-- drivers/mmc/core/mmc.c | 5 +++ include/linux/mmc/card.h | 2 + include/linux/mmc/mmc.h | 4 ++ 4 files changed, 92 insertions(+), 4 deletions(-) diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index 61d233a..712fe96 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -48,6 +48,10 @@ MODULE_ALIAS("mmc:block"); #endif #define MODULE_PARAM_PREFIX "mmcblk." +#define REL_WRITES_SUPPORTED(card) (mmc_card_mmc((card)) && \ + (((card)->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN) || \ + ((card)->ext_csd.rel_sectors))) + static DEFINE_MUTEX(block_mutex); /* @@ -331,6 +335,59 @@ out: return err ? 0 : 1; } +static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req) +{ + struct mmc_blk_data *md = mq->data; + + /* + No-op, only service this because we need REQ_FUA + for reliable writes. + */ + spin_lock_irq(&md->lock); + __blk_end_request_all(req, 0); + spin_unlock_irq(&md->lock); + + return 1; +} + +/* + * Reformat current write as a reliable write, supporting + * both legacy and the enhanced reliable write MMC cards. + * In each transfer we'll handle only as much as a single + * reliable write can handle, thus finish the request in + * partial completions. + */ +static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq, + struct mmc_card *card, + struct request *req) +{ + int err; + struct mmc_command set_count; + + if (!(card->ext_csd.rel_param & + EXT_CSD_WR_REL_PARAM_EN)) { + + /* Legacy mode imposes restrictions on transfers. */ + if (!IS_ALIGNED(brq->cmd.arg, card->ext_csd.rel_sectors)) + brq->data.blocks = 1; + + if (brq->data.blocks > card->ext_csd.rel_sectors) + brq->data.blocks = card->ext_csd.rel_sectors; + else if (brq->data.blocks != card->ext_csd.rel_sectors) + brq->data.blocks = 1; + } + + memset(&set_count, 0, sizeof(struct mmc_command)); + set_count.opcode = MMC_SET_BLOCK_COUNT; + set_count.arg = brq->data.blocks | (1 << 31); + set_count.flags = MMC_RSP_R1 | MMC_CMD_AC; + err = mmc_wait_for_cmd(card->host, &set_count, 0); + if (err) + printk(KERN_ERR "%s: error %d SET_BLOCK_COUNT\n", + req->rq_disk->disk_name, err); + return err; +} + static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) { struct mmc_blk_data *md = mq->data; @@ -338,6 +395,15 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) struct mmc_blk_request brq; int ret = 1, disable_multi = 0; + /* + Reliable writes are used to implement Forced Unit Access and + REQ_META accesses, and it's supported only on MMCs. + */ + bool do_rel_wr = ((req->cmd_flags & REQ_FUA) || + (req->cmd_flags & REQ_META)) && + (rq_data_dir(req) == WRITE) && + REL_WRITES_SUPPORTED(card); + mmc_claim_host(card->host); do { @@ -374,12 +440,14 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) if (disable_multi && brq.data.blocks > 1) brq.data.blocks = 1; - if (brq.data.blocks > 1) { + if (brq.data.blocks > 1 || do_rel_wr) { /* SPI multiblock writes terminate using a special - * token, not a STOP_TRANSMISSION request. + * token, not a STOP_TRANSMISSION request. Reliable + * writes use SET_BLOCK_COUNT and do not use a + * STOP_TRANSMISSION request either. */ - if (!mmc_host_is_spi(card->host) - || rq_data_dir(req) == READ) + if ((!mmc_host_is_spi(card->host) && !do_rel_wr) || + rq_data_dir(req) == READ) brq.mrq.stop = &brq.stop; readcmd = MMC_READ_MULTIPLE_BLOCK; writecmd = MMC_WRITE_MULTIPLE_BLOCK; @@ -396,6 +464,11 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) brq.data.flags |= MMC_DATA_WRITE; } + if (do_rel_wr) { + if (mmc_apply_rel_rw(&brq, card, req)) + goto cmd_err; + } + mmc_set_data_timeout(&brq.data, card); brq.data.sg = mq->sg; @@ -565,6 +638,8 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req) return mmc_blk_issue_secdiscard_rq(mq, req); else return mmc_blk_issue_discard_rq(mq, req); + } else if (req->cmd_flags & REQ_FLUSH) { + return mmc_blk_issue_flush(mq, req); } else { return mmc_blk_issue_rw_rq(mq, req); } @@ -622,6 +697,8 @@ static struct mmc_blk_data *mmc_blk_alloc(struct mmc_card *card) md->disk->queue = md->queue.queue; md->disk->driverfs_dev = &card->dev; set_disk_ro(md->disk, md->read_only); + if (REL_WRITES_SUPPORTED(card)) + blk_queue_flush(md->queue.queue, REQ_FLUSH | REQ_FUA); /* * As discussed on lkml, GENHD_FL_REMOVABLE should: diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c index 14e95f3..1b1e142 100644 --- a/drivers/mmc/core/mmc.c +++ b/drivers/mmc/core/mmc.c @@ -299,6 +299,8 @@ static int mmc_read_ext_csd(struct mmc_card *card) ext_csd[EXT_CSD_ERASE_TIMEOUT_MULT]; card->ext_csd.hc_erase_size = ext_csd[EXT_CSD_HC_ERASE_GRP_SIZE] << 10; + + card->ext_csd.rel_sectors = ext_csd[EXT_CSD_REL_WR_SEC_C]; } if (card->ext_csd.rev >= 4) { @@ -350,6 +352,9 @@ static int mmc_read_ext_csd(struct mmc_card *card) ext_csd[EXT_CSD_TRIM_MULT]; } + if (card->ext_csd.rev >= 5) + card->ext_csd.rel_param = ext_csd[EXT_CSD_WR_REL_PARAM]; + if (ext_csd[EXT_CSD_ERASED_MEM_CONT]) card->erased_byte = 0xFF; else diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h index adb4888..959e3d8 100644 --- a/include/linux/mmc/card.h +++ b/include/linux/mmc/card.h @@ -45,6 +45,8 @@ struct mmc_ext_csd { u8 rev; u8 erase_group_def; u8 sec_feature_support; + u8 rel_sectors; + u8 rel_param; unsigned int sa_timeout; /* Units: 100ns */ unsigned int hs_max_dtr; unsigned int sectors; diff --git a/include/linux/mmc/mmc.h b/include/linux/mmc/mmc.h index 264ba54..44a8157 100644 --- a/include/linux/mmc/mmc.h +++ b/include/linux/mmc/mmc.h @@ -255,6 +255,7 @@ struct _mmc_csd { #define EXT_CSD_PARTITION_ATTRIBUTE 156 /* R/W */ #define EXT_CSD_PARTITION_SUPPORT 160 /* RO */ +#define EXT_CSD_WR_REL_PARAM 166 /* RO */ #define EXT_CSD_ERASE_GROUP_DEF 175 /* R/W */ #define EXT_CSD_ERASED_MEM_CONT 181 /* RO */ #define EXT_CSD_BUS_WIDTH 183 /* R/W */ @@ -264,6 +265,7 @@ struct _mmc_csd { #define EXT_CSD_CARD_TYPE 196 /* RO */ #define EXT_CSD_SEC_CNT 212 /* RO, 4 bytes */ #define EXT_CSD_S_A_TIMEOUT 217 /* RO */ +#define EXT_CSD_REL_WR_SEC_C 222 /* RO */ #define EXT_CSD_HC_WP_GRP_SIZE 221 /* RO */ #define EXT_CSD_ERASE_TIMEOUT_MULT 223 /* RO */ #define EXT_CSD_HC_ERASE_GRP_SIZE 224 /* RO */ @@ -276,6 +278,8 @@ struct _mmc_csd { * EXT_CSD field definitions */ +#define EXT_CSD_WR_REL_PARAM_EN (1<<2) + #define EXT_CSD_CMD_SET_NORMAL (1<<0) #define EXT_CSD_CMD_SET_SECURE (1<<1) #define EXT_CSD_CMD_SET_CPSECURE (1<<2) -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-24 21:22 ` [comments] MMC: " Andrei Warkentin @ 2011-03-25 15:14 ` Arnd Bergmann 2011-03-26 7:17 ` Andrei Warkentin 2011-03-29 0:50 ` Andrei Warkentin 0 siblings, 2 replies; 14+ messages in thread From: Arnd Bergmann @ 2011-03-25 15:14 UTC (permalink / raw) To: Andrei Warkentin; +Cc: linux-mmc On Thursday 24 March 2011, Andrei Warkentin wrote: > This is a request-for-comments patch. Please provide your feedback. > > Allows reliable writes to be used for MMC writes. Reliable writes are used > to service write REQ_FUA/REQ_META requests. Handles both the legacy and the enhanced > reliable write support in MMC cards. > > Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed > to reduce write amplification issues in cards employing a small (usually flash page-sized) > buffer and a large (usually erase-block sized) buffer, at the expense of performance. Looks good to me, but I don't really understand some of the block layer specifics here. One question: > +static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req) > +{ > + struct mmc_blk_data *md = mq->data; > + > + /* > + No-op, only service this because we need REQ_FUA > + for reliable writes. > + */ > + spin_lock_irq(&md->lock); > + __blk_end_request_all(req, 0); > + spin_unlock_irq(&md->lock); > + > + return 1; > +} How does this work when you have a flush that does not directly follow a REQ_FUA or REQ_META request? I would assume that we still need to flush in some way, which you don't seem to do here. Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-25 15:14 ` Arnd Bergmann @ 2011-03-26 7:17 ` Andrei Warkentin 2011-03-26 7:22 ` Andrei Warkentin 2011-03-29 0:50 ` Andrei Warkentin 1 sibling, 1 reply; 14+ messages in thread From: Andrei Warkentin @ 2011-03-26 7:17 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linux-mmc Hi Arnd, On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote: > On Thursday 24 March 2011, Andrei Warkentin wrote: >> >> Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed >> to reduce write amplification issues in cards employing a small (usually flash page-sized) >> buffer and a large (usually erase-block sized) buffer, at the expense of performance. > > Looks good to me, but I don't really understand some of the block layer > specifics here. One question: > How does this work when you have a flush that does not directly follow > a REQ_FUA or REQ_META request? I would assume that we still need to > flush in some way, which you don't seem to do here. In much the same way as before - REQ_FLUSH does nothing as it did nothing before. The only reason I'm handing REQ_FLUSH, is because you can't register for REQ_FUA without registering for REQ_FLUSH, which does make sense - after all, if REQ_FUA is used to ensure data is committed immediately, then by default it isn't, and you need a flush operation if you want to commit the data at a particular checkpoint You are correct in stating that it is a likely issue, but it is an issue even without this patch. Unfortunately, there is nothing (that I saw) in the latest JEDEC MMC specification that says how a card's internal buffers might be forced to be flushed to non-volatile storage. Clearly, any recent card which tries to play optimization games in order to reduce write amplification and flash wearing is going to have an internal buffer, so the inability to handle REQ_FLUSH is already a potential issue for these cards. Looking at the 4.41 JEDEC spec, CMD0, CMD15 and hardware reset are all non-ways to ensure flush. In fact they will terminate the programming operation resulting in an unknown state. I'm working with a flash vendor to see if I missed something or if there is some way to flush internal buffers, and if that way is generic enough to apply to all cards. A ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-26 7:17 ` Andrei Warkentin @ 2011-03-26 7:22 ` Andrei Warkentin 0 siblings, 0 replies; 14+ messages in thread From: Andrei Warkentin @ 2011-03-26 7:22 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linux-mmc On Sat, Mar 26, 2011 at 2:17 AM, Andrei Warkentin <andreiw@motorola.com> wrote: > Hi Arnd, > > On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> On Thursday 24 March 2011, Andrei Warkentin wrote: >>> >>> Beyond REQ_FUA/REQ_META, this was meant to be used by a following patch that aimed >>> to reduce write amplification issues in cards employing a small (usually flash page-sized) >>> buffer and a large (usually erase-block sized) buffer, at the expense of performance. >> >> Looks good to me, but I don't really understand some of the block layer >> specifics here. One question: >> How does this work when you have a flush that does not directly follow >> a REQ_FUA or REQ_META request? I would assume that we still need to >> flush in some way, which you don't seem to do here. > > In much the same way as before - REQ_FLUSH does nothing as it did > nothing before. The only reason I'm handing REQ_FLUSH, is because you > can't register for REQ_FUA without registering for REQ_FLUSH, which > does make sense - after all, if REQ_FUA is used to ensure data is > committed immediately, then by default it isn't, and you need a flush > operation if you want to commit the data at a particular checkpoint > You are correct in stating that it is a likely issue, but it is an > issue even without this patch. Unfortunately, there is nothing (that > I saw) in the latest JEDEC MMC specification that says how a card's > internal buffers might be forced to be flushed to non-volatile > storage. Clearly, any recent card which tries to play optimization > games in order to reduce write amplification and flash wearing is > going to have an internal buffer, so the inability to handle REQ_FLUSH > is already a potential issue for these cards. > > Looking at the 4.41 JEDEC spec, CMD0, CMD15 and hardware reset are all > non-ways to ensure flush. In fact they will terminate the programming > operation resulting in an unknown state. > > I'm working with a flash vendor to see if I missed something or if > there is some way to flush internal buffers, and if that way is > generic enough to apply to all cards. > > A > Sorry, I totally forgot to add. REQ_FLUSH means - flush all write back caches internal to the disk hardware. A ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-25 15:14 ` Arnd Bergmann 2011-03-26 7:17 ` Andrei Warkentin @ 2011-03-29 0:50 ` Andrei Warkentin 2011-03-29 7:01 ` Arnd Bergmann 1 sibling, 1 reply; 14+ messages in thread From: Andrei Warkentin @ 2011-03-29 0:50 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linux-mmc On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> +static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req) >> +{ >> + struct mmc_blk_data *md = mq->data; >> + >> + /* >> + No-op, only service this because we need REQ_FUA >> + for reliable writes. >> + */ >> + spin_lock_irq(&md->lock); >> + __blk_end_request_all(req, 0); >> + spin_unlock_irq(&md->lock); >> + >> + return 1; >> +} > > How does this work when you have a flush that does not directly follow > a REQ_FUA or REQ_META request? I would assume that we still need to > flush in some way, which you don't seem to do here. > Arnd, I confirmed with two MMC vendors that there is no "flush". Once the DAT transfer completes, the data is stored on non-volatile storage as soon as the busy status is cleared. Reliable writes are still "more reliable" because if the DAT transfer is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC), you have predictable flash contents. So it makes sense to map REQ_FUA to it (and REQ_META, I would guess). What do you think? A ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-29 0:50 ` Andrei Warkentin @ 2011-03-29 7:01 ` Arnd Bergmann 2011-03-29 22:44 ` Andrei Warkentin 0 siblings, 1 reply; 14+ messages in thread From: Arnd Bergmann @ 2011-03-29 7:01 UTC (permalink / raw) To: Andrei Warkentin; +Cc: linux-mmc On Tuesday 29 March 2011, Andrei Warkentin wrote: > On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > I confirmed with two MMC vendors that there is no "flush". Once the > DAT transfer completes, the data is stored on non-volatile storage as > soon as the busy status is cleared. > > Reliable writes are still "more reliable" because if the DAT transfer > is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC), > you have predictable flash contents. So it makes sense to map REQ_FUA > to it (and REQ_META, I would guess). Yes, sounds good. So I guess on MLC flash, a reliable write will go to a flash page that does not have data in any of its paired pages. Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-29 7:01 ` Arnd Bergmann @ 2011-03-29 22:44 ` Andrei Warkentin 2011-03-30 12:05 ` Arnd Bergmann 0 siblings, 1 reply; 14+ messages in thread From: Andrei Warkentin @ 2011-03-29 22:44 UTC (permalink / raw) To: Arnd Bergmann; +Cc: linux-mmc On Tue, Mar 29, 2011 at 2:01 AM, Arnd Bergmann <arnd@arndb.de> wrote: > On Tuesday 29 March 2011, Andrei Warkentin wrote: >> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> >> I confirmed with two MMC vendors that there is no "flush". Once the >> DAT transfer completes, the data is stored on non-volatile storage as >> soon as the busy status is cleared. >> >> Reliable writes are still "more reliable" because if the DAT transfer >> is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC), >> you have predictable flash contents. So it makes sense to map REQ_FUA >> to it (and REQ_META, I would guess). > > Yes, sounds good. > > So I guess on MLC flash, a reliable write will go to a flash page > that does not have data in any of its paired pages. Should I resubmit the patch? A ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-29 22:44 ` Andrei Warkentin @ 2011-03-30 12:05 ` Arnd Bergmann 2011-03-30 22:38 ` Chris Ball 0 siblings, 1 reply; 14+ messages in thread From: Arnd Bergmann @ 2011-03-30 12:05 UTC (permalink / raw) To: Andrei Warkentin; +Cc: linux-mmc, Chris Ball On Wednesday 30 March 2011, Andrei Warkentin wrote: > On Tue, Mar 29, 2011 at 2:01 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > On Tuesday 29 March 2011, Andrei Warkentin wrote: > >> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote: > >> > >> I confirmed with two MMC vendors that there is no "flush". Once the > >> DAT transfer completes, the data is stored on non-volatile storage as > >> soon as the busy status is cleared. > >> > >> Reliable writes are still "more reliable" because if the DAT transfer > >> is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC), > >> you have predictable flash contents. So it makes sense to map REQ_FUA > >> to it (and REQ_META, I would guess). > > > > Yes, sounds good. > > > > So I guess on MLC flash, a reliable write will go to a flash page > > that does not have data in any of its paired pages. > > Should I resubmit the patch? I think the patch was ok, you can add my Reviewed-by: Arnd Bergmann <arnd@arndb.de> Chris, what is your opinion on the patch? Arnd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-30 12:05 ` Arnd Bergmann @ 2011-03-30 22:38 ` Chris Ball 2011-03-31 20:39 ` Andrei Warkentin 2011-03-31 23:40 ` [PATCH] " Andrei Warkentin 0 siblings, 2 replies; 14+ messages in thread From: Chris Ball @ 2011-03-30 22:38 UTC (permalink / raw) To: Arnd Bergmann; +Cc: Andrei Warkentin, linux-mmc Hi, On Wed, Mar 30 2011, Arnd Bergmann wrote: > On Wednesday 30 March 2011, Andrei Warkentin wrote: >> On Tue, Mar 29, 2011 at 2:01 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> > On Tuesday 29 March 2011, Andrei Warkentin wrote: >> >> On Fri, Mar 25, 2011 at 10:14 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> >> >> >> I confirmed with two MMC vendors that there is no "flush". Once the >> >> DAT transfer completes, the data is stored on non-volatile storage as >> >> soon as the busy status is cleared. >> >> >> >> Reliable writes are still "more reliable" because if the DAT transfer >> >> is interrupted (power or reset through CMD0/CMD15 or hw pin for eMMC), >> >> you have predictable flash contents. So it makes sense to map REQ_FUA >> >> to it (and REQ_META, I would guess). >> > >> > Yes, sounds good. >> > >> > So I guess on MLC flash, a reliable write will go to a flash page >> > that does not have data in any of its paired pages. >> >> Should I resubmit the patch? > > I think the patch was ok, you can add my > > Reviewed-by: Arnd Bergmann <arnd@arndb.de> > > Chris, what is your opinion on the patch? Looks unobjectionable to me, I'd take it with your ACK -- the only thought I had was whether it might make sense to add a test to mmc_test to check that reliable writes make it out successfully with the same data they went in with; it would be awful if there was a data loss bug in the code that we didn't catch because we aren't choosing to use REQ_FUA/REQ_META. Then I wondered if there are *other* types of request and data integrity that we should be growing mmc_test to handle, and then I was wondering whether this is the realm of mmc_test at all. Any thoughts on that? :) I'd also apply the indentation/cleanup patch below, rebase against mmc-next, and shorten the commit message to 74 chars. (Andrei, please check the below over for correctness in case I missed something.) Thanks, - Chris. diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index 712fe96..91a6767 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -340,9 +340,9 @@ static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req) struct mmc_blk_data *md = mq->data; /* - No-op, only service this because we need REQ_FUA - for reliable writes. - */ + * No-op, only service this because we need REQ_FUA for reliable + * writes. + */ spin_lock_irq(&md->lock); __blk_end_request_all(req, 0); spin_unlock_irq(&md->lock); @@ -364,16 +364,14 @@ static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq, int err; struct mmc_command set_count; - if (!(card->ext_csd.rel_param & - EXT_CSD_WR_REL_PARAM_EN)) { - + if (!(card->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN)) { /* Legacy mode imposes restrictions on transfers. */ if (!IS_ALIGNED(brq->cmd.arg, card->ext_csd.rel_sectors)) brq->data.blocks = 1; if (brq->data.blocks > card->ext_csd.rel_sectors) brq->data.blocks = card->ext_csd.rel_sectors; - else if (brq->data.blocks != card->ext_csd.rel_sectors) + else if (brq->data.blocks < card->ext_csd.rel_sectors) brq->data.blocks = 1; } @@ -396,8 +394,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) int ret = 1, disable_multi = 0; /* - Reliable writes are used to implement Forced Unit Access and - REQ_META accesses, and it's supported only on MMCs. + * Reliable writes are used to implement Forced Unit Access and + * REQ_META accesses, and are supported only on MMCs. */ bool do_rel_wr = ((req->cmd_flags & REQ_FUA) || (req->cmd_flags & REQ_META)) && @@ -464,10 +462,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) brq.data.flags |= MMC_DATA_WRITE; } - if (do_rel_wr) { - if (mmc_apply_rel_rw(&brq, card, req)) - goto cmd_err; - } + if (do_rel_wr && mmc_apply_rel_rw(&brq, card, req)) + goto cmd_err; mmc_set_data_timeout(&brq.data, card); -- Chris Ball <cjb@laptop.org> <http://printf.net/> One Laptop Per Child ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-30 22:38 ` Chris Ball @ 2011-03-31 20:39 ` Andrei Warkentin 2011-03-31 20:58 ` Chris Ball 2011-03-31 23:40 ` [PATCH] " Andrei Warkentin 1 sibling, 1 reply; 14+ messages in thread From: Andrei Warkentin @ 2011-03-31 20:39 UTC (permalink / raw) To: Chris Ball; +Cc: Arnd Bergmann, linux-mmc Hi Chris, On Wed, Mar 30, 2011 at 5:38 PM, Chris Ball <cjb@laptop.org> wrote: > Looks unobjectionable to me, I'd take it with your ACK -- the only > thought I had was whether it might make sense to add a test to mmc_test > to check that reliable writes make it out successfully with the same > data they went in with; it would be awful if there was a data loss bug > in the code that we didn't catch because we aren't choosing to use > REQ_FUA/REQ_META. > > Then I wondered if there are *other* types of request and data integrity > that we should be growing mmc_test to handle, and then I was wondering > whether this is the realm of mmc_test at all. Any thoughts on that? :) I took a look at mmc_test. It seems like it was meant more to test cards, rather than block.c functionality, as it issues all MMC requests by itself, instead of submitting via block layer. It would be separately an interesting an idea to implement an MMC reliable write test to mmc_test, but it wouldn't help much with bugs in block.c. What do you think? A ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [comments] MMC: Reliable write support. 2011-03-31 20:39 ` Andrei Warkentin @ 2011-03-31 20:58 ` Chris Ball 0 siblings, 0 replies; 14+ messages in thread From: Chris Ball @ 2011-03-31 20:58 UTC (permalink / raw) To: Andrei Warkentin; +Cc: Arnd Bergmann, linux-mmc Hi, On Thu, Mar 31 2011, Andrei Warkentin wrote: > I took a look at mmc_test. It seems like it was meant more to test > cards, rather than block.c functionality, as it issues all MMC > requests by itself, instead of submitting via block layer. It would be > separately an interesting an idea to implement an MMC reliable write > test to mmc_test, but it wouldn't help much with bugs in block.c. > What do you think? Ah, that's right. I guess this is just calling for a userspace test suite, and there's nothing more to be done in the kernel.. - Chris. -- Chris Ball <cjb@laptop.org> <http://printf.net/> One Laptop Per Child ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] MMC: Reliable write support. 2011-03-30 22:38 ` Chris Ball 2011-03-31 20:39 ` Andrei Warkentin @ 2011-03-31 23:40 ` Andrei Warkentin 2011-04-01 0:47 ` Chris Ball 1 sibling, 1 reply; 14+ messages in thread From: Andrei Warkentin @ 2011-03-31 23:40 UTC (permalink / raw) To: linux-mmc; +Cc: Andrei Warkentin Allows reliable writes to be used for MMC writes. Reliable writes are used to service write REQ_FUA/REQ_META requests. Handles both the legacy and the enhanced reliable write support in MMC cards. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrei Warkentin <andreiw@motorola.com> --- drivers/mmc/card/block.c | 81 +++++++++++++++++++++++++++++++++++++++++++-- drivers/mmc/core/mmc.c | 5 +++ include/linux/mmc/card.h | 2 + include/linux/mmc/mmc.h | 4 ++ 4 files changed, 88 insertions(+), 4 deletions(-) diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index 61d233a..91a6767 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -48,6 +48,10 @@ MODULE_ALIAS("mmc:block"); #endif #define MODULE_PARAM_PREFIX "mmcblk." +#define REL_WRITES_SUPPORTED(card) (mmc_card_mmc((card)) && \ + (((card)->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN) || \ + ((card)->ext_csd.rel_sectors))) + static DEFINE_MUTEX(block_mutex); /* @@ -331,6 +335,57 @@ out: return err ? 0 : 1; } +static int mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req) +{ + struct mmc_blk_data *md = mq->data; + + /* + * No-op, only service this because we need REQ_FUA for reliable + * writes. + */ + spin_lock_irq(&md->lock); + __blk_end_request_all(req, 0); + spin_unlock_irq(&md->lock); + + return 1; +} + +/* + * Reformat current write as a reliable write, supporting + * both legacy and the enhanced reliable write MMC cards. + * In each transfer we'll handle only as much as a single + * reliable write can handle, thus finish the request in + * partial completions. + */ +static inline int mmc_apply_rel_rw(struct mmc_blk_request *brq, + struct mmc_card *card, + struct request *req) +{ + int err; + struct mmc_command set_count; + + if (!(card->ext_csd.rel_param & EXT_CSD_WR_REL_PARAM_EN)) { + /* Legacy mode imposes restrictions on transfers. */ + if (!IS_ALIGNED(brq->cmd.arg, card->ext_csd.rel_sectors)) + brq->data.blocks = 1; + + if (brq->data.blocks > card->ext_csd.rel_sectors) + brq->data.blocks = card->ext_csd.rel_sectors; + else if (brq->data.blocks < card->ext_csd.rel_sectors) + brq->data.blocks = 1; + } + + memset(&set_count, 0, sizeof(struct mmc_command)); + set_count.opcode = MMC_SET_BLOCK_COUNT; + set_count.arg = brq->data.blocks | (1 << 31); + set_count.flags = MMC_RSP_R1 | MMC_CMD_AC; + err = mmc_wait_for_cmd(card->host, &set_count, 0); + if (err) + printk(KERN_ERR "%s: error %d SET_BLOCK_COUNT\n", + req->rq_disk->disk_name, err); + return err; +} + static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) { struct mmc_blk_data *md = mq->data; @@ -338,6 +393,15 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) struct mmc_blk_request brq; int ret = 1, disable_multi = 0; + /* + * Reliable writes are used to implement Forced Unit Access and + * REQ_META accesses, and are supported only on MMCs. + */ + bool do_rel_wr = ((req->cmd_flags & REQ_FUA) || + (req->cmd_flags & REQ_META)) && + (rq_data_dir(req) == WRITE) && + REL_WRITES_SUPPORTED(card); + mmc_claim_host(card->host); do { @@ -374,12 +438,14 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) if (disable_multi && brq.data.blocks > 1) brq.data.blocks = 1; - if (brq.data.blocks > 1) { + if (brq.data.blocks > 1 || do_rel_wr) { /* SPI multiblock writes terminate using a special - * token, not a STOP_TRANSMISSION request. + * token, not a STOP_TRANSMISSION request. Reliable + * writes use SET_BLOCK_COUNT and do not use a + * STOP_TRANSMISSION request either. */ - if (!mmc_host_is_spi(card->host) - || rq_data_dir(req) == READ) + if ((!mmc_host_is_spi(card->host) && !do_rel_wr) || + rq_data_dir(req) == READ) brq.mrq.stop = &brq.stop; readcmd = MMC_READ_MULTIPLE_BLOCK; writecmd = MMC_WRITE_MULTIPLE_BLOCK; @@ -396,6 +462,9 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *req) brq.data.flags |= MMC_DATA_WRITE; } + if (do_rel_wr && mmc_apply_rel_rw(&brq, card, req)) + goto cmd_err; + mmc_set_data_timeout(&brq.data, card); brq.data.sg = mq->sg; @@ -565,6 +634,8 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req) return mmc_blk_issue_secdiscard_rq(mq, req); else return mmc_blk_issue_discard_rq(mq, req); + } else if (req->cmd_flags & REQ_FLUSH) { + return mmc_blk_issue_flush(mq, req); } else { return mmc_blk_issue_rw_rq(mq, req); } @@ -622,6 +693,8 @@ static struct mmc_blk_data *mmc_blk_alloc(struct mmc_card *card) md->disk->queue = md->queue.queue; md->disk->driverfs_dev = &card->dev; set_disk_ro(md->disk, md->read_only); + if (REL_WRITES_SUPPORTED(card)) + blk_queue_flush(md->queue.queue, REQ_FLUSH | REQ_FUA); /* * As discussed on lkml, GENHD_FL_REMOVABLE should: diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c index 2d48800..caba751 100644 --- a/drivers/mmc/core/mmc.c +++ b/drivers/mmc/core/mmc.c @@ -300,6 +300,8 @@ static int mmc_read_ext_csd(struct mmc_card *card) ext_csd[EXT_CSD_ERASE_TIMEOUT_MULT]; card->ext_csd.hc_erase_size = ext_csd[EXT_CSD_HC_ERASE_GRP_SIZE] << 10; + + card->ext_csd.rel_sectors = ext_csd[EXT_CSD_REL_WR_SEC_C]; } if (card->ext_csd.rev >= 4) { @@ -351,6 +353,9 @@ static int mmc_read_ext_csd(struct mmc_card *card) ext_csd[EXT_CSD_TRIM_MULT]; } + if (card->ext_csd.rev >= 5) + card->ext_csd.rel_param = ext_csd[EXT_CSD_WR_REL_PARAM]; + if (ext_csd[EXT_CSD_ERASED_MEM_CONT]) card->erased_byte = 0xFF; else diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h index 557b732..c4e96fa 100644 --- a/include/linux/mmc/card.h +++ b/include/linux/mmc/card.h @@ -45,6 +45,8 @@ struct mmc_ext_csd { u8 rev; u8 erase_group_def; u8 sec_feature_support; + u8 rel_sectors; + u8 rel_param; u8 bootconfig; unsigned int sa_timeout; /* Units: 100ns */ unsigned int hs_max_dtr; diff --git a/include/linux/mmc/mmc.h b/include/linux/mmc/mmc.h index b5ec88f..390aa6e 100644 --- a/include/linux/mmc/mmc.h +++ b/include/linux/mmc/mmc.h @@ -255,6 +255,7 @@ struct _mmc_csd { #define EXT_CSD_PARTITION_ATTRIBUTE 156 /* R/W */ #define EXT_CSD_PARTITION_SUPPORT 160 /* RO */ +#define EXT_CSD_WR_REL_PARAM 166 /* RO */ #define EXT_CSD_ERASE_GROUP_DEF 175 /* R/W */ #define EXT_CSD_BOOT_CONFIG 179 /* R/W */ #define EXT_CSD_ERASED_MEM_CONT 181 /* RO */ @@ -265,6 +266,7 @@ struct _mmc_csd { #define EXT_CSD_CARD_TYPE 196 /* RO */ #define EXT_CSD_SEC_CNT 212 /* RO, 4 bytes */ #define EXT_CSD_S_A_TIMEOUT 217 /* RO */ +#define EXT_CSD_REL_WR_SEC_C 222 /* RO */ #define EXT_CSD_HC_WP_GRP_SIZE 221 /* RO */ #define EXT_CSD_ERASE_TIMEOUT_MULT 223 /* RO */ #define EXT_CSD_HC_ERASE_GRP_SIZE 224 /* RO */ @@ -277,6 +279,8 @@ struct _mmc_csd { * EXT_CSD field definitions */ +#define EXT_CSD_WR_REL_PARAM_EN (1<<2) + #define EXT_CSD_CMD_SET_NORMAL (1<<0) #define EXT_CSD_CMD_SET_SECURE (1<<1) #define EXT_CSD_CMD_SET_CPSECURE (1<<2) -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] MMC: Reliable write support. 2011-03-31 23:40 ` [PATCH] " Andrei Warkentin @ 2011-04-01 0:47 ` Chris Ball 0 siblings, 0 replies; 14+ messages in thread From: Chris Ball @ 2011-04-01 0:47 UTC (permalink / raw) To: Andrei Warkentin; +Cc: linux-mmc Hi, On Thu, Mar 31 2011, Andrei Warkentin wrote: > Allows reliable writes to be used for MMC writes. Reliable writes are used > to service write REQ_FUA/REQ_META requests. Handles both the legacy and > the enhanced reliable write support in MMC cards. > > Reviewed-by: Arnd Bergmann <arnd@arndb.de> > Signed-off-by: Andrei Warkentin <andreiw@motorola.com> Thanks, Andrei, pushed to mmc-next for .40. - Chris. -- Chris Ball <cjb@laptop.org> <http://printf.net/> One Laptop Per Child ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-04-01 0:42 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-24 21:22 Reliable write support Andrei Warkentin 2011-03-24 21:22 ` [comments] MMC: " Andrei Warkentin 2011-03-25 15:14 ` Arnd Bergmann 2011-03-26 7:17 ` Andrei Warkentin 2011-03-26 7:22 ` Andrei Warkentin 2011-03-29 0:50 ` Andrei Warkentin 2011-03-29 7:01 ` Arnd Bergmann 2011-03-29 22:44 ` Andrei Warkentin 2011-03-30 12:05 ` Arnd Bergmann 2011-03-30 22:38 ` Chris Ball 2011-03-31 20:39 ` Andrei Warkentin 2011-03-31 20:58 ` Chris Ball 2011-03-31 23:40 ` [PATCH] " Andrei Warkentin 2011-04-01 0:47 ` Chris Ball
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox