From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755138AbZERPbT (ORCPT ); Mon, 18 May 2009 11:31:19 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751961AbZERPbK (ORCPT ); Mon, 18 May 2009 11:31:10 -0400 Received: from cs20.apochromatic.org ([204.152.189.161]:63220 "EHLO cs20.apochromatic.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751422AbZERPbJ (ORCPT ); Mon, 18 May 2009 11:31:09 -0400 X-Greylist: delayed 667 seconds by postgrey-1.27 at vger.kernel.org; Mon, 18 May 2009 11:31:09 EDT Date: Mon, 18 May 2009 16:20:01 +0100 From: Matt Fleming To: Wolfgang =?iso-8859-1?Q?M=FCes?= Cc: Pierre Ossman , Andrew Morton , David Brownell , Mike Frysinger , linux-kernel@vger.kernel.org Subject: Re: [PATCH] mmc_spi: do propper retry managment in the block layer - rewritten Message-ID: <20090518152001.GA29472@console-pimps.org> References: <200905181521.38959.wolfgang.mues@auerswald.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <200905181521.38959.wolfgang.mues@auerswald.de> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 18, 2009 at 02:21:38PM +0100, Wolfgang Mües wrote: > From: Wolfgang Muees > > o This patch adds a propper retry managment for reading > and writing data blocks for mmc and mmc_spi. Blocks are > retransmitted if the cause of failure might be a transmission > fault. After a permanent failure, we try to read all other > pending sectors, but terminate on write. > This patch was tested with induced transmission errors > by ESD pulses (and survived). > > Signed-off-by: Wolfgang Muees > When you say "proper retry management", what problem are you having with the current code? Your changelog tells me what you changed but not why you changed it. > --- > diff -uprN 2_6_29_rc7_patch_EILSEQ/drivers/mmc/card/block.c 2_6_29_rc7_patch_retries/drivers/mmc/card/block.c > --- 2_6_29_rc7_patch_EILSEQ/drivers/mmc/card/block.c 2009-03-09 17:10:55.000000000 +0100 > +++ 2_6_29_rc7_patch_retries/drivers/mmc/card/block.c 2009-05-18 15:00:41.000000000 +0200 It's always a good idea to make your changes against the latest kernel version to make sure that your patches apply cleanly. They applied OK this time, but they might not in future. > @@ -230,12 +230,14 @@ static int mmc_blk_issue_rq(struct mmc_q > struct mmc_card *card = md->queue.card; > struct mmc_blk_request brq; > int ret = 1, disable_multi = 0; > + int retries = 3; > > mmc_claim_host(card->host); > > do { > struct mmc_command cmd; > u32 readcmd, writecmd, status = 0; > + int error = 0; > > memset(&brq, 0, sizeof(struct mmc_blk_request)); > brq.mrq.cmd = &brq.cmd; > @@ -251,10 +253,8 @@ static int mmc_blk_issue_rq(struct mmc_q > brq.stop.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC; > brq.data.blocks = req->nr_sectors; > > - /* > - * After a read error, we redo the request one sector at a time > - * in order to accurately determine which sectors can be read > - * successfully. > + /* After an transmission error, we switch to single block > + * transfer until the problem is solved or we fail. > */ Why did you change this comment? They seem to say roughly the same things to me, only the first one is more descriptive. I think you really need to combine the two, /* * After a transmission error, we redo the request one sector * at a time in order to accurately determine which sectors * can be transmitted successfully. We keep retrying until * the transmission succeeds or we exceed our retry count. */ > if (disable_multi && brq.data.blocks > 1) > brq.data.blocks = 1; > @@ -262,10 +262,12 @@ static int mmc_blk_issue_rq(struct mmc_q > if (brq.data.blocks > 1) { > /* SPI multiblock writes terminate using a special > * token, not a STOP_TRANSMISSION request. > + * Here, this request is set for ALL types of > + * hosts, so that we can use it in the lower > + * layers if the data transfer stage has failed > + * and the card is not able to accept the token. > */ > - if (!mmc_host_is_spi(card->host) > - || rq_data_dir(req) == READ) > - brq.mrq.stop = &brq.stop; > + brq.mrq.stop = &brq.stop; > readcmd = MMC_READ_MULTIPLE_BLOCK; > writecmd = MMC_WRITE_MULTIPLE_BLOCK; > } else { Hmm.. is it OK to abuse the STOP_TRANSMISSION request like this? Does SPI multiblock write still terminate with this patch? > @@ -311,52 +313,71 @@ static int mmc_blk_issue_rq(struct mmc_q > mmc_wait_for_req(card->host, &brq.mrq); > > mmc_queue_bounce_post(mq); > - > - /* > - * Check for errors here, but don't jump to cmd_err > - * until later as we need to wait for the card to leave > - * programming mode even when things go wrong. > - */ > - if (brq.cmd.error || brq.data.error || brq.stop.error) { > - if (brq.data.blocks > 1 && rq_data_dir(req) == READ) { > - /* Redo read one sector at a time */ > - printk(KERN_WARNING "%s: retrying using single " > - "block read\n", req->rq_disk->disk_name); > - disable_multi = 1; > - continue; > - } > - status = get_card_status(card, req); > - } > - > +#if 0 > + printk(KERN_INFO "%s transfer sector %d count %d\n", > + (rq_data_dir(req) == READ) ? "Read" : "Write", > + (int)req->sector, (int)brq.data.blocks); > +#endif This should probably be using pr_debug(). That way you can remove the #if 0. Or delete it altogether. [...] > diff -uprN 2_6_29_rc7_patch_EILSEQ/drivers/mmc/host/mmc_spi.c 2_6_29_rc7_patch_retries/drivers/mmc/host/mmc_spi.c > --- 2_6_29_rc7_patch_EILSEQ/drivers/mmc/host/mmc_spi.c 2009-05-14 12:49:42.000000000 +0200 > +++ 2_6_29_rc7_patch_retries/drivers/mmc/host/mmc_spi.c 2009-05-14 15:01:15.000000000 +0200 > @@ -1087,7 +1089,15 @@ static void mmc_spi_request(struct mmc_h > status = mmc_spi_command_send(host, mrq, mrq->cmd, mrq->data != NULL); > if (status == 0 && mrq->data) { > mmc_spi_data_do(host, mrq->cmd, mrq->data, mrq->data->blksz); > - if (mrq->stop) > + /* filter-out the stop command for multiblock writes, > + * only if the data stage has no transmission error. > + * If the data stage has a transmission error, send the > + * STOP command because there is a great chance that the > + * SPI stop token was not accepted by the card. > + */ > + if (mrq->stop && > + ((mrq->data->flags & MMC_DATA_READ) > + || mrq->data->error)) > status = mmc_spi_command_send(host, mrq, mrq->stop, 0); > else > mmc_cs_off(host); Again, does this still work for SPI? Have you tested it?