linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: ezequiel.garcia@free-electrons.com (Ezequiel Garcia)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v3 12/28] mtd: nand: pxa3xx: Use a completion to signal device ready
Date: Tue, 5 Nov 2013 21:28:00 -0300	[thread overview]
Message-ID: <20131106002759.GF11759@localhost> (raw)
In-Reply-To: <20131105195136.GS20061@ld-irv-0074.broadcom.com>

On Tue, Nov 05, 2013 at 11:51:36AM -0800, Brian Norris wrote:
> On Tue, Nov 05, 2013 at 09:55:19AM -0300, Ezequiel Garcia wrote:
> > Apparently, the expected behavior of the waitfunc() NAND chip call
> > is to wait for the device to be READY (this is a standard chip line).
> > However, the current implementation does almost nothing, which opens
> > a possibility to issue a command to a non-ready device.
> > 
> > Fix this by adding a new completion to wait for the ready event to arrive.
> > 
> > Because the "is ready" flag is cleared from the controller status
> > register, it's needed to store that state in the driver, and because the
> > field is accesed from an interruption, the field needs to be of an
> > atomic type.
> > 
> > Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> > ---
> >  drivers/mtd/nand/pxa3xx_nand.c | 45 +++++++++++++++++++++++++++++-------------
> >  1 file changed, 31 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/pxa3xx_nand.c b/drivers/mtd/nand/pxa3xx_nand.c
> > index 2fb0f38..e198c94 100644
> > --- a/drivers/mtd/nand/pxa3xx_nand.c
> > +++ b/drivers/mtd/nand/pxa3xx_nand.c
> > @@ -35,6 +35,7 @@
> >  
> >  #include <linux/platform_data/mtd-nand-pxa3xx.h>
> >  
> > +#define NAND_DEV_READY_TIMEOUT  50
> >  #define	CHIP_DELAY_TIMEOUT	(2 * HZ/10)
> >  #define NAND_STOP_DELAY		(2 * HZ/50)
> >  #define PAGE_CHUNK_SIZE		(2048)
> > @@ -166,7 +167,7 @@ struct pxa3xx_nand_info {
> >  	struct clk		*clk;
> >  	void __iomem		*mmio_base;
> >  	unsigned long		mmio_phys;
> > -	struct completion	cmd_complete;
> > +	struct completion	cmd_complete, dev_ready;
> 
> I still kinda think this could be consolidated into one completion,
> under which cmdfunc() performs all the necessary waiting (for both the
> "command complete" and "device ready" signals), but I think this is not
> a big advantage right now, considering your code is not too complex
> right now.
> 

Quite frankly, I'd love such solution (given this adds a lot of
complexity) but I can't see how. Maybe restarting the command done completion
at the end of cmdfunc() so waitfunc() would re-use it?

> >  
> >  	unsigned int 		buf_start;
> >  	unsigned int		buf_count;
> > @@ -196,7 +197,13 @@ struct pxa3xx_nand_info {
> >  	int			use_ecc;	/* use HW ECC ? */
> >  	int			use_dma;	/* use DMA ? */
> >  	int			use_spare;	/* use spare ? */
> > -	int			is_ready;
> > +
> > +	/*
> > +	 * The is_ready flag is accesed from several places,
> > +	 * including an interruption hander. We need an atomic
> > +	 * type to avoid races.
> > +	 */
> > +	atomic_t		is_ready;
> 
> I believe your handling of this 'is_ready' bit is a little unwise, as
> you are actually creating extra concurrency that is unneeded. I'll
> summarize what you're doing with this 'is_ready' field:
> 
>   cmdfunc() -> sets info->is_ready=0 for appropriate commands
>             -> kicks off the hardware
> 
> The following two sequences may then occur concurrently:
> 
>   (1) pxa3xx_nand_irq -> ready interrupt occurs
>                       -> set info->is_ready=1
> 		      -> signal 'dev_ready' completion
> 
>   (2) waitfunc -> check info->is_ready, if it is 0...
>                   |_ ... wait for the dev_ready completion
> 
> Instead of setting info->is_ready=1 under (1), you could set it in (2),
> after the completion (or timeout). This avoids the concurrency, since
> cmdfunc() and waitfunc() are sequential. This also avoids a benign race
> in which you may not even call wait_for_completion() in (2) in the
> following scenario:
> 
>   * Suppose waitfunc is delayed a long time
>   * The IRQ handler...
>     - receives the 'ready' interrupt
>     - clears info->is_ready
>     - calls complete(&info->dev_ready)
>   * waitfunc() finally executes
>     - because info->is_ready==1, it skips the wait_for_completion...(),
>       leaving your completion unbalanced
> 

Right.

> This influences a comment I have below regarding your re-initialization
> of the completion struct.
> 
> >  
> >  	unsigned int		fifo_size;	/* max. data size in the FIFO */
> >  	unsigned int		data_size;	/* data to be read from FIFO */
> > @@ -478,7 +485,7 @@ static void start_data_dma(struct pxa3xx_nand_info *info)
> >  static irqreturn_t pxa3xx_nand_irq(int irq, void *devid)
> >  {
> >  	struct pxa3xx_nand_info *info = devid;
> > -	unsigned int status, is_completed = 0;
> > +	unsigned int status, is_completed = 0, is_ready = 0;
> >  	unsigned int ready, cmd_done;
> >  
> >  	if (info->cs == 0) {
> > @@ -514,8 +521,9 @@ static irqreturn_t pxa3xx_nand_irq(int irq, void *devid)
> >  		is_completed = 1;
> >  	}
> >  	if (status & ready) {
> > -		info->is_ready = 1;
> > +		atomic_set(&info->is_ready, 1);
> 
> According to my suggestions, I don't think you need to set
> info->is_ready=1 here.
> 

Well, I'm not sure there's other place where this can be done, as it's the
only place where the status register is read. After this, the register
is cleared and the information is lost. (On the other side, see below)

I guess that if we could simply read the status at any time, this would be
much simpler (iow, we would read the status register in waitfunc).

> >  		info->state = STATE_READY;
> > +		is_ready = 1;
> >  	}
> >  
> >  	if (status & NDSR_WRCMDREQ) {
> > @@ -544,6 +552,8 @@ static irqreturn_t pxa3xx_nand_irq(int irq, void *devid)
> >  	nand_writel(info, NDSR, status);
> >  	if (is_completed)
> >  		complete(&info->cmd_complete);
> > +	if (is_ready)
> > +		complete(&info->dev_ready);
> >  NORMAL_IRQ_EXIT:
> >  	return IRQ_HANDLED;
> >  }
> > @@ -574,7 +584,6 @@ static int prepare_command_pool(struct pxa3xx_nand_info *info, int command,
> >  	info->oob_size		= 0;
> >  	info->use_ecc		= 0;
> >  	info->use_spare		= 1;
> > -	info->is_ready		= 0;
> >  	info->retcode		= ERR_NONE;
> >  	if (info->cs != 0)
> >  		info->ndcb0 = NDCB0_CSEL;
> > @@ -747,6 +756,8 @@ static void pxa3xx_nand_cmdfunc(struct mtd_info *mtd, unsigned command,
> >  	exec_cmd = prepare_command_pool(info, command, column, page_addr);
> >  	if (exec_cmd) {
> >  		init_completion(&info->cmd_complete);
> > +		init_completion(&info->dev_ready);
> 
> Do you really need to init the completions each time you run a command?
> AIUI, the only reason you would need to do this is if you aren't
> matching up your calls to complete() and wait_for_completion*()
> properly, so that you simply dodge the issue and reset the completion
> count each time. This might be a result of the complexity of your
> 2-completion signalling design.
> 

Exactly, I need to reset any previous statue.

> > +		atomic_set(&info->is_ready, 0);
> >  		pxa3xx_nand_start(info);
> >  
> >  		ret = wait_for_completion_timeout(&info->cmd_complete,
> > @@ -859,21 +870,27 @@ static int pxa3xx_nand_waitfunc(struct mtd_info *mtd, struct nand_chip *this)
> >  {
> >  	struct pxa3xx_nand_host *host = mtd->priv;
> >  	struct pxa3xx_nand_info *info = host->info_data;
> > +	int ret;
> > +
> > +	/* Need to wait? */
> > +	if (!atomic_read(&info->is_ready)) {
> 
> This read of info->is_ready will no longer need to be atomic, because
> you never modify it from the IRQ context--only from the cmdfunc() and
> waitfunc().
> 

I see.

> > +		ret = wait_for_completion_timeout(&info->dev_ready,
> > +				CHIP_DELAY_TIMEOUT);
> 
> I think you can just do a (non-atomic) info->is_ready=1 here.
> 

The problem is: not every cmdfunc() is followed by a waitfunc(), right?
And even if it would be, cmdfunc() can be accesed from other places
(such as this same driver) so how do you guarantee cmdfunc() and
waitfunc() are called in pair?

Or maybe you're right and this is not an issue? I'm doing some tests
now and we'll report back later.

> > +		if (!ret) {
> > +			dev_err(&info->pdev->dev, "Ready time out!!!\n");
> > +			return NAND_STATUS_FAIL;
> > +		}
> > +	}
> >  
> >  	/* pxa3xx_nand_send_command has waited for command complete */
> >  	if (this->state == FL_WRITING || this->state == FL_ERASING) {
> >  		if (info->retcode == ERR_NONE)
> >  			return 0;
> > -		else {
> > -			/*
> > -			 * any error make it return 0x01 which will tell
> > -			 * the caller the erase and write fail
> > -			 */
> > -			return 0x01;
> > -		}
> > +		else
> > +			return NAND_STATUS_FAIL;
> >  	}
> >  
> > -	return 0;
> > +	return NAND_STATUS_READY;
> >  }
> >  
> >  static int pxa3xx_nand_config_flash(struct pxa3xx_nand_info *info,
> > @@ -1026,7 +1043,7 @@ static int pxa3xx_nand_sensing(struct pxa3xx_nand_info *info)
> >  		return ret;
> >  
> >  	chip->cmdfunc(mtd, NAND_CMD_RESET, 0, 0);
> > -	if (info->is_ready)
> > +	if (atomic_read(&info->is_ready))
> 
> Does this need to wait on the dev_ready completion? I'm not sure I can
> guarantee there is no race on info->is_ready here, but then that means
> I'm not sure the code is correct even with the atomic read (which I
> don't think will be necessary, according to my suggestion above).
> 

Hm... yes, this probably should use instead of accessing is_ready.
-- 
Ezequiel Garc?a, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com

  reply	other threads:[~2013-11-06  0:28 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-05 12:55 [PATCH v3 00/28] Armada 370/XP NAND support Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 01/28] clk: mvebu: Add Core Divider clock Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 02/28] ARM: mvebu: Add Core Divider clock device-tree binding Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 03/28] ARM: mvebu: Add a 2 GHz fixed-clock Armada 370/XP Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 04/28] ARM: mvebu: Add the core-divider clock to " Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 05/28] mtd: nand: pxa3xx: Make config menu show supported platforms Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 06/28] mtd: nand: pxa3xx: Prevent sub-page writes Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 07/28] mtd: nand: pxa3xx: Early variant detection Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 08/28] mtd: nand: pxa3xx: Use chip->cmdfunc instead of the internal Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 09/28] mtd: nand: pxa3xx: Split FIFO size from to-be-read FIFO count Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 10/28] mtd: nand: pxa3xx: Replace host->page_size by mtd->writesize Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 11/28] mtd: nand: pxa3xx: Add a nice comment to pxa3xx_set_datasize() Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 12/28] mtd: nand: pxa3xx: Use a completion to signal device ready Ezequiel Garcia
2013-11-05 19:51   ` Brian Norris
2013-11-06  0:28     ` Ezequiel Garcia [this message]
2013-11-06  0:46       ` Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 13/28] mtd: nand: pxa3xx: Add bad block handling Ezequiel Garcia
2013-11-05 18:23   ` Brian Norris
2013-11-05 23:40     ` Ezequiel Garcia
2013-11-06  1:36       ` Brian Norris
2013-11-05 12:55 ` [PATCH v3 14/28] mtd: nand: pxa3xx: Add driver-specific ECC BCH support Ezequiel Garcia
2013-11-05 18:31   ` Brian Norris
2013-11-05 23:24     ` Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 15/28] mtd: nand: pxa3xx: Clear cmd buffer #3 (NDCB3) on command start Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 16/28] mtd: nand: pxa3xx: Add helper function to set page address Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 17/28] mtd: nand: pxa3xx: Remove READ0 switch/case falltrough Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 18/28] mtd: nand: pxa3xx: Split prepare_command_pool() in two stages Ezequiel Garcia
2013-11-05 18:32   ` Brian Norris
2013-11-05 12:55 ` [PATCH v3 19/28] mtd: nand: pxa3xx: Move the data buffer clean to prepare_start_command() Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 20/28] mtd: nand: pxa3xx: Fix SEQIN column address set Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 21/28] mtd: nand: pxa3xx: Add a read/write buffers markers Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 22/28] mtd: nand: pxa3xx: Introduce multiple page I/O support Ezequiel Garcia
2013-11-05 19:04   ` Brian Norris
2013-11-06  1:13     ` Ezequiel Garcia
2013-11-06  2:20       ` Brian Norris
2013-11-06  2:27         ` Brian Norris
2013-11-06  3:35         ` Ezequiel Garcia
2013-11-06 11:32         ` Ezequiel Garcia
2013-11-18 18:10           ` Brian Norris
2013-11-18 18:33             ` Ezequiel Garcia
2013-11-18 18:50               ` Brian Norris
2013-12-04 21:41                 ` Brian Norris
2013-11-05 19:08   ` Brian Norris
2013-11-05 12:55 ` [PATCH v3 23/28] mtd: nand: pxa3xx: Add multiple chunk write support Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 24/28] mtd: nand: pxa3xx: Add ECC BCH correctable errors detection Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 25/28] ARM: mvebu: Add support for NAND controller in Armada 370/XP Ezequiel Garcia
2013-11-05 13:29   ` Jason Cooper
2013-11-05 13:51     ` Ezequiel Garcia
2013-11-05 15:15       ` Jason Cooper
2013-11-05 15:37         ` Ezequiel Garcia
2013-11-06  8:24         ` Thomas Petazzoni
2013-11-06 11:42           ` Jason Cooper
2013-11-06 12:56             ` Thomas Petazzoni
2013-11-06 17:21               ` Jason Cooper
2013-11-05 12:55 ` [PATCH v3 26/28] ARM: mvebu: Enable NAND controller in Armada XP GP board Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 27/28] ARM: mvebu: Enable NAND controller in Armada 370 Mirabox Ezequiel Garcia
2013-11-05 12:55 ` [PATCH v3 28/28] mtd: nand: pxa3xx: Add documentation about the controller Ezequiel Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131106002759.GF11759@localhost \
    --to=ezequiel.garcia@free-electrons.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).