* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
@ 2012-06-29 17:22 Jon Hunter
2012-06-29 20:27 ` Franky Lin
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Jon Hunter @ 2012-06-29 17:22 UTC (permalink / raw)
To: linux-arm-kernel
Currently the gpio _runtime_resume/suspend functions are calling the
get_context_loss_count() platform function if the function is populated for
a gpio bank. This function is used to determine if the gpio bank logic state
needs to be restored due to a power transition. This function will be populated
for all banks, but it should only be called for banks that have the
"loses_context" variable set. It is pointless to call this if loses_context is
false as we know the context will never be lost and will not need restoring.
For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
never lose context. We found that the get_context_loss_count() was being called
for bank-0 during the probe and returning 1 instead of 0 indicating that the
context had been lost. This was causing the context restore function to be
called at probe time for this bank and because the context had never been saved,
was restoring an invalid state. This ultimately resulted in a crash [1].
There are multiple bugs here that need to be addressed ...
1. Why the always-on power domain returns a context loss count of 1? This needs
to be fixed in the power domain code. However, the gpio driver should not
assume the loss count is 0 to begin with.
2. The omap gpio driver should never be calling get_context_loss_count for a
gpio bank in a always-on domain. This is pointless and adds unneccessary
overhead.
3. The OMAP gpio driver assumes that the initial power domain context loss count
will be 0 at the time the gpio driver is probed. However, it could be
possible that this is not the case and an invalid context restore could be
performed during the probe. To avoid this otherwise only populated the
get_context_loss_count() function pointer after the initial call to
pm_runtime_get() has occurred. This will ensure that the first
pm_runtime_put() initialised the loss count correctly.
This patch addresses issues 2 and 3 above.
[1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Cc: Franky Lin <frankyl@broadcom.com>
Reported-by: Franky Lin <frankyl@broadcom.com>
Signed-off-by: Jon Hunter <jon-hunter@ti.com>
---
drivers/gpio/gpio-omap.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
index c4ed172..f13fc9c 100644
--- a/drivers/gpio/gpio-omap.c
+++ b/drivers/gpio/gpio-omap.c
@@ -1081,7 +1081,6 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev)
bank->is_mpuio = pdata->is_mpuio;
bank->non_wakeup_gpios = pdata->non_wakeup_gpios;
bank->loses_context = pdata->loses_context;
- bank->get_context_loss_count = pdata->get_context_loss_count;
bank->regs = pdata->regs;
#ifdef CONFIG_OF_GPIO
bank->chip.of_node = of_node_get(node);
@@ -1135,6 +1134,9 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev)
omap_gpio_chip_init(bank);
omap_gpio_show_rev(bank);
+ if (bank->loses_context)
+ bank->get_context_loss_count = pdata->get_context_loss_count;
+
pm_runtime_put(bank->dev);
list_add_tail(&bank->node, &omap_gpio_list);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-06-29 17:22 [PATCH] gpio/omap: fix invalid context restore of gpio bank-0 Jon Hunter
@ 2012-06-29 20:27 ` Franky Lin
2012-06-30 4:18 ` Shilimkar, Santosh
2012-07-02 18:07 ` Kevin Hilman
2 siblings, 0 replies; 10+ messages in thread
From: Franky Lin @ 2012-06-29 20:27 UTC (permalink / raw)
To: linux-arm-kernel
On 06/29/2012 10:22 AM, Jon Hunter wrote:
> Currently the gpio _runtime_resume/suspend functions are calling the
> get_context_loss_count() platform function if the function is populated for
> a gpio bank. This function is used to determine if the gpio bank logic state
> needs to be restored due to a power transition. This function will be populated
> for all banks, but it should only be called for banks that have the
> "loses_context" variable set. It is pointless to call this if loses_context is
> false as we know the context will never be lost and will not need restoring.
>
> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
> never lose context. We found that the get_context_loss_count() was being called
> for bank-0 during the probe and returning 1 instead of 0 indicating that the
> context had been lost. This was causing the context restore function to be
> called at probe time for this bank and because the context had never been saved,
> was restoring an invalid state. This ultimately resulted in a crash [1].
>
> There are multiple bugs here that need to be addressed ...
>
> 1. Why the always-on power domain returns a context loss count of 1? This needs
> to be fixed in the power domain code. However, the gpio driver should not
> assume the loss count is 0 to begin with.
> 2. The omap gpio driver should never be calling get_context_loss_count for a
> gpio bank in a always-on domain. This is pointless and adds unneccessary
> overhead.
> 3. The OMAP gpio driver assumes that the initial power domain context loss count
> will be 0 at the time the gpio driver is probed. However, it could be
> possible that this is not the case and an invalid context restore could be
> performed during the probe. To avoid this otherwise only populated the
> get_context_loss_count() function pointer after the initial call to
> pm_runtime_get() has occurred. This will ensure that the first
> pm_runtime_put() initialised the loss count correctly.
>
> This patch addresses issues 2 and 3 above.
>
> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
>
> Cc: Grant Likely <grant.likely@secretlab.ca>
> Cc: Linus Walleij <linus.walleij@stericsson.com>
> Cc: Kevin Hilman <khilman@ti.com>
> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
> Cc: Franky Lin <frankyl@broadcom.com>
>
> Reported-by: Franky Lin <frankyl@broadcom.com>
> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
> ---
Tested-by: Franky Lin <frankyl@broadcom.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-06-29 17:22 [PATCH] gpio/omap: fix invalid context restore of gpio bank-0 Jon Hunter
2012-06-29 20:27 ` Franky Lin
@ 2012-06-30 4:18 ` Shilimkar, Santosh
2012-07-01 8:45 ` Tony Lindgren
2012-07-02 18:07 ` Kevin Hilman
2 siblings, 1 reply; 10+ messages in thread
From: Shilimkar, Santosh @ 2012-06-30 4:18 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jun 29, 2012 at 10:52 PM, Jon Hunter <jon-hunter@ti.com> wrote:
> Currently the gpio _runtime_resume/suspend functions are calling the
> get_context_loss_count() platform function if the function is populated for
> a gpio bank. This function is used to determine if the gpio bank logic state
> needs to be restored due to a power transition. This function will be populated
> for all banks, but it should only be called for banks that have the
> "loses_context" variable set. It is pointless to call this if loses_context is
> false as we know the context will never be lost and will not need restoring.
>
> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
> never lose context. We found that the get_context_loss_count() was being called
> for bank-0 during the probe and returning 1 instead of 0 indicating that the
> context had been lost. This was causing the context restore function to be
> called at probe time for this bank and because the context had never been saved,
> was restoring an invalid state. This ultimately resulted in a crash [1].
>
> There are multiple bugs here that need to be addressed ...
>
> 1. Why the always-on power domain returns a context loss count of 1? This needs
> ? to be fixed in the power domain code. However, the gpio driver should not
> ? assume the loss count is 0 to begin with.
Indeed. GPIO driver should not assume the value.
> 2. The omap gpio driver should never be calling get_context_loss_count for a
> ? gpio bank in a always-on domain. This is pointless and adds unneccessary
> ? overhead.
Make sense too.
> 3. The OMAP gpio driver assumes that the initial power domain context loss count
> ? will be 0 at the time the gpio driver is probed. However, it could be
> ? possible that this is not the case and an invalid context restore could be
> ? performed during the probe. To avoid this otherwise only populated the
> ? get_context_loss_count() function pointer after the initial call to
> ? pm_runtime_get() has occurred. This will ensure that the first
> ? pm_runtime_put() initialised the loss count correctly.
>
> This patch addresses issues 2 and 3 above.
>
> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
>
> Cc: Grant Likely <grant.likely@secretlab.ca>
> Cc: Linus Walleij <linus.walleij@stericsson.com>
> Cc: Kevin Hilman <khilman@ti.com>
> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
> Cc: Franky Lin <frankyl@broadcom.com>
>
> Reported-by: Franky Lin <frankyl@broadcom.com>
> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
> ---
Thanks Jon for sorting this out. Patch looks good to me.
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-06-30 4:18 ` Shilimkar, Santosh
@ 2012-07-01 8:45 ` Tony Lindgren
2012-07-02 18:22 ` Jon Hunter
0 siblings, 1 reply; 10+ messages in thread
From: Tony Lindgren @ 2012-07-01 8:45 UTC (permalink / raw)
To: linux-arm-kernel
* Shilimkar, Santosh <santosh.shilimkar@ti.com> [120629 21:23]:
> On Fri, Jun 29, 2012 at 10:52 PM, Jon Hunter <jon-hunter@ti.com> wrote:
> > Currently the gpio _runtime_resume/suspend functions are calling the
> > get_context_loss_count() platform function if the function is populated for
> > a gpio bank. This function is used to determine if the gpio bank logic state
> > needs to be restored due to a power transition. This function will be populated
> > for all banks, but it should only be called for banks that have the
> > "loses_context" variable set. It is pointless to call this if loses_context is
> > false as we know the context will never be lost and will not need restoring.
> >
> > For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
> > never lose context. We found that the get_context_loss_count() was being called
> > for bank-0 during the probe and returning 1 instead of 0 indicating that the
> > context had been lost. This was causing the context restore function to be
> > called at probe time for this bank and because the context had never been saved,
> > was restoring an invalid state. This ultimately resulted in a crash [1].
> >
> > There are multiple bugs here that need to be addressed ...
> >
> > 1. Why the always-on power domain returns a context loss count of 1? This needs
> > ? to be fixed in the power domain code. However, the gpio driver should not
> > ? assume the loss count is 0 to begin with.
> Indeed. GPIO driver should not assume the value.
>
> > 2. The omap gpio driver should never be calling get_context_loss_count for a
> > ? gpio bank in a always-on domain. This is pointless and adds unneccessary
> > ? overhead.
> Make sense too.
>
> > 3. The OMAP gpio driver assumes that the initial power domain context loss count
> > ? will be 0 at the time the gpio driver is probed. However, it could be
> > ? possible that this is not the case and an invalid context restore could be
> > ? performed during the probe. To avoid this otherwise only populated the
> > ? get_context_loss_count() function pointer after the initial call to
> > ? pm_runtime_get() has occurred. This will ensure that the first
> > ? pm_runtime_put() initialised the loss count correctly.
> >
> > This patch addresses issues 2 and 3 above.
Should this one be Cc: stable? If this is a regression, then the regression
causing commit should be mentioned.
Tony
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-07-01 8:45 ` Tony Lindgren
@ 2012-07-02 18:22 ` Jon Hunter
0 siblings, 0 replies; 10+ messages in thread
From: Jon Hunter @ 2012-07-02 18:22 UTC (permalink / raw)
To: linux-arm-kernel
On 07/01/2012 03:45 AM, Tony Lindgren wrote:
> * Shilimkar, Santosh <santosh.shilimkar@ti.com> [120629 21:23]:
>> On Fri, Jun 29, 2012 at 10:52 PM, Jon Hunter <jon-hunter@ti.com> wrote:
>>> Currently the gpio _runtime_resume/suspend functions are calling the
>>> get_context_loss_count() platform function if the function is populated for
>>> a gpio bank. This function is used to determine if the gpio bank logic state
>>> needs to be restored due to a power transition. This function will be populated
>>> for all banks, but it should only be called for banks that have the
>>> "loses_context" variable set. It is pointless to call this if loses_context is
>>> false as we know the context will never be lost and will not need restoring.
>>>
>>> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
>>> never lose context. We found that the get_context_loss_count() was being called
>>> for bank-0 during the probe and returning 1 instead of 0 indicating that the
>>> context had been lost. This was causing the context restore function to be
>>> called at probe time for this bank and because the context had never been saved,
>>> was restoring an invalid state. This ultimately resulted in a crash [1].
>>>
>>> There are multiple bugs here that need to be addressed ...
>>>
>>> 1. Why the always-on power domain returns a context loss count of 1? This needs
>>> to be fixed in the power domain code. However, the gpio driver should not
>>> assume the loss count is 0 to begin with.
>> Indeed. GPIO driver should not assume the value.
>>
>>> 2. The omap gpio driver should never be calling get_context_loss_count for a
>>> gpio bank in a always-on domain. This is pointless and adds unneccessary
>>> overhead.
>> Make sense too.
>>
>>> 3. The OMAP gpio driver assumes that the initial power domain context loss count
>>> will be 0 at the time the gpio driver is probed. However, it could be
>>> possible that this is not the case and an invalid context restore could be
>>> performed during the probe. To avoid this otherwise only populated the
>>> get_context_loss_count() function pointer after the initial call to
>>> pm_runtime_get() has occurred. This will ensure that the first
>>> pm_runtime_put() initialised the loss count correctly.
>>>
>>> This patch addresses issues 2 and 3 above.
>
> Should this one be Cc: stable? If this is a regression, then the regression
> causing commit should be mentioned.
So that raises a good point. Looking at the stable branch (3.4.4) it is
missing 3 other fixes too [1][2][3]. So this particular problem would
not have been exposed, however, I am wondering if there are other
problems lingering there.
This is a regression is exposed by [2]. I should add that to the changelog.
Cheers
Jon
[1]
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b3c64bc30af67ed328a8d919e41160942b870451
[2]
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1b1287032df3a69d3ef9a486b444f4ffcca50d01
[3]
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=22770de11cb13e7120f973bca6c800de371a6717
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-06-29 17:22 [PATCH] gpio/omap: fix invalid context restore of gpio bank-0 Jon Hunter
2012-06-29 20:27 ` Franky Lin
2012-06-30 4:18 ` Shilimkar, Santosh
@ 2012-07-02 18:07 ` Kevin Hilman
2012-07-02 18:26 ` Jon Hunter
2 siblings, 1 reply; 10+ messages in thread
From: Kevin Hilman @ 2012-07-02 18:07 UTC (permalink / raw)
To: linux-arm-kernel
+ Neil Brown
Hi Jon,
Jon Hunter <jon-hunter@ti.com> writes:
> Currently the gpio _runtime_resume/suspend functions are calling the
> get_context_loss_count() platform function if the function is populated for
> a gpio bank. This function is used to determine if the gpio bank logic state
> needs to be restored due to a power transition. This function will be populated
> for all banks, but it should only be called for banks that have the
> "loses_context" variable set. It is pointless to call this if loses_context is
> false as we know the context will never be lost and will not need restoring.
>
> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
> never lose context. We found that the get_context_loss_count() was being called
> for bank-0 during the probe and returning 1 instead of 0 indicating that the
> context had been lost. This was causing the context restore function to be
> called at probe time for this bank and because the context had never been saved,
> was restoring an invalid state. This ultimately resulted in a crash [1].
>
> There are multiple bugs here that need to be addressed ...
>
> 1. Why the always-on power domain returns a context loss count of 1? This needs
> to be fixed in the power domain code. However, the gpio driver should not
> assume the loss count is 0 to begin with.
> 2. The omap gpio driver should never be calling get_context_loss_count for a
> gpio bank in a always-on domain. This is pointless and adds unneccessary
> overhead.
> 3. The OMAP gpio driver assumes that the initial power domain context loss count
> will be 0 at the time the gpio driver is probed. However, it could be
> possible that this is not the case and an invalid context restore could be
> performed during the probe. To avoid this otherwise only populated the
The 'To avoid this...' sentence here doesn't read well. Looks like you
need to:
s/otherwise//
s/populated/populate/
?
> get_context_loss_count() function pointer after the initial call to
> pm_runtime_get() has occurred. This will ensure that the first
> pm_runtime_put() initialised the loss count correctly.
>
> This patch addresses issues 2 and 3 above.
> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
>
> Cc: Grant Likely <grant.likely@secretlab.ca>
> Cc: Linus Walleij <linus.walleij@stericsson.com>
> Cc: Kevin Hilman <khilman@ti.com>
> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
> Cc: Franky Lin <frankyl@broadcom.com>
>
> Reported-by: Franky Lin <frankyl@broadcom.com>
> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Thanks for digging inot this bug Jon. The same bug was brought up by
Neil Brown (Cc'd) in a different thread.
Neil, it looks to me that this fix will address the problems you were
seeing as well. Care to test, and respond with your ack/tested-by if it
works for you? Thanks.
Kevin
> ---
> drivers/gpio/gpio-omap.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
> index c4ed172..f13fc9c 100644
> --- a/drivers/gpio/gpio-omap.c
> +++ b/drivers/gpio/gpio-omap.c
> @@ -1081,7 +1081,6 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev)
> bank->is_mpuio = pdata->is_mpuio;
> bank->non_wakeup_gpios = pdata->non_wakeup_gpios;
> bank->loses_context = pdata->loses_context;
> - bank->get_context_loss_count = pdata->get_context_loss_count;
> bank->regs = pdata->regs;
> #ifdef CONFIG_OF_GPIO
> bank->chip.of_node = of_node_get(node);
> @@ -1135,6 +1134,9 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev)
> omap_gpio_chip_init(bank);
> omap_gpio_show_rev(bank);
>
> + if (bank->loses_context)
> + bank->get_context_loss_count = pdata->get_context_loss_count;
> +
> pm_runtime_put(bank->dev);
>
> list_add_tail(&bank->node, &omap_gpio_list);
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-07-02 18:07 ` Kevin Hilman
@ 2012-07-02 18:26 ` Jon Hunter
2012-07-02 23:34 ` NeilBrown
0 siblings, 1 reply; 10+ messages in thread
From: Jon Hunter @ 2012-07-02 18:26 UTC (permalink / raw)
To: linux-arm-kernel
On 07/02/2012 01:07 PM, Kevin Hilman wrote:
> + Neil Brown
>
> Hi Jon,
>
> Jon Hunter <jon-hunter@ti.com> writes:
>
>> Currently the gpio _runtime_resume/suspend functions are calling the
>> get_context_loss_count() platform function if the function is populated for
>> a gpio bank. This function is used to determine if the gpio bank logic state
>> needs to be restored due to a power transition. This function will be populated
>> for all banks, but it should only be called for banks that have the
>> "loses_context" variable set. It is pointless to call this if loses_context is
>> false as we know the context will never be lost and will not need restoring.
>>
>> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
>> never lose context. We found that the get_context_loss_count() was being called
>> for bank-0 during the probe and returning 1 instead of 0 indicating that the
>> context had been lost. This was causing the context restore function to be
>> called at probe time for this bank and because the context had never been saved,
>> was restoring an invalid state. This ultimately resulted in a crash [1].
>>
>> There are multiple bugs here that need to be addressed ...
>>
>> 1. Why the always-on power domain returns a context loss count of 1? This needs
>> to be fixed in the power domain code. However, the gpio driver should not
>> assume the loss count is 0 to begin with.
>> 2. The omap gpio driver should never be calling get_context_loss_count for a
>> gpio bank in a always-on domain. This is pointless and adds unneccessary
>> overhead.
>> 3. The OMAP gpio driver assumes that the initial power domain context loss count
>> will be 0 at the time the gpio driver is probed. However, it could be
>> possible that this is not the case and an invalid context restore could be
>> performed during the probe. To avoid this otherwise only populated the
>
> The 'To avoid this...' sentence here doesn't read well. Looks like you
> need to:
>
> s/otherwise//
Yes, I meant to have dropped "otherwise" here. Thanks!
> s/populated/populate/
Yes that too! I must have re-worded and screwed it up royally :-(
> ?
>
>> get_context_loss_count() function pointer after the initial call to
>> pm_runtime_get() has occurred. This will ensure that the first
>> pm_runtime_put() initialised the loss count correctly.
>>
>> This patch addresses issues 2 and 3 above.
>> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
>>
>> Cc: Grant Likely <grant.likely@secretlab.ca>
>> Cc: Linus Walleij <linus.walleij@stericsson.com>
>> Cc: Kevin Hilman <khilman@ti.com>
>> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
>> Cc: Franky Lin <frankyl@broadcom.com>
>>
>> Reported-by: Franky Lin <frankyl@broadcom.com>
>> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
>
> Thanks for digging inot this bug Jon. The same bug was brought up by
> Neil Brown (Cc'd) in a different thread.
>
> Neil, it looks to me that this fix will address the problems you were
> seeing as well. Care to test, and respond with your ack/tested-by if it
> works for you? Thanks.
Neil let me know your thoughts and if you are ok, I can clean-up the
changelog and re-send.
Cheers
Jon
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-07-02 18:26 ` Jon Hunter
@ 2012-07-02 23:34 ` NeilBrown
2012-07-03 0:05 ` Kevin Hilman
0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2012-07-02 23:34 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 2 Jul 2012 13:26:38 -0500 Jon Hunter <jon-hunter@ti.com> wrote:
>
> On 07/02/2012 01:07 PM, Kevin Hilman wrote:
> > + Neil Brown
> >
> > Hi Jon,
> >
> > Jon Hunter <jon-hunter@ti.com> writes:
> >
> >> Currently the gpio _runtime_resume/suspend functions are calling the
> >> get_context_loss_count() platform function if the function is populated for
> >> a gpio bank. This function is used to determine if the gpio bank logic state
> >> needs to be restored due to a power transition. This function will be populated
> >> for all banks, but it should only be called for banks that have the
> >> "loses_context" variable set. It is pointless to call this if loses_context is
> >> false as we know the context will never be lost and will not need restoring.
> >>
> >> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
> >> never lose context. We found that the get_context_loss_count() was being called
> >> for bank-0 during the probe and returning 1 instead of 0 indicating that the
> >> context had been lost. This was causing the context restore function to be
> >> called at probe time for this bank and because the context had never been saved,
> >> was restoring an invalid state. This ultimately resulted in a crash [1].
> >>
> >> There are multiple bugs here that need to be addressed ...
> >>
> >> 1. Why the always-on power domain returns a context loss count of 1? This needs
> >> to be fixed in the power domain code. However, the gpio driver should not
> >> assume the loss count is 0 to begin with.
> >> 2. The omap gpio driver should never be calling get_context_loss_count for a
> >> gpio bank in a always-on domain. This is pointless and adds unneccessary
> >> overhead.
> >> 3. The OMAP gpio driver assumes that the initial power domain context loss count
> >> will be 0 at the time the gpio driver is probed. However, it could be
> >> possible that this is not the case and an invalid context restore could be
> >> performed during the probe. To avoid this otherwise only populated the
> >
> > The 'To avoid this...' sentence here doesn't read well. Looks like you
> > need to:
> >
> > s/otherwise//
>
> Yes, I meant to have dropped "otherwise" here. Thanks!
>
> > s/populated/populate/
>
> Yes that too! I must have re-worded and screwed it up royally :-(
>
> > ?
> >
> >> get_context_loss_count() function pointer after the initial call to
> >> pm_runtime_get() has occurred. This will ensure that the first
> >> pm_runtime_put() initialised the loss count correctly.
> >>
> >> This patch addresses issues 2 and 3 above.
> >> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
> >>
> >> Cc: Grant Likely <grant.likely@secretlab.ca>
> >> Cc: Linus Walleij <linus.walleij@stericsson.com>
> >> Cc: Kevin Hilman <khilman@ti.com>
> >> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
> >> Cc: Franky Lin <frankyl@broadcom.com>
> >>
> >> Reported-by: Franky Lin <frankyl@broadcom.com>
> >> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
> >
> > Thanks for digging inot this bug Jon. The same bug was brought up by
> > Neil Brown (Cc'd) in a different thread.
> >
> > Neil, it looks to me that this fix will address the problems you were
> > seeing as well. Care to test, and respond with your ack/tested-by if it
> > works for you? Thanks.
>
> Neil let me know your thoughts and if you are ok, I can clean-up the
> changelog and re-send.
Yes, works for me and looks sensible.
Tested-by: NeilBrown <neilb@suse.de>
Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20120703/11adf637/attachment-0001.sig>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-07-02 23:34 ` NeilBrown
@ 2012-07-03 0:05 ` Kevin Hilman
2012-07-03 0:20 ` Jon Hunter
0 siblings, 1 reply; 10+ messages in thread
From: Kevin Hilman @ 2012-07-03 0:05 UTC (permalink / raw)
To: linux-arm-kernel
NeilBrown <neilb@suse.de> writes:
> On Mon, 2 Jul 2012 13:26:38 -0500 Jon Hunter <jon-hunter@ti.com> wrote:
>
>>
>> On 07/02/2012 01:07 PM, Kevin Hilman wrote:
>> > + Neil Brown
>> >
>> > Hi Jon,
>> >
>> > Jon Hunter <jon-hunter@ti.com> writes:
>> >
>> >> Currently the gpio _runtime_resume/suspend functions are calling the
>> >> get_context_loss_count() platform function if the function is populated for
>> >> a gpio bank. This function is used to determine if the gpio bank logic state
>> >> needs to be restored due to a power transition. This function will be populated
>> >> for all banks, but it should only be called for banks that have the
>> >> "loses_context" variable set. It is pointless to call this if loses_context is
>> >> false as we know the context will never be lost and will not need restoring.
>> >>
>> >> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
>> >> never lose context. We found that the get_context_loss_count() was being called
>> >> for bank-0 during the probe and returning 1 instead of 0 indicating that the
>> >> context had been lost. This was causing the context restore function to be
>> >> called at probe time for this bank and because the context had never been saved,
>> >> was restoring an invalid state. This ultimately resulted in a crash [1].
>> >>
>> >> There are multiple bugs here that need to be addressed ...
>> >>
>> >> 1. Why the always-on power domain returns a context loss count of 1? This needs
>> >> to be fixed in the power domain code. However, the gpio driver should not
>> >> assume the loss count is 0 to begin with.
>> >> 2. The omap gpio driver should never be calling get_context_loss_count for a
>> >> gpio bank in a always-on domain. This is pointless and adds unneccessary
>> >> overhead.
>> >> 3. The OMAP gpio driver assumes that the initial power domain context loss count
>> >> will be 0 at the time the gpio driver is probed. However, it could be
>> >> possible that this is not the case and an invalid context restore could be
>> >> performed during the probe. To avoid this otherwise only populated the
>> >
>> > The 'To avoid this...' sentence here doesn't read well. Looks like you
>> > need to:
>> >
>> > s/otherwise//
>>
>> Yes, I meant to have dropped "otherwise" here. Thanks!
>>
>> > s/populated/populate/
>>
>> Yes that too! I must have re-worded and screwed it up royally :-(
>>
>> > ?
>> >
>> >> get_context_loss_count() function pointer after the initial call to
>> >> pm_runtime_get() has occurred. This will ensure that the first
>> >> pm_runtime_put() initialised the loss count correctly.
>> >>
>> >> This patch addresses issues 2 and 3 above.
>> >> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
>> >>
>> >> Cc: Grant Likely <grant.likely@secretlab.ca>
>> >> Cc: Linus Walleij <linus.walleij@stericsson.com>
>> >> Cc: Kevin Hilman <khilman@ti.com>
>> >> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
>> >> Cc: Franky Lin <frankyl@broadcom.com>
>> >>
>> >> Reported-by: Franky Lin <frankyl@broadcom.com>
>> >> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
>> >
>> > Thanks for digging inot this bug Jon. The same bug was brought up by
>> > Neil Brown (Cc'd) in a different thread.
>> >
>> > Neil, it looks to me that this fix will address the problems you were
>> > seeing as well. Care to test, and respond with your ack/tested-by if it
>> > works for you? Thanks.
>>
>> Neil let me know your thoughts and if you are ok, I can clean-up the
>> changelog and re-send.
>
> Yes, works for me and looks sensible.
>
> Tested-by: NeilBrown <neilb@suse.de>
>
Great! Thanks for testing.
Jon, please make the minor changelog edits, collect the reviewed-by and
tested-by tags and repost. I'll then queue this up for Grant.
Based on your earlier comments, this only affects v3.5, so no
need to push it into stable, correct?
Kevin
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] gpio/omap: fix invalid context restore of gpio bank-0
2012-07-03 0:05 ` Kevin Hilman
@ 2012-07-03 0:20 ` Jon Hunter
0 siblings, 0 replies; 10+ messages in thread
From: Jon Hunter @ 2012-07-03 0:20 UTC (permalink / raw)
To: linux-arm-kernel
On 07/02/2012 07:05 PM, Kevin Hilman wrote:
> NeilBrown <neilb@suse.de> writes:
>
>> On Mon, 2 Jul 2012 13:26:38 -0500 Jon Hunter <jon-hunter@ti.com> wrote:
>>
>>>
>>> On 07/02/2012 01:07 PM, Kevin Hilman wrote:
>>>> + Neil Brown
>>>>
>>>> Hi Jon,
>>>>
>>>> Jon Hunter <jon-hunter@ti.com> writes:
>>>>
>>>>> Currently the gpio _runtime_resume/suspend functions are calling the
>>>>> get_context_loss_count() platform function if the function is populated for
>>>>> a gpio bank. This function is used to determine if the gpio bank logic state
>>>>> needs to be restored due to a power transition. This function will be populated
>>>>> for all banks, but it should only be called for banks that have the
>>>>> "loses_context" variable set. It is pointless to call this if loses_context is
>>>>> false as we know the context will never be lost and will not need restoring.
>>>>>
>>>>> For all OMAP2+ devices gpio bank-0 is in an always-on power domain and so will
>>>>> never lose context. We found that the get_context_loss_count() was being called
>>>>> for bank-0 during the probe and returning 1 instead of 0 indicating that the
>>>>> context had been lost. This was causing the context restore function to be
>>>>> called at probe time for this bank and because the context had never been saved,
>>>>> was restoring an invalid state. This ultimately resulted in a crash [1].
>>>>>
>>>>> There are multiple bugs here that need to be addressed ...
>>>>>
>>>>> 1. Why the always-on power domain returns a context loss count of 1? This needs
>>>>> to be fixed in the power domain code. However, the gpio driver should not
>>>>> assume the loss count is 0 to begin with.
>>>>> 2. The omap gpio driver should never be calling get_context_loss_count for a
>>>>> gpio bank in a always-on domain. This is pointless and adds unneccessary
>>>>> overhead.
>>>>> 3. The OMAP gpio driver assumes that the initial power domain context loss count
>>>>> will be 0 at the time the gpio driver is probed. However, it could be
>>>>> possible that this is not the case and an invalid context restore could be
>>>>> performed during the probe. To avoid this otherwise only populated the
>>>>
>>>> The 'To avoid this...' sentence here doesn't read well. Looks like you
>>>> need to:
>>>>
>>>> s/otherwise//
>>>
>>> Yes, I meant to have dropped "otherwise" here. Thanks!
>>>
>>>> s/populated/populate/
>>>
>>> Yes that too! I must have re-worded and screwed it up royally :-(
>>>
>>>> ?
>>>>
>>>>> get_context_loss_count() function pointer after the initial call to
>>>>> pm_runtime_get() has occurred. This will ensure that the first
>>>>> pm_runtime_put() initialised the loss count correctly.
>>>>>
>>>>> This patch addresses issues 2 and 3 above.
>>>>> [1] http://marc.info/?l=linux-omap&m=134065775323775&w=2
>>>>>
>>>>> Cc: Grant Likely <grant.likely@secretlab.ca>
>>>>> Cc: Linus Walleij <linus.walleij@stericsson.com>
>>>>> Cc: Kevin Hilman <khilman@ti.com>
>>>>> Cc: Tarun Kanti DebBarma <tarun.kanti@ti.com>
>>>>> Cc: Franky Lin <frankyl@broadcom.com>
>>>>>
>>>>> Reported-by: Franky Lin <frankyl@broadcom.com>
>>>>> Signed-off-by: Jon Hunter <jon-hunter@ti.com>
>>>>
>>>> Thanks for digging inot this bug Jon. The same bug was brought up by
>>>> Neil Brown (Cc'd) in a different thread.
>>>>
>>>> Neil, it looks to me that this fix will address the problems you were
>>>> seeing as well. Care to test, and respond with your ack/tested-by if it
>>>> works for you? Thanks.
>>>
>>> Neil let me know your thoughts and if you are ok, I can clean-up the
>>> changelog and re-send.
>>
>> Yes, works for me and looks sensible.
>>
>> Tested-by: NeilBrown <neilb@suse.de>
>>
>
> Great! Thanks for testing.
>
> Jon, please make the minor changelog edits, collect the reviewed-by and
> tested-by tags and repost. I'll then queue this up for Grant.
Ok, will do that tomorrow.
> Based on your earlier comments, this only affects v3.5, so no
> need to push it into stable, correct?
As far as I can tell. However, not sure if any of the other fixes should
be back ported.
Cheers
Jon
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-07-03 0:20 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-29 17:22 [PATCH] gpio/omap: fix invalid context restore of gpio bank-0 Jon Hunter
2012-06-29 20:27 ` Franky Lin
2012-06-30 4:18 ` Shilimkar, Santosh
2012-07-01 8:45 ` Tony Lindgren
2012-07-02 18:22 ` Jon Hunter
2012-07-02 18:07 ` Kevin Hilman
2012-07-02 18:26 ` Jon Hunter
2012-07-02 23:34 ` NeilBrown
2012-07-03 0:05 ` Kevin Hilman
2012-07-03 0:20 ` Jon Hunter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).