* Re: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing' in shutdown_drv
From: Dmitry Torokhov @ 2016-10-24 23:47 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: linux-wireless, Cathy Luo, Nishant Sarmukadam, briannorris
In-Reply-To: <1477318892-22877-2-git-send-email-akarwar@marvell.com>
On Mon, Oct 24, 2016 at 07:51:29PM +0530, Amitkumar Karwar wrote:
> This variable is guarded by spinlock at all other places. This patch
> takes care of missing spinlock usage in mwifiex_shutdown_drv().
>
> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> ---
> drivers/net/wireless/marvell/mwifiex/init.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
> index 82839d9..8e5e424 100644
> --- a/drivers/net/wireless/marvell/mwifiex/init.c
> +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> @@ -670,11 +670,14 @@ mwifiex_shutdown_drv(struct mwifiex_adapter *adapter)
>
> adapter->hw_status = MWIFIEX_HW_STATUS_CLOSING;
> /* wait for mwifiex_process to complete */
> + spin_lock_irqsave(&adapter->main_proc_lock, flags);
> if (adapter->mwifiex_processing) {
> + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
> mwifiex_dbg(adapter, WARN,
> "main process is still running\n");
> return ret;
> }
> + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
What happens if adapter->mwifiex_processing will become true here?
>
> /* cancel current command */
> if (adapter->curr_cmd) {
> --
> 1.9.1
>
--
Dmitry
^ permalink raw reply
* Re: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing' in shutdown_drv
From: Brian Norris @ 2016-10-24 23:55 UTC (permalink / raw)
To: Dmitry Torokhov
Cc: Amitkumar Karwar, linux-wireless, Cathy Luo, Nishant Sarmukadam,
briannorris
In-Reply-To: <20161024234720.GB15034@dtor-ws>
Hi,
On Mon, Oct 24, 2016 at 04:47:20PM -0700, Dmitry Torokhov wrote:
> On Mon, Oct 24, 2016 at 07:51:29PM +0530, Amitkumar Karwar wrote:
> > This variable is guarded by spinlock at all other places. This patch
> > takes care of missing spinlock usage in mwifiex_shutdown_drv().
> >
> > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > ---
> > drivers/net/wireless/marvell/mwifiex/init.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
> > index 82839d9..8e5e424 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/init.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> > @@ -670,11 +670,14 @@ mwifiex_shutdown_drv(struct mwifiex_adapter *adapter)
> >
> > adapter->hw_status = MWIFIEX_HW_STATUS_CLOSING;
> > /* wait for mwifiex_process to complete */
> > + spin_lock_irqsave(&adapter->main_proc_lock, flags);
> > if (adapter->mwifiex_processing) {
> > + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
> > mwifiex_dbg(adapter, WARN,
> > "main process is still running\n");
> > return ret;
> > }
> > + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
>
> What happens if adapter->mwifiex_processing will become true here?
That's why I commented:
"I'm not quite sure why we have this check in the first place; if the
main process is still running, then don't we have bigger problems here
anyway?"
I think the answer is, we really better NOT see that become true. That
means that we've fired off more interrupts and began processing them.
But all callers should have disabled interrupts and stopped or flushed
the queue at this point, AFAICT.
So I expect the intention here is to be essentially an assert(), without
actually making it fatal. Maybe there's a better way to handle this? I
can't really think of a good one right now.
Brian
> >
> > /* cancel current command */
> > if (adapter->curr_cmd) {
> > --
> > 1.9.1
> >
>
> --
> Dmitry
^ permalink raw reply
* Re: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing' in shutdown_drv
From: Dmitry Torokhov @ 2016-10-24 23:57 UTC (permalink / raw)
To: Brian Norris
Cc: Amitkumar Karwar, linux-wireless, Cathy Luo, Nishant Sarmukadam
In-Reply-To: <20161024191914.GB968@localhost>
On Mon, Oct 24, 2016 at 12:19:15PM -0700, Brian Norris wrote:
> On Mon, Oct 24, 2016 at 07:51:29PM +0530, Amitkumar Karwar wrote:
> > This variable is guarded by spinlock at all other places. This patch
> > takes care of missing spinlock usage in mwifiex_shutdown_drv().
> >
> > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > ---
> > drivers/net/wireless/marvell/mwifiex/init.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
> > index 82839d9..8e5e424 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/init.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> > @@ -670,11 +670,14 @@ mwifiex_shutdown_drv(struct mwifiex_adapter *adapter)
> >
> > adapter->hw_status = MWIFIEX_HW_STATUS_CLOSING;
> > /* wait for mwifiex_process to complete */
> > + spin_lock_irqsave(&adapter->main_proc_lock, flags);
> > if (adapter->mwifiex_processing) {
>
> I'm not quite sure why we have this check in the first place; if the
> main process is still running, then don't we have bigger problems here
> anyway? But this is definitely the "right" way to check this flag, so:
No, this is definitely not right way to check it. What exactly does this
spinlock protect? It seems that the intent is to make sure we do not
call mwifiex_shutdown_drv() while mwifiex_main_process() is executing.
It even says above that we are "waiting" for it (but we do not, we may
bail out or we may not, depends on luck).
To implement this properly you not only need to take a lock and check
the flag, but keep the lock until mwifiex_shutdown_drv() is done, or
use some other way to let mwifiex_main_process() know it should not
access the adapter while mwifiex_shutdown_drv() is running.
NACK.
Thanks.
>
> Reviewed-by: Brian Norris <briannorris@chromium.org>
>
> > + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
> > mwifiex_dbg(adapter, WARN,
> > "main process is still running\n");
> > return ret;
> > }
> > + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
> >
> > /* cancel current command */
> > if (adapter->curr_cmd) {
> > --
> > 1.9.1
> >
--
Dmitry
^ permalink raw reply
* Re: [PATCH 5/5] mwifiex: wait for firmware dump completion in remove_card
From: Dmitry Torokhov @ 2016-10-25 0:14 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: linux-wireless, Cathy Luo, Nishant Sarmukadam, briannorris,
Xinming Hu
In-Reply-To: <1477318892-22877-5-git-send-email-akarwar@marvell.com>
On Mon, Oct 24, 2016 at 07:51:32PM +0530, Amitkumar Karwar wrote:
> From: Xinming Hu <huxm@marvell.com>
>
> This patch ensures to wait for firmware dump completion in
> mwifiex_remove_card().
>
> For sdio interface, reset_trigger variable is used to identify
> if mwifiex_sdio_remove() is called by sdio_work during reset or
> the call is from sdio subsystem.
>
> This patch fixes a kernel crash observed during reboot when firmware
> dump operation is in process.
>
> Signed-off-by: Xinming Hu <huxm@marvell.com>
> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> ---
> drivers/net/wireless/marvell/mwifiex/pcie.c | 2 ++
> drivers/net/wireless/marvell/mwifiex/sdio.c | 15 ++++++++++++++-
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index 986bf07..4512e86 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -528,6 +528,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
> if (!adapter || !adapter->priv_num)
> return;
>
> + cancel_work_sync(&pcie_work);
> +
> if (user_rmmod && !adapter->mfg_mode) {
> #ifdef CONFIG_PM_SLEEP
> if (adapter->is_suspended)
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index 4cad1c2..f974260 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -46,6 +46,15 @@
> */
> static u8 user_rmmod;
>
> +/* reset_trigger variable is used to identify if mwifiex_sdio_remove()
> + * is called by sdio_work during reset or the call is from sdio subsystem.
> + * We will cancel sdio_work only if the call is from sdio subsystem.
> + */
> +static u8 reset_triggered;
It would be really great if the driver supported multiple devices. IOW
please avoid module-globals.
> +
> +static void mwifiex_sdio_work(struct work_struct *work);
> +static DECLARE_WORK(sdio_work, mwifiex_sdio_work);
> +
> static struct mwifiex_if_ops sdio_ops;
> static unsigned long iface_work_flags;
>
> @@ -289,6 +298,9 @@ mwifiex_sdio_remove(struct sdio_func *func)
> if (!adapter || !adapter->priv_num)
> return;
>
> + if (!reset_triggered)
> + cancel_work_sync(&sdio_work);
> +
> mwifiex_dbg(adapter, INFO, "info: SDIO func num=%d\n", func->num);
>
> if (user_rmmod && !adapter->mfg_mode) {
> @@ -2290,7 +2302,9 @@ static void mwifiex_recreate_adapter(struct sdio_mmc_card *card)
> * discovered and initializes them from scratch.
> */
>
> + reset_triggered = 1;
> mwifiex_sdio_remove(func);
> + reset_triggered = 0;
What exactly happens if I trigger mwifiex_sdio_remove() from sysfs at
the same time this code is running?
>
> /* power cycle the adapter */
> sdio_claim_host(func);
> @@ -2621,7 +2635,6 @@ static void mwifiex_sdio_work(struct work_struct *work)
> mwifiex_sdio_card_reset_work(save_adapter);
> }
>
> -static DECLARE_WORK(sdio_work, mwifiex_sdio_work);
> /* This function resets the card */
> static void mwifiex_sdio_card_reset(struct mwifiex_adapter *adapter)
> {
> --
> 1.9.1
>
--
Dmitry
^ permalink raw reply
* Re: [PATCH 3/5] mwifiex: do not free firmware dump memory in shutdown_drv
From: Dmitry Torokhov @ 2016-10-25 0:17 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: linux-wireless, Cathy Luo, Nishant Sarmukadam, briannorris,
Xinming Hu
In-Reply-To: <1477318892-22877-3-git-send-email-akarwar@marvell.com>
On Mon, Oct 24, 2016 at 07:51:30PM +0530, Amitkumar Karwar wrote:
> From: Xinming Hu <huxm@marvell.com>
>
> mwifiex_upload_device_dump() already takes care of freeing firmware dump
> memory. Doing the same thing in mwifiex_shutdown_drv() is redundant.
>
> Signed-off-by: Xinming Hu <huxm@marvell.com>
> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> ---
> drivers/net/wireless/marvell/mwifiex/init.c | 19 -------------------
> 1 file changed, 19 deletions(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
> index 8e5e424..365efb8 100644
> --- a/drivers/net/wireless/marvell/mwifiex/init.c
> +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> @@ -407,8 +407,6 @@ static void mwifiex_free_lock_list(struct mwifiex_adapter *adapter)
> static void
> mwifiex_adapter_cleanup(struct mwifiex_adapter *adapter)
> {
> - int idx;
> -
> if (!adapter) {
> pr_err("%s: adapter is NULL\n", __func__);
> return;
> @@ -426,23 +424,6 @@ mwifiex_adapter_cleanup(struct mwifiex_adapter *adapter)
> mwifiex_dbg(adapter, INFO, "info: free cmd buffer\n");
> mwifiex_free_cmd_buffer(adapter);
>
> - for (idx = 0; idx < adapter->num_mem_types; idx++) {
> - struct memory_type_mapping *entry =
> - &adapter->mem_type_mapping_tbl[idx];
> -
> - if (entry->mem_ptr) {
> - vfree(entry->mem_ptr);
> - entry->mem_ptr = NULL;
> - }
> - entry->mem_size = 0;
> - }
> -
> - if (adapter->drv_info_dump) {
> - vfree(adapter->drv_info_dump);
> - adapter->drv_info_dump = NULL;
> - adapter->drv_info_size = 0;
> - }
Why do you even keep the pointer to dump memory in the adapter
structure? You allocate it in mwifiex_drv_info_dump() and immediately
use it in mwifiex_upload_device_dump(). Why not simply pass the pointer
between the functions?
Thanks.
--
Dmitry
^ permalink raw reply
* Re: [PATCH v4 1/3] mwifiex: reset card->adapter during device unregister
From: Brian Norris @ 2016-10-25 0:51 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam,
rajatja@google.com, Xinming Hu, abhishekbh@google.com,
Dmitry Torokhov
In-Reply-To: <b1231f275ecc4e45a9c98dadb1d585f1@SC-EXCH04.marvell.com>
Hi Amit,
On Thu, Oct 20, 2016 at 01:11:31PM +0000, Amitkumar Karwar wrote:
> > From: Brian Norris [mailto:briannorris@chromium.org]
> > Sent: Tuesday, October 11, 2016 5:53 AM
> > To: Amitkumar Karwar
> > Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> > rajatja@google.com; Xinming Hu; abhishekbh@google.com; Dmitry Torokhov
> > Subject: Re: [PATCH v4 1/3] mwifiex: reset card->adapter during device
> > unregister
> >
> > On Mon, Oct 10, 2016 at 01:53:32PM -0700, Brian Norris wrote:
> > > On Thu, Oct 06, 2016 at 11:36:24PM +0530, Amitkumar Karwar wrote:
> > > > From: Xinming Hu <huxm@marvell.com>
> > > >
> > > > card->adapter gets initialized during device registration.
> > > > As it's not cleared, we may end up accessing invalid memory in some
> > > > corner cases. This patch fixes the problem.
> > > >
> > > > Signed-off-by: Xinming Hu <huxm@marvell.com>
> > > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > > > ---
> > > > v4: Same as v1, v2, v3
> > > > ---
> > > > drivers/net/wireless/marvell/mwifiex/pcie.c | 1 +
> > > > drivers/net/wireless/marvell/mwifiex/sdio.c | 1 +
> > > > 2 files changed, 2 insertions(+)
> > > >
> > > > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > index f1eeb73..ba9e068 100644
> > > > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > @@ -3042,6 +3042,7 @@ static void mwifiex_unregister_dev(struct
> > mwifiex_adapter *adapter)
> > > > pci_disable_msi(pdev);
> > > > }
> > > > }
> > > > + card->adapter = NULL;
> > > > }
> > > >
> > > > /* This function initializes the PCI-E host memory space, WCB
> > rings, etc.
> > > > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > index 8718950..4cad1c2 100644
> > > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > @@ -2066,6 +2066,7 @@ mwifiex_unregister_dev(struct mwifiex_adapter
> > *adapter)
> > > > struct sdio_mmc_card *card = adapter->card;
> > > >
> > > > if (adapter->card) {
> > > > + card->adapter = NULL;
> > > > sdio_claim_host(card->func);
> > > > sdio_disable_func(card->func);
> > > > sdio_release_host(card->func);
> > >
> > > As discussed on v1, I had qualms about the raciness between
> > > reads/writes of card->adapter, but I believe we:
> > > (a) can't have any command activity while writing the ->adapter field
> > > (either we're just init'ing the device, or we've disabled interrupts
> > > and are tearing it down) and
> > > (b) can't have a race between suspend()/resume() and unregister_dev(),
> > > since unregister_dev() is called from device remove() (which should
> > > not be concurrent with suspend()).
> > >
> > > Also, I thought you had the same problem in usb.c, but in fact, you
> > > fixed that ages ago here:
> > >
> > > 353d2a69ea26 mwifiex: fix issues in driver unload path for USB
> > > chipsets
> > >
> > > Would be nice if fixes were bettery synchronized across the three
> > > interface drivers you support. We seem to be discovering unnecessary
> > > divergence on a few points recently.
> > >
> > > At any rate:
> > >
> > > Reviewed-by: Brian Norris <briannorris@chromium.org>
> > > Tested-by: Brian Norris <briannorris@chromium.org>
> >
> > Dmitry helped me re-realize my original qualms:
> >
> > mwifiex_unregister_dev() is called in the failure path for your async FW
> > request, and so it may race with suspend(). So I retract my Reviewed-by.
> > Sorry.
>
> Thanks for your comments.
>
> Actually description for this patch was ambiguous and incorrect. Sorry
> for that. This patch doesn't fix any race. In fact, we don't have a
> race between init and remove threads due to semaphore usage as per
> design. This patch just adds missing "card->adapter=NULL" so that when
> teardown/remove thread starts after init failure, it won't try freeing
> already freed things.
So to be clear, you'r talking about mwifiex_fw_dpc(), which in the error path
has:
if (adapter->if_ops.unregister_dev)
adapter->if_ops.unregister_dev(adapter); <--- POINT A: This is where you want to set ->adapter = NULL
...
if (init_failed)
mwifiex_free_adapter(adapter);
up(sem); <--- POINT B: This is where you release the semaphore, which is supposed to guarantee that remove() isn't happening
return;
}
But you *do* have a race between the above code and the remove code in some
cases. Particularly, see this:
static void mwifiex_pcie_remove(struct pci_dev *pdev)
{
struct pcie_service_card *card;
struct mwifiex_adapter *adapter;
struct mwifiex_private *priv;
card = pci_get_drvdata(pdev);
if (!card)
return;
adapter = card->adapter; <--- POINT C: This can execute at the same time as unregister_dev()
if (!adapter || !adapter->priv_num)
return;
if (user_rmmod && !adapter->mfg_mode) {
#ifdef CONFIG_PM_SLEEP
if (adapter->is_suspended)
mwifiex_pcie_resume(&pdev->dev);
#endif
mwifiex_deauthenticate_all(adapter);
priv = mwifiex_get_priv(adapter, MWIFIEX_BSS_ROLE_ANY);
mwifiex_disable_auto_ds(priv);
mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
}
mwifiex_remove_card(card->adapter, &add_remove_card_sem); <--- POINT D: You only grab the semaphore here
}
So IIUC, you have a race **even here, where you claim the semaphore
should protect you** such that the adapter might be freed, but you still
can access it anywhere between C and D. i.e., you can see this:
Thread 1 Thread 2
(1) POINT C (retrieve adapter != NULL)
(2) POINT A (set adapter NULL)
(3) POINT B (adapter has been freed)
(3) ....Keep accessing freed adapter structure
(4) POINT D - acquire semaphore, but we're too late
Step 3 is an error, and AFAICT, that's exactly what you're trying to
solve in this patch. It essentially comes down to the same fact: you're
getting a reference to the adapter structure *without* any protection at
all. Your add_remove_card_sem is *almost* the right thing to resolve
this, but you still don't have the ordering quite right.
Maybe you could solve this all by acquiring the semaphore in suspend(),
remove(), shutdown(), before you ever acquire the card->adapter? If you
do that first (to fix the race), and only *then* do you submit the
$subject patch, then you might have fixed all the races in question (at
least between FW init and {suspend,resume,remove,shutdown} -- you have
plenty of other races still, both known and unknown).
[[ Also, a side note on POINT D: it's possible that even though you're
retrieving card->adapter in the argument to that function call, it's
possible the compiler will notice that this is redundant with the
retrieval at POINT C and just use that one. But even if not, there's a
window between "retrieve function argument" and "grab semaphore in
mwifiex_remove_card()". ]]
---
BTW I'm gonna sidetrack us here... what's up with this code, in pcie.c?
if (!down_interruptible(&add_remove_card_sem))
up(&add_remove_card_sem);
I can understand that you want to grab the semaphore for the remove code, but
why do you want to immediately release it? If you release it, that must
mean that someone else is trying to get it. And they won't notice that
you're tearing down the device... (Also, why aren't you handling the
interrupted case?)
> I have updated patch description in v5 series.
>
> You are right. We may have a race between init failure and suspend
> thread. I have prepared 4th patch in my v5 series to address this.
>
> >
> > I'm going to look into converting to asynchronous device probing, which
> > might remove the need for async FW request, and would therefore resolve
> > both patch 1 and 3's races without any additional complicated hacks. But
> > I'm not sure if that will satisfy all mwifiex users well enough. I'll
> > have to give it a little more thought. Any thoughts from your side,
> > Amit?
> >
>
> This is not needed.
Yeah, I ended up deciding this really wasn't that great of a solution.
For one, you also need to do firmware reloads for your PCIe reset
handling, and this happens in parallel to any device suspend. So:
(a) I'd bet that has plenty of race conditions and
(b) that means we can't just "load the firmware synchronously once and
forget about it"
> 4th patch in V5 series would take care of this race.
No, patch 4 does not handle this race, and it doesn't handle the race I
point out above either. You still can get a pointer card->adapter and
then see another thread subsequently free() it out from under you.
> I will be posting v5 series shortly.
Nak to both v4 and v5. They're not substantially different.
Brian
^ permalink raw reply
* Re: [PATCH v5 1/4] mwifiex: reset card->adapter during device unregister
From: Brian Norris @ 2016-10-25 1:00 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: linux-wireless, Cathy Luo, Nishant Sarmukadam, briannorris,
dmitry.torokhov, Xinming Hu
In-Reply-To: <1476969979-28554-1-git-send-email-akarwar@marvell.com>
On Thu, Oct 20, 2016 at 06:56:16PM +0530, Amitkumar Karwar wrote:
> From: Xinming Hu <huxm@marvell.com>
>
> card->adapter gets initialized in mwifiex_register_dev(). As it's not
> cleared in mwifiex_unregister_dev(), we may end up accessing the memory
> which is already free in below scenario.
>
> Scenario: Driver initialization is failed due to incorrect firmware or
> some other reason. Meanwhile device reboot/unload occurs.
>
> Please note that we have 'add_remove_card_sem' semaphore. So if there
> is a race betweem init and remove threads, they will execute one after
> another.
I argued in v4 [1] that this is false, and therefore this patch isn't
really correct. Carrying the NACK here.
Brian
[1] https://patchwork.kernel.org/patch/9365209/
> This patch ensures that "card->adapter" is set to NULL when
> all cleanup is performed in init failure thread. Later remove thread
> can return immediately if "card->adapter" is NULL
>
> Signed-off-by: Xinming Hu <huxm@marvell.com>
> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> ---
> v4: Same as v1, v2, v3
> v5: Patch description is updated to get clear picture. There is no race
> for init and remove threads as per design. This patch just adds missing
> "card->adapter= NULL" change to avoid accessing already freed memory which
> leads to a crash.
> ---
> drivers/net/wireless/marvell/mwifiex/pcie.c | 1 +
> drivers/net/wireless/marvell/mwifiex/sdio.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index 063c707..302ffd1 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -3021,6 +3021,7 @@ static void mwifiex_unregister_dev(struct mwifiex_adapter *adapter)
> if (card->msi_enable)
> pci_disable_msi(pdev);
> }
> + card->adapter = NULL;
> }
> }
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index 8718950..4cad1c2 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -2066,6 +2066,7 @@ mwifiex_unregister_dev(struct mwifiex_adapter *adapter)
> struct sdio_mmc_card *card = adapter->card;
>
> if (adapter->card) {
> + card->adapter = NULL;
> sdio_claim_host(card->func);
> sdio_disable_func(card->func);
> sdio_release_host(card->func);
> --
> 1.9.1
>
^ permalink raw reply
* Re: compex wle900vx (ath10k) problem on 4.4.24 / armv7
From: Oliver Zemann @ 2016-10-25 5:53 UTC (permalink / raw)
To: Matthias Klein, Michal Kazior; +Cc: linux-wireless
In-Reply-To: <45bffe7d-8f27-ef26-b7f9-78709e52652f@gmail.com>
> I created a patch which should work for 4.4.24 (at least for arch
> linux arm it applied successful)
>
> Just compiling the kernel... lets see what happens
Did not work. First, there was a + which was wrong, guess i have this
simply overseen. After fixing that i got:
drivers/pci/pcie/portdrv_pci.c: In function 'pcie_port_runtime_suspend':
drivers/pci/pcie/portdrv_pci.c:98:24: error: 'struct pci_dev' has no
member named 'bridge_d3'
return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
^~
drivers/pci/pcie/portdrv_pci.c: In function 'pcie_port_runtime_idle':
drivers/pci/pcie/portdrv_pci.c:113:24: error: 'struct pci_dev' has no
member named 'bridge_d3'
return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
^~
drivers/pci/pcie/portdrv_pci.c:114:1: warning: control reaches end of
non-void function [-Wreturn-type]
}
^
drivers/pci/pcie/portdrv_pci.c: In function 'pcie_port_runtime_suspend':
drivers/pci/pcie/portdrv_pci.c:99:1: warning: control reaches end of
non-void function [-Wreturn-type]
}
^
make[3]: *** [scripts/Makefile.build:259:
drivers/pci/pcie/portdrv_pci.o] Error 1
make[2]: *** [scripts/Makefile.build:403: drivers/pci/pcie] Error 2
make[1]: *** [scripts/Makefile.build:403: drivers/pci] Error 2
make: *** [Makefile:959: drivers] Error 2
==> ERROR: A failure occurred in build().
Aborting...
A simple cherry pick does not work in that case.
^ permalink raw reply
* Re: Bayesian rate control
From: Johannes Berg @ 2016-10-25 7:14 UTC (permalink / raw)
To: Björn Smedman; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <CAGp19xdgsoyx9Sa=uas1OEiFoJdzFChmt20++9e7fJK=Z1ujWQ@mail.gmail.com>
On Mon, 2016-10-24 at 22:17 +0200, Björn Smedman wrote:
>
> Are there any cards out there that support VHT and use s/w rate
> control (Minstrel)? Just in case I run out of search space. :P
Maybe something that's usable with brcmsmac? Not sure I'd recommend
buying Broadcom NICs though at this point, with them having been bought
out and all (and their upstream support is somewhat spotty, so you
might have a hard time finding the right NIC)
johannes
^ permalink raw reply
* Re: [PATCH] staging: rtl8712: Free memory and return failure when kmalloc fails
From: Greg KH @ 2016-10-25 9:03 UTC (permalink / raw)
To: Souptick Joarder; +Cc: linux-wireless, Larry.Finger, florian.c.schilhabel
In-Reply-To: <20161020065932.GA21705@symbol-HP-ZBook-15>
On Thu, Oct 20, 2016 at 12:29:33PM +0530, Souptick Joarder wrote:
> This patch is added to free memory and return failure when kmalloc fails
I'm sorry, but I can not parse that sentance. Can you rephrase this a
bit better? What exactly are you doing here?
>
> Signed-off-by: Souptick joarder <jrdr.linux@gmail.com>
> ---
> drivers/staging/rtl8712/os_intfs.c | 3 ++-
> drivers/staging/rtl8712/rtl871x_cmd.c | 5 ++++-
> drivers/staging/rtl8712/rtl871x_xmit.c | 5 ++++-
> 3 files changed, 10 insertions(+), 3 deletions(-)
Any reason why you didn't cc: the driverdevel mailing list? I doubt
linux-wireless cares about staging drivers :(
>
> diff --git a/drivers/staging/rtl8712/os_intfs.c b/drivers/staging/rtl8712/os_intfs.c
> index cbe4de0..aab3141 100644
> --- a/drivers/staging/rtl8712/os_intfs.c
> +++ b/drivers/staging/rtl8712/os_intfs.c
> @@ -313,7 +313,8 @@ u8 r8712_init_drv_sw(struct _adapter *padapter)
> return _FAIL;
> if (r8712_init_mlme_priv(padapter) == _FAIL)
> return _FAIL;
> - _r8712_init_xmit_priv(&padapter->xmitpriv, padapter);
> + if ((_r8712_init_xmit_priv(&padapter->xmitpriv, padapter)) != _SUCCESS)
> + return _FAIL;
You don't have to unwind anything that r8712_init_mlme_priv() did?
> _r8712_init_recv_priv(&padapter->recvpriv, padapter);
> memset((unsigned char *)&padapter->securitypriv, 0,
> sizeof(struct security_priv));
thanks,
greg k-h
^ permalink raw reply
* pull-request: iwlwifi 2016-10-25
From: Luca Coelho @ 2016-10-25 9:38 UTC (permalink / raw)
To: kvalo; +Cc: linux-wireless, linuxwifi
[-- Attachment #1: Type: text/plain, Size: 2456 bytes --]
Hi Kalle,
Here are 7 patches intended for 4.9. They fix some small issues, as
described in the tag itself. I sent them out for review on 2016-10-19,
and pushed to a pending branch. Kbuildbot didn't find any issues.
Let me know if everything's fine (or not). :)
Luca.
The following changes since commit 1ea2643961b0d1b8d0e4a11af5aa69b0f92d0533:
ath6kl: add Dell OEM SDIO I/O for the Venue 8 Pro (2016-10-13 14:16:33 +0300)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes.git tags/iwlwifi-for-kalle-2016-10-25
for you to fetch changes up to 5a143db8c4a28dab6423cb6197e9f1389da375f2:
iwlwifi: mvm: fix netdetect starting/stopping for unified images (2016-10-19 09:54:45 +0300)
----------------------------------------------------------------
* some fixes for suspend/resume with unified FW images;
* a fix for a false-positive lockdep report;
* a fix for multi-queue that caused an unnecessary 1 second latency;
* a fix for an ACPI parsing bug that caused a misleading error message;
----------------------------------------------------------------
Haim Dreyfuss (1):
iwlwifi: mvm: comply with fw_restart mod param on suspend
Johannes Berg (1):
iwlwifi: pcie: mark command queue lock with separate lockdep class
Luca Coelho (4):
iwlwifi: mvm: use ssize_t for len in iwl_debugfs_mem_read()
iwlwifi: mvm: fix d3_test with unified D0/D3 images
iwlwifi: pcie: fix SPLC structure parsing
iwlwifi: mvm: fix netdetect starting/stopping for unified images
Sara Sharon (1):
iwlwifi: mvm: wake the wait queue when the RX sync counter is zero
drivers/net/wireless/intel/iwlwifi/mvm/d3.c | 49 +++++++++++++++++++++++++++++++++++++-----------
drivers/net/wireless/intel/iwlwifi/mvm/debugfs.c | 4 ++--
drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 3 +--
drivers/net/wireless/intel/iwlwifi/mvm/mvm.h | 1 +
drivers/net/wireless/intel/iwlwifi/mvm/ops.c | 1 +
drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c | 3 ++-
drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 33 +++++++++++++++++++++++++++------
drivers/net/wireless/intel/iwlwifi/pcie/drv.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++-------------------------------
drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 8 ++++++++
9 files changed, 128 insertions(+), 53 deletions(-)
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply
* pull-request: iwlwifi-next 2016-10-25
From: Luca Coelho @ 2016-10-25 10:04 UTC (permalink / raw)
To: kvalo; +Cc: linux-wireless, linuxwifi
[-- Attachment #1: Type: text/plain, Size: 2966 bytes --]
Hi Kalle,
Here's my first pull-request intended for 4.10. It contains the usual
improvements and bugfixes, with the major thing being that we have
finally enabled dynamic queue allocation. Some more details in the tag
description.
These patches were sent out for review on 2016-10-19, and pushed to a
pending branch. Kbuildbot didn't find any issues.
Let me know if everything's fine (or not). :)
Luca.
The following changes since commit 15b95a15950238eff4d7f24be1716086eea67835:
Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git (2016-09-28 16:37:33 +0300)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next.git tags/iwlwifi-next-for-kalle-2016-10-25
for you to fetch changes up to a048dcc71e701cb1771b2e0aca23181a45658023:
iwlwifi: mvm: use dev_coredumpsg() (2016-10-19 12:46:37 +0300)
----------------------------------------------------------------
* Finalize and enable dynamic queue allocation;
* Use dev_coredumpmsg() to prevent locking the driver;
* Small fix to pass the AID to the FW;
* Use FW PS decisions with multi-queue;
----------------------------------------------------------------
Aviya Erenfeld (1):
iwlwifi: mvm: use dev_coredumpsg()
Emmanuel Grumbach (1):
iwlwifi: mvm: tell the firmware about the AID of the peer
Johannes Berg (1):
iwlwifi: mvm: use firmware station PM notification for AP_LINK_PS
Liad Kaufman (5):
iwlwifi: mvm: update txq metadata to current owner
iwlwifi: mvm: fix reserved txq freeing
iwlwifi: mvm: support MONITOR vif in DQA mode
iwlwifi: mvm: fix dqa deferred frames marking
iwlwifi: mvm: operate in dqa mode
Sara Sharon (1):
iwlwifi: mvm: assign cab queue to the correct station
Sharon Dvir (1):
iwlwifi: pcie: give a meaningful name to interrupt request
drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h | 2 ++
drivers/net/wireless/intel/iwlwifi/mvm/fw-api-rx.h | 26 ++++++++++++++++++++
drivers/net/wireless/intel/iwlwifi/mvm/fw-api-sta.h | 9 ++++---
drivers/net/wireless/intel/iwlwifi/mvm/fw-api.h | 1 +
drivers/net/wireless/intel/iwlwifi/mvm/fw-dbg.c | 100 +++++++++++++++++++++++++++++++++++++++++----------------------------------
drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c | 24 +++++++++---------
drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
drivers/net/wireless/intel/iwlwifi/mvm/mvm.h | 6 ++---
drivers/net/wireless/intel/iwlwifi/mvm/ops.c | 3 +++
drivers/net/wireless/intel/iwlwifi/mvm/sta.c | 37 ++++++++++++++++++++++------
drivers/net/wireless/intel/iwlwifi/mvm/sta.h | 1 +
drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 29 +++++++++++++++++++++-
12 files changed, 249 insertions(+), 75 deletions(-)
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply
* RE: [PATCH] cw1200: fix bogus maybe-uninitialized warning
From: David Laight @ 2016-10-25 13:24 UTC (permalink / raw)
To: 'Arnd Bergmann', Solomon Peachy, Kalle Valo
Cc: Johannes Berg, linux-wireless@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20161024154215.2863586-1-arnd@arndb.de>
RnJvbTogT2YgQXJuZCBCZXJnbWFubg0KPiBTZW50OiAyNCBPY3RvYmVyIDIwMTYgMTY6NDINCiAN
Cj4gT24geDg2LCB0aGUgY3cxMjAwIGRyaXZlciBwcm9kdWNlcyBhIHJhdGhlciBzaWxseSB3YXJu
aW5nIGFib3V0IHRoZQ0KPiBwb3NzaWJsZSB1c2Ugb2YgdGhlICdyZXQnIHZhcmlhYmxlIHdpdGhv
dXQgYW4gaW5pdGlhbGl6YXRpb24NCj4gcHJlc3VtYWJseSBhZnRlciBiZWluZyBjb25mdXNlZCBi
eSB0aGUgYXJjaGl0ZWN0dXJlIHNwZWNpZmljIGRlZmluaXRpb24NCj4gb2YgV0FSTl9PTjoNCj4g
DQo+IGRyaXZlcnMvbmV0L3dpcmVsZXNzL3N0L2N3MTIwMC93c20uYzogSW4gZnVuY3Rpb24gd3Nt
X2hhbmRsZV9yeDoNCj4gZHJpdmVycy9uZXQvd2lyZWxlc3Mvc3QvY3cxMjAwL3dzbS5jOjE0NTc6
OTogZXJyb3I6IHJldCBtYXkgYmUgdXNlZCB1bmluaXRpYWxpemVkIGluIHRoaXMgZnVuY3Rpb24g
Wy0NCj4gV2Vycm9yPW1heWJlLXVuaW5pdGlhbGl6ZWRdDQo+IA0KPiBBcyB0aGUgZHJpdmVyIGp1
c3QgY2hlY2tzIHRoZSBzYW1lIHZhcmlhYmxlIHR3aWNlIGhlcmUsIHdlIGNhbiBzaW1wbGlmeQ0K
PiBpdCBieSByZW1vdmluZyB0aGUgc2Vjb25kIGNvbmRpdGlvbiwgd2hpY2ggbWFrZXMgaXQgbW9y
ZSByZWFkYWJsZSBhbmQNCj4gYXZvaWRzIHRoZSB3YXJuaW5nLg0KPiANCj4gU2lnbmVkLW9mZi1i
eTogQXJuZCBCZXJnbWFubiA8YXJuZEBhcm5kYi5kZT4NCj4gLS0tDQo+ICBkcml2ZXJzL25ldC93
aXJlbGVzcy9zdC9jdzEyMDAvd3NtLmMgfCAxNSArKysrKysrLS0tLS0tLS0NCj4gIDEgZmlsZSBj
aGFuZ2VkLCA3IGluc2VydGlvbnMoKyksIDggZGVsZXRpb25zKC0pDQo+IA0KPiBkaWZmIC0tZ2l0
IGEvZHJpdmVycy9uZXQvd2lyZWxlc3Mvc3QvY3cxMjAwL3dzbS5jIGIvZHJpdmVycy9uZXQvd2ly
ZWxlc3Mvc3QvY3cxMjAwL3dzbS5jDQo+IGluZGV4IDY4MGQ2MGVhYmM3NS4uMDk0ZTY2MzdhZGUy
IDEwMDY0NA0KPiAtLS0gYS9kcml2ZXJzL25ldC93aXJlbGVzcy9zdC9jdzEyMDAvd3NtLmMNCj4g
KysrIGIvZHJpdmVycy9uZXQvd2lyZWxlc3Mvc3QvY3cxMjAwL3dzbS5jDQo+IEBAIC0zODUsMTQg
KzM4NSwxMyBAQCBzdGF0aWMgaW50IHdzbV9tdWx0aV90eF9jb25maXJtKHN0cnVjdCBjdzEyMDBf
Y29tbW9uICpwcml2LA0KPiAgCWlmIChXQVJOX09OKGNvdW50IDw9IDApKQ0KPiAgCQlyZXR1cm4g
LUVJTlZBTDsNCj4gDQo+IC0JaWYgKGNvdW50ID4gMSkgew0KPiAtCQkvKiBXZSBhbHJlYWR5IHJl
bGVhc2VkIG9uZSBidWZmZXIsIG5vdyBmb3IgdGhlIHJlc3QgKi8NCj4gLQkJcmV0ID0gd3NtX3Jl
bGVhc2VfdHhfYnVmZmVyKHByaXYsIGNvdW50IC0gMSk7DQo+IC0JCWlmIChyZXQgPCAwKQ0KPiAt
CQkJcmV0dXJuIHJldDsNCj4gLQkJZWxzZSBpZiAocmV0ID4gMCkNCj4gLQkJCWN3MTIwMF9iaF93
YWtldXAocHJpdik7DQo+IC0JfQ0KPiArCS8qIFdlIGFscmVhZHkgcmVsZWFzZWQgb25lIGJ1ZmZl
ciwgbm93IGZvciB0aGUgcmVzdCAqLw0KPiArCXJldCA9IHdzbV9yZWxlYXNlX3R4X2J1ZmZlcihw
cml2LCBjb3VudCAtIDEpOw0KPiArCWlmIChyZXQgPCAwKQ0KPiArCQlyZXR1cm4gcmV0Ow0KPiAr
DQo+ICsJaWYgKHJldCA+IDApDQo+ICsJCWN3MTIwMF9iaF93YWtldXAocHJpdik7DQoNClRoYXQg
ZG9lc24ndCBsb29rIGVxdWl2YWxlbnQgdG8gbWUgKHdoZW4gY291bnQgPT0gMSkuDQoNCj4gDQo+
ICAJY3cxMjAwX2RlYnVnX3R4ZWRfbXVsdGkocHJpdiwgY291bnQpOw0KPiAgCWZvciAoaSA9IDA7
IGkgPCBjb3VudDsgKytpKSB7DQoNCkNvbnZlcnQgdGhpcyBsb29wIGludG8gYSBkbyAuLi4gd2hp
bGUgc28gdGhlIGJvZHkgZXhlY3V0ZXMgYXQgbGVhc3Qgb25jZS4NCg0KCURhdmlkDQoNCg==
^ permalink raw reply
* RE: [PATCH v4 1/3] mwifiex: reset card->adapter during device unregister
From: Amitkumar Karwar @ 2016-10-25 15:12 UTC (permalink / raw)
To: Brian Norris
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam,
rajatja@google.com, Xinming Hu, abhishekbh@google.com,
Dmitry Torokhov
In-Reply-To: <20161025005150.GA84310@google.com>
Hi Brian,
Thanks for review.
> From: Brian Norris [mailto:briannorris@chromium.org]
> Sent: Tuesday, October 25, 2016 6:22 AM
> To: Amitkumar Karwar
> Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> rajatja@google.com; Xinming Hu; abhishekbh@google.com; Dmitry Torokhov
> Subject: Re: [PATCH v4 1/3] mwifiex: reset card->adapter during device
> unregister
>
> Hi Amit,
>
> On Thu, Oct 20, 2016 at 01:11:31PM +0000, Amitkumar Karwar wrote:
> > > From: Brian Norris [mailto:briannorris@chromium.org]
> > > Sent: Tuesday, October 11, 2016 5:53 AM
> > > To: Amitkumar Karwar
> > > Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> > > rajatja@google.com; Xinming Hu; abhishekbh@google.com; Dmitry
> > > Torokhov
> > > Subject: Re: [PATCH v4 1/3] mwifiex: reset card->adapter during
> > > device unregister
> > >
> > > On Mon, Oct 10, 2016 at 01:53:32PM -0700, Brian Norris wrote:
> > > > On Thu, Oct 06, 2016 at 11:36:24PM +0530, Amitkumar Karwar wrote:
> > > > > From: Xinming Hu <huxm@marvell.com>
> > > > >
> > > > > card->adapter gets initialized during device registration.
> > > > > As it's not cleared, we may end up accessing invalid memory in
> > > > > some corner cases. This patch fixes the problem.
> > > > >
> > > > > Signed-off-by: Xinming Hu <huxm@marvell.com>
> > > > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > > > > ---
> > > > > v4: Same as v1, v2, v3
> > > > > ---
> > > > > drivers/net/wireless/marvell/mwifiex/pcie.c | 1 +
> > > > > drivers/net/wireless/marvell/mwifiex/sdio.c | 1 +
> > > > > 2 files changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > index f1eeb73..ba9e068 100644
> > > > > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > @@ -3042,6 +3042,7 @@ static void mwifiex_unregister_dev(struct
> > > mwifiex_adapter *adapter)
> > > > > pci_disable_msi(pdev);
> > > > > }
> > > > > }
> > > > > + card->adapter = NULL;
> > > > > }
> > > > >
> > > > > /* This function initializes the PCI-E host memory space, WCB
> > > rings, etc.
> > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > index 8718950..4cad1c2 100644
> > > > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > @@ -2066,6 +2066,7 @@ mwifiex_unregister_dev(struct
> > > > > mwifiex_adapter
> > > *adapter)
> > > > > struct sdio_mmc_card *card = adapter->card;
> > > > >
> > > > > if (adapter->card) {
> > > > > + card->adapter = NULL;
> > > > > sdio_claim_host(card->func);
> > > > > sdio_disable_func(card->func);
> > > > > sdio_release_host(card->func);
> > > >
> > > > As discussed on v1, I had qualms about the raciness between
> > > > reads/writes of card->adapter, but I believe we:
> > > > (a) can't have any command activity while writing the ->adapter
> > > > field (either we're just init'ing the device, or we've disabled
> > > > interrupts and are tearing it down) and
> > > > (b) can't have a race between suspend()/resume() and
> > > > unregister_dev(), since unregister_dev() is called from device
> > > > remove() (which should not be concurrent with suspend()).
> > > >
> > > > Also, I thought you had the same problem in usb.c, but in fact,
> > > > you fixed that ages ago here:
> > > >
> > > > 353d2a69ea26 mwifiex: fix issues in driver unload path for USB
> > > > chipsets
> > > >
> > > > Would be nice if fixes were bettery synchronized across the three
> > > > interface drivers you support. We seem to be discovering
> > > > unnecessary divergence on a few points recently.
> > > >
> > > > At any rate:
> > > >
> > > > Reviewed-by: Brian Norris <briannorris@chromium.org>
> > > > Tested-by: Brian Norris <briannorris@chromium.org>
> > >
> > > Dmitry helped me re-realize my original qualms:
> > >
> > > mwifiex_unregister_dev() is called in the failure path for your
> > > async FW request, and so it may race with suspend(). So I retract my
> Reviewed-by.
> > > Sorry.
> >
> > Thanks for your comments.
> >
> > Actually description for this patch was ambiguous and incorrect. Sorry
> > for that. This patch doesn't fix any race. In fact, we don't have a
> > race between init and remove threads due to semaphore usage as per
> > design. This patch just adds missing "card->adapter=NULL" so that when
> > teardown/remove thread starts after init failure, it won't try freeing
> > already freed things.
>
> So to be clear, you'r talking about mwifiex_fw_dpc(), which in the error
> path
> has:
>
> if (adapter->if_ops.unregister_dev)
> adapter->if_ops.unregister_dev(adapter); <--- POINT A:
> This is where you want to set ->adapter = NULL ...
> if (init_failed)
> mwifiex_free_adapter(adapter);
> up(sem); <--- POINT B: This is where you release the semaphore,
> which is supposed to guarantee that remove() isn't happening
> return;
> }
>
> But you *do* have a race between the above code and the remove code in
> some cases. Particularly, see this:
>
> static void mwifiex_pcie_remove(struct pci_dev *pdev) {
> struct pcie_service_card *card;
> struct mwifiex_adapter *adapter;
> struct mwifiex_private *priv;
>
> card = pci_get_drvdata(pdev);
> if (!card)
> return;
>
> adapter = card->adapter; <--- POINT C: This can execute at the
> same time as unregister_dev()
> if (!adapter || !adapter->priv_num)
> return;
>
> if (user_rmmod && !adapter->mfg_mode) { #ifdef CONFIG_PM_SLEEP
> if (adapter->is_suspended)
> mwifiex_pcie_resume(&pdev->dev); #endif
>
> mwifiex_deauthenticate_all(adapter);
>
> priv = mwifiex_get_priv(adapter, MWIFIEX_BSS_ROLE_ANY);
>
> mwifiex_disable_auto_ds(priv);
>
> mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
> }
>
> mwifiex_remove_card(card->adapter, &add_remove_card_sem); <---
> POINT D: You only grab the semaphore here }
>
> So IIUC, you have a race **even here, where you claim the semaphore
> should protect you** such that the adapter might be freed, but you still
> can access it anywhere between C and D. i.e., you can see this:
>
> Thread 1 Thread 2
> (1) POINT C (retrieve adapter !=
> NULL)
> (2) POINT A (set adapter NULL)
> (3) POINT B (adapter has been freed)
> (3) ....Keep accessing freed
> adapter structure
> (4) POINT D - acquire semaphore,
> but we're too late
>
> Step 3 is an error, and AFAICT, that's exactly what you're trying to
> solve in this patch. It essentially comes down to the same fact: you're
> getting a reference to the adapter structure *without* any protection at
> all. Your add_remove_card_sem is *almost* the right thing to resolve
> this, but you still don't have the ordering quite right.
Code between POINT C and POINT D won't come into picture for "init + reboot" scenario(Case 1 below) which we are talking here. Reason is "user_rmmod" flag will be false.
Basically we have 3 teardown cases
1) System shutdown (or manually remove wifi card) -- Only mwifiex_pcie_remove() gets called.
2) User unloading the driver -- mwifiex_pcie_cleanup_module() followed by mwifiex_pcie_remove() gets called.
3) Chip isn't connected. User unloaded the driver -- Only mwifiex_pcie_cleanup_module() gets
called.
In case 2, we have already waited for semaphore in mwifiex_pcie_cleanup_module(). So by the time we execute mwifiex_pcie_remove(), "card->adapter" is NULL.
Case 3, doesn't use "card->adapter"
>
> Maybe you could solve this all by acquiring the semaphore in suspend(),
> remove(), shutdown(), before you ever acquire the card->adapter? If you
> do that first (to fix the race), and only *then* do you submit the
> $subject patch, then you might have fixed all the races in question (at
> least between FW init and {suspend,resume,remove,shutdown} -- you have
> plenty of other races still, both known and unknown).
"add_remove_card_sem" is used only for init and teardown threads as per our design.
V5 4/4 takes care of race between "init fail + suspend". Please let us know if you see any other race situation.
>
> [[ Also, a side note on POINT D: it's possible that even though you're
> retrieving card->adapter in the argument to that function call, it's
> possible the compiler will notice that this is redundant with the
> retrieval at POINT C and just use that one. But even if not, there's a
> window between "retrieve function argument" and "grab semaphore in
> mwifiex_remove_card()". ]]
I had not considered that compiler will notice this is redundant and use "card->adapter" assigned at POINT C.
But this window is very small compared to the window between in "card->adapter = NULL" and releasing semaphore in other(mwifiex_fw_dpc()) thread.
Even if "card->adapter" is set to NULL in mwifiex_fw_dpc() after we checked it at POINT C, we will return from mwifiex_remove_card() without grabing semaphore. mwifiex_fw_dpc() wouldn't have released the semaphore at that point of time. mwifiex_fw_dpc() will take care of performing cleanup.
>
> ---
>
> BTW I'm gonna sidetrack us here... what's up with this code, in pcie.c?
>
> if (!down_interruptible(&add_remove_card_sem))
> up(&add_remove_card_sem);
>
> I can understand that you want to grab the semaphore for the remove
> code, but why do you want to immediately release it? If you release it,
> that must mean that someone else is trying to get it. And they won't
> notice that you're tearing down the device...
I don't see any problem in releasing the semaphore here. Only other thread which can acquire this semaphore is "init" thread.
"init" thread won't get called when user is unloading the driver.
The purpose of immediately releasing it is we just wanted to wait until init is completed if it's running.
>(Also, why aren't you
> handling the interrupted case?)
The cleanup performed in this function is done without touching hardware OR referring adapter/card etc. It can be done even for the interrupted case.
>
> > I have updated patch description in v5 series.
> >
> > You are right. We may have a race between init failure and suspend
> > thread. I have prepared 4th patch in my v5 series to address this.
> >
> > >
> > > I'm going to look into converting to asynchronous device probing,
> > > which might remove the need for async FW request, and would
> > > therefore resolve both patch 1 and 3's races without any additional
> > > complicated hacks. But I'm not sure if that will satisfy all mwifiex
> > > users well enough. I'll have to give it a little more thought. Any
> > > thoughts from your side, Amit?
> > >
> >
> > This is not needed.
>
> Yeah, I ended up deciding this really wasn't that great of a solution.
> For one, you also need to do firmware reloads for your PCIe reset
> handling, and this happens in parallel to any device suspend. So:
> (a) I'd bet that has plenty of race conditions and
> (b) that means we can't just "load the firmware synchronously once and
> forget about it"
>
> > 4th patch in V5 series would take care of this race.
>
> No, patch 4 does not handle this race, and it doesn't handle the race I
> point out above either. You still can get a pointer card->adapter and
> then see another thread subsequently free() it out from under you.
We have used spinlock in v5 4/4 to guard "card->adapter".
Please help me understand if it doesn't handle the race between "init fail" + suspend.
Regards,
Amitkumar Karwar.
^ permalink raw reply
* RE: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing' in shutdown_drv
From: Amitkumar Karwar @ 2016-10-25 16:00 UTC (permalink / raw)
To: Brian Norris, Dmitry Torokhov
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam,
briannorris@google.com
In-Reply-To: <20161024235555.GA7745@localhost>
Hi Brian/Dmitry,
> From: Brian Norris [mailto:briannorris@chromium.org]
> Sent: Tuesday, October 25, 2016 5:26 AM
> To: Dmitry Torokhov
> Cc: Amitkumar Karwar; linux-wireless@vger.kernel.org; Cathy Luo; Nishant
> Sarmukadam; briannorris@google.com
> Subject: Re: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing'
> in shutdown_drv
>
> Hi,
>
> On Mon, Oct 24, 2016 at 04:47:20PM -0700, Dmitry Torokhov wrote:
> > On Mon, Oct 24, 2016 at 07:51:29PM +0530, Amitkumar Karwar wrote:
> > > This variable is guarded by spinlock at all other places. This patch
> > > takes care of missing spinlock usage in mwifiex_shutdown_drv().
> > >
> > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > > ---
> > > drivers/net/wireless/marvell/mwifiex/init.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/net/wireless/marvell/mwifiex/init.c
> > > b/drivers/net/wireless/marvell/mwifiex/init.c
> > > index 82839d9..8e5e424 100644
> > > --- a/drivers/net/wireless/marvell/mwifiex/init.c
> > > +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> > > @@ -670,11 +670,14 @@ mwifiex_shutdown_drv(struct mwifiex_adapter
> > > *adapter)
> > >
> > > adapter->hw_status = MWIFIEX_HW_STATUS_CLOSING;
> > > /* wait for mwifiex_process to complete */
> > > + spin_lock_irqsave(&adapter->main_proc_lock, flags);
> > > if (adapter->mwifiex_processing) {
> > > + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
> > > mwifiex_dbg(adapter, WARN,
> > > "main process is still running\n");
> > > return ret;
> > > }
> > > + spin_unlock_irqrestore(&adapter->main_proc_lock, flags);
> >
> > What happens if adapter->mwifiex_processing will become true here?
>
> That's why I commented:
>
> "I'm not quite sure why we have this check in the first place; if the
> main process is still running, then don't we have bigger problems here
> anyway?"
If mwifiex_processing is true here, it means main_process is still running. We have set "adapter->hw_status = MWIFIEX_HW_STATUS_CLOSING" in mwifiex_shutdown_drv().
Now we will wait until main_process() gets completed.
---------------------
if (mwifiex_shutdown_drv(adapter) == -EINPROGRESS)
wait_event_interruptible(adapter->init_wait_q,
adapter->init_wait_q_woken);
--------------
We have following logic at the end of main_process()
-------
exit_main_proc:
if (adapter->hw_status == MWIFIEX_HW_STATUS_CLOSING)
mwifiex_shutdown_drv(adapter);
--------
Regards,
Amitkumar Karwar
^ permalink raw reply
* RE: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing' in shutdown_drv
From: Amitkumar Karwar @ 2016-10-25 16:11 UTC (permalink / raw)
To: Dmitry Torokhov, Brian Norris
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam
In-Reply-To: <20161024235746.GC15034@dtor-ws>
Hi Dmitry,
> From: Dmitry Torokhov [mailto:dmitry.torokhov@gmail.com]
> Sent: Tuesday, October 25, 2016 5:28 AM
> To: Brian Norris
> Cc: Amitkumar Karwar; linux-wireless@vger.kernel.org; Cathy Luo; Nishant
> Sarmukadam
> Subject: Re: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing'
> in shutdown_drv
>
> On Mon, Oct 24, 2016 at 12:19:15PM -0700, Brian Norris wrote:
> > On Mon, Oct 24, 2016 at 07:51:29PM +0530, Amitkumar Karwar wrote:
> > > This variable is guarded by spinlock at all other places. This patch
> > > takes care of missing spinlock usage in mwifiex_shutdown_drv().
> > >
> > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > > ---
> > > drivers/net/wireless/marvell/mwifiex/init.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/net/wireless/marvell/mwifiex/init.c
> > > b/drivers/net/wireless/marvell/mwifiex/init.c
> > > index 82839d9..8e5e424 100644
> > > --- a/drivers/net/wireless/marvell/mwifiex/init.c
> > > +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> > > @@ -670,11 +670,14 @@ mwifiex_shutdown_drv(struct mwifiex_adapter
> > > *adapter)
> > >
> > > adapter->hw_status = MWIFIEX_HW_STATUS_CLOSING;
> > > /* wait for mwifiex_process to complete */
> > > + spin_lock_irqsave(&adapter->main_proc_lock, flags);
> > > if (adapter->mwifiex_processing) {
> >
> > I'm not quite sure why we have this check in the first place; if the
> > main process is still running, then don't we have bigger problems here
> > anyway? But this is definitely the "right" way to check this flag, so:
>
> No, this is definitely not right way to check it. What exactly does this
> spinlock protect? It seems that the intent is to make sure we do not
> call mwifiex_shutdown_drv() while mwifiex_main_process() is executing.
> It even says above that we are "waiting" for it (but we do not, we may
> bail out or we may not, depends on luck).
>
> To implement this properly you not only need to take a lock and check
> the flag, but keep the lock until mwifiex_shutdown_drv() is done, or use
> some other way to let mwifiex_main_process() know it should not access
> the adapter while mwifiex_shutdown_drv() is running.
>
> NACK.
>
As I explained in other email, here is the sequence.
1) We find mwifiex_processing is true in mwifiex_shitdown_drv(). "-EINPROGRESS" is returned by the function and hw_status state is changed to MWIFIEX_HW_STATUS_CLOSING.
2) We wait until main_process is completed.
if (mwifiex_shutdown_drv(adapter) == -EINPROGRESS)
wait_event_interruptible(adapter->init_wait_q,
adapter->init_wait_q_woken);
3) We have following check at the end of main_process()
exit_main_proc:
if (adapter->hw_status == MWIFIEX_HW_STATUS_CLOSING)
mwifiex_shutdown_drv(adapter);
4) Here at the end of mwifiex_shutdown_drv(), we are setting "adapter->init_wait_q_woken" and waking up the thread in point (2)
Regards,
Amitkumar
^ permalink raw reply
* RE: [PATCH 5/5] mwifiex: wait for firmware dump completion in remove_card
From: Amitkumar Karwar @ 2016-10-25 16:20 UTC (permalink / raw)
To: Dmitry Torokhov
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam,
briannorris@google.com, Xinming Hu
In-Reply-To: <20161025001405.GE15034@dtor-ws>
Hi Dmitry,
> From: Dmitry Torokhov [mailto:dmitry.torokhov@gmail.com]
> Sent: Tuesday, October 25, 2016 5:44 AM
> To: Amitkumar Karwar
> Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> briannorris@google.com; Xinming Hu
> Subject: Re: [PATCH 5/5] mwifiex: wait for firmware dump completion in
> remove_card
>
> On Mon, Oct 24, 2016 at 07:51:32PM +0530, Amitkumar Karwar wrote:
> > From: Xinming Hu <huxm@marvell.com>
> >
> > This patch ensures to wait for firmware dump completion in
> > mwifiex_remove_card().
> >
> > For sdio interface, reset_trigger variable is used to identify if
> > mwifiex_sdio_remove() is called by sdio_work during reset or the call
> > is from sdio subsystem.
> >
> > This patch fixes a kernel crash observed during reboot when firmware
> > dump operation is in process.
> >
> > Signed-off-by: Xinming Hu <huxm@marvell.com>
> > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > ---
> > drivers/net/wireless/marvell/mwifiex/pcie.c | 2 ++
> > drivers/net/wireless/marvell/mwifiex/sdio.c | 15 ++++++++++++++-
> > 2 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > index 986bf07..4512e86 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > @@ -528,6 +528,8 @@ static void mwifiex_pcie_remove(struct pci_dev
> *pdev)
> > if (!adapter || !adapter->priv_num)
> > return;
> >
> > + cancel_work_sync(&pcie_work);
> > +
> > if (user_rmmod && !adapter->mfg_mode) { #ifdef CONFIG_PM_SLEEP
> > if (adapter->is_suspended)
> > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > index 4cad1c2..f974260 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > @@ -46,6 +46,15 @@
> > */
> > static u8 user_rmmod;
> >
> > +/* reset_trigger variable is used to identify if
> > +mwifiex_sdio_remove()
> > + * is called by sdio_work during reset or the call is from sdio
> subsystem.
> > + * We will cancel sdio_work only if the call is from sdio subsystem.
> > + */
> > +static u8 reset_triggered;
>
> It would be really great if the driver supported multiple devices. IOW
> please avoid module-globals.
You are right. As Brian pointed out in other email, I will refactor the code and get rid of global variable.
>
> > +
> > +static void mwifiex_sdio_work(struct work_struct *work); static
> > +DECLARE_WORK(sdio_work, mwifiex_sdio_work);
> > +
> > static struct mwifiex_if_ops sdio_ops; static unsigned long
> > iface_work_flags;
> >
> > @@ -289,6 +298,9 @@ mwifiex_sdio_remove(struct sdio_func *func)
> > if (!adapter || !adapter->priv_num)
> > return;
> >
> > + if (!reset_triggered)
> > + cancel_work_sync(&sdio_work);
> > +
> > mwifiex_dbg(adapter, INFO, "info: SDIO func num=%d\n", func->num);
> >
> > if (user_rmmod && !adapter->mfg_mode) { @@ -2290,7 +2302,9 @@
> static
> > void mwifiex_recreate_adapter(struct sdio_mmc_card *card)
> > * discovered and initializes them from scratch.
> > */
> >
> > + reset_triggered = 1;
> > mwifiex_sdio_remove(func);
> > + reset_triggered = 0;
>
> What exactly happens if I trigger mwifiex_sdio_remove() from sysfs at
> the same time this code is running?
Yes. There would be a race. It will be avoided with Brian's code refactoring approach.
Regards,
Amitkumar
^ permalink raw reply
* RE: [PATCH 3/5] mwifiex: do not free firmware dump memory in shutdown_drv
From: Amitkumar Karwar @ 2016-10-25 16:23 UTC (permalink / raw)
To: Dmitry Torokhov
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam,
briannorris@google.com, Xinming Hu
In-Reply-To: <20161025001716.GF15034@dtor-ws>
Hi Dmitry,
> From: Dmitry Torokhov [mailto:dmitry.torokhov@gmail.com]
> Sent: Tuesday, October 25, 2016 5:47 AM
> To: Amitkumar Karwar
> Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> briannorris@google.com; Xinming Hu
> Subject: Re: [PATCH 3/5] mwifiex: do not free firmware dump memory in
> shutdown_drv
>
> On Mon, Oct 24, 2016 at 07:51:30PM +0530, Amitkumar Karwar wrote:
> > From: Xinming Hu <huxm@marvell.com>
> >
> > mwifiex_upload_device_dump() already takes care of freeing firmware
> > dump memory. Doing the same thing in mwifiex_shutdown_drv() is
> redundant.
> >
> > Signed-off-by: Xinming Hu <huxm@marvell.com>
> > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > ---
> > drivers/net/wireless/marvell/mwifiex/init.c | 19 -------------------
> > 1 file changed, 19 deletions(-)
> >
> > diff --git a/drivers/net/wireless/marvell/mwifiex/init.c
> > b/drivers/net/wireless/marvell/mwifiex/init.c
> > index 8e5e424..365efb8 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/init.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> > @@ -407,8 +407,6 @@ static void mwifiex_free_lock_list(struct
> > mwifiex_adapter *adapter) static void mwifiex_adapter_cleanup(struct
> > mwifiex_adapter *adapter) {
> > - int idx;
> > -
> > if (!adapter) {
> > pr_err("%s: adapter is NULL\n", __func__);
> > return;
> > @@ -426,23 +424,6 @@ mwifiex_adapter_cleanup(struct mwifiex_adapter
> *adapter)
> > mwifiex_dbg(adapter, INFO, "info: free cmd buffer\n");
> > mwifiex_free_cmd_buffer(adapter);
> >
> > - for (idx = 0; idx < adapter->num_mem_types; idx++) {
> > - struct memory_type_mapping *entry =
> > - &adapter->mem_type_mapping_tbl[idx];
> > -
> > - if (entry->mem_ptr) {
> > - vfree(entry->mem_ptr);
> > - entry->mem_ptr = NULL;
> > - }
> > - entry->mem_size = 0;
> > - }
> > -
> > - if (adapter->drv_info_dump) {
> > - vfree(adapter->drv_info_dump);
> > - adapter->drv_info_dump = NULL;
> > - adapter->drv_info_size = 0;
> > - }
>
> Why do you even keep the pointer to dump memory in the adapter
> structure? You allocate it in mwifiex_drv_info_dump() and immediately
> use it in mwifiex_upload_device_dump(). Why not simply pass the pointer
> between the functions?
>
Thanks. This makes sense. I will pass the pointer and get rid of adapter variable in updated version.
Regards,
Amitkumar
^ permalink raw reply
* RE: [PATCH 5/5] mwifiex: wait for firmware dump completion in remove_card
From: Amitkumar Karwar @ 2016-10-25 16:30 UTC (permalink / raw)
To: Brian Norris
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam,
dmitry.torokhov@gmail.com, Xinming Hu
In-Reply-To: <20161024202332.GD968@localhost>
Hi Brian,
> From: Brian Norris [mailto:briannorris@chromium.org]
> Sent: Tuesday, October 25, 2016 1:54 AM
> To: Amitkumar Karwar
> Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> dmitry.torokhov@gmail.com; Xinming Hu
> Subject: Re: [PATCH 5/5] mwifiex: wait for firmware dump completion in
> remove_card
>
> On Mon, Oct 24, 2016 at 07:51:32PM +0530, Amitkumar Karwar wrote:
> > From: Xinming Hu <huxm@marvell.com>
> >
> > This patch ensures to wait for firmware dump completion in
> > mwifiex_remove_card().
> >
> > For sdio interface, reset_trigger variable is used to identify if
> > mwifiex_sdio_remove() is called by sdio_work during reset or the call
> > is from sdio subsystem.
> >
> > This patch fixes a kernel crash observed during reboot when firmware
> > dump operation is in process.
> >
> > Signed-off-by: Xinming Hu <huxm@marvell.com>
> > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > ---
> > drivers/net/wireless/marvell/mwifiex/pcie.c | 2 ++
> > drivers/net/wireless/marvell/mwifiex/sdio.c | 15 ++++++++++++++-
> > 2 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > index 986bf07..4512e86 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > @@ -528,6 +528,8 @@ static void mwifiex_pcie_remove(struct pci_dev
> *pdev)
> > if (!adapter || !adapter->priv_num)
> > return;
> >
> > + cancel_work_sync(&pcie_work);
>
> Is there a good reason you have to cancel the work? What if you were
> just to finish it (i.e., flush_work())?
I assume if work is running, cancel_work_sync() will wait until completion. Otherwise it will cancel queued work. Firmware dump takes 20-30 seconds to complete. I think, it's ok to cancel it if work is just queued and not running yet. Reboot taking significant time due to our wait in remove handler may not be a good experience from end user point of view.
If you prefer flush_work(), I can use the same.
>
> In any case, I think this is fine, so for the PCIe bits:
>
> Reviewed-by: Brian Norris <briannorris@chromium.org>
>
> > +
> > if (user_rmmod && !adapter->mfg_mode) { #ifdef CONFIG_PM_SLEEP
> > if (adapter->is_suspended)
> > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > index 4cad1c2..f974260 100644
> > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > @@ -46,6 +46,15 @@
> > */
> > static u8 user_rmmod;
> >
> > +/* reset_trigger variable is used to identify if
> > +mwifiex_sdio_remove()
> > + * is called by sdio_work during reset or the call is from sdio
> subsystem.
> > + * We will cancel sdio_work only if the call is from sdio subsystem.
> > + */
> > +static u8 reset_triggered;
> > +
> > +static void mwifiex_sdio_work(struct work_struct *work); static
> > +DECLARE_WORK(sdio_work, mwifiex_sdio_work);
> > +
> > static struct mwifiex_if_ops sdio_ops; static unsigned long
> > iface_work_flags;
> >
> > @@ -289,6 +298,9 @@ mwifiex_sdio_remove(struct sdio_func *func)
> > if (!adapter || !adapter->priv_num)
> > return;
> >
> > + if (!reset_triggered)
> > + cancel_work_sync(&sdio_work);
> > +
> > mwifiex_dbg(adapter, INFO, "info: SDIO func num=%d\n", func->num);
> >
> > if (user_rmmod && !adapter->mfg_mode) { @@ -2290,7 +2302,9 @@
> static
> > void mwifiex_recreate_adapter(struct sdio_mmc_card *card)
> > * discovered and initializes them from scratch.
> > */
> >
> > + reset_triggered = 1;
> > mwifiex_sdio_remove(func);
> > + reset_triggered = 0;
>
> Boy that's ugly! Couldn't you just create something like
> __mwifiex_sdio_remove(), which does everything except the
> cancel_work_sync()? Then you'd do this for the .remove() callback:
>
> cancel_work_sync(&sdio_work);
> __mwifiex_sdio_remove(func);
>
> and for mwifiex_recreate_adapter() you'd just call
> __mwifiex_sdio_remove()? The static variable that simply serves as a
> (non-reentrant) function call parameter seems like a really poor
> alternative to this simple refactoring.
Thanks a lot for the suggestion.
I will use this approach in updated version.
Regards,
Amitkumar
^ permalink raw reply
* Re: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing' in shutdown_drv
From: Dmitry Torokhov @ 2016-10-25 16:35 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: Brian Norris, linux-wireless@vger.kernel.org, Cathy Luo,
Nishant Sarmukadam
In-Reply-To: <91f0f4390ac14afc9e4f3498d1b79c78@SC-EXCH04.marvell.com>
On Tue, Oct 25, 2016 at 04:11:14PM +0000, Amitkumar Karwar wrote:
> Hi Dmitry,
>
> > From: Dmitry Torokhov [mailto:dmitry.torokhov@gmail.com]
> > Sent: Tuesday, October 25, 2016 5:28 AM
> > To: Brian Norris
> > Cc: Amitkumar Karwar; linux-wireless@vger.kernel.org; Cathy Luo; Nishant
> > Sarmukadam
> > Subject: Re: [PATCH 2/5] mwifiex: use spinlock for 'mwifiex_processing'
> > in shutdown_drv
> >
> > On Mon, Oct 24, 2016 at 12:19:15PM -0700, Brian Norris wrote:
> > > On Mon, Oct 24, 2016 at 07:51:29PM +0530, Amitkumar Karwar wrote:
> > > > This variable is guarded by spinlock at all other places. This patch
> > > > takes care of missing spinlock usage in mwifiex_shutdown_drv().
> > > >
> > > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > > > ---
> > > > drivers/net/wireless/marvell/mwifiex/init.c | 3 +++
> > > > 1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/net/wireless/marvell/mwifiex/init.c
> > > > b/drivers/net/wireless/marvell/mwifiex/init.c
> > > > index 82839d9..8e5e424 100644
> > > > --- a/drivers/net/wireless/marvell/mwifiex/init.c
> > > > +++ b/drivers/net/wireless/marvell/mwifiex/init.c
> > > > @@ -670,11 +670,14 @@ mwifiex_shutdown_drv(struct mwifiex_adapter
> > > > *adapter)
> > > >
> > > > adapter->hw_status = MWIFIEX_HW_STATUS_CLOSING;
> > > > /* wait for mwifiex_process to complete */
> > > > + spin_lock_irqsave(&adapter->main_proc_lock, flags);
> > > > if (adapter->mwifiex_processing) {
> > >
> > > I'm not quite sure why we have this check in the first place; if the
> > > main process is still running, then don't we have bigger problems here
> > > anyway? But this is definitely the "right" way to check this flag, so:
> >
> > No, this is definitely not right way to check it. What exactly does this
> > spinlock protect? It seems that the intent is to make sure we do not
> > call mwifiex_shutdown_drv() while mwifiex_main_process() is executing.
> > It even says above that we are "waiting" for it (but we do not, we may
> > bail out or we may not, depends on luck).
> >
> > To implement this properly you not only need to take a lock and check
> > the flag, but keep the lock until mwifiex_shutdown_drv() is done, or use
> > some other way to let mwifiex_main_process() know it should not access
> > the adapter while mwifiex_shutdown_drv() is running.
> >
> > NACK.
> >
>
> As I explained in other email, here is the sequence.
> 1) We find mwifiex_processing is true in mwifiex_shitdown_drv(). "-EINPROGRESS" is returned by the function and hw_status state is changed to MWIFIEX_HW_STATUS_CLOSING.
> 2) We wait until main_process is completed.
>
> if (mwifiex_shutdown_drv(adapter) == -EINPROGRESS)
> wait_event_interruptible(adapter->init_wait_q,
> adapter->init_wait_q_woken);
>
> 3) We have following check at the end of main_process()
>
> exit_main_proc:
> if (adapter->hw_status == MWIFIEX_HW_STATUS_CLOSING)
> mwifiex_shutdown_drv(adapter);
>
> 4) Here at the end of mwifiex_shutdown_drv(), we are setting "adapter->init_wait_q_woken" and waking up the thread in point (2)
1. We do not find mwifiex_processing to be true
2. We proceed to try and shut down the driver
3. Interrupt is raised in the meantime, kicking work item
4. mwifiex_main_process() sees that adapter->hw_status is
MWIFIEX_HW_STATUS_CLOSING and jumps to exit_main_proc
5. mwifiex_main_process() calls into mwifiex_shutdown_drv() that is now
racing with another copy of the same.
It seems to me that mwifiex_main_process() is [almost] always used from
a workqueue. Can you instead of sprinkling spinlocks ensure that
mwifiex_shutdown_drv():
1. Inhibits scheduling of mwifiex_main_process()
2. Does cancel_work_sync(...) to ensure that mwifiex_main_process() does
not run
3. Continues shutting down the driver.
Alternatively, do these have to be spinlocks? Can you make them mutexes
and take them for the duration of mwifiex_main_process() and
mwifiex_shutdown_drv() and others, as needed?
Thanks.
--
Dmitry
^ permalink raw reply
* Re: [PATCH v4 1/3] mwifiex: reset card->adapter during device unregister
From: Brian Norris @ 2016-10-25 16:54 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: linux-wireless@vger.kernel.org, Cathy Luo, Nishant Sarmukadam,
rajatja@google.com, Xinming Hu, abhishekbh@google.com,
Dmitry Torokhov
In-Reply-To: <933ec70e65a6470ca0654d1a5b5c4496@SC-EXCH04.marvell.com>
Hi Amit,
On Tue, Oct 25, 2016 at 03:12:44PM +0000, Amitkumar Karwar wrote:
> > From: Brian Norris [mailto:briannorris@chromium.org]
> > Sent: Tuesday, October 25, 2016 6:22 AM
> > To: Amitkumar Karwar
> >
> > On Thu, Oct 20, 2016 at 01:11:31PM +0000, Amitkumar Karwar wrote:
> > > > From: Brian Norris [mailto:briannorris@chromium.org]
> > > > Sent: Tuesday, October 11, 2016 5:53 AM
> > > > To: Amitkumar Karwar
> > > >
> > > > On Mon, Oct 10, 2016 at 01:53:32PM -0700, Brian Norris wrote:
> > > > > On Thu, Oct 06, 2016 at 11:36:24PM +0530, Amitkumar Karwar wrote:
> > > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > index f1eeb73..ba9e068 100644
> > > > > > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > @@ -3042,6 +3042,7 @@ static void mwifiex_unregister_dev(struct
> > > > mwifiex_adapter *adapter)
> > > > > > pci_disable_msi(pdev);
> > > > > > }
> > > > > > }
> > > > > > + card->adapter = NULL;
> > > > > > }
> > > > > >
> > > > > > /* This function initializes the PCI-E host memory space, WCB
> > > > rings, etc.
> > > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > index 8718950..4cad1c2 100644
> > > > > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > @@ -2066,6 +2066,7 @@ mwifiex_unregister_dev(struct
> > > > > > mwifiex_adapter
> > > > *adapter)
> > > > > > struct sdio_mmc_card *card = adapter->card;
> > > > > >
> > > > > > if (adapter->card) {
> > > > > > + card->adapter = NULL;
> > > > > > sdio_claim_host(card->func);
> > > > > > sdio_disable_func(card->func);
> > > > > > sdio_release_host(card->func);
> > > > >
[...]
> > > Thanks for your comments.
> > >
> > > Actually description for this patch was ambiguous and incorrect. Sorry
> > > for that. This patch doesn't fix any race. In fact, we don't have a
> > > race between init and remove threads due to semaphore usage as per
> > > design. This patch just adds missing "card->adapter=NULL" so that when
> > > teardown/remove thread starts after init failure, it won't try freeing
> > > already freed things.
> >
> > So to be clear, you'r talking about mwifiex_fw_dpc(), which in the error
> > path
> > has:
> >
> > if (adapter->if_ops.unregister_dev)
> > adapter->if_ops.unregister_dev(adapter); <--- POINT A:
> > This is where you want to set ->adapter = NULL ...
> > if (init_failed)
> > mwifiex_free_adapter(adapter);
> > up(sem); <--- POINT B: This is where you release the semaphore,
> > which is supposed to guarantee that remove() isn't happening
> > return;
> > }
> >
> > But you *do* have a race between the above code and the remove code in
> > some cases. Particularly, see this:
> >
> > static void mwifiex_pcie_remove(struct pci_dev *pdev) {
> > struct pcie_service_card *card;
> > struct mwifiex_adapter *adapter;
> > struct mwifiex_private *priv;
> >
> > card = pci_get_drvdata(pdev);
> > if (!card)
> > return;
> >
> > adapter = card->adapter; <--- POINT C: This can execute at the
> > same time as unregister_dev()
> > if (!adapter || !adapter->priv_num)
> > return;
> >
> > if (user_rmmod && !adapter->mfg_mode) { #ifdef CONFIG_PM_SLEEP
> > if (adapter->is_suspended)
> > mwifiex_pcie_resume(&pdev->dev); #endif
> >
> > mwifiex_deauthenticate_all(adapter);
> >
> > priv = mwifiex_get_priv(adapter, MWIFIEX_BSS_ROLE_ANY);
> >
> > mwifiex_disable_auto_ds(priv);
> >
> > mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
> > }
> >
> > mwifiex_remove_card(card->adapter, &add_remove_card_sem); <---
> > POINT D: You only grab the semaphore here }
> >
> > So IIUC, you have a race **even here, where you claim the semaphore
> > should protect you** such that the adapter might be freed, but you still
> > can access it anywhere between C and D. i.e., you can see this:
> >
> > Thread 1 Thread 2
> > (1) POINT C (retrieve adapter !=
> > NULL)
> > (2) POINT A (set adapter NULL)
> > (3) POINT B (adapter has been freed)
> > (3) ....Keep accessing freed
> > adapter structure
> > (4) POINT D - acquire semaphore,
> > but we're too late
> >
> > Step 3 is an error, and AFAICT, that's exactly what you're trying to
> > solve in this patch. It essentially comes down to the same fact: you're
> > getting a reference to the adapter structure *without* any protection at
> > all. Your add_remove_card_sem is *almost* the right thing to resolve
> > this, but you still don't have the ordering quite right.
>
> Code between POINT C and POINT D won't come into picture for "init +
> reboot" scenario(Case 1 below) which we are talking here. Reason is
> "user_rmmod" flag will be false.
Wait, but doesn't mwifiex_pcie_shutdown() set 'user_rmmod = 1'? And
whether or not user_rmmod is set, you still have a problem. See below.
> Basically we have 3 teardown cases
> 1) System shutdown (or manually remove wifi card) -- Only mwifiex_pcie_remove() gets called.
This is wrong, due to above. But that isn't key either. Still see below.
> 2) User unloading the driver -- mwifiex_pcie_cleanup_module() followed by mwifiex_pcie_remove() gets called.
This case is sort of OK, because you grab the semaphore first in this
function. (You then release it, but that's a different problem. Probably
not going to cause problems, but it still feels wrong.)
> 3) Chip isn't connected. User unloaded the driver -- Only mwifiex_pcie_cleanup_module() gets
> called.
>
> In case 2, we have already waited for semaphore in
> mwifiex_pcie_cleanup_module(). So by the time we execute
> mwifiex_pcie_remove(), "card->adapter" is NULL.
Yes, that's (mostly) fine.
> Case 3, doesn't use "card->adapter"
That case is fine.
But you haven't addressed 1 properly. Whether or not you have user_rmmod
= 1, you still have this code that uses adapter:
if (!adapter || !adapter->priv_num)
return;
The ->priv_num dereference can be a user-after-free.
You also have the problem below which you argue is "very small." That's
not a real argument. At large enough scale, "small" problems become
noticeable. And with sufficiently complicated compilers and/or
out-of-order parallel processors, races can become issues later, even if
they aren't today. So wrong code is wrong.
> > Maybe you could solve this all by acquiring the semaphore in suspend(),
> > remove(), shutdown(), before you ever acquire the card->adapter? If you
> > do that first (to fix the race), and only *then* do you submit the
> > $subject patch, then you might have fixed all the races in question (at
> > least between FW init and {suspend,resume,remove,shutdown} -- you have
> > plenty of other races still, both known and unknown).
>
> "add_remove_card_sem" is used only for init and teardown threads as
> per our design. V5 4/4 takes care of race between "init fail +
> suspend". Please let us know if you see any other race situation.
No, patch 4 does not handle the problem. It's almost exactly the same
problem as I'm trying to describe here. But it's becoming more and more
clear that you do not understand the problem here. Let's focus on this
patch, and then we can try and shift our gaze to your other problems.
> >
> > [[ Also, a side note on POINT D: it's possible that even though you're
> > retrieving card->adapter in the argument to that function call, it's
> > possible the compiler will notice that this is redundant with the
> > retrieval at POINT C and just use that one. But even if not, there's a
> > window between "retrieve function argument" and "grab semaphore in
> > mwifiex_remove_card()". ]]
>
> I had not considered that compiler will notice this is redundant and
> use "card->adapter" assigned at POINT C.
Compiler optimization isn't really the point here. Even if the compiler
doesn't coalesce these dereferences, you have a problem.
> But this window is very small compared to the window between in
> "card->adapter = NULL" and releasing semaphore in
> other(mwifiex_fw_dpc()) thread.
I can't even listen to that argument. A "small window" of incorrectness
is still incorrect.
> Even if "card->adapter" is set to NULL in mwifiex_fw_dpc() after we
> checked it at POINT C, we will return from mwifiex_remove_card()
> without grabing semaphore. mwifiex_fw_dpc() wouldn't have released the
> semaphore at that point of time.
Did you read my sequence? Remember things are happening concurrently.
But I claimed that we can have this overall interleaving of events:
C -> A -> B -> D
Remember that 'adapter' is freed between A and B, and the semaphore is
released right at B. So D is free to both grab the semaphore and
dereference the 'adapter' pointer that we read in POINT C.
Note that this all works exactly the same even if POINT C is instead
moved to POINT C-prime at the end of mwifiex_pcie_remove():
mwifiex_remove_card(card->adapter, <--- POINT C-prime - where we read card->adapter again
&add_remove_card_sem);
The key point that I hope you're understanding here is you have no
synchronization between the end of the mwifiex_fw_dpc() thread and the
start of grabbing your 'adapter' pointer. So even if you do some kind of
synchronization *afterward*, it's pointless -- you already are planning
on using a pointer to freed memory.
> mwifiex_fw_dpc() will take care of performing cleanup.
> >
> > ---
> >
> > BTW I'm gonna sidetrack us here... what's up with this code, in pcie.c?
> >
> > if (!down_interruptible(&add_remove_card_sem))
> > up(&add_remove_card_sem);
> >
> > I can understand that you want to grab the semaphore for the remove
> > code, but why do you want to immediately release it? If you release it,
> > that must mean that someone else is trying to get it. And they won't
> > notice that you're tearing down the device...
>
> I don't see any problem in releasing the semaphore here. Only other
> thread which can acquire this semaphore is "init" thread.
> "init" thread won't get called when user is unloading the driver.
If no one can acquire it, then why are you releasing it? Seems
unnecessary, and borderline wrong.
> The purpose of immediately releasing it is we just wanted to wait
> until init is completed if it's running.
That's not a reason for releasing it. That's a reason for acquiring it.
> >(Also, why aren't you
> > handling the interrupted case?)
>
> The cleanup performed in this function is done without touching
> hardware OR referring adapter/card etc. It can be done even for the
> interrupted case.
Uh, but if you're interrupted, that means you *didn't* wait for the
other thread to complete. So all the safety about use-after-free that
we're trying to work on gets thrown away...
Seems like a case where either you don't use the interruptible version,
or else you should return an error like -EBUSY.
> > > I have updated patch description in v5 series.
> > >
> > > You are right. We may have a race between init failure and suspend
> > > thread. I have prepared 4th patch in my v5 series to address this.
> > >
> > > >
> > > > I'm going to look into converting to asynchronous device probing,
> > > > which might remove the need for async FW request, and would
> > > > therefore resolve both patch 1 and 3's races without any additional
> > > > complicated hacks. But I'm not sure if that will satisfy all mwifiex
> > > > users well enough. I'll have to give it a little more thought. Any
> > > > thoughts from your side, Amit?
> > > >
> > >
> > > This is not needed.
> >
> > Yeah, I ended up deciding this really wasn't that great of a solution.
> > For one, you also need to do firmware reloads for your PCIe reset
> > handling, and this happens in parallel to any device suspend. So:
> > (a) I'd bet that has plenty of race conditions and
> > (b) that means we can't just "load the firmware synchronously once and
> > forget about it"
> >
> > > 4th patch in V5 series would take care of this race.
> >
> > No, patch 4 does not handle this race, and it doesn't handle the race I
> > point out above either. You still can get a pointer card->adapter and
> > then see another thread subsequently free() it out from under you.
>
> We have used spinlock in v5 4/4 to guard "card->adapter".
> Please help me understand if it doesn't handle the race between "init fail" + suspend.
I think you do. But I'll have to get to that later. We have enough
problems here.
Brian
^ permalink raw reply
* Re: [PATCH v4 1/3] mwifiex: reset card->adapter during device unregister
From: Dmitry Torokhov @ 2016-10-25 16:56 UTC (permalink / raw)
To: Amitkumar Karwar
Cc: Brian Norris, linux-wireless@vger.kernel.org, Cathy Luo,
Nishant Sarmukadam, rajatja@google.com, Xinming Hu,
abhishekbh@google.com
In-Reply-To: <933ec70e65a6470ca0654d1a5b5c4496@SC-EXCH04.marvell.com>
On Tue, Oct 25, 2016 at 03:12:44PM +0000, Amitkumar Karwar wrote:
> Hi Brian,
>
> Thanks for review.
>
> > From: Brian Norris [mailto:briannorris@chromium.org]
> > Sent: Tuesday, October 25, 2016 6:22 AM
> > To: Amitkumar Karwar
> > Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> > rajatja@google.com; Xinming Hu; abhishekbh@google.com; Dmitry Torokhov
> > Subject: Re: [PATCH v4 1/3] mwifiex: reset card->adapter during device
> > unregister
> >
> > Hi Amit,
> >
> > On Thu, Oct 20, 2016 at 01:11:31PM +0000, Amitkumar Karwar wrote:
> > > > From: Brian Norris [mailto:briannorris@chromium.org]
> > > > Sent: Tuesday, October 11, 2016 5:53 AM
> > > > To: Amitkumar Karwar
> > > > Cc: linux-wireless@vger.kernel.org; Cathy Luo; Nishant Sarmukadam;
> > > > rajatja@google.com; Xinming Hu; abhishekbh@google.com; Dmitry
> > > > Torokhov
> > > > Subject: Re: [PATCH v4 1/3] mwifiex: reset card->adapter during
> > > > device unregister
> > > >
> > > > On Mon, Oct 10, 2016 at 01:53:32PM -0700, Brian Norris wrote:
> > > > > On Thu, Oct 06, 2016 at 11:36:24PM +0530, Amitkumar Karwar wrote:
> > > > > > From: Xinming Hu <huxm@marvell.com>
> > > > > >
> > > > > > card->adapter gets initialized during device registration.
> > > > > > As it's not cleared, we may end up accessing invalid memory in
> > > > > > some corner cases. This patch fixes the problem.
> > > > > >
> > > > > > Signed-off-by: Xinming Hu <huxm@marvell.com>
> > > > > > Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
> > > > > > ---
> > > > > > v4: Same as v1, v2, v3
> > > > > > ---
> > > > > > drivers/net/wireless/marvell/mwifiex/pcie.c | 1 +
> > > > > > drivers/net/wireless/marvell/mwifiex/sdio.c | 1 +
> > > > > > 2 files changed, 2 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > index f1eeb73..ba9e068 100644
> > > > > > --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > > > > > @@ -3042,6 +3042,7 @@ static void mwifiex_unregister_dev(struct
> > > > mwifiex_adapter *adapter)
> > > > > > pci_disable_msi(pdev);
> > > > > > }
> > > > > > }
> > > > > > + card->adapter = NULL;
> > > > > > }
> > > > > >
> > > > > > /* This function initializes the PCI-E host memory space, WCB
> > > > rings, etc.
> > > > > > diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > index 8718950..4cad1c2 100644
> > > > > > --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> > > > > > @@ -2066,6 +2066,7 @@ mwifiex_unregister_dev(struct
> > > > > > mwifiex_adapter
> > > > *adapter)
> > > > > > struct sdio_mmc_card *card = adapter->card;
> > > > > >
> > > > > > if (adapter->card) {
> > > > > > + card->adapter = NULL;
> > > > > > sdio_claim_host(card->func);
> > > > > > sdio_disable_func(card->func);
> > > > > > sdio_release_host(card->func);
> > > > >
> > > > > As discussed on v1, I had qualms about the raciness between
> > > > > reads/writes of card->adapter, but I believe we:
> > > > > (a) can't have any command activity while writing the ->adapter
> > > > > field (either we're just init'ing the device, or we've disabled
> > > > > interrupts and are tearing it down) and
> > > > > (b) can't have a race between suspend()/resume() and
> > > > > unregister_dev(), since unregister_dev() is called from device
> > > > > remove() (which should not be concurrent with suspend()).
> > > > >
> > > > > Also, I thought you had the same problem in usb.c, but in fact,
> > > > > you fixed that ages ago here:
> > > > >
> > > > > 353d2a69ea26 mwifiex: fix issues in driver unload path for USB
> > > > > chipsets
> > > > >
> > > > > Would be nice if fixes were bettery synchronized across the three
> > > > > interface drivers you support. We seem to be discovering
> > > > > unnecessary divergence on a few points recently.
> > > > >
> > > > > At any rate:
> > > > >
> > > > > Reviewed-by: Brian Norris <briannorris@chromium.org>
> > > > > Tested-by: Brian Norris <briannorris@chromium.org>
> > > >
> > > > Dmitry helped me re-realize my original qualms:
> > > >
> > > > mwifiex_unregister_dev() is called in the failure path for your
> > > > async FW request, and so it may race with suspend(). So I retract my
> > Reviewed-by.
> > > > Sorry.
> > >
> > > Thanks for your comments.
> > >
> > > Actually description for this patch was ambiguous and incorrect. Sorry
> > > for that. This patch doesn't fix any race. In fact, we don't have a
> > > race between init and remove threads due to semaphore usage as per
> > > design. This patch just adds missing "card->adapter=NULL" so that when
> > > teardown/remove thread starts after init failure, it won't try freeing
> > > already freed things.
> >
> > So to be clear, you'r talking about mwifiex_fw_dpc(), which in the error
> > path
> > has:
> >
> > if (adapter->if_ops.unregister_dev)
> > adapter->if_ops.unregister_dev(adapter); <--- POINT A:
> > This is where you want to set ->adapter = NULL ...
> > if (init_failed)
> > mwifiex_free_adapter(adapter);
> > up(sem); <--- POINT B: This is where you release the semaphore,
> > which is supposed to guarantee that remove() isn't happening
> > return;
> > }
> >
> > But you *do* have a race between the above code and the remove code in
> > some cases. Particularly, see this:
> >
> > static void mwifiex_pcie_remove(struct pci_dev *pdev) {
> > struct pcie_service_card *card;
> > struct mwifiex_adapter *adapter;
> > struct mwifiex_private *priv;
> >
> > card = pci_get_drvdata(pdev);
> > if (!card)
> > return;
> >
> > adapter = card->adapter; <--- POINT C: This can execute at the
> > same time as unregister_dev()
> > if (!adapter || !adapter->priv_num)
> > return;
> >
> > if (user_rmmod && !adapter->mfg_mode) { #ifdef CONFIG_PM_SLEEP
> > if (adapter->is_suspended)
> > mwifiex_pcie_resume(&pdev->dev); #endif
> >
> > mwifiex_deauthenticate_all(adapter);
> >
> > priv = mwifiex_get_priv(adapter, MWIFIEX_BSS_ROLE_ANY);
> >
> > mwifiex_disable_auto_ds(priv);
> >
> > mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
> > }
> >
> > mwifiex_remove_card(card->adapter, &add_remove_card_sem); <---
> > POINT D: You only grab the semaphore here }
> >
> > So IIUC, you have a race **even here, where you claim the semaphore
> > should protect you** such that the adapter might be freed, but you still
> > can access it anywhere between C and D. i.e., you can see this:
> >
> > Thread 1 Thread 2
> > (1) POINT C (retrieve adapter !=
> > NULL)
> > (2) POINT A (set adapter NULL)
> > (3) POINT B (adapter has been freed)
> > (3) ....Keep accessing freed
> > adapter structure
> > (4) POINT D - acquire semaphore,
> > but we're too late
> >
> > Step 3 is an error, and AFAICT, that's exactly what you're trying to
> > solve in this patch. It essentially comes down to the same fact: you're
> > getting a reference to the adapter structure *without* any protection at
> > all. Your add_remove_card_sem is *almost* the right thing to resolve
> > this, but you still don't have the ordering quite right.
>
> Code between POINT C and POINT D won't come into picture for "init + reboot" scenario(Case 1 below) which we are talking here. Reason is "user_rmmod" flag will be false.
>
> Basically we have 3 teardown cases
> 1) System shutdown (or manually remove wifi card) -- Only mwifiex_pcie_remove() gets called.
> 2) User unloading the driver -- mwifiex_pcie_cleanup_module() followed by mwifiex_pcie_remove() gets called.
> 3) Chip isn't connected. User unloaded the driver -- Only mwifiex_pcie_cleanup_module() gets
> called.
>
> In case 2, we have already waited for semaphore in mwifiex_pcie_cleanup_module(). So by the time we execute mwifiex_pcie_remove(), "card->adapter" is NULL.
> Case 3, doesn't use "card->adapter"
Case 4: echo "0000:03:00.0" > /sys/bus/pci/drivers/mwifiex/unbind
This does not play with the semaphore, does not resume device, does
not deauthenticate calls, etc, races happily with another thread.
By the way, "saving" adapter for the dump is not nice if you unbind it
in the mean time. Your pcie_work may be executing after
mwifiex_pcie_remove() is called.
Thanks.
--
Dmitry
^ permalink raw reply
* Re: [PATCH] cw1200: fix bogus maybe-uninitialized warning
From: Solomon Peachy @ 2016-10-25 18:13 UTC (permalink / raw)
To: David Laight
Cc: 'Arnd Bergmann', Kalle Valo, Johannes Berg,
linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0209E9A@AcuExch.aculab.com>
[-- Attachment #1: Type: text/plain, Size: 839 bytes --]
On Tue, Oct 25, 2016 at 01:24:55PM +0000, David Laight wrote:
> > - if (count > 1) {
> > - /* We already released one buffer, now for the rest */
> > - ret = wsm_release_tx_buffer(priv, count - 1);
> > - if (ret < 0)
> > - return ret;
> > - else if (ret > 0)
> > - cw1200_bh_wakeup(priv);
> > - }
> > + /* We already released one buffer, now for the rest */
> > + ret = wsm_release_tx_buffer(priv, count - 1);
> > + if (ret < 0)
> > + return ret;
> > +
> > + if (ret > 0)
> > + cw1200_bh_wakeup(priv);
>
> That doesn't look equivalent to me (when count == 1).
I concur, this patch should not be applied in its current form.
- Solomon
--
Solomon Peachy pizza at shaftnet dot org
Delray Beach, FL ^^ (email/xmpp) ^^
Quidquid latine dictum sit, altum viditur.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 163 bytes --]
^ permalink raw reply
* Re: [PATCH] cw1200: fix bogus maybe-uninitialized warning
From: Arnd Bergmann @ 2016-10-25 20:19 UTC (permalink / raw)
To: David Laight
Cc: Solomon Peachy, Kalle Valo, Johannes Berg,
linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB0209E9A@AcuExch.aculab.com>
On Tuesday, October 25, 2016 1:24:55 PM CEST David Laight wrote:
> > diff --git a/drivers/net/wireless/st/cw1200/wsm.c b/drivers/net/wireless/st/cw1200/wsm.c
> > index 680d60eabc75..094e6637ade2 100644
> > --- a/drivers/net/wireless/st/cw1200/wsm.c
> > +++ b/drivers/net/wireless/st/cw1200/wsm.c
> > @@ -385,14 +385,13 @@ static int wsm_multi_tx_confirm(struct cw1200_common *priv,
> > if (WARN_ON(count <= 0))
> > return -EINVAL;
> >
> > - if (count > 1) {
> > - /* We already released one buffer, now for the rest */
> > - ret = wsm_release_tx_buffer(priv, count - 1);
> > - if (ret < 0)
> > - return ret;
> > - else if (ret > 0)
> > - cw1200_bh_wakeup(priv);
> > - }
> > + /* We already released one buffer, now for the rest */
> > + ret = wsm_release_tx_buffer(priv, count - 1);
> > + if (ret < 0)
> > + return ret;
> > +
> > + if (ret > 0)
> > + cw1200_bh_wakeup(priv);
>
> That doesn't look equivalent to me (when count == 1).
Ah, that's what I missed, thanks for pointing that out!
> >
> > cw1200_debug_txed_multi(priv, count);
> > for (i = 0; i < count; ++i) {
>
> Convert this loop into a do ... while so the body executes at least once.
Good idea. Version 2 coming now.
Arnd
^ permalink raw reply
* [PATCH v2] cw1200: fix bogus maybe-uninitialized warning
From: Arnd Bergmann @ 2016-10-25 20:21 UTC (permalink / raw)
To: Solomon Peachy, Kalle Valo
Cc: David Laight, Arnd Bergmann, Johannes Berg, linux-wireless,
netdev, linux-kernel
On x86, the cw1200 driver produces a rather silly warning about the
possible use of the 'ret' variable without an initialization
presumably after being confused by the architecture specific definition
of WARN_ON:
drivers/net/wireless/st/cw1200/wsm.c: In function ‘wsm_handle_rx’:
drivers/net/wireless/st/cw1200/wsm.c:1457:9: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
We have already checked that 'count' is larger than 0 here, so
we know that 'ret' is initialized. Changing the 'for' loop
into do/while also makes this clear to the compiler.
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/net/wireless/st/cw1200/wsm.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
v2: rewrite based on David Laight's suggestion, the first version
was completely wrong.
diff --git a/drivers/net/wireless/st/cw1200/wsm.c b/drivers/net/wireless/st/cw1200/wsm.c
index 680d60eabc75..ed93bf3474ec 100644
--- a/drivers/net/wireless/st/cw1200/wsm.c
+++ b/drivers/net/wireless/st/cw1200/wsm.c
@@ -379,7 +379,6 @@ static int wsm_multi_tx_confirm(struct cw1200_common *priv,
{
int ret;
int count;
- int i;
count = WSM_GET32(buf);
if (WARN_ON(count <= 0))
@@ -395,11 +394,10 @@ static int wsm_multi_tx_confirm(struct cw1200_common *priv,
}
cw1200_debug_txed_multi(priv, count);
- for (i = 0; i < count; ++i) {
+ do {
ret = wsm_tx_confirm(priv, buf, link_id);
- if (ret)
- return ret;
- }
+ } while (!ret && --count);
+
return ret;
underflow:
--
2.9.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox