From mboxrd@z Thu Jan  1 00:00:00 1970
From: Michal Kazior <michal.kazior@tieto.com>
Date: Fri, 14 Jun 2013 13:46:32 +0200
Subject: [ath9k-devel] [PATCH 5/6] ath10k: wait for CE to drain when
 shutting down
In-Reply-To: <87ip1hopd9.fsf@kamboji.qca.qualcomm.com>
References: <1371040066-17631-1-git-send-email-michal.kazior@tieto.com>
	<1371040066-17631-6-git-send-email-michal.kazior@tieto.com>
	<87ip1hopd9.fsf@kamboji.qca.qualcomm.com>
Message-ID: <51BB0298.2070807@tieto.com>
List-Id: <ath9k-devel.lists.ath9k.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: ath9k-devel@lists.ath9k.org

On 13/06/13 20:08, Kalle Valo wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
>
>> ath10k_pci_process_ce() is used to process
>> completions. Only one thread can do that though.
>>
>> If one thread starts handling completions then the
>> other one (i.e. possibly PCI shutdown) would exit
>> immediatelely and free up memory while completions
>> are being processed leading to corruption.
>>
>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
>
> [...]
>
>> +/* This function assumes no new data is going to be submitted/completed. It is
>> + * mainly intended to flush out completions when stopping the device. */
>> +static void ath10k_pci_wait_for_ce_drain(struct ath10k *ar)
>> +{
>> +	struct ath10k_pci *ar_pci = ath10k_pci_priv(ar);
>> +	int ret;
>> +
>> +	ret = wait_event_timeout(ar_pci->compl_wq, ({
>> +			bool processing;
>> +			spin_lock_bh(&ar_pci->compl_lock);
>> +			processing = ar_pci->compl_processing;
>> +			spin_unlock_bh(&ar_pci->compl_lock);
>> +			(!processing);
>> +		}), 5*HZ);
>> +	if (ret == 0)
>> +		ath10k_warn("timed out while waiting for completions to be processed\n");
>> +}
>
> This looks like a hack to me. Wouldn't it be a better to fix make sure
> that all threads/tasklets are stopped, for example with tasklet_kill()
> and cancel_work_sync()? (And of course first making sure that we don't
> fire new instances). That way we could be sure that there is no other
> thread running while we shutdown.

Apparently I can't reproduce this bug anymore. I can't really locate 
where the other thread could be comming from. Maybe an interrupt? We 
don't unregister interrupt handlers at the point where the issue happens 
(we only stop CE interrupts via registers).

Perhaps we can drop this patch for now and look for a proper solution 
later on?


-- Pozdrawiam / Best regards, Michal Kazior.