From: Andreas Fenkart <andreas.fenkart@streamunlimited.com>
To: Bing Zhao <bzhao@marvell.com>
Cc: Andreas Fenkart <andreas.fenkart@streamunlimited.com>,
"linville@tuxdriver.com" <linville@tuxdriver.com>,
"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
"daniel@zonque.org" <daniel@zonque.org>,
Yogesh Powar <yogeshp@marvell.com>,
Avinash Patil <patila@marvell.com>
Subject: Re: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.
Date: Wed, 3 Apr 2013 13:35:54 +0200 [thread overview]
Message-ID: <20130403113554.GA14785@blumentopf> (raw)
In-Reply-To: <477F20668A386D41ADCC57781B1F70430D9DDAB197@SC-VEXCH1.marvell.com>
Hi Bing,
On Tue, Apr 02, 2013 at 07:40:53PM -0700, Bing Zhao wrote:
>
> > Using NO_PKT_PRIO_TID and tx_pkts_queued to check for an empty state, can
> > lead to a contradictory state, resulting in an infinite loop.
> > Currently queueing and dequeuing of packets is not synchronized, and can
> > happen concurrently. While tx_pkts_queued is incremented when adding a
> > packet, max prio is set to NO_PKT when the WMM list is empty. If a packet
> > is added right after the check for empty, but before setting max prio to
> > NO_PKT, that packet is trapped and creates an infinite loop.
> > Because of the new packet, tx_pkts_queued is at least 1, indicating wmm
> > lists are not empty. Opposing that max prio is NO_PKT, which means "skip
> > this wmm queue, it has no packets". The infinite loop results, because the
> > main loop checks the wmm lists for not empty via tx_pkts_queued, but when
> > dequeing uses max_prio to see if it can skip a list. This will never end,
> > unless a new packet is added which will restore max prio to the level of
> > the trapped packet.
> > The solution here is to rely on tx_pkts_queued solely for checking wmm
> > queue to be empty, and drop the NO_PKT define. It does not address the
> > locking issue.
> >
> > Signed-off-by: Andreas Fenkart <andreas.fenkart@streamunlimited.com>
>
> With this patch (1/6) applied, I'm getting soft-lockup watchdog:
>
> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:37]
My bad here, should be like this when patch is applied first:
@@ -919,8 +919,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct
mwifiex_adapter *adapter,
do {
priv_tmp = bssprio_node->priv;
- hqp = &priv_tmp->wmm.highest_queued_prio;
+ if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
+ goto skip_bss;
+
+ /* iterate over the WMM queues of the BSS */
+ hqp = &priv_tmp->wmm.highest_queued_prio;
for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {
tid_ptr = &(priv_tmp)->wmm.
@@ -980,12 +984,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
} while (ptr != head);
}
- /* No packet at any TID for this priv. Mark as
such
- * to skip checking TIDs for this priv (until
pkt is
- * added).
- */
- atomic_set(hqp, NO_PKT_PRIO_TID);
-
+skip_bss:
/* Get next bss priority node */
bssprio_node = list_first_entry(&bssprio_node->list,
struct mwifiex_bss_prio_node,
That said, yes I developed the pathset the other way round. First
cleaned up, until I knew how to fix the bug best. Then pulled the fix
in front of the cleanup patches and -- mea culpa -- didn't test the
patches individually. Sorry again.
Also found issue here, which could be a problem without patch 6/6:
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -688,13 +688,13 @@ mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
ra_list->total_pkts_size += skb->len;
ra_list->pkt_count++;
- atomic_inc(&priv->wmm.tx_pkts_queued);
-
if (atomic_read(&priv->wmm.highest_queued_prio) <
tos_to_tid_inv[tid_down])
atomic_set(&priv->wmm.highest_queued_prio,
tos_to_tid_inv[tid_down]);
+ atomic_inc(&priv->wmm.tx_pkts_queued);
+
How should I proceed? Can I reorder patches to match my development
cycle, which is? 2-5;1;6 or more verbosely cleanup first followed
by bug fix and proper locking last
Or should keep the order as is, but fix patch 1, and propagate changes
through patch 2 till 6?
rgds,
Andi
next prev parent reply other threads:[~2013-04-03 11:36 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-19 9:52 mwifiex: infinite loop in mwifiex_main_process Andreas Fenkart
2013-03-19 22:37 ` Bing Zhao
2013-04-02 0:05 ` Andreas Fenkart
2013-04-02 0:08 ` [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID Andreas Fenkart
2013-04-02 0:08 ` [PATCH 2/6] mwifiex: bug: wrong list in list_empty check Andreas Fenkart
2013-04-02 0:08 ` [PATCH 3/6] mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl Andreas Fenkart
2013-04-02 0:08 ` [PATCH 4/6] mwifiex: replace ra_list_curr by list rotation Andreas Fenkart
2013-04-02 0:08 ` [PATCH 5/6] mwifiex: rework round robin scheduling of bss nodes Andreas Fenkart
2013-04-02 0:08 ` [PATCH 6/6] mwifiex: hold proper locks when accessing ra_list / bss_prio lists Andreas Fenkart
2013-04-03 2:40 ` [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID Bing Zhao
2013-04-03 11:35 ` Andreas Fenkart [this message]
2013-04-03 18:37 ` Bing Zhao
2013-04-04 20:57 ` Andreas Fenkart
2013-04-04 21:01 ` [PATCH 1/4] mwifiex: bug: wrong list in list_empty check Andreas Fenkart
2013-04-04 21:01 ` [PATCH 2/4] mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl Andreas Fenkart
2013-04-04 22:33 ` Bing Zhao
2013-04-04 21:01 ` [PATCH 3/4] mwifiex: bug: remove NO_PKT_PRIO_TID Andreas Fenkart
2013-04-04 22:34 ` Bing Zhao
2013-04-04 21:01 ` [PATCH 4/4] mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists Andreas Fenkart
2013-04-04 22:38 ` Bing Zhao
2013-04-04 22:29 ` [PATCH 1/4] mwifiex: bug: wrong list in list_empty check Bing Zhao
2013-04-04 21:08 ` [PATCH 1/2] mwifiex: replace ra_list_curr by list rotation Andreas Fenkart
2013-04-04 21:08 ` [PATCH 2/2] mwifiex: rework round robin scheduling of bss nodes Andreas Fenkart
2013-04-04 22:56 ` [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID Bing Zhao
2013-04-05 8:27 ` Andreas Fenkart
2013-04-08 18:19 ` Bing Zhao
2013-04-11 11:51 ` [PATCH v3 0/2] wmm queues handling simplificatons Andreas Fenkart
2013-04-11 11:51 ` [PATCH 1/2] mwifiex: replace ra_list_curr by list rotation Andreas Fenkart
2013-04-11 18:42 ` Bing Zhao
2013-04-11 11:51 ` [PATCH 2/2] mwifiex: rework round robin scheduling of bss nodes Andreas Fenkart
2013-04-11 18:43 ` Bing Zhao
2013-04-23 18:33 ` [PATCH v3 0/2] wmm queues handling simplificatons Bing Zhao
2013-04-23 18:48 ` John W. Linville
2013-04-23 18:51 ` Bing Zhao
2013-04-02 18:16 ` mwifiex: infinite loop in mwifiex_main_process Bing Zhao
2013-04-02 19:35 ` Andreas Fenkart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130403113554.GA14785@blumentopf \
--to=andreas.fenkart@streamunlimited.com \
--cc=bzhao@marvell.com \
--cc=daniel@zonque.org \
--cc=linux-wireless@vger.kernel.org \
--cc=linville@tuxdriver.com \
--cc=patila@marvell.com \
--cc=yogeshp@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.