From: "Gustavo F. Padovan" <padovan@profusion.mobi>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Marcel Holtmann <marcel@holtmann.org>,
Vinicius Costa Gomes <vinicius.gomes@openbossa.org>,
Keith Packard <keithp@keithp.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-bluetooth@vger.kernel.org
Subject: Re: 2.6.39-rc2 regression: X201s fails to resume b77dcf8460ae57d4eb9fd3633eb4f97b8fb20716
Date: Tue, 12 Apr 2011 15:09:33 -0300 [thread overview]
Message-ID: <20110412180933.GA2836@joana> (raw)
In-Reply-To: <20110411222504.GE2195@joana>
* Gustavo F. Padovan <padovan@profusion.mobi> [2011-04-11 19:25:04 -0300]:
> * Thomas Gleixner <tglx@linutronix.de> [2011-04-12 00:19:32 +0200]:
>=20
> > On Tue, 12 Apr 2011, Thomas Gleixner wrote:
> > > On Mon, 11 Apr 2011, Marcel Holtmann wrote:
> > >=20
> > > > Hi Thomas,
> > > >=20
> > > > > > > > Can the bluetooth folks please have a look at that ASAP? Th=
e obvious
> > > > > > > > fast fix for Linus tree is to revert the second hunk for no=
w, but this
> > > > > > > > needs to be fixed proper.
> > > > > > >=20
> > > > > > > Who will submit this patch? I'd rather have your name on it s=
o that
> > > > > > > people come complain at you...
> > > > > >=20
> > > > > > I took a shot at it and just sent a patch (also attached for co=
nvenience)=20
> > > > > > that should solve the problem.
> > > > >=20
> > > > > Aaarg. No. That patch reverts both hunks.
> > > > >=20
> > > > > --- a/net/bluetooth/hci_core.c
> > > > > +++ b/net/bluetooth/hci_core.c
> > > > > @@ -586,9 +586,6 @@ static int hci_dev_do_close(struct hci_dev *h=
dev)
> > > > > hci_req_cancel(hdev, ENODEV);
> > > > > hci_req_lock(hdev);
> > > > > =20
> > > > > - /* Stop timer, it might be running */
> > > > > - del_timer_sync(&hdev->cmd_timer);
> > > > > -
> > > > > if (!test_and_clear_bit(HCI_UP, &hdev->flags)) {
> > > > > hci_req_unlock(hdev);
> > > > > return 0;
> > > > >=20
> > > > > As I said before you need that first hunk to stay for the case wh=
ere
> > > > > there is no device up and you return via the !HCI_UP check. You j=
ust
> > > > > moved back to the state before as the stupid timer is active for
> > > > > whatever reason even when HCI_UP is not set.
> > > >=20
> > > > if I read this right then we have the case that we arm this timer f=
or no
> > > > real reason. A device in !HCI_UP should have nothing running. Certa=
inly
> > > > not the cmd_timer since it will never process any commands.
> > > >=20
> > > > According to Gustavo, the problem is really in the hci_reset logic =
were
> > > > we arm the timer even when shutting down the device.
> > >=20
> > > The reason why the original patch was sent is, that the timer was
> > > running when the thing went out via the !HCI_UP path, which caused the
> > > whole thing to explode in the first place. I had no time to figure out
> > > why, but moving the del_timer_sync above the
> > > if (!test_and_clear_bit(HCI_UP, &hdev->flags)) solved it.
> >=20
> > Oops. Hit send too fast.
> >=20
> > Then it broke the resume on Keith machine and reverting just the hunk
> > which disarms the timer in the=20
> >=20
> > if (hdev->sent_cmd) {
> >=20
> > path made both scenarios working. So there are two problems:
> >=20
> > 1) Why do we need the del_timer_sync() above the !HCI_UP check
>=20
> That is still a mysterious to me, the real bug the hiding here. I'm tryin=
g to
> track this down but no luck yet.
>=20
> >=20
> > 2) Why gets the timer rearmed after that
>=20
> It is armed at each HCI command we send. In the close path we send out an=
HCI
> RESET command that rearms it.
I applied v2 patch from Vin=EDcius that fix all the symptoms. Now we have m=
ore time
to find the real cause of this bug. However I still have no idea, I'm not a=
ble
to reproduce it.
--=20
Gustavo F. Padovan
http://profusion.mobi
WARNING: multiple messages have this Message-ID (diff)
From: "Gustavo F. Padovan" <padovan@profusion.mobi>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Marcel Holtmann <marcel@holtmann.org>,
Vinicius Costa Gomes <vinicius.gomes@openbossa.org>,
Keith Packard <keithp@keithp.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-bluetooth@vger.kernel.org
Subject: Re: 2.6.39-rc2 regression: X201s fails to resume b77dcf8460ae57d4eb9fd3633eb4f97b8fb20716
Date: Tue, 12 Apr 2011 15:09:33 -0300 [thread overview]
Message-ID: <20110412180933.GA2836@joana> (raw)
In-Reply-To: <20110411222504.GE2195@joana>
* Gustavo F. Padovan <padovan@profusion.mobi> [2011-04-11 19:25:04 -0300]:
> * Thomas Gleixner <tglx@linutronix.de> [2011-04-12 00:19:32 +0200]:
>
> > On Tue, 12 Apr 2011, Thomas Gleixner wrote:
> > > On Mon, 11 Apr 2011, Marcel Holtmann wrote:
> > >
> > > > Hi Thomas,
> > > >
> > > > > > > > Can the bluetooth folks please have a look at that ASAP? The obvious
> > > > > > > > fast fix for Linus tree is to revert the second hunk for now, but this
> > > > > > > > needs to be fixed proper.
> > > > > > >
> > > > > > > Who will submit this patch? I'd rather have your name on it so that
> > > > > > > people come complain at you...
> > > > > >
> > > > > > I took a shot at it and just sent a patch (also attached for convenience)
> > > > > > that should solve the problem.
> > > > >
> > > > > Aaarg. No. That patch reverts both hunks.
> > > > >
> > > > > --- a/net/bluetooth/hci_core.c
> > > > > +++ b/net/bluetooth/hci_core.c
> > > > > @@ -586,9 +586,6 @@ static int hci_dev_do_close(struct hci_dev *hdev)
> > > > > hci_req_cancel(hdev, ENODEV);
> > > > > hci_req_lock(hdev);
> > > > >
> > > > > - /* Stop timer, it might be running */
> > > > > - del_timer_sync(&hdev->cmd_timer);
> > > > > -
> > > > > if (!test_and_clear_bit(HCI_UP, &hdev->flags)) {
> > > > > hci_req_unlock(hdev);
> > > > > return 0;
> > > > >
> > > > > As I said before you need that first hunk to stay for the case where
> > > > > there is no device up and you return via the !HCI_UP check. You just
> > > > > moved back to the state before as the stupid timer is active for
> > > > > whatever reason even when HCI_UP is not set.
> > > >
> > > > if I read this right then we have the case that we arm this timer for no
> > > > real reason. A device in !HCI_UP should have nothing running. Certainly
> > > > not the cmd_timer since it will never process any commands.
> > > >
> > > > According to Gustavo, the problem is really in the hci_reset logic were
> > > > we arm the timer even when shutting down the device.
> > >
> > > The reason why the original patch was sent is, that the timer was
> > > running when the thing went out via the !HCI_UP path, which caused the
> > > whole thing to explode in the first place. I had no time to figure out
> > > why, but moving the del_timer_sync above the
> > > if (!test_and_clear_bit(HCI_UP, &hdev->flags)) solved it.
> >
> > Oops. Hit send too fast.
> >
> > Then it broke the resume on Keith machine and reverting just the hunk
> > which disarms the timer in the
> >
> > if (hdev->sent_cmd) {
> >
> > path made both scenarios working. So there are two problems:
> >
> > 1) Why do we need the del_timer_sync() above the !HCI_UP check
>
> That is still a mysterious to me, the real bug the hiding here. I'm trying to
> track this down but no luck yet.
>
> >
> > 2) Why gets the timer rearmed after that
>
> It is armed at each HCI command we send. In the close path we send out an HCI
> RESET command that rearms it.
I applied v2 patch from Vinícius that fix all the symptoms. Now we have more time
to find the real cause of this bug. However I still have no idea, I'm not able
to reproduce it.
--
Gustavo F. Padovan
http://profusion.mobi
next prev parent reply other threads:[~2011-04-12 18:09 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-06 7:44 2.6.39-rc2 regression: X201s fails to resume b77dcf8460ae57d4eb9fd3633eb4f97b8fb20716 Keith Packard
2011-04-06 22:31 ` Thomas Gleixner
2011-04-08 21:44 ` Thomas Gleixner
2011-04-08 23:13 ` Keith Packard
2011-04-09 0:08 ` Vinicius Costa Gomes
2011-04-09 2:03 ` Keith Packard
2011-04-11 15:58 ` Gustavo F. Padovan
2011-04-11 16:13 ` Thomas Gleixner
2011-04-11 16:12 ` Thomas Gleixner
2011-04-11 21:40 ` Marcel Holtmann
2011-04-11 22:16 ` Thomas Gleixner
2011-04-11 22:19 ` Thomas Gleixner
2011-04-11 22:25 ` Gustavo F. Padovan
2011-04-12 18:09 ` Gustavo F. Padovan [this message]
2011-04-12 18:09 ` Gustavo F. Padovan
2011-04-12 18:28 ` Thomas Gleixner
2011-04-11 16:34 ` Alan Cox
2011-04-11 16:59 ` Vinicius Gomes
2011-04-21 21:50 ` Keith Packard
2011-04-21 22:59 ` Gustavo F. Padovan
2011-04-22 0:04 ` Keith Packard
2011-04-23 16:37 ` Gustavo F. Padovan
2011-05-03 7:27 ` Stefan Seyfried
2011-05-03 14:48 ` Keith Packard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110412180933.GA2836@joana \
--to=padovan@profusion.mobi \
--cc=keithp@keithp.com \
--cc=linux-bluetooth@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marcel@holtmann.org \
--cc=tglx@linutronix.de \
--cc=vinicius.gomes@openbossa.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.