All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Heitor Alves de Siqueira" <halves@igalia.com>
To: "Luiz Augusto von Dentz" <luiz.dentz@gmail.com>
Cc: "Hillf Danton" <hdanton@sina.com>,
	"Marcel Holtmann" <marcel@holtmann.org>,
	"Gustavo Padovan" <padovan@profusion.mobi>,
	<linux-bluetooth@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<kernel-dev@igalia.com>, <syzkaller-bugs@googlegroups.com>,
	<syzbot+97721dd81f792e838ba0@syzkaller.appspotmail.com>
Subject: Re: [PATCH] Bluetooth: hci_core: Don't queue tx_work while draining workqueue
Date: Mon, 18 May 2026 12:21:46 -0300	[thread overview]
Message-ID: <DILWLB58NRY9.1AS5Z2W6EHAIG@igalia.com> (raw)
In-Reply-To: <CABBYNZKA3FsovfdWEeZYqFjjNCabjfa7uM48mq+r3G0PTb_JUg@mail.gmail.com>

Hi Luiz,

On Fri May 15, 2026 at 10:33 AM -03, Luiz Augusto von Dentz wrote:
> Hi Heitor,
>
> On Thu, May 14, 2026 at 10:54 AM Heitor Alves de Siqueira
> <halves@igalia.com> wrote:
>>
>> On Wed May 13, 2026 at 11:04 PM -03, Hillf Danton wrote:
>> > On Wed, 13 May 2026 15:55:23 -0300 Heitor Alves de Siqueira wrote:
>> >> Syzbot reported a warning when L2CAP calls queue_work() on the hdev
>> >> workqueue while it's being drained. This can happen during device reset or
>> >> close paths for hci_send_acl(), hci_send_sco() and hci_send_iso().
>> >>
>> >> The workqueue is drained in hci_dev_do_reset() and in hci_dev_close_sync():
>> >>   - hci_dev_close_sync() clears the HCI_UP bit before draining
>> >>   - hci_dev_do_reset() sets HCI_CMD_DRAIN_WORKQUEUE before draining
>> >>
>> >> Add these checks before queuing tx_work, and free the SKB if it's not
>> >> queued for transmission.
>> >>
>> >> Fixes: 3eff45eaf817 ("Bluetooth: convert tx_task to workqueue")
>> >> Reported-by: syzbot+97721dd81f792e838ba0@syzkaller.appspotmail.com
>> >> Closes: https://syzkaller.appspot.com/bug?extid=97721dd81f792e838ba0
>> >> Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com>
>> >> ---
>> >>  net/bluetooth/hci_core.c | 18 ++++++++++++++++++
>> >>  1 file changed, 18 insertions(+)
>> >>
>> >> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
>> >> index c46c1236ebfa..5d5f8ad7d1a8 100644
>> >> --- a/net/bluetooth/hci_core.c
>> >> +++ b/net/bluetooth/hci_core.c
>> >> @@ -3278,6 +3278,12 @@ void hci_send_acl(struct hci_chan *chan, struct sk_buff *skb, __u16 flags)
>> >>
>> >>      BT_DBG("%s chan %p flags 0x%4.4x", hdev->name, chan, flags);
>> >>
>> >> +    if (!test_bit(HCI_UP, &hdev->flags) ||
>> >> +        hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE)) {
>> >> +            kfree_skb(skb);
>> >> +            return;
>> >> +    }
>> >> +
>> >>      hci_queue_acl(chan, &chan->data_q, skb, flags);
>> >>
>> >>      queue_work(hdev->workqueue, &hdev->tx_work);
>> >>
>> > What you add is not enough, go and see how HCI_CMD_DRAIN_WORKQUEUE is
>> > checked in hci_cmd_work(), and in hci_dev_do_reset() for why.
>>
>> I see, I missed the RCU guards for the device flags. Sorry about that,
>> I'll add them to v2.
>> Thanks for the catch!
>
> Actually this whole thing might be because we try to clean the
> workqueue without actually closing the hdev, so I suspect that if we
> just remove all the code from hci_dev_do_reset with
> hci_dev_do_close+hci_dev_do_open, it might work better and align with
> how things work over the MGMT interface:
>
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index c46c1236ebfa..6782bbc9b6a7 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -539,47 +539,13 @@ static int hci_dev_do_reset(struct hci_dev *hdev)
>
>         hci_req_sync_lock(hdev);
>
> -       /* Drop queues */
> -       skb_queue_purge(&hdev->rx_q);
> -       skb_queue_purge(&hdev->cmd_q);
> -
> -       /* Cancel these to avoid queueing non-chained pending work */
> -       hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
> -       /* Wait for
> -        *
> -        *    if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> -        *        queue_delayed_work(&hdev->{cmd,ncmd}_timer)
> -        *
> -        * inside RCU section to see the flag or complete scheduling.
> -        */
> -       synchronize_rcu();
> -       /* Explicitly cancel works in case scheduled after setting the flag. */
> -       cancel_delayed_work(&hdev->cmd_timer);
> -       cancel_delayed_work(&hdev->ncmd_timer);
> -
> -       /* Avoid potential lockdep warnings from the *_flush() calls by
> -        * ensuring the workqueue is empty up front.
> -        */
> -       drain_workqueue(hdev->workqueue);
> -
> -       hci_dev_lock(hdev);
> -       hci_inquiry_cache_flush(hdev);
> -       hci_conn_hash_flush(hdev);
> -       hci_dev_unlock(hdev);
> -
> -       if (hdev->flush)
> -               hdev->flush(hdev);
> -
> -       hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
> -
> -       atomic_set(&hdev->cmd_cnt, 1);
> -       hdev->acl_cnt = 0;
> -       hdev->sco_cnt = 0;
> -       hdev->le_cnt = 0;
> -       hdev->iso_cnt = 0;
> +       ret = hci_dev_close_sync(hdev);
> +       if (ret)
> +               goto done;
>
> -       ret = hci_reset_sync(hdev);
> +       ret = hci_dev_open_sync(hdev);
>
> +done:
>         hci_req_sync_unlock(hdev);
>         return ret;
>  }
>
> Seem to work here and as a side effect we get notified over the MGMT
> when a user uses hciconfig hci0 reset:
>
> # tools/hciconfig hci0 reset
> bluetoothd[34]: src/shared/mgmt.c:can_read_data() [0x0000] event 0x0006
> bluetoothd[34]: src/adapter.c:new_settings_callback() Settings: 0x01fc0ac0
> bluetoothd[34]: src/adapter.c:settings_changed() Changed settings: 0x00000001
> bluetoothd[34]: src/adapter.c:settings_changed() Pending settings: 0x00000000
> bluetoothd[34]: src/adapter.c:cancel_passive_scanning()
> bluetoothd[34]: src/adapter.c:adapter_stop() adapter /org/bluez/hci0
> has been disabled
> bluetoothd[34]: src/shared/mgmt.c:can_read_data() [0x0000] event 0x0008
> bluetoothd[34]: src/shared/mgmt.c:can_read_data() [0x0000] event 0x0006
> bluetoothd[34]: src/adapter.c:new_settings_callback() Settings: 0x01fc0ac1
> bluetoothd[34]: src/adapter.c:settings_changed() Changed settings: 0x00000001
> bluetoothd[34]: src/adapter.c:settings_changed() Pending settings: 0x00000000
> bluetoothd[34]: src/adapter.c:adapter_start() adapter /org/bluez/hci0
> has been enabled
> bluetoothd[34]: src/adapter.c:trigger_passive_scanning()
> bluetoothd[34]: [signal] org.freedesktop.DBus.Properties.PropertiesChanged

Thank you for the review, and for the suggestion! There seems to be a
number of similar syzbot reports for the hdev workqueue, so maybe the
close+open approach is a simpler solution indeed. I've originally
considered implementing a wrapper/helper for workqueue submission in
hci_core.c, but if we can eliminate the race condition altogether that'd
be even better.

My only concern is that there seem to be slight differences between
what hci_dev_do_reset() does now and what we'd get with
hci_dev_close_sync()+hci_dev_open_sync() (e.g. we wouldn't use
HCI_CMD_DRAIN_WORKQUEUE anymore). Would this alternative approach be
suitable for the stable kernels? And if not, do you think the checks
from v1 would be appropriate, in that case?

  reply	other threads:[~2026-05-18 15:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 18:55 [PATCH] Bluetooth: hci_core: Don't queue tx_work while draining workqueue Heitor Alves de Siqueira
2026-05-13 20:45 ` bluez.test.bot
2026-05-14  2:04 ` [PATCH] " Hillf Danton
2026-05-14 14:53   ` Heitor Alves de Siqueira
2026-05-15 13:33     ` Luiz Augusto von Dentz
2026-05-18 15:21       ` Heitor Alves de Siqueira [this message]
2026-05-18 20:48         ` Luiz Augusto von Dentz
2026-05-20 14:26           ` Heitor Alves de Siqueira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DILWLB58NRY9.1AS5Z2W6EHAIG@igalia.com \
    --to=halves@igalia.com \
    --cc=hdanton@sina.com \
    --cc=kernel-dev@igalia.com \
    --cc=linux-bluetooth@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luiz.dentz@gmail.com \
    --cc=marcel@holtmann.org \
    --cc=padovan@profusion.mobi \
    --cc=syzbot+97721dd81f792e838ba0@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.