From: Petr Machata <petrm@mellanox.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Eric Dumazet <eric.dumazet@gmail.com>,
Jiri Pirko <jiri@mellanox.com>,
Ido Schimmel <idosch@mellanox.com>
Subject: Re: [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks
Date: Wed, 08 Jul 2020 18:21:12 +0200 [thread overview]
Message-ID: <87y2nugepz.fsf@mellanox.com> (raw)
In-Reply-To: <873662i3rc.fsf@mellanox.com>
Petr Machata <petrm@mellanox.com> writes:
> Cong Wang <xiyou.wangcong@gmail.com> writes:
>
>> On Tue, Jul 7, 2020 at 8:22 AM Petr Machata <petrm@mellanox.com> wrote:
>>>
>>>
>>> Cong Wang <xiyou.wangcong@gmail.com> writes:
>>>
>>> > On Fri, Jun 26, 2020 at 3:46 PM Petr Machata <petrm@mellanox.com> wrote:
>>> >> The function tcf_qevent_handle() should be invoked when qdisc hits the
>>> >> "interesting event" corresponding to a block. This function releases root
>>> >> lock for the duration of executing the attached filters, to allow packets
>>> >> generated through user actions (notably mirred) to be reinserted to the
>>> >> same qdisc tree.
>>> >
>>> > Are you sure releasing the root lock in the middle of an enqueue operation
>>> > is a good idea? I mean, it seems racy with qdisc change or reset path,
>>> > for example, __red_change() could update some RED parameters
>>> > immediately after you release the root lock.
>>>
>>> So I had mulled over this for a while. If packets are enqueued or
>>> dequeued while the lock is released, maybe the packet under
>>> consideration should have been hard_marked instead of prob_marked, or
>>> vice versa. (I can't really go to not marked at all, because the fact
>>> that we ran the qevent is a very observable commitment to put the packet
>>> in the queue with marking.) I figured that is not such a big deal.
>>>
>>> Regarding a configuration change, for a brief period after the change, a
>>> few not-yet-pushed packets could have been enqueued with ECN marking
>>> even as I e.g. disabled ECN. This does not seem like a big deal either,
>>> these are transient effects.
>>
>> Hmm, let's see:
>>
>> 1. red_enqueue() caches a pointer to child qdisc, child = q->qdisc
>> 2. root lock is released by tcf_qevent_handle().
>> 3. __red_change() acquires the root lock and then changes child
>> qdisc with a new one
>> 4. The old child qdisc is put by qdisc_put()
>> 5. tcf_qevent_handle() acquires the root lock again, and still uses
>> the cached but now freed old child qdisc.
>>
>> Isn't this a problem?
>
> I missed this. It is a problem, destroy gets called right away and then
> enqueue would use invalid data.
>
> Also qdisc_graft() calls cops->graft, which locks the tree and swaps the
> qdisc pointes, and then qdisc_put()s the original one. So dropping the
> root lock can lead to destruction of the qdisc whose enqueue is
> currently executed.
>
> So that's no good. The lock has to be held throughout.
>
>> Even if it really is not, why do we make tcf_qevent_handle() callers'
>> life so hard? They have to be very careful with race conditions like this.
>
> Do you have another solution in mind here? I think the deadlock (in both
> classification and qevents) is an issue, but really don't know how to
> avoid it except by dropping the lock.
Actually I guess I could qdisc_refcount_inc() the current qdisc so that
it doesn't go away. Then when enqueing I could access the child
directly, not relying on the now-obsolete cache from the beginning of
the enqueue function. I suppose that a similar approach could be used in
other users of tcf_classify() as well. What do you think?
next prev parent reply other threads:[~2020-07-08 16:21 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-26 22:45 [PATCH net-next v1 0/5] TC: Introduce qevents Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 1/5] net: sched: Pass root lock to Qdisc_ops.enqueue Petr Machata
2020-07-06 19:21 ` Cong Wang
2020-07-07 15:25 ` Petr Machata
2020-07-07 19:41 ` Cong Wang
2020-06-26 22:45 ` [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks Petr Machata
2020-07-06 19:48 ` Cong Wang
2020-07-07 15:22 ` Petr Machata
2020-07-07 19:13 ` Cong Wang
2020-07-08 12:35 ` Petr Machata
2020-07-08 16:21 ` Petr Machata [this message]
2020-07-08 19:09 ` Cong Wang
2020-07-08 19:04 ` Cong Wang
2020-07-08 21:04 ` Petr Machata
2020-07-09 0:13 ` Petr Machata
2020-07-09 19:37 ` Cong Wang
2020-07-10 14:40 ` Petr Machata
2020-07-14 2:51 ` Cong Wang
2020-07-14 9:12 ` Petr Machata
2020-07-07 19:48 ` Cong Wang
2020-07-08 9:19 ` Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 3/5] net: sched: sch_red: Split init and change callbacks Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 4/5] net: sched: sch_red: Add qevents "early_drop" and "mark" Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 5/5] selftests: forwarding: Add a RED test for SW datapath Petr Machata
2020-06-26 22:45 ` [PATCH iproute2-next v1 1/4] uapi: pkt_sched: Add two new RED attributes Petr Machata
2020-06-26 22:45 ` [PATCH iproute2-next v1 2/4] tc: Add helpers to support qevent handling Petr Machata
2020-06-26 22:45 ` [PATCH iproute2-next v1 3/4] man: tc: Describe qevents Petr Machata
2020-06-26 22:45 ` [PATCH iproute2-next v1 4/4] tc: q_red: Add support for qevents "mark" and "early_drop" Petr Machata
2020-06-26 22:56 ` [PATCH net-next v1 0/5] TC: Introduce qevents Stephen Hemminger
2020-06-29 13:21 ` Petr Machata
2020-06-30 0:15 ` David Miller
-- strict thread matches above, loose matches on Subject: below --
2020-06-27 21:06 [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y2nugepz.fsf@mellanox.com \
--to=petrm@mellanox.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=idosch@mellanox.com \
--cc=jiri@mellanox.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.