netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: Vladimir Oltean <vladimir.oltean@nxp.com>, netdev@vger.kernel.org
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Voon Weifeng <weifeng.voon@intel.com>,
	Vladimir Oltean <olteanv@gmail.com>,
	Kurt Kanzenbach <kurt@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 net 1/2] net/sched: taprio: avoid disabling offload when it was never enabled
Date: Thu, 15 Sep 2022 14:52:29 -0700	[thread overview]
Message-ID: <877d24gvaa.fsf@intel.com> (raw)
In-Reply-To: <20220915100802.2308279-2-vladimir.oltean@nxp.com>

Vladimir Oltean <vladimir.oltean@nxp.com> writes:

> In an incredibly strange API design decision, qdisc->destroy() gets
> called even if qdisc->init() never succeeded, not exclusively since
> commit 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation"),
> but apparently also earlier (in the case of qdisc_create_dflt()).
>
> The taprio qdisc does not fully acknowledge this when it attempts full
> offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in
> taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS
> parsed from netlink (in taprio_change(), tail called from taprio_init()).
>
> But in taprio_destroy(), we call taprio_disable_offload(), and this
> determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).
>
> But looking at the implementation of FULL_OFFLOAD_IS_ENABLED()
> (a bitwise check of bit 1 in q->flags), it is invalid to call this macro
> on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set
> to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on
> an invalid set of flags.
>
> As a result, it is possible to crash the kernel if user space forces an
> error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling
> of taprio_enable_offload(). This is because drivers do not expect the
> offload to be disabled when it was never enabled.
>
> The error that we force here is to attach taprio as a non-root qdisc,
> but instead as child of an mqprio root qdisc:
>
> $ tc qdisc add dev swp0 root handle 1: \
> 	mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \
> 	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
> $ tc qdisc replace dev swp0 parent 1:1 \
> 	taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
> 	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
> 	sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
> 	flags 0x0 clockid CLOCK_TAI
> Unable to handle kernel paging request at virtual address fffffffffffffff8
> [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> Call trace:
>  taprio_dump+0x27c/0x310
>  vsc9959_port_setup_tc+0x1f4/0x460
>  felix_port_setup_tc+0x24/0x3c
>  dsa_slave_setup_tc+0x54/0x27c
>  taprio_disable_offload.isra.0+0x58/0xe0
>  taprio_destroy+0x80/0x104
>  qdisc_create+0x240/0x470
>  tc_modify_qdisc+0x1fc/0x6b0
>  rtnetlink_rcv_msg+0x12c/0x390
>  netlink_rcv_skb+0x5c/0x130
>  rtnetlink_rcv+0x1c/0x2c
>
> Fix this by keeping track of the operations we made, and undo the
> offload only if we actually did it.
>
> I've added "bool offloaded" inside a 4 byte hole between "int clockid"
> and "atomic64_t picos_per_byte". Now the first cache line looks like
> below:
>
> $ pahole -C taprio_sched net/sched/sch_taprio.o
> struct taprio_sched {
>         struct Qdisc * *           qdiscs;               /*     0     8 */
>         struct Qdisc *             root;                 /*     8     8 */
>         u32                        flags;                /*    16     4 */
>         enum tk_offsets            tk_offset;            /*    20     4 */
>         int                        clockid;              /*    24     4 */
>         bool                       offloaded;            /*    28     1 */
>
>         /* XXX 3 bytes hole, try to pack */
>
>         atomic64_t                 picos_per_byte;       /*    32     0 */
>
>         /* XXX 8 bytes hole, try to pack */
>
>         spinlock_t                 current_entry_lock;   /*    40     0 */
>
>         /* XXX 8 bytes hole, try to pack */
>
>         struct sched_entry *       current_entry;        /*    48     8 */
>         struct sched_gate_list *   oper_sched;           /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>
> Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---

Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


Cheers,
-- 
Vinicius

  reply	other threads:[~2022-09-15 21:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-15 10:08 [PATCH v2 net 0/2] Fixes for tc-taprio software mode Vladimir Oltean
2022-09-15 10:08 ` [PATCH v2 net 1/2] net/sched: taprio: avoid disabling offload when it was never enabled Vladimir Oltean
2022-09-15 21:52   ` Vinicius Costa Gomes [this message]
2022-09-15 10:08 ` [PATCH v2 net 2/2] net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs Vladimir Oltean
2022-09-15 21:54   ` Vinicius Costa Gomes
2023-02-22 14:03   ` Kurt Kanzenbach
2023-02-22 14:25     ` Vladimir Oltean
2023-02-23 13:03       ` Greg Kroah-Hartman
2023-02-23 13:18         ` Vladimir Oltean
2022-09-20 19:00 ` [PATCH v2 net 0/2] Fixes for tc-taprio software mode patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877d24gvaa.fsf@intel.com \
    --to=vinicius.gomes@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=kurt@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=vladimir.oltean@nxp.com \
    --cc=weifeng.voon@intel.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).