All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vinicius Costa Gomes <vinicius.gomes@intel.com>
To: Vladimir Oltean <vladimir.oltean@nxp.com>, netdev@vger.kernel.org
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Voon Weifeng <weifeng.voon@intel.com>,
	Vladimir Oltean <olteanv@gmail.com>,
	Kurt Kanzenbach <kurt@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 net 1/2] net/sched: taprio: avoid disabling offload when it was never enabled
Date: Thu, 15 Sep 2022 14:52:29 -0700	[thread overview]
Message-ID: <877d24gvaa.fsf@intel.com> (raw)
In-Reply-To: <20220915100802.2308279-2-vladimir.oltean@nxp.com>

Vladimir Oltean <vladimir.oltean@nxp.com> writes:

> In an incredibly strange API design decision, qdisc->destroy() gets
> called even if qdisc->init() never succeeded, not exclusively since
> commit 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation"),
> but apparently also earlier (in the case of qdisc_create_dflt()).
>
> The taprio qdisc does not fully acknowledge this when it attempts full
> offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in
> taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS
> parsed from netlink (in taprio_change(), tail called from taprio_init()).
>
> But in taprio_destroy(), we call taprio_disable_offload(), and this
> determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).
>
> But looking at the implementation of FULL_OFFLOAD_IS_ENABLED()
> (a bitwise check of bit 1 in q->flags), it is invalid to call this macro
> on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set
> to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on
> an invalid set of flags.
>
> As a result, it is possible to crash the kernel if user space forces an
> error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling
> of taprio_enable_offload(). This is because drivers do not expect the
> offload to be disabled when it was never enabled.
>
> The error that we force here is to attach taprio as a non-root qdisc,
> but instead as child of an mqprio root qdisc:
>
> $ tc qdisc add dev swp0 root handle 1: \
> 	mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \
> 	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
> $ tc qdisc replace dev swp0 parent 1:1 \
> 	taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
> 	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
> 	sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
> 	flags 0x0 clockid CLOCK_TAI
> Unable to handle kernel paging request at virtual address fffffffffffffff8
> [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> Call trace:
>  taprio_dump+0x27c/0x310
>  vsc9959_port_setup_tc+0x1f4/0x460
>  felix_port_setup_tc+0x24/0x3c
>  dsa_slave_setup_tc+0x54/0x27c
>  taprio_disable_offload.isra.0+0x58/0xe0
>  taprio_destroy+0x80/0x104
>  qdisc_create+0x240/0x470
>  tc_modify_qdisc+0x1fc/0x6b0
>  rtnetlink_rcv_msg+0x12c/0x390
>  netlink_rcv_skb+0x5c/0x130
>  rtnetlink_rcv+0x1c/0x2c
>
> Fix this by keeping track of the operations we made, and undo the
> offload only if we actually did it.
>
> I've added "bool offloaded" inside a 4 byte hole between "int clockid"
> and "atomic64_t picos_per_byte". Now the first cache line looks like
> below:
>
> $ pahole -C taprio_sched net/sched/sch_taprio.o
> struct taprio_sched {
>         struct Qdisc * *           qdiscs;               /*     0     8 */
>         struct Qdisc *             root;                 /*     8     8 */
>         u32                        flags;                /*    16     4 */
>         enum tk_offsets            tk_offset;            /*    20     4 */
>         int                        clockid;              /*    24     4 */
>         bool                       offloaded;            /*    28     1 */
>
>         /* XXX 3 bytes hole, try to pack */
>
>         atomic64_t                 picos_per_byte;       /*    32     0 */
>
>         /* XXX 8 bytes hole, try to pack */
>
>         spinlock_t                 current_entry_lock;   /*    40     0 */
>
>         /* XXX 8 bytes hole, try to pack */
>
>         struct sched_entry *       current_entry;        /*    48     8 */
>         struct sched_gate_list *   oper_sched;           /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>
> Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---

Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


Cheers,
-- 
Vinicius

  reply	other threads:[~2022-09-15 21:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-15 10:08 [PATCH v2 net 0/2] Fixes for tc-taprio software mode Vladimir Oltean
2022-09-15 10:08 ` [PATCH v2 net 1/2] net/sched: taprio: avoid disabling offload when it was never enabled Vladimir Oltean
2022-09-15 21:52   ` Vinicius Costa Gomes [this message]
2022-09-15 10:08 ` [PATCH v2 net 2/2] net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs Vladimir Oltean
2022-09-15 21:54   ` Vinicius Costa Gomes
2023-02-22 14:03   ` Kurt Kanzenbach
2023-02-22 14:25     ` Vladimir Oltean
2023-02-23 13:03       ` Greg Kroah-Hartman
2023-02-23 13:18         ` Vladimir Oltean
2022-09-20 19:00 ` [PATCH v2 net 0/2] Fixes for tc-taprio software mode patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877d24gvaa.fsf@intel.com \
    --to=vinicius.gomes@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=kurt@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=vladimir.oltean@nxp.com \
    --cc=weifeng.voon@intel.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.