netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleksij Rempel <o.rempel@pengutronix.de>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>,
	Vladimir Oltean <olteanv@gmail.com>,
	Oleksij Rempel <linux@rempel-privat.de>,
	netdev@vger.kernel.org, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, horms@kernel.org,
	nbd@nbd.name, sean.wang@mediatek.com, Mark-MC.Lee@mediatek.com,
	lorenzo.bianconi83@gmail.com
Subject: Re: [RFC net-next 0/5] Add ETS and TBF Qdisc offload for Airoha EN7581 SoC
Date: Tue, 17 Dec 2024 10:38:21 +0100	[thread overview]
Message-ID: <Z2FGjeyawnhABnRb@pengutronix.de> (raw)
In-Reply-To: <f8e74e29-f4b0-4e38-8701-a4364d68230f@lunn.ch>

On Tue, Dec 17, 2024 at 12:24:13AM +0100, Andrew Lunn wrote:
> > Considering patch [0], we are still offloading the Qdisc on the provided
> > DSA switch port (e.g. LANx) via the port_setup_tc() callback available in
> > dsa_user_setup_qdisc(), but we are introducing even the ndo_setup_tc_conduit()
> > callback in order to use the hw Qdisc capabilities available on the mac chip
> > (e.g. EN7581) for the routed traffic from WAN to LANx. We will still apply
> > the Qdisc defined on LANx for L2 traffic from LANy to LANx. Agree?
> 
> I've not read all the details, so i could be getting something
> wrong. But let me point out the basics. Offloading is used to
> accelerate what Linux already supports in software. So forget about
> your hardware. How would i configure a bunch of e1000e cards connected
> to a software bridge to do what you want?
> 
> There is no conduit interface in this, so i would not expect to
> explicitly configure a conduit interface. Maybe the offloading needs
> to implicitly configure the conduit, but that should be all hidden
> away from the user. But given the software bridge has no concept of a
> conduit, i doubt it.
> 
> It could well be our model does not map to the hardware too well,
> leaving some bits unusable, but there is not much you can do about
> that, that is the Linux model, accelerate what Linux supports in
> software.

Hi,

You are absolutely correct that offloading should accelerate what Linux already
supports in software, and we need to respect this model. However, I’d like to
step back for a moment to clarify the underlying problem before focusing too
much on solutions.

### The Core Problem: Flow Control Limitations

1. **QoS and Flow Control:** 

   At the heart of proper QoS implementation lies flow control. Flow control
   mechanisms exist at various levels:

   - MAC-level signaling (e.g., pause frames)

   - Queue management (e.g., stopping queues when the hardware is congested)

   The typical Linux driver uses flow control signaling from the MAC (e.g.,
   stopping queues) to coordinate traffic, and depending on the Qdisc, this
   flow control can propagate up to user space applications.

2. **Challenges with DSA:**
   In DSA, we lose direct **flow control communication** between:

   - The host MAC

   - The MAC of a DSA user port.

   While internal flow control within the switch may still work, it does not
   extend to the host. Specifically:

   - Pause frames often affect **all priorities** and are not granular enough
     for low-latency applications.

   - The signaling from the MAC of the DSA user port to the host is either
     **not supported** or is **disabled** (often through device tree
     configuration).

### Why This Matters for QoS

For traffic flowing **from the host** to DSA user ports:

- Without proper flow control, congestion cannot be communicated back to the
  host, leading to buffer overruns and degraded QoS.  

- To address this, we need to compensate for the lack of flow control signaling
  by applying traffic limits (or shaping).

### Approach: Applying Limits on the Conduit Interface

One way to solve this is by applying traffic shaping or limits directly on the
**conduit MAC**. However, this approach has significant complexity:

1. **Hardware-Specific Details:**

   We would need deep hardware knowledge to set up traffic filters or disectors
   at the conduit level. This includes:

   - Parsing **CPU tags** specific to the switch in use.  

   - Applying port-specific rules, some of which depend on **user port link
     speed**.

2. **Admin Burden:**

   Forcing network administrators to configure conduit-specific filters
   manually increases complexity and goes against the existing DSA abstractions,
   which are already well-integrated into the kernel.


### How Things Can Be Implemented

To address QoS for host-to-user port traffic in DSA, I see two possible
approaches:

#### 1. Apply Rules on the Conduit Port (Using `dst_port`)

In this approach, rules are applied to the **conduit interface**, and specific
user ports are matched using **port indices**.

# Conduit interface  
tc qdisc add dev conduit0 clsact  

# Match traffic for user port 1 (e.g., lan0)  
tc filter add dev conduit0 egress flower dst_port 1 \  
    action police rate 50mbit burst 5k drop  

# Match traffic for user port 2 (e.g., lan1)  
tc filter add dev conduit0 egress flower dst_port 2 \  
    action police rate 30mbit burst 3k drop  

#### 2. Apply Rules Directly on the User Ports (With Conduit Marker)

In this approach, rules are applied **directly to the user-facing DSA ports**
(e.g., `lan0`, `lan1`) with a **conduit-specific marker**. The kernel resolves
the mapping internally.

# Apply rules with conduit marker for user ports  
tc qdisc add dev lan0 root tbf rate 50mbit burst 5k conduit-only  
tc qdisc add dev lan1 root tbf rate 30mbit burst 3k conduit-only  

Here:  
- **`conduit-only`**: A marker (flag) indicating that the rule applies
specifically to **host-to-port traffic** and not to L2-forwarded traffic within
the switch.  

### Recommendation

The second approach (**user port-based with `conduit-only` marker**) is cleaner
and more intuitive. It avoids exposing hardware details like port indices while
letting the kernel handle conduit-specific behavior transparently.

Best regards,  
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

  reply	other threads:[~2024-12-17  9:38 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-11 15:31 [RFC net-next 0/5] Add ETS and TBF Qdisc offload for Airoha EN7581 SoC Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 1/5] net: airoha: Enable Tx drop capability for each Tx DMA ring Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 2/5] net: airoha: Introduce ndo_select_queue callback Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 3/5] net: dsa: Introduce ndo_setup_tc_conduit callback Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 4/5] net: airoha: Add sched ETS offload support Lorenzo Bianconi
2024-12-12 14:37   ` Davide Caratti
2024-12-12 17:04     ` Lorenzo Bianconi
2024-12-11 15:31 ` [RFC net-next 5/5] net: airoha: Add sched TBF " Lorenzo Bianconi
2024-12-11 15:41 ` [RFC net-next 0/5] Add ETS and TBF Qdisc offload for Airoha EN7581 SoC Vladimir Oltean
2024-12-12  9:19   ` Lorenzo Bianconi
2024-12-12 15:06     ` Vladimir Oltean
2024-12-12 17:03       ` Lorenzo Bianconi
2024-12-12 18:46         ` Vladimir Oltean
2024-12-16 12:09           ` Lorenzo Bianconi
2024-12-16 15:49             ` Vladimir Oltean
2024-12-16 18:14               ` Oleksij Rempel
2024-12-16 19:01               ` Lorenzo Bianconi
2024-12-16 19:23                 ` Oleksij Rempel
2024-12-16 21:44                   ` Lorenzo Bianconi
2024-12-17  8:46                     ` Oleksij Rempel
2024-12-16 19:46                 ` Vladimir Oltean
2024-12-16 22:28                   ` Lorenzo Bianconi
2024-12-16 23:13                     ` Vladimir Oltean
2024-12-17  9:11                       ` Lorenzo Bianconi
2024-12-17  9:30                         ` Vladimir Oltean
2024-12-17 10:01                           ` Lorenzo Bianconi
2024-12-17 10:17                             ` Vladimir Oltean
2024-12-17 10:23                               ` Oleksij Rempel
2024-12-16 23:24                 ` Andrew Lunn
2024-12-17  9:38                   ` Oleksij Rempel [this message]
2024-12-17 11:54                     ` Vladimir Oltean
2024-12-17 12:22                       ` Oleksij Rempel
2024-12-17 13:28                         ` Vladimir Oltean

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z2FGjeyawnhABnRb@pengutronix.de \
    --to=o.rempel@pengutronix.de \
    --cc=Mark-MC.Lee@mediatek.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux@rempel-privat.de \
    --cc=lorenzo.bianconi83@gmail.com \
    --cc=lorenzo@kernel.org \
    --cc=nbd@nbd.name \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=sean.wang@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).