All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Machata <petrm@mellanox.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Jiri Pirko <jiri@mellanox.com>,
	Ido Schimmel <idosch@mellanox.com>
Subject: Re: [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks
Date: Fri, 10 Jul 2020 16:40:21 +0200	[thread overview]
Message-ID: <875zavh1re.fsf@mellanox.com> (raw)
In-Reply-To: <CAM_iQpU-fh9Saaxo+6juONn+Xd891sUhgaaoht0Bkn2ssAEm8A@mail.gmail.com>


Cong Wang <xiyou.wangcong@gmail.com> writes:

> On Wed, Jul 8, 2020 at 5:13 PM Petr Machata <petrm@mellanox.com> wrote:
>>
>>
>> Petr Machata <petrm@mellanox.com> writes:
>>
>> > Cong Wang <xiyou.wangcong@gmail.com> writes:
>> >
>> > I'll think about it some more. For now I will at least fix the lack of
>> > locking.
>>
>> I guess I could store smp_processor_id() that acquired the lock in
>> struct qdisc_skb_head. Do a trylock instead of lock, and on fail check
>> the stored value. I'll need to be careful about the race between
>> unsuccessful trylock and the test, and about making sure CPU ID doesn't
>> change after it is read. I'll probe this tomorrow.
>
> Like __netif_tx_lock(), right? Seems doable.

Good to see it actually used, I wasn't sure if the idea made sense :)

Unfortunately it is not enough.

Consider two threads (A, B) and two netdevices (eth0, eth1):

- "A" takes eth0's root lock and proceeds to classification
- "B" takes eth1's root lock and proceeds to classification
- "A" invokes mirror to eth1, waits on lock held by "B"
- "B" invakes mirror to eth0, waits on lock held by "A"
- Some say they are still waiting to this day.

So one option that I see is to just stash the mirrored packet in a queue
instead of delivering it right away:

- s/netif_receive_skb/netif_rx/ in act_mirred

- Reuse the RX queue for TX packets as well, differentiating the two by
  a bit in SKB CB. Then process_backlog() would call either
  __netif_receive_skb() or dev_queue_transmit().

- Drop mirred_rec_level guard.

This seems to work, but I might be missing something non-obvious, such
as CB actually being used for something already in that context. I would
really rather not introduce a second backlog queue just for mirred
though.

Since mirred_rec_level does not kick in anymore, the same packet can end
up being forwarded from the backlog queue, to the qdisc, and back to the
backlog queue, forever. But that seems OK, that's what the admin
configured, so that's what's happening.

If this is not a good idea for some reason, this might work as well:

- Convert the current root lock to an rw lock. Convert all current
  lockers to write lock (which should be safe), except of enqueue, which
  will take read lock. That will allow many concurrent threads to enter
  enqueue, or one thread several times, but it will exclude all other
  users.

  So this guards configuration access to the qdisc tree, makes sure
  qdiscs don't go away from under one's feet.

- Introduce another spin lock to guard the private data of the qdisc
  tree, counters etc., things that even two concurrent enqueue
  operations shouldn't trample on. Enqueue takes this spin lock after
  read-locking the root lock. act_mirred drops it before injecting the
  packet and takes it again afterwards.

Any opinions y'all?

  reply	other threads:[~2020-07-10 14:40 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-26 22:45 [PATCH net-next v1 0/5] TC: Introduce qevents Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 1/5] net: sched: Pass root lock to Qdisc_ops.enqueue Petr Machata
2020-07-06 19:21   ` Cong Wang
2020-07-07 15:25     ` Petr Machata
2020-07-07 19:41       ` Cong Wang
2020-06-26 22:45 ` [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks Petr Machata
2020-07-06 19:48   ` Cong Wang
2020-07-07 15:22     ` Petr Machata
2020-07-07 19:13       ` Cong Wang
2020-07-08 12:35         ` Petr Machata
2020-07-08 16:21           ` Petr Machata
2020-07-08 19:09             ` Cong Wang
2020-07-08 19:04           ` Cong Wang
2020-07-08 21:04             ` Petr Machata
2020-07-09  0:13               ` Petr Machata
2020-07-09 19:37                 ` Cong Wang
2020-07-10 14:40                   ` Petr Machata [this message]
2020-07-14  2:51                     ` Cong Wang
2020-07-14  9:12                       ` Petr Machata
2020-07-07 19:48   ` Cong Wang
2020-07-08  9:19     ` Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 3/5] net: sched: sch_red: Split init and change callbacks Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 4/5] net: sched: sch_red: Add qevents "early_drop" and "mark" Petr Machata
2020-06-26 22:45 ` [PATCH net-next v1 5/5] selftests: forwarding: Add a RED test for SW datapath Petr Machata
2020-06-26 22:45 ` [PATCH iproute2-next v1 1/4] uapi: pkt_sched: Add two new RED attributes Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 2/4] tc: Add helpers to support qevent handling Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 3/4] man: tc: Describe qevents Petr Machata
2020-06-26 22:45   ` [PATCH iproute2-next v1 4/4] tc: q_red: Add support for qevents "mark" and "early_drop" Petr Machata
2020-06-26 22:56 ` [PATCH net-next v1 0/5] TC: Introduce qevents Stephen Hemminger
2020-06-29 13:21   ` Petr Machata
2020-06-30  0:15 ` David Miller
  -- strict thread matches above, loose matches on Subject: below --
2020-06-27 21:06 [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875zavh1re.fsf@mellanox.com \
    --to=petrm@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=idosch@mellanox.com \
    --cc=jiri@mellanox.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.