All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Jiri Pirko <jiri@resnulli.us>,
	jhs@mojatatu.com, xiyou.wangcong@gmail.com, davem@davemloft.net
Cc: netdev@vger.kernel.org, mlxsw@mellanox.com
Subject: Re: Qdisc->u32_node - licence to kill
Date: Mon, 07 Aug 2017 10:47:14 -0700	[thread overview]
Message-ID: <5988A7A2.3090200@gmail.com> (raw)
In-Reply-To: <20170807164100.GK2085@nanopsycho.orion>

On 08/07/2017 09:41 AM, Jiri Pirko wrote:
> Hi Jamal/Cong/David/all.
> 
> Digging in the u32 code deeper now. I need to get rid of tp->q for shared
> blocks, but I found out about this:
> 
> struct Qdisc {
> 	......
>         void                    *u32_node;
> 	......
> };
> 
> Yeah, ugly. u32 uses it to store some shared data, tp_c. It actually
> stores a linked list of all hashtables added to one qdiscs.
> 
> So basically what you have is, you have 1 root ht per prio/pref. Then
> you can have multiple hts, linked from any other ht, does not matter in
> which prio/pref they are.
> 

We can create arbitrary hash tables here independent of prio/pref via
TCA_U32_DIVISOR. Then these can be linked to other hash tables via
TCA_U32_LINK commands.

prio/pref does not really play any part here from my reading, except as
a further specifier in the walk callbacks. Making it a useful filter on
dump operations.

> Do I understand that correctly that prio/pref only has meaning if
> linking does not take place, because if there is linking, the prio/pref
> of inserted rule is simply ignored?

I think even then the prio/pref meaning is dubious, from u32_change,

                for (pins = rtnl_dereference(*ins); pins;
                     ins = &pins->next, pins = rtnl_dereference(*ins))
                        if (TC_U32_NODE(handle) < TC_U32_NODE(pins->handle))
                                break;

I think the list insert is done via handle not via prio/pref.

> 
> That is the most confusing thing I saw in net/sched/ so far. 
> Is this a bug? Sounds like one.
> 

I don't think this is a bug at very least I don't see how we can
change it without breaking users. I know people depend on the hash map
capabilities and linking logic.

> Did someone introduce *u32_node (formerly static struct tc_u_common
> *u32_list;) just to allow this weirdness?
> 
> Can I just remove this shared tp_c and make the linking to other
> hashtables only possible within the same prio/pref? That would make
> sense to me.
> 

The idea to make linking hash tables only possible within the same
prio/pref will break existing programs. We can't do this its part of
UAPI now and people depend on it.

> Thanks.
> 
> Jiri
> 

  reply	other threads:[~2017-08-07 17:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-07 16:41 Qdisc->u32_node - licence to kill Jiri Pirko
2017-08-07 17:47 ` John Fastabend [this message]
2017-08-07 19:06   ` Jiri Pirko
2017-08-07 19:54     ` John Fastabend
2017-08-07 23:21       ` Cong Wang
2017-08-09 12:40         ` Jamal Hadi Salim
2017-08-09 12:48           ` Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5988A7A2.3090200@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=mlxsw@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.