* Re: [PATCH] net:mv88e6xxx: dispose irq mapping
From: Andrew Lunn @ 2016-12-06 15:08 UTC (permalink / raw)
To: Volodymyr Bendiuga; +Cc: vivien.didelot, f.fainelli, netdev
In-Reply-To: <7717215e-dddd-14af-196f-e638b5c60649@westermo.se>
On Tue, Dec 06, 2016 at 04:00:27PM +0100, Volodymyr Bendiuga wrote:
> Hi Andrew,
>
> thanks for you comments and time taken to review. I will make a new
>
> version which will indicate net-next tree.
>
> About mv88e6xxx_remove(), this function will not be called
Correct, which is why i said:
> and the error path of mv88e6xxx_probe().
which will be called on EPRODE_DEFFER.
> irq_dispose_mapping() is also called in mv88e6xxx_g2_irq_free(), so
> I put it in a similar
The mapping for chip->device_irq is made in mv88e6xxx_g2_irq_setup(),
so the disposal of the mapping logically goes in
mv88e6xxx_g2_irq_free().
Andrew
^ permalink raw reply
* Re: Avoid deadlock situation due to use of xmit_lock
From: David Miller @ 2016-12-06 15:06 UTC (permalink / raw)
To: LinoSanfilippo
Cc: bh74.an, ks.giri, vipul.pandya, peppe.cavallaro, alexandre.torgue,
linux-kernel, netdev
In-Reply-To: <1480719966-12839-1-git-send-email-LinoSanfilippo@gmx.de>
From: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Date: Sat, 3 Dec 2016 00:06:04 +0100
> after stumbling over a potential deadlock situation in the altera driver
> (see http://marc.info/?l=linux-netdev&m=148054615230447&w=2), I checked
> all other ethernet drivers for the same issue and actually found it in 2
> more, namely stmmac, and sxgbe. Please see the commit messages for a
> description of the problem.
> These 2 patches fix the concerning drivers.
First of all, I don't want to apply these patches without proper testing
and ACKs from the individual driver maintainers.
For both of these drivers, this situation only exists because the TX
path uses the unnecessary ->tx_lock. This private lock should be
removed completely and the driver should use the lock the mid-layer
already holds in the transmit path and take it in the TX reclaim path
instead of the private ->tx_lock.
^ permalink raw reply
* Re: [PATCH net-next] phy: add phy fixup unregister functions
From: David Miller @ 2016-12-06 15:01 UTC (permalink / raw)
To: Woojung.Huh; +Cc: f.fainelli, andrew, netdev
In-Reply-To: <9235D6609DB808459E95D78E17F2E43D40975526@CHN-SV-EXMX02.mchp-main.com>
From: <Woojung.Huh@microchip.com>
Date: Fri, 2 Dec 2016 23:04:13 +0000
> From: Woojung Huh <woojung.huh@microchip.com>
>
> Add functions to unregister phy fixup for modules.
>
> phy_unregister_fixup(const char *bus_id, u32 phy_uid, u32 phy_uid_mask)
> Unregister phy fixup matches bus_id, phy_uid and phy_uid_mask
> from phy_fixup_list.
> Return 0 when find matched one and remove from the list.
> Return -ENODEV when fail to find it on the list.
>
> phy_unregister_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask)
> Unregister phy fixup from phy_fixup_list.
> Use it for fixup registered by phy_register_fixup_for_uid()
> Return 0 when find matched one and remove from the list.
> Return -ENODEV when fail to find it on the list.
>
> phy_unregister_fixup_for_id(const char *bus_id)
> Unregister phy fixup from phy_fixup_list.
> Use it for fixup registered by phy_register_fixup_for_id()
> Return 0 when find matched one and remove from the list.
> Return -ENODEV when fail to find it on the list.
>
> Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Please submit this along with whatever will be the first user(s) of these
new interfaces.
^ permalink raw reply
* Re: [PATCH] net:dsa:mv88e6xxx: delete timer and cancel ppu_work
From: Andrew Lunn @ 2016-12-06 14:22 UTC (permalink / raw)
To: Volodymyr Bendiuga
Cc: vivien.didelot, f.fainelli, netdev, volodymyr.bendiuga,
Jonas Johansson
In-Reply-To: <1481029062-16668-1-git-send-email-volodymyr.bendiuga@westermo.se>
On Tue, Dec 06, 2016 at 01:57:42PM +0100, Volodymyr Bendiuga wrote:
> Signed-off-by: Jonas Johansson <jonas.johansson@westermo.se>
> Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@westermo.se>
> ---
> drivers/net/dsa/mv88e6xxx/chip.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index ca453f3..4212fb6 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -4528,8 +4528,11 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
> out_mdio:
> mv88e6xxx_mdio_unregister(chip);
> out_g2_irq:
> - if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G2_INT) && chip->irq > 0)
> + if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G2_INT) && chip->irq > 0) {
> mv88e6xxx_g2_irq_free(chip);
> + del_timer(&chip->ppu_timer);
> + cancel_work_sync(&chip->ppu_work);
> + }
Why do this here, inside this if statement?
Vivien has also just reworked the PPU code. Please take a look at his
patches and see if they fix the issue.
Thanks
Andrew
^ permalink raw reply
* Re: [PATCH] net/udp: do not touch skb->peeked unless really needed
From: Eric Dumazet @ 2016-12-06 14:35 UTC (permalink / raw)
To: Paolo Abeni; +Cc: David Miller, netdev, Willem de Bruijn
In-Reply-To: <1481026212.6225.43.camel@redhat.com>
On Tue, 2016-12-06 at 13:10 +0100, Paolo Abeni wrote:
> Please ignore the above dumb comment. I misread the 'skip condition'.
>
> I'm fine with the patch in its current form.
>
> Acked-by: Paolo Abeni <pabeni@redhat.com>
No worries, I prefer having multiple eyes on this stuff before doing the
next step ;)
Thanks !
^ permalink raw reply
* Re: [PATCH] net/udp: do not touch skb->peeked unless really needed
From: Eric Dumazet @ 2016-12-06 14:34 UTC (permalink / raw)
To: Paolo Abeni; +Cc: David Miller, netdev, Willem de Bruijn
In-Reply-To: <1481017989.6225.21.camel@redhat.com>
On Tue, 2016-12-06 at 10:53 +0100, Paolo Abeni wrote:
> Hi Eric,
> I don't understand why we can avoid setting skb->peek if len > 0. I
> think that will change the kernel behavior if:
> - peek with offset is set
> - 3 skbs with len > 0 are enqueued
> - the u/s peek (with offset) the second one
> - the u/s disable peeking with offset and peeks 2 more skbs.
>
> With the current code in the last step the u/s is going to peek the 1#
> and the 3# skbs, after this patch will peek the 1# and the 2#. Am I
> missing something ? Probably the new behavior is more correct, but still
> is a change.
>
> I gave this a run in my test bed on top of your udp-related patches I
> see additional ~3 improvement in the udp flood scenario, and a bit more
> in the un-contended scenario.
>
> Thank you,
MSG_PEEK always grab the first skb in queue, regardless of its
skb->peeked status.
Unless an offset (given in bytes) is given. Then we skip X bytes in the
queue, again regardless of skb->peeked bit.
skb->peeked is _needed_ to avoid peeking the same 0-byte skb over and
over, since user land could not skip it, as the offset to skip skbs is
computed by sum ( lengthes). An infinite loop of recvmsg() would happen,
stuck on the same skb.
For regular non 0-bytes payload, we can skip over them without even
looking at skb->peeked.
Initially, skb->peeked was only a (bad) way to make sure UDP would
increment its stats only for non peeked messages. We can implement that
using at MSG_PEEK flag.
^ permalink raw reply
* Re: [PATCH]net:switchdev:add judgement and proper return vlaue to make switchdev_port_vlan_fill() more robust
From: David Miller @ 2016-12-06 14:32 UTC (permalink / raw)
To: jiri; +Cc: cxdx2006, netdev, linux-kernel, feng.deng
In-Reply-To: <20161206123225.GK1984@nanopsycho>
From: Jiri Pirko <jiri@resnulli.us>
Date: Tue, 6 Dec 2016 13:32:25 +0100
> Tue, Dec 06, 2016 at 05:46:36AM CET, cxdx2006@gmail.com wrote:
>>From: Feng Deng<cxdx2006@gmail.com>
>>
>>1.add judgement to make sure switchdev_port_vlan_fill() could
>> return right, even when switchdev_port_vlan_dump_put() failed ,
>> which will make switchdev_port_vlan_fill() more robust
>>2.add proper return vlaue at the end of the switchdev_port_vlan_fill()
>>
>>Signed-off-by: Feng Deng <feng.deng@cortina-access.com>
>
> Please don't send patches like this. You are wasting everyone's time.
+1
^ permalink raw reply
* Re: [PATCH] dsa:mv88e6xxx: allow address 0x1 in smi_init
From: Andrew Lunn @ 2016-12-06 14:27 UTC (permalink / raw)
To: Volodymyr Bendiuga; +Cc: vivien.didelot, f.fainelli, netdev, volodymyr.bendiuga
In-Reply-To: <1481029592-19496-1-git-send-email-volodymyr.bendiuga@westermo.se>
On Tue, Dec 06, 2016 at 02:06:32PM +0100, Volodymyr Bendiuga wrote:
> Westermo devices use this address.
I think we can have a better change log entry than that. How about:
Some devices, such as the mv88e6097 do have ADDR[0] external and so it
is possible to configure the device to use SMI address 0x1. Remove the
restriction, as there are boards using this address.
Thanks
Andrew
>
> Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@westermo.se>
> ---
> drivers/net/dsa/mv88e6xxx/chip.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 4212fb6..05942e3 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -4253,10 +4253,6 @@ static void mv88e6xxx_phy_destroy(struct mv88e6xxx_chip *chip)
> static int mv88e6xxx_smi_init(struct mv88e6xxx_chip *chip,
> struct mii_bus *bus, int sw_addr)
> {
> - /* ADDR[0] pin is unavailable externally and considered zero */
> - if (sw_addr & 0x1)
> - return -EINVAL;
> -
> if (sw_addr == 0)
> chip->smi_ops = &mv88e6xxx_smi_single_chip_ops;
> else if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_MULTI_CHIP))
> --
> 2.7.4
>
^ permalink raw reply
* Re: [PATCH v2 net-next 1/2] flow dissector: ICMP support
From: Simon Horman @ 2016-12-06 14:23 UTC (permalink / raw)
To: Jiri Pirko
Cc: Tom Herbert, David Miller, Linux Kernel Network Developers,
Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, Jamal Hadi Salim,
Jiri Pirko
In-Reply-To: <20161206122634.GH1984@nanopsycho>
On Tue, Dec 06, 2016 at 01:26:34PM +0100, Jiri Pirko wrote:
> Tue, Dec 06, 2016 at 11:51:46AM CET, simon.horman@netronome.com wrote:
> >On Mon, Dec 05, 2016 at 09:29:45AM -0800, Tom Herbert wrote:
> >> On Mon, Dec 5, 2016 at 6:23 AM, Simon Horman <simon.horman@netronome.com> wrote:
> >> > On Sat, Dec 03, 2016 at 10:52:32AM -0800, Tom Herbert wrote:
> >> >> On Sat, Dec 3, 2016 at 2:49 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> >> >> > Fri, Dec 02, 2016 at 09:31:41PM CET, simon.horman@netronome.com wrote:
> >> >> >>Allow dissection of ICMP(V6) type and code. This re-uses transport layer
> >> >> >>port dissection code as although ICMP is not a transport protocol and their
> >> >> >>type and code are not ports this allows sharing of both code and storage.
> >> >> >>
> >> >> >>Signed-off-by: Simon Horman <simon.horman@netronome.com>
> >> >
> >> > ...
> >> >
> >> >> > Digging into this a bit more. I think it would be much nice not to mix
> >> >> > up l4 ports and icmp stuff.
> >> >> >
> >> >> > How about to have FLOW_DISSECTOR_KEY_ICMP
> >> >> > and
> >> >> > struct flow_dissector_key_icmp {
> >> >> > u8 type;
> >> >> > u8 code;
> >> >> > };
> >> >> >
> >> >> > The you can make this structure and struct flow_dissector_key_ports into
> >> >> > an union in struct flow_keys.
> >> >> >
> >> >> > Looks much cleaner to me.
> >> >
> >> > Hi Jiri,
> >> >
> >> > I wondered about that too. The main reason that I didn't do this the first
> >> > time around is that I wasn't sure what to do to differentiate between ports
> >> > and icmp in fl_init_dissector() given that using a union would could to
> >> > mask bits being set in both if they are set in either.
> >> >
> >> > I've taken a stab at addressing that below along with implementing your
> >> > suggestion. If the treatment in fl_init_dissector() is acceptable
> >> > perhaps something similar should be done for FLOW_DISSECTOR_KEY_IPV[46]_ADDRS
> >> > too?
> >> >
> >> >> I agree, this patch adds to many conditionals into the fast path for
> >> >> ICMP handling. Neither is there much point in using type and code as
> >> >> input to the packet hash.
> >> >
> >> > Hi Tom,
> >> >
> >> > I'm not sure that I follow this. A packet may be ICMP or not regardless of
> >> > if FLOW_DISSECTOR_KEY_PORT (and now FLOW_DISSECTOR_KEY_ICMP) port is set or
> >> > not. I'd appreciate some guidance here.
> >> >
> >> > First-pass at implementing Jiri's suggestion follows:
> >> >
> >
> >...
> >
> >> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> >> > index 0584b4bb4390..33e5fa2b3e87 100644
> >> > --- a/net/core/flow_dissector.c
> >> > +++ b/net/core/flow_dissector.c
> >
> >...
> >
> >> > @@ -975,6 +983,10 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = {
> >> > .offset = offsetof(struct flow_keys, ports),
> >> > },
> >> > {
> >> > + .key_id = FLOW_DISSECTOR_KEY_ICMP,
> >> > + .offset = offsetof(struct flow_keys, icmp),
> >> > + },
> >>
> >> This is the problem I was referring to. This adds ICMP to the keys for
> >> computing the skb hash and the ICMP type and code are in a union with
> >> port numbers so they are in the range over which the hash is computed.
> >> This means that we are treating type and code as some sort of flow and
> >> and so different type/code between the same src/dst could follow
> >> different paths in ECMP. For the purposes of computing a packet hash I
> >> don't think we want ICMP information included. This might be a matter
> >> of not putting FLOW_DISSECTOR_KEY_ICMP in flow_keys_dissector_keys,
> >> where we need this information we could define another structure that
> >> has all the same fields as in flow_keys_dissector_keys and include
> >> FLOW_DISSECTOR_KEY_ICMP. Alternatively, the flow_dissector_key_icmp
> >> could also be outside of the area that is used in the hash: that is no
> >> in flow_keys_hash_start(keys) to flow_keys_hash_length(keys), keyval);
> >
> >...
> >
> >> > diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> >> > index c96b2a349779..f4ba69bd362f 100644
> >> > --- a/net/sched/cls_flower.c
> >> > +++ b/net/sched/cls_flower.c
> >> > @@ -38,7 +38,10 @@ struct fl_flow_key {
> >> > struct flow_dissector_key_ipv4_addrs ipv4;
> >> > struct flow_dissector_key_ipv6_addrs ipv6;
> >> > };
> >> > - struct flow_dissector_key_ports tp;
> >> > + union {
> >> > + struct flow_dissector_key_ports tp;
> >> > + struct flow_dissector_key_icmp icmp;
> >> > + };
> >>
> >> When an ICMP packet is encapsulated within UDP then icmp overwrites
> >> valid port information with is a stronger signal of the flow (unless
> >> FLOW_DISSECTOR_F_STOP_AT_ENCAP is set). This is another reason not to
> >> use a union with ports.
> >
> >...
> >
> >Hi Tom,
> >
> >thanks for your explanation. I think I have a clearer picture now.
> >
> >I have reworked things to try to address your concerns.
> >In particular:
> >
> >* FLOW_DISSECTOR_KEY_ICMP is no longer added to flow_keys_dissector_keys.
> > I don't think it is needed at all for the use-case I currently have in
> > mind, which is classifying packets using tc_flower. And adding it there
> > was an error on my part.
>
> I was just about to ask why you are adding it there. Good.
>
>
> >
> >* I stopped using a union for ports and icmp. At the very least this seems
> > to make it easier to reason about things as we no longer need to consider
> > that a port value may in fact be an ICMP type or code.
>
> This should be within csl_flower code. You can easily have it as a union
> in struct fl_flow_key.
Ok, will do.
> > This seems to allow us avoid adding a number of is_icmp checks (as I believe
> > you pointed out earlier). And should also address the problem you state
> > immediately above.
>
> I don't understand the check you are talking about. The union just allow
> to share the mem. I don't see any checks needed.
I meant the checks that the patchset adds were in bond_main.c and
other places before accessing the ports fields. I now see we can avoid them
while still using a union.
> >* I have also placed icmp outside of the range flow_keys_hash_start(keys)
> > to flow_keys_hash_length(keys), keyval). This is because I now see no
> > value of having it inside that range and it both avoids any chance of
> > contaminating the hash with ICMP values and hashing over unwanted
> > (hopefully zero) values.
>
> Okay, now I'm confused again. You don't want to add this to
> flow_keys_dissector_keys. Why would you?
Sorry, I think it was me that was being confused. I'll drop that change.
I agree it should not be necessary.
^ permalink raw reply
* Re: [PATCH] net:mv88e6xxx: dispose irq mapping
From: Andrew Lunn @ 2016-12-06 14:19 UTC (permalink / raw)
To: Volodymyr Bendiuga; +Cc: vivien.didelot, f.fainelli, netdev
In-Reply-To: <1481018743-27208-1-git-send-email-volodymyr.bendiuga@westermo.se>
On Tue, Dec 06, 2016 at 11:05:43AM +0100, Volodymyr Bendiuga wrote:
> If this is not done, then it is not possible to map
> irq next time, after EPROBE_DEFER.
>
> Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@westermo.se>
Hi Volodymyr
Thanks for your patches.
Please could you include in the subject line which tree it is for. See
Documentation/networking/netdev-FAQ.txt
> ---
> drivers/net/dsa/mv88e6xxx/chip.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 05942e3..12e7d38 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -420,6 +420,7 @@ static void mv88e6xxx_g1_irq_free(struct mv88e6xxx_chip *chip)
> mv88e6xxx_g1_write(chip, GLOBAL_CONTROL, mask);
>
> free_irq(chip->irq, chip);
> + irq_dispose_mapping(chip->irq);
This seems like the wrong place to do this.
The mapping is created by chip->irq = of_irq_get(np, 0); in
mv88e6xxx_probe(). So the correct place to dispose of this mapping would be in
mv88e6xxx_remove() and the error path of mv88e6xxx_probe().
Thanks
Andrew
^ permalink raw reply
* Re: [PATCH net-next 2/3] net/sched: cls_flower: Add support for matching on flags
From: Or Gerlitz @ 2016-12-06 14:02 UTC (permalink / raw)
To: Jiri Pirko; +Cc: David S. Miller, netdev, Jiri Pirko, Roi Dayan, Hadar Har-Zion
In-Reply-To: <20161206124006.GL1984@nanopsycho>
On 12/6/2016 2:40 PM, Jiri Pirko wrote:
> Tue, Dec 06, 2016 at 01:32:58PM CET, ogerlitz@mellanox.com wrote:
>> Add UAPI to provide set of flags for matching, where the flags
>> provided from user-space are mapped to flow-dissector flags.
>>
>> The 1st flag allows to match on whether the packet is an
>> IP fragment and corresponds to the FLOW_DIS_IS_FRAGMENT flag.
>>
>> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
>> Reviewed-by: Paul Blakey <paulb@mellanox.com>
>> ---
>> include/uapi/linux/pkt_cls.h | 7 ++++
>> net/sched/cls_flower.c | 83 ++++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 90 insertions(+)
>>
>> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>> index 86786d4..f114277 100644
>> --- a/include/uapi/linux/pkt_cls.h
>> +++ b/include/uapi/linux/pkt_cls.h
>> @@ -457,11 +457,18 @@ enum {
>> TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK, /* be16 */
>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT, /* be16 */
>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, /* be16 */
>> +
>> + TCA_FLOWER_KEY_FLAGS, /* be32 */
>> + TCA_FLOWER_KEY_FLAGS_MASK, /* be32 */
>> __TCA_FLOWER_MAX,
>> };
>>
>> #define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1)
>>
>> +enum {
>> + TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = BIT(0),
>> +};
>> +
>> /* Match-all classifier */
>>
>> enum {
>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>> index c5cea78..d9f4124 100644
>> --- a/net/sched/cls_flower.c
>> +++ b/net/sched/cls_flower.c
>> @@ -383,6 +383,8 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
>> [TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK] = { .type = NLA_U16 },
>> [TCA_FLOWER_KEY_ENC_UDP_DST_PORT] = { .type = NLA_U16 },
>> [TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK] = { .type = NLA_U16 },
>> + [TCA_FLOWER_KEY_FLAGS] = { .type = NLA_U32 },
>> + [TCA_FLOWER_KEY_FLAGS_MASK] = { .type = NLA_U32 },
>> };
>>
>> static void fl_set_key_val(struct nlattr **tb,
>> @@ -417,6 +419,40 @@ static void fl_set_key_vlan(struct nlattr **tb,
>> }
>> }
>>
>> +static void set_flags(u32 flower_key, u32 flower_mask,
> Use some prefix for helpers like this, or at least __
okay, will use fl_set_key_flags here
>
>
>> + u32 *dissector_key, u32 *dissector_mask,
>> + u32 flower_flag_bit, u32 dissector_flag_bit)
>> +{
>> + if (flower_mask & flower_flag_bit) {
>> + *dissector_mask |= dissector_flag_bit;
>> + if (flower_key & flower_flag_bit)
>> + *dissector_key |= dissector_flag_bit;
>> + }
>> +}
>> +
>> +static void fl_set_key_flags(struct nlattr **tb,
>> + u32 *flags_key,
>> + u32 *flags_mask)
>> +{
>> + u32 key, mask;
>> +
>> + if (!tb[TCA_FLOWER_KEY_FLAGS])
>> + return;
>> +
>> + key = be32_to_cpu(nla_get_u32(tb[TCA_FLOWER_KEY_FLAGS]));
>> +
>> + if (!tb[TCA_FLOWER_KEY_FLAGS_MASK])
>> + mask = ~0;
>> + else
>> + mask = be32_to_cpu(nla_get_u32(tb[TCA_FLOWER_KEY_FLAGS_MASK]));
>> +
>> + *flags_key = 0;
>> + *flags_mask = 0;
>> +
>> + set_flags(key, mask, flags_key, flags_mask,
>> + TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_IS_FRAGMENT);
>> +}
>> +
>> static int fl_set_key(struct net *net, struct nlattr **tb,
>> struct fl_flow_key *key, struct fl_flow_key *mask)
>> {
>> @@ -543,6 +579,8 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
>> &mask->enc_tp.dst, TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK,
>> sizeof(key->enc_tp.dst));
>>
>> + fl_set_key_flags(tb, &key->control.flags, &mask->control.flags);
>> +
>> return 0;
>> }
>>
>> @@ -877,6 +915,48 @@ static int fl_dump_key_vlan(struct sk_buff *skb,
>> return 0;
>> }
>>
>> +static void get_flags(u32 dissector_key, u32 dissector_mask,
> Same here.
and fl_get_key_flags here
>
>> + u32 *flower_key, u32 *flower_mask,
>> + u32 flower_flag_bit, u32 dissector_flag_bit)
>> +{
>> + if (dissector_mask & dissector_flag_bit) {
>> + *flower_mask |= flower_flag_bit;
>> + if (dissector_key & dissector_flag_bit)
>> + *flower_key |= flower_flag_bit;
>> + }
>> +}
>> +
>> +static int fl_dump_key_flags(struct sk_buff *skb,
>> + u32 flags_key,
>> + u32 flags_mask)
> over-wrapped :)
okay,
>
>> +{
>> + u32 key, mask;
>> + __be32 _key, _mask;
>> + int err;
>> +
>> + if (!memchr_inv(&flags_mask, 0, sizeof(flags_mask)))
>> + return 0;
>> +
>> + key = 0;
> ^ remove the extra space
okay
>
>> + mask = 0;
>> +
>> + get_flags(flags_key, flags_mask, &key, &mask,
>> + TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_IS_FRAGMENT);
>> +
>> + _key = cpu_to_be32(key);
> ^ remove the extra space
NP
>
>> + _mask = cpu_to_be32(mask);
>> +
>> + err = nla_put(skb, TCA_FLOWER_KEY_FLAGS, 4, &_key);
>> + if (err)
>> + return err;
>> +
>> + err = nla_put(skb, TCA_FLOWER_KEY_FLAGS_MASK, 4, &_mask);
> Just return nla_put...
sure
^ permalink raw reply
* [PATCH net-next v2 1/1] driver: ipvlan: Free ipvl_port directly with kfree instead of kfree_rcu
From: fgao @ 2016-12-06 13:54 UTC (permalink / raw)
To: davem, maheshb, edumazet, netdev, gfree.wind
From: Gao Feng <gfree.wind@gmail.com>
There is no one which may reference the ipvlan port when free it in
ipvlan_port_create and ipvlan_port_destroy. So it is unnecessary to
use kfree_rcu.
Signed-off-by: Gao Feng <gfree.wind@gmail.com>
---
v2: Remove the rcu of ipvl_port directly
v1: Initial version
drivers/net/ipvlan/ipvlan.h | 1 -
drivers/net/ipvlan/ipvlan_main.c | 4 ++--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 05a62d2..031093e 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -97,7 +97,6 @@ struct ipvl_port {
struct work_struct wq;
struct sk_buff_head backlog;
int count;
- struct rcu_head rcu;
};
static inline struct ipvl_port *ipvlan_port_get_rcu(const struct net_device *d)
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index c6aa667..44ceebc 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -128,7 +128,7 @@ static int ipvlan_port_create(struct net_device *dev)
return 0;
err:
- kfree_rcu(port, rcu);
+ kfree(port);
return err;
}
@@ -145,7 +145,7 @@ static void ipvlan_port_destroy(struct net_device *dev)
netdev_rx_handler_unregister(dev);
cancel_work_sync(&port->wq);
__skb_queue_purge(&port->backlog);
- kfree_rcu(port, rcu);
+ kfree(port);
}
#define IPVLAN_FEATURES \
--
1.9.1
^ permalink raw reply related
* Re: [PATCH]net:sched:release lock before tcf_dump_walker() normal return to avoid deadlock
From: Jamal Hadi Salim @ 2016-12-06 13:50 UTC (permalink / raw)
To: Feng Deng, David S. Miller
Cc: netdev, linux-kernel, feng.deng, Roman Mashak, Cong Wang
In-Reply-To: <CAEX1niWim7BSki6uRtihwbJ3Adw4Qwt8MR1aki8UA4_x7V9UTA@mail.gmail.com>
On 16-12-06 12:36 AM, Feng Deng wrote:
> From: Feng Deng<cxdx2006@gmail.com>
>
> release lock before tcf_dump_walker() normal return to avoid deadlock
>
/Scratching my head.
I am probably missing something obvious.
What are the condition under which this deadlock will happen?
Do you have a testcase we can try?
cheers,
jamal
^ permalink raw reply
* [PATCH] net: stmmac: do not call phy_ethtool_ksettings_set from atomic context
From: Niklas Cassel @ 2016-12-06 13:47 UTC (permalink / raw)
To: Giuseppe Cavallaro, Alexandre Torgue; +Cc: Niklas Cassel, netdev, linux-kernel
From: Niklas Cassel <niklas.cassel@axis.com>
>From what I can tell, spin_lock(&priv->lock) is not needed, since the
phy_ethtool_ksettings_set call is not given the priv struct.
phy_start_aneg takes the phydev->lock. Calls to phy_adjust_link
from phy_state_machine also takes the phydev->lock.
[ 13.718319] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:97
[ 13.726717] in_atomic(): 1, irqs_disabled(): 0, pid: 1307, name: ethtool
[ 13.742115] Hardware name: Axis ARTPEC-6 Platform
[ 13.746829] [<80110568>] (unwind_backtrace) from [<8010c2bc>] (show_stack+0x18/0x1c)
[ 13.754575] [<8010c2bc>] (show_stack) from [<80433484>] (dump_stack+0x80/0xa0)
[ 13.761801] [<80433484>] (dump_stack) from [<80145428>] (___might_sleep+0x108/0x170)
[ 13.769554] [<80145428>] (___might_sleep) from [<806c9b50>] (mutex_lock+0x24/0x44)
[ 13.777128] [<806c9b50>] (mutex_lock) from [<8050cbc0>] (phy_start_aneg+0x1c/0x13c)
[ 13.784783] [<8050cbc0>] (phy_start_aneg) from [<8050d338>] (phy_ethtool_ksettings_set+0x98/0xd0)
[ 13.793656] [<8050d338>] (phy_ethtool_ksettings_set) from [<80517adc>] (stmmac_ethtool_set_link_ksettings+0xa0/0xb4)
[ 13.804184] [<80517adc>] (stmmac_ethtool_set_link_ksettings) from [<805c5138>] (ethtool_set_settings+0xd4/0x13c)
[ 13.814358] [<805c5138>] (ethtool_set_settings) from [<805c9718>] (dev_ethtool+0x13c4/0x211c)
[ 13.822882] [<805c9718>] (dev_ethtool) from [<805dc7c0>] (dev_ioctl+0x480/0x8e0)
[ 13.830291] [<805dc7c0>] (dev_ioctl) from [<80260e34>] (do_vfs_ioctl+0x94/0xa00)
[ 13.837699] [<80260e34>] (do_vfs_ioctl) from [<802617dc>] (SyS_ioctl+0x3c/0x60)
[ 13.845011] [<802617dc>] (SyS_ioctl) from [<801088bc>] (__sys_trace_return+0x0/0x10)
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index d5a8122b6033..6ab7e2bdcadd 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -406,9 +406,7 @@ stmmac_ethtool_set_link_ksettings(struct net_device *dev,
return 0;
}
- spin_lock(&priv->lock);
rc = phy_ethtool_ksettings_set(phy, cmd);
- spin_unlock(&priv->lock);
return rc;
}
--
2.1.4
^ permalink raw reply related
* Re: [PATCH v4 09/13] net: ethernet: ti: cpts: rework initialization/deinitialization
From: Richard Cochran @ 2016-12-06 13:40 UTC (permalink / raw)
To: Grygorii Strashko
Cc: David S. Miller, netdev-u79uwXL29TY76Z2rM5mHXA, Mugunthan V N,
Sekhar Nori, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-omap-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA, Murali Karicheri, Wingman Kwok,
Thomas Gleixner
In-Reply-To: <20161205200525.16664-10-grygorii.strashko-l0cyMroinI0@public.gmane.org>
On Mon, Dec 05, 2016 at 02:05:21PM -0600, Grygorii Strashko wrote:
> @@ -372,34 +354,27 @@ void cpts_tx_timestamp(struct cpts *cpts, struct sk_buff *skb)
> }
> EXPORT_SYMBOL_GPL(cpts_tx_timestamp);
>
> -int cpts_register(struct device *dev, struct cpts *cpts,
> - u32 mult, u32 shift)
> +int cpts_register(struct cpts *cpts)
> {
> int err, i;
>
> - cpts->info = cpts_info;
> - spin_lock_init(&cpts->lock);
> -
> - cpts->cc.read = cpts_systim_read;
> - cpts->cc.mask = CLOCKSOURCE_MASK(32);
> - cpts->cc_mult = mult;
> - cpts->cc.mult = mult;
> - cpts->cc.shift = shift;
> -
> INIT_LIST_HEAD(&cpts->events);
> INIT_LIST_HEAD(&cpts->pool);
> for (i = 0; i < CPTS_MAX_EVENTS; i++)
> list_add(&cpts->pool_data[i].list, &cpts->pool);
>
> - cpts_clk_init(dev, cpts);
> + clk_enable(cpts->refclk);
> +
> cpts_write32(cpts, CPTS_EN, control);
> cpts_write32(cpts, TS_PEND_EN, int_enable);
>
> + /* reinitialize cc.mult to original value as it can be modified
> + * by cpts_ptp_adjfreq().
> + */
> + cpts->cc.mult = cpts->cc_mult;
This still isn't quite right. First of all, you shouldn't clobber the
learned cc.mult value in cpts_register(). Presumably, if PTP had been
run on this port before, then the learned frequency is approximately
correct, and it should be left alone.
[ BTW, resetting the timecounter here makes no sense either. Why
reset the clock just because the interface goes down? ]
Secondly, you have made the initialization order of these fields hard
to follow. With the whole series applied:
probe()
cpts_create()
cpts_of_parse()
{
/* Set cc_mult but not cc.mult! */
set cc_mult
set cc.shift
}
cpts_calc_mult_shift()
{
/* Set them both. */
cpts->cc_mult = mult;
cpts->cc.mult = mult;
cpts->cc.shift = shift;
}
/* later on */
cpts_register()
cpts->cc.mult = cpts->cc_mult;
There is no need for such complexity. Simply set cc.mult in
cpts_create() _once_, immediately after the call to
cpts_calc_mult_shift().
You can remove the assignment from cpts_calc_mult_shift() and
cpts_register().
Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 0/5] cpsw: add per channel shaper configuration
From: Ivan Khoronzhuk @ 2016-12-06 13:35 UTC (permalink / raw)
To: Grygorii Strashko; +Cc: netdev, linux-kernel, mugunthanvnm, linux-omap
In-Reply-To: <4640ce88-75b0-8508-05f0-849e8fd4927f@ti.com>
On Mon, Dec 05, 2016 at 02:33:40PM -0600, Grygorii Strashko wrote:
> Hi Ivan,
Hi, Grygorii
I've sent patch that fixes issue in question.
https://lkml.org/lkml/2016/12/5/811
Could you please review it. Also I'm preparing several patches
to sophisticate cases with unexpected speed value.
>
> On 11/29/2016 09:00 AM, Ivan Khoronzhuk wrote:
> > This series is intended to allow user to set rate for per channel
> > shapers at cpdma level. This patchset doesn't have impact on performance.
> > The rate can be set with:
> >
> > echo 100 > /sys/class/net/ethX/queues/tx-0/tx_maxrate
> >
> > Tested on am572xx
> > Based on net-next/master
> >
> > Ivan Khoronzhuk (5):
> > net: ethernet: ti: davinci_cpdma: add weight function for channels
> > net: ethernet: ti: davinci_cpdma: add set rate for a channel
> > net: ethernet: ti: cpsw: add .ndo to set per-queue rate
> > net: ethernet: ti: cpsw: optimize end of poll cycle
> > net: ethernet: ti: cpsw: split tx budget according between channels
> >
> > drivers/net/ethernet/ti/cpsw.c | 264 +++++++++++++++----
> > drivers/net/ethernet/ti/davinci_cpdma.c | 453 ++++++++++++++++++++++++++++----
> > drivers/net/ethernet/ti/davinci_cpdma.h | 6 +
> > 3 files changed, 624 insertions(+), 99 deletions(-)
> >
>
>
> I've just tried net-next on BBB and got below back-trace:
> INIT: Entering runlevel: 5
> Configuring network interfaces... [ 15.018356] net eth0: initializing cpsw version 1.12 (0)
> [ 15.120153] SMSC LAN8710/LAN8720 4a101000.mdio:00: attached PHY driver [SMSC LAN8710/LAN8720] (mii_bus:phy_addr=4a101000.mdio:00, irq=-1)
> [ 15.138578] Division by zero in kernel.
> [ 15.142667] CPU: 0 PID: 755 Comm: ifconfig Not tainted 4.9.0-rc7-01617-g6ea3f00 #5
> [ 15.150277] Hardware name: Generic AM33XX (Flattened Device Tree)
> [ 15.156399] Backtrace:
> [ 15.158898] [<c010bbf8>] (dump_backtrace) from [<c010beb4>] (show_stack+0x18/0x1c)
> [ 15.166508] r7:00000000 r6:600f0013 r5:00000000 r4:c0d395d0
> [ 15.172200] [<c010be9c>] (show_stack) from [<c03ce6cc>] (dump_stack+0x8c/0xa0)
> [ 15.179460] [<c03ce640>] (dump_stack) from [<c010bd60>] (__div0+0x1c/0x20)
> [ 15.186368] r7:00000000 r6:ddf1c010 r5:00000001 r4:00000001
> [ 15.192068] [<c010bd44>] (__div0) from [<c03cd17c>] (Ldiv0+0x8/0x14)
> [ 15.198503] [<bf07ba38>] (cpsw_split_budget [ti_cpsw]) from [<bf07ee8c>] (cpsw_ndo_open+0x4b8/0x5e4 [ti_cpsw])
> [ 15.208554] r10:ddf1c010 r9:00000000 r8:00000000 r7:ddf1c010 r6:dcc88000 r5:dcc88500
> [ 15.216418] r4:ddf1c0b0
> [ 15.218985] [<bf07e9d4>] (cpsw_ndo_open [ti_cpsw]) from [<c0731c68>] (__dev_open+0xb0/0x114)
> [ 15.227466] r10:00000000 r9:00000000 r8:00000000 r7:dcc88030 r6:bf080364 r5:00000000
> [ 15.235330] r4:dcc88000
> [ 15.237880] [<c0731bb8>] (__dev_open) from [<c0731f24>] (__dev_change_flags+0x9c/0x14c)
> [ 15.245923] r7:00001002 r6:00000001 r5:00001043 r4:dcc88000
> [ 15.251613] [<c0731e88>] (__dev_change_flags) from [<c0731ff4>] (dev_change_flags+0x20/0x50)
> [ 15.260093] r9:00000000 r8:00000000 r7:dcbabf0c r6:00001002 r5:dcc8813c r4:dcc88000
> [ 15.267886] [<c0731fd4>] (dev_change_flags) from [<c07a3168>] (devinet_ioctl+0x6d4/0x794)
> [ 15.276105] r9:00000000 r8:bee8cc64 r7:dcbabf0c r6:00000000 r5:dc921e80 r4:00000000
> [ 15.283891] [<c07a2a94>] (devinet_ioctl) from [<c07a54bc>] (inet_ioctl+0x19c/0x1c8)
> [ 15.291587] r10:00000000 r9:dc920000 r8:bee8cc64 r7:c0d64bc0 r6:bee8cc64 r5:dd2728e0
> [ 15.299450] r4:00008914
> [ 15.302002] [<c07a5320>] (inet_ioctl) from [<c07112dc>] (sock_ioctl+0x14c/0x300)
> [ 15.309443] [<c0711190>] (sock_ioctl) from [<c0241868>] (do_vfs_ioctl+0xa8/0x98c)
> [ 15.316962] r7:00000003 r6:ddf0a780 r5:dd2728e0 r4:bee8cc64
> [ 15.322653] [<c02417c0>] (do_vfs_ioctl) from [<c0242188>] (SyS_ioctl+0x3c/0x64)
> [ 15.330000] r10:00000000 r9:dc920000 r8:bee8cc64 r7:00008914 r6:ddf0a780 r5:00000003
> [ 15.337864] r4:ddf0a780
> [ 15.340416] [<c024214c>] (SyS_ioctl) from [<c0107d60>] (ret_fast_syscall+0x0/0x3c)
> [ 15.348024] r9:dc920000 r8:c0107f24 r7:00000036 r6:bee8cf4d r5:bee8ce4c r4:000949f0
> [ 15.361174] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
> udhcpc (v1.23.1) started
>
>
> --
> regards,
> -grygorii
^ permalink raw reply
* Re: [PATCH net-next 1/3] net/flow_dissector: Enable matching on flags for TC filter consumers
From: Jiri Pirko @ 2016-12-06 12:43 UTC (permalink / raw)
To: Or Gerlitz
Cc: David S. Miller, netdev, Jiri Pirko, Roi Dayan, Hadar Har-Zion,
Tom Herbert
In-Reply-To: <1481027579-23195-2-git-send-email-ogerlitz@mellanox.com>
Tue, Dec 06, 2016 at 01:32:57PM CET, ogerlitz@mellanox.com wrote:
>This is a pre-step to allow TC classifiers that makes use of the
>flow-dissector (e.g Flower) to match on flow-dissector flags which
>are located in the control struct.
>
>Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
>---
> include/net/flow_dissector.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
>index c4f3166..a679e6c 100644
>--- a/include/net/flow_dissector.h
>+++ b/include/net/flow_dissector.h
>@@ -154,8 +154,8 @@ struct flow_dissector {
> };
>
> struct flow_keys {
>+#define FLOW_KEYS_HASH_START_FIELD control
> struct flow_dissector_key_control control;
>-#define FLOW_KEYS_HASH_START_FIELD basic
How is this hashing related to cls_flower dissections? What am I
missing?
^ permalink raw reply
* [PATCH RFC v1 2/2] ethtool: add support to specify ring_cookie as VF/queue
From: Jacob Keller @ 2016-12-06 12:40 UTC (permalink / raw)
To: John W . Linville; +Cc: netdev, Jacob Keller
In-Reply-To: <20161206124031.32346-1-jacob.e.keller@intel.com>
Add support to allow users to specify the ring_cookie as a VF/queue
value, ie: '0/25' means direct to the physical function queue 25, while
'3/10' means to direct to VF id 3, queue 10. It is driver dependent to
determine what the VF id is, excepting that 0 indicates the physical
function. This allows the user to more easily specify the matching
notation as it is understood by the kernel and some drivers. Since no
driver exists today with over 2,147,483,647 queues, this notation will
simply fail on drivers which don't recognize the new partitioning.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
ethtool.8.in | 3 ++-
rxclass.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 53 insertions(+), 8 deletions(-)
diff --git a/ethtool.8.in b/ethtool.8.in
index 5c36c06385f6..cb165fa4c77a 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -826,7 +826,8 @@ Specifies the Rx queue to send packets to, or some other action.
nokeep;
lB l.
-1 Drop the matched flow
-0 or higher Rx queue to route the flow
+0 or higher Route flow to specific Rx queue on the physical function
+N/M Route flow to Rx queue M of specific virtual function N (0 indicates the physical function)
.TE
.TP
.BI loc \ N
diff --git a/rxclass.c b/rxclass.c
index 7f0e765d3b47..de25550529b4 100644
--- a/rxclass.c
+++ b/rxclass.c
@@ -614,6 +614,7 @@ typedef enum {
OPT_IP4,
OPT_IP6,
OPT_MAC,
+ OPT_ACTION,
} rule_opt_type_t;
#define NFC_FLAG_RING 0x001
@@ -654,7 +655,7 @@ static const struct rule_opts rule_nfc_tcp_ip4[] = {
{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip4_spec.pdst),
offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip4_spec.pdst) },
- { "action", OPT_U64, NFC_FLAG_RING,
+ { "action", OPT_ACTION, NFC_FLAG_RING,
offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
{ "loc", OPT_U32, NFC_FLAG_LOC,
offsetof(struct ethtool_rx_flow_spec, location), -1 },
@@ -685,7 +686,7 @@ static const struct rule_opts rule_nfc_esp_ip4[] = {
{ "spi", OPT_BE32, NFC_FLAG_SPI,
offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip4_spec.spi),
offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip4_spec.spi) },
- { "action", OPT_U64, NFC_FLAG_RING,
+ { "action", OPT_ACTION, NFC_FLAG_RING,
offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
{ "loc", OPT_U32, NFC_FLAG_LOC,
offsetof(struct ethtool_rx_flow_spec, location), -1 },
@@ -728,7 +729,7 @@ static const struct rule_opts rule_nfc_usr_ip4[] = {
{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip4_spec.l4_4_bytes) + 2,
offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip4_spec.l4_4_bytes) + 2 },
- { "action", OPT_U64, NFC_FLAG_RING,
+ { "action", OPT_ACTION, NFC_FLAG_RING,
offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
{ "loc", OPT_U32, NFC_FLAG_LOC,
offsetof(struct ethtool_rx_flow_spec, location), -1 },
@@ -762,7 +763,7 @@ static const struct rule_opts rule_nfc_tcp_ip6[] = {
{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
offsetof(struct ethtool_rx_flow_spec, h_u.tcp_ip6_spec.pdst),
offsetof(struct ethtool_rx_flow_spec, m_u.tcp_ip6_spec.pdst) },
- { "action", OPT_U64, NFC_FLAG_RING,
+ { "action", OPT_ACTION, NFC_FLAG_RING,
offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
{ "loc", OPT_U32, NFC_FLAG_LOC,
offsetof(struct ethtool_rx_flow_spec, location), -1 },
@@ -793,7 +794,7 @@ static const struct rule_opts rule_nfc_esp_ip6[] = {
{ "spi", OPT_BE32, NFC_FLAG_SPI,
offsetof(struct ethtool_rx_flow_spec, h_u.esp_ip6_spec.spi),
offsetof(struct ethtool_rx_flow_spec, m_u.esp_ip6_spec.spi) },
- { "action", OPT_U64, NFC_FLAG_RING,
+ { "action", OPT_ACTION, NFC_FLAG_RING,
offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
{ "loc", OPT_U32, NFC_FLAG_LOC,
offsetof(struct ethtool_rx_flow_spec, location), -1 },
@@ -836,7 +837,7 @@ static const struct rule_opts rule_nfc_usr_ip6[] = {
{ "dst-port", OPT_BE16, NFC_FLAG_DPORT,
offsetof(struct ethtool_rx_flow_spec, h_u.usr_ip6_spec.l4_4_bytes) + 2,
offsetof(struct ethtool_rx_flow_spec, m_u.usr_ip6_spec.l4_4_bytes) + 2 },
- { "action", OPT_U64, NFC_FLAG_RING,
+ { "action", OPT_ACTION, NFC_FLAG_RING,
offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
{ "loc", OPT_U32, NFC_FLAG_LOC,
offsetof(struct ethtool_rx_flow_spec, location), -1 },
@@ -864,7 +865,7 @@ static const struct rule_opts rule_nfc_ether[] = {
{ "proto", OPT_BE16, NFC_FLAG_PROTO,
offsetof(struct ethtool_rx_flow_spec, h_u.ether_spec.h_proto),
offsetof(struct ethtool_rx_flow_spec, m_u.ether_spec.h_proto) },
- { "action", OPT_U64, NFC_FLAG_RING,
+ { "action", OPT_ACTION, NFC_FLAG_RING,
offsetof(struct ethtool_rx_flow_spec, ring_cookie), -1 },
{ "loc", OPT_U32, NFC_FLAG_LOC,
offsetof(struct ethtool_rx_flow_spec, location), -1 },
@@ -948,6 +949,40 @@ static int rxclass_get_ether(char *str, unsigned char *val)
return 0;
}
+static int rxclass_get_action(char *str, unsigned long long *val)
+{
+ const long long max_queue = ~0ULL >> 32;
+ const long long max_vf = ~0ULL >> 8;
+ char *end_vf, *end_queue;
+ unsigned long long queue, vf;
+
+ /* If there is no '/' just read the full value like a u64 */
+ if (!strchr(str, '/'))
+ return rxclass_get_ulong(str, val, 64);
+
+ errno = 0;
+
+ vf = strtoull(str, &end_vf, 0);
+
+ /*
+ * Ensure that we consume everything up to the slash, and that there
+ * is more than the empty string following the slash.
+ */
+ if ((*end_vf != '/') || !(*(end_vf + 1)) || errno || (vf > max_vf))
+ return -1;
+
+ errno = 0;
+
+ queue = strtoull(end_vf + 1, &end_queue, 0);
+
+ if (*end_queue || errno || (queue > max_queue))
+ return -1;
+
+ *val = (vf << ETHTOOL_RX_FLOW_SPEC_RING_VF_OFF) | queue;
+
+ return 0;
+}
+
static int rxclass_get_val(char *str, unsigned char *p, u32 *flags,
const struct rule_opts *opt)
{
@@ -1070,6 +1105,14 @@ static int rxclass_get_val(char *str, unsigned char *p, u32 *flags,
memcpy(&p[opt->moffset], &mask, ETH_ALEN);
break;
}
+ case OPT_ACTION: {
+ unsigned long long val;
+ err = rxclass_get_action(str, &val);
+ if (err)
+ return -1;
+ *(u64 *)&p[opt->offset] = (u64)val;
+ break;
+ }
case OPT_NONE:
default:
return -1;
@@ -1183,6 +1226,7 @@ static int rxclass_get_mask(char *str, unsigned char *p,
memcpy(&p[opt->moffset], val, ETH_ALEN);
break;
}
+ case OPT_ACTION:
case OPT_NONE:
default:
return -1;
--
2.11.0.rc2.152.g4d04e67
^ permalink raw reply related
* [PATCH RFC v1 1/2] rxclass: use ethtool_get_flow_spec_ring to display queue and VF
From: Jacob Keller @ 2016-12-06 12:40 UTC (permalink / raw)
To: John W . Linville; +Cc: netdev, Jacob Keller
In-Reply-To: <20161206124031.32346-1-jacob.e.keller@intel.com>
Recent kernels have made the ring_cookie store both the Queue index as
well as a VF identifier. This allows for drivers to direct traffic to
specific VF queues, without having the user have to understand the
physical queue layout. Add support to display this notation so that
users do not have to manually parse the value.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
rxclass.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/rxclass.c b/rxclass.c
index c7bfebaf6e22..7f0e765d3b47 100644
--- a/rxclass.c
+++ b/rxclass.c
@@ -247,11 +247,19 @@ static void rxclass_print_nfc_rule(struct ethtool_rx_flow_spec *fsp)
rxclass_print_nfc_spec_ext(fsp);
- if (fsp->ring_cookie != RX_CLS_FLOW_DISC)
- fprintf(stdout, "\tAction: Direct to queue %llu\n",
- fsp->ring_cookie);
- else
+ if (fsp->ring_cookie != RX_CLS_FLOW_DISC) {
+ u64 vf = ethtool_get_flow_spec_ring_vf(fsp->ring_cookie);
+ u64 queue = ethtool_get_flow_spec_ring(fsp->ring_cookie);
+
+ if (vf)
+ fprintf(stdout, "\tAction: Direct to queue %llu\n",
+ queue);
+ else
+ fprintf(stdout, "\tAction: Direct to VF %llu queue %llu\n",
+ vf, queue);
+ } else {
fprintf(stdout, "\tAction: Drop\n");
+ }
fprintf(stdout, "\n");
}
--
2.11.0.rc2.152.g4d04e67
^ permalink raw reply related
* [PATCH RFC v1 0/2] ethtool: add support for VF in ring_cookie
From: Jacob Keller @ 2016-12-06 12:40 UTC (permalink / raw)
To: John W . Linville; +Cc: netdev, Jacob Keller
The ring_cookie value for an ntuple filter is used by the driver to
determine what queue to direct the traffic towards. Since the main
function has cnotrol over all queues it is possible to use this to
direct traffic to a queue on a virtual function rather than a queue on
the physical function. However, this was cumbersome to do if you did not
know exactly how the device laid out virtual function queues.
To alleviate this, the kernel side of the ethtool API gained a couple of
functions used to partition the ring_cookie value into a queue index,
and a virtual function index. It was assumed that no device was likely
to have more than 2^32 queues, and currently no device has more than 2^8
virtual functions, so the split was chosen such that the lower 32 bits
of the ring_cookie represent the queue, and the next 8 bits would
represent a virtual function id.
This is useful, but lacked support on the ethtool command line side. It
is possible to simply use hex notation, and to "know" that the driver
works in this manner. However, this makes use of and knowledge of the
feature not widespread.
This series attempts to fix that with two patches. First, we make use of
the access functions when displaying the queue. Since no device
currently supports queues beyond 2^32 it is unlikely that any driver
which does not honor this partitioning of the ring_cookie would have its
index incorrectly interpreted. This makes it so that the user will see
the queue index and virtual function index split apart instead of being
displayed as a combined base-10 number.
Additionally, the second patch adds support for a new notation for the
action. Instead of just using a number, we add support for the special
syntax "X/Y" where X would be a number from 0-256 which is the virtual
function index, while Y would be a number from 0 to 2^32 which is the
queue index. This makes it much easier to write the encoded ring_cookie
value.
This series is sent as an RFC so that others may comment on the design
of the new syntax, and offer alternative suggestions. Some of the ones I
have considered:
a) just leave the second part alone, and display the VF+queue split, but
do not provide an easier syntax for writing the split value. This is
easy to do, but leaves the feature as a sort of hidden gem.
Additionally, the user would most easily do this by writing it as hex
notation such as
action 0x100000005
However, this is cumbersome. It can be partially alleviated by using
shell scripting on top of ethtool, but again is somewhat of a pain.
b) actually adding a new key work so that you would do something like
"queue x" and "vf y" or similar. However, this is much more difficult to
correctly implement. The current implementation of handling the values
assumes that each type argument is followed by a single argument which
fits into a well defined section of the structure. Because the structure
is not specified as le64 or be64, we can't use a union or similar to
make it correct (since placement in the union ordering would depend on
byte ordering of the field!) so we would have to re-architect the
handling of values to allow multiple keywords to combine into a single
value. This seemed incredibly cumbersome.
The disadvantage of the new syntax is that it *is* new syntax that we
have to parse ourselves, and that may not be obvious to users. However,
we can document it, and the current syntax will work for all drivers,
and the new syntax will generally only work on drivers which use the
ring_cookie partitioning in this manner (as no current driver has queues
beyond 2^32, as was decided when the partitioning scheme was added to
the ethtool header file).
I like the use of / but am open to alternative suggestions for a simple
sequence. Note that we cannot use spaces, as this would move into the
argv solution of (b) which I do not have time to implement, and would be
a much more massive project.
Jacob Keller (2):
rxclass: use ethtool_get_flow_spec_ring to display queue and VF
ethtool: add support to specify ring_cookie as VF/queue
ethtool.8.in | 3 ++-
rxclass.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
2 files changed, 65 insertions(+), 12 deletions(-)
--
2.11.0.rc2.152.g4d04e67
^ permalink raw reply
* Re: [PATCH net-next 2/3] net/sched: cls_flower: Add support for matching on flags
From: Jiri Pirko @ 2016-12-06 12:40 UTC (permalink / raw)
To: Or Gerlitz; +Cc: David S. Miller, netdev, Jiri Pirko, Roi Dayan, Hadar Har-Zion
In-Reply-To: <1481027579-23195-3-git-send-email-ogerlitz@mellanox.com>
Tue, Dec 06, 2016 at 01:32:58PM CET, ogerlitz@mellanox.com wrote:
>Add UAPI to provide set of flags for matching, where the flags
>provided from user-space are mapped to flow-dissector flags.
>
>The 1st flag allows to match on whether the packet is an
>IP fragment and corresponds to the FLOW_DIS_IS_FRAGMENT flag.
>
>Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
>Reviewed-by: Paul Blakey <paulb@mellanox.com>
>---
> include/uapi/linux/pkt_cls.h | 7 ++++
> net/sched/cls_flower.c | 83 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 90 insertions(+)
>
>diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>index 86786d4..f114277 100644
>--- a/include/uapi/linux/pkt_cls.h
>+++ b/include/uapi/linux/pkt_cls.h
>@@ -457,11 +457,18 @@ enum {
> TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK, /* be16 */
> TCA_FLOWER_KEY_ENC_UDP_DST_PORT, /* be16 */
> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, /* be16 */
>+
>+ TCA_FLOWER_KEY_FLAGS, /* be32 */
>+ TCA_FLOWER_KEY_FLAGS_MASK, /* be32 */
> __TCA_FLOWER_MAX,
> };
>
> #define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1)
>
>+enum {
>+ TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = BIT(0),
>+};
>+
> /* Match-all classifier */
>
> enum {
>diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>index c5cea78..d9f4124 100644
>--- a/net/sched/cls_flower.c
>+++ b/net/sched/cls_flower.c
>@@ -383,6 +383,8 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
> [TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK] = { .type = NLA_U16 },
> [TCA_FLOWER_KEY_ENC_UDP_DST_PORT] = { .type = NLA_U16 },
> [TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK] = { .type = NLA_U16 },
>+ [TCA_FLOWER_KEY_FLAGS] = { .type = NLA_U32 },
>+ [TCA_FLOWER_KEY_FLAGS_MASK] = { .type = NLA_U32 },
> };
>
> static void fl_set_key_val(struct nlattr **tb,
>@@ -417,6 +419,40 @@ static void fl_set_key_vlan(struct nlattr **tb,
> }
> }
>
>+static void set_flags(u32 flower_key, u32 flower_mask,
Use some prefix for helpers like this, or at least __
>+ u32 *dissector_key, u32 *dissector_mask,
>+ u32 flower_flag_bit, u32 dissector_flag_bit)
>+{
>+ if (flower_mask & flower_flag_bit) {
>+ *dissector_mask |= dissector_flag_bit;
>+ if (flower_key & flower_flag_bit)
>+ *dissector_key |= dissector_flag_bit;
>+ }
>+}
>+
>+static void fl_set_key_flags(struct nlattr **tb,
>+ u32 *flags_key,
>+ u32 *flags_mask)
>+{
>+ u32 key, mask;
>+
>+ if (!tb[TCA_FLOWER_KEY_FLAGS])
>+ return;
>+
>+ key = be32_to_cpu(nla_get_u32(tb[TCA_FLOWER_KEY_FLAGS]));
>+
>+ if (!tb[TCA_FLOWER_KEY_FLAGS_MASK])
>+ mask = ~0;
>+ else
>+ mask = be32_to_cpu(nla_get_u32(tb[TCA_FLOWER_KEY_FLAGS_MASK]));
>+
>+ *flags_key = 0;
>+ *flags_mask = 0;
>+
>+ set_flags(key, mask, flags_key, flags_mask,
>+ TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_IS_FRAGMENT);
>+}
>+
> static int fl_set_key(struct net *net, struct nlattr **tb,
> struct fl_flow_key *key, struct fl_flow_key *mask)
> {
>@@ -543,6 +579,8 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
> &mask->enc_tp.dst, TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK,
> sizeof(key->enc_tp.dst));
>
>+ fl_set_key_flags(tb, &key->control.flags, &mask->control.flags);
>+
> return 0;
> }
>
>@@ -877,6 +915,48 @@ static int fl_dump_key_vlan(struct sk_buff *skb,
> return 0;
> }
>
>+static void get_flags(u32 dissector_key, u32 dissector_mask,
Same here.
>+ u32 *flower_key, u32 *flower_mask,
>+ u32 flower_flag_bit, u32 dissector_flag_bit)
>+{
>+ if (dissector_mask & dissector_flag_bit) {
>+ *flower_mask |= flower_flag_bit;
>+ if (dissector_key & dissector_flag_bit)
>+ *flower_key |= flower_flag_bit;
>+ }
>+}
>+
>+static int fl_dump_key_flags(struct sk_buff *skb,
>+ u32 flags_key,
>+ u32 flags_mask)
over-wrapped :)
>+{
>+ u32 key, mask;
>+ __be32 _key, _mask;
>+ int err;
>+
>+ if (!memchr_inv(&flags_mask, 0, sizeof(flags_mask)))
>+ return 0;
>+
>+ key = 0;
^ remove the extra space
>+ mask = 0;
>+
>+ get_flags(flags_key, flags_mask, &key, &mask,
>+ TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_IS_FRAGMENT);
>+
>+ _key = cpu_to_be32(key);
^ remove the extra space
>+ _mask = cpu_to_be32(mask);
>+
>+ err = nla_put(skb, TCA_FLOWER_KEY_FLAGS, 4, &_key);
>+ if (err)
>+ return err;
>+
>+ err = nla_put(skb, TCA_FLOWER_KEY_FLAGS_MASK, 4, &_mask);
Just return nla_put...
>+ if (err)
>+ return err;
>+
>+ return 0;
>+}
>+
> static int fl_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
> struct sk_buff *skb, struct tcmsg *t)
> {
>@@ -1012,6 +1092,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
> sizeof(key->enc_tp.dst)))
> goto nla_put_failure;
>
>+ if (fl_dump_key_flags(skb, key->control.flags, mask->control.flags))
>+ goto nla_put_failure;
>+
> nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags);
>
> if (tcf_exts_dump(skb, &f->exts))
>--
>2.3.7
>
^ permalink raw reply
* [PATCH net-next 0/3] net/sched: cls_flower: Add support for matching on dissection flags
From: Or Gerlitz @ 2016-12-06 12:32 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Jiri Pirko, Roi Dayan, Hadar Har-Zion, Or Gerlitz
This series add the UAPI to provide set of flags for matching, where the
flags provided from user-space are mapped to flow-dissector flags.
The 1st flag allows to match on whether the packet is an
IP fragment and corresponds to the FLOW_DIS_IS_FRAGMENT flag.
Or
Or Gerlitz (3):
net/flow_dissector: Enable matching on flags for TC filter consumers
net/sched: cls_flower: Add support for matching on flags
net/mlx5e: Offload TC matching on packets being IP fragments
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12 ++++
include/net/flow_dissector.h | 2 +-
include/uapi/linux/pkt_cls.h | 7 +++
net/sched/cls_flower.c | 83 +++++++++++++++++++++++++
4 files changed, 103 insertions(+), 1 deletion(-)
--
2.3.7
^ permalink raw reply
* [PATCH net-next 3/3] net/mlx5e: Offload TC matching on packets being IP fragments
From: Or Gerlitz @ 2016-12-06 12:32 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Jiri Pirko, Roi Dayan, Hadar Har-Zion, Or Gerlitz
In-Reply-To: <1481027579-23195-1-git-send-email-ogerlitz@mellanox.com>
Enable offloading of matching on packets being fragments.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index f07ef8c..f8829b5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -31,6 +31,7 @@
*/
#include <net/flow_dissector.h>
+#include <net/sch_generic.h>
#include <net/pkt_cls.h>
#include <net/tc_act/tc_gact.h>
#include <net/tc_act/tc_skbedit.h>
@@ -363,7 +364,18 @@ static int __parse_cls_flower(struct mlx5e_priv *priv,
skb_flow_dissector_target(f->dissector,
FLOW_DISSECTOR_KEY_CONTROL,
f->key);
+
+ struct flow_dissector_key_control *mask =
+ skb_flow_dissector_target(f->dissector,
+ FLOW_DISSECTOR_KEY_CONTROL,
+ f->mask);
addr_type = key->addr_type;
+
+ if (mask->flags & FLOW_DIS_IS_FRAGMENT) {
+ MLX5_SET(fte_match_set_lyr_2_4, headers_c, frag, 1);
+ MLX5_SET(fte_match_set_lyr_2_4, headers_v, frag,
+ key->flags & FLOW_DIS_IS_FRAGMENT);
+ }
}
if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
--
2.3.7
^ permalink raw reply related
* [PATCH net-next 1/3] net/flow_dissector: Enable matching on flags for TC filter consumers
From: Or Gerlitz @ 2016-12-06 12:32 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jiri Pirko, Roi Dayan, Hadar Har-Zion, Or Gerlitz,
Tom Herbert
In-Reply-To: <1481027579-23195-1-git-send-email-ogerlitz@mellanox.com>
This is a pre-step to allow TC classifiers that makes use of the
flow-dissector (e.g Flower) to match on flow-dissector flags which
are located in the control struct.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
include/net/flow_dissector.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index c4f3166..a679e6c 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -154,8 +154,8 @@ struct flow_dissector {
};
struct flow_keys {
+#define FLOW_KEYS_HASH_START_FIELD control
struct flow_dissector_key_control control;
-#define FLOW_KEYS_HASH_START_FIELD basic
struct flow_dissector_key_basic basic;
struct flow_dissector_key_tags tags;
struct flow_dissector_key_vlan vlan;
--
2.3.7
Cc: Tom Herbert <tom@herbertland.com>
^ permalink raw reply related
* [PATCH net-next 2/3] net/sched: cls_flower: Add support for matching on flags
From: Or Gerlitz @ 2016-12-06 12:32 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev, Jiri Pirko, Roi Dayan, Hadar Har-Zion, Or Gerlitz
In-Reply-To: <1481027579-23195-1-git-send-email-ogerlitz@mellanox.com>
Add UAPI to provide set of flags for matching, where the flags
provided from user-space are mapped to flow-dissector flags.
The 1st flag allows to match on whether the packet is an
IP fragment and corresponds to the FLOW_DIS_IS_FRAGMENT flag.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
---
include/uapi/linux/pkt_cls.h | 7 ++++
net/sched/cls_flower.c | 83 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 90 insertions(+)
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 86786d4..f114277 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -457,11 +457,18 @@ enum {
TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_DST_PORT, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, /* be16 */
+
+ TCA_FLOWER_KEY_FLAGS, /* be32 */
+ TCA_FLOWER_KEY_FLAGS_MASK, /* be32 */
__TCA_FLOWER_MAX,
};
#define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1)
+enum {
+ TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = BIT(0),
+};
+
/* Match-all classifier */
enum {
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index c5cea78..d9f4124 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -383,6 +383,8 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
[TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK] = { .type = NLA_U16 },
[TCA_FLOWER_KEY_ENC_UDP_DST_PORT] = { .type = NLA_U16 },
[TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK] = { .type = NLA_U16 },
+ [TCA_FLOWER_KEY_FLAGS] = { .type = NLA_U32 },
+ [TCA_FLOWER_KEY_FLAGS_MASK] = { .type = NLA_U32 },
};
static void fl_set_key_val(struct nlattr **tb,
@@ -417,6 +419,40 @@ static void fl_set_key_vlan(struct nlattr **tb,
}
}
+static void set_flags(u32 flower_key, u32 flower_mask,
+ u32 *dissector_key, u32 *dissector_mask,
+ u32 flower_flag_bit, u32 dissector_flag_bit)
+{
+ if (flower_mask & flower_flag_bit) {
+ *dissector_mask |= dissector_flag_bit;
+ if (flower_key & flower_flag_bit)
+ *dissector_key |= dissector_flag_bit;
+ }
+}
+
+static void fl_set_key_flags(struct nlattr **tb,
+ u32 *flags_key,
+ u32 *flags_mask)
+{
+ u32 key, mask;
+
+ if (!tb[TCA_FLOWER_KEY_FLAGS])
+ return;
+
+ key = be32_to_cpu(nla_get_u32(tb[TCA_FLOWER_KEY_FLAGS]));
+
+ if (!tb[TCA_FLOWER_KEY_FLAGS_MASK])
+ mask = ~0;
+ else
+ mask = be32_to_cpu(nla_get_u32(tb[TCA_FLOWER_KEY_FLAGS_MASK]));
+
+ *flags_key = 0;
+ *flags_mask = 0;
+
+ set_flags(key, mask, flags_key, flags_mask,
+ TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_IS_FRAGMENT);
+}
+
static int fl_set_key(struct net *net, struct nlattr **tb,
struct fl_flow_key *key, struct fl_flow_key *mask)
{
@@ -543,6 +579,8 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
&mask->enc_tp.dst, TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK,
sizeof(key->enc_tp.dst));
+ fl_set_key_flags(tb, &key->control.flags, &mask->control.flags);
+
return 0;
}
@@ -877,6 +915,48 @@ static int fl_dump_key_vlan(struct sk_buff *skb,
return 0;
}
+static void get_flags(u32 dissector_key, u32 dissector_mask,
+ u32 *flower_key, u32 *flower_mask,
+ u32 flower_flag_bit, u32 dissector_flag_bit)
+{
+ if (dissector_mask & dissector_flag_bit) {
+ *flower_mask |= flower_flag_bit;
+ if (dissector_key & dissector_flag_bit)
+ *flower_key |= flower_flag_bit;
+ }
+}
+
+static int fl_dump_key_flags(struct sk_buff *skb,
+ u32 flags_key,
+ u32 flags_mask)
+{
+ u32 key, mask;
+ __be32 _key, _mask;
+ int err;
+
+ if (!memchr_inv(&flags_mask, 0, sizeof(flags_mask)))
+ return 0;
+
+ key = 0;
+ mask = 0;
+
+ get_flags(flags_key, flags_mask, &key, &mask,
+ TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT, FLOW_DIS_IS_FRAGMENT);
+
+ _key = cpu_to_be32(key);
+ _mask = cpu_to_be32(mask);
+
+ err = nla_put(skb, TCA_FLOWER_KEY_FLAGS, 4, &_key);
+ if (err)
+ return err;
+
+ err = nla_put(skb, TCA_FLOWER_KEY_FLAGS_MASK, 4, &_mask);
+ if (err)
+ return err;
+
+ return 0;
+}
+
static int fl_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
struct sk_buff *skb, struct tcmsg *t)
{
@@ -1012,6 +1092,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, unsigned long fh,
sizeof(key->enc_tp.dst)))
goto nla_put_failure;
+ if (fl_dump_key_flags(skb, key->control.flags, mask->control.flags))
+ goto nla_put_failure;
+
nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags);
if (tcf_exts_dump(skb, &f->exts))
--
2.3.7
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox