Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH V2] mlx5: Fix formats with line continuation whitespace
From: Joe Perches @ 2019-08-02 18:09 UTC (permalink / raw)
  To: Doug Ledford, Leon Romanovsky
  Cc: Saeed Mahameed, David S. Miller, netdev, linux-rdma, linux-kernel
In-Reply-To: <ac8361beee5dd80ad6546328dd7457bb6ee1ca5a.camel@redhat.com>

On Tue, 2018-11-06 at 16:34 -0500, Doug Ledford wrote:
> On Thu, 2018-11-01 at 09:34 +0200, Leon Romanovsky wrote:
> > On Thu, Nov 01, 2018 at 12:24:08AM -0700, Joe Perches wrote:
> > > The line continuations unintentionally add whitespace so
> > > instead use coalesced formats to remove the whitespace.
> > > 
> > > Signed-off-by: Joe Perches <joe@perches.com>
> > > ---
> > > 
> > > v2: Remove excess space after %u
> > > 
> > >  drivers/net/ethernet/mellanox/mlx5/core/rl.c | 6 ++----
> > >  1 file changed, 2 insertions(+), 4 deletions(-)
> > > 
> > 
> > Thanks,
> > Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
> 
> Applied, thanks.

Still not upstream.  How long does it take?

^ permalink raw reply

* Re: [PATCH v4 4/4] net: phy: realtek: configure RTL8211E LEDs
From: Andrew Lunn @ 2019-08-02 18:18 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
	Heiner Kallweit, netdev, devicetree, linux-kernel,
	Douglas Anderson
In-Reply-To: <20190801190759.28201-5-mka@chromium.org>

On Thu, Aug 01, 2019 at 12:07:59PM -0700, Matthias Kaehlcke wrote:
> Configure the RTL8211E LEDs behavior when the device tree property
> 'realtek,led-modes' is specified.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>

Hi Matthias

I was more thinking of adding a new driver call to the PHY driver API,
to configure an LED. Something like

rtl8211e_config_leds(phydev, int led, struct phy_led_config cfg);

It would be called by the phylib core after config_init(). But also,
thinking ahead to generic linux LED support, it could be called later
to reconfigure the LEDs to use a different trigger. The standard LED
sysfs interface would be used.

      Andrew

^ permalink raw reply

* Re: [PATCH v4 1/4] dt-bindings: net: phy: Add subnode for LED configuration
From: Matthias Kaehlcke @ 2019-08-02 18:27 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
	Heiner Kallweit, netdev, devicetree, linux-kernel,
	Douglas Anderson
In-Reply-To: <20190802165755.GM2099@lunn.ch>

On Fri, Aug 02, 2019 at 06:57:55PM +0200, Andrew Lunn wrote:
> On Thu, Aug 01, 2019 at 12:07:56PM -0700, Matthias Kaehlcke wrote:
> > The LED behavior of some Ethernet PHYs is configurable. Add an
> > optional 'leds' subnode with a child node for each LED to be
> > configured. The binding aims to be compatible with the common
> > LED binding (see devicetree/bindings/leds/common.txt).
> > 
> > A LED can be configured to be 'on' when a link with a certain speed
> > is active, or to blink on RX/TX activity. For the configuration to
> > be effective it needs to be supported by the hardware and the
> > corresponding PHY driver.
> > 
> > Suggested-by: Andrew Lunn <andrew@lunn.ch>
> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > ---
> > Changes in v4:
> > - patch added to the series
> > ---
> >  .../devicetree/bindings/net/ethernet-phy.yaml | 47 +++++++++++++++++++
> >  1 file changed, 47 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/net/ethernet-phy.yaml b/Documentation/devicetree/bindings/net/ethernet-phy.yaml
> > index f70f18ff821f..81c5aacc89a5 100644
> > --- a/Documentation/devicetree/bindings/net/ethernet-phy.yaml
> > +++ b/Documentation/devicetree/bindings/net/ethernet-phy.yaml
> > @@ -153,6 +153,38 @@ properties:
> >        Delay after the reset was deasserted in microseconds. If
> >        this property is missing the delay will be skipped.
> >  
> > +patternProperties:
> > +  "^leds$":
> > +    type: object
> > +    description:
> > +      Subnode with configuration of the PHY LEDs.
> > +
> > +    patternProperties:
> > +      "^led@[0-9]+$":
> > +        type: object
> > +        description:
> > +          Subnode with the configuration of a single PHY LED.
> > +
> > +    properties:
> > +      reg:
> > +        description:
> > +          The ID number of the LED, typically corresponds to a hardware ID.
> > +        $ref: "/schemas/types.yaml#/definitions/uint32"
> > +
> > +      linux,default-trigger:
> > +        description:
> > +          This parameter, if present, is a string specifying the trigger
> > +          assigned to the LED. Supported triggers are:
> > +            "phy_link_10m_active" - LED will be on when a 10Mb/s link is active
> > +            "phy_link_100m_active" - LED will be on when a 100Mb/s link is active
> > +            "phy_link_1g_active" - LED will be on when a 1Gb/s link is active
> > +            "phy_link_10g_active" - LED will be on when a 10Gb/s link is active
> > +            "phy_activity" - LED will blink when data is received or transmitted
> 
> Matthias
> 
> We should think a bit more about these names.
> 
> I can see in future needing 1G link, but it blinks off when there is
> active traffic? So phy_link_1g_active could be confusing, and very similar to
> phy_link_1g_activity?

agreed, the 'active' vs' 'activity' can be confusing, let's avoid that.

> So maybe 

> > +            "phy_link_10m" - LED will be solid on when a 10Mb/s link is active
> > +            "phy_link_100m" - LED will be solid on when a 100Mb/s link is active
> > +            "phy_link_1g" - LED will be solid on when a 1Gb/s link is active
> 
> etc.
>
> And then in the future we can have
> 
>                "phy_link_1g_activity' - LED will be on when 1Gbp/s
>                                         link is active and blink off
>                                         with activity.

sounds good to me

> What other use cases do we have? I don't want to support everything,
> but we should be able to represent the most common modes without the
> names getting too confusing.

Initially I planned to support to configure a LED to be solid for
multiple link speeds, however that could become a bit messy with the
string based triggers, unless we limit the possible combinations. My
expertise in network land is limited, so I'm not sure if that's an
important/realistic use case.

^ permalink raw reply

* Re: [PATCH V2] mlx5: Fix formats with line continuation whitespace
From: Saeed Mahameed @ 2019-08-02 18:32 UTC (permalink / raw)
  To: joe@perches.com, dledford@redhat.com, leon@kernel.org
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <f2b2559865e8bd59202e14b837a522a801d498e2.camel@perches.com>

On Fri, 2019-08-02 at 11:09 -0700, Joe Perches wrote:
> On Tue, 2018-11-06 at 16:34 -0500, Doug Ledford wrote:
> > On Thu, 2018-11-01 at 09:34 +0200, Leon Romanovsky wrote:
> > > On Thu, Nov 01, 2018 at 12:24:08AM -0700, Joe Perches wrote:
> > > > The line continuations unintentionally add whitespace so
> > > > instead use coalesced formats to remove the whitespace.
> > > > 
> > > > Signed-off-by: Joe Perches <joe@perches.com>
> > > > ---
> > > > 
> > > > v2: Remove excess space after %u
> > > > 
> > > >  drivers/net/ethernet/mellanox/mlx5/core/rl.c | 6 ++----
> > > >  1 file changed, 2 insertions(+), 4 deletions(-)
> > > > 
> > > 
> > > Thanks,
> > > Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
> > 
> > Applied, thanks.
> 
> Still not upstream.  How long does it take?
> 

Doug, Leon, this patch still apply, let me know what happened here ?
and if you want me to apply it to one of my branches.

Thanks,
Saeed.


^ permalink raw reply

* Re: [PATCH bpf-next v4 2/2] xsk: support BPF_EXIST and BPF_NOEXIST flags in XSKMAP
From: Jonathan Lemon @ 2019-08-02 18:28 UTC (permalink / raw)
  To: Björn Töpel
  Cc: ast, daniel, netdev, Björn Töpel, magnus.karlsson,
	bruce.richardson, songliubraving, bpf
In-Reply-To: <20190802081154.30962-3-bjorn.topel@gmail.com>



On 2 Aug 2019, at 1:11, Björn Töpel wrote:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> The XSKMAP did not honor the BPF_EXIST/BPF_NOEXIST flags when updating
> an entry. This patch addresses that.
>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Reviewed-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply

* Re: [PATCH bpf-next v4 1/2] xsk: remove AF_XDP socket from map when the socket is released
From: Jonathan Lemon @ 2019-08-02 18:28 UTC (permalink / raw)
  To: Björn Töpel
  Cc: ast, daniel, netdev, Björn Töpel, magnus.karlsson,
	bruce.richardson, songliubraving, bpf
In-Reply-To: <20190802081154.30962-2-bjorn.topel@gmail.com>



On 2 Aug 2019, at 1:11, Björn Töpel wrote:

> From: Björn Töpel <bjorn.topel@intel.com>
>
> When an AF_XDP socket is released/closed the XSKMAP still holds a
> reference to the socket in a "released" state. The socket will still
> use the netdev queue resource, and block newly created sockets from
> attaching to that queue, but no user application can access the
> fill/complete/rx/tx queues. This results in that all applications need
> to explicitly clear the map entry from the old "zombie state"
> socket. This should be done automatically.
>
> In this patch, the sockets tracks, and have a reference to, which maps
> it resides in. When the socket is released, it will remove itself from
> all maps.
>
> Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Reviewed-by: Jonathan Lemon <jonathan.lemon@gmail.com>

^ permalink raw reply

* Re: [PATCH] net/mlx4_core: Use refcount_t for refcount
From: Saeed Mahameed @ 2019-08-02 18:38 UTC (permalink / raw)
  To: hslester96@gmail.com
  Cc: linux-kernel@vger.kernel.org, davem@davemloft.net, Tariq Toukan,
	linux-rdma@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <CANhBUQ1chO0Q6wHJwbKMvp6LkD7qLBRw57xwf1QkBAKaewHs5w@mail.gmail.com>

On Sat, 2019-08-03 at 00:10 +0800, Chuhong Yuan wrote:
> Chuhong Yuan <hslester96@gmail.com> 于2019年8月2日周五 下午8:10写道：
> > refcount_t is better for reference counters since its
> > implementation can prevent overflows.
> > So convert atomic_t ref counters to refcount_t.
> > 
> > Also convert refcount from 0-based to 1-based.
> > 
> 
> It seems that directly converting refcount from 0-based
> to 1-based is infeasible.
> I am sorry for this mistake.

Just curious, why not keep it 0 based and use refcout_t ?

refcount API should have the same semantics as atomic_t API .. no ?

^ permalink raw reply

* Re: [v2,1/2] tools: bpftool: add net attach command to attach XDP on interface
From: Jakub Kicinski @ 2019-08-02 18:39 UTC (permalink / raw)
  To: Daniel T. Lee; +Cc: Daniel Borkmann, Alexei Starovoitov, netdev
In-Reply-To: <CAEKGpzhsjMuf+DtN3pDVYMxJa5o2e=-3AeWbHFiFoMoXCkgsNg@mail.gmail.com>

On Fri, 2 Aug 2019 14:02:29 +0900, Daniel T. Lee wrote:
> On Fri, Aug 2, 2019 at 8:36 AM Jakub Kicinski  wrote:
> > On Thu,  1 Aug 2019 17:11:32 +0900, Daniel T. Lee wrote:  
> > > By this commit, using `bpftool net attach`, user can attach XDP prog on
> > > interface. New type of enum 'net_attach_type' has been made, as stated at
> > > cover-letter, the meaning of 'attach' is, prog will be attached on interface.
> > >
> > > BPF prog will be attached through libbpf 'bpf_set_link_xdp_fd'.
> > >
> > > Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> > > ---
> > > Changes in v2:
> > >   - command 'load' changed to 'attach' for the consistency
> > >   - 'NET_ATTACH_TYPE_XDP_DRIVE' changed to 'NET_ATTACH_TYPE_XDP_DRIVER'
> > >
> > >  tools/bpf/bpftool/net.c | 107 +++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 106 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
> > > index 67e99c56bc88..f3b57660b303 100644
> > > --- a/tools/bpf/bpftool/net.c
> > > +++ b/tools/bpf/bpftool/net.c
> > > @@ -55,6 +55,35 @@ struct bpf_attach_info {
> > >       __u32 flow_dissector_id;
> > >  };
> > >
> > > +enum net_attach_type {
> > > +     NET_ATTACH_TYPE_XDP,
> > > +     NET_ATTACH_TYPE_XDP_GENERIC,
> > > +     NET_ATTACH_TYPE_XDP_DRIVER,
> > > +     NET_ATTACH_TYPE_XDP_OFFLOAD,
> > > +     __MAX_NET_ATTACH_TYPE
> > > +};
> > > +
> > > +static const char * const attach_type_strings[] = {
> > > +     [NET_ATTACH_TYPE_XDP] = "xdp",
> > > +     [NET_ATTACH_TYPE_XDP_GENERIC] = "xdpgeneric",
> > > +     [NET_ATTACH_TYPE_XDP_DRIVER] = "xdpdrv",
> > > +     [NET_ATTACH_TYPE_XDP_OFFLOAD] = "xdpoffload",
> > > +     [__MAX_NET_ATTACH_TYPE] = NULL,  
> >
> > Not sure if the terminator is necessary,
> > ARRAY_SIZE(attach_type_strings) should suffice?  
> 
> Yes, ARRAY_SIZE is fine though. But I was just trying to make below
> 'parse_attach_type' consistent with 'parse_attach_type' from the 'prog.c'.
> At 'prog.c', It has same terminator at 'attach_type_strings'.
> 
> Should I change it or keep it?

Oh well, I guess there is some precedent for that :S

Quick grep for const char * const reveals we have around 7 non-NULL
terminated arrays, and 2 NULL terminated. Plus the NULL-terminated
don't align the '=' sign, while most do.

it's not a big deal, my preference is for not NULL terminating here,
and aligning '='.

> > > +     NEXT_ARG();
> > > +     if (!REQ_ARGS(1))
> > > +             return -EINVAL;  
> >
> > Error message needed here.
> >  
> 
> Actually it provides error message like:
> Error: 'xdp' needs at least 1 arguments, 0 found
> 
> are you suggesting that any additional error message is necessary?

Ah, sorry, I missed REQ_ARGS() there!

> > > +             return -EINVAL;
> > > +     }  
> >
> > Please require the dev keyword before the interface name.
> > That'll make it feel closer to prog load syntax.  
> 
> If adding the dev keyword before interface name, will it be too long to type in?

I think it's probably muscle memory for most. Plus we have excellent
bash completions.

> and also `bpftool prog` use extra keyword (such as dev) when it is
> optional keyword.
> 
>        bpftool prog dump jited  PROG [{ file FILE | opcodes | linum }]
>        bpftool prog pin   PROG FILE
>        bpftool prog { load | loadall } OBJ  PATH \
> 
> as you can see here, FILE uses optional keyword 'file' when the
> argument is optional.

Not sure I follow 🤔

>        bpftool prog { load | loadall } OBJ  PATH \
>                          [type TYPE] [dev NAME] \
>                          [map { idx IDX | name NAME } MAP]\
>                          [pinmaps MAP_DIR]
> 
> Yes, bpftool prog load has dev keyword with it,
> 
> but first, like previous, the argument is optional so i think it is
> unnecessary to use optional keyword 'dev'.

The keyword should not be optional if device name is specified.
Maybe lack of coffee on my side..

> and secondly, 'bpftool net attach' isn't really related to 'bpftool prog load'.
>
> At previous version patch, I was using word 'load' instead of
> 'attach', since XDP program is not
> considered as 'BPF_PROG_ATTACH', so it might give a confusion. However
> by the last patch discussion,
> word 'load' has been replaced to 'attach'.
> 
> Keeping the consistency is very important, but I was just wandering
> about making command
> similar to 'bpftool prog load' syntax.

In case of TC the device argument is optional. You may specify it, or
you can refer to TC blocks instead. So for that reason alone I think
it'll be much cleaner to require dev before the interface name.

> > > +     return 0;
> > > +}
> > > +
> > > +static int do_attach_detach_xdp(int *progfd, enum net_attach_type *attach_type,
> > > +                             int *ifindex)
> > > +{
> > > +     __u32 flags;
> > > +     int err;
> > > +
> > > +     flags = XDP_FLAGS_UPDATE_IF_NOEXIST;  
> >
> > Please add this as an option so that user can decide whether overwrite
> > is allowed or not.  
> 
> Adding force flag to bpftool seems necessary.
> I will add an optional argument for this.

Right, I was wondering if we want to call it force, though? force is
sort of a reuse of iproute2 concept. But it's kind of hard to come up
with names.

Just to be sure - I mean something like:

bpftool net attach xdp id xyz dev ethN noreplace

Rather than:

bpftool -f net attach xdp id xyz dev ethN

> > > +     if (*attach_type == NET_ATTACH_TYPE_XDP_GENERIC)
> > > +             flags |= XDP_FLAGS_SKB_MODE;
> > > +     if (*attach_type == NET_ATTACH_TYPE_XDP_DRIVER)
> > > +             flags |= XDP_FLAGS_DRV_MODE;
> > > +     if (*attach_type == NET_ATTACH_TYPE_XDP_OFFLOAD)
> > > +             flags |= XDP_FLAGS_HW_MODE;
> > > +
> > > +     err = bpf_set_link_xdp_fd(*ifindex, *progfd, flags);
> > > +
> > > +     return err;  
> >
> > no need for the err variable here.  
> 
> My apologies, but I'm not sure why err variable isn't needed at here.
> AFAIK, 'bpf_set_link_xdp_fd' from libbpf returns the netlink_recv result,
> and in order to propagate error, err variable is necessary, I guess?

	return bpf_set_link_xdp_fd(*ifindex, *progfd, flags);

Is what I meant.

> > > +}
> > > +
> > > +static int do_attach(int argc, char **argv)
> > > +{
> > > +     enum net_attach_type attach_type;
> > > +     int err, progfd, ifindex;
> > > +
> > > +     err = parse_attach_args(argc, argv, &progfd, &attach_type, &ifindex);
> > > +     if (err)
> > > +             return err;  
> >
> > Probably not the best idea to move this out into a helper.  
> 
> Again, just trying to make consistent with 'prog.c'.
> 
> But clearly it has differences with do_attach/detach from 'prog.c'.
> From it, it uses the same parse logic 'parse_attach_detach_args' since
> the two command 'bpftool prog attach/detach' uses the same argument format.
> 
> However, in here, 'bpftool net' attach and detach requires different number of
> argument, so function for parse argument has been defined separately.
> The situation is little bit different, but keeping argument parse logic as an
> helper, I think it's better in terms of consistency.

Well they won't share the same arguments if you add the keyword for
controlling IF_NOEXIST :(

> About the moving parse logic to a helper, I was trying to keep command
> entry (do_attach)
> as simple as possible. Parse all the argument in command entry will
> make function longer
> and might make harder to understand what it does.
> 
> And I'm not pretty sure that argument parse logic will stays the same
> after other attachment
> type comes in. What I mean is, the argument count or type might be
> added and to fulfill
> all that specific cases, the code might grow larger.
> 
> So for the consistency, simplicity and extensibility, I prefer to keep
> it as a helper.
> 
> > > +     if (is_prefix("xdp", attach_type_strings[attach_type]))
> > > +             err = do_attach_detach_xdp(&progfd, &attach_type, &ifindex);  
> >
> > Hm. We either need an error to be reported if it's not xdp or since we
> > only accept XDP now perhaps the if() is superfluous?  
> 
> Well, if the attach_type isn't xdp, the error will be occurred from
> the argument parse,
> Will it be necessary to reinforce with error logic to make it more secure?

Hm. it should already be fine, no? For non-xdp parse_attach_type() will
return __MAX_NET_ATTACH_TYPE, then parsing returns EINVAL and we exit.
Not sure I follow.

^ permalink raw reply

* Re: [PATCH 06/34] drm/i915: convert put_page() to put_user_page*()
From: John Hubbard @ 2019-08-02 18:48 UTC (permalink / raw)
  To: Joonas Lahtinen, Andrew Morton, john.hubbard
  Cc: Christoph Hellwig, Dan Williams, Dave Chinner, Dave Hansen,
	Ira Weiny, Jan Kara, Jason Gunthorpe, Jérôme Glisse,
	LKML, amd-gfx, ceph-devel, devel, devel, dri-devel, intel-gfx,
	kvm, linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel, Jani Nikula, Rodrigo Vivi, David Airlie
In-Reply-To: <156473756254.19842.12384378926183716632@jlahtine-desk.ger.corp.intel.com>

On 8/2/19 2:19 AM, Joonas Lahtinen wrote:
> Quoting john.hubbard@gmail.com (2019-08-02 05:19:37)
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> For pages that were retained via get_user_pages*(), release those pages
>> via the new put_user_page*() routines, instead of via put_page() or
>> release_pages().
>>
>> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
>> ("mm: introduce put_user_page*(), placeholder versions").
>>
>> Note that this effectively changes the code's behavior in
>> i915_gem_userptr_put_pages(): it now calls set_page_dirty_lock(),
>> instead of set_page_dirty(). This is probably more accurate.
> 
> We've already fixed this in drm-tip where the current code uses
> set_page_dirty_lock().
> 
> This would conflict with our tree. Rodrigo is handling
> drm-intel-next for 5.4, so you guys want to coordinate how
> to merge.
> 

Hi Joonas, Rodrigo,

First of all, I apologize for the API breakage: put_user_pages_dirty_lock()
has an additional "dirty" parameter.

In order to deal with the merge problem, I'll drop this patch from my series,
and I'd recommend that the drm-intel-next take the following approach:

1) For now, s/put_page/put_user_page/ in i915_gem_userptr_put_pages(),
and fix up the set_page_dirty() --> set_page_dirty_lock() issue, like this
(based against linux.git):

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 528b61678334..94721cc0093b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -664,10 +664,10 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,

         for_each_sgt_page(page, sgt_iter, pages) {
                 if (obj->mm.dirty)
-                       set_page_dirty(page);
+                       set_page_dirty_lock(page);

                 mark_page_accessed(page);
-               put_page(page);
+               put_user_page(page);
         }
         obj->mm.dirty = false;


That will leave you with your original set_page_dirty_lock() calls
and everything works properly.

2) Next cycle, move to the new put_user_pages_dirty_lock().

thanks,
-- 
John Hubbard
NVIDIA


> Regards, Joonas
> 

^ permalink raw reply related

* Re: [PATCH 16/34] drivers/tee: convert put_page() to put_user_page*()
From: John Hubbard @ 2019-08-02 18:51 UTC (permalink / raw)
  To: Jens Wiklander, john.hubbard
  Cc: Andrew Morton, Christoph Hellwig, Dan Williams, Dave Chinner,
	Dave Hansen, Ira Weiny, Jan Kara, Jason Gunthorpe,
	Jérôme Glisse, LKML, amd-gfx, ceph-devel, devel, devel,
	dri-devel, intel-gfx, kvm, Linux ARM, linux-block,
	open list:HARDWARE RANDOM NUMBER GENERATOR CORE, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel
In-Reply-To: <CAHUa44G++iiwU62jj7QH=V3sr4z26sf007xrwWLPw6AAeMLAEw@mail.gmail.com>

On 8/1/19 11:29 PM, Jens Wiklander wrote:
> On Fri, Aug 2, 2019 at 4:20 AM <john.hubbard@gmail.com> wrote:
>>
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> For pages that were retained via get_user_pages*(), release those pages
>> via the new put_user_page*() routines, instead of via put_page() or
>> release_pages().
>>
>> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
>> ("mm: introduce put_user_page*(), placeholder versions").
>>
>> Cc: Jens Wiklander <jens.wiklander@linaro.org>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>   drivers/tee/tee_shm.c | 10 ++--------
>>   1 file changed, 2 insertions(+), 8 deletions(-)
> 
> Acked-by: Jens Wiklander <jens.wiklander@linaro.org>
> 
> I suppose you're taking this via your own tree or such.
> 

Hi Jens,

Thanks for the ACK! I'm expecting that Andrew will take this through his
-mm tree, unless he pops up and says otherwise.

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
From: Bernd @ 2019-08-02 19:02 UTC (permalink / raw)
  To: netdev

Hello,

While analyzing a aborted upload packet capture I came across a odd
trace where a sender was not responding to a duplicate SACK but
sending further segments until it stalled.

Took me some time until I remembered this fix, and actually the
problems started since the security fix was applied.

I see a high counter for TCPWqueueTooBig - and I don’t think that’s an
actual attack.

Is there a probability for triggering the limit with connections with
big windows and large send buffers and dropped segments? If so what
would be the plan? It does not look like it is configurable. The trace
seem to have 100 (filled) inflight segments.

Gruss
Bernd
-- 
http://bernd.eckenfels.net

^ permalink raw reply

* Re: [PATCH V2] mlx5: Fix formats with line continuation whitespace
From: Doug Ledford @ 2019-08-02 19:08 UTC (permalink / raw)
  To: Saeed Mahameed, joe@perches.com, leon@kernel.org
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <910f77ed7f2923206adc8927204c6d759ec18d20.camel@mellanox.com>

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

On Fri, 2019-08-02 at 18:32 +0000, Saeed Mahameed wrote:
> On Fri, 2019-08-02 at 11:09 -0700, Joe Perches wrote:
> > On Tue, 2018-11-06 at 16:34 -0500, Doug Ledford wrote:
> > > On Thu, 2018-11-01 at 09:34 +0200, Leon Romanovsky wrote:
> > > > On Thu, Nov 01, 2018 at 12:24:08AM -0700, Joe Perches wrote:
> > > > > The line continuations unintentionally add whitespace so
> > > > > instead use coalesced formats to remove the whitespace.
> > > > > 
> > > > > Signed-off-by: Joe Perches <joe@perches.com>
> > > > > ---
> > > > > 
> > > > > v2: Remove excess space after %u
> > > > > 
> > > > >  drivers/net/ethernet/mellanox/mlx5/core/rl.c | 6 ++----
> > > > >  1 file changed, 2 insertions(+), 4 deletions(-)
> > > > > 
> > > > 
> > > > Thanks,
> > > > Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
> > > 
> > > Applied, thanks.
> > 
> > Still not upstream.  How long does it take?
> > 
> 
> Doug, Leon, this patch still apply, let me know what happened here ?
> and if you want me to apply it to one of my branches.

I'm not entirely sure what happened here.  Obviously I said I had taken
it, which I don't do under my normal workflow until I've actually
applied and build tested the patch.  For it to not make it into the tree
means that I probably applied it to my wip/dl-for-next branch, but prior
to moving it to for-next, I might have had a rebase and it got lost in
the shuffle or something like that.  My apologies for letting it slip
through the cracks.  Anyway, I pulled the patch from patchworks, applied
it, and pushed it to k.o.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits
From: Neal Cardwell @ 2019-08-02 19:14 UTC (permalink / raw)
  To: Bernd; +Cc: netdev, Eric Dumazet
In-Reply-To: <CABOR3+yUiu1BzCojFQFADUKc5BT2-Ew_j7KFNpjP8WoMYZ+SMA@mail.gmail.com>

On Fri, Aug 2, 2019 at 3:03 PM Bernd <ecki@zusammenkunft.net> wrote:
>
> Hello,
>
> While analyzing a aborted upload packet capture I came across a odd
> trace where a sender was not responding to a duplicate SACK but
> sending further segments until it stalled.
>
> Took me some time until I remembered this fix, and actually the
> problems started since the security fix was applied.
>
> I see a high counter for TCPWqueueTooBig - and I don’t think that’s an
> actual attack.
>
> Is there a probability for triggering the limit with connections with
> big windows and large send buffers and dropped segments? If so what
> would be the plan? It does not look like it is configurable. The trace
> seem to have 100 (filled) inflight segments.
>
> Gruss
> Bernd
> --
> http://bernd.eckenfels.net

What's the exact kernel version you are using?

Eric submitted a patch recently that may address your issue:
   tcp: be more careful in tcp_fragment()
  https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=b617158dc096709d8600c53b6052144d12b89fab

Would you be able to test your workload with that commit
cherry-picked, and see if the issue still occurs?

That commit was targeted to many stable releases, so you may be able
to pick up that fix from a stable branch.

cheers,
neal

^ permalink raw reply

* Re: [PATCH 00/34] put_user_pages(): miscellaneous call sites
From: John Hubbard @ 2019-08-02 19:14 UTC (permalink / raw)
  To: Jan Kara, Matthew Wilcox
  Cc: Michal Hocko, john.hubbard, Andrew Morton, Christoph Hellwig,
	Dan Williams, Dave Chinner, Dave Hansen, Ira Weiny,
	Jason Gunthorpe, Jérôme Glisse, LKML, amd-gfx,
	ceph-devel, devel, devel, dri-devel, intel-gfx, kvm,
	linux-arm-kernel, linux-block, linux-crypto, linux-fbdev,
	linux-fsdevel, linux-media, linux-mm, linux-nfs, linux-rdma,
	linux-rpi-kernel, linux-xfs, netdev, rds-devel, sparclinux, x86,
	xen-devel
In-Reply-To: <20190802145227.GQ25064@quack2.suse.cz>

On 8/2/19 7:52 AM, Jan Kara wrote:
> On Fri 02-08-19 07:24:43, Matthew Wilcox wrote:
>> On Fri, Aug 02, 2019 at 02:41:46PM +0200, Jan Kara wrote:
>>> On Fri 02-08-19 11:12:44, Michal Hocko wrote:
>>>> On Thu 01-08-19 19:19:31, john.hubbard@gmail.com wrote:
>>>> [...]
>>>>> 2) Convert all of the call sites for get_user_pages*(), to
>>>>> invoke put_user_page*(), instead of put_page(). This involves dozens of
>>>>> call sites, and will take some time.
>>>>
>>>> How do we make sure this is the case and it will remain the case in the
>>>> future? There must be some automagic to enforce/check that. It is simply
>>>> not manageable to do it every now and then because then 3) will simply
>>>> be never safe.
>>>>
>>>> Have you considered coccinele or some other scripted way to do the
>>>> transition? I have no idea how to deal with future changes that would
>>>> break the balance though.

Hi Michal,

Yes, I've thought about it, and coccinelle falls a bit short (it's not smart
enough to know which put_page()'s to convert). However, there is a debug
option planned: a yet-to-be-posted commit [1] uses struct page extensions
(obviously protected by CONFIG_DEBUG_GET_USER_PAGES_REFERENCES) to add
a redundant counter. That allows:

void __put_page(struct page *page)
{
	...
	/* Someone called put_page() instead of put_user_page() */
	WARN_ON_ONCE(atomic_read(&page_ext->pin_count) > 0);

>>>
>>> Yeah, that's why I've been suggesting at LSF/MM that we may need to create
>>> a gup wrapper - say vaddr_pin_pages() - and track which sites dropping
>>> references got converted by using this wrapper instead of gup. The
>>> counterpart would then be more logically named as unpin_page() or whatever
>>> instead of put_user_page().  Sure this is not completely foolproof (you can
>>> create new callsite using vaddr_pin_pages() and then just drop refs using
>>> put_page()) but I suppose it would be a high enough barrier for missed
>>> conversions... Thoughts?

The debug option above is still a bit simplistic in its implementation (and maybe
not taking full advantage of the data it has), but I think it's preferable,
because it monitors the "core" and WARNs.

Instead of the wrapper, I'm thinking: documentation and the passage of time,
plus the debug option (perhaps enhanced--probably once I post it someone will
notice opportunities), yes?

>>
>> I think the API we really need is get_user_bvec() / put_user_bvec(),
>> and I know Christoph has been putting some work into that.  That avoids
>> doing refcount operations on hundreds of pages if the page in question is
>> a huge page.  Once people are switched over to that, they won't be tempted
>> to manually call put_page() on the individual constituent pages of a bvec.
> 
> Well, get_user_bvec() is certainly a good API for one class of users but
> just looking at the above series, you'll see there are *many* places that
> just don't work with bvecs at all and you need something for those.
> 

Yes, there are quite a few places that don't involve _bvec, as we can see
right here. So we need something. Andrew asked for a debug option some time
ago, and several people (Dave Hansen, Dan Williams, Jerome) had the idea
of vmap-ing gup pages separately, so you can definitely tell where each
page came from. I'm hoping not to have to go to that level of complexity
though.


[1] "mm/gup: debug tracking of get_user_pages() references" :
https://github.com/johnhubbard/linux/commit/21ff7d6161ec2a14d3f9d17c98abb00cc969d4d6

thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* [PATCH net 0/2] Fix batched event generation for vlan action
From: Roman Mashak @ 2019-08-02 19:16 UTC (permalink / raw)
  To: davem; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, Roman Mashak

When adding or deleting a batch of entries, the kernel sends up to
TCA_ACT_MAX_PRIO (defined to 32 in kernel) entries in an event to user
space. However it does not consider that the action sizes may vary and
require different skb sizes.

For example, consider the following script adding 32 entries with all
supported vlan parameters (in order to maximize netlink messages size):

% cat tc-batch.sh
TC="sudo /mnt/iproute2.git/tc/tc"

$TC actions flush action vlan
for i in `seq 1 $1`;
do
   cmd="action vlan push protocol 802.1q id 4094 priority 7 pipe \
               index $i cookie aabbccddeeff112233445566778800a1 "
   args=$args$cmd
done
$TC actions add $args
%
% ./tc-batch.sh 32
Error: Failed to fill netlink attributes while adding TC action.
We have an error talking to the kernel
%

patch 1 adds callback in tc_action_ops of vlan action, which calculates
the action size, and passes size to tcf_add_notify()/tcf_del_notify().

patch 2 updates the TDC test suite with relevant vlan test cases.

Roman Mashak (2):
  net sched: update vlan action for batched events operations
  tc-testing: updated vlan action tests with batch create/delete

 net/sched/act_vlan.c                               |  9 +++
 .../tc-testing/tc-tests/actions/vlan.json          | 94 ++++++++++++++++++++++
 2 files changed, 103 insertions(+)

-- 
2.7.4

^ permalink raw reply

* [PATCH net 1/2] net sched: update vlan action for batched events operations
From: Roman Mashak @ 2019-08-02 19:16 UTC (permalink / raw)
  To: davem; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, Roman Mashak
In-Reply-To: <1564773407-26209-1-git-send-email-mrv@mojatatu.com>

Add get_fill_size() routine used to calculate the action size
when building a batch of events.

Fixes: c7e2b9689 ("sched: introduce vlan action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
 net/sched/act_vlan.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 9269d350fb8a..e0c97267bccb 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -306,6 +306,14 @@ static int tcf_vlan_search(struct net *net, struct tc_action **a, u32 index)
 	return tcf_idr_search(tn, a, index);
 }
 
+static size_t tcf_vlan_get_fill_size(const struct tc_action *act)
+{
+	return nla_total_size(sizeof(struct tc_vlan))
+		+ nla_total_size(sizeof(u16)) /* TCA_VLAN_PUSH_VLAN_ID */
+		+ nla_total_size(sizeof(u16)) /* TCA_VLAN_PUSH_VLAN_PROTOCOL */
+		+ nla_total_size(sizeof(u8)); /* TCA_VLAN_PUSH_VLAN_PRIORITY */
+}
+
 static struct tc_action_ops act_vlan_ops = {
 	.kind		=	"vlan",
 	.id		=	TCA_ID_VLAN,
@@ -315,6 +323,7 @@ static struct tc_action_ops act_vlan_ops = {
 	.init		=	tcf_vlan_init,
 	.cleanup	=	tcf_vlan_cleanup,
 	.walk		=	tcf_vlan_walker,
+	.get_fill_size	=	tcf_vlan_get_fill_size,
 	.lookup		=	tcf_vlan_search,
 	.size		=	sizeof(struct tcf_vlan),
 };
-- 
2.7.4


^ permalink raw reply related

* [PATCH net 2/2] tc-testing: updated vlan action tests with batch create/delete
From: Roman Mashak @ 2019-08-02 19:16 UTC (permalink / raw)
  To: davem; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri, Roman Mashak
In-Reply-To: <1564773407-26209-1-git-send-email-mrv@mojatatu.com>

Update TDC tests with cases varifying ability of TC to install or delete
batches of vlan actions.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
 .../tc-testing/tc-tests/actions/vlan.json          | 94 ++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json b/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json
index cc7c7d758008..6503b1ce091f 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/vlan.json
@@ -713,5 +713,99 @@
         "teardown": [
             "$TC actions flush action vlan"
         ]
+    },
+    {
+        "id": "294e",
+        "name": "Add batch of 32 vlan push actions with cookie",
+        "category": [
+            "actions",
+            "vlan"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action vlan",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "bash -c \"for i in \\`seq 1 32\\`; do cmd=\\\"action vlan push protocol 802.1q id 4094 priority 7 pipe index \\$i cookie aabbccddeeff112233445566778800a1 \\\"; args=\"\\$args\\$cmd\"; done && $TC actions add \\$args\"",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action vlan",
+        "matchPattern": "^[ \t]+index [0-9]+ ref",
+        "matchCount": "32",
+        "teardown": [
+            "$TC actions flush action vlan"
+        ]
+    },
+    {
+        "id": "56f7",
+        "name": "Delete batch of 32 vlan push actions",
+        "category": [
+            "actions",
+            "vlan"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action vlan",
+                0,
+                1,
+                255
+            ],
+            "bash -c \"for i in \\`seq 1 32\\`; do cmd=\\\"action vlan push protocol 802.1q id 4094 priority 7 pipe index \\$i \\\"; args=\\\"\\$args\\$cmd\\\"; done && $TC actions add \\$args\""
+        ],
+        "cmdUnderTest": "bash -c \"for i in \\`seq 1 32\\`; do cmd=\\\"action vlan index \\$i \\\"; args=\"\\$args\\$cmd\"; done && $TC actions del \\$args\"",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action vlan",
+        "matchPattern": "^[ \t]+index [0-9]+ ref",
+        "matchCount": "0",
+        "teardown": []
+    },
+    {
+        "id": "759f",
+        "name": "Add batch of 32 vlan pop actions with cookie",
+        "category": [
+            "actions",
+            "vlan"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action vlan",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "bash -c \"for i in \\`seq 1 32\\`; do cmd=\\\"action vlan pop continue index \\$i cookie aabbccddeeff112233445566778800a1 \\\"; args=\"\\$args\\$cmd\"; done && $TC actions add \\$args\"",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action vlan",
+        "matchPattern": "^[ \t]+index [0-9]+ ref",
+        "matchCount": "32",
+        "teardown": [
+            "$TC actions flush action vlan"
+        ]
+    },
+    {
+        "id": "c84a",
+        "name": "Delete batch of 32 vlan pop actions",
+        "category": [
+            "actions",
+            "vlan"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action vlan",
+                0,
+                1,
+                255
+            ],
+            "bash -c \"for i in \\`seq 1 32\\`; do cmd=\\\"action vlan pop index \\$i \\\"; args=\\\"\\$args\\$cmd\\\"; done && $TC actions add \\$args\""
+        ],
+        "cmdUnderTest": "bash -c \"for i in \\`seq 1 32\\`; do cmd=\\\"action vlan index \\$i \\\"; args=\"\\$args\\$cmd\"; done && $TC actions del \\$args\"",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions list action vlan",
+        "matchPattern": "^[ \t]+index [0-9]+ ref",
+        "matchCount": "0",
+        "teardown": []
     }
 ]
-- 
2.7.4


^ permalink raw reply related

* Re: [net-next 01/12] net/mlx5: E-Switch, add ingress rate support
From: Saeed Mahameed @ 2019-08-02 19:22 UTC (permalink / raw)
  To: alexei.starovoitov@gmail.com
  Cc: Eli Cohen, davem@davemloft.net, netdev@vger.kernel.org,
	Paul Blakey
In-Reply-To: <CAADnVQ+VOSYxbF9RiMJx4kY9bxJCS+Tsf97nsOnRLvi2r6RCog@mail.gmail.com>

On Fri, 2019-08-02 at 10:37 -0700, Alexei Starovoitov wrote:
> On Thu, Aug 1, 2019 at 6:30 PM Saeed Mahameed <saeedm@mellanox.com>
> wrote:
> > From: Eli Cohen <eli@mellanox.com>
> > 
> > Use the scheduling elements to implement ingress rate limiter on an
> > eswitch ports ingress traffic. Since the ingress of eswitch port is
> > the
> > egress of VF port, we control eswitch ingress by controlling VF
> > egress.
> 
> Looks like the patch is only passing args to firmware which is doing
> the magic.
> Can you please describe what is the algorithm there?
> Is it configurable?

Hi Alexei, 

I am not sure how much details you are looking for, but let me share
some of what i know:

From a previous submission for legacy mode sriov vf bw limit, where we 
introduced the FW configuration API and the legacy sriov use case: 
https://patchwork.kernel.org/patch/9404655/

So basically the algorithm is Deficit Weighted Round Robin (DWRR)
between the agents, we can control BW allocation/weight of each agent
(vf vport).

Quoting the commit message from the above link:

"The TSAR implements a Deficit Weighted Round Robin (DWRR) between the
agents. Each agent attached to the TSAR is assigned with a Weight. An
agent is awarded with transmission tokens according to its Weight, and
charged with transmission Tokens according to the amount of data it has
transmitted. Effectively, the Weight (relative to the other agents’
Weight) defines the percentage of the BW an agent will receive,
assuming it has enough data to sustain this BW.

This arbitration scheme is work-preserving, meaning that an agent not
using the entire BW it was allocated, hands over the excess BW, to be
redistributed among the other agents. Each agent will receive
additional BW according to its Weight."

Thanks,
Saeed.

^ permalink raw reply

* Re: [PATCH 20/34] xen: convert put_page() to put_user_page*()
From: John Hubbard @ 2019-08-02 19:25 UTC (permalink / raw)
  To: Weiny, Ira, Juergen Gross, john.hubbard@gmail.com, Andrew Morton
  Cc: devel@driverdev.osuosl.org, Dave Chinner, Christoph Hellwig,
	Williams, Dan J, x86@kernel.org, linux-mm@kvack.org, Dave Hansen,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org,
	linux-arm-kernel@lists.infradead.org,
	linux-rpi-kernel@lists.infradead.org, devel@lists.orangefs.org,
	xen-devel@lists.xenproject.org, Boris Ostrovsky,
	rds-devel@oss.oracle.com, Jérôme Glisse, Jan Kara,
	ceph-devel@vger.kernel.org, kvm@vger.kernel.org,
	linux-block@vger.kernel.org, linux-crypto@vger.kernel.org,
	linux-fbdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, LKML,
	linux-media@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-rdma@vger.kernel.org, linux-xfs@vger.kernel.org,
	netdev@vger.kernel.org, sparclinux@vger.kernel.org,
	Jason Gunthorpe
In-Reply-To: <2807E5FD2F6FDA4886F6618EAC48510E79E66216@CRSMSX101.amr.corp.intel.com>

On 8/2/19 9:09 AM, Weiny, Ira wrote:
>>
>> On 02.08.19 07:48, John Hubbard wrote:
>>> On 8/1/19 9:36 PM, Juergen Gross wrote:
>>>> On 02.08.19 04:19, john.hubbard@gmail.com wrote:
>>>>> From: John Hubbard <jhubbard@nvidia.com>
>>> ...
>>> If that's not the case (both here, and in 3 or 4 other patches in this
>>> series, then as you said, I should add NULL checks to put_user_pages()
>>> and put_user_pages_dirty_lock().
>>
>> In this case it is not correct, but can easily be handled. The NULL case can
>> occur only in an error case with the pages array filled partially or not at all.
>>
>> I'd prefer something like the attached patch here.
> 
> I'm not an expert in this code and have not looked at it carefully but that patch does seem to be the better fix than forcing NULL checks on everyone.
> 

OK, I'll use Juergen's approach, and also check for that pattern in the
other patches.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* Re: [PATCH 00/34] put_user_pages(): miscellaneous call sites
From: John Hubbard @ 2019-08-02 19:33 UTC (permalink / raw)
  To: Peter Zijlstra, john.hubbard
  Cc: Andrew Morton, Christoph Hellwig, Dan Williams, Dave Chinner,
	Dave Hansen, Ira Weiny, Jan Kara, Jason Gunthorpe,
	Jérôme Glisse, LKML, amd-gfx, ceph-devel, devel, devel,
	dri-devel, intel-gfx, kvm, linux-arm-kernel, linux-block,
	linux-crypto, linux-fbdev, linux-fsdevel, linux-media, linux-mm,
	linux-nfs, linux-rdma, linux-rpi-kernel, linux-xfs, netdev,
	rds-devel, sparclinux, x86, xen-devel
In-Reply-To: <20190802080554.GD2332@hirez.programming.kicks-ass.net>

On 8/2/19 1:05 AM, Peter Zijlstra wrote:
> On Thu, Aug 01, 2019 at 07:16:19PM -0700, john.hubbard@gmail.com wrote:
> 
>> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
>> ("mm: introduce put_user_page*(), placeholder versions"). That commit
>> has an extensive description of the problem and the planned steps to
>> solve it, but the highlites are:
> 
> That is one horridly mangled Changelog there :-/ It looks like it's
> partially duplicated.

Yeah. It took so long to merge that I think I was no longer able to
actually see the commit description, after N readings. sigh

> 
> Anyway; no objections to any of that, but I just wanted to mention that
> there are other problems with long term pinning that haven't been
> mentioned, notably they inhibit compaction.
> 
> A long time ago I proposed an interface to mark pages as pinned, such
> that we could run compaction before we actually did the pinning.
> 

This is all heading toward marking pages as pinned, so we should finally
get there.  I'll post the RFC for tracking pinned pages shortly.


thanks,
-- 
John Hubbard
NVIDIA

^ permalink raw reply

* [PATCH ethtool] ethtool: dump nested registers
From: Vivien Didelot @ 2019-08-02 19:34 UTC (permalink / raw)
  To: netdev; +Cc: f.fainelli, andrew, davem, linville, cphealy, Vivien Didelot

Usually kernel drivers set the regs->len value to the same length as
info->regdump_len, which was used for the allocation. In case where
regs->len is smaller than the allocated info->regdump_len length,
we may assume that the dump contains a nested set of registers.

This becomes handy for kernel drivers to expose registers of an
underlying network conduit unfortunately not exposed to userspace,
as found in network switching equipment for example.

This patch adds support for recursing into the dump operation if there
is enough room for a nested ethtool_drvinfo structure containing a
valid driver name, followed by a ethtool_regs structure like this:

    0      regs->len                        info->regdump_len
    v              v                                        v
    +--------------+-----------------+--------------+-- - --+
    | ethtool_regs | ethtool_drvinfo | ethtool_regs |       |
    +--------------+-----------------+--------------+-- - --+

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 ethtool.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index 05fe05a08..c0e2903c5 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1245,7 +1245,7 @@ static int dump_regs(int gregs_dump_raw, int gregs_dump_hex,
 
 	if (gregs_dump_raw) {
 		fwrite(regs->data, regs->len, 1, stdout);
-		return 0;
+		goto nested;
 	}
 
 	if (!gregs_dump_hex)
@@ -1253,7 +1253,7 @@ static int dump_regs(int gregs_dump_raw, int gregs_dump_hex,
 			if (!strncmp(driver_list[i].name, info->driver,
 				     ETHTOOL_BUSINFO_LEN)) {
 				if (driver_list[i].func(info, regs) == 0)
-					return 0;
+					goto nested;
 				/* This version (or some other
 				 * variation in the dump format) is
 				 * not handled; fall back to hex
@@ -1263,6 +1263,15 @@ static int dump_regs(int gregs_dump_raw, int gregs_dump_hex,
 
 	dump_hex(stdout, regs->data, regs->len, 0);
 
+nested:
+	/* Recurse dump if some drvinfo and regs structures are nested */
+	if (info->regdump_len > regs->len + sizeof(*info) + sizeof(*regs)) {
+		info = (struct ethtool_drvinfo *)(&regs->data[0] + regs->len);
+		regs = (struct ethtool_regs *)(&regs->data[0] + regs->len + sizeof(*info));
+
+		return dump_regs(gregs_dump_raw, gregs_dump_hex, info, regs);
+	}
+
 	return 0;
 }
 
-- 
2.22.0


^ permalink raw reply related

* [PATCH net-next] net: dsa: dump CPU port regs through master
From: Vivien Didelot @ 2019-08-02 19:34 UTC (permalink / raw)
  To: netdev; +Cc: f.fainelli, andrew, davem, linville, cphealy, Vivien Didelot
In-Reply-To: <20190802193455.17126-1-vivien.didelot@gmail.com>

Merge the CPU port registers dump into the master interface registers
dump through ethtool, by nesting the ethtool_drvinfo and ethtool_regs
structures of the CPU port into the dump.

drvinfo->regdump_len will contain the full data length, while regs->len
will contain only the master interface registers dump length.

This allows for example to dump the CPU port registers on a ZII Dev
C board like this:

    # ethtool -d eth1
    0x004:                                              0x00000000
    0x008:                                              0x0a8000aa
    0x010:                                              0x01000000
    0x014:                                              0x00000000
    0x024:                                              0xf0000102
    0x040:                                              0x6d82c800
    0x044:                                              0x00000020
    0x064:                                              0x40000000
    0x084: RCR (Receive Control Register)               0x47c00104
        MAX_FL (Maximum frame length)                   1984
        FCE (Flow control enable)                       0
        BC_REJ (Broadcast frame reject)                 0
        PROM (Promiscuous mode)                         0
        DRT (Disable receive on transmit)               0
        LOOP (Internal loopback)                        0
    0x0c4: TCR (Transmit Control Register)              0x00000004
        RFC_PAUSE (Receive frame control pause)         0
        TFC_PAUSE (Transmit frame control pause)        0
        FDEN (Full duplex enable)                       1
        HBC (Heartbeat control)                         0
        GTS (Graceful transmit stop)                    0
    0x0e4:                                              0x76735d6d
    0x0e8:                                              0x7e9e8808
    0x0ec:                                              0x00010000
    .
    .
    .
    88E6352  Switch Port Registers
    ------------------------------
    00: Port Status                            0x4d04
          Pause Enabled                        0
          My Pause                             1
          802.3 PHY Detected                   0
          Link Status                          Up
          Duplex                               Full
          Speed                                100 or 200 Mbps
          EEE Enabled                          0
          Transmitter Paused                   0
          Flow Control                         0
          Config Mode                          0x4
    01: Physical Control                       0x003d
          RGMII Receive Timing Control         Default
          RGMII Transmit Timing Control        Default
          200 BASE Mode                        100
          Flow Control's Forced value          0
          Force Flow Control                   0
          Link's Forced value                  Up
          Force Link                           1
          Duplex's Forced value                Full
          Force Duplex                         1
          Force Speed                          100 or 200 Mbps
    .
    .
    .

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 net/dsa/master.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/net/dsa/master.c b/net/dsa/master.c
index 4b52f8bac5e1..a8e52c9967f4 100644
--- a/net/dsa/master.c
+++ b/net/dsa/master.c
@@ -8,6 +8,70 @@
 
 #include "dsa_priv.h"
 
+static int dsa_master_get_regs_len(struct net_device *dev)
+{
+	struct dsa_port *cpu_dp = dev->dsa_ptr;
+	const struct ethtool_ops *ops = cpu_dp->orig_ethtool_ops;
+	struct dsa_switch *ds = cpu_dp->ds;
+	int port = cpu_dp->index;
+	int ret = 0;
+	int len;
+
+	if (ops->get_regs_len) {
+		len = ops->get_regs_len(dev);
+		if (len < 0)
+			return len;
+		ret += len;
+	}
+
+	ret += sizeof(struct ethtool_drvinfo);
+	ret += sizeof(struct ethtool_regs);
+
+	if (ds->ops->get_regs_len) {
+		len = ds->ops->get_regs_len(ds, port);
+		if (len < 0)
+			return len;
+		ret += len;
+	}
+
+	return ret;
+}
+
+static void dsa_master_get_regs(struct net_device *dev,
+				struct ethtool_regs *regs, void *data)
+{
+	struct dsa_port *cpu_dp = dev->dsa_ptr;
+	const struct ethtool_ops *ops = cpu_dp->orig_ethtool_ops;
+	struct dsa_switch *ds = cpu_dp->ds;
+	struct ethtool_drvinfo *cpu_info;
+	struct ethtool_regs *cpu_regs;
+	int port = cpu_dp->index;
+	int len;
+
+	if (ops->get_regs_len && ops->get_regs) {
+		len = ops->get_regs_len(dev);
+		if (len < 0)
+			return;
+		regs->len = len;
+		ops->get_regs(dev, regs, data);
+		data += regs->len;
+	}
+
+	cpu_info = (struct ethtool_drvinfo *)data;
+	strlcpy(cpu_info->driver, "dsa", sizeof(cpu_info->driver));
+	data += sizeof(*cpu_info);
+	cpu_regs = (struct ethtool_regs *)data;
+	data += sizeof(*cpu_regs);
+
+	if (ds->ops->get_regs_len && ds->ops->get_regs) {
+		len = ds->ops->get_regs_len(ds, port);
+		if (len < 0)
+			return;
+		cpu_regs->len = len;
+		ds->ops->get_regs(ds, port, cpu_regs, data);
+	}
+}
+
 static void dsa_master_get_ethtool_stats(struct net_device *dev,
 					 struct ethtool_stats *stats,
 					 uint64_t *data)
@@ -147,6 +211,8 @@ static int dsa_master_ethtool_setup(struct net_device *dev)
 	if (cpu_dp->orig_ethtool_ops)
 		memcpy(ops, cpu_dp->orig_ethtool_ops, sizeof(*ops));
 
+	ops->get_regs_len = dsa_master_get_regs_len;
+	ops->get_regs = dsa_master_get_regs;
 	ops->get_sset_count = dsa_master_get_sset_count;
 	ops->get_ethtool_stats = dsa_master_get_ethtool_stats;
 	ops->get_strings = dsa_master_get_strings;
-- 
2.22.0


^ permalink raw reply related

* Re: [PATCH v4 4/4] net: phy: realtek: configure RTL8211E LEDs
From: Matthias Kaehlcke @ 2019-08-02 19:40 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
	Heiner Kallweit, netdev, devicetree, linux-kernel,
	Douglas Anderson
In-Reply-To: <20190802181840.GP2099@lunn.ch>

Hi Andrew,

On Fri, Aug 02, 2019 at 08:18:40PM +0200, Andrew Lunn wrote:
> On Thu, Aug 01, 2019 at 12:07:59PM -0700, Matthias Kaehlcke wrote:
> > Configure the RTL8211E LEDs behavior when the device tree property
> > 'realtek,led-modes' is specified.

note to self: update commit message

> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> 
> Hi Matthias
> 
> I was more thinking of adding a new driver call to the PHY driver API,
> to configure an LED. Something like
> 
> rtl8211e_config_leds(phydev, int led, struct phy_led_config cfg);

I guess it sould be singular ('_config_led') if it configures a single
LED.

> It would be called by the phylib core after config_init(). But also,
> thinking ahead to generic linux LED support, it could be called later
> to reconfigure the LEDs to use a different trigger. The standard LED
> sysfs interface would be used.

I'll look into the phylib part.

Thanks

Matthias

^ permalink raw reply

* Re: [net-next 01/12] net/mlx5: E-Switch, add ingress rate support
From: Alexei Starovoitov @ 2019-08-02 19:44 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Eli Cohen, davem@davemloft.net, netdev@vger.kernel.org,
	Paul Blakey
In-Reply-To: <b2c77010e96b5fdb6693e5cf0a46a2017f389b44.camel@mellanox.com>

On Fri, Aug 02, 2019 at 07:22:21PM +0000, Saeed Mahameed wrote:
> On Fri, 2019-08-02 at 10:37 -0700, Alexei Starovoitov wrote:
> > On Thu, Aug 1, 2019 at 6:30 PM Saeed Mahameed <saeedm@mellanox.com>
> > wrote:
> > > From: Eli Cohen <eli@mellanox.com>
> > > 
> > > Use the scheduling elements to implement ingress rate limiter on an
> > > eswitch ports ingress traffic. Since the ingress of eswitch port is
> > > the
> > > egress of VF port, we control eswitch ingress by controlling VF
> > > egress.
> > 
> > Looks like the patch is only passing args to firmware which is doing
> > the magic.
> > Can you please describe what is the algorithm there?
> > Is it configurable?
> 
> Hi Alexei, 
> 
> I am not sure how much details you are looking for, but let me share
> some of what i know:
> 
> From a previous submission for legacy mode sriov vf bw limit, where we 
> introduced the FW configuration API and the legacy sriov use case: 
> https://patchwork.kernel.org/patch/9404655/
> 
> So basically the algorithm is Deficit Weighted Round Robin (DWRR)
> between the agents, we can control BW allocation/weight of each agent
> (vf vport).

commit log of this patch says nothing about DWRR.
It is also not using any of the api that were provided by that
earlier patch.
what is going on?


^ permalink raw reply

* [PATCH] isdn: hysdn: fix code style error from checkpatch
From: Ricardo Bruno Lopes da Silva @ 2019-08-02 19:50 UTC (permalink / raw)
  To: isdn; +Cc: gregkh, netdev, devel, linux-kernel, lkcamp

Fix error bellow from checkpatch.

WARNING: Block comments use * on subsequent lines
+/***********************************************************
+

Signed-off-by: Ricardo Bruno Lopes da Silva <ricardo6142@gmail.com>
---
 Hi! This is my first patch, I am learning how to contribute to Linux
kernel. Let me know if you have any suggestions.

Thanks, 
Ricardo Bruno

 drivers/staging/isdn/hysdn/hycapi.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/staging/isdn/hysdn/hycapi.c b/drivers/staging/isdn/hysdn/hycapi.c
index a2c15cd7b..b7ba28d40 100644
--- a/drivers/staging/isdn/hysdn/hycapi.c
+++ b/drivers/staging/isdn/hysdn/hycapi.c
@@ -107,11 +107,8 @@ hycapi_remove_ctr(struct capi_ctr *ctrl)
 	card->hyctrlinfo = NULL;
 }
 
-/***********************************************************
-
-Queue a CAPI-message to the controller.
+/* Queue a CAPI-message to the controller. */
 
-***********************************************************/
 
 static void
 hycapi_sendmsg_internal(struct capi_ctr *ctrl, struct sk_buff *skb)
-- 
2.20.1


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox