* Re: [PATCH net-next 1/2] bnxt_en: do not call napi_hash_add()
From: Michael Chan @ 2016-11-08 21:25 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1478632013.17367.23.camel@edumazet-glaptop3.roam.corp.google.com>
On Tue, Nov 8, 2016 at 11:06 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> This is automatically done from netif_napi_add(), and we want to not
> export napi_hash_add() anymore in the following patch.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
^ permalink raw reply
* [iproute PATCH 0/2] Resend: Simplify and enhance vf_info parsing
From: Phil Sutter @ 2016-11-08 21:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
This patch series got lost in a discussion about whether the code the
first patch removes is necessary or not - static analysis as well as my
tests showed it is not. Therefore resending this with updated
description of patch 1 to contain the discussion's gist.
Phil Sutter (2):
ipaddress: Simplify vf_info parsing
ipaddress: Print IFLA_VF_QUERY_RSS_EN setting
ip/ipaddress.c | 52 ++++++++++++++++++----------------------------------
1 file changed, 18 insertions(+), 34 deletions(-)
--
2.10.0
^ permalink raw reply
* [iproute PATCH 2/2] ipaddress: Print IFLA_VF_QUERY_RSS_EN setting
From: Phil Sutter @ 2016-11-08 21:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20161108212912.27258-1-phil@nwl.cc>
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
ip/ipaddress.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index df0f1b9c94c58..c9f769fb748e4 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -405,6 +405,14 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
fprintf(fp, ", trust %s",
vf_trust->setting ? "on" : "off");
}
+ if (vf[IFLA_VF_RSS_QUERY_EN]) {
+ struct ifla_vf_rss_query_en *rss_query =
+ RTA_DATA(vf[IFLA_VF_RSS_QUERY_EN]);
+
+ if (rss_query->setting != -1)
+ fprintf(fp, ", query_rss %s",
+ rss_query->setting ? "on" : "off");
+ }
if (vf[IFLA_VF_STATS] && show_stats)
print_vf_stats64(fp, vf[IFLA_VF_STATS]);
}
--
2.10.0
^ permalink raw reply related
* [iproute PATCH 1/2] ipaddress: Simplify vf_info parsing
From: Phil Sutter @ 2016-11-08 21:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20161108212912.27258-1-phil@nwl.cc>
Commit 7b8179c780a1a ("iproute2: Add new command to ip link to
enable/disable VF spoof check") tried to add support for
IFLA_VF_SPOOFCHK in a backwards-compatible manner, but aparently overdid
it: parse_rtattr_nested() handles missing attributes perfectly fine in
that it will leave the relevant field unassigned so calling code can
just compare against NULL. There is no need to layback from the previous
(IFLA_VF_TX_RATE) attribute to the next to check if IFLA_VF_SPOOFCHK is
present or not. To the contrary, it establishes a potentially incorrect
assumption of these two attributes directly following each other which
may not be the case (although up to now, kernel aligns them this way).
This patch cleans up the code to adhere to the common way of checking
for attribute existence. It has been tested to return correct results
regardless of whether the kernel exports IFLA_VF_SPOOFCHK or not.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
ip/ipaddress.c | 44 ++++++++++----------------------------------
1 file changed, 10 insertions(+), 34 deletions(-)
diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 7f05258f43453..df0f1b9c94c58 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -322,10 +322,7 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
{
struct ifla_vf_mac *vf_mac;
struct ifla_vf_tx_rate *vf_tx_rate;
- struct ifla_vf_spoofchk *vf_spoofchk;
- struct ifla_vf_link_state *vf_linkstate;
struct rtattr *vf[IFLA_VF_MAX + 1] = {};
- struct rtattr *tmp;
SPRINT_BUF(b1);
@@ -339,31 +336,6 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
vf_mac = RTA_DATA(vf[IFLA_VF_MAC]);
vf_tx_rate = RTA_DATA(vf[IFLA_VF_TX_RATE]);
- /* Check if the spoof checking vf info type is supported by
- * this kernel.
- */
- tmp = (struct rtattr *)((char *)vf[IFLA_VF_TX_RATE] +
- vf[IFLA_VF_TX_RATE]->rta_len);
-
- if (tmp->rta_type != IFLA_VF_SPOOFCHK)
- vf_spoofchk = NULL;
- else
- vf_spoofchk = RTA_DATA(vf[IFLA_VF_SPOOFCHK]);
-
- if (vf_spoofchk) {
- /* Check if the link state vf info type is supported by
- * this kernel.
- */
- tmp = (struct rtattr *)((char *)vf[IFLA_VF_SPOOFCHK] +
- vf[IFLA_VF_SPOOFCHK]->rta_len);
-
- if (tmp->rta_type != IFLA_VF_LINK_STATE)
- vf_linkstate = NULL;
- else
- vf_linkstate = RTA_DATA(vf[IFLA_VF_LINK_STATE]);
- } else
- vf_linkstate = NULL;
-
fprintf(fp, "%s vf %d MAC %s", _SL_, vf_mac->vf,
ll_addr_n2a((unsigned char *)&vf_mac->mac,
ETH_ALEN, 0, b1, sizeof(b1)));
@@ -407,14 +379,18 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
if (vf_rate->min_tx_rate)
fprintf(fp, ", min_tx_rate %dMbps", vf_rate->min_tx_rate);
}
+ if (vf[IFLA_VF_SPOOFCHK]) {
+ struct ifla_vf_spoofchk *vf_spoofchk =
+ RTA_DATA(vf[IFLA_VF_SPOOFCHK]);
- if (vf_spoofchk && vf_spoofchk->setting != -1) {
- if (vf_spoofchk->setting)
- fprintf(fp, ", spoof checking on");
- else
- fprintf(fp, ", spoof checking off");
+ if (vf_spoofchk->setting != -1)
+ fprintf(fp, ", spoof checking %s",
+ vf_spoofchk->setting ? "on" : "off");
}
- if (vf_linkstate) {
+ if (vf[IFLA_VF_LINK_STATE]) {
+ struct ifla_vf_link_state *vf_linkstate =
+ RTA_DATA(vf[IFLA_VF_LINK_STATE]);
+
if (vf_linkstate->link_state == IFLA_VF_LINK_STATE_AUTO)
fprintf(fp, ", link-state auto");
else if (vf_linkstate->link_state == IFLA_VF_LINK_STATE_ENABLE)
--
2.10.0
^ permalink raw reply related
* Re: [RFC PATCH net-next] net: ethtool: add support for forward error correction modes
From: Vidya Sagar Ravipati @ 2016-11-08 21:33 UTC (permalink / raw)
To: Gal Pressman
Cc: David Miller, John W. Linville, Saeed Mahameed, Gal Pressman,
netdev, Oded Wertheim, Ariel Almog, Yuval Mintz, Tzahi Oved,
Roee Shapiro, Aviad Raveh, Dustin Byford, Roopa Prabhu
In-Reply-To: <25b7fce6-49f9-bd54-5c12-c72298d107c1@gmail.com>
On Thu, Nov 3, 2016 at 6:24 AM, Gal Pressman <galp.dev@gmail.com> wrote:
>
>
> On 25/10/2016 05:50, Vidya Sagar Ravipati wrote:
>> SET FEC option:
>> root@tor: ethtool --set-fec swp1 encoding [off | RS | BaseR | auto] autoneg [off | on]
>>
>> Encoding: Types of encoding
>> Off : Turning off any encoding
>> RS : enforcing RS-FEC encoding on supported speeds
>> BaseR : enforcing Base R encoding on supported speeds
>> Auto : Default FEC settings for divers , and would represent
>
> divers? :)
Drivers :)
>
>> asking the hardware to essentially go into a best effort mode.
>>
>> Here are a few examples of what we would expect if encoding=auto:
>> - if autoneg is on, we are expecting FEC to be negotiated as on or off
>> as long as protocol supports it
>> - if the hardware is capable of detecting the FEC encoding on it's
>> receiver it will reconfigure its encoder to match
>> - in absence of the above, the configuration would be set to IEEE
>> defaults.
>
> Not sure I follow, why do we need an autoneg option if encoding type can be set to auto?
Auto is one of the FEC configuration modes which indicates the
drivers to set the IEEE defaults based on speed/duplex combination of
the port. i.e. RS FEC mode will be set in case of 100G/full
Auto negotiation is the configuration for the link to negotiate the
FEC capabilities and different FEC modes with other endpoint using
encoded bits D44:47 in base link code word.
^ permalink raw reply
* Re: [PATCH] bpf: Remove unused but set variables
From: Alexei Starovoitov @ 2016-11-08 21:45 UTC (permalink / raw)
To: Tobias Klauser; +Cc: Alexei Starovoitov, netdev, Josef Bacik
In-Reply-To: <20161108154028.13334-1-tklauser@distanz.ch>
On Tue, Nov 08, 2016 at 04:40:28PM +0100, Tobias Klauser wrote:
> Remove the unused but set variables min_set and max_set in
> adjust_reg_min_max_vals to fix the following warning when building with
> 'W=1':
>
> kernel/bpf/verifier.c:1483:7: warning: variable ‘min_set’ set but not used [-Wunused-but-set-variable]
>
> There is no warning about max_set being unused, but since it is only
> used in the assignment of min_set it can be removed as well.
>
> They were introduced in commit 484611357c19 ("bpf: allow access into map
> value arrays") but seem to have never been used.
>
> Cc: Josef Bacik <jbacik@fb.com>
> Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
> ---
Acked-by: Alexei Starovoitov <ast@kernel.org>
^ permalink raw reply
* [PATCH net-next 0/3] PHC frequency fine tuning
From: Richard Cochran @ 2016-11-08 21:49 UTC (permalink / raw)
To: netdev
Cc: David Miller, Jacob Keller, Jeff Kirsher, John Stultz,
Manfred Rudigier, Stefan Sørensen, Thomas Gleixner,
Ulrik De Bie, intel-wired-lan
This series expands the PTP Hardware Clock subsystem by adding a
method that passes the frequency tuning word to the the drivers
without dropping the low order bits. Keeping those bits is useful for
drivers whose frequency resolution is higher than 1 ppb.
The appended script (below) runs a simple demonstration of the
improvement. This test needs two Intel i210 PCIe cards installed in
the same PC, with their SDP0 pins connected by copper wire. Measuring
the estimated offset (from the ptp4l servo) and the true offset (from
the PPS) over one hour yields the following statistics.
| | Est. Before | Est. After | True Before | True After |
|--------+---------------+---------------+---------------+---------------|
| min | -5.200000e+01 | -1.600000e+01 | -3.100000e+01 | -1.000000e+00 |
| max | +5.700000e+01 | +2.500000e+01 | +8.500000e+01 | +4.000000e+01 |
| pk-pk: | +1.090000e+02 | +4.100000e+01 | +1.160000e+02 | +4.100000e+01 |
| mean | +6.472222e-02 | +1.277778e-02 | +2.422083e+01 | +1.826083e+01 |
| stddev | +1.158006e+01 | +4.581982e+00 | +1.207708e+01 | +4.981435e+00 |
Here the numbers in units of nanoseconds, and the ~20 nanosecond PPS
offset is due to input/output delays on the i210's external interface
logic.
With the series applied, both the peak to peak error and the standard
deviation improve by a factor of more than two. These two graphs show
the improvement nicely.
http://linuxptp.sourceforge.net/fine-tuning/fine-est.png
http://linuxptp.sourceforge.net/fine-tuning/fine-tru.png
Thanks,
Richard
Richard Cochran (3):
ptp: Introduce a high resolution frequency adjustment method.
ptp: igb: Use the high resolution frequency method.
ptp: dp83640: Use the high resolution frequency method.
drivers/net/ethernet/intel/igb/igb_ptp.c | 16 ++++++++--------
drivers/net/phy/dp83640.c | 14 +++++++-------
drivers/ptp/ptp_clock.c | 5 ++++-
include/linux/ptp_clock_kernel.h | 8 ++++++++
4 files changed, 27 insertions(+), 16 deletions(-)
--
2.1.4
---
#!/bin/sh
set -e
set -x
killall ptp4l || true
DUR=3600
ETHA=eth6
ETHB=eth3
DEVA=/dev/ptp`ethtool -T $ETHA | awk '/PTP/ {print $4}'`
DEVB=/dev/ptp`ethtool -T $ETHB | awk '/PTP/ {print $4}'`
testptp -d $DEVA -p 0
for x in $DEVA $DEVB; do
testptp -d $x -f 0
testptp -d $x -s
done
testptp -d $DEVA -L 0,2 # periodic output
testptp -d $DEVB -L 0,1 # external time stamp
testptp -d $DEVA -p 2000000000
ptp4l -m -q -2 -i $ETHA > log.master &
ptp4l -m -q -2 -i $ETHB -s > log.slave &
sleep 60
testptp -d $DEVB -e $DUR > log.pps
tail -n $DUR log.slave > log.est
killall ptp4l
^ permalink raw reply
* [PATCH net-next 1/3] ptp: Introduce a high resolution frequency adjustment method.
From: Richard Cochran @ 2016-11-08 21:49 UTC (permalink / raw)
To: netdev
Cc: David Miller, Jacob Keller, Jeff Kirsher, John Stultz,
Manfred Rudigier, Stefan Sørensen, Thomas Gleixner,
Ulrik De Bie, intel-wired-lan
In-Reply-To: <cover.1478526333.git.richardcochran@gmail.com>
The internal PTP Hardware Clock (PHC) interface limits the resolution for
frequency adjustments to one part per billion. However, some hardware
devices allow finer adjustment, and making use of the increased resolution
improves synchronization measurably on such devices.
This patch adds an alternative method that allows finer frequency tuning
by passing the scaled ppm value to PHC drivers. This value comes from
user space, and it has a resolution of about 0.015 ppb. We also deprecate
the older method, anticipating its removal once existing drivers have been
converted over.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Suggested-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
---
drivers/ptp/ptp_clock.c | 5 ++++-
include/linux/ptp_clock_kernel.h | 8 ++++++++
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
index 86280b7..9c13381 100644
--- a/drivers/ptp/ptp_clock.c
+++ b/drivers/ptp/ptp_clock.c
@@ -153,7 +153,10 @@ static int ptp_clock_adjtime(struct posix_clock *pc, struct timex *tx)
s32 ppb = scaled_ppm_to_ppb(tx->freq);
if (ppb > ops->max_adj || ppb < -ops->max_adj)
return -ERANGE;
- err = ops->adjfreq(ops, ppb);
+ if (ops->adjfine)
+ err = ops->adjfine(ops, tx->freq);
+ else
+ err = ops->adjfreq(ops, ppb);
ptp->dialed_frequency = tx->freq;
} else if (tx->modes == 0) {
tx->freq = ptp->dialed_frequency;
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index 5ad54fc6..b76d47a 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -58,7 +58,14 @@ struct system_device_crosststamp;
*
* clock operations
*
+ * @adjfine: Adjusts the frequency of the hardware clock.
+ * parameter scaled_ppm: Desired frequency offset from
+ * nominal frequency in parts per million, but with a
+ * 16 bit binary fractional field.
+ *
* @adjfreq: Adjusts the frequency of the hardware clock.
+ * This method is deprecated. New drivers should implement
+ * the @adjfine method instead.
* parameter delta: Desired frequency offset from nominal frequency
* in parts per billion
*
@@ -108,6 +115,7 @@ struct ptp_clock_info {
int n_pins;
int pps;
struct ptp_pin_desc *pin_config;
+ int (*adjfine)(struct ptp_clock_info *ptp, long scaled_ppm);
int (*adjfreq)(struct ptp_clock_info *ptp, s32 delta);
int (*adjtime)(struct ptp_clock_info *ptp, s64 delta);
int (*gettime64)(struct ptp_clock_info *ptp, struct timespec64 *ts);
--
2.1.4
^ permalink raw reply related
* [PATCH net-next 2/3] ptp: igb: Use the high resolution frequency method.
From: Richard Cochran @ 2016-11-08 21:49 UTC (permalink / raw)
To: netdev
Cc: David Miller, Jacob Keller, Jeff Kirsher, John Stultz,
Manfred Rudigier, Stefan Sørensen, Thomas Gleixner,
Ulrik De Bie, intel-wired-lan
In-Reply-To: <cover.1478526333.git.richardcochran@gmail.com>
The 82580 and related devices offer a frequency resolution of about
0.029 ppb. This patch lets users of the device benefit from the
increased frequency resolution when tuning the clock.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_ptp.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
index a7895c4..c30eea8 100644
--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -226,7 +226,7 @@ static int igb_ptp_adjfreq_82576(struct ptp_clock_info *ptp, s32 ppb)
return 0;
}
-static int igb_ptp_adjfreq_82580(struct ptp_clock_info *ptp, s32 ppb)
+static int igb_ptp_adjfine_82580(struct ptp_clock_info *ptp, long scaled_ppm)
{
struct igb_adapter *igb = container_of(ptp, struct igb_adapter,
ptp_caps);
@@ -235,13 +235,13 @@ static int igb_ptp_adjfreq_82580(struct ptp_clock_info *ptp, s32 ppb)
u64 rate;
u32 inca;
- if (ppb < 0) {
+ if (scaled_ppm < 0) {
neg_adj = 1;
- ppb = -ppb;
+ scaled_ppm = -scaled_ppm;
}
- rate = ppb;
- rate <<= 26;
- rate = div_u64(rate, 1953125);
+ rate = scaled_ppm;
+ rate <<= 13;
+ rate = div_u64(rate, 15625);
inca = rate & INCVALUE_MASK;
if (neg_adj)
@@ -1103,7 +1103,7 @@ void igb_ptp_init(struct igb_adapter *adapter)
adapter->ptp_caps.max_adj = 62499999;
adapter->ptp_caps.n_ext_ts = 0;
adapter->ptp_caps.pps = 0;
- adapter->ptp_caps.adjfreq = igb_ptp_adjfreq_82580;
+ adapter->ptp_caps.adjfine = igb_ptp_adjfine_82580;
adapter->ptp_caps.adjtime = igb_ptp_adjtime_82576;
adapter->ptp_caps.gettime64 = igb_ptp_gettime_82576;
adapter->ptp_caps.settime64 = igb_ptp_settime_82576;
@@ -1131,7 +1131,7 @@ void igb_ptp_init(struct igb_adapter *adapter)
adapter->ptp_caps.n_pins = IGB_N_SDP;
adapter->ptp_caps.pps = 1;
adapter->ptp_caps.pin_config = adapter->sdp_config;
- adapter->ptp_caps.adjfreq = igb_ptp_adjfreq_82580;
+ adapter->ptp_caps.adjfine = igb_ptp_adjfine_82580;
adapter->ptp_caps.adjtime = igb_ptp_adjtime_i210;
adapter->ptp_caps.gettime64 = igb_ptp_gettime_i210;
adapter->ptp_caps.settime64 = igb_ptp_settime_i210;
--
2.1.4
^ permalink raw reply related
* [PATCH net-next 3/3] ptp: dp83640: Use the high resolution frequency method.
From: Richard Cochran @ 2016-11-08 21:49 UTC (permalink / raw)
To: netdev
Cc: David Miller, Jacob Keller, Jeff Kirsher, John Stultz,
Manfred Rudigier, Stefan Sørensen, Thomas Gleixner,
Ulrik De Bie, intel-wired-lan
In-Reply-To: <cover.1478526333.git.richardcochran@gmail.com>
The dp83640 has a frequency resolution of about 0.029 ppb.
This patch lets users of the device benefit from the
increased frequency resolution when tuning the clock.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
---
drivers/net/phy/dp83640.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 7a240fc..e2460a5 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -375,7 +375,7 @@ static int periodic_output(struct dp83640_clock *clock,
/* ptp clock methods */
-static int ptp_dp83640_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
+static int ptp_dp83640_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
{
struct dp83640_clock *clock =
container_of(ptp, struct dp83640_clock, caps);
@@ -384,13 +384,13 @@ static int ptp_dp83640_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
int neg_adj = 0;
u16 hi, lo;
- if (ppb < 0) {
+ if (scaled_ppm < 0) {
neg_adj = 1;
- ppb = -ppb;
+ scaled_ppm = -scaled_ppm;
}
- rate = ppb;
- rate <<= 26;
- rate = div_u64(rate, 1953125);
+ rate = scaled_ppm;
+ rate <<= 13;
+ rate = div_u64(rate, 15625);
hi = (rate >> 16) & PTP_RATE_HI_MASK;
if (neg_adj)
@@ -1035,7 +1035,7 @@ static void dp83640_clock_init(struct dp83640_clock *clock, struct mii_bus *bus)
clock->caps.n_per_out = N_PER_OUT;
clock->caps.n_pins = DP83640_N_PINS;
clock->caps.pps = 0;
- clock->caps.adjfreq = ptp_dp83640_adjfreq;
+ clock->caps.adjfine = ptp_dp83640_adjfine;
clock->caps.adjtime = ptp_dp83640_adjtime;
clock->caps.gettime64 = ptp_dp83640_gettime;
clock->caps.settime64 = ptp_dp83640_settime;
--
2.1.4
^ permalink raw reply related
* Re: [PATCH net-next 0/3] PHC frequency fine tuning
From: Keller, Jacob E @ 2016-11-08 21:56 UTC (permalink / raw)
To: netdev@vger.kernel.org, richardcochran@gmail.com
Cc: tglx@linutronix.de, Manfred.Rudigier@omicron.at,
ulrik.debie-os@e2big.org, stefan.sorensen@spectralink.com,
davem@davemloft.net, Kirsher, Jeffrey T, john.stultz@linaro.org,
intel-wired-lan@lists.osuosl.org
In-Reply-To: <cover.1478526333.git.richardcochran@gmail.com>
On Tue, 2016-11-08 at 22:49 +0100, Richard Cochran wrote:
> This series expands the PTP Hardware Clock subsystem by adding a
> method that passes the frequency tuning word to the the drivers
> without dropping the low order bits. Keeping those bits is useful
> for
> drivers whose frequency resolution is higher than 1 ppb.
>
Makes sense.
> The appended script (below) runs a simple demonstration of the
> improvement. This test needs two Intel i210 PCIe cards installed in
> the same PC, with their SDP0 pins connected by copper
> wire. Measuring
> the estimated offset (from the ptp4l servo) and the true offset (from
> the PPS) over one hour yields the following statistics.
>
> >
> > | Est. Before | Est. After | True Before | True
> > After |
> > --------+---------------+---------------+---------------+--------
> > -------|
> > min | -5.200000e+01 | -1.600000e+01 | -3.100000e+01 |
> > -1.000000e+00 |
> > max | +5.700000e+01 | +2.500000e+01 | +8.500000e+01 |
> > +4.000000e+01 |
> > pk-pk: | +1.090000e+02 | +4.100000e+01 | +1.160000e+02 |
> > +4.100000e+01 |
> > mean | +6.472222e-02 | +1.277778e-02 | +2.422083e+01 |
> > +1.826083e+01 |
> > stddev | +1.158006e+01 | +4.581982e+00 | +1.207708e+01 |
> > +4.981435e+00 |
>
> Here the numbers in units of nanoseconds, and the ~20 nanosecond PPS
> offset is due to input/output delays on the i210's external interface
> logic.
>
> With the series applied, both the peak to peak error and the standard
> deviation improve by a factor of more than two. These two graphs
> show
> the improvement nicely.
>
> http://linuxptp.sourceforge.net/fine-tuning/fine-est.png
>
> http://linuxptp.sourceforge.net/fine-tuning/fine-tru.png
>
Wow, nice! I'll take a look at the actual patches in a few minutes, but
this is a really nice improvement!
Thanks,
Jake
>
> Thanks,
> Richard
>
> Richard Cochran (3):
> ptp: Introduce a high resolution frequency adjustment method.
> ptp: igb: Use the high resolution frequency method.
> ptp: dp83640: Use the high resolution frequency method.
>
> drivers/net/ethernet/intel/igb/igb_ptp.c | 16 ++++++++--------
> drivers/net/phy/dp83640.c | 14 +++++++-------
> drivers/ptp/ptp_clock.c | 5 ++++-
> include/linux/ptp_clock_kernel.h | 8 ++++++++
> 4 files changed, 27 insertions(+), 16 deletions(-)
>
^ permalink raw reply
* [PATCH] tools lib bpf: fix maps resolution
From: Wang Nan @ 2016-11-08 21:57 UTC (permalink / raw)
To: eric
Cc: linux-kernel, netdev, pi3orama, Wang Nan, Alexei Starovoitov,
Arnaldo Carvalho de Melo, Li Zefan
From: Eric Leblond <eric@regit.org>
It is not correct to assimilate the elf data of the maps section
to an array of map definition. In fact the sizes differ. The
offset provided in the symbol section has to be used instead.
This patch fixes a bug causing a elf with two maps not to load
correctly.
Wang Nan added:
This patch requires a name for each BPF map, so array of BPF maps is
not allowed. This restriction is reasonable, because kernel verifier
forbid indexing BPF map from such array unless the index is a fixed
value, but if the index is fixed why not merging it into name?
For example:
Program like this:
...
unsigned long cpu = get_smp_processor_id();
int *pval = map_lookup_elem(&map_array[cpu], &key);
...
Generates bytecode like this:
0: (b7) r1 = 0
1: (63) *(u32 *)(r10 -4) = r1
2: (b7) r1 = 680997
3: (63) *(u32 *)(r10 -8) = r1
4: (85) call 8
5: (67) r0 <<= 4
6: (18) r1 = 0x112dd000
8: (0f) r0 += r1
9: (bf) r2 = r10
10: (07) r2 += -4
11: (bf) r1 = r0
12: (85) call 1
Where instruction 8 is the computation, 8 and 11 render r1 to an invalid
value for function map_lookup_elem, causes verifier report error.
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
[Merge bpf_object__init_maps_name into bpf_object__init_maps
Fix segfault for buggy BPF script
Validate obj->maps
]
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
---
tools/lib/bpf/libbpf.c | 142 ++++++++++++++++++++++++++++++++++---------------
1 file changed, 98 insertions(+), 44 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b699aea..96a2b2f 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -185,6 +185,7 @@ struct bpf_program {
struct bpf_map {
int fd;
char *name;
+ size_t offset;
struct bpf_map_def def;
void *priv;
bpf_map_clear_priv_t clear_priv;
@@ -513,57 +514,106 @@ bpf_object__init_kversion(struct bpf_object *obj,
}
static int
-bpf_object__init_maps(struct bpf_object *obj, void *data,
- size_t size)
+bpf_object__validate_maps(struct bpf_object *obj)
{
- size_t nr_maps;
int i;
- nr_maps = size / sizeof(struct bpf_map_def);
- if (!data || !nr_maps) {
- pr_debug("%s doesn't need map definition\n",
- obj->path);
+ /*
+ * If there's only 1 map, the only error case should have been
+ * catched in bpf_object__init_maps().
+ */
+ if (!obj->maps || !obj->nr_maps || (obj->nr_maps == 1))
return 0;
- }
- pr_debug("maps in %s: %zd bytes\n", obj->path, size);
+ for (i = 1; i < obj->nr_maps; i++) {
+ const struct bpf_map *a = &obj->maps[i - 1];
+ const struct bpf_map *b = &obj->maps[i];
- obj->maps = calloc(nr_maps, sizeof(obj->maps[0]));
- if (!obj->maps) {
- pr_warning("alloc maps for object failed\n");
- return -ENOMEM;
+ if (b->offset - a->offset < sizeof(struct bpf_map_def)) {
+ pr_warning("corrupted map section in %s: map \"%s\" too small\n",
+ obj->path, a->name);
+ return -EINVAL;
+ }
}
- obj->nr_maps = nr_maps;
-
- for (i = 0; i < nr_maps; i++) {
- struct bpf_map_def *def = &obj->maps[i].def;
+ return 0;
+}
- /*
- * fill all fd with -1 so won't close incorrect
- * fd (fd=0 is stdin) when failure (zclose won't close
- * negative fd)).
- */
- obj->maps[i].fd = -1;
+static int compare_bpf_map(const void *_a, const void *_b)
+{
+ const struct bpf_map *a = _a;
+ const struct bpf_map *b = _b;
- /* Save map definition into obj->maps */
- *def = ((struct bpf_map_def *)data)[i];
- }
- return 0;
+ return a->offset - b->offset;
}
static int
-bpf_object__init_maps_name(struct bpf_object *obj)
+bpf_object__init_maps(struct bpf_object *obj)
{
- int i;
+ int i, map_idx, nr_maps = 0;
+ Elf_Scn *scn;
+ Elf_Data *data;
Elf_Data *symbols = obj->efile.symbols;
- if (!symbols || obj->efile.maps_shndx < 0)
+ if (obj->efile.maps_shndx < 0)
+ return -EINVAL;
+ if (!symbols)
+ return -EINVAL;
+
+ scn = elf_getscn(obj->efile.elf, obj->efile.maps_shndx);
+ if (scn)
+ data = elf_getdata(scn, NULL);
+ if (!scn || !data) {
+ pr_warning("failed to get Elf_Data from map section %d\n",
+ obj->efile.maps_shndx);
return -EINVAL;
+ }
+ /*
+ * Count number of maps. Each map has a name.
+ * Array of maps is not supported: only the first element is
+ * considered.
+ *
+ * TODO: Detect array of map and report error.
+ */
for (i = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
GElf_Sym sym;
- size_t map_idx;
+
+ if (!gelf_getsym(symbols, i, &sym))
+ continue;
+ if (sym.st_shndx != obj->efile.maps_shndx)
+ continue;
+ nr_maps++;
+ }
+
+ /* Alloc obj->maps and fill nr_maps. */
+ pr_debug("maps in %s: %d maps in %zd bytes\n", obj->path,
+ nr_maps, data->d_size);
+
+ if (!nr_maps)
+ return 0;
+
+ obj->maps = calloc(nr_maps, sizeof(obj->maps[0]));
+ if (!obj->maps) {
+ pr_warning("alloc maps for object failed\n");
+ return -ENOMEM;
+ }
+ obj->nr_maps = nr_maps;
+
+ /*
+ * fill all fd with -1 so won't close incorrect
+ * fd (fd=0 is stdin) when failure (zclose won't close
+ * negative fd)).
+ */
+ for (i = 0; i < nr_maps; i++)
+ obj->maps[i].fd = -1;
+
+ /*
+ * Fill obj->maps using data in "maps" section.
+ */
+ for (i = 0, map_idx = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
+ GElf_Sym sym;
const char *map_name;
+ struct bpf_map_def *def;
if (!gelf_getsym(symbols, i, &sym))
continue;
@@ -573,21 +623,27 @@ bpf_object__init_maps_name(struct bpf_object *obj)
map_name = elf_strptr(obj->efile.elf,
obj->efile.strtabidx,
sym.st_name);
- map_idx = sym.st_value / sizeof(struct bpf_map_def);
- if (map_idx >= obj->nr_maps) {
- pr_warning("index of map \"%s\" is buggy: %zu > %zu\n",
- map_name, map_idx, obj->nr_maps);
- continue;
+ obj->maps[map_idx].offset = sym.st_value;
+ if (sym.st_value + sizeof(struct bpf_map_def) > data->d_size) {
+ pr_warning("corrupted maps section in %s: last map \"%s\" too small\n",
+ obj->path, map_name);
+ return -EINVAL;
}
+
obj->maps[map_idx].name = strdup(map_name);
if (!obj->maps[map_idx].name) {
pr_warning("failed to alloc map name\n");
return -ENOMEM;
}
- pr_debug("map %zu is \"%s\"\n", map_idx,
+ pr_debug("map %d is \"%s\"\n", map_idx,
obj->maps[map_idx].name);
+ def = (struct bpf_map_def *)(data->d_buf + sym.st_value);
+ obj->maps[map_idx].def = *def;
+ map_idx++;
}
- return 0;
+
+ qsort(obj->maps, obj->nr_maps, sizeof(obj->maps[0]), compare_bpf_map);
+ return bpf_object__validate_maps(obj);
}
static int bpf_object__elf_collect(struct bpf_object *obj)
@@ -645,11 +701,9 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
err = bpf_object__init_kversion(obj,
data->d_buf,
data->d_size);
- else if (strcmp(name, "maps") == 0) {
- err = bpf_object__init_maps(obj, data->d_buf,
- data->d_size);
+ else if (strcmp(name, "maps") == 0)
obj->efile.maps_shndx = idx;
- } else if (sh.sh_type == SHT_SYMTAB) {
+ else if (sh.sh_type == SHT_SYMTAB) {
if (obj->efile.symbols) {
pr_warning("bpf: multiple SYMTAB in %s\n",
obj->path);
@@ -698,7 +752,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
return LIBBPF_ERRNO__FORMAT;
}
if (obj->efile.maps_shndx >= 0)
- err = bpf_object__init_maps_name(obj);
+ err = bpf_object__init_maps(obj);
out:
return err;
}
@@ -807,7 +861,7 @@ bpf_object__create_maps(struct bpf_object *obj)
zclose(obj->maps[j].fd);
return err;
}
- pr_debug("create map: fd=%d\n", *pfd);
+ pr_debug("create map %s: fd=%d\n", obj->maps[i].name, *pfd);
}
return 0;
--
2.10.1
^ permalink raw reply related
* Re: [PATCH net-next 2/3] ptp: igb: Use the high resolution frequency method.
From: Keller, Jacob E @ 2016-11-08 22:02 UTC (permalink / raw)
To: netdev@vger.kernel.org, richardcochran@gmail.com
Cc: tglx@linutronix.de, Manfred.Rudigier@omicron.at,
ulrik.debie-os@e2big.org, stefan.sorensen@spectralink.com,
davem@davemloft.net, Kirsher, Jeffrey T, john.stultz@linaro.org,
intel-wired-lan@lists.osuosl.org
In-Reply-To: <b1d732324e0b1960b294e90ca5eb2a31b6559188.1478526333.git.richardcochran@gmail.com>
On Tue, 2016-11-08 at 22:49 +0100, Richard Cochran wrote:
> The 82580 and related devices offer a frequency resolution of about
> 0.029 ppb. This patch lets users of the device benefit from the
> increased frequency resolution when tuning the clock.
>
> Signed-off-by: Richard Cochran <richardcochran@gmail.com>
> ---
> drivers/net/ethernet/intel/igb/igb_ptp.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c
> b/drivers/net/ethernet/intel/igb/igb_ptp.c
> index a7895c4..c30eea8 100644
> --- a/drivers/net/ethernet/intel/igb/igb_ptp.c
> +++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
> @@ -226,7 +226,7 @@ static int igb_ptp_adjfreq_82576(struct
> ptp_clock_info *ptp, s32 ppb)
> return 0;
> }
>
> -static int igb_ptp_adjfreq_82580(struct ptp_clock_info *ptp, s32
> ppb)
> +static int igb_ptp_adjfine_82580(struct ptp_clock_info *ptp, long
> scaled_ppm)
> {
> struct igb_adapter *igb = container_of(ptp, struct
> igb_adapter,
> ptp_caps);
> @@ -235,13 +235,13 @@ static int igb_ptp_adjfreq_82580(struct
> ptp_clock_info *ptp, s32 ppb)
> u64 rate;
> u32 inca;
>
> - if (ppb < 0) {
> + if (scaled_ppm < 0) {
> neg_adj = 1;
> - ppb = -ppb;
> + scaled_ppm = -scaled_ppm;
> }
> - rate = ppb;
> - rate <<= 26;
> - rate = div_u64(rate, 1953125);
> + rate = scaled_ppm;
> + rate <<= 13;
> + rate = div_u64(rate, 15625);
>
I'm curious how you generate the new math here, since this can be
tricky, and I could use more examples in order to port to some of the
other drivers implementations. I'm not quit sure how to handle the
value when the lower 16 bits are fractional.
Thanks,
Jake
> inca = rate & INCVALUE_MASK;
> if (neg_adj)
> @@ -1103,7 +1103,7 @@ void igb_ptp_init(struct igb_adapter *adapter)
> adapter->ptp_caps.max_adj = 62499999;
> adapter->ptp_caps.n_ext_ts = 0;
> adapter->ptp_caps.pps = 0;
> - adapter->ptp_caps.adjfreq = igb_ptp_adjfreq_82580;
> + adapter->ptp_caps.adjfine = igb_ptp_adjfine_82580;
> adapter->ptp_caps.adjtime = igb_ptp_adjtime_82576;
> adapter->ptp_caps.gettime64 = igb_ptp_gettime_82576;
> adapter->ptp_caps.settime64 = igb_ptp_settime_82576;
> @@ -1131,7 +1131,7 @@ void igb_ptp_init(struct igb_adapter *adapter)
> adapter->ptp_caps.n_pins = IGB_N_SDP;
> adapter->ptp_caps.pps = 1;
> adapter->ptp_caps.pin_config = adapter->sdp_config;
> - adapter->ptp_caps.adjfreq = igb_ptp_adjfreq_82580;
> + adapter->ptp_caps.adjfine = igb_ptp_adjfine_82580;
> adapter->ptp_caps.adjtime = igb_ptp_adjtime_i210;
> adapter->ptp_caps.gettime64 = igb_ptp_gettime_i210;
> adapter->ptp_caps.settime64 = igb_ptp_settime_i210;
^ permalink raw reply
* Re: [PATCH net-next 2/3] ptp: igb: Use the high resolution frequency method.
From: Keller, Jacob E @ 2016-11-08 22:04 UTC (permalink / raw)
To: netdev@vger.kernel.org, richardcochran@gmail.com
Cc: tglx@linutronix.de, Manfred.Rudigier@omicron.at,
ulrik.debie-os@e2big.org, stefan.sorensen@spectralink.com,
davem@davemloft.net, Kirsher, Jeffrey T, john.stultz@linaro.org,
intel-wired-lan@lists.osuosl.org
In-Reply-To: <b1d732324e0b1960b294e90ca5eb2a31b6559188.1478526333.git.richardcochran@gmail.com>
On Tue, 2016-11-08 at 22:49 +0100, Richard Cochran wrote:
> The 82580 and related devices offer a frequency resolution of about
> 0.029 ppb. This patch lets users of the device benefit from the
> increased frequency resolution when tuning the clock.
>
> Signed-off-by: Richard Cochran <richardcochran@gmail.com>
> ---
Additionally, what about min/max frequency check? Wouldn't this need to
be updated for the new adjfine operation?
Thanks,
Jake
^ permalink raw reply
* [PATCH v5] Net Driver: Add Cypress GX3 VID=04b4 PID=3610.
From: chris.roth @ 2016-11-08 22:08 UTC (permalink / raw)
To: linux-usb, netdev, linux-kernel, artjom.simon, gregkh
Cc: Allan Chou, Chris Roth
In-Reply-To: <CAPZMiRa0kYyPFQ77L5y4NgxxH85iuMAjODV2iCzd-5OjyGfNuw@mail.gmail.com>
From: Allan Chou <allan@asix.com.tw>
Add support for Cypress GX3 SuperSpeed to Gigabit Ethernet
Bridge Controller (Vendor=04b4 ProdID=3610).
Patch verified on x64 linux kernel 4.7.4, 4.8.6, 4.9-rc4 systems
with the Kensington SD4600P USB-C Universal Dock with Power,
which uses the Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge
Controller.
A similar patch was signed-off and tested-by Allan Chou
<allan@asix.com.tw> on 2015-12-01.
Allan verified his similar patch on x86 Linux kernel 4.1.6 system
with Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge Controller.
Tested-by: Allan Chou <allan@asix.com.tw>
Tested-by: Chris Roth <chris.roth@usask.ca>
Tested-by: Artjom Simon <artjom.simon@gmail.com>
Signed-off-by: Allan Chou <allan@asix.com.tw>
Signed-off-by: Chris Roth <chris.roth@usask.ca>
---
Changes in v4, v5:
- Add verification of patch on 4.8.6, 4.9-rc4 (v4)
- Add tester Artjom Simon <artjom.simon@gmail.com> (v4)
- Reformat spaces to tabs (v5)
drivers/net/usb/ax88179_178a.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index e6338c1..8a6675d 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -1656,6 +1656,19 @@ static const struct driver_info ax88178a_info = {
.tx_fixup = ax88179_tx_fixup,
};
+static const struct driver_info cypress_GX3_info = {
+ .description = "Cypress GX3 SuperSpeed to Gigabit Ethernet Controller",
+ .bind = ax88179_bind,
+ .unbind = ax88179_unbind,
+ .status = ax88179_status,
+ .link_reset = ax88179_link_reset,
+ .reset = ax88179_reset,
+ .stop = ax88179_stop,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88179_rx_fixup,
+ .tx_fixup = ax88179_tx_fixup,
+};
+
static const struct driver_info dlink_dub1312_info = {
.description = "D-Link DUB-1312 USB 3.0 to Gigabit Ethernet Adapter",
.bind = ax88179_bind,
@@ -1718,6 +1731,10 @@ static const struct usb_device_id products[] = {
USB_DEVICE(0x0b95, 0x178a),
.driver_info = (unsigned long)&ax88178a_info,
}, {
+ /* Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge Controller */
+ USB_DEVICE(0x04b4, 0x3610),
+ .driver_info = (unsigned long)&cypress_GX3_info,
+}, {
/* D-Link DUB-1312 USB 3.0 to Gigabit Ethernet Adapter */
USB_DEVICE(0x2001, 0x4a00),
.driver_info = (unsigned long)&dlink_dub1312_info,
--
2.7.4
^ permalink raw reply related
* Re: [PATCH 7/8] tools lib bpf: fix maps resolution
From: Wangnan (F) @ 2016-11-08 22:08 UTC (permalink / raw)
To: Eric Leblond, netdev; +Cc: linux-kernel, ast
In-Reply-To: <20161016211834.11732-8-eric@regit.org>
Hi Eric,
During testing this patch I find a segfault, please see inline comment.
In addition, since both the BPF map array and map names should be done
after symbol table is collected, merging bpf_object__init_maps and
bpf_object__init_maps_name would be a good practice, making code
simpler.
So I prepare a new patch. Please have a look at:
http://lkml.kernel.org/g/20161108215734.28905-1-wangnan0@huawei.com
New version ensure not crashing in any case user provides a corrupted
maps section, including array of bpf maps, maps with different definition
structures and very short map definition.
Thank you.
On 2016/10/16 14:18, Eric Leblond wrote:
> It is not correct to assimilate the elf data of the maps section
> to an array of map definition. In fact the sizes differ. The
> offset provided in the symbol section has to be used instead.
>
> This patch fixes a bug causing a elf with two maps not to load
> correctly.
>
> Signed-off-by: Eric Leblond <eric@regit.org>
> ---
> tools/lib/bpf/libbpf.c | 50 +++++++++++++++++++++++++++++++++++---------------
> 1 file changed, 35 insertions(+), 15 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 1fe4532..f72628b 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -186,6 +186,7 @@ struct bpf_program {
> struct bpf_map {
> int fd;
> char *name;
> + size_t offset;
> struct bpf_map_def def;
> void *priv;
> bpf_map_clear_priv_t clear_priv;
> @@ -529,13 +530,6 @@ bpf_object__init_maps(struct bpf_object *obj, void *data,
>
> pr_debug("maps in %s: %zd bytes\n", obj->path, size);
>
> - obj->maps = calloc(nr_maps, sizeof(obj->maps[0]));
> - if (!obj->maps) {
> - pr_warning("alloc maps for object failed\n");
> - return -ENOMEM;
> - }
> - obj->nr_maps = nr_maps;
> -
> for (i = 0; i < nr_maps; i++) {
> struct bpf_map_def *def = &obj->maps[i].def;
>
> @@ -547,23 +541,42 @@ bpf_object__init_maps(struct bpf_object *obj, void *data,
> obj->maps[i].fd = -1;
>
> /* Save map definition into obj->maps */
> - *def = ((struct bpf_map_def *)data)[i];
> + *def = *(struct bpf_map_def *)(data + obj->maps[i].offset);
> }
Here, nr_maps is still size / sizeof(struct bpf_map_def), so obj->maps[i]
can be invalid.
> return 0;
> }
>
> static int
> -bpf_object__init_maps_name(struct bpf_object *obj)
> +bpf_object__init_maps_symbol(struct bpf_object *obj)
> {
> int i;
> + int nr_maps = 0;
> Elf_Data *symbols = obj->efile.symbols;
> + size_t map_idx = 0;
>
> if (!symbols || obj->efile.maps_shndx < 0)
> return -EINVAL;
>
> + /* get the number of maps */
> + for (i = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
> + GElf_Sym sym;
> +
> + if (!gelf_getsym(symbols, i, &sym))
> + continue;
> + if (sym.st_shndx != obj->efile.maps_shndx)
> + continue;
> + nr_maps++;
> + }
> +
> + obj->maps = calloc(nr_maps, sizeof(obj->maps[0]));
> + if (!obj->maps) {
> + pr_warning("alloc maps for object failed\n");
> + return -ENOMEM;
> + }
> + obj->nr_maps = nr_maps;
> +
> for (i = 0; i < symbols->d_size / sizeof(GElf_Sym); i++) {
> GElf_Sym sym;
> - size_t map_idx;
> const char *map_name;
>
> if (!gelf_getsym(symbols, i, &sym))
> @@ -574,12 +587,12 @@ bpf_object__init_maps_name(struct bpf_object *obj)
> map_name = elf_strptr(obj->efile.elf,
> obj->efile.strtabidx,
> sym.st_name);
> - map_idx = sym.st_value / sizeof(struct bpf_map_def);
> if (map_idx >= obj->nr_maps) {
> pr_warning("index of map \"%s\" is buggy: %zu > %zu\n",
> map_name, map_idx, obj->nr_maps);
> continue;
> }
> + obj->maps[map_idx].offset = sym.st_value;
> obj->maps[map_idx].name = strdup(map_name);
> if (!obj->maps[map_idx].name) {
> pr_warning("failed to alloc map name\n");
> @@ -587,6 +600,7 @@ bpf_object__init_maps_name(struct bpf_object *obj)
> }
> pr_debug("map %zu is \"%s\"\n", map_idx,
> obj->maps[map_idx].name);
> + map_idx++;
> }
> return 0;
> }
> @@ -647,8 +661,6 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
> data->d_buf,
> data->d_size);
> else if (strcmp(name, "maps") == 0) {
> - err = bpf_object__init_maps(obj, data->d_buf,
> - data->d_size);
> obj->efile.maps_shndx = idx;
> } else if (sh.sh_type == SHT_SYMTAB) {
> if (obj->efile.symbols) {
> @@ -698,8 +710,16 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
> pr_warning("Corrupted ELF file: index of strtab invalid\n");
> return LIBBPF_ERRNO__FORMAT;
> }
> - if (obj->efile.maps_shndx >= 0)
> - err = bpf_object__init_maps_name(obj);
> + if (obj->efile.maps_shndx >= 0) {
> + Elf_Data *data;
> + err = bpf_object__init_maps_symbol(obj);
> + if (err)
> + goto out;
> +
> + scn = elf_getscn(elf, obj->efile.maps_shndx);
> + data = elf_getdata(scn, 0);
> + err = bpf_object__init_maps(obj, data->d_buf, data->d_size);
> + }
> out:
> return err;
> }
^ permalink raw reply
* [PATCH] net: ipv4: ip_send_unicast_reply should set oif only if it is L3 master
From: David Ahern @ 2016-11-08 22:50 UTC (permalink / raw)
To: netdev; +Cc: David Ahern
Lorenzo noted an Android unit test failed due to commit e0d56fdd7342:
"The expectation in the test was that the RST replying to a SYN sent to a
closed port should be generated with oif=0. In other words it should not
prefer the interface where the SYN came in on, but instead should follow
whatever the routing table says it should do."
Since this a change in behavior, revert the change to
ip_send_unicast_reply such that the oif in the flow is set to the skb_iif
only if skb_iif is an L3 master.
Fixes: e0d56fdd7342 ("net: l3mdev: remove redundant calls")
Reported-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
net/ipv4/ip_output.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 49714010ac2e..9403fa3850be 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1577,7 +1577,8 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
}
oif = arg->bound_dev_if;
- oif = oif ? : skb->skb_iif;
+ if (!oif && netif_index_is_l3_master(net, skb->skb_iif))
+ oif = skb->skb_iif;
flowi4_init_output(&fl4, oif,
IP4_REPLY_MARK(net, skb->mark),
--
2.1.4
^ permalink raw reply related
* Re: [Patch net] ipvs: use IPVS_CMD_ATTR_MAX for family.maxattr
From: Pablo Neira Ayuso @ 2016-11-08 22:53 UTC (permalink / raw)
To: Simon Horman; +Cc: Cong Wang, netdev, netfilter-devel, andreyknvl
In-Reply-To: <20161104105843.GC24863@verge.net.au>
On Fri, Nov 04, 2016 at 11:58:44AM +0100, Simon Horman wrote:
> On Thu, Nov 03, 2016 at 05:14:03PM -0700, Cong Wang wrote:
> > family.maxattr is the max index for policy[], the size of
> > ops[] is determined with ARRAY_SIZE().
> >
> > Reported-by: Andrey Konovalov <andreyknvl@google.com>
> > Tested-by: Andrey Konovalov <andreyknvl@google.com>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
>
>
> Signed-off-by: Simon Horman <horms@verge.net.au>
>
> Pablo, can you take this one into nf?
Applied, thanks!
^ permalink raw reply
* Re: [iproute PATCH 1/2] ipaddress: Simplify vf_info parsing
From: Greg @ 2016-11-08 23:05 UTC (permalink / raw)
To: Phil Sutter; +Cc: Stephen Hemminger, netdev
In-Reply-To: <20161108212912.27258-2-phil@nwl.cc>
On Tue, 2016-11-08 at 22:29 +0100, Phil Sutter wrote:
> Commit 7b8179c780a1a ("iproute2: Add new command to ip link to
> enable/disable VF spoof check") tried to add support for
> IFLA_VF_SPOOFCHK in a backwards-compatible manner, but aparently overdid
> it: parse_rtattr_nested() handles missing attributes perfectly fine in
> that it will leave the relevant field unassigned so calling code can
> just compare against NULL. There is no need to layback from the previous
> (IFLA_VF_TX_RATE) attribute to the next to check if IFLA_VF_SPOOFCHK is
> present or not. To the contrary, it establishes a potentially incorrect
> assumption of these two attributes directly following each other which
> may not be the case (although up to now, kernel aligns them this way).
>
> This patch cleans up the code to adhere to the common way of checking
> for attribute existence. It has been tested to return correct results
> regardless of whether the kernel exports IFLA_VF_SPOOFCHK or not.
>
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
> ip/ipaddress.c | 44 ++++++++++----------------------------------
> 1 file changed, 10 insertions(+), 34 deletions(-)
>
> diff --git a/ip/ipaddress.c b/ip/ipaddress.c
> index 7f05258f43453..df0f1b9c94c58 100644
> --- a/ip/ipaddress.c
> +++ b/ip/ipaddress.c
> @@ -322,10 +322,7 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
> {
> struct ifla_vf_mac *vf_mac;
> struct ifla_vf_tx_rate *vf_tx_rate;
> - struct ifla_vf_spoofchk *vf_spoofchk;
> - struct ifla_vf_link_state *vf_linkstate;
> struct rtattr *vf[IFLA_VF_MAX + 1] = {};
> - struct rtattr *tmp;
>
> SPRINT_BUF(b1);
>
> @@ -339,31 +336,6 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
> vf_mac = RTA_DATA(vf[IFLA_VF_MAC]);
> vf_tx_rate = RTA_DATA(vf[IFLA_VF_TX_RATE]);
>
> - /* Check if the spoof checking vf info type is supported by
> - * this kernel.
> - */
> - tmp = (struct rtattr *)((char *)vf[IFLA_VF_TX_RATE] +
> - vf[IFLA_VF_TX_RATE]->rta_len);
> -
> - if (tmp->rta_type != IFLA_VF_SPOOFCHK)
> - vf_spoofchk = NULL;
> - else
> - vf_spoofchk = RTA_DATA(vf[IFLA_VF_SPOOFCHK]);
> -
> - if (vf_spoofchk) {
> - /* Check if the link state vf info type is supported by
> - * this kernel.
> - */
> - tmp = (struct rtattr *)((char *)vf[IFLA_VF_SPOOFCHK] +
> - vf[IFLA_VF_SPOOFCHK]->rta_len);
> -
> - if (tmp->rta_type != IFLA_VF_LINK_STATE)
> - vf_linkstate = NULL;
> - else
> - vf_linkstate = RTA_DATA(vf[IFLA_VF_LINK_STATE]);
> - } else
> - vf_linkstate = NULL;
> -
> fprintf(fp, "%s vf %d MAC %s", _SL_, vf_mac->vf,
> ll_addr_n2a((unsigned char *)&vf_mac->mac,
> ETH_ALEN, 0, b1, sizeof(b1)));
> @@ -407,14 +379,18 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
> if (vf_rate->min_tx_rate)
> fprintf(fp, ", min_tx_rate %dMbps", vf_rate->min_tx_rate);
> }
> + if (vf[IFLA_VF_SPOOFCHK]) {
> + struct ifla_vf_spoofchk *vf_spoofchk =
> + RTA_DATA(vf[IFLA_VF_SPOOFCHK]);
>
> - if (vf_spoofchk && vf_spoofchk->setting != -1) {
> - if (vf_spoofchk->setting)
> - fprintf(fp, ", spoof checking on");
> - else
> - fprintf(fp, ", spoof checking off");
> + if (vf_spoofchk->setting != -1)
> + fprintf(fp, ", spoof checking %s",
> + vf_spoofchk->setting ? "on" : "off");
I wrote some of this code at a time when I was pretty new to Linux
kernel net programming and I really just didn't understand it. It
appears you're doing it more correctly than I.
Thanks for cleaning it up.
Reviewed-by: Greg Rose <grose@lightfleet.com>
- Greg
> }
> - if (vf_linkstate) {
> + if (vf[IFLA_VF_LINK_STATE]) {
> + struct ifla_vf_link_state *vf_linkstate =
> + RTA_DATA(vf[IFLA_VF_LINK_STATE]);
> +
> if (vf_linkstate->link_state == IFLA_VF_LINK_STATE_AUTO)
> fprintf(fp, ", link-state auto");
> else if (vf_linkstate->link_state == IFLA_VF_LINK_STATE_ENABLE)
^ permalink raw reply
* Re: [PATCH] net/netfilter: Fix use uninitialized warn in nft_range_eval()
From: Pablo Neira Ayuso @ 2016-11-08 23:13 UTC (permalink / raw)
To: Shuah Khan
Cc: kaber, kadlec, davem, netfilter-devel, coreteam, netdev,
linux-kernel
In-Reply-To: <20161107154114.26803-4-shuahkh@osg.samsung.com>
On Mon, Nov 07, 2016 at 08:41:14AM -0700, Shuah Khan wrote:
> Fix the following warn:
>
> CC [M] net/netfilter/nft_range.o
> 8601,8605c9105
> net/netfilter/nft_range.c: In function ‘nft_range_eval’:
> net/netfilter/nft_range.c:45:5: warning: ‘mismatch’ may be used uninitialized in this function [-Wmaybe-uninitialized]
> if (mismatch)
> ^
You probably using an old tree snapshot? This was already fixed by:
commit d2e4d593516e877f1f6fb40031eb495f36606e16
Author: Arnd Bergmann <arnd@arndb.de>
Date: Tue Oct 18 00:05:30 2016 +0200
netfilter: nf_tables: avoid uninitialized variable warning
The newly added nft_range_eval() function handles the two possible
nft range operations, but as the compiler warning points out,
any unexpected value would lead to the 'mismatch' variable being
used without being initialized:
^ permalink raw reply
* Re: [PATCH net v2 2/4] net: ethernet: ti: cpsw: fix device and of_node leaks
From: Grygorii Strashko @ 2016-11-08 23:19 UTC (permalink / raw)
To: Johan Hovold, Florian Fainelli, Mugunthan V N, Yisen Zhuang,
Salil Mehta, David S. Miller
Cc: netdev, linux-kernel, linux-omap
In-Reply-To: <1478194822-29545-3-git-send-email-johan@kernel.org>
On 11/03/2016 12:40 PM, Johan Hovold wrote:
> Make sure to drop the references taken by of_get_child_by_name() and
> bus_find_device() before returning from cpsw_phy_sel().
>
> Note that holding a reference to the cpsw-phy-sel device does not
> prevent the devres-managed private data from going away.
>
> Fixes: 5892cd135e16 ("drivers: net: cpsw-phy-sel: Add new driver...")
> Cc: Mugunthan V N <mugunthanvnm@ti.com>
> Cc: Grygorii Strashko <grygorii.strashko@ti.com>
> Cc: linux-omap@vger.kernel.org
> Signed-off-by: Johan Hovold <johan@kernel.org>
> ---
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
> drivers/net/ethernet/ti/cpsw-phy-sel.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/ethernet/ti/cpsw-phy-sel.c b/drivers/net/ethernet/ti/cpsw-phy-sel.c
> index 054a8dd23dae..ba1e45ff6aae 100644
> --- a/drivers/net/ethernet/ti/cpsw-phy-sel.c
> +++ b/drivers/net/ethernet/ti/cpsw-phy-sel.c
> @@ -176,9 +176,12 @@ void cpsw_phy_sel(struct device *dev, phy_interface_t phy_mode, int slave)
> }
>
> dev = bus_find_device(&platform_bus_type, NULL, node, match);
> + of_node_put(node);
> priv = dev_get_drvdata(dev);
>
> priv->cpsw_phy_sel(priv, phy_mode, slave);
> +
> + put_device(dev);
> }
> EXPORT_SYMBOL_GPL(cpsw_phy_sel);
>
>
--
regards,
-grygorii
^ permalink raw reply
* Re: [PATCH] net: ethernet: ti: davinci_cpdma: free memory while channel destroy
From: Grygorii Strashko @ 2016-11-08 23:22 UTC (permalink / raw)
To: Ivan Khoronzhuk, mugunthanvnm, netdev; +Cc: linux-omap, linux-kernel
In-Reply-To: <1478610965-25288-1-git-send-email-ivan.khoronzhuk@linaro.org>
On 11/08/2016 07:16 AM, Ivan Khoronzhuk wrote:
> While create/destroy channel operation memory is not freed. It was
> supposed that memory is freed while driver remove. But a channel
> can be created and destroyed many times while changing number of
> channels with ethtool.
>
> Based on net-next/master
^?
>
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
> ---
> drivers/net/ethernet/ti/davinci_cpdma.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
> index 05afc05..07fc92d 100644
> --- a/drivers/net/ethernet/ti/davinci_cpdma.c
> +++ b/drivers/net/ethernet/ti/davinci_cpdma.c
> @@ -586,7 +586,7 @@ int cpdma_chan_destroy(struct cpdma_chan *chan)
> cpdma_chan_stop(chan);
> ctlr->channels[chan->chan_num] = NULL;
> ctlr->chan_num--;
> -
> + devm_kfree(ctlr->dev, chan);
> cpdma_chan_split_pool(ctlr);
>
> spin_unlock_irqrestore(&ctlr->lock, flags);
>
--
regards,
-grygorii
^ permalink raw reply
* Re: [PATCH v2 3/6] qedi: Add QLogic FastLinQ offload iSCSI driver framework.
From: Arun Easi @ 2016-11-08 20:04 UTC (permalink / raw)
To: kbuild test robot
Cc: Manish Rangankar, kbuild-all, Martin K. Petersen, James Bottomley,
lduncan, cleech, linux-scsi, netdev, QLogic-Storage-Upstream,
Yuval Mintz
In-Reply-To: <201611081805.KzpHTNxg%fengguang.wu@intel.com>
[ Sending on behalf of Manish to cover for the time difference. ]
Hi Martin, James,
I would like to request your input on this kbuild test error on the
series, where they compile fine together, but is not bisectable.
qedi is the new iSCSI driver, which we are trying to submit, for our 41000
series CNA. This patch series were broken up into logical blocks for
review purpose, but were not made to compile individually. It is our
impression that this is acceptable for SCSI and all the initial "qedi"
patches will be squashed and committed as a single commit. Please let us
know if we are mistaken, and if so, we will post another series
with this taken care of.
FYI, this series accompany additions to the common core module, "qed",
that goes under drivers/net/. The patches for the qed module compiles fine
individually and so is bisectable.
In regards to the additional warnings brought out by kbuild test on
"PATCH v2 6/6" and "PATCH v2 3/6", we will post a v3 with the fixes.
Regards,
-Arun
On Tue, 8 Nov 2016, 2:52am -0000, kbuild test robot wrote:
> Hi Manish,
>
> [auto build test ERROR on net-next/master]
> [also build test ERROR on v4.9-rc4]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>
> url: https://github.com/0day-ci/linux/commits/Manish-Rangankar/qed-Add-support-for-hardware-offloaded-iSCSI/20161108-180027
> config: ia64-allmodconfig (attached as .config)
> compiler: ia64-linux-gcc (GCC) 6.2.0
> reproduce:
> wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=ia64
>
> Note: the linux-review/Manish-Rangankar/qed-Add-support-for-hardware-offloaded-iSCSI/20161108-180027 HEAD dd4d1d0e0785d20cdcfdf9b2c792c564a79b2de2 builds fine.
> It only hurts bisectibility.
>
^ permalink raw reply
* Re: [PATCH v2 3/6] qedi: Add QLogic FastLinQ offload iSCSI driver framework.
From: Martin K. Petersen @ 2016-11-08 23:49 UTC (permalink / raw)
To: Arun Easi
Cc: kbuild test robot, Manish Rangankar, kbuild-all,
Martin K. Petersen, James Bottomley, lduncan, cleech, linux-scsi,
netdev, QLogic-Storage-Upstream, Yuval Mintz
In-Reply-To: <alpine.LRH.2.00.1611081129440.28058@mvluser05.qlc.com>
>>>>> "Arun" == Arun Easi <arun.easi@cavium.com> writes:
Arun,
Arun> qedi is the new iSCSI driver, which we are trying to submit, for
Arun> our 41000 series CNA. This patch series were broken up into
Arun> logical blocks for review purpose, but were not made to compile
Arun> individually. It is our impression that this is acceptable for
Arun> SCSI and all the initial "qedi" patches will be squashed and
Arun> committed as a single commit. Please let us know if we are
Arun> mistaken, and if so, we will post another series with this taken
Arun> care of.
It's fine to post the patches split up to ease the review process. But
whatever we commit must obviously be bisectable.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply
* Re: [RESEND][PATCH v4] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups
From: Andy Lutomirski @ 2016-11-08 23:51 UTC (permalink / raw)
To: John Stultz, Alexei Starovoitov, Mickaël Salaün,
Daniel Mack, David S. Miller, kafai, fw, Harald Hoyer,
Network Development, Sargun Dhillon, Pablo Neira Ayuso
Cc: lkml, Tejun Heo, Li Zefan, Jonathan Corbet,
open list:CONTROL GROUP (CGROUP), Android Kernel Team,
Rom Lemarchand, Colin Cross, Dmitry Shmidt, Todd Kjos,
Christian Poetzsch, Amit Pundir, Dmitry Torokhov, Kees Cook,
Serge E . Hallyn, Linux API
In-Reply-To: <1478647728-30357-1-git-send-email-john.stultz@linaro.org>
On Tue, Nov 8, 2016 at 3:28 PM, John Stultz <john.stultz@linaro.org> wrote:
> This patch adds logic to allows a process to migrate other tasks
> between cgroups if they have CAP_SYS_RESOURCE.
>
> In Android (where this feature originated), the ActivityManager tracks
> various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM,
> etc), and then as applications change states, the SchedPolicy logic
> will migrate the application tasks between different cgroups used
> to control the different application states (for example, there is a
> background cpuset cgroup which can limit background tasks to stay
> on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can
> then further limit those background tasks to a small percentage of
> that one cpu's cpu time).
>
> However, for security reasons, Android doesn't want to make the
> system_server (the process that runs the ActivityManager and
> SchedPolicy logic), run as root. So in the Android common.git
> kernel, they have some logic to allow cgroups to loosen their
> permissions so CAP_SYS_NICE tasks can migrate other tasks between
> cgroups.
>
> I feel the approach taken there overloads CAP_SYS_NICE a bit much
> for non-android environments.
>
> So this patch, as suggested by Michael Kerrisk, simply adds a
> check for CAP_SYS_RESOURCE.
>
> I've tested this with AOSP master, and this seems to work well
> as Zygote and system_server already use CAP_SYS_RESOURCE. I've
> also submitted patches against the android-4.4 kernel to change
> it to use CAP_SYS_RESOURCE, and the Android developers just merged
> it.
>
I hate to say it, but I think I may see a problem. Current
developments are afoot to make cgroups do more than resource control.
For example, there's Landlock and there's Daniel's ingress/egress
filter thing. Current cgroup controllers can mostly just DoS their
controlled processes. These new controllers (or controller-like
things) can exfiltrate data and change semantics.
Does anyone have a security model in mind for these controllers and
the cgroups that they're attached to? I'm reasonably confident that
CAP_SYS_RESOURCE is not the answer...
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox