* [PATCH net-next v6 2/6] Documentation/bindings: net: ocelot: document the PTP ready IRQ
From: Antoine Tenart @ 2019-08-12 14:45 UTC (permalink / raw)
To: davem, richardcochran, alexandre.belloni, UNGLinuxDriver
Cc: Antoine Tenart, netdev, thomas.petazzoni, allan.nielsen, andrew
In-Reply-To: <20190812144537.14497-1-antoine.tenart@bootlin.com>
One additional interrupt needs to be described within the Ocelot device
tree node: the PTP ready one. This patch documents the binding needed to
do so.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
Documentation/devicetree/bindings/net/mscc-ocelot.txt | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/Documentation/devicetree/bindings/net/mscc-ocelot.txt b/Documentation/devicetree/bindings/net/mscc-ocelot.txt
index 4d05a3b0f786..3b6290b45ce5 100644
--- a/Documentation/devicetree/bindings/net/mscc-ocelot.txt
+++ b/Documentation/devicetree/bindings/net/mscc-ocelot.txt
@@ -17,9 +17,10 @@ Required properties:
- "ana"
- "portX" with X from 0 to the number of last port index available on that
switch
-- interrupts: Should contain the switch interrupts for frame extraction and
- frame injection
-- interrupt-names: should contain the interrupt names: "xtr", "inj"
+- interrupts: Should contain the switch interrupts for frame extraction,
+ frame injection and PTP ready.
+- interrupt-names: should contain the interrupt names: "xtr", "inj". Can contain
+ "ptp_rdy" which is optional due to backward compatibility.
- ethernet-ports: A container for child nodes representing switch ports.
The ethernet-ports container has the following properties
@@ -63,8 +64,8 @@ Example:
"port2", "port3", "port4", "port5", "port6",
"port7", "port8", "port9", "port10", "qsys",
"ana";
- interrupts = <21 22>;
- interrupt-names = "xtr", "inj";
+ interrupts = <18 21 22>;
+ interrupt-names = "ptp_rdy", "xtr", "inj";
ethernet-ports {
#address-cells = <1>;
--
2.21.0
^ permalink raw reply related
* [PATCH net-next v6 6/6] net: mscc: PTP Hardware Clock (PHC) support
From: Antoine Tenart @ 2019-08-12 14:45 UTC (permalink / raw)
To: davem, richardcochran, alexandre.belloni, UNGLinuxDriver
Cc: Antoine Tenart, netdev, thomas.petazzoni, allan.nielsen, andrew
In-Reply-To: <20190812144537.14497-1-antoine.tenart@bootlin.com>
This patch adds support for PTP Hardware Clock (PHC) to the Ocelot
switch for both PTP 1-step and 2-step modes.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
---
drivers/net/ethernet/mscc/ocelot.c | 401 ++++++++++++++++++++++-
drivers/net/ethernet/mscc/ocelot.h | 39 +++
drivers/net/ethernet/mscc/ocelot_board.c | 112 ++++++-
3 files changed, 539 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 6932e615d4b0..4d1bce4389c7 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -14,6 +14,7 @@
#include <linux/module.h>
#include <linux/netdevice.h>
#include <linux/phy.h>
+#include <linux/ptp_clock_kernel.h>
#include <linux/skbuff.h>
#include <linux/iopoll.h>
#include <net/arp.h>
@@ -538,7 +539,7 @@ static int ocelot_port_stop(struct net_device *dev)
*/
static int ocelot_gen_ifh(u32 *ifh, struct frame_info *info)
{
- ifh[0] = IFH_INJ_BYPASS;
+ ifh[0] = IFH_INJ_BYPASS | ((0x1ff & info->rew_op) << 21);
ifh[1] = (0xf00 & info->port) >> 8;
ifh[2] = (0xff & info->port) << 24;
ifh[3] = (info->tag_type << 16) | info->vid;
@@ -548,6 +549,7 @@ static int ocelot_gen_ifh(u32 *ifh, struct frame_info *info)
static int ocelot_port_xmit(struct sk_buff *skb, struct net_device *dev)
{
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
struct ocelot_port *port = netdev_priv(dev);
struct ocelot *ocelot = port->ocelot;
u32 val, ifh[IFH_LEN];
@@ -566,6 +568,14 @@ static int ocelot_port_xmit(struct sk_buff *skb, struct net_device *dev)
info.port = BIT(port->chip_port);
info.tag_type = IFH_TAG_TYPE_C;
info.vid = skb_vlan_tag_get(skb);
+
+ /* Check if timestamping is needed */
+ if (ocelot->ptp && shinfo->tx_flags & SKBTX_HW_TSTAMP) {
+ info.rew_op = port->ptp_cmd;
+ if (port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP)
+ info.rew_op |= (port->ts_id % 4) << 3;
+ }
+
ocelot_gen_ifh(ifh, &info);
for (i = 0; i < IFH_LEN; i++)
@@ -596,11 +606,58 @@ static int ocelot_port_xmit(struct sk_buff *skb, struct net_device *dev)
dev->stats.tx_packets++;
dev->stats.tx_bytes += skb->len;
- dev_kfree_skb_any(skb);
+ if (ocelot->ptp && shinfo->tx_flags & SKBTX_HW_TSTAMP &&
+ port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
+ struct ocelot_skb *oskb =
+ kzalloc(sizeof(struct ocelot_skb), GFP_ATOMIC);
+
+ if (unlikely(!oskb))
+ goto out;
+
+ skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
+
+ oskb->skb = skb;
+ oskb->id = port->ts_id % 4;
+ port->ts_id++;
+
+ list_add_tail(&oskb->head, &port->skbs);
+
+ return NETDEV_TX_OK;
+ }
+
+out:
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}
+void ocelot_get_hwtimestamp(struct ocelot *ocelot, struct timespec64 *ts)
+{
+ unsigned long flags;
+ u32 val;
+
+ spin_lock_irqsave(&ocelot->ptp_clock_lock, flags);
+
+ /* Read current PTP time to get seconds */
+ val = ocelot_read_rix(ocelot, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ val &= ~(PTP_PIN_CFG_SYNC | PTP_PIN_CFG_ACTION_MASK | PTP_PIN_CFG_DOM);
+ val |= PTP_PIN_CFG_ACTION(PTP_PIN_ACTION_SAVE);
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+ ts->tv_sec = ocelot_read_rix(ocelot, PTP_PIN_TOD_SEC_LSB, TOD_ACC_PIN);
+
+ /* Read packet HW timestamp from FIFO */
+ val = ocelot_read(ocelot, SYS_PTP_TXSTAMP);
+ ts->tv_nsec = SYS_PTP_TXSTAMP_PTP_TXSTAMP(val);
+
+ /* Sec has incremented since the ts was registered */
+ if ((ts->tv_sec & 0x1) != !!(val & SYS_PTP_TXSTAMP_PTP_TXSTAMP_SEC))
+ ts->tv_sec--;
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+}
+EXPORT_SYMBOL(ocelot_get_hwtimestamp);
+
static int ocelot_mc_unsync(struct net_device *dev, const unsigned char *addr)
{
struct ocelot_port *port = netdev_priv(dev);
@@ -917,6 +974,97 @@ static int ocelot_get_port_parent_id(struct net_device *dev,
return 0;
}
+static int ocelot_hwstamp_get(struct ocelot_port *port, struct ifreq *ifr)
+{
+ struct ocelot *ocelot = port->ocelot;
+
+ return copy_to_user(ifr->ifr_data, &ocelot->hwtstamp_config,
+ sizeof(ocelot->hwtstamp_config)) ? -EFAULT : 0;
+}
+
+static int ocelot_hwstamp_set(struct ocelot_port *port, struct ifreq *ifr)
+{
+ struct ocelot *ocelot = port->ocelot;
+ struct hwtstamp_config cfg;
+
+ if (copy_from_user(&cfg, ifr->ifr_data, sizeof(cfg)))
+ return -EFAULT;
+
+ /* reserved for future extensions */
+ if (cfg.flags)
+ return -EINVAL;
+
+ /* Tx type sanity check */
+ switch (cfg.tx_type) {
+ case HWTSTAMP_TX_ON:
+ port->ptp_cmd = IFH_REW_OP_TWO_STEP_PTP;
+ break;
+ case HWTSTAMP_TX_ONESTEP_SYNC:
+ /* IFH_REW_OP_ONE_STEP_PTP updates the correctional field, we
+ * need to update the origin time.
+ */
+ port->ptp_cmd = IFH_REW_OP_ORIGIN_PTP;
+ break;
+ case HWTSTAMP_TX_OFF:
+ port->ptp_cmd = 0;
+ break;
+ default:
+ return -ERANGE;
+ }
+
+ mutex_lock(&ocelot->ptp_lock);
+
+ switch (cfg.rx_filter) {
+ case HWTSTAMP_FILTER_NONE:
+ break;
+ case HWTSTAMP_FILTER_ALL:
+ case HWTSTAMP_FILTER_SOME:
+ case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+ case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
+ case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
+ case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
+ break;
+ default:
+ mutex_unlock(&ocelot->ptp_lock);
+ return -ERANGE;
+ }
+
+ /* Commit back the result & save it */
+ memcpy(&ocelot->hwtstamp_config, &cfg, sizeof(cfg));
+ mutex_unlock(&ocelot->ptp_lock);
+
+ return copy_to_user(ifr->ifr_data, &cfg, sizeof(cfg)) ? -EFAULT : 0;
+}
+
+static int ocelot_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
+{
+ struct ocelot_port *port = netdev_priv(dev);
+ struct ocelot *ocelot = port->ocelot;
+
+ /* The function is only used for PTP operations for now */
+ if (!ocelot->ptp)
+ return -EOPNOTSUPP;
+
+ switch (cmd) {
+ case SIOCSHWTSTAMP:
+ return ocelot_hwstamp_set(port, ifr);
+ case SIOCGHWTSTAMP:
+ return ocelot_hwstamp_get(port, ifr);
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
static const struct net_device_ops ocelot_port_netdev_ops = {
.ndo_open = ocelot_port_open,
.ndo_stop = ocelot_port_stop,
@@ -933,6 +1081,7 @@ static const struct net_device_ops ocelot_port_netdev_ops = {
.ndo_set_features = ocelot_set_features,
.ndo_get_port_parent_id = ocelot_get_port_parent_id,
.ndo_setup_tc = ocelot_setup_tc,
+ .ndo_do_ioctl = ocelot_ioctl,
};
static void ocelot_get_strings(struct net_device *netdev, u32 sset, u8 *data)
@@ -1014,12 +1163,37 @@ static int ocelot_get_sset_count(struct net_device *dev, int sset)
return ocelot->num_stats;
}
+static int ocelot_get_ts_info(struct net_device *dev,
+ struct ethtool_ts_info *info)
+{
+ struct ocelot_port *ocelot_port = netdev_priv(dev);
+ struct ocelot *ocelot = ocelot_port->ocelot;
+
+ if (!ocelot->ptp)
+ return ethtool_op_get_ts_info(dev, info);
+
+ info->phc_index = ocelot->ptp_clock ?
+ ptp_clock_index(ocelot->ptp_clock) : -1;
+ info->so_timestamping |= SOF_TIMESTAMPING_TX_SOFTWARE |
+ SOF_TIMESTAMPING_RX_SOFTWARE |
+ SOF_TIMESTAMPING_SOFTWARE |
+ SOF_TIMESTAMPING_TX_HARDWARE |
+ SOF_TIMESTAMPING_RX_HARDWARE |
+ SOF_TIMESTAMPING_RAW_HARDWARE;
+ info->tx_types = BIT(HWTSTAMP_TX_OFF) | BIT(HWTSTAMP_TX_ON) |
+ BIT(HWTSTAMP_TX_ONESTEP_SYNC);
+ info->rx_filters = BIT(HWTSTAMP_FILTER_NONE) | BIT(HWTSTAMP_FILTER_ALL);
+
+ return 0;
+}
+
static const struct ethtool_ops ocelot_ethtool_ops = {
.get_strings = ocelot_get_strings,
.get_ethtool_stats = ocelot_get_ethtool_stats,
.get_sset_count = ocelot_get_sset_count,
.get_link_ksettings = phy_ethtool_get_link_ksettings,
.set_link_ksettings = phy_ethtool_set_link_ksettings,
+ .get_ts_info = ocelot_get_ts_info,
};
static int ocelot_port_attr_stp_state_set(struct ocelot_port *ocelot_port,
@@ -1629,6 +1803,196 @@ struct notifier_block ocelot_switchdev_blocking_nb __read_mostly = {
};
EXPORT_SYMBOL(ocelot_switchdev_blocking_nb);
+int ocelot_ptp_gettime64(struct ptp_clock_info *ptp, struct timespec64 *ts)
+{
+ struct ocelot *ocelot = container_of(ptp, struct ocelot, ptp_info);
+ unsigned long flags;
+ time64_t s;
+ u32 val;
+ s64 ns;
+
+ spin_lock_irqsave(&ocelot->ptp_clock_lock, flags);
+
+ val = ocelot_read_rix(ocelot, PTP_PIN_CFG, TOD_ACC_PIN);
+ val &= ~(PTP_PIN_CFG_SYNC | PTP_PIN_CFG_ACTION_MASK | PTP_PIN_CFG_DOM);
+ val |= PTP_PIN_CFG_ACTION(PTP_PIN_ACTION_SAVE);
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ s = ocelot_read_rix(ocelot, PTP_PIN_TOD_SEC_MSB, TOD_ACC_PIN) & 0xffff;
+ s <<= 32;
+ s += ocelot_read_rix(ocelot, PTP_PIN_TOD_SEC_LSB, TOD_ACC_PIN);
+ ns = ocelot_read_rix(ocelot, PTP_PIN_TOD_NSEC, TOD_ACC_PIN);
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+
+ /* Deal with negative values */
+ if (ns >= 0x3ffffff0 && ns <= 0x3fffffff) {
+ s--;
+ ns &= 0xf;
+ ns += 999999984;
+ }
+
+ set_normalized_timespec64(ts, s, ns);
+ return 0;
+}
+EXPORT_SYMBOL(ocelot_ptp_gettime64);
+
+static int ocelot_ptp_settime64(struct ptp_clock_info *ptp,
+ const struct timespec64 *ts)
+{
+ struct ocelot *ocelot = container_of(ptp, struct ocelot, ptp_info);
+ unsigned long flags;
+ u32 val;
+
+ spin_lock_irqsave(&ocelot->ptp_clock_lock, flags);
+
+ val = ocelot_read_rix(ocelot, PTP_PIN_CFG, TOD_ACC_PIN);
+ val &= ~(PTP_PIN_CFG_SYNC | PTP_PIN_CFG_ACTION_MASK | PTP_PIN_CFG_DOM);
+ val |= PTP_PIN_CFG_ACTION(PTP_PIN_ACTION_IDLE);
+
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ ocelot_write_rix(ocelot, lower_32_bits(ts->tv_sec), PTP_PIN_TOD_SEC_LSB,
+ TOD_ACC_PIN);
+ ocelot_write_rix(ocelot, upper_32_bits(ts->tv_sec), PTP_PIN_TOD_SEC_MSB,
+ TOD_ACC_PIN);
+ ocelot_write_rix(ocelot, ts->tv_nsec, PTP_PIN_TOD_NSEC, TOD_ACC_PIN);
+
+ val = ocelot_read_rix(ocelot, PTP_PIN_CFG, TOD_ACC_PIN);
+ val &= ~(PTP_PIN_CFG_SYNC | PTP_PIN_CFG_ACTION_MASK | PTP_PIN_CFG_DOM);
+ val |= PTP_PIN_CFG_ACTION(PTP_PIN_ACTION_LOAD);
+
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+ return 0;
+}
+
+static int ocelot_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
+{
+ if (delta > -(NSEC_PER_SEC / 2) && delta < (NSEC_PER_SEC / 2)) {
+ struct ocelot *ocelot = container_of(ptp, struct ocelot, ptp_info);
+ unsigned long flags;
+ u32 val;
+
+ spin_lock_irqsave(&ocelot->ptp_clock_lock, flags);
+
+ val = ocelot_read_rix(ocelot, PTP_PIN_CFG, TOD_ACC_PIN);
+ val &= ~(PTP_PIN_CFG_SYNC | PTP_PIN_CFG_ACTION_MASK | PTP_PIN_CFG_DOM);
+ val |= PTP_PIN_CFG_ACTION(PTP_PIN_ACTION_IDLE);
+
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ ocelot_write_rix(ocelot, 0, PTP_PIN_TOD_SEC_LSB, TOD_ACC_PIN);
+ ocelot_write_rix(ocelot, 0, PTP_PIN_TOD_SEC_MSB, TOD_ACC_PIN);
+ ocelot_write_rix(ocelot, delta, PTP_PIN_TOD_NSEC, TOD_ACC_PIN);
+
+ val = ocelot_read_rix(ocelot, PTP_PIN_CFG, TOD_ACC_PIN);
+ val &= ~(PTP_PIN_CFG_SYNC | PTP_PIN_CFG_ACTION_MASK | PTP_PIN_CFG_DOM);
+ val |= PTP_PIN_CFG_ACTION(PTP_PIN_ACTION_DELTA);
+
+ ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+ } else {
+ /* Fall back using ocelot_ptp_settime64 which is not exact. */
+ struct timespec64 ts;
+ u64 now;
+
+ ocelot_ptp_gettime64(ptp, &ts);
+
+ now = ktime_to_ns(timespec64_to_ktime(ts));
+ ts = ns_to_timespec64(now + delta);
+
+ ocelot_ptp_settime64(ptp, &ts);
+ }
+ return 0;
+}
+
+static int ocelot_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
+{
+ struct ocelot *ocelot = container_of(ptp, struct ocelot, ptp_info);
+ u32 unit = 0, direction = 0;
+ unsigned long flags;
+ u64 adj = 0;
+
+ spin_lock_irqsave(&ocelot->ptp_clock_lock, flags);
+
+ if (!scaled_ppm)
+ goto disable_adj;
+
+ if (scaled_ppm < 0) {
+ direction = PTP_CFG_CLK_ADJ_CFG_DIR;
+ scaled_ppm = -scaled_ppm;
+ }
+
+ adj = PSEC_PER_SEC << 16;
+ do_div(adj, scaled_ppm);
+ do_div(adj, 1000);
+
+ /* If the adjustment value is too large, use ns instead */
+ if (adj >= (1L << 30)) {
+ unit = PTP_CFG_CLK_ADJ_FREQ_NS;
+ do_div(adj, 1000);
+ }
+
+ /* Still too big */
+ if (adj >= (1L << 30))
+ goto disable_adj;
+
+ ocelot_write(ocelot, unit | adj, PTP_CLK_CFG_ADJ_FREQ);
+ ocelot_write(ocelot, PTP_CFG_CLK_ADJ_CFG_ENA | direction,
+ PTP_CLK_CFG_ADJ_CFG);
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+ return 0;
+
+disable_adj:
+ ocelot_write(ocelot, 0, PTP_CLK_CFG_ADJ_CFG);
+
+ spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+ return 0;
+}
+
+static struct ptp_clock_info ocelot_ptp_clock_info = {
+ .owner = THIS_MODULE,
+ .name = "ocelot ptp",
+ .max_adj = 0x7fffffff,
+ .n_alarm = 0,
+ .n_ext_ts = 0,
+ .n_per_out = 0,
+ .n_pins = 0,
+ .pps = 0,
+ .gettime64 = ocelot_ptp_gettime64,
+ .settime64 = ocelot_ptp_settime64,
+ .adjtime = ocelot_ptp_adjtime,
+ .adjfine = ocelot_ptp_adjfine,
+};
+
+static int ocelot_init_timestamp(struct ocelot *ocelot)
+{
+ ocelot->ptp_info = ocelot_ptp_clock_info;
+ ocelot->ptp_clock = ptp_clock_register(&ocelot->ptp_info, ocelot->dev);
+ if (IS_ERR(ocelot->ptp_clock))
+ return PTR_ERR(ocelot->ptp_clock);
+ /* Check if PHC support is missing at the configuration level */
+ if (!ocelot->ptp_clock)
+ return 0;
+
+ ocelot_write(ocelot, SYS_PTP_CFG_PTP_STAMP_WID(30), SYS_PTP_CFG);
+ ocelot_write(ocelot, 0xffffffff, ANA_TABLES_PTP_ID_LOW);
+ ocelot_write(ocelot, 0xffffffff, ANA_TABLES_PTP_ID_HIGH);
+
+ ocelot_write(ocelot, PTP_CFG_MISC_PTP_EN, PTP_CFG_MISC);
+
+ /* There is no device reconfiguration, PTP Rx stamping is always
+ * enabled.
+ */
+ ocelot->hwtstamp_config.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
+
+ return 0;
+}
+
int ocelot_probe_port(struct ocelot *ocelot, u8 port,
void __iomem *regs,
struct phy_device *phy)
@@ -1661,6 +2025,8 @@ int ocelot_probe_port(struct ocelot *ocelot, u8 port,
ocelot_mact_learn(ocelot, PGID_CPU, dev->dev_addr, ocelot_port->pvid,
ENTRYTYPE_LOCKED);
+ INIT_LIST_HEAD(&ocelot_port->skbs);
+
err = register_netdev(dev);
if (err) {
dev_err(ocelot->dev, "register_netdev failed\n");
@@ -1684,7 +2050,7 @@ EXPORT_SYMBOL(ocelot_probe_port);
int ocelot_init(struct ocelot *ocelot)
{
u32 port;
- int i, cpu = ocelot->num_phys_ports;
+ int i, ret, cpu = ocelot->num_phys_ports;
char queue_name[32];
ocelot->lags = devm_kcalloc(ocelot->dev, ocelot->num_phys_ports,
@@ -1699,6 +2065,8 @@ int ocelot_init(struct ocelot *ocelot)
return -ENOMEM;
mutex_init(&ocelot->stats_lock);
+ mutex_init(&ocelot->ptp_lock);
+ spin_lock_init(&ocelot->ptp_clock_lock);
snprintf(queue_name, sizeof(queue_name), "%s-stats",
dev_name(ocelot->dev));
ocelot->stats_queue = create_singlethread_workqueue(queue_name);
@@ -1812,16 +2180,43 @@ int ocelot_init(struct ocelot *ocelot)
INIT_DELAYED_WORK(&ocelot->stats_work, ocelot_check_stats_work);
queue_delayed_work(ocelot->stats_queue, &ocelot->stats_work,
OCELOT_STATS_CHECK_DELAY);
+
+ if (ocelot->ptp) {
+ ret = ocelot_init_timestamp(ocelot);
+ if (ret) {
+ dev_err(ocelot->dev,
+ "Timestamp initialization failed\n");
+ return ret;
+ }
+ }
+
return 0;
}
EXPORT_SYMBOL(ocelot_init);
void ocelot_deinit(struct ocelot *ocelot)
{
+ struct list_head *pos, *tmp;
+ struct ocelot_port *port;
+ struct ocelot_skb *entry;
+ int i;
+
cancel_delayed_work(&ocelot->stats_work);
destroy_workqueue(ocelot->stats_queue);
mutex_destroy(&ocelot->stats_lock);
ocelot_ace_deinit();
+
+ for (i = 0; i < ocelot->num_phys_ports; i++) {
+ port = ocelot->ports[i];
+
+ list_for_each_safe(pos, tmp, &port->skbs) {
+ entry = list_entry(pos, struct ocelot_skb, head);
+
+ list_del(pos);
+ dev_kfree_skb_any(entry->skb);
+ kfree(entry);
+ }
+ }
}
EXPORT_SYMBOL(ocelot_deinit);
diff --git a/drivers/net/ethernet/mscc/ocelot.h b/drivers/net/ethernet/mscc/ocelot.h
index 515dee6fa8a6..e40773c01a44 100644
--- a/drivers/net/ethernet/mscc/ocelot.h
+++ b/drivers/net/ethernet/mscc/ocelot.h
@@ -11,9 +11,11 @@
#include <linux/bitops.h>
#include <linux/etherdevice.h>
#include <linux/if_vlan.h>
+#include <linux/net_tstamp.h>
#include <linux/phy.h>
#include <linux/phy/phy.h>
#include <linux/platform_device.h>
+#include <linux/ptp_clock_kernel.h>
#include <linux/regmap.h>
#include "ocelot_ana.h"
@@ -39,6 +41,8 @@
#define OCELOT_STATS_CHECK_DELAY (2 * HZ)
+#define OCELOT_PTP_QUEUE_SZ 128
+
#define IFH_LEN 4
struct frame_info {
@@ -46,6 +50,8 @@ struct frame_info {
u16 port;
u16 vid;
u8 tag_type;
+ u16 rew_op;
+ u32 timestamp; /* rew_val */
};
#define IFH_INJ_BYPASS BIT(31)
@@ -54,6 +60,12 @@ struct frame_info {
#define IFH_TAG_TYPE_C 0
#define IFH_TAG_TYPE_S 1
+#define IFH_REW_OP_NOOP 0x0
+#define IFH_REW_OP_DSCP 0x1
+#define IFH_REW_OP_ONE_STEP_PTP 0x2
+#define IFH_REW_OP_TWO_STEP_PTP 0x3
+#define IFH_REW_OP_ORIGIN_PTP 0x5
+
#define OCELOT_SPEED_2500 0
#define OCELOT_SPEED_1000 1
#define OCELOT_SPEED_100 2
@@ -401,6 +413,13 @@ enum ocelot_regfield {
REGFIELD_MAX
};
+enum ocelot_clk_pins {
+ ALT_PPS_PIN = 1,
+ EXT_CLK_PIN,
+ ALT_LDST_PIN,
+ TOD_ACC_PIN
+};
+
struct ocelot_multicast {
struct list_head list;
unsigned char addr[ETH_ALEN];
@@ -450,6 +469,13 @@ struct ocelot {
u64 *stats;
struct delayed_work stats_work;
struct workqueue_struct *stats_queue;
+
+ u8 ptp:1;
+ struct ptp_clock *ptp_clock;
+ struct ptp_clock_info ptp_info;
+ struct hwtstamp_config hwtstamp_config;
+ struct mutex ptp_lock; /* Protects the PTP interface state */
+ spinlock_t ptp_clock_lock; /* Protects the PTP clock */
};
struct ocelot_port {
@@ -473,6 +499,16 @@ struct ocelot_port {
struct phy *serdes;
struct ocelot_port_tc tc;
+
+ u8 ptp_cmd;
+ struct list_head skbs;
+ u8 ts_id;
+};
+
+struct ocelot_skb {
+ struct list_head head;
+ struct sk_buff *skb;
+ u8 id;
};
u32 __ocelot_read_ix(struct ocelot *ocelot, u32 reg, u32 offset);
@@ -517,4 +553,7 @@ extern struct notifier_block ocelot_netdevice_nb;
extern struct notifier_block ocelot_switchdev_nb;
extern struct notifier_block ocelot_switchdev_blocking_nb;
+int ocelot_ptp_gettime64(struct ptp_clock_info *ptp, struct timespec64 *ts);
+void ocelot_get_hwtimestamp(struct ocelot *ocelot, struct timespec64 *ts);
+
#endif
diff --git a/drivers/net/ethernet/mscc/ocelot_board.c b/drivers/net/ethernet/mscc/ocelot_board.c
index df8d15994a89..b063eb78fa0c 100644
--- a/drivers/net/ethernet/mscc/ocelot_board.c
+++ b/drivers/net/ethernet/mscc/ocelot_board.c
@@ -31,6 +31,8 @@ static int ocelot_parse_ifh(u32 *_ifh, struct frame_info *info)
info->len = OCELOT_BUFFER_CELL_SZ * wlen + llen - 80;
+ info->timestamp = IFH_EXTRACT_BITFIELD64(ifh[0], 21, 32);
+
info->port = IFH_EXTRACT_BITFIELD64(ifh[1], 43, 4);
info->tag_type = IFH_EXTRACT_BITFIELD64(ifh[1], 16, 1);
@@ -92,13 +94,14 @@ static irqreturn_t ocelot_xtr_irq_handler(int irq, void *arg)
return IRQ_NONE;
do {
- struct sk_buff *skb;
+ struct skb_shared_hwtstamps *shhwtstamps;
+ u64 tod_in_ns, full_ts_in_ns;
+ struct frame_info info = {};
struct net_device *dev;
- u32 *buf;
+ u32 ifh[4], val, *buf;
+ struct timespec64 ts;
int sz, len, buf_len;
- u32 ifh[4];
- u32 val;
- struct frame_info info;
+ struct sk_buff *skb;
for (i = 0; i < IFH_LEN; i++) {
err = ocelot_rx_frame_word(ocelot, grp, true, &ifh[i]);
@@ -145,6 +148,22 @@ static irqreturn_t ocelot_xtr_irq_handler(int irq, void *arg)
break;
}
+ if (ocelot->ptp) {
+ ocelot_ptp_gettime64(&ocelot->ptp_info, &ts);
+
+ tod_in_ns = ktime_set(ts.tv_sec, ts.tv_nsec);
+ if ((tod_in_ns & 0xffffffff) < info.timestamp)
+ full_ts_in_ns = (((tod_in_ns >> 32) - 1) << 32) |
+ info.timestamp;
+ else
+ full_ts_in_ns = (tod_in_ns & GENMASK_ULL(63, 32)) |
+ info.timestamp;
+
+ shhwtstamps = skb_hwtstamps(skb);
+ memset(shhwtstamps, 0, sizeof(struct skb_shared_hwtstamps));
+ shhwtstamps->hwtstamp = full_ts_in_ns;
+ }
+
/* Everything we see on an interface that is in the HW bridge
* has already been forwarded.
*/
@@ -164,6 +183,66 @@ static irqreturn_t ocelot_xtr_irq_handler(int irq, void *arg)
return IRQ_HANDLED;
}
+static irqreturn_t ocelot_ptp_rdy_irq_handler(int irq, void *arg)
+{
+ int budget = OCELOT_PTP_QUEUE_SZ;
+ struct ocelot *ocelot = arg;
+
+ while (budget--) {
+ struct skb_shared_hwtstamps shhwtstamps;
+ struct list_head *pos, *tmp;
+ struct sk_buff *skb = NULL;
+ struct ocelot_skb *entry;
+ struct ocelot_port *port;
+ struct timespec64 ts;
+ u32 val, id, txport;
+
+ val = ocelot_read(ocelot, SYS_PTP_STATUS);
+
+ /* Check if a timestamp can be retrieved */
+ if (!(val & SYS_PTP_STATUS_PTP_MESS_VLD))
+ break;
+
+ WARN_ON(val & SYS_PTP_STATUS_PTP_OVFL);
+
+ /* Retrieve the ts ID and Tx port */
+ id = SYS_PTP_STATUS_PTP_MESS_ID_X(val);
+ txport = SYS_PTP_STATUS_PTP_MESS_TXPORT_X(val);
+
+ /* Retrieve its associated skb */
+ port = ocelot->ports[txport];
+
+ list_for_each_safe(pos, tmp, &port->skbs) {
+ entry = list_entry(pos, struct ocelot_skb, head);
+ if (entry->id != id)
+ continue;
+
+ skb = entry->skb;
+
+ list_del(pos);
+ kfree(entry);
+ }
+
+ /* Next ts */
+ ocelot_write(ocelot, SYS_PTP_NXT_PTP_NXT, SYS_PTP_NXT);
+
+ if (unlikely(!skb))
+ continue;
+
+ /* Get the h/w timestamp */
+ ocelot_get_hwtimestamp(ocelot, &ts);
+
+ /* Set the timestamp into the skb */
+ memset(&shhwtstamps, 0, sizeof(shhwtstamps));
+ shhwtstamps.hwtstamp = ktime_set(ts.tv_sec, ts.tv_nsec);
+ skb_tstamp_tx(skb, &shhwtstamps);
+
+ dev_kfree_skb_any(skb);
+ }
+
+ return IRQ_HANDLED;
+}
+
static const struct of_device_id mscc_ocelot_match[] = {
{ .compatible = "mscc,vsc7514-switch" },
{ }
@@ -172,12 +251,12 @@ MODULE_DEVICE_TABLE(of, mscc_ocelot_match);
static int mscc_ocelot_probe(struct platform_device *pdev)
{
- int err, irq;
- unsigned int i;
struct device_node *np = pdev->dev.of_node;
struct device_node *ports, *portnp;
+ int err, irq_xtr, irq_ptp_rdy;
struct ocelot *ocelot;
struct regmap *hsio;
+ unsigned int i;
u32 val;
struct {
@@ -232,16 +311,29 @@ static int mscc_ocelot_probe(struct platform_device *pdev)
if (err)
return err;
- irq = platform_get_irq_byname(pdev, "xtr");
- if (irq < 0)
+ irq_xtr = platform_get_irq_byname(pdev, "xtr");
+ if (irq_xtr < 0)
return -ENODEV;
- err = devm_request_threaded_irq(&pdev->dev, irq, NULL,
+ err = devm_request_threaded_irq(&pdev->dev, irq_xtr, NULL,
ocelot_xtr_irq_handler, IRQF_ONESHOT,
"frame extraction", ocelot);
if (err)
return err;
+ irq_ptp_rdy = platform_get_irq_byname(pdev, "ptp_rdy");
+ if (irq_ptp_rdy > 0 && ocelot->targets[PTP]) {
+ err = devm_request_threaded_irq(&pdev->dev, irq_ptp_rdy, NULL,
+ ocelot_ptp_rdy_irq_handler,
+ IRQF_ONESHOT, "ptp ready",
+ ocelot);
+ if (err)
+ return err;
+
+ /* Both the PTP interrupt and the PTP bank are available */
+ ocelot->ptp = 1;
+ }
+
regmap_field_write(ocelot->regfields[SYS_RESET_CFG_MEM_INIT], 1);
regmap_field_write(ocelot->regfields[SYS_RESET_CFG_MEM_ENA], 1);
--
2.21.0
^ permalink raw reply related
* [PATCH bpf] s390/bpf: fix lcgr instruction encoding
From: Ilya Leoshkevich @ 2019-08-12 15:03 UTC (permalink / raw)
To: bpf, netdev; +Cc: gor, heiko.carstens, Ilya Leoshkevich
"masking, test in bounds 3" fails on s390, because
BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0) ignores the top 32 bits of
BPF_REG_2. The reason is that JIT emits lcgfr instead of lcgr.
The associated comment indicates that the code was intended to emit lcgr
in the first place, it's just that the wrong opcode was used.
Fix by using the correct opcode.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
arch/s390/net/bpf_jit_comp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index e636728ab452..6299156f9738 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -863,7 +863,7 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
break;
case BPF_ALU64 | BPF_NEG: /* dst = -dst */
/* lcgr %dst,%dst */
- EMIT4(0xb9130000, dst_reg, dst_reg);
+ EMIT4(0xb9030000, dst_reg, dst_reg);
break;
/*
* BPF_FROM_BE/LE
--
2.21.0
^ permalink raw reply related
* Re: [net-next 01/15] ice: Implement ethtool ops for channels
From: Nguyen, Anthony L @ 2019-08-12 15:07 UTC (permalink / raw)
To: Kirsher, Jeffrey T, jakub.kicinski@netronome.com
Cc: nhorman@redhat.com, davem@davemloft.net, netdev@vger.kernel.org,
Bowers, AndrewX, sassmann@redhat.com, Tieman, Henry W
In-Reply-To: <20190809141518.55fe7f8a@cakuba.netronome.com>
[-- Attachment #1: Type: text/plain, Size: 974 bytes --]
On Fri, 2019-08-09 at 14:15 -0700, Jakub Kicinski wrote:
> On Fri, 9 Aug 2019 11:31:25 -0700, Jeff Kirsher wrote:
> > From: Henry Tieman <henry.w.tieman@intel.com>
> >
> > Add code to query and set the number of queues on the primary
> > VSI for a PF. This is accessed from the 'ethtool -l' and 'ethtool
> > -L'
> > commands, respectively.
> >
> > Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
> > Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
> > Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>
> If you're using the same IRQ vector for RX and TX queue the channel
> counts as combined. Looks like you are counting RX and TX separately
> here. That's incorrect.
Hi Jakub,
The ice driver can support asymmetric queues. We report these
seperately, as opposed to combined, so that the user can specify a
different number of Rx and Tx queues.
Thanks,
Tony
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3277 bytes --]
^ permalink raw reply
* Re: [BUG] fec mdio times out under system stress
From: Fabio Estevam @ 2019-08-12 15:10 UTC (permalink / raw)
To: Russell King - ARM Linux admin
Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, netdev,
Andrew Lunn, Florian Fainelli, Heiner Kallweit
In-Reply-To: <20190811133707.GC13294@shell.armlinux.org.uk>
Hi Russell,
On Sun, Aug 11, 2019 at 10:37 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> Hi Fabio,
>
> When I woke up this morning, I found that one of the Hummingboards
> had gone offline (as in, lost network link) during the night.
> Investigating, I find that the system had gone into OOM, and at
> that time, triggered an unrelated:
>
> [4111697.698776] fec 2188000.ethernet eth0: MDIO read timeout
> [4111697.712996] MII_DATA: 0x6006796d
> [4111697.729415] MII_SPEED: 0x0000001a
> [4111697.745232] IEVENT: 0x00000000
> [4111697.745242] IMASK: 0x0a8000aa
> [4111698.002233] Atheros 8035 ethernet 2188000.ethernet-1:00: PHY state change RUNNING -> HALTED
> [4111698.009882] fec 2188000.ethernet eth0: Link is Down
>
> This is on a dual-core iMX6.
>
> It looks like the read actually completed (since MII_DATA contains
> the register data) but we somehow lost the interrupt (or maybe
> received the interrupt after wait_for_completion_timeout() timed
> out.)
>
> From what I can see, the OOM events happened on CPU1, CPU1 was
> allocated the FEC interrupt, and the PHY polling that suffered the
> MDIO timeout was on CPU0.
>
> Given that IEVENT is zero, it seems that CPU1 had read serviced the
> interrupt, but it is not clear how far through processing that it
> was - it may be that fec_enet_interrupt() had been delayed by the
> OOM condition.
>
> This seems rather fragile - as the system slowing down due to OOM
> triggers the network to completely collapse by phylib taking the
> PHY offline, making the system inaccessible except through the
> console.
>
> In my case, even serial console wasn't operational (except for
> magic sysrq). Not sure what agetty was playing at... so the only
> way I could recover any information from the system was to connect
> the HDMI and plug in a USB keyboard.
>
> Any thoughts on how FEC MDIO accesses could be made more robust?
Sorry for the delay. I am currently on vacation with limited e-mail access.
I think it is worth trying Andrew's suggestion to increase FEC_MII_TIMEOUT.
Thanks
^ permalink raw reply
* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: Roopa Prabhu @ 2019-08-12 15:13 UTC (permalink / raw)
To: Jiri Pirko
Cc: David Ahern, netdev, David Miller, Jakub Kicinski,
Stephen Hemminger, dcbw, Michal Kubecek, Andrew Lunn, parav,
Saeed Mahameed, mlxsw
In-Reply-To: <20190812083139.GA2428@nanopsycho>
On Mon, Aug 12, 2019 at 1:31 AM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Mon, Aug 12, 2019 at 03:37:26AM CEST, dsahern@gmail.com wrote:
> >On 8/11/19 7:34 PM, David Ahern wrote:
> >> On 8/10/19 12:30 AM, Jiri Pirko wrote:
> >>> Could you please write me an example message of add/remove?
> >>
> >> altnames are for existing netdevs, yes? existing netdevs have an id and
> >> a name - 2 existing references for identifying the existing netdev for
> >> which an altname will be added. Even using the altname as the main
> >> 'handle' for a setlink change, I see no reason why the GETLINK api can
> >> not take an the IFLA_ALT_IFNAME and return the full details of the
> >> device if the altname is unique.
> >>
> >> So, what do the new RTM commands give you that you can not do with
> >> RTM_*LINK?
> >>
> >
> >
> >To put this another way, the ALT_NAME is an attribute of an object - a
> >LINK. It is *not* a separate object which requires its own set of
> >commands for manipulating.
>
> Okay, again, could you provide example of a message to add/remove
> altname using existing setlink message? Thanks!
Will the below work ?... just throwing an example for discussion:
make the name list a nested list
IFLA_ALT_NAMES
IFLA_ALT_NAME_OP /* ADD or DEL used with setlink */
IFLA_ALT_NAME
IFLA_ALT_NAME_LIST
With RTM_NEWLINK you can specify a list to set and unset
With RTM_SETLINK you can specify an individual name with a add or del op
notifications will always be RTM_NEWLINK with the full list.
The nested attribute can be structured differently.
Only thing is i am worried about increasing the size of link dump and
notification msgs.
What is the limit on the number of names again ?
^ permalink raw reply
* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: Roopa Prabhu @ 2019-08-12 15:21 UTC (permalink / raw)
To: Michal Kubecek
Cc: netdev, Jiri Pirko, David Miller, Jakub Kicinski,
Stephen Hemminger, David Ahern, dcbw, Andrew Lunn, parav,
Saeed Mahameed, mlxsw
In-Reply-To: <20190811221027.GD30089@unicorn.suse.cz>
On Sun, Aug 11, 2019 at 3:10 PM Michal Kubecek <mkubecek@suse.cz> wrote:
>
> On Sat, Aug 10, 2019 at 12:39:31PM -0700, Roopa Prabhu wrote:
> > On Sat, Aug 10, 2019 at 8:50 AM Michal Kubecek <mkubecek@suse.cz> wrote:
> > >
> > > On Sat, Aug 10, 2019 at 06:46:57AM -0700, Roopa Prabhu wrote:
> > > > On Fri, Aug 9, 2019 at 8:46 AM Michal Kubecek <mkubecek@suse.cz> wrote:
> > > > >
> > > > > On Fri, Aug 09, 2019 at 08:40:25AM -0700, Roopa Prabhu wrote:
> > > > > > to that point, I am also not sure why we have a new API For multiple
> > > > > > names. I mean why support more than two names (existing old name and
> > > > > > a new name to remove the length limitation) ?
> > > > >
> > > > > One use case is to allow "predictable names" from udev/systemd to work
> > > > > the way do for e.g. block devices, see
> > > > >
> > > > > http://lkml.kernel.org/r/20190628162716.GF29149@unicorn.suse.cz
> > > > >
> > > >
> > > > thanks for the link. don't know the details about alternate block
> > > > device names. Does user-space generate multiple and assign them to a
> > > > kernel object as proposed in this series ?. is there a limit to number
> > > > of names ?. my understanding of 'predictable names' was still a single
> > > > name but predictable structure to the name.
> > >
> > > It is a single name but IMHO mostly because we can only have one name.
> > > For block devices, udev uses symlinks to create multiple aliases based
> > > on different naming schemes, e.g.
> > >
> > > mike@lion:~> find -L /dev/disk/ -samefile /dev/sda2 -exec ls -l {} +
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-id/ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T3114933-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-id/scsi-SATA_WDC_WD30EFRX-68A_WD-WMC1T3114933-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-id/scsi-SATA_WDC_WD30EFRX-68_WD-WMC1T3114933-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-id/scsi-0ATA_WDC_WD30EFRX-68A_WD-WMC1T3114933-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-id/scsi-1ATA_WDC_WD30EFRX-68AX9N0_WD-WMC1T3114933-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-id/scsi-350014ee6589cfea0-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-id/wwn-0x50014ee6589cfea0-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-partlabel/root2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-partuuid/71affb47-a93b-40fd-8986-d2e227e1b39d -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-path/pci-0000:00:11.0-ata-1-part2 -> ../../sda2
> > > lrwxrwxrwx 1 root root 10 srp 5 21:47 /dev/disk/by-path/pci-0000:00:11.0-scsi-0:0:0:0-part2 -> ../../sda2
> > >
> > > Few years ago, udev even dropped support for renaming block and
> > > character devices (NAME="...") so that it now keeps kernel name and only
> > > creates symlinks to it. Recent versions only allow NAME="..." for
> > > network devices.
> >
> > ok thanks for the details. This looks like names that are structured
> > on hardware info which could fall into devlinks scope and they point
> > to a single name.
> > We should think about keeping them under devlink (by-id, by-mac etc).
> > It already can recognize network interfaces by id.
>
> Not all of them are hardware based, there are also links based on
> filesystem label or UUID. But my point is rather that udev creates
> multiple links so that any of them can be used in any place where
> a block device is to be identified.
>
> As network devices can have only one name, udev drops kernel provided
> name completely and replaces it with name following one naming scheme.
> Thus we have to know which naming scheme is going to be used and make
> sure it does not change. With multiple alternative names, we could also
> have all udev provided names at once (and also the original one from
> kernel).
ok, understand the use-case.
But, Its hard for me to understand how udev is going to manage this
list of names without structure to them.
Plus how is udev going to distinguish its own names from user given name ?.
I thought this list was giving an opportunity to use the long name
everywhere else.
But if this is going to be managed by udev with a bunch of structured
names, I don't understand how the rest of the system is going to use
these names.
Maybe we should just call this a udev managed list of names.
(again, i think the best way to do this for udev is to provide the
symlink like facility via devlink or any other infra).
^ permalink raw reply
* Re: [RFC PATCH v7] rtl8xxxu: Improve TX performance of RTL8723BU on rtl8xxxu driver
From: Jes Sorensen @ 2019-08-12 15:21 UTC (permalink / raw)
To: Kalle Valo
Cc: Chris Chiu, davem, linux-wireless, netdev, linux-kernel, linux,
Daniel Drake
In-Reply-To: <87wofibgk7.fsf@kamboji.qca.qualcomm.com>
On 8/12/19 10:32 AM, Kalle Valo wrote:
> Jes Sorensen <jes.sorensen@gmail.com> writes:
>> Looks good to me! Nice work! I am actually very curious if this will
>> improve performance 8192eu as well.
>>
>> Ideally I'd like to figure out how to make host controlled rates work,
>> but in all my experiments with that, I never really got it to work well.
>>
>> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
>
> This is marked as RFC so I'm not sure what's the plan. Should I apply
> this?
I think it's at a point where it's worth applying - I kinda wish I had
had time to test it, but I won't be near my stash of USB dongles for a
little while.
Cheers,
Jes
^ permalink raw reply
* Re: [PATCH v2 0/3] Fix three issues found by syzbot
From: David Miller @ 2019-08-12 15:25 UTC (permalink / raw)
To: ying.xue
Cc: netdev, jon.maloy, hdanton, tipc-discussion, syzkaller-bugs,
jakub.kicinski
In-Reply-To: <1565595162-1383-1-git-send-email-ying.xue@windriver.com>
From: Ying Xue <ying.xue@windriver.com>
Date: Mon, 12 Aug 2019 15:32:39 +0800
> Ying Xue (3):
> tipc: fix memory leak issue
> tipc: fix memory leak issue
Please make the subject lines for these two patches unique. Perhaps
mention what part of the tipc code has the memory leak you are fixing.
Thanks.
^ permalink raw reply
* Re: [PATCH net] netdevsim: Restore per-network namespace accounting for fib entries
From: David Miller @ 2019-08-12 15:28 UTC (permalink / raw)
To: jiri; +Cc: dsahern, netdev, dsahern
In-Reply-To: <20190812083635.GB2428@nanopsycho>
From: Jiri Pirko <jiri@resnulli.us>
Date: Mon, 12 Aug 2019 10:36:35 +0200
> I understand it with real devices, but dummy testing device, who's
> purpose is just to test API. Why?
Because you'll break all of the wonderful testing infrastructure
people like David have created.
^ permalink raw reply
* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: Stephen Hemminger @ 2019-08-12 15:40 UTC (permalink / raw)
To: Jiri Pirko
Cc: David Ahern, Roopa Prabhu, netdev, David Miller, Jakub Kicinski,
Stephen Hemminger, dcbw, Michal Kubecek, Andrew Lunn, parav,
Saeed Mahameed, mlxsw
In-Reply-To: <20190812083139.GA2428@nanopsycho>
On Mon, 12 Aug 2019 10:31:39 +0200
Jiri Pirko <jiri@resnulli.us> wrote:
> Mon, Aug 12, 2019 at 03:37:26AM CEST, dsahern@gmail.com wrote:
> >On 8/11/19 7:34 PM, David Ahern wrote:
> >> On 8/10/19 12:30 AM, Jiri Pirko wrote:
> >>> Could you please write me an example message of add/remove?
> >>
> >> altnames are for existing netdevs, yes? existing netdevs have an id and
> >> a name - 2 existing references for identifying the existing netdev for
> >> which an altname will be added. Even using the altname as the main
> >> 'handle' for a setlink change, I see no reason why the GETLINK api can
> >> not take an the IFLA_ALT_IFNAME and return the full details of the
> >> device if the altname is unique.
> >>
> >> So, what do the new RTM commands give you that you can not do with
> >> RTM_*LINK?
> >>
> >
> >
> >To put this another way, the ALT_NAME is an attribute of an object - a
> >LINK. It is *not* a separate object which requires its own set of
> >commands for manipulating.
>
> Okay, again, could you provide example of a message to add/remove
> altname using existing setlink message? Thanks!
The existing IFALIAS takes an empty name to do deletion.
^ permalink raw reply
* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: Michal Kubecek @ 2019-08-12 15:43 UTC (permalink / raw)
To: netdev
Cc: Roopa Prabhu, Jiri Pirko, David Miller, Jakub Kicinski,
Stephen Hemminger, David Ahern, dcbw, Andrew Lunn, parav,
Saeed Mahameed, mlxsw
In-Reply-To: <CAJieiUjQNHdgHCsQqMHO3u3DtGRBgZfx0Wo3aUj0LyUm_YJ5Eg@mail.gmail.com>
On Mon, Aug 12, 2019 at 08:21:31AM -0700, Roopa Prabhu wrote:
> On Sun, Aug 11, 2019 at 3:10 PM Michal Kubecek <mkubecek@suse.cz> wrote:
> >
> > Not all of them are hardware based, there are also links based on
> > filesystem label or UUID. But my point is rather that udev creates
> > multiple links so that any of them can be used in any place where
> > a block device is to be identified.
> >
> > As network devices can have only one name, udev drops kernel provided
> > name completely and replaces it with name following one naming scheme.
> > Thus we have to know which naming scheme is going to be used and make
> > sure it does not change. With multiple alternative names, we could also
> > have all udev provided names at once (and also the original one from
> > kernel).
>
> ok, understand the use-case.
> But, Its hard for me to understand how udev is going to manage this
> list of names without structure to them.
> Plus how is udev going to distinguish its own names from user given name ?.
>
> I thought this list was giving an opportunity to use the long name
> everywhere else.
> But if this is going to be managed by udev with a bunch of structured
> names, I don't understand how the rest of the system is going to use
> these names.
>
> Maybe we should just call this a udev managed list of names.
>
> (again, i think the best way to do this for udev is to provide the
> symlink like facility via devlink or any other infra).
I certainly didn't want to suggest for alternative names to be managed
by udev. What I meant was that supporting multiple alternative names
would allow udev to create its names based on e.g. device bus address,
BIOS/UEFI slot number, MAC address etc. But it would still be up to
admins if they want to create their own names.
Michal
^ permalink raw reply
* Re: [PATCH rdma-next 0/4] Add XRQ and SRQ support to DEVX interface
From: Doug Ledford @ 2019-08-12 15:43 UTC (permalink / raw)
To: Leon Romanovsky, Jason Gunthorpe
Cc: RDMA mailing list, Edward Srouji, Saeed Mahameed, Yishai Hadas,
linux-netdev
In-Reply-To: <20190808101059.GC28049@mtr-leonro.mtl.com>
[-- Attachment #1: Type: text/plain, Size: 601 bytes --]
On Thu, 2019-08-08 at 10:11 +0000, Leon Romanovsky wrote:
> On Thu, Aug 08, 2019 at 11:43:54AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Hi,
> >
> > This small series extends DEVX interface with SRQ and XRQ legacy
> > commands.
>
> Sorry for typo in cover letter, there is no SRQ here.
Series looks fine to me. Are you planning on the first two via mlx5-
next and the remainder via RDMA tree?
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: B826A3330E572FDD
Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH bpf] s390/bpf: fix lcgr instruction encoding
From: Vasily Gorbik @ 2019-08-12 15:46 UTC (permalink / raw)
To: Ilya Leoshkevich; +Cc: bpf, netdev, heiko.carstens
In-Reply-To: <20190812150332.98109-1-iii@linux.ibm.com>
On Mon, Aug 12, 2019 at 05:03:32PM +0200, Ilya Leoshkevich wrote:
> "masking, test in bounds 3" fails on s390, because
> BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0) ignores the top 32 bits of
> BPF_REG_2. The reason is that JIT emits lcgfr instead of lcgr.
> The associated comment indicates that the code was intended to emit lcgr
> in the first place, it's just that the wrong opcode was used.
>
> Fix by using the correct opcode.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
> arch/s390/net/bpf_jit_comp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> index e636728ab452..6299156f9738 100644
> --- a/arch/s390/net/bpf_jit_comp.c
> +++ b/arch/s390/net/bpf_jit_comp.c
> @@ -863,7 +863,7 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
> break;
> case BPF_ALU64 | BPF_NEG: /* dst = -dst */
> /* lcgr %dst,%dst */
> - EMIT4(0xb9130000, dst_reg, dst_reg);
> + EMIT4(0xb9030000, dst_reg, dst_reg);
> break;
> /*
> * BPF_FROM_BE/LE
> --
> 2.21.0
>
Please add
Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
or whatever it should be. With that:
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
^ permalink raw reply
* Re: [PATCH v3 14/17] fm10k: no need to check return value of debugfs_create functions
From: Jeff Kirsher @ 2019-08-12 15:46 UTC (permalink / raw)
To: Greg Kroah-Hartman, netdev; +Cc: David S. Miller, intel-wired-lan
In-Reply-To: <20190810101732.26612-15-gregkh@linuxfoundation.org>
[-- Attachment #1: Type: text/plain, Size: 655 bytes --]
On Sat, 2019-08-10 at 12:17 +0200, Greg Kroah-Hartman wrote:
> When calling debugfs functions, there is no need to ever check the
> return value. The function can work or not, but the code logic
> should
> never do something different based on this.
>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
> drivers/net/ethernet/intel/fm10k/fm10k_debugfs.c | 2 --
> 1 file changed, 2 deletions(-)
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH v3 15/17] i40e: no need to check return value of debugfs_create functions
From: Jeff Kirsher @ 2019-08-12 15:48 UTC (permalink / raw)
To: Greg Kroah-Hartman, netdev; +Cc: David S. Miller, intel-wired-lan
In-Reply-To: <20190810101732.26612-16-gregkh@linuxfoundation.org>
[-- Attachment #1: Type: text/plain, Size: 692 bytes --]
On Sat, 2019-08-10 at 12:17 +0200, Greg Kroah-Hartman wrote:
> When calling debugfs functions, there is no need to ever check the
> return value. The function can work or not, but the code logic
> should
> never do something different based on this.
>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
> .../net/ethernet/intel/i40e/i40e_debugfs.c | 22 ++++-------------
> --
> 1 file changed, 4 insertions(+), 18 deletions(-)
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH v3 16/17] ixgbe: no need to check return value of debugfs_create functions
From: Jeff Kirsher @ 2019-08-12 15:49 UTC (permalink / raw)
To: Greg Kroah-Hartman, netdev; +Cc: David S. Miller, intel-wired-lan
In-Reply-To: <20190810101732.26612-17-gregkh@linuxfoundation.org>
[-- Attachment #1: Type: text/plain, Size: 692 bytes --]
On Sat, 2019-08-10 at 12:17 +0200, Greg Kroah-Hartman wrote:
> When calling debugfs functions, there is no need to ever check the
> return value. The function can work or not, but the code logic
> should
> never do something different based on this.
>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
> .../net/ethernet/intel/ixgbe/ixgbe_debugfs.c | 22 +++++----------
> ----
> 1 file changed, 5 insertions(+), 17 deletions(-)
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCHv2 net 0/2] Add netdev_level_ratelimited to avoid netdev msg flush
From: Thomas Falcon @ 2019-08-12 15:56 UTC (permalink / raw)
To: Hangbin Liu, David Miller; +Cc: netdev, joe
In-Reply-To: <20190812073733.GV18865@dhcp-12-139.nay.redhat.com>
On 8/12/19 2:37 AM, Hangbin Liu wrote:
> On Sun, Aug 11, 2019 at 09:08:20PM -0700, David Miller wrote:
>> From: Hangbin Liu <liuhangbin@gmail.com>
>> Date: Fri, 9 Aug 2019 08:29:39 +0800
>>
>>> ibmveth 30000003 env3: h_multicast_ctrl rc=4 when adding an entry to the filter table
>>> error when add more thann 256 memberships in one multicast group. I haven't
>>> found this issue on other driver. It looks like an ibm driver issue and need
>>> to be fixed separately.
>> You need to root cause and fix the reason this message appears so much.
>>
>> Once I let you rate limit the message you will have zero incentive to
>> fix the real problem and fix it.
> Sorry, I'm not familiar with ibmveth driver...
>
> Hi Thomas,
>
> Would you please help check why this issue happens
Hi, thanks for reporting this. I was able to recreate this on my own
system. The virtual ethernet's multicast filter list size is limited,
and the driver will check that there is available space before adding
entries. The problem is that the size is encoded as big endian, but the
driver does not convert it for little endian systems after retrieving it
from the device tree. As a result the driver is requesting more than
the hypervisor can allow and getting this error in reply. I will submit
a patch to correct this soon.
Thanks again,
Tom
> Thanks
> Hangbbin
>
^ permalink raw reply
* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: David Ahern @ 2019-08-12 16:01 UTC (permalink / raw)
To: Jiri Pirko
Cc: Roopa Prabhu, netdev, David Miller, Jakub Kicinski,
Stephen Hemminger, dcbw, Michal Kubecek, Andrew Lunn, parav,
Saeed Mahameed, mlxsw
In-Reply-To: <20190812083139.GA2428@nanopsycho>
On 8/12/19 2:31 AM, Jiri Pirko wrote:
> Mon, Aug 12, 2019 at 03:37:26AM CEST, dsahern@gmail.com wrote:
>> On 8/11/19 7:34 PM, David Ahern wrote:
>>> On 8/10/19 12:30 AM, Jiri Pirko wrote:
>>>> Could you please write me an example message of add/remove?
>>>
>>> altnames are for existing netdevs, yes? existing netdevs have an id and
>>> a name - 2 existing references for identifying the existing netdev for
>>> which an altname will be added. Even using the altname as the main
>>> 'handle' for a setlink change, I see no reason why the GETLINK api can
>>> not take an the IFLA_ALT_IFNAME and return the full details of the
>>> device if the altname is unique.
>>>
>>> So, what do the new RTM commands give you that you can not do with
>>> RTM_*LINK?
>>>
>>
>>
>> To put this another way, the ALT_NAME is an attribute of an object - a
>> LINK. It is *not* a separate object which requires its own set of
>> commands for manipulating.
>
> Okay, again, could you provide example of a message to add/remove
> altname using existing setlink message? Thanks!
>
Examples from your cover letter with updates
$ ip link set dummy0 altname someothername
$ ip link set dummy0 altname someotherveryveryveryverylongname
$ ip link set dummy0 del altname someothername
$ ip link set dummy0 del altname someotherveryveryveryverylongname
This syntactic sugar to what is really happening:
RTM_NEWLINK, dummy0, IFLA_ALT_IFNAME
if you are allowing many alt names, then yes, you need a flag to say
delete this specific one which is covered by Roopa's nested suggestion.
^ permalink raw reply
* Re: [PATCH bpf] s390/bpf: fix lcgr instruction encoding
From: Daniel Borkmann @ 2019-08-12 16:06 UTC (permalink / raw)
To: Ilya Leoshkevich, bpf, netdev; +Cc: gor, heiko.carstens
In-Reply-To: <20190812150332.98109-1-iii@linux.ibm.com>
On 8/12/19 5:03 PM, Ilya Leoshkevich wrote:
> "masking, test in bounds 3" fails on s390, because
> BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0) ignores the top 32 bits of
> BPF_REG_2. The reason is that JIT emits lcgfr instead of lcgr.
> The associated comment indicates that the code was intended to emit lcgr
> in the first place, it's just that the wrong opcode was used.
>
> Fix by using the correct opcode.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Applied, thanks!
^ permalink raw reply
* [PATCH v2 2/2] Documentation: sphinx: Don't parse socket() as identifier reference
From: Jonathan Neuschäfer @ 2019-08-12 16:07 UTC (permalink / raw)
To: linux-doc
Cc: Jonathan Neuschäfer, Jonathan Corbet, Alexei Starovoitov,
Daniel Borkmann, David S. Miller, Jakub Kicinski,
Jesper Dangaard Brouer, John Fastabend, Mauro Carvalho Chehab,
linux-kernel, netdev, xdp-newbies, bpf
In-Reply-To: <20190812160708.32172-1-j.neuschaefer@gmx.net>
With the introduction of Documentation/sphinx/automarkup.py, socket() is
parsed as a reference to the in-kernel definition of socket. Sphinx then
decides that struct socket is a good match, which is usually not
intended, when the syscall is meant instead. This was observed in
Documentation/networking/af_xdp.rst.
Prevent socket() from being misinterpreted by adding it to the Skipfuncs
list in automarkup.py.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
---
v2:
- block socket() in Documentation/sphinx/automarkup.py, as suggested by
Jonathan Corbet
v1:
- https://lore.kernel.org/lkml/20190810121738.19587-1-j.neuschaefer@gmx.net/
---
Documentation/sphinx/automarkup.py | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/Documentation/sphinx/automarkup.py b/Documentation/sphinx/automarkup.py
index a8798369e8f7..5b6119ff69f4 100644
--- a/Documentation/sphinx/automarkup.py
+++ b/Documentation/sphinx/automarkup.py
@@ -26,7 +26,8 @@ RE_function = re.compile(r'([\w_][\w\d_]+\(\))')
# just don't even try with these names.
#
Skipfuncs = [ 'open', 'close', 'read', 'write', 'fcntl', 'mmap',
- 'select', 'poll', 'fork', 'execve', 'clone', 'ioctl']
+ 'select', 'poll', 'fork', 'execve', 'clone', 'ioctl',
+ 'socket' ]
#
# Find all occurrences of function() and try to replace them with
--
2.20.1
^ permalink raw reply related
* Re: [PATCH v2 2/2] Documentation: sphinx: Don't parse socket() as identifier reference
From: Mauro Carvalho Chehab @ 2019-08-12 16:11 UTC (permalink / raw)
To: Jonathan Neuschäfer
Cc: linux-doc, Jonathan Corbet, Alexei Starovoitov, Daniel Borkmann,
David S. Miller, Jakub Kicinski, Jesper Dangaard Brouer,
John Fastabend, linux-kernel, netdev, xdp-newbies, bpf
In-Reply-To: <20190812160708.32172-2-j.neuschaefer@gmx.net>
Em Mon, 12 Aug 2019 18:07:05 +0200
Jonathan Neuschäfer <j.neuschaefer@gmx.net> escreveu:
> With the introduction of Documentation/sphinx/automarkup.py, socket() is
> parsed as a reference to the in-kernel definition of socket. Sphinx then
> decides that struct socket is a good match, which is usually not
> intended, when the syscall is meant instead. This was observed in
> Documentation/networking/af_xdp.rst.
>
> Prevent socket() from being misinterpreted by adding it to the Skipfuncs
> list in automarkup.py.
>
> Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
> ---
>
> v2:
> - block socket() in Documentation/sphinx/automarkup.py, as suggested by
> Jonathan Corbet
>
> v1:
> - https://lore.kernel.org/lkml/20190810121738.19587-1-j.neuschaefer@gmx.net/
> ---
> Documentation/sphinx/automarkup.py | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/sphinx/automarkup.py b/Documentation/sphinx/automarkup.py
> index a8798369e8f7..5b6119ff69f4 100644
> --- a/Documentation/sphinx/automarkup.py
> +++ b/Documentation/sphinx/automarkup.py
> @@ -26,7 +26,8 @@ RE_function = re.compile(r'([\w_][\w\d_]+\(\))')
> # just don't even try with these names.
> #
> Skipfuncs = [ 'open', 'close', 'read', 'write', 'fcntl', 'mmap',
> - 'select', 'poll', 'fork', 'execve', 'clone', 'ioctl']
> + 'select', 'poll', 'fork', 'execve', 'clone', 'ioctl',
> + 'socket' ]
Both patches sound OK on my eyes. Yet, I would just fold them into
a single one.
In any case:
Reviewed-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
>
> #
> # Find all occurrences of function() and try to replace them with
> --
> 2.20.1
>
Thanks,
Mauro
^ permalink raw reply
* Re: [PATCH net-next] net: openvswitch: Set OvS recirc_id from tc chain index
From: Pravin Shelar @ 2019-08-12 16:18 UTC (permalink / raw)
To: Paul Blakey
Cc: Linux Kernel Network Developers, David S. Miller, Justin Pettit,
Simon Horman, Marcelo Ricardo Leitner, Vlad Buslov, Jiri Pirko,
Roi Dayan, Yossi Kuperman, Rony Efraim, Oz Shlomo
In-Reply-To: <b5342e56-4baa-97ab-8694-2f48d012afca@mellanox.com>
On Sun, Aug 11, 2019 at 3:46 AM Paul Blakey <paulb@mellanox.com> wrote:
>
>
> On 8/8/2019 11:53 PM, Pravin Shelar wrote:
> > On Wed, Aug 7, 2019 at 5:08 AM Paul Blakey <paulb@mellanox.com> wrote:
> >> Offloaded OvS datapath rules are translated one to one to tc rules,
> >> for example the following simplified OvS rule:
> >>
> >> recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)
> >>
> >> Will be translated to the following tc rule:
> >>
> >> $ tc filter add dev dev1 ingress \
> >> prio 1 chain 0 proto ip \
> >> flower tcp ct_state -trk \
> >> action ct pipe \
> >> action goto chain 2
> >>
> >> Received packets will first travel though tc, and if they aren't stolen
> >> by it, like in the above rule, they will continue to OvS datapath.
> >> Since we already did some actions (action ct in this case) which might
> >> modify the packets, and updated action stats, we would like to continue
> >> the proccessing with the correct recirc_id in OvS (here recirc_id(2))
> >> where we left off.
> >>
> >> To support this, introduce a new skb extension for tc, which
> >> will be used for translating tc chain to ovs recirc_id to
> >> handle these miss cases. Last tc chain index will be set
> >> by tc goto chain action and read by OvS datapath.
> >>
> >> Signed-off-by: Paul Blakey <paulb@mellanox.com>
> >> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
> >> Acked-by: Jiri Pirko <jiri@mellanox.com>
> >> ---
> >> include/linux/skbuff.h | 13 +++++++++++++
> >> include/net/sch_generic.h | 5 ++++-
> >> net/core/skbuff.c | 6 ++++++
> >> net/openvswitch/flow.c | 9 +++++++++
> >> net/sched/Kconfig | 13 +++++++++++++
> >> net/sched/act_api.c | 1 +
> >> net/sched/cls_api.c | 12 ++++++++++++
> >> 7 files changed, 58 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> >> index 3aef8d8..fb2a792 100644
> >> --- a/include/linux/skbuff.h
> >> +++ b/include/linux/skbuff.h
> >> @@ -279,6 +279,16 @@ struct nf_bridge_info {
> >> };
> >> #endif
> >>
> >> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
> >> +/* Chain in tc_skb_ext will be used to share the tc chain with
> >> + * ovs recirc_id. It will be set to the current chain by tc
> >> + * and read by ovs to recirc_id.
> >> + */
> >> +struct tc_skb_ext {
> >> + __u32 chain;
> >> +};
> >> +#endif
> >> +
> >> struct sk_buff_head {
> >> /* These two members must be first. */
> >> struct sk_buff *next;
> >> @@ -4050,6 +4060,9 @@ enum skb_ext_id {
> >> #ifdef CONFIG_XFRM
> >> SKB_EXT_SEC_PATH,
> >> #endif
> >> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
> >> + TC_SKB_EXT,
> >> +#endif
> >> SKB_EXT_NUM, /* must be last */
> >> };
> >>
> >> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> >> index 6b6b012..871feea 100644
> >> --- a/include/net/sch_generic.h
> >> +++ b/include/net/sch_generic.h
> >> @@ -275,7 +275,10 @@ struct tcf_result {
> >> unsigned long class;
> >> u32 classid;
> >> };
> >> - const struct tcf_proto *goto_tp;
> >> + struct {
> >> + const struct tcf_proto *goto_tp;
> >> + u32 goto_index;
> >> + };
> >>
> >> /* used in the skb_tc_reinsert function */
> >> struct {
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index ea8e8d3..2b40b5a 100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -4087,6 +4087,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
> >> #ifdef CONFIG_XFRM
> >> [SKB_EXT_SEC_PATH] = SKB_EXT_CHUNKSIZEOF(struct sec_path),
> >> #endif
> >> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
> >> + [TC_SKB_EXT] = SKB_EXT_CHUNKSIZEOF(struct tc_skb_ext),
> >> +#endif
> >> };
> >>
> >> static __always_inline unsigned int skb_ext_total_length(void)
> >> @@ -4098,6 +4101,9 @@ static __always_inline unsigned int skb_ext_total_length(void)
> >> #ifdef CONFIG_XFRM
> >> skb_ext_type_len[SKB_EXT_SEC_PATH] +
> >> #endif
> >> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
> >> + skb_ext_type_len[TC_SKB_EXT] +
> >> +#endif
> >> 0;
> >> }
> >>
> >> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> >> index bc89e16..0287ead 100644
> >> --- a/net/openvswitch/flow.c
> >> +++ b/net/openvswitch/flow.c
> >> @@ -816,6 +816,9 @@ static int key_extract_mac_proto(struct sk_buff *skb)
> >> int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
> >> struct sk_buff *skb, struct sw_flow_key *key)
> >> {
> >> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
> >> + struct tc_skb_ext *tc_ext;
> >> +#endif
> >> int res, err;
> >>
> >> /* Extract metadata from packet. */
> >> @@ -848,7 +851,13 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
> >> if (res < 0)
> >> return res;
> >> key->mac_proto = res;
> >> +
> >> +#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
> >> + tc_ext = skb_ext_find(skb, TC_SKB_EXT);
> >> + key->recirc_id = tc_ext ? tc_ext->chain : 0;
> >> +#else
> >> key->recirc_id = 0;
> >> +#endif
> >>
> > Most of cases the config would be turned on, so the ifdef is not that
> > useful. Can you add static key to avoid searching the skb-ext in non
> > offload cases.
>
> Hi,
>
> What do you mean by a static key?
>
https://www.kernel.org/doc/Documentation/static-keys.txt
Static key can be enabled when a flow is added to the tc filter.
^ permalink raw reply
* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: Roopa Prabhu @ 2019-08-12 16:23 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Jiri Pirko, David Ahern, netdev, David Miller, Jakub Kicinski,
Stephen Hemminger, dcbw, Michal Kubecek, Andrew Lunn, parav,
Saeed Mahameed, mlxsw
In-Reply-To: <20190812084039.2fbd1f01@hermes.lan>
On Mon, Aug 12, 2019 at 8:40 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Mon, 12 Aug 2019 10:31:39 +0200
> Jiri Pirko <jiri@resnulli.us> wrote:
>
> > Mon, Aug 12, 2019 at 03:37:26AM CEST, dsahern@gmail.com wrote:
> > >On 8/11/19 7:34 PM, David Ahern wrote:
> > >> On 8/10/19 12:30 AM, Jiri Pirko wrote:
> > >>> Could you please write me an example message of add/remove?
> > >>
> > >> altnames are for existing netdevs, yes? existing netdevs have an id and
> > >> a name - 2 existing references for identifying the existing netdev for
> > >> which an altname will be added. Even using the altname as the main
> > >> 'handle' for a setlink change, I see no reason why the GETLINK api can
> > >> not take an the IFLA_ALT_IFNAME and return the full details of the
> > >> device if the altname is unique.
> > >>
> > >> So, what do the new RTM commands give you that you can not do with
> > >> RTM_*LINK?
> > >>
> > >
> > >
> > >To put this another way, the ALT_NAME is an attribute of an object - a
> > >LINK. It is *not* a separate object which requires its own set of
> > >commands for manipulating.
> >
> > Okay, again, could you provide example of a message to add/remove
> > altname using existing setlink message? Thanks!
>
> The existing IFALIAS takes an empty name to do deletion.
yes, if its a single alt name, keeping it similar to IFALIAS will help
^ permalink raw reply
* Re: [PATCH v5 18/29] compat_ioctl: move rfcomm handlers into driver
From: Marcel Holtmann @ 2019-08-12 16:29 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Alexander Viro, linux-fsdevel, linux-kernel, Johan Hedberg,
David S. Miller, Mauro Carvalho Chehab, linux-bluetooth, netdev
In-Reply-To: <20190730195819.901457-6-arnd@arndb.de>
Hi Arnd,
> All these ioctl commands are compatible, so we can handle
> them with a trivial wrapper in rfcomm/sock.c and remove
> the listing in fs/compat_ioctl.c.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> fs/compat_ioctl.c | 6 ------
> net/bluetooth/rfcomm/sock.c | 14 ++++++++++++--
> 2 files changed, 12 insertions(+), 8 deletions(-)
I think it is best if this series is applied as a whole. So whoever takes it
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Regards
Marcel
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox