* Re: [net PATCH 1/2] virtio_net: cap mtu when XDP programs are running
From: Michael S. Tsirkin @ 2017-01-05 3:18 UTC (permalink / raw)
To: John Fastabend; +Cc: jasowang, john.r.fastabend, netdev
In-Reply-To: <20170105031118.2636.82374.stgit@john-Precision-Tower-5810>
On Wed, Jan 04, 2017 at 07:11:18PM -0800, John Fastabend wrote:
> XDP programs can not consume multiple pages so we cap the MTU to
> avoid this case. Virtio-net however only checks the MTU at XDP
> program load and does not block MTU changes after the program
> has loaded.
Do drivers really have to tweak max mtu all the time?
Seems strange, I would say drivers just report device caps
and net core enforces rules.
Can't net core do these checks?
>
> This patch sets/clears the max_mtu value at XDP load/unload time.
>
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> ---
> drivers/net/virtio_net.c | 26 ++++++++++++++++++++++----
> 1 file changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 4a10500..261103d9 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1699,11 +1699,28 @@ static void virtnet_init_settings(struct net_device *dev)
> .set_settings = virtnet_set_settings,
> };
>
> +#define MIN_MTU ETH_MIN_MTU
> +#define MAX_MTU ETH_MAX_MTU
> +
> +static unsigned long int virtnet_xdp_mtu(struct bpf_prog *prog,
> + struct virtnet_info *vi)
> +{
> + if (!prog && virtio_has_feature(vi->vdev, VIRTIO_NET_F_MTU))
> + return virtio_cread16(vi->vdev,
> + offsetof(struct virtio_net_config, mtu));
> + else if (!prog)
> + return ETH_MAX_MTU;
> + else if (vi->mergeable_rx_bufs)
> + return PAGE_SIZE - sizeof(struct padded_vnet_hdr);
> + else
> + return GOOD_PACKET_LEN;
> +}
> +
> static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
> {
> - unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
> struct virtnet_info *vi = netdev_priv(dev);
> struct bpf_prog *old_prog;
> + unsigned long int max_sz;
> u16 xdp_qp = 0, curr_qp;
> int i, err;
>
> @@ -1720,6 +1737,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
> return -EINVAL;
> }
>
> + max_sz = virtnet_xdp_mtu(prog, vi);
> if (dev->mtu > max_sz) {
> netdev_warn(dev, "XDP requires MTU less than %lu\n", max_sz);
> return -EINVAL;
> @@ -1748,6 +1766,9 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
> virtnet_set_queues(vi, curr_qp);
> return PTR_ERR(prog);
> }
> + dev->max_mtu = max_sz;
> + } else {
> + dev->max_mtu = ETH_MAX_MTU;
> }
>
> vi->xdp_queue_pairs = xdp_qp;
> @@ -2133,9 +2154,6 @@ static bool virtnet_validate_features(struct virtio_device *vdev)
> return true;
> }
>
> -#define MIN_MTU ETH_MIN_MTU
> -#define MAX_MTU ETH_MAX_MTU
> -
> static int virtnet_probe(struct virtio_device *vdev)
> {
> int i, err;
^ permalink raw reply
* Re: [PATCH net-next] packet: fix panic in __packet_set_timestamp on tpacket_v3 in tx mode
From: David Miller @ 2017-01-05 4:51 UTC (permalink / raw)
To: daniel; +Cc: sowmini.varadhan, willemb, netdev
In-Reply-To: <1483580068-13854-1-git-send-email-daniel@iogearbox.net>
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu, 5 Jan 2017 02:34:28 +0100
> When TX timestamping is in use with TPACKET_V3's TX ring, then we'll
> hit the BUG() in __packet_set_timestamp() when ring buffer slot is
> returned to user space via tpacket_destruct_skb(). This is due to v3
> being assumed as unreachable here, but since 7f953ab2ba46 ("af_packet:
> TX_RING support for TPACKET_V3") it's not anymore. Fix it by filling
> the timestamp back into the ring slot.
>
> Fixes: 7f953ab2ba46 ("af_packet: TX_RING support for TPACKET_V3")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Applied, thanks.
^ permalink raw reply
* [PATCH net-next] cxgb4: Synchronize access to mailbox
From: Hariprasad Shenai @ 2017-01-05 5:53 UTC (permalink / raw)
To: netdev; +Cc: davem, leedom, nirranjan, ganeshgr, Hariprasad Shenai
The issue comes when there are multiple threads attempting to use
the mailbox facility at the same time.
When DCB operations and interface up/down is run in a loop for every
0.1 sec, we observed mailbox collisions. And out of the two commands
one would fail with the present code, since we don't queue the second
command.
To overcome the above issue, added a queue to access the mailbox.
Whenever a mailbox command is issued add it to the queue. If its at
the head issue the mailbox command, else wait for the existing command
to complete. Usually command takes less than a milli-second to
complete.
Also timeout from the loop, if the command under execution takes
long time to run.
In reality, the number of mailbox access collisions is going to be
very rare since no one runs such abusive script.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 8 ++++
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 3 ++
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 59 ++++++++++++++++++++++++-
3 files changed, 69 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 0bce1bf9ca0f..78a852c72f5d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -782,6 +782,10 @@ struct vf_info {
bool pf_set_mac;
};
+struct mbox_list {
+ struct list_head list;
+};
+
struct adapter {
void __iomem *regs;
void __iomem *bar2;
@@ -844,6 +848,10 @@ struct adapter {
struct work_struct db_drop_task;
bool tid_release_task_busy;
+ /* lock for mailbox cmd list */
+ spinlock_t mbox_lock;
+ struct mbox_list mlist;
+
/* support for mailbox command/reply logging */
#define T4_OS_LOG_MBOX_CMDS 256
struct mbox_cmd_log *mbox_log;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 6f951877430b..34ceb3518dd4 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4707,6 +4707,9 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
spin_lock_init(&adapter->stats_lock);
spin_lock_init(&adapter->tid_release_lock);
spin_lock_init(&adapter->win0_lock);
+ spin_lock_init(&adapter->mbox_lock);
+
+ INIT_LIST_HEAD(&adapter->mbox_list.list);
INIT_WORK(&adapter->tid_release_task, process_tid_release_list);
INIT_WORK(&adapter->db_full_task, process_db_full);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index e8139514d32c..7ac6ea531b0f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -284,6 +284,7 @@ int t4_wr_mbox_meat_timeout(struct adapter *adap, int mbox, const void *cmd,
1, 1, 3, 5, 10, 10, 20, 50, 100, 200
};
+ struct mbox_list entry;
u16 access = 0;
u16 execute = 0;
u32 v;
@@ -311,11 +312,61 @@ int t4_wr_mbox_meat_timeout(struct adapter *adap, int mbox, const void *cmd,
timeout = -timeout;
}
+ /* Queue ourselves onto the mailbox access list. When our entry is at
+ * the front of the list, we have rights to access the mailbox. So we
+ * wait [for a while] till we're at the front [or bail out with an
+ * EBUSY] ...
+ */
+ spin_lock(&adap->mbox_lock);
+ list_add_tail(&entry.list, &adap->mlist.list);
+ spin_unlock(&adap->mbox_lock);
+
+ delay_idx = 0;
+ ms = delay[0];
+
+ for (i = 0; ; i += ms) {
+ /* If we've waited too long, return a busy indication. This
+ * really ought to be based on our initial position in the
+ * mailbox access list but this is a start. We very rearely
+ * contend on access to the mailbox ...
+ */
+ if (i > FW_CMD_MAX_TIMEOUT) {
+ spin_lock(&adap->mbox_lock);
+ list_del(&entry.list);
+ spin_unlock(&adap->mbox_lock);
+ ret = -EBUSY;
+ t4_record_mbox(adap, cmd, size, access, ret);
+ return ret;
+ }
+
+ /* If we're at the head, break out and start the mailbox
+ * protocol.
+ */
+ if (list_first_entry(&adap->mlist.list, struct mbox_list,
+ list) == &entry)
+ break;
+
+ /* Delay for a bit before checking again ... */
+ if (sleep_ok) {
+ ms = delay[delay_idx]; /* last element may repeat */
+ if (delay_idx < ARRAY_SIZE(delay) - 1)
+ delay_idx++;
+ msleep(ms);
+ } else {
+ mdelay(ms);
+ }
+ }
+
+ /* Loop trying to get ownership of the mailbox. Return an error
+ * if we can't gain ownership.
+ */
v = MBOWNER_G(t4_read_reg(adap, ctl_reg));
for (i = 0; v == MBOX_OWNER_NONE && i < 3; i++)
v = MBOWNER_G(t4_read_reg(adap, ctl_reg));
-
if (v != MBOX_OWNER_DRV) {
+ spin_lock(&adap->mbox_lock);
+ list_del(&entry.list);
+ spin_unlock(&adap->mbox_lock);
ret = (v == MBOX_OWNER_FW) ? -EBUSY : -ETIMEDOUT;
t4_record_mbox(adap, cmd, MBOX_LEN, access, ret);
return ret;
@@ -366,6 +417,9 @@ int t4_wr_mbox_meat_timeout(struct adapter *adap, int mbox, const void *cmd,
execute = i + ms;
t4_record_mbox(adap, cmd_rpl,
MBOX_LEN, access, execute);
+ spin_lock(&adap->mbox_lock);
+ list_del(&entry.list);
+ spin_unlock(&adap->mbox_lock);
return -FW_CMD_RETVAL_G((int)res);
}
}
@@ -375,6 +429,9 @@ int t4_wr_mbox_meat_timeout(struct adapter *adap, int mbox, const void *cmd,
dev_err(adap->pdev_dev, "command %#x in mailbox %d timed out\n",
*(const u8 *)cmd, mbox);
t4_report_fw_error(adap);
+ spin_lock(&adap->mbox_lock);
+ list_del(&entry.list);
+ spin_unlock(&adap->mbox_lock);
return ret;
}
--
2.3.4
^ permalink raw reply related
* Re: [PATCH nf-next 4/4] netfilter: merge ctinfo into nfct pointer storage area
From: kbuild test robot @ 2017-01-05 6:03 UTC (permalink / raw)
To: Florian Westphal; +Cc: kbuild-all, netfilter-devel, netdev, Florian Westphal
In-Reply-To: <1483544150-10686-5-git-send-email-fw@strlen.de>
[-- Attachment #1: Type: text/plain, Size: 2017 bytes --]
Hi Florian,
[auto build test WARNING on nf-next/master]
url: https://github.com/0day-ci/linux/commits/Florian-Westphal/netfilter-skbuff-merge-nfctinfo-bits-and-nfct-pointer/20170105-133727
base: https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master
config: x86_64-randconfig-x002-201701 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All warnings (new ones prefixed by >>):
net/ipv4/netfilter/nf_defrag_ipv4.c: In function 'ipv4_conntrack_defrag':
>> net/ipv4/netfilter/nf_defrag_ipv4.c:78:2: warning: this 'if' clause does not guard... [-Wmisleading-indentation]
if (skb->_nfct && !nf_ct_is_template((struct nf_conn *) skb_nfct(skb)));
^~
net/ipv4/netfilter/nf_defrag_ipv4.c:79:3: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the 'if'
return NF_ACCEPT;
^~~~~~
vim +/if +78 net/ipv4/netfilter/nf_defrag_ipv4.c
62 }
63
64 static unsigned int ipv4_conntrack_defrag(void *priv,
65 struct sk_buff *skb,
66 const struct nf_hook_state *state)
67 {
68 struct sock *sk = skb->sk;
69
70 if (sk && sk_fullsock(sk) && (sk->sk_family == PF_INET) &&
71 inet_sk(sk)->nodefrag)
72 return NF_ACCEPT;
73
74 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
75 #if !IS_ENABLED(CONFIG_NF_NAT)
76 /* Previously seen (loopback)? Ignore. Do this before
77 fragment check. */
> 78 if (skb->_nfct && !nf_ct_is_template((struct nf_conn *) skb_nfct(skb)));
79 return NF_ACCEPT;
80 #endif
81 #endif
82 /* Gather fragments. */
83 if (ip_is_fragment(ip_hdr(skb))) {
84 enum ip_defrag_users user =
85 nf_ct_defrag_user(state->hook, skb);
86
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24700 bytes --]
^ permalink raw reply
* [PATCHv2 1/1] r8169: fix the typo in the comment
From: Zhu Yanjun @ 2017-01-05 8:02 UTC (permalink / raw)
To: nic_swsd, netdev; +Cc: Zhu Yanjun
>From the realtek data sheet, the PID0 should be bit 0.
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
---
Change from v1 to v2:
change the commit header.
drivers/net/ethernet/realtek/r8169.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 44389c9..8f1623b 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -696,7 +696,7 @@ enum rtl_tx_desc_bit_1 {
enum rtl_rx_desc_bit {
/* Rx private */
PID1 = (1 << 18), /* Protocol ID bit 1/2 */
- PID0 = (1 << 17), /* Protocol ID bit 2/2 */
+ PID0 = (1 << 17), /* Protocol ID bit 0/2 */
#define RxProtoUDP (PID1)
#define RxProtoTCP (PID0)
--
2.7.4
^ permalink raw reply related
* Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
From: Simon Horman @ 2017-01-05 8:03 UTC (permalink / raw)
To: Paul Blakey
Cc: Jamal Hadi Salim, John Fastabend, David S. Miller, netdev,
Jiri Pirko, Hadar Hen Zion, Or Gerlitz, Roi Dayan, Roman Mashak
In-Reply-To: <786655e9-de6a-29e9-a043-207afedcedc2@mellanox.com>
On Wed, Jan 04, 2017 at 01:45:28PM +0200, Paul Blakey wrote:
> On 04/01/2017 12:14, Simon Horman wrote:
> >On Tue, Jan 03, 2017 at 02:22:05PM +0200, Paul Blakey wrote:
> >>
> >>On 03/01/2017 13:44, Jamal Hadi Salim wrote:
> >>>On 17-01-02 11:33 PM, John Fastabend wrote:
> >>>>On 17-01-02 05:22 PM, Jamal Hadi Salim wrote:
> >>>[..]
> >>>>>Like all cookie semantics it is for storing state. The receiver
> >>>>>(kernel)
> >>>>>is not just store it and not intepret it. The user when reading it back
> >>>>>simplifies what they have to do for their processing.
> >>>>>
> >>>>>>The tuple <ifindex:qdisc:prio:handle> really should be unique why
> >>>>>>not use this for system wide mappings?
> >>>>>>
> >>>>>I think on a single machine should be enough, however:
> >>>>>typically the user wants to define the value in a manner that
> >>>>>in a distributed system it is unique. It would be trickier to
> >>>>>do so with well defined values such as above.
> >>>>>
> >>>>Just extend the tuple <hostname:ifindex:qdisc:prio:handle> that
> >>>>should be unique in the domain of hostname's, or use some other domain
> >>>>wide machine identifier.
> >>>>
> >>>May work for the case of filter identification. The nice thing for
> >>>allowing cookies is you can let the user define it define their
> >>>own scheme.
> >>>
> >>>>Although actions can be shared so the cookie can be shared across
> >>>>filters. Maybe its useful but it doesn't uniquely identify a filter
> >>>>in the shared case but the user would have to specify that case
> >>>>so maybe its not important.
> >>>>
> >>>Note: the action cookies and filter cookies are unrelated/orthogonal.
> >>>Their basic concept of stashing something in the cookie to help improve
> >>>what user space does (in our case millions of actions of which some are
> >>>used for accounting) is similar.
> >>>I have no objections to the flow cookies; my main concern was it should
> >>>be applicable to all classifiers not just flower. And the arbitrary size
> >>>of the cookie that you pointed out is questionable.
> >>>
> >>>cheers,
> >>>jamal
> >>
> >>Hi all,
> >>Our use case is replacing OVS rules with TC filters for HW offload, and
> >>you're are right the cookie would
> >>have saved us the mapping from OVS rule ufid to the tc filter handle/prio...
> >>that was generated for it.
> >>It also was going to be used to store other info like which OVS output port
> >>corresponds to the ifindex,
> >Possibly off-topic but I am curious to know why you need to store the port.
> >My possibly naïve assumption is that a filter is attached to the netdev
> >corresponding to the input port and mirred or other actions are used to output
> >to netdevs corresponding to output ports.
>
> Right, its for the output ports, OVS uses ovs port numbers and mirred action
> uses the device ifindex, so there is need
> to translate it back to OVS port on dump.
Understood, that is a tedious abstraction to support.
But I don't see an easy way around it at this time.
If I read Jamal's emails correctly he is working on per-action cookies.
They may be better than per-flow cookies for storing the OvS port number
(though not the UUID of the flow).
...
^ permalink raw reply
* [PATCHv2 1/1] r8169: fix the typo in the comment
From: yanjun.zhu @ 2017-01-05 7:54 UTC (permalink / raw)
To: nic_swsd, netdev; +Cc: Zhu Yanjun
From: Zhu Yanjun <yanjun.zhu@oracle.com>
>From the realtek data sheet, the PID0 should be bit 0.
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
---
Change from v1 to v2:
change the commit header.
drivers/net/ethernet/realtek/r8169.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 44389c9..8f1623b 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -696,7 +696,7 @@ enum rtl_tx_desc_bit_1 {
enum rtl_rx_desc_bit {
/* Rx private */
PID1 = (1 << 18), /* Protocol ID bit 1/2 */
- PID0 = (1 << 17), /* Protocol ID bit 2/2 */
+ PID0 = (1 << 17), /* Protocol ID bit 0/2 */
#define RxProtoUDP (PID1)
#define RxProtoTCP (PID0)
--
2.7.4
^ permalink raw reply related
* [PATCH] net: ethoc: Remove unused members from struct ethoc
From: Tobias Klauser @ 2017-01-05 8:16 UTC (permalink / raw)
To: netdev; +Cc: davem, f.fainelli, thierry.reding, andrew, colin.king, tremyfr
The io_region_size and dma_alloc members of struct ethoc are only
written but never read, so they might as well be removed.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
---
drivers/net/ethernet/ethoc.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
index 45abc81f6f55..63e5e14174ee 100644
--- a/drivers/net/ethernet/ethoc.c
+++ b/drivers/net/ethernet/ethoc.c
@@ -180,8 +180,6 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
* struct ethoc - driver-private device structure
* @iobase: pointer to I/O memory region
* @membase: pointer to buffer memory region
- * @dma_alloc: dma allocated buffer size
- * @io_region_size: I/O memory region size
* @num_bd: number of buffer descriptors
* @num_tx: number of send buffers
* @cur_tx: last send buffer written
@@ -199,8 +197,6 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
struct ethoc {
void __iomem *iobase;
void __iomem *membase;
- int dma_alloc;
- resource_size_t io_region_size;
bool big_endian;
unsigned int num_bd;
@@ -1096,8 +1092,6 @@ static int ethoc_probe(struct platform_device *pdev)
/* setup driver-private data */
priv = netdev_priv(netdev);
priv->netdev = netdev;
- priv->dma_alloc = 0;
- priv->io_region_size = resource_size(mmio);
priv->iobase = devm_ioremap_nocache(&pdev->dev, netdev->base_addr,
resource_size(mmio));
@@ -1127,7 +1121,6 @@ static int ethoc_probe(struct platform_device *pdev)
goto free;
}
netdev->mem_end = netdev->mem_start + buffer_size;
- priv->dma_alloc = buffer_size;
}
priv->big_endian = pdata ? pdata->big_endian :
--
2.11.0
^ permalink raw reply related
* Re: [PATCH nf-next 4/4] netfilter: merge ctinfo into nfct pointer storage area
From: kbuild test robot @ 2017-01-05 8:31 UTC (permalink / raw)
To: Florian Westphal; +Cc: kbuild-all, netfilter-devel, netdev, Florian Westphal
In-Reply-To: <1483544150-10686-5-git-send-email-fw@strlen.de>
[-- Attachment #1: Type: text/plain, Size: 1725 bytes --]
Hi Florian,
[auto build test WARNING on nf-next/master]
url: https://github.com/0day-ci/linux/commits/Florian-Westphal/netfilter-skbuff-merge-nfctinfo-bits-and-nfct-pointer/20170105-133727
base: https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master
config: x86_64-randconfig-b0-01051551 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All warnings (new ones prefixed by >>):
net/ipv4/netfilter/nf_dup_ipv4.c: In function 'nf_dup_ipv4':
>> net/ipv4/netfilter/nf_dup_ipv4.c:56: warning: unused variable 'untracked'
vim +/untracked +56 net/ipv4/netfilter/nf_dup_ipv4.c
40 fl4.flowi4_flags = FLOWI_FLAG_KNOWN_NH;
41 rt = ip_route_output_key(net, &fl4);
42 if (IS_ERR(rt))
43 return false;
44
45 skb_dst_drop(skb);
46 skb_dst_set(skb, &rt->dst);
47 skb->dev = rt->dst.dev;
48 skb->protocol = htons(ETH_P_IP);
49
50 return true;
51 }
52
53 void nf_dup_ipv4(struct net *net, struct sk_buff *skb, unsigned int hooknum,
54 const struct in_addr *gw, int oif)
55 {
> 56 struct nf_conn *untracked;
57 struct iphdr *iph;
58
59 if (this_cpu_read(nf_skb_duplicated))
60 return;
61 /*
62 * Copy the skb, and route the copy. Will later return %XT_CONTINUE for
63 * the original skb, which should continue on its way as if nothing has
64 * happened. The copy should be independently delivered to the gateway.
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28934 bytes --]
^ permalink raw reply
* Re: [PATCH] phy state machine: failsafe leave invalid RUNNING state
From: Zefir Kurtisi @ 2017-01-05 9:23 UTC (permalink / raw)
To: Florian Fainelli, netdev; +Cc: andrew
In-Reply-To: <b482e869-cc00-8b19-f18d-eb0a2d2ba67d@gmail.com>
On 01/04/2017 10:44 PM, Florian Fainelli wrote:
> On 01/04/2017 08:10 AM, Zefir Kurtisi wrote:
>> On 01/04/2017 04:30 PM, Florian Fainelli wrote:
>>>
>>>
>>> On 01/04/2017 07:27 AM, Zefir Kurtisi wrote:
>>>> On 01/04/2017 04:13 PM, Florian Fainelli wrote:
>>>>>
>>>>>
>>>>> On 01/04/2017 07:04 AM, Zefir Kurtisi wrote:
>>>>>> While in RUNNING state, phy_state_machine() checks for link changes by
>>>>>> comparing phydev->link before and after calling phy_read_status().
>>>>>> This works as long as it is guaranteed that phydev->link is never
>>>>>> changed outside the phy_state_machine().
>>>>>>
>>>>>> If in some setups this happens, it causes the state machine to miss
>>>>>> a link loss and remain RUNNING despite phydev->link being 0.
>>>>>>
>>>>>> This has been observed running a dsa setup with a process continuously
>>>>>> polling the link states over ethtool each second (SNMPD RFC-1213
>>>>>> agent). Disconnecting the link on a phy followed by a ETHTOOL_GSET
>>>>>> causes dsa_slave_get_settings() / dsa_slave_get_link_ksettings() to
>>>>>> call phy_read_status() and with that modify the link status - and
>>>>>> with that bricking the phy state machine.
>>>>>
>>>>> That's the interesting part of the analysis, how does this brick the PHY
>>>>> state machine? Is the PHY driver changing the link status in the
>>>>> read_status callback that it implements?
>>>>>
>>>> phydev->read_status points to genphy_read_status(), where the first call goes to
>>>> genphy_update_link() which updates the link status.
>>>>
>>>> Thereafter phy_state_machine():RUNNING won't be able to detect the link loss
>>>> anymore unless the link state changes again.
>>>>
>>>>
>>>> I was trying to figure out if there is a rule that forbids changing phydev->link
>>>> from outside the state machine, but found several places where it happens (either
>>>> directly, or over genphy_read_status() or over genphy_update_link()).
>>>>
>>>> Curious how this did not show up before, since within the dsa setup it is very
>>>> easy to trigger:
>>>> a) physically disconnect link
>>>> b) within one second run ethtool ethX
>>>
>>> You need to be more specific here about what "the dsa setup" is, drivers
>>> involved, which ports of the switch you are seeing this with (user
>>> facing, CPU port, DSA port?) etc.
>>>
>> I am working on top of LEDE and with that at kernel 4.4.21 - alas I checked the
>> related source files and believe the effect should be reproducible with HEAD.
>>
>> The setup is as follows:
>> mv88e6321:
>> * ports 0+1 connected to fibre-optics transceivers at fixed 100 Mbps
>> * port 4 is CPU port
>> * custom phy driver (replacement for marvell.ko) only populated with
>> * .config_init to
>> * set fixed speed for ports 0+1 (when in FO mode)
>> * run genphy_config_init() for all other modes (here: CPU port)
>> * .config_aneg=genphy_config_aneg, .read_status=genphy_read_status
>>
>>
>> To my understanding, the exact setup is irrelevant - to reproduce the issue it is
>> enough to have a means of running genphy_update_link() (as done in e.g.
>> mediatek/mtk_eth_soc.c, dsa/slave.c), or genphy_read_status() (as done in e.g.
>> hisilicon/hns/hns_enet.c) or phy_read_status() (as done in e.g.
>> ethernet/ti/netcp_ethss.c, ethernet/aeroflex/greth.c, etc.). In the observed
>> drivers it is mostly implemented in the ETHTOOL_GSET execution path.
>>
>> Once you get the link state updated outside the phy state machine, it remains in
>> invalid RUNNING. To prevent that invalid state, to my understanding upper layer
>> drivers (Ethernet, dsa) must not modify link-states in any case (including calling
>> the functions noted above), or we need the proposed fail-safe mechanism to prevent
>> getting stuck.
>
> OK, I see the code path involved now, sorry -ENOCOFFEE when I initially
> responded. Yes, clearly, we should not be mangling the PHY device's link
> by calling genphy_read_status(). At first glance, none of the users
> below should be doing what they are doing, but let's kick a separate
> patch series to collect feedback from the driver writes.
>
> Thanks!
>
Ok, thanks for taking time.
The kbuild test robot error is due to 'struct device dev' been removed from
phy_device struct since 4.4.21. Does it make sense to provide a v2 fixing that, or
do you expect that this fail-safe mechanism is not needed once all Ethernet/dsa
drivers are fixed?
I think it won't hurt to add the check simply to ensure that it got fixed and the
issue is not popping up thereafter.
Cheers,
Zefir
^ permalink raw reply
* Re: [PATCH net-next V2 3/3] tun: rx batching
From: Stefan Hajnoczi @ 2017-01-05 9:27 UTC (permalink / raw)
To: Jason Wang; +Cc: netdev, virtualization, linux-kernel, kvm, mst
In-Reply-To: <73da2ef8-2454-5614-d637-0ce7c5287433@redhat.com>
[-- Attachment #1.1: Type: text/plain, Size: 830 bytes --]
On Wed, Jan 04, 2017 at 11:03:32AM +0800, Jason Wang wrote:
> On 2017年01月03日 21:33, Stefan Hajnoczi wrote:
> > On Wed, Dec 28, 2016 at 04:09:31PM +0800, Jason Wang wrote:
> > > +static int tun_rx_batched(struct tun_file *tfile, struct sk_buff *skb,
> > > + int more)
> > > +{
> > > + struct sk_buff_head *queue = &tfile->sk.sk_write_queue;
> > > + struct sk_buff_head process_queue;
> > > + int qlen;
> > > + bool rcv = false;
> > > +
> > > + spin_lock(&queue->lock);
> > Should this be spin_lock_bh()? Below and in tun_get_user() there are
> > explicit local_bh_disable() calls so I guess BHs can interrupt us here
> > and this would deadlock.
>
> sk_write_queue were accessed only in this function which runs under process
> context, so no need for spin_lock_bh() here.
I see, thanks!
Stefan
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
[-- Attachment #2: Type: text/plain, Size: 183 bytes --]
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [PATCH] sh_eth: R8A7740 supports packet shecksumming
From: Sergei Shtylyov @ 2017-01-05 9:33 UTC (permalink / raw)
To: netdev, linux-renesas-soc, David Miller
In-Reply-To: <1871204.P0zXJRoIfd@wasted.cogentembedded.com>
Hello!
Oops, typo in the subject, "shecksumming". David, should I resend?
MBR, Sergei
^ permalink raw reply
* Re: [net PATCH 1/2] virtio_net: cap mtu when XDP programs are running
From: Jason Wang @ 2017-01-05 9:34 UTC (permalink / raw)
To: Michael S. Tsirkin, John Fastabend; +Cc: john.r.fastabend, netdev
In-Reply-To: <20170105051641-mutt-send-email-mst@kernel.org>
On 2017年01月05日 11:18, Michael S. Tsirkin wrote:
> On Wed, Jan 04, 2017 at 07:11:18PM -0800, John Fastabend wrote:
>> XDP programs can not consume multiple pages so we cap the MTU to
>> avoid this case. Virtio-net however only checks the MTU at XDP
>> program load and does not block MTU changes after the program
>> has loaded.
> Do drivers really have to tweak max mtu all the time?
> Seems strange, I would say drivers just report device caps
> and net core enforces rules.
> Can't net core do these checks?
I think this needs host co-operation, at least this patch prevents user
from misconfiguring mtu in guest.
^ permalink raw reply
* [PATCH] net: xilinx: emaclite: Remove xemaclite_remove_ndev()
From: Tobias Klauser @ 2017-01-05 9:41 UTC (permalink / raw)
To: netdev; +Cc: michal.simek, soren.brinkmann, davem
xemaclite_remove_ndev() is a simple wrapper around free_netdev()
checking for NULL before the call. All possible paths calling
it are guaranteed to pass a non-NULL argument, so rather call
free_netdev() directly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
---
drivers/net/ethernet/xilinx/xilinx_emaclite.c | 18 ++----------------
1 file changed, 2 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 93dc10b10c09..97dcc0bd5a85 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1029,20 +1029,6 @@ static int xemaclite_send(struct sk_buff *orig_skb, struct net_device *dev)
}
/**
- * xemaclite_remove_ndev - Free the network device
- * @ndev: Pointer to the network device to be freed
- *
- * This function un maps the IO region of the Emaclite device and frees the net
- * device.
- */
-static void xemaclite_remove_ndev(struct net_device *ndev)
-{
- if (ndev) {
- free_netdev(ndev);
- }
-}
-
-/**
* get_bool - Get a parameter from the OF device
* @ofdev: Pointer to OF device structure
* @s: Property to be retrieved
@@ -1172,7 +1158,7 @@ static int xemaclite_of_probe(struct platform_device *ofdev)
return 0;
error:
- xemaclite_remove_ndev(ndev);
+ free_netdev(ndev);
return rc;
}
@@ -1204,7 +1190,7 @@ static int xemaclite_of_remove(struct platform_device *of_dev)
of_node_put(lp->phy_node);
lp->phy_node = NULL;
- xemaclite_remove_ndev(ndev);
+ free_netdev(ndev);
return 0;
}
--
2.11.0
^ permalink raw reply related
* [PATCH v3 net-next] net:mv88e6xxx: use g2 interrupt for 6097 chip
From: Volodymyr Bendiuga @ 2017-01-05 9:44 UTC (permalink / raw)
To: andrew, vivien.didelot, f.fainelli, netdev, volodymyr.bendiuga
Cc: Volodymyr Bendiuga
This chip needs MV88E6XXX_FLAG_G2_INT
Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index dcb1b81..474e715 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -568,6 +568,7 @@ enum mv88e6xxx_cap {
(MV88E6XXX_FLAG_G1_ATU_FID | \
MV88E6XXX_FLAG_G1_VTU_FID | \
MV88E6XXX_FLAG_GLOBAL2 | \
+ MV88E6XXX_FLAG_G2_INT | \
MV88E6XXX_FLAG_G2_MGMT_EN_2X | \
MV88E6XXX_FLAG_G2_MGMT_EN_0X | \
MV88E6XXX_FLAG_G2_POT | \
--
2.7.4
^ permalink raw reply related
* [PATCH net-next V2 1/3] net/skbuff: Introduce skb_mac_offset()
From: Amir Vadai @ 2017-01-05 9:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jiri Pirko, Or Gerlitz, Hadar Har-Zion, Amir Vadai
In-Reply-To: <20170105095454.32644-1-amir@vadai.me>
Introduce skb_mac_offset() that could be used to get mac header offset.
Signed-off-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
---
include/linux/skbuff.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b53c0cfd417e..3d8f81f39c2b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2178,6 +2178,11 @@ static inline unsigned char *skb_mac_header(const struct sk_buff *skb)
return skb->head + skb->mac_header;
}
+static inline int skb_mac_offset(const struct sk_buff *skb)
+{
+ return skb_mac_header(skb) - skb->data;
+}
+
static inline int skb_mac_header_was_set(const struct sk_buff *skb)
{
return skb->mac_header != (typeof(skb->mac_header))~0U;
--
2.11.0
^ permalink raw reply related
* [PATCH net-next V2 0/3] net/sched: act_pedit: Use offset relative to conventional network headers
From: Amir Vadai @ 2017-01-05 9:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jiri Pirko, Or Gerlitz, Hadar Har-Zion, Amir Vadai
Hi Dave,
This is a respin of the patchset. V1 was sent and didn't make it for 4.10.
You asked me [1] why did I use specific header names instead of layers (L2, L3...),
and I explained that it is on purpose, this extra information is planned to be used
by hardware drivers to offload the action.
Some FW/HW parser APIs are such that they need to get the specific header type (e.g
IPV4 or IPV6, TCP or UDP) and not only the networking level (e.g network or transport).
Enhancing the UAPI to allow for specifying that would allow the same flows to be
set into both SW and HW.
This patchset also makes pedit more robust. Currently fields offset is specified
by offset relative to the ip header, while using negative offsets for
MAC layer fields.
This series enables the user to set offset relative to the relevant header.
This patch is reusing existing fields in a way where backward UAPI
compatibility is being kept.
Usage example:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower \
ip_proto tcp \
dst_port 80 \
action \
pedit munge ip ttl add 0xff \
pedit munge tcp dport set 8080 \
pipe action mirred egress redirect dev veth0
Will forward traffic destined to tcp dport 80, while modifying the
destination port to 8080, and decreasing the ttl by one.
I've uploaded a draft for the userspace [2] to make it easier to review and
test the patchset.
[1] - http://patchwork.ozlabs.org/patch/700909/
[2] - git: https://bitbucket.org/av42/iproute2.git
branch: pedit
Patchset was tested and applied on top of upstream commit 57ea884b0dcf
("packet: fix panic in __packet_set_timestamp on tpacket_v3 in tx mode")
Thanks,
Amir
Amir Vadai (3):
net/skbuff: Introduce skb_mac_offset()
net/act_pedit: Support using offset relative to the conventional
network headers
net/act_pedit: Introduce 'add' operation
include/linux/skbuff.h | 5 +++
include/uapi/linux/tc_act/tc_pedit.h | 27 ++++++++++++
net/sched/act_pedit.c | 81 ++++++++++++++++++++++++++++++------
3 files changed, 100 insertions(+), 13 deletions(-)
--
2.11.0
^ permalink raw reply
* [PATCH net-next V2 2/3] net/act_pedit: Support using offset relative to the conventional network headers
From: Amir Vadai @ 2017-01-05 9:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jiri Pirko, Or Gerlitz, Hadar Har-Zion, Amir Vadai
In-Reply-To: <20170105095454.32644-1-amir@vadai.me>
Extend pedit to enable the user setting offset relative to network
headers. This change would enable to work with more complex header
schemes (vs the simple IPv4 case) where setting a fixed offset relative
to the network header is not enough. It is also forward looking to
enable hardware offloading of pedit.
The header type is embedded in the 8 MSB of the u32 key->shift which
were never used till now. Therefore backward compatibility is being
kept.
Usage example:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower \
ip_proto tcp \
dst_port 80 \
action pedit munge tcp dport set 8080 pipe \
action mirred egress redirect dev veth0
Will forward tcp port whose original dest port is 80, while modifying
the destination port to 8080.
Signed-off-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
---
include/uapi/linux/tc_act/tc_pedit.h | 17 ++++++++++
net/sched/act_pedit.c | 65 +++++++++++++++++++++++++++++-------
2 files changed, 70 insertions(+), 12 deletions(-)
diff --git a/include/uapi/linux/tc_act/tc_pedit.h b/include/uapi/linux/tc_act/tc_pedit.h
index 6389959a5157..604e6729ad38 100644
--- a/include/uapi/linux/tc_act/tc_pedit.h
+++ b/include/uapi/linux/tc_act/tc_pedit.h
@@ -32,4 +32,21 @@ struct tc_pedit_sel {
};
#define tc_pedit tc_pedit_sel
+#define PEDIT_TYPE_SHIFT 24
+#define PEDIT_TYPE_MASK 0xff
+
+#define PEDIT_TYPE_GET(_val) \
+ (((_val) >> PEDIT_TYPE_SHIFT) & PEDIT_TYPE_MASK)
+#define PEDIT_SHIFT_GET(_val) ((_val) & 0xff)
+
+enum pedit_header_type {
+ PEDIT_HDR_TYPE_RAW = 0,
+
+ PEDIT_HDR_TYPE_ETH = 1,
+ PEDIT_HDR_TYPE_IP4 = 2,
+ PEDIT_HDR_TYPE_IP6 = 3,
+ PEDIT_HDR_TYPE_TCP = 4,
+ PEDIT_HDR_TYPE_UDP = 5,
+};
+
#endif
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index b27c4daec88f..4b9c7184c752 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -119,18 +119,45 @@ static bool offset_valid(struct sk_buff *skb, int offset)
return true;
}
+static int pedit_skb_hdr_offset(struct sk_buff *skb,
+ enum pedit_header_type htype, int *hoffset)
+{
+ int ret = -1;
+
+ switch (htype) {
+ case PEDIT_HDR_TYPE_ETH:
+ if (skb_mac_header_was_set(skb)) {
+ *hoffset = skb_mac_offset(skb);
+ ret = 0;
+ }
+ break;
+ case PEDIT_HDR_TYPE_RAW:
+ case PEDIT_HDR_TYPE_IP4:
+ case PEDIT_HDR_TYPE_IP6:
+ *hoffset = skb_network_offset(skb);
+ ret = 0;
+ break;
+ case PEDIT_HDR_TYPE_TCP:
+ case PEDIT_HDR_TYPE_UDP:
+ if (skb_transport_header_was_set(skb)) {
+ *hoffset = skb_transport_offset(skb);
+ ret = 0;
+ }
+ break;
+ };
+
+ return ret;
+}
+
static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
struct tcf_result *res)
{
struct tcf_pedit *p = to_pedit(a);
int i;
- unsigned int off;
if (skb_unclone(skb, GFP_ATOMIC))
return p->tcf_action;
- off = skb_network_offset(skb);
-
spin_lock(&p->tcf_lock);
tcf_lastuse_update(&p->tcf_tm);
@@ -141,20 +168,32 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
for (i = p->tcfp_nkeys; i > 0; i--, tkey++) {
u32 *ptr, _data;
int offset = tkey->off;
+ int hoffset;
+ int rc;
+ enum pedit_header_type htype =
+ PEDIT_TYPE_GET(tkey->shift);
+
+ rc = pedit_skb_hdr_offset(skb, htype, &hoffset);
+ if (rc) {
+ pr_info("tc filter pedit bad header type specified (0x%x)\n",
+ htype);
+ goto bad;
+ }
if (tkey->offmask) {
char *d, _d;
- if (!offset_valid(skb, off + tkey->at)) {
+ if (!offset_valid(skb, hoffset + tkey->at)) {
pr_info("tc filter pedit 'at' offset %d out of bounds\n",
- off + tkey->at);
+ hoffset + tkey->at);
goto bad;
}
- d = skb_header_pointer(skb, off + tkey->at, 1,
- &_d);
+ d = skb_header_pointer(skb,
+ hoffset + tkey->at,
+ 1, &_d);
if (!d)
goto bad;
- offset += (*d & tkey->offmask) >> tkey->shift;
+ offset += (*d & tkey->offmask) >> PEDIT_SHIFT_GET(tkey->shift);
}
if (offset % 4) {
@@ -163,19 +202,21 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
goto bad;
}
- if (!offset_valid(skb, off + offset)) {
+ if (!offset_valid(skb, hoffset + offset)) {
pr_info("tc filter pedit offset %d out of bounds\n",
- offset);
+ hoffset + offset);
goto bad;
}
- ptr = skb_header_pointer(skb, off + offset, 4, &_data);
+ ptr = skb_header_pointer(skb,
+ hoffset + offset,
+ 4, &_data);
if (!ptr)
goto bad;
/* just do it, baby */
*ptr = ((*ptr & tkey->mask) ^ tkey->val);
if (ptr == &_data)
- skb_store_bits(skb, off + offset, ptr, 4);
+ skb_store_bits(skb, hoffset + offset, ptr, 4);
}
goto done;
--
2.11.0
^ permalink raw reply related
* [PATCH net-next V2 3/3] net/act_pedit: Introduce 'add' operation
From: Amir Vadai @ 2017-01-05 9:54 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Jiri Pirko, Or Gerlitz, Hadar Har-Zion, Amir Vadai
In-Reply-To: <20170105095454.32644-1-amir@vadai.me>
This command could be useful to inc/dec fields.
Command type is embedded inside the existing shift field in an unused
bits, therefore UAPI backward compatibility is being kept.
For example, to forward any TCP packet and decrease its TTL:
$ tc filter add dev enp0s9 protocol ip parent ffff: \
flower ip_proto tcp \
action pedit munge ip ttl add 0xff pipe \
action mirred egress redirect dev veth0
In the example above, adding 0xff to this u8 field is actually
decreasing it by one, since the operation is masked.
Signed-off-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
---
include/uapi/linux/tc_act/tc_pedit.h | 10 ++++++++++
net/sched/act_pedit.c | 16 +++++++++++++++-
2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/tc_act/tc_pedit.h b/include/uapi/linux/tc_act/tc_pedit.h
index 604e6729ad38..80028cd0bb1b 100644
--- a/include/uapi/linux/tc_act/tc_pedit.h
+++ b/include/uapi/linux/tc_act/tc_pedit.h
@@ -35,8 +35,13 @@ struct tc_pedit_sel {
#define PEDIT_TYPE_SHIFT 24
#define PEDIT_TYPE_MASK 0xff
+#define PEDIT_CMD_SHIFT 16
+#define PEDIT_CMD_MASK 0xff
+
#define PEDIT_TYPE_GET(_val) \
(((_val) >> PEDIT_TYPE_SHIFT) & PEDIT_TYPE_MASK)
+#define PEDIT_CMD_GET(_val) \
+ (((_val) >> PEDIT_CMD_SHIFT) & PEDIT_CMD_MASK)
#define PEDIT_SHIFT_GET(_val) ((_val) & 0xff)
enum pedit_header_type {
@@ -49,4 +54,9 @@ enum pedit_header_type {
PEDIT_HDR_TYPE_UDP = 5,
};
+enum pedit_cmd {
+ PEDIT_CMD_SET = 0,
+ PEDIT_CMD_ADD = 1,
+};
+
#endif
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index 4b9c7184c752..aa137d51bf7f 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -169,6 +169,7 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
u32 *ptr, _data;
int offset = tkey->off;
int hoffset;
+ u32 val;
int rc;
enum pedit_header_type htype =
PEDIT_TYPE_GET(tkey->shift);
@@ -214,7 +215,20 @@ static int tcf_pedit(struct sk_buff *skb, const struct tc_action *a,
if (!ptr)
goto bad;
/* just do it, baby */
- *ptr = ((*ptr & tkey->mask) ^ tkey->val);
+ switch (PEDIT_CMD_GET(tkey->shift)) {
+ case PEDIT_CMD_SET:
+ val = tkey->val;
+ break;
+ case PEDIT_CMD_ADD:
+ val = (*ptr + tkey->val) & ~tkey->mask;
+ break;
+ default:
+ pr_info("tc filter pedit bad command (%d)\n",
+ PEDIT_CMD_GET(tkey->shift));
+ goto bad;
+ }
+
+ *ptr = ((*ptr & tkey->mask) ^ val);
if (ptr == &_data)
skb_store_bits(skb, hoffset + offset, ptr, 4);
}
--
2.11.0
^ permalink raw reply related
* [PATCH v2 net-next] net:dsa: check for EPROBE_DEFER from dsa_dst_parse()
From: Volodymyr Bendiuga @ 2017-01-05 10:10 UTC (permalink / raw)
To: andrew, vivien.didelot, f.fainelli, davem, netdev,
volodymyr.bendiuga
Cc: Volodymyr Bendiuga
Since there can be multiple dsa switches stacked together but
not all of devicetree nodes available at the time of calling
dsa_dst_parse(), EPROBE_DEFER can be returned by it. When this
happens, only the last dsa switch has to be deleted by
dsa_dst_del_ds(), but not the whole list, because next time linux
cames back to this function it will try to add only the last dsa
switch which returned EPROBE_DEFER.
Signed-off-by: Volodymyr Bendiuga <volodymyr.bendiuga@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
net/dsa/dsa2.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 7924c92..a799718 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -673,8 +673,14 @@ static int _dsa_register_switch(struct dsa_switch *ds, struct device_node *np)
}
err = dsa_dst_parse(dst);
- if (err)
+ if (err) {
+ if (err == -EPROBE_DEFER) {
+ dsa_dst_del_ds(dst, ds, ds->index);
+ return err;
+ }
+
goto out_del_dst;
+ }
err = dsa_dst_apply(dst);
if (err) {
--
2.7.4
^ permalink raw reply related
* Re: [PATCH] stmmac: Enable Clause 45 PHYs in GMAC4 (eQOS)
From: Joao Pinto @ 2017-01-05 10:15 UTC (permalink / raw)
To: Kweh, Hock Leong, Joao Pinto, davem@davemloft.net; +Cc: netdev@vger.kernel.org
In-Reply-To: <F54AEECA5E2B9541821D670476DAE19C5A91819D@PGSMSX102.gar.corp.intel.com>
Às 1:37 AM de 1/5/2017, Kweh, Hock Leong escreveu:
>> -----Original Message-----
>> From: Joao Pinto [mailto:Joao.Pinto@synopsys.com]
>> Sent: Wednesday, January 04, 2017 10:36 PM
>> To: davem@davemloft.net
>> Cc: Kweh, Hock Leong <hock.leong.kweh@intel.com>; netdev@vger.kernel.org;
>> Joao Pinto <Joao.Pinto@synopsys.com>
>> Subject: [PATCH] stmmac: Enable Clause 45 PHYs in GMAC4 (eQOS)
>>
>> The eQOS IP Core (best known in stmmac as gmac4) has a register that must be
>> set if using a Clause 45 PHY. If this register is not set, the PHY won't work.
>> This patch will have no impact in setups using Clause 22 PHYs.
>>
>> Signed-off-by: Joao Pinto <jpinto@synopsys.com>
>
> Hi Joao,
>
> This is not working on our environment. We are using the 4-ETH-4-MGB-101 plugin card.
>
> Regards,
> Wilson
Hi Wilson and David,
I am using a different PHY and I only get it detecting the link with that bit
set. Thanks for your feedback, going to dig a bit more!
Joao
>
>> ---
>> drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
>> b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
>> index b0344c2..676ae3c 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
>> @@ -41,6 +41,7 @@
>> #define MII_GMAC4_GOC_SHIFT 2
>> #define MII_GMAC4_WRITE (1 << MII_GMAC4_GOC_SHIFT)
>> #define MII_GMAC4_READ (3 << MII_GMAC4_GOC_SHIFT)
>> +#define MII_CLAUSE45_PHY (1 << 1)
>>
>> static int stmmac_mdio_busy_wait(void __iomem *ioaddr, unsigned int
>> mii_addr) { @@ -125,7 +126,7 @@ static int stmmac_mdio_write(struct
>> mii_bus *bus, int phyaddr, int phyreg,
>> value |= (priv->clk_csr << priv->hw->mii.clk_csr_shift)
>> & priv->hw->mii.clk_csr_mask;
>> if (priv->plat->has_gmac4)
>> - value |= MII_GMAC4_WRITE;
>> + value |= MII_GMAC4_WRITE | MII_CLAUSE45_PHY;
>> else
>> value |= MII_WRITE;
>>
>> --
>> 2.9.3
>
^ permalink raw reply
* [PATCH v4] net: ethernet: faraday: To support device tree usage.
From: Greentime Hu @ 2017-01-05 10:23 UTC (permalink / raw)
To: f.fainelli, netdev, devicetree, andrew, linux-kernel, jiri,
jonas.jensen, davem, arnd
Signed-off-by: Greentime Hu <green.hu@gmail.com>
---
Changes in v4:
- Use the same binding document to describe the same faraday ethernet controller and add faraday to vendor-prefixes.txt.
Changes in v3:
- Nothing changed in this patch but I have committed andestech to vendor-prefixes.txt.
Changes in v2:
- Change atmac100_of_ids to ftmac100_of_ids
---
.../net/{moxa,moxart-mac.txt => faraday,ftmac.txt} | 7 +++++--
.../devicetree/bindings/vendor-prefixes.txt | 1 +
drivers/net/ethernet/faraday/ftmac100.c | 7 +++++++
3 files changed, 13 insertions(+), 2 deletions(-)
rename Documentation/devicetree/bindings/net/{moxa,moxart-mac.txt => faraday,ftmac.txt} (68%)
diff --git a/Documentation/devicetree/bindings/net/moxa,moxart-mac.txt b/Documentation/devicetree/bindings/net/faraday,ftmac.txt
similarity index 68%
rename from Documentation/devicetree/bindings/net/moxa,moxart-mac.txt
rename to Documentation/devicetree/bindings/net/faraday,ftmac.txt
index 583418b..be4f55e 100644
--- a/Documentation/devicetree/bindings/net/moxa,moxart-mac.txt
+++ b/Documentation/devicetree/bindings/net/faraday,ftmac.txt
@@ -1,8 +1,11 @@
-MOXA ART Ethernet Controller
+Faraday Ethernet Controller
Required properties:
-- compatible : Must be "moxa,moxart-mac"
+- compatible : Must contain "faraday,ftmac", as well as one of
+ the SoC specific identifiers:
+ "andestech,atmac100"
+ "moxa,moxart-mac"
- reg : Should contain register location and length
- interrupts : Should contain the mac interrupt number
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 16d3b5e..489c336 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -102,6 +102,7 @@ everest Everest Semiconductor Co. Ltd.
everspin Everspin Technologies, Inc.
excito Excito
ezchip EZchip Semiconductor
+faraday Faraday Technology Corporation
fcs Fairchild Semiconductor
firefly Firefly
focaltech FocalTech Systems Co.,Ltd
diff --git a/drivers/net/ethernet/faraday/ftmac100.c b/drivers/net/ethernet/faraday/ftmac100.c
index dce5f7b..5d70ee9 100644
--- a/drivers/net/ethernet/faraday/ftmac100.c
+++ b/drivers/net/ethernet/faraday/ftmac100.c
@@ -1172,11 +1172,17 @@ static int __exit ftmac100_remove(struct platform_device *pdev)
return 0;
}
+static const struct of_device_id ftmac100_of_ids[] = {
+ { .compatible = "andestech,atmac100" },
+ { }
+};
+
static struct platform_driver ftmac100_driver = {
.probe = ftmac100_probe,
.remove = __exit_p(ftmac100_remove),
.driver = {
.name = DRV_NAME,
+ .of_match_table = ftmac100_of_ids
},
};
@@ -1200,3 +1206,4 @@ static void __exit ftmac100_exit(void)
MODULE_AUTHOR("Po-Yu Chuang <ratbert@faraday-tech.com>");
MODULE_DESCRIPTION("FTMAC100 driver");
MODULE_LICENSE("GPL");
+MODULE_DEVICE_TABLE(of, ftmac100_of_ids);
--
1.7.9.5
^ permalink raw reply related
* [PATCH/RFC v2 net-next] ravb: unmap descriptors when freeing rings
From: Simon Horman @ 2017-01-05 10:43 UTC (permalink / raw)
To: David Miller, Sergei Shtylyov; +Cc: Magnus Damm, netdev, linux-renesas-soc
From: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
"swiotlb buffer is full" errors occur after repeated initialisation of a
device - f.e. suspend/resume or ip link set up/down. This is because memory
mapped using dma_map_single() in ravb_ring_format() and ravb_start_xmit()
is not released. Resolve this problem by unmapping descriptors when
freeing rings.
Note, ravb_tx_free() is moved but not otherwise modified by this patch.
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[simon: reworked]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
--
v1 [Kazuya Mizuguchi]
v2 [Simon Horman]
* As suggested by Sergei Shtylyov
- Use dma_mapping_error() and rx_desc->ds_cc when unmapping RX descriptors;
this is consistent with the way that they are mapped
- Use ravb_tx_free() to clear TX descriptors
* Reduce scope of new local variable
---
drivers/net/ethernet/renesas/ravb_main.c | 89 ++++++++++++++++++--------------
1 file changed, 51 insertions(+), 38 deletions(-)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index 92d7692c840d..1797c48e3176 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -179,6 +179,44 @@ static struct mdiobb_ops bb_ops = {
.get_mdio_data = ravb_get_mdio_data,
};
+/* Free TX skb function for AVB-IP */
+static int ravb_tx_free(struct net_device *ndev, int q)
+{
+ struct ravb_private *priv = netdev_priv(ndev);
+ struct net_device_stats *stats = &priv->stats[q];
+ struct ravb_tx_desc *desc;
+ int free_num = 0;
+ int entry;
+ u32 size;
+
+ for (; priv->cur_tx[q] - priv->dirty_tx[q] > 0; priv->dirty_tx[q]++) {
+ entry = priv->dirty_tx[q] % (priv->num_tx_ring[q] *
+ NUM_TX_DESC);
+ desc = &priv->tx_ring[q][entry];
+ if (desc->die_dt != DT_FEMPTY)
+ break;
+ /* Descriptor type must be checked before all other reads */
+ dma_rmb();
+ size = le16_to_cpu(desc->ds_tagl) & TX_DS;
+ /* Free the original skb. */
+ if (priv->tx_skb[q][entry / NUM_TX_DESC]) {
+ dma_unmap_single(ndev->dev.parent, le32_to_cpu(desc->dptr),
+ size, DMA_TO_DEVICE);
+ /* Last packet descriptor? */
+ if (entry % NUM_TX_DESC == NUM_TX_DESC - 1) {
+ entry /= NUM_TX_DESC;
+ dev_kfree_skb_any(priv->tx_skb[q][entry]);
+ priv->tx_skb[q][entry] = NULL;
+ stats->tx_packets++;
+ }
+ free_num++;
+ }
+ stats->tx_bytes += size;
+ desc->die_dt = DT_EEMPTY;
+ }
+ return free_num;
+}
+
/* Free skb's and DMA buffers for Ethernet AVB */
static void ravb_ring_free(struct net_device *ndev, int q)
{
@@ -207,6 +245,18 @@ static void ravb_ring_free(struct net_device *ndev, int q)
priv->tx_align[q] = NULL;
if (priv->rx_ring[q]) {
+ for (i = 0; i < priv->num_rx_ring[q]; i++) {
+ struct ravb_ex_rx_desc *rx_desc = &priv->rx_ring[q][i];
+
+ if (!dma_mapping_error(ndev->dev.parent,
+ rx_desc->dptr)) {
+ dma_unmap_single(ndev->dev.parent,
+ le32_to_cpu(rx_desc->dptr),
+ PKT_BUF_SZ,
+ DMA_FROM_DEVICE);
+ rx_desc->ds_cc = cpu_to_le16(0);
+ }
+ }
ring_size = sizeof(struct ravb_ex_rx_desc) *
(priv->num_rx_ring[q] + 1);
dma_free_coherent(ndev->dev.parent, ring_size, priv->rx_ring[q],
@@ -215,6 +265,7 @@ static void ravb_ring_free(struct net_device *ndev, int q)
}
if (priv->tx_ring[q]) {
+ ravb_tx_free(ndev, q);
ring_size = sizeof(struct ravb_tx_desc) *
(priv->num_tx_ring[q] * NUM_TX_DESC + 1);
dma_free_coherent(ndev->dev.parent, ring_size, priv->tx_ring[q],
@@ -431,44 +482,6 @@ static int ravb_dmac_init(struct net_device *ndev)
return 0;
}
-/* Free TX skb function for AVB-IP */
-static int ravb_tx_free(struct net_device *ndev, int q)
-{
- struct ravb_private *priv = netdev_priv(ndev);
- struct net_device_stats *stats = &priv->stats[q];
- struct ravb_tx_desc *desc;
- int free_num = 0;
- int entry;
- u32 size;
-
- for (; priv->cur_tx[q] - priv->dirty_tx[q] > 0; priv->dirty_tx[q]++) {
- entry = priv->dirty_tx[q] % (priv->num_tx_ring[q] *
- NUM_TX_DESC);
- desc = &priv->tx_ring[q][entry];
- if (desc->die_dt != DT_FEMPTY)
- break;
- /* Descriptor type must be checked before all other reads */
- dma_rmb();
- size = le16_to_cpu(desc->ds_tagl) & TX_DS;
- /* Free the original skb. */
- if (priv->tx_skb[q][entry / NUM_TX_DESC]) {
- dma_unmap_single(ndev->dev.parent, le32_to_cpu(desc->dptr),
- size, DMA_TO_DEVICE);
- /* Last packet descriptor? */
- if (entry % NUM_TX_DESC == NUM_TX_DESC - 1) {
- entry /= NUM_TX_DESC;
- dev_kfree_skb_any(priv->tx_skb[q][entry]);
- priv->tx_skb[q][entry] = NULL;
- stats->tx_packets++;
- }
- free_num++;
- }
- stats->tx_bytes += size;
- desc->die_dt = DT_EEMPTY;
- }
- return free_num;
-}
-
static void ravb_get_tx_tstamp(struct net_device *ndev)
{
struct ravb_private *priv = netdev_priv(ndev);
--
2.7.0.rc3.207.g0ac5344
^ permalink raw reply related
* [PATCH] net: stmmac: fix maxmtu assignment to be within valid range
From: Kweh, Hock Leong @ 2017-01-05 10:47 UTC (permalink / raw)
To: David S. Miller, Joao Pinto, Giuseppe CAVALLARO,
seraphin.bonnaffe, Jarod Wilson
Cc: Alexandre TORGUE, Joachim Eastwood, Niklas Cassel, Johan Hovold,
pavel, Kweh, Hock Leong, lars.persson, netdev, LKML
From: "Kweh, Hock Leong" <hock.leong.kweh@intel.com>
There is no checking valid value of maxmtu when getting it from devicetree.
This resolution added the checking condition to ensure the assignment is
made within a valid range.
Signed-off-by: Kweh, Hock Leong <hock.leong.kweh@intel.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 39eb7a6..683d59f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3319,7 +3319,8 @@ int stmmac_dvr_probe(struct device *device,
ndev->max_mtu = JUMBO_LEN;
else
ndev->max_mtu = SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN);
- if (priv->plat->maxmtu < ndev->max_mtu)
+ if ((priv->plat->maxmtu < ndev->max_mtu) &&
+ (priv->plat->maxmtu >= ndev->min_mtu))
ndev->max_mtu = priv->plat->maxmtu;
if (flow_ctrl)
--
1.7.9.5
^ permalink raw reply related
* SIOCSIWFREQ while in NL80211_IFTYPE_STATION
From: Jorge Ramirez @ 2017-01-05 11:02 UTC (permalink / raw)
To: netdev, Daniel Lezcano
Hello all,
I am running a single wlan0 interface in managed mode (no aliases, no
other wireless interfaces).
The association with the AP still hasn't happened.
I noticed that if trying to change the frequency to one of the valid
values, the driver returns EBUSY.
The call stack is
cfg80211_wext_siwfreq
-->cfg80211_mgd_wext_siwfreq
--->cfg80211_set_monitor_channel (notice call to set 'monitor' channel
in managed mode)
----> fails with EBUSY
Is therefore the expected behavior to fail under the above circumstances
(managed mode && single wlan0 interface && no association)?
And if it is, please could you clarify when would it be valid to change
the frequency in managed mode?
many thanks in advance for the help,
Jorge
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox