* [PATCH net-next 2/4] geneve: cleanup hard coded value for Ethernet header length
From: Alexey Kodanev @ 2018-04-19 12:42 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Alexey Kodanev
In-Reply-To: <1524141752-25789-1-git-send-email-alexey.kodanev@oracle.com>
Use ETH_HLEN instead and introduce two new macros: GENEVE_IPV4_HLEN
and GENEVE_IPV6_HLEN that include Ethernet header length, corresponded
IP header length and GENEVE_BASE_HLEN.
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---
drivers/net/geneve.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 45acdc9..b650f84 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -36,6 +36,8 @@
#define GENEVE_VER 0
#define GENEVE_BASE_HLEN (sizeof(struct udphdr) + sizeof(struct genevehdr))
+#define GENEVE_IPV4_HLEN (ETH_HLEN + sizeof(struct iphdr) + GENEVE_BASE_HLEN)
+#define GENEVE_IPV6_HLEN (ETH_HLEN + sizeof(struct ipv6hdr) + GENEVE_BASE_HLEN)
/* per-network namespace private data for this module */
struct geneve_net {
@@ -826,8 +828,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
return PTR_ERR(rt);
if (skb_dst(skb)) {
- int mtu = dst_mtu(&rt->dst) - sizeof(struct iphdr) -
- GENEVE_BASE_HLEN - info->options_len - 14;
+ int mtu = dst_mtu(&rt->dst) - GENEVE_IPV4_HLEN -
+ info->options_len;
skb_dst_update_pmtu(skb, mtu);
}
@@ -872,8 +874,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
return PTR_ERR(dst);
if (skb_dst(skb)) {
- int mtu = dst_mtu(dst) - sizeof(struct ipv6hdr) -
- GENEVE_BASE_HLEN - info->options_len - 14;
+ int mtu = dst_mtu(dst) - GENEVE_IPV6_HLEN - info->options_len;
skb_dst_update_pmtu(skb, mtu);
}
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 1/4] geneve: remove white-space before '#if IS_ENABLED(CONFIG_IPV6)'
From: Alexey Kodanev @ 2018-04-19 12:42 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Alexey Kodanev
In-Reply-To: <1524141752-25789-1-git-send-email-alexey.kodanev@oracle.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
---
drivers/net/geneve.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index b919e89..45acdc9 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1261,7 +1261,7 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[],
}
if (data[IFLA_GENEVE_REMOTE6]) {
- #if IS_ENABLED(CONFIG_IPV6)
+#if IS_ENABLED(CONFIG_IPV6)
if (changelink && (ip_tunnel_info_af(info) == AF_INET)) {
attrtype = IFLA_GENEVE_REMOTE6;
goto change_notsup;
--
1.8.3.1
^ permalink raw reply related
* [PATCH net-next 0/4] geneve: verify user specified MTU or adjust with a lower device
From: Alexey Kodanev @ 2018-04-19 12:42 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Alexey Kodanev
The first two patches don't introduce any functional changes and
contain minor cleanups for code readability.
The last one adds a new function geneve_link_config() similar to the
other tunnels. The function will be used on a new link creation or
when 'remote' parameter is changed. It adjusts a user specified MTU
or, if it finds a lower device, tunes the tunnel MTU using it.
Alexey Kodanev (4):
geneve: remove white-space before '#if IS_ENABLED(CONFIG_IPV6)'
geneve: cleanup hard coded value for Ethernet header length
geneve: check MTU for a minimum in geneve_change_mtu()
geneve: configure MTU based on a lower device
drivers/net/geneve.c | 72 ++++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 61 insertions(+), 11 deletions(-)
--
1.8.3.1
^ permalink raw reply
* Re: [RFC PATCH ghak32 V2 09/13] audit: add containerid support for config/feature/user records
From: Richard Guy Briggs @ 2018-04-19 12:31 UTC (permalink / raw)
To: Paul Moore
Cc: cgroups, containers, linux-api, Linux-Audit Mailing List,
linux-fsdevel, LKML, netdev, ebiederm, luto, jlayton, carlos,
dhowells, viro, simo, Eric Paris, serge
In-Reply-To: <CAHC9VhQ-i5oA48sXXnN2fP06t5=9-NMoY0bKcGXorQw2k=CK0Q@mail.gmail.com>
On 2018-04-18 21:27, Paul Moore wrote:
> On Fri, Mar 16, 2018 at 5:00 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > Add container ID auxiliary records to configuration change, feature set change
> > and user generated standalone records.
> >
> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> > ---
> > kernel/audit.c | 50 ++++++++++++++++++++++++++++++++++++++++----------
> > kernel/auditfilter.c | 5 ++++-
> > 2 files changed, 44 insertions(+), 11 deletions(-)
> >
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index b238be5..08662b4 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -400,8 +400,9 @@ static int audit_log_config_change(char *function_name, u32 new, u32 old,
> > {
> > struct audit_buffer *ab;
> > int rc = 0;
> > + struct audit_context *context = audit_alloc_local();
>
> We should be able to use current->audit_context here right? If we
> can't for every caller, perhaps we pass an audit_context as an
> argument and only allocate a local context when the passed
> audit_context is NULL.
>
> Also, if you're not comfortable always using current, just pass the
> audit_context as you do with audit_log_common_recv_msg().
As mentioned in the tree/watch/mark patch, this is all obsoleted by
making the AUDIT_CONFIG_CHANGE record a SYSCALL auxiliary record.
This review would have been more helpful a month and a half ago.
> > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> > if (unlikely(!ab))
> > return rc;
> > audit_log_format(ab, "%s=%u old=%u", function_name, new, old);
> > @@ -411,6 +412,8 @@ static int audit_log_config_change(char *function_name, u32 new, u32 old,
> > allow_changes = 0; /* Something weird, deny request */
> > audit_log_format(ab, " res=%d", allow_changes);
> > audit_log_end(ab);
> > + audit_log_container_info(context, "config", audit_get_containerid(current));
> > + audit_free_context(context);
> > return rc;
> > }
> >
> > @@ -1058,7 +1061,8 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
> > return err;
> > }
> >
> > -static void audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
> > +static void audit_log_common_recv_msg(struct audit_context *context,
> > + struct audit_buffer **ab, u16 msg_type)
> > {
> > uid_t uid = from_kuid(&init_user_ns, current_uid());
> > pid_t pid = task_tgid_nr(current);
> > @@ -1068,7 +1072,7 @@ static void audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
> > return;
> > }
> >
> > - *ab = audit_log_start(NULL, GFP_KERNEL, msg_type);
> > + *ab = audit_log_start(context, GFP_KERNEL, msg_type);
> > if (unlikely(!*ab))
> > return;
> > audit_log_format(*ab, "pid=%d uid=%u", pid, uid);
> > @@ -1097,11 +1101,12 @@ static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature
> > u32 old_lock, u32 new_lock, int res)
> > {
> > struct audit_buffer *ab;
> > + struct audit_context *context = audit_alloc_local();
>
> So I know based on the other patch we are currently discussing that we
> can use current here ...
>
> > if (audit_enabled == AUDIT_OFF)
> > return;
> >
> > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_FEATURE_CHANGE);
> > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_FEATURE_CHANGE);
> > if (!ab)
> > return;
> > audit_log_task_info(ab, current);
> > @@ -1109,6 +1114,8 @@ static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature
> > audit_feature_names[which], !!old_feature, !!new_feature,
> > !!old_lock, !!new_lock, res);
> > audit_log_end(ab);
> > + audit_log_container_info(context, "feature", audit_get_containerid(current));
> > + audit_free_context(context);
> > }
> >
> > static int audit_set_feature(struct sk_buff *skb)
> > @@ -1337,13 +1344,15 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> >
> > err = audit_filter(msg_type, AUDIT_FILTER_USER);
> > if (err == 1) { /* match or error */
> > + struct audit_context *context = audit_alloc_local();
>
> I'm pretty sure we can use current here.
>
> > err = 0;
> > if (msg_type == AUDIT_USER_TTY) {
> > err = tty_audit_push();
> > if (err)
> > break;
> > }
> > - audit_log_common_recv_msg(&ab, msg_type);
> > + audit_log_common_recv_msg(context, &ab, msg_type);
> > if (msg_type != AUDIT_USER_TTY)
> > audit_log_format(ab, " msg='%.*s'",
> > AUDIT_MESSAGE_TEXT_MAX,
> > @@ -1359,6 +1368,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > audit_log_n_untrustedstring(ab, data, size);
> > }
> > audit_log_end(ab);
> > + audit_log_container_info(context, "user",
> > + audit_get_containerid(current));
> > + audit_free_context(context);
> > }
> > break;
> > case AUDIT_ADD_RULE:
> > @@ -1366,9 +1378,14 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > if (nlmsg_len(nlh) < sizeof(struct audit_rule_data))
> > return -EINVAL;
> > if (audit_enabled == AUDIT_LOCKED) {
> > - audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
> > + struct audit_context *context = audit_alloc_local();
>
> Pretty sure current can be used here too. In fact I think everywhere
> where we are processing commands from netlink we can use current as I
> believe the entire netlink stack is processed in the context of the
> caller.
>
> > + audit_log_common_recv_msg(context, &ab, AUDIT_CONFIG_CHANGE);
> > audit_log_format(ab, " audit_enabled=%d res=0", audit_enabled);
> > audit_log_end(ab);
> > + audit_log_container_info(context, "config",
> > + audit_get_containerid(current));
> > + audit_free_context(context);
> > return -EPERM;
> > }
> > err = audit_rule_change(msg_type, seq, data, nlmsg_len(nlh));
> > @@ -1376,17 +1393,23 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > case AUDIT_LIST_RULES:
> > err = audit_list_rules_send(skb, seq);
> > break;
> > - case AUDIT_TRIM:
> > + case AUDIT_TRIM: {
> > + struct audit_context *context = audit_alloc_local();
>
> Same.
>
> > audit_trim_trees();
> > - audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
> > + audit_log_common_recv_msg(context, &ab, AUDIT_CONFIG_CHANGE);
> > audit_log_format(ab, " op=trim res=1");
> > audit_log_end(ab);
> > + audit_log_container_info(context, "config",
> > + audit_get_containerid(current));
> > + audit_free_context(context);
> > break;
> > + }
> > case AUDIT_MAKE_EQUIV: {
> > void *bufp = data;
> > u32 sizes[2];
> > size_t msglen = nlmsg_len(nlh);
> > char *old, *new;
> > + struct audit_context *context = audit_alloc_local();
>
> Same.
>
> > err = -EINVAL;
> > if (msglen < 2 * sizeof(u32))
> > @@ -1408,7 +1431,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > /* OK, here comes... */
> > err = audit_tag_tree(old, new);
> >
> > - audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
> > + audit_log_common_recv_msg(context, &ab, AUDIT_CONFIG_CHANGE);
> >
> > audit_log_format(ab, " op=make_equiv old=");
> > audit_log_untrustedstring(ab, old);
> > @@ -1418,6 +1441,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > audit_log_end(ab);
> > kfree(old);
> > kfree(new);
> > + audit_log_container_info(context, "config",
> > + audit_get_containerid(current));
> > + audit_free_context(context);
> > break;
> > }
> > case AUDIT_SIGNAL_INFO:
> > @@ -1459,6 +1485,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > struct audit_tty_status s, old;
> > struct audit_buffer *ab;
> > unsigned int t;
> > + struct audit_context *context = audit_alloc_local();
>
> Same.
>
> > memset(&s, 0, sizeof(s));
> > /* guard against past and future API changes */
> > @@ -1477,12 +1504,15 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> > old.enabled = t & AUDIT_TTY_ENABLE;
> > old.log_passwd = !!(t & AUDIT_TTY_LOG_PASSWD);
> >
> > - audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE);
> > + audit_log_common_recv_msg(context, &ab, AUDIT_CONFIG_CHANGE);
> > audit_log_format(ab, " op=tty_set old-enabled=%d new-enabled=%d"
> > " old-log_passwd=%d new-log_passwd=%d res=%d",
> > old.enabled, s.enabled, old.log_passwd,
> > s.log_passwd, !err);
> > audit_log_end(ab);
> > + audit_log_container_info(context, "config",
> > + audit_get_containerid(current));
> > + audit_free_context(context);
> > break;
> > }
> > default:
> > diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
> > index c4c8746..5f7f4d6 100644
> > --- a/kernel/auditfilter.c
> > +++ b/kernel/auditfilter.c
> > @@ -1109,11 +1109,12 @@ static void audit_log_rule_change(char *action, struct audit_krule *rule, int re
> > struct audit_buffer *ab;
> > uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(current));
> > unsigned int sessionid = audit_get_sessionid(current);
> > + struct audit_context *context = audit_alloc_local();
> >
> > if (!audit_enabled)
> > return;
>
> Well, first I think we should be able to get rid of the local context,
> but if for some reason we can't use current->audit_context then do the
> allocation after the audit_enabled check.
>
> > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> > if (!ab)
> > return;
> > audit_log_format(ab, "auid=%u ses=%u" ,loginuid, sessionid);
> > @@ -1122,6 +1123,8 @@ static void audit_log_rule_change(char *action, struct audit_krule *rule, int re
> > audit_log_key(ab, rule->filterkey);
> > audit_log_format(ab, " list=%d res=%d", rule->listnr, res);
> > audit_log_end(ab);
> > + audit_log_container_info(context, "config", audit_get_containerid(current));
> > + audit_free_context(context);
> > }
>
> --
> paul moore
> www.paul-moore.com
- RGB
--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
^ permalink raw reply
* Re: [PATCH v2 net 3/3] virtio_net: sparse annotation fix
From: Jason Wang @ 2018-04-19 12:27 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel
Cc: Mikulas Patocka, Eric Dumazet, David Miller, Thomas Huth,
Cornelia Huck, virtualization, netdev
In-Reply-To: <1524115776-334953-4-git-send-email-mst@redhat.com>
On 2018年04月19日 13:30, Michael S. Tsirkin wrote:
> offloads is a buffer in virtio format, should use
> the __virtio64 tag.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> drivers/net/virtio_net.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f84fe04..c5b11f2 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -155,7 +155,7 @@ struct control_buf {
> u8 promisc;
> u8 allmulti;
> __virtio16 vid;
> - u64 offloads;
> + __virtio64 offloads;
> };
>
> struct virtnet_info {
Acked-by: Jason Wang <jasowang@redhat.com>
^ permalink raw reply
* Re: [PATCH bpf-next v3 5/8] bpf: add documentation for eBPF helpers (33-41)
From: Daniel Borkmann @ 2018-04-19 12:27 UTC (permalink / raw)
To: Quentin Monnet, ast; +Cc: netdev, oss-drivers, linux-doc, linux-man
In-Reply-To: <20180417143438.7018-6-quentin.monnet@netronome.com>
On 04/17/2018 04:34 PM, Quentin Monnet wrote:
> Add documentation for eBPF helper functions to bpf.h user header file.
> This documentation can be parsed with the Python script provided in
> another commit of the patch series, in order to provide a RST document
> that can later be converted into a man page.
>
> The objective is to make the documentation easily understandable and
> accessible to all eBPF developers, including beginners.
>
> This patch contains descriptions for the following helper functions, all
> written by Daniel:
>
> - bpf_get_hash_recalc()
> - bpf_skb_change_tail()
> - bpf_skb_pull_data()
> - bpf_csum_update()
> - bpf_set_hash_invalid()
> - bpf_get_numa_node_id()
> - bpf_set_hash()
> - bpf_skb_adjust_room()
> - bpf_xdp_adjust_meta()
>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> ---
> include/uapi/linux/bpf.h | 155 +++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 155 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index d748f65a8f58..3a40f5debac2 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -965,9 +965,164 @@ union bpf_attr {
> * Return
> * 0 on success, or a negative error in case of failure.
> *
> + * u32 bpf_get_hash_recalc(struct sk_buff *skb)
> + * Description
> + * Retrieve the hash of the packet, *skb*\ **->hash**. If it is
> + * not set, in particular if the hash was cleared due to mangling,
> + * recompute this hash. Later accesses to the hash can be done
> + * directly with *skb*\ **->hash**.
> + *
> + * Calling **bpf_set_hash_invalid**\ (), changing a packet
> + * prototype with **bpf_skb_change_proto**\ (), or calling
> + * **bpf_skb_store_bytes**\ () with the
> + * **BPF_F_INVALIDATE_HASH** are actions susceptible to clear
> + * the hash and to trigger a new computation for the next call to
> + * **bpf_get_hash_recalc**\ ().
> + * Return
> + * The 32-bit hash.
> + *
> * u64 bpf_get_current_task(void)
> * Return
> * A pointer to the current task struct.
> + *
> + * int bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
> + * Description
> + * Resize (trim or grow) the packet associated to *skb* to the
> + * new *len*. The *flags* are reserved for future usage, and must
> + * be left at zero.
> + *
> + * The basic idea is that the helper performs the needed work to
> + * change the size of the packet, then the eBPF program rewrites
> + * the rest via helpers like **bpf_skb_store_bytes**\ (),
> + * **bpf_l3_csum_replace**\ (), **bpf_l3_csum_replace**\ ()
> + * and others. This helper is a slow path utility intended for
> + * replies with control messages. And because it is targeted for
> + * slow path, the helper itself can afford to be slow: it
> + * implicitly linearizes, unclones and drops offloads from the
> + * *skb*.
> + *
> + * A call to this helper is susceptible to change data from the
> + * packet. Therefore, at load time, all checks on pointers
> + * previously done by the verifier are invalidated and must be
> + * performed again.
> + * Return
> + * 0 on success, or a negative error in case of failure.
> + *
> + * int bpf_skb_pull_data(struct sk_buff *skb, u32 len)
> + * Description
> + * Pull in non-linear data in case the *skb* is non-linear and not
> + * all of *len* are part of the linear section. Make *len* bytes
> + * from *skb* readable and writable. If a zero value is passed for
> + * *len*, then the whole length of the *skb* is pulled.
> + *
> + * This helper is only needed for reading and writing with direct
> + * packet access.
> + *
> + * For direct packet access, when testing that offsets to access
> + * are within packet boundaries (test on *skb*\ **->data_end**)
> + * fails, programs just bail out, or, in the direct read case, use
I would add here to why it can fail, meaning either due to invalid offsets
or due to the requested data being in non-linear parts of the skb where then
either the bpf_skb_load_bytes() can be used as you mentioned or the data
pulled in via bpf_skb_pull_data().
> + * **bpf_skb_load_bytes()** as an alternative to overcome this
> + * limitation. If such data sits in non-linear parts, it is
> + * possible to pull them in once with the new helper, retest and
> + * eventually access them.
You do this here, but maybe slightly rearranging this one paragraph a bit as
to why one would use either of the helpers would help reading flow a bit.
> + * At the same time, this also makes sure the skb is uncloned,
> + * which is a necessary condition for direct write. As this needs
> + * to be an invariant for the write part only, the verifier
> + * detects writes and adds a prologue that is calling
> + * **bpf_skb_pull_data()** to effectively unclone the skb from the
> + * very beginning in case it is indeed cloned.
> + *
> + * A call to this helper is susceptible to change data from the
> + * packet. Therefore, at load time, all checks on pointers
> + * previously done by the verifier are invalidated and must be
> + * performed again.
> + * Return
> + * 0 on success, or a negative error in case of failure.
> + *
> + * s64 bpf_csum_update(struct sk_buff *skb, __wsum csum)
> + * Description
> + * Add the checksum *csum* into *skb*\ **->csum** in case the
> + * driver fed us an IP checksum. Return an error otherwise. This
It's not IP checksum specifically (if that is what you meant), it's when the
driver propagates CHECKSUM_COMPLETE to the skb, where the device has supplied
the checksum of the whole packet into skb->csum. At TC ingress time, this
covers everything starting from net header offset to the end of the skb since
mac hdr skb->csum has been pulled already. Main use case indeed direct packet
access.
> + * header is intended to be used in combination with
> + * **bpf_csum_diff()** helper, in particular when the checksum
> + * needs to be updated after data has been written into the packet
> + * through direct packet access.
> + * Return
> + * The checksum on success, or a negative error code in case of
> + * failure.
> + *
> + * void bpf_set_hash_invalid(struct sk_buff *skb)
> + * Description
> + * Invalidate the current *skb*\ **->hash**. It can be used after
> + * mangling on headers through direct packet access, in order to
> + * indicate that the hash is outdated and to trigger a
> + * recalculation the next time the kernel tries to access this
> + * hash.
[...] hash or through the helper bpf_get_hash_recalc().
> + *
> + * int bpf_get_numa_node_id(void)
> + * Description
> + * Return the id of the current NUMA node. The primary use case
> + * for this helper is the selection of sockets for the local NUMA
> + * node, when the program is attached to sockets using the
> + * **SO_ATTACH_REUSEPORT_EBPF** option (see also **socket(7)**).
I would mention that this also available for other types similarly to
bpf_get_smp_processor_id() helper though. (Otherwise one might read that
this could not be the case.)
> + * Return
> + * The id of current NUMA node.
> + *
> + * u32 bpf_set_hash(struct sk_buff *skb, u32 hash)
> + * Description
> + * Set the full hash for *skb* (set the field *skb*\ **->hash**)
> + * to value *hash*.
> + * Return
> + * 0
> + *
> + * int bpf_skb_adjust_room(struct sk_buff *skb, u32 len_diff, u32 mode, u64 flags)
> + * Description
> + * Grow or shrink the room for data in the packet associated to
> + * *skb* by *len_diff*, and according to the selected *mode*.
> + *
> + * There is a single supported mode at this time:
> + *
> + * * **BPF_ADJ_ROOM_NET**: Adjust room at the network layer
> + * (room space is added or removed below the layer 3 header).
> + *
> + * All values for *flags* are reserved for future usage, and must
> + * be left at zero.
> + *
> + * A call to this helper is susceptible to change data from the
> + * packet. Therefore, at load time, all checks on pointers
> + * previously done by the verifier are invalidated and must be
> + * performed again.
> + * Return
> + * 0 on success, or a negative error in case of failure.
> + *
> + * int bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
> + * Description
> + * Adjust the address pointed by *xdp_md*\ **->data_meta** by
> + * *delta* (which can be positive or negative). Note that this
> + * operation modifies the address stored in *xdp_md*\ **->data**,
> + * so the latter must be loaded only after the helper has been
> + * called.
> + *
> + * The use of *xdp_md*\ **->data_meta** is optional and programs
> + * are not required to use it. The rationale is that when the
> + * packet is processed with XDP (e.g. as DoS filter), it is
> + * possible to push further meta data along with it before passing
> + * to the stack, and to give the guarantee that an ingress eBPF
> + * program attached as a TC classifier on the same device can pick
> + * this up for further post-processing. Since TC works with socket
> + * buffers, it remains possible to set from XDP the **mark** or
> + * **priority** pointers, or other pointers for the socket buffer.
> + * Having this scratch space generic and programmable allows for
> + * more flexibility as the user is free to store whatever meta
> + * data they need.
> + *
> + * A call to this helper is susceptible to change data from the
> + * packet. Therefore, at load time, all checks on pointers
> + * previously done by the verifier are invalidated and must be
> + * performed again.
> + * Return
> + * 0 on success, or a negative error in case of failure.
> */
> #define __BPF_FUNC_MAPPER(FN) \
> FN(unspec), \
>
^ permalink raw reply
* Re: [PATCH v2 net 2/3] virtio_net: fix adding vids on big-endian
From: Jason Wang @ 2018-04-19 12:26 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel
Cc: Mikulas Patocka, Eric Dumazet, David Miller, Thomas Huth,
Cornelia Huck, virtualization, netdev
In-Reply-To: <1524115776-334953-3-git-send-email-mst@redhat.com>
On 2018年04月19日 13:30, Michael S. Tsirkin wrote:
> Programming vids (adding or removing them) still passes
> guest-endian values in the DMA buffer. That's wrong
> if guest is big-endian and when virtio 1 is enabled.
>
> Note: this is on top of a previous patch:
> virtio_net: split out ctrl buffer
>
> Fixes: 9465a7a6f ("virtio_net: enable v1.0 support")
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> drivers/net/virtio_net.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 3d0eff53..f84fe04 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -154,7 +154,7 @@ struct control_buf {
> struct virtio_net_ctrl_mq mq;
> u8 promisc;
> u8 allmulti;
> - u16 vid;
> + __virtio16 vid;
> u64 offloads;
> };
>
> @@ -1718,7 +1718,7 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev,
> struct virtnet_info *vi = netdev_priv(dev);
> struct scatterlist sg;
>
> - vi->ctrl->vid = vid;
> + vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid);
> sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> @@ -1733,7 +1733,7 @@ static int virtnet_vlan_rx_kill_vid(struct net_device *dev,
> struct virtnet_info *vi = netdev_priv(dev);
> struct scatterlist sg;
>
> - vi->ctrl->vid = vid;
> + vi->ctrl->vid = cpu_to_virtio16(vi->vdev, vid);
> sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
Acked-by: Jason Wang <jasowang@redhat.com>
^ permalink raw reply
* Re: [PATCH v2 net 1/3] virtio_net: split out ctrl buffer
From: Jason Wang @ 2018-04-19 12:26 UTC (permalink / raw)
To: Michael S. Tsirkin, linux-kernel
Cc: Thomas Huth, Eric Dumazet, netdev, Cornelia Huck, virtualization,
Mikulas Patocka, David Miller
In-Reply-To: <1524115776-334953-2-git-send-email-mst@redhat.com>
On 2018年04月19日 13:30, Michael S. Tsirkin wrote:
> When sending control commands, virtio net sets up several buffers for
> DMA. The buffers are all part of the net device which means it's
> actually allocated by kvmalloc so it's in theory (on extreme memory
> pressure) possible to get a vmalloc'ed buffer which on some platforms
> means we can't DMA there.
>
> Fix up by moving the DMA buffers into a separate structure.
>
> Reported-by: Mikulas Patocka <mpatocka@redhat.com>
> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> Changes from v1:
> build fix
>
> drivers/net/virtio_net.c | 68 +++++++++++++++++++++++++++---------------------
> 1 file changed, 39 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7b187ec..3d0eff53 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -147,6 +147,17 @@ struct receive_queue {
> struct xdp_rxq_info xdp_rxq;
> };
>
> +/* Control VQ buffers: protected by the rtnl lock */
> +struct control_buf {
> + struct virtio_net_ctrl_hdr hdr;
> + virtio_net_ctrl_ack status;
> + struct virtio_net_ctrl_mq mq;
> + u8 promisc;
> + u8 allmulti;
> + u16 vid;
> + u64 offloads;
> +};
> +
> struct virtnet_info {
> struct virtio_device *vdev;
> struct virtqueue *cvq;
> @@ -192,14 +203,7 @@ struct virtnet_info {
> struct hlist_node node;
> struct hlist_node node_dead;
>
> - /* Control VQ buffers: protected by the rtnl lock */
> - struct virtio_net_ctrl_hdr ctrl_hdr;
> - virtio_net_ctrl_ack ctrl_status;
> - struct virtio_net_ctrl_mq ctrl_mq;
> - u8 ctrl_promisc;
> - u8 ctrl_allmulti;
> - u16 ctrl_vid;
> - u64 ctrl_offloads;
> + struct control_buf *ctrl;
>
> /* Ethtool settings */
> u8 duplex;
> @@ -1454,25 +1458,25 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> /* Caller should know better */
> BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
>
> - vi->ctrl_status = ~0;
> - vi->ctrl_hdr.class = class;
> - vi->ctrl_hdr.cmd = cmd;
> + vi->ctrl->status = ~0;
> + vi->ctrl->hdr.class = class;
> + vi->ctrl->hdr.cmd = cmd;
> /* Add header */
> - sg_init_one(&hdr, &vi->ctrl_hdr, sizeof(vi->ctrl_hdr));
> + sg_init_one(&hdr, &vi->ctrl->hdr, sizeof(vi->ctrl->hdr));
> sgs[out_num++] = &hdr;
>
> if (out)
> sgs[out_num++] = out;
>
> /* Add return status. */
> - sg_init_one(&stat, &vi->ctrl_status, sizeof(vi->ctrl_status));
> + sg_init_one(&stat, &vi->ctrl->status, sizeof(vi->ctrl->status));
> sgs[out_num] = &stat;
>
> BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
> virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, vi, GFP_ATOMIC);
>
> if (unlikely(!virtqueue_kick(vi->cvq)))
> - return vi->ctrl_status == VIRTIO_NET_OK;
> + return vi->ctrl->status == VIRTIO_NET_OK;
>
> /* Spin for a response, the kick causes an ioport write, trapping
> * into the hypervisor, so the request should be handled immediately.
> @@ -1481,7 +1485,7 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> !virtqueue_is_broken(vi->cvq))
> cpu_relax();
>
> - return vi->ctrl_status == VIRTIO_NET_OK;
> + return vi->ctrl->status == VIRTIO_NET_OK;
> }
>
> static int virtnet_set_mac_address(struct net_device *dev, void *p)
> @@ -1593,8 +1597,8 @@ static int _virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs)
> if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ))
> return 0;
>
> - vi->ctrl_mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs);
> - sg_init_one(&sg, &vi->ctrl_mq, sizeof(vi->ctrl_mq));
> + vi->ctrl->mq.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs);
> + sg_init_one(&sg, &vi->ctrl->mq, sizeof(vi->ctrl->mq));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ,
> VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET, &sg)) {
> @@ -1653,22 +1657,22 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX))
> return;
>
> - vi->ctrl_promisc = ((dev->flags & IFF_PROMISC) != 0);
> - vi->ctrl_allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
> + vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0);
> + vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
>
> - sg_init_one(sg, &vi->ctrl_promisc, sizeof(vi->ctrl_promisc));
> + sg_init_one(sg, &vi->ctrl->promisc, sizeof(vi->ctrl->promisc));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX,
> VIRTIO_NET_CTRL_RX_PROMISC, sg))
> dev_warn(&dev->dev, "Failed to %sable promisc mode.\n",
> - vi->ctrl_promisc ? "en" : "dis");
> + vi->ctrl->promisc ? "en" : "dis");
>
> - sg_init_one(sg, &vi->ctrl_allmulti, sizeof(vi->ctrl_allmulti));
> + sg_init_one(sg, &vi->ctrl->allmulti, sizeof(vi->ctrl->allmulti));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_RX,
> VIRTIO_NET_CTRL_RX_ALLMULTI, sg))
> dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n",
> - vi->ctrl_allmulti ? "en" : "dis");
> + vi->ctrl->allmulti ? "en" : "dis");
>
> uc_count = netdev_uc_count(dev);
> mc_count = netdev_mc_count(dev);
> @@ -1714,8 +1718,8 @@ static int virtnet_vlan_rx_add_vid(struct net_device *dev,
> struct virtnet_info *vi = netdev_priv(dev);
> struct scatterlist sg;
>
> - vi->ctrl_vid = vid;
> - sg_init_one(&sg, &vi->ctrl_vid, sizeof(vi->ctrl_vid));
> + vi->ctrl->vid = vid;
> + sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> VIRTIO_NET_CTRL_VLAN_ADD, &sg))
> @@ -1729,8 +1733,8 @@ static int virtnet_vlan_rx_kill_vid(struct net_device *dev,
> struct virtnet_info *vi = netdev_priv(dev);
> struct scatterlist sg;
>
> - vi->ctrl_vid = vid;
> - sg_init_one(&sg, &vi->ctrl_vid, sizeof(vi->ctrl_vid));
> + vi->ctrl->vid = vid;
> + sg_init_one(&sg, &vi->ctrl->vid, sizeof(vi->ctrl->vid));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> VIRTIO_NET_CTRL_VLAN_DEL, &sg))
> @@ -2126,9 +2130,9 @@ static int virtnet_restore_up(struct virtio_device *vdev)
> static int virtnet_set_guest_offloads(struct virtnet_info *vi, u64 offloads)
> {
> struct scatterlist sg;
> - vi->ctrl_offloads = cpu_to_virtio64(vi->vdev, offloads);
> + vi->ctrl->offloads = cpu_to_virtio64(vi->vdev, offloads);
>
> - sg_init_one(&sg, &vi->ctrl_offloads, sizeof(vi->ctrl_offloads));
> + sg_init_one(&sg, &vi->ctrl->offloads, sizeof(vi->ctrl->offloads));
>
> if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_GUEST_OFFLOADS,
> VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET, &sg)) {
> @@ -2351,6 +2355,7 @@ static void virtnet_free_queues(struct virtnet_info *vi)
>
> kfree(vi->rq);
> kfree(vi->sq);
> + kfree(vi->ctrl);
> }
>
> static void _free_receive_bufs(struct virtnet_info *vi)
> @@ -2543,6 +2548,9 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
> {
> int i;
>
> + vi->ctrl = kzalloc(sizeof(*vi->ctrl), GFP_KERNEL);
> + if (!vi->ctrl)
> + goto err_ctrl;
> vi->sq = kzalloc(sizeof(*vi->sq) * vi->max_queue_pairs, GFP_KERNEL);
> if (!vi->sq)
> goto err_sq;
> @@ -2571,6 +2579,8 @@ static int virtnet_alloc_queues(struct virtnet_info *vi)
> err_rq:
> kfree(vi->sq);
> err_sq:
> + kfree(vi->ctrl);
> +err_ctrl:
> return -ENOMEM;
> }
>
Acked-by: Jason Wang <jasowang@redhat.com>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: [RFC PATCH ghak32 V2 07/13] audit: add container aux record to watch/tree/mark
From: Richard Guy Briggs @ 2018-04-19 12:24 UTC (permalink / raw)
To: Paul Moore
Cc: cgroups, containers, linux-api, Linux-Audit Mailing List,
linux-fsdevel, LKML, netdev, ebiederm, luto, jlayton, carlos,
dhowells, viro, simo, Eric Paris, serge
In-Reply-To: <CAHC9VhTzp-r2TFytt1zTEpeGK=O5dEnLPFw-CdsM1ttpY0a30g@mail.gmail.com>
On 2018-04-18 20:42, Paul Moore wrote:
> On Fri, Mar 16, 2018 at 5:00 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > Add container ID auxiliary record to mark, watch and tree rule
> > configuration standalone records.
> >
> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> > ---
> > kernel/audit_fsnotify.c | 5 ++++-
> > kernel/audit_tree.c | 5 ++++-
> > kernel/audit_watch.c | 33 +++++++++++++++++++--------------
> > 3 files changed, 27 insertions(+), 16 deletions(-)
> >
> > diff --git a/kernel/audit_fsnotify.c b/kernel/audit_fsnotify.c
> > index 52f368b..18c110d 100644
> > --- a/kernel/audit_fsnotify.c
> > +++ b/kernel/audit_fsnotify.c
> > @@ -124,10 +124,11 @@ static void audit_mark_log_rule_change(struct audit_fsnotify_mark *audit_mark, c
> > {
> > struct audit_buffer *ab;
> > struct audit_krule *rule = audit_mark->rule;
> > + struct audit_context *context = audit_alloc_local();
> >
> > if (!audit_enabled)
> > return;
>
> Move the audit_alloc_local() after the audit_enabled check.
Already fixed in V3 as previously warned, by making all
AUDIT_CONFIG_CHANGE records SYSCALL auxiliary records.
> > - ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE);
> > + ab = audit_log_start(context, GFP_NOFS, AUDIT_CONFIG_CHANGE);
> > if (unlikely(!ab))
> > return;
> > audit_log_format(ab, "auid=%u ses=%u op=%s",
> > @@ -138,6 +139,8 @@ static void audit_mark_log_rule_change(struct audit_fsnotify_mark *audit_mark, c
> > audit_log_key(ab, rule->filterkey);
> > audit_log_format(ab, " list=%d res=1", rule->listnr);
> > audit_log_end(ab);
> > + audit_log_container_info(context, "config", audit_get_containerid(current));
> > + audit_free_context(context);
> > }
> >
> > void audit_remove_mark(struct audit_fsnotify_mark *audit_mark)
> > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> > index 67e6956..7c085be 100644
> > --- a/kernel/audit_tree.c
> > +++ b/kernel/audit_tree.c
> > @@ -496,8 +496,9 @@ static int tag_chunk(struct inode *inode, struct audit_tree *tree)
> > static void audit_tree_log_remove_rule(struct audit_krule *rule)
> > {
> > struct audit_buffer *ab;
> > + struct audit_context *context = audit_alloc_local();
>
> Sort of independent of the audit container ID work, but shouldn't we
> have an audit_enabled check here?
Same.
> > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
> > if (unlikely(!ab))
> > return;
> > audit_log_format(ab, "op=remove_rule");
> > @@ -506,6 +507,8 @@ static void audit_tree_log_remove_rule(struct audit_krule *rule)
> > audit_log_key(ab, rule->filterkey);
> > audit_log_format(ab, " list=%d res=1", rule->listnr);
> > audit_log_end(ab);
> > + audit_log_container_info(context, "config", audit_get_containerid(current));
> > + audit_free_context(context);
> > }
> >
> > static void kill_rules(struct audit_tree *tree)
> > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c
> > index 9eb8b35..60d75a2 100644
> > --- a/kernel/audit_watch.c
> > +++ b/kernel/audit_watch.c
> > @@ -238,20 +238,25 @@ static struct audit_watch *audit_dupe_watch(struct audit_watch *old)
> >
> > static void audit_watch_log_rule_change(struct audit_krule *r, struct audit_watch *w, char *op)
> > {
> > - if (audit_enabled) {
> > - struct audit_buffer *ab;
> > - ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE);
> > - if (unlikely(!ab))
> > - return;
> > - audit_log_format(ab, "auid=%u ses=%u op=%s",
> > - from_kuid(&init_user_ns, audit_get_loginuid(current)),
> > - audit_get_sessionid(current), op);
> > - audit_log_format(ab, " path=");
> > - audit_log_untrustedstring(ab, w->path);
> > - audit_log_key(ab, r->filterkey);
> > - audit_log_format(ab, " list=%d res=1", r->listnr);
> > - audit_log_end(ab);
> > - }
> > + struct audit_buffer *ab;
> > + struct audit_context *context = audit_alloc_local();
> > +
> > + if (!audit_enabled)
> > + return;
>
> Same as above, do the allocation after the audit_enabled check.
Same.
> > + ab = audit_log_start(context, GFP_NOFS, AUDIT_CONFIG_CHANGE);
> > + if (unlikely(!ab))
> > + return;
> > + audit_log_format(ab, "auid=%u ses=%u op=%s",
> > + from_kuid(&init_user_ns, audit_get_loginuid(current)),
> > + audit_get_sessionid(current), op);
> > + audit_log_format(ab, " path=");
> > + audit_log_untrustedstring(ab, w->path);
> > + audit_log_key(ab, r->filterkey);
> > + audit_log_format(ab, " list=%d res=1", r->listnr);
> > + audit_log_end(ab);
> > + audit_log_container_info(context, "config", audit_get_containerid(current));
> > + audit_free_context(context);
> > }
>
> --
> paul moore
> www.paul-moore.com
- RGB
--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
^ permalink raw reply
* Re: [PATCH] net: phy: TLK10X initial driver submission
From: Miguel Ojeda @ 2018-04-19 12:24 UTC (permalink / raw)
To: Måns Andersson
Cc: Rob Herring, Mark Rutland, Andrew Lunn, Florian Fainelli,
Network Development, devicetree, linux-kernel
In-Reply-To: <20180419082816.109338-1-mans.andersson@nibe.se>
On Thu, Apr 19, 2018 at 10:28 AM, Måns Andersson <mans.andersson@nibe.se> wrote:
> From: Mans Andersson <mans.andersson@nibe.se>
>
> Add suport for the TI TLK105 and TLK106 10/100Mbit ethernet phys.
>
Hi Mans,
Some quick notes.
> In addition the TLK10X needs to be removed from DP83848 driver as the
> power back off support is added here for this device.
>
> Datasheet:
> http://www.ti.com/lit/gpn/tlk106
Missing signature.
> ---
> .../devicetree/bindings/net/ti,tlk10x.txt | 27 +++
> drivers/net/phy/Kconfig | 5 +
> drivers/net/phy/Makefile | 1 +
> drivers/net/phy/dp83848.c | 3 -
> drivers/net/phy/tlk10x.c | 209 +++++++++++++++++++++
> 5 files changed, 242 insertions(+), 3 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/net/ti,tlk10x.txt
> create mode 100644 drivers/net/phy/tlk10x.c
>
> diff --git a/Documentation/devicetree/bindings/net/ti,tlk10x.txt b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> new file mode 100644
> index 0000000..371d0d7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> @@ -0,0 +1,27 @@
> +* Texas Instruments - TLK105 / TLK106 ethernet PHYs
> +
> +Required properties:
> + - reg - The ID number for the phy, usually a small integer
> +
> +Optional properties:
> + - ti,power-back-off - Power Back Off Level
> + Please refer to data sheet chapter 8.6 and TI Application
> + Note SLLA3228
> + 0 - Normal Operation
> + 1 - Level 1 (up to 140m cable between TLK link partners)
> + 2 - Level 2 (up to 100m cable between TLK link partners)
> + 3 - Level 3 (up to 80m cable between TLK link partners)
> +
> +Default child nodes are standard Ethernet PHY device
> +nodes as described in Documentation/devicetree/bindings/net/phy.txt
> +
> +Example:
> +
> + ethernet-phy@0 {
> + reg = <0>;
> + ti,power-back-off = <2>;
> + };
> +
> +Datasheets and documentation can be found at:
> +http://www.ti.com/lit/gpn/tlk106
> +http://www.ti.com/lit/an/slla328/slla328.pdf
> diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
> index bdfbabb..c980240 100644
> --- a/drivers/net/phy/Kconfig
> +++ b/drivers/net/phy/Kconfig
> @@ -295,6 +295,11 @@ config DP83867_PHY
> ---help---
> Currently supports the DP83867 PHY.
>
> +config TLK10X_PHY
> + tristate "Texas Instruments TLK10x PHY"
> + ---help---
> + Supports the TLK105 and TLK106 PHYs.
> +
> config FIXED_PHY
> tristate "MDIO Bus/PHY emulation with fixed speed/link PHYs"
> depends on PHYLIB
> diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
> index 01acbcb..37e4e02 100644
> --- a/drivers/net/phy/Makefile
> +++ b/drivers/net/phy/Makefile
> @@ -79,5 +79,6 @@ obj-$(CONFIG_ROCKCHIP_PHY) += rockchip.o
> obj-$(CONFIG_SMSC_PHY) += smsc.o
> obj-$(CONFIG_STE10XP) += ste10Xp.o
> obj-$(CONFIG_TERANETICS_PHY) += teranetics.o
> +obj-$(CONFIG_TLK10X_PHY) += tlk10x.o
> obj-$(CONFIG_VITESSE_PHY) += vitesse.o
> obj-$(CONFIG_XILINX_GMII2RGMII) += xilinx_gmii2rgmii.o
> diff --git a/drivers/net/phy/dp83848.c b/drivers/net/phy/dp83848.c
> index cd09c3a..435f401 100644
> --- a/drivers/net/phy/dp83848.c
> +++ b/drivers/net/phy/dp83848.c
> @@ -19,7 +19,6 @@
> #define TI_DP83848C_PHY_ID 0x20005ca0
> #define TI_DP83620_PHY_ID 0x20005ce0
> #define NS_DP83848C_PHY_ID 0x20005c90
> -#define TLK10X_PHY_ID 0x2000a210
>
> /* Registers */
> #define DP83848_MICR 0x11 /* MII Interrupt Control Register */
> @@ -78,7 +77,6 @@ static struct mdio_device_id __maybe_unused dp83848_tbl[] = {
> { TI_DP83848C_PHY_ID, 0xfffffff0 },
> { NS_DP83848C_PHY_ID, 0xfffffff0 },
> { TI_DP83620_PHY_ID, 0xfffffff0 },
> - { TLK10X_PHY_ID, 0xfffffff0 },
> { }
> };
> MODULE_DEVICE_TABLE(mdio, dp83848_tbl);
> @@ -105,7 +103,6 @@ static struct phy_driver dp83848_driver[] = {
> DP83848_PHY_DRIVER(TI_DP83848C_PHY_ID, "TI DP83848C 10/100 Mbps PHY"),
> DP83848_PHY_DRIVER(NS_DP83848C_PHY_ID, "NS DP83848C 10/100 Mbps PHY"),
> DP83848_PHY_DRIVER(TI_DP83620_PHY_ID, "TI DP83620 10/100 Mbps PHY"),
> - DP83848_PHY_DRIVER(TLK10X_PHY_ID, "TI TLK10X 10/100 Mbps PHY"),
> };
> module_phy_driver(dp83848_driver);
>
> diff --git a/drivers/net/phy/tlk10x.c b/drivers/net/phy/tlk10x.c
> new file mode 100644
> index 0000000..1efc81e
> --- /dev/null
> +++ b/drivers/net/phy/tlk10x.c
> @@ -0,0 +1,209 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/**
> + * Driver for the Texas Instruments TLK105 / TLK106
> + *
> + * Copyright (C) 2018 NIBE Industrier AB - http://www.nibe.com
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
Since you are using the SPDX id, please remove the license text (which
is actually wrong: it seems you have cut the v2+ version and then
removed the last sentence of the first paragraph? :-).
> + */
> +
> +#include <linux/module.h>
> +#include <linux/phy.h>
> +#include <linux/of.h>
> +
> +#define TLK10X_PHY_ID 0x2000a210
> +
> +/* Registers */
> +#define TLK10X_MICR 0x11 /* MII Interrupt Control Reg */
> +#define TLK10X_MISR 0x12 /* MII Interrupt Status Reg */
> +#define TLK10X_REGCR 0x0d /* Register Control Register */
> +#define TLK10X_ADDAR 0x0e /* Data Register */
> +#define TLK10X_PWRBOCR 0xae /* Power Backoff Register */
> +
> +/* MICR Register Fields */
> +#define TLK10X_MICR_INT_OE BIT(0) /* Interrupt Output Enable */
> +#define TLK10X_MICR_INTEN BIT(1) /* Interrupt Enable */
> +
> +/* MISR Register Fields */
> +#define TLK10X_MISR_RHF_INT_EN BIT(0) /* Receive Error Counter */
> +#define TLK10X_MISR_FHF_INT_EN BIT(1) /* False Carrier Counter */
> +#define TLK10X_MISR_ANC_INT_EN BIT(2) /* Auto-negotiation complete */
> +#define TLK10X_MISR_DUP_INT_EN BIT(3) /* Duplex Status */
> +#define TLK10X_MISR_SPD_INT_EN BIT(4) /* Speed status */
> +#define TLK10X_MISR_LINK_INT_EN BIT(5) /* Link status */
> +#define TLK10X_MISR_ED_INT_EN BIT(6) /* Energy detect */
> +#define TLK10X_MISR_LQM_INT_EN BIT(7) /* Link Quality Monitor */
> +
> +/* PWRBOCR Register Fields */
> +#define TLK10X_PWRBOCR_MASK 0xe0 /* Power Backoff Mask */
> +
> +#define TLK10X_INT_EN_MASK \
> + (TLK10X_MISR_ANC_INT_EN | \
> + TLK10X_MISR_DUP_INT_EN | \
> + TLK10X_MISR_SPD_INT_EN | \
> + TLK10X_MISR_LINK_INT_EN)
> +
> +struct tlk10x_private {
> + int pwrbo_level;
> +};
> +
> +static int tlk10x_read(struct phy_device *phydev, int reg)
> +{
> + if (reg & ~0x1f) {
0x1f or ~0x1f should probably have a #define name.
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }
> +
> + return phy_read(phydev, reg);
> +}
> +
> +static int tlk10x_write(struct phy_device *phydev, int reg, int val)
> +{
> + if (reg & ~0x1f) {
Ditto.
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }
> +
> + return phy_write(phydev, reg, val);
> +}
> +
> +#ifdef CONFIG_OF_MDIO
Maybe you want the #ifdef inside.
> +static int tlk10x_of_init(struct phy_device *phydev)
> +{
> + struct tlk10x_private *tlk10x = phydev->priv;
> + struct device *dev = &phydev->mdio.dev;
> + struct device_node *of_node = dev->of_node;
> + int ret;
> +
> + if (!of_node)
> + return 0;
> +
> + ret = of_property_read_u32(of_node, "ti,power-back-off",
> + &tlk10x->pwrbo_level);
> + if (ret) {
> + dev_err(dev, "missing ti,power-back-off property");
> + tlk10x->pwrbo_level = 0;
> + }
> +
> + return 0;
> +}
> +#else
> +static int tlk10x_of_init(struct phy_device *phydev)
> +{
> + return 0;
> +}
> +#endif /* CONFIG_OF_MDIO */
> +
> +static int tlk10x_config_init(struct phy_device *phydev)
> +{
> + int ret, reg;
> + struct tlk10x_private *tlk10x;
> +
> + ret = genphy_config_init(phydev);
> + if (ret < 0)
> + return ret;
> +
> + if (!phydev->priv) {
> + tlk10x = devm_kzalloc(&phydev->mdio.dev, sizeof(*tlk10x),
> + GFP_KERNEL);
> + if (!tlk10x)
> + return -ENOMEM;
> +
> + phydev->priv = tlk10x;
> + ret = tlk10x_of_init(phydev);
> + if (ret)
> + return ret;
> + } else {
> + tlk10x = (struct tlk10x_private *)phydev->priv;
> + }
> +
> + // Power back off
> + if (tlk10x->pwrbo_level < 0 || tlk10x->pwrbo_level > 3)
> + tlk10x->pwrbo_level = 0;
> + reg = tlk10x_read(phydev, TLK10X_PWRBOCR);
> + reg = ((reg & ~TLK10X_PWRBOCR_MASK)
> + | (tlk10x->pwrbo_level << 6));
Maybe the 6 should have a name, or maybe a bigger macro for this would clarify.
> + ret = tlk10x_write(phydev, TLK10X_PWRBOCR, reg);
> + if (ret < 0) {
> + dev_err(&phydev->mdio.dev,
> + "unable to set power back-off (err=%d)\n", ret);
> + return ret;
> + }
> + dev_info(&phydev->mdio.dev, "power back-off set to level %d\n",
> + tlk10x->pwrbo_level);
> +
> + return 0;
> +}
> +
> +static int tlk10x_ack_interrupt(struct phy_device *phydev)
> +{
> + int err = tlk10x_read(phydev, TLK10X_MISR);
Following the style of the rest of the file, shouldn't this be:
if (err < 0)
return err;
return 0;
?
> +
> + return err < 0 ? err : 0;
> +}
> +
> +static int tlk10x_config_intr(struct phy_device *phydev)
> +{
> + int control, ret;
> +
> + control = tlk10x_read(phydev, TLK10X_MICR);
> + if (control < 0)
> + return control;
> +
> + if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
> + control |= TLK10X_MICR_INT_OE;
> + control |= TLK10X_MICR_INTEN;
> +
> + ret = tlk10x_write(phydev, TLK10X_MISR, TLK10X_INT_EN_MASK);
> + if (ret < 0)
> + return ret;
> + } else {
> + control &= ~TLK10X_MICR_INTEN;
> + }
> +
> + return tlk10x_write(phydev, TLK10X_MICR, control);
> +}
> +
> +static struct phy_driver tlk10x_driver[] = {
> + {
> + .phy_id = TLK10X_PHY_ID,
> + .phy_id_mask = 0xfffffff0,
> + .name = "TI TLK10X 10/100 Mbps PHY",
> + .features = PHY_BASIC_FEATURES,
> + .flags = PHY_HAS_INTERRUPT,
> +
> + .config_init = tlk10x_config_init,
> + .soft_reset = genphy_soft_reset,
> +
> + /* IRQ related */
> + .ack_interrupt = tlk10x_ack_interrupt,
> + .config_intr = tlk10x_config_intr,
> +
> + .suspend = genphy_suspend,
> + .resume = genphy_resume,
> + },
> +};
> +module_phy_driver(tlk10x_driver);
> +
> +static struct mdio_device_id __maybe_unused tlk10x_tbl[] = {
> + { TLK10X_PHY_ID, 0xfffffff0 },
> + { }
> +};
> +MODULE_DEVICE_TABLE(mdio, tlk10x_tbl);
> +
> +MODULE_DESCRIPTION("Texas Instruments TLK105 / TLK106 PHY driver");
> +MODULE_AUTHOR("Mans Andersson <mans.andersson@nibe.se>");
> +MODULE_LICENSE("GPL");
Should be "GPL v2".
Cheers,
Miguel
^ permalink raw reply
* Re: [PATCH v4 1/9] net-next: phy: new Asix Electronics PHY driver
From: Andrew Lunn @ 2018-04-19 12:21 UTC (permalink / raw)
To: Michael Schmitz
Cc: netdev, fthain, geert, f.fainelli, linux-m68k, Michael.Karcher
In-Reply-To: <1524103526-12240-2-git-send-email-schmitzmic@gmail.com>
On Thu, Apr 19, 2018 at 02:05:18PM +1200, Michael Schmitz wrote:
> The Asix Electronics PHY found on the X-Surf 100 Amiga Zorro network
> card by Individual Computers is buggy, and needs the reset bit toggled
> as workaround to make a PHY soft reset succeed.
>
> Add workaround driver just for this special case.
>
> Suggested in xsurf100 patch series review by Andrew Lunn <andrew@lunn.ch>
>
> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH] net: phy: marvell: clear wol event before setting it
From: Andrew Lunn @ 2018-04-19 12:18 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Florian Fainelli, David S. Miller, netdev, linux-kernel,
Jingju Hou
In-Reply-To: <20180419160232.519d15be@xhacker.debian>
On Thu, Apr 19, 2018 at 04:02:32PM +0800, Jisheng Zhang wrote:
> From: Jingju Hou <Jingju.Hou@synaptics.com>
>
> If WOL event happened once, the LED[2] interrupt pin will not be
> cleared unless reading the CSISR register. So clear the WOL event
> before enabling it.
>
> Signed-off-by: Jingju Hou <Jingju.Hou@synaptics.com>
> Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
> ---
> drivers/net/phy/marvell.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
> index c22e8e383247..b6abe1cbc84b 100644
> --- a/drivers/net/phy/marvell.c
> +++ b/drivers/net/phy/marvell.c
> @@ -115,6 +115,9 @@
> /* WOL Event Interrupt Enable */
> #define MII_88E1318S_PHY_CSIER_WOL_EIE BIT(7)
>
> +/* Copper Specific Interrupt Status Register */
> +#define MII_88E1318S_PHY_CSISR 0x13
> +
> /* LED Timer Control Register */
> #define MII_88E1318S_PHY_LED_TCR 0x12
> #define MII_88E1318S_PHY_LED_TCR_FORCE_INT BIT(15)
> @@ -1393,6 +1396,12 @@ static int m88e1318_set_wol(struct phy_device *phydev,
> if (err < 0)
> goto error;
>
> + /* If WOL event happened once, the LED[2] interrupt pin
> + * will not be cleared unless reading the CSISR register.
> + * So clear the WOL event first before enabling it.
> + */
> + phy_read(phydev, MII_88E1318S_PHY_CSISR);
> +
Hi Jisheng
The problem with this is, you could be clearing a real interrupt, link
down/up etc. If interrupts are in use, i think the normal interrupt
handling will clear the WOL interrupt? So can you make this read
conditional on !phy_interrupt_is_valid()?
Andrew
^ permalink raw reply
* Re: [RFC PATCH ghak32 V2 04/13] audit: add containerid filtering
From: Richard Guy Briggs @ 2018-04-19 12:17 UTC (permalink / raw)
To: Paul Moore
Cc: cgroups, containers, linux-api, Linux-Audit Mailing List,
linux-fsdevel, LKML, netdev, ebiederm, luto, jlayton, carlos,
dhowells, viro, simo, Eric Paris, serge
In-Reply-To: <CAHC9VhRVGTCVJxG3Etcs-aOpr71A7xGsn5VPhskUG35rmQ7WUw@mail.gmail.com>
On 2018-04-18 20:24, Paul Moore wrote:
> On Fri, Mar 16, 2018 at 5:00 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> > Implement container ID filtering using the AUDIT_CONTAINERID field name
> > to send an 8-character string representing a u64 since the value field
> > is only u32.
> >
> > Sending it as two u32 was considered, but gathering and comparing two
> > fields was more complex.
>
> My only worry here is that you aren't really sending a string in the
> ASCII sense, you are sending an 8 byte buffer (that better be NUL
> terminated) that happens to be an unsigned 64-bit integer. To be
> clear, I'm okay with that (it's protected by AUDIT_CONTAINERID), and
> the code is okay with that, I just want us to pause for a minute and
> make sure that is an okay thing to do long term.
I already went through that process and warned of it 7 weeks ago. As
already noted, That was preferable to two seperate u32 fields that
depend on each other making comparisons more complicated. Using two
seperate fields to configure the rule could be gated for validity, then
the result stored in a special rule field, but I wasn't keen about that
approach.
> > The feature indicator is AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER.
> >
> > This requires support from userspace to be useful.
> > See: https://github.com/linux-audit/audit-userspace/issues/40
> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> > ---
> > include/linux/audit.h | 1 +
> > include/uapi/linux/audit.h | 5 ++++-
> > kernel/audit.h | 1 +
> > kernel/auditfilter.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++
> > kernel/auditsc.c | 3 +++
> > 5 files changed, 56 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > index 3acbe9d..f10ca1b 100644
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -76,6 +76,7 @@ struct audit_field {
> > u32 type;
> > union {
> > u32 val;
> > + u64 val64;
> > kuid_t uid;
> > kgid_t gid;
> > struct {
> > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> > index e83ccbd..8443a8f 100644
> > --- a/include/uapi/linux/audit.h
> > +++ b/include/uapi/linux/audit.h
> > @@ -262,6 +262,7 @@
> > #define AUDIT_LOGINUID_SET 24
> > #define AUDIT_SESSIONID 25 /* Session ID */
> > #define AUDIT_FSTYPE 26 /* FileSystem Type */
> > +#define AUDIT_CONTAINERID 27 /* Container ID */
> >
> > /* These are ONLY useful when checking
> > * at syscall exit time (AUDIT_AT_EXIT). */
> > @@ -342,6 +343,7 @@ enum {
> > #define AUDIT_FEATURE_BITMAP_SESSIONID_FILTER 0x00000010
> > #define AUDIT_FEATURE_BITMAP_LOST_RESET 0x00000020
> > #define AUDIT_FEATURE_BITMAP_FILTER_FS 0x00000040
> > +#define AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER 0x00000080
> >
> > #define AUDIT_FEATURE_BITMAP_ALL (AUDIT_FEATURE_BITMAP_BACKLOG_LIMIT | \
> > AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_TIME | \
> > @@ -349,7 +351,8 @@ enum {
> > AUDIT_FEATURE_BITMAP_EXCLUDE_EXTEND | \
> > AUDIT_FEATURE_BITMAP_SESSIONID_FILTER | \
> > AUDIT_FEATURE_BITMAP_LOST_RESET | \
> > - AUDIT_FEATURE_BITMAP_FILTER_FS)
> > + AUDIT_FEATURE_BITMAP_FILTER_FS | \
> > + AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER)
> >
> > /* deprecated: AUDIT_VERSION_* */
> > #define AUDIT_VERSION_LATEST AUDIT_FEATURE_BITMAP_ALL
> > diff --git a/kernel/audit.h b/kernel/audit.h
> > index 214e149..aaa651a 100644
> > --- a/kernel/audit.h
> > +++ b/kernel/audit.h
> > @@ -234,6 +234,7 @@ static inline int audit_hash_ino(u32 ino)
> >
> > extern int audit_match_class(int class, unsigned syscall);
> > extern int audit_comparator(const u32 left, const u32 op, const u32 right);
> > +extern int audit_comparator64(const u64 left, const u32 op, const u64 right);
> > extern int audit_uid_comparator(kuid_t left, u32 op, kuid_t right);
> > extern int audit_gid_comparator(kgid_t left, u32 op, kgid_t right);
> > extern int parent_len(const char *path);
> > diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
> > index d7a807e..c4c8746 100644
> > --- a/kernel/auditfilter.c
> > +++ b/kernel/auditfilter.c
> > @@ -410,6 +410,7 @@ static int audit_field_valid(struct audit_entry *entry, struct audit_field *f)
> > /* FALL THROUGH */
> > case AUDIT_ARCH:
> > case AUDIT_FSTYPE:
> > + case AUDIT_CONTAINERID:
> > if (f->op != Audit_not_equal && f->op != Audit_equal)
> > return -EINVAL;
> > break;
> > @@ -584,6 +585,14 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
> > }
> > entry->rule.exe = audit_mark;
> > break;
> > + case AUDIT_CONTAINERID:
> > + if (f->val != sizeof(u64))
> > + goto exit_free;
> > + str = audit_unpack_string(&bufp, &remain, f->val);
> > + if (IS_ERR(str))
> > + goto exit_free;
> > + f->val64 = ((u64 *)str)[0];
> > + break;
> > }
> > }
> >
> > @@ -666,6 +675,11 @@ static struct audit_rule_data *audit_krule_to_data(struct audit_krule *krule)
> > data->buflen += data->values[i] =
> > audit_pack_string(&bufp, audit_mark_path(krule->exe));
> > break;
> > + case AUDIT_CONTAINERID:
> > + data->buflen += data->values[i] = sizeof(u64);
> > + for (i = 0; i < sizeof(u64); i++)
> > + ((char *)bufp)[i] = ((char *)&f->val64)[i];
> > + break;
> > case AUDIT_LOGINUID_SET:
> > if (krule->pflags & AUDIT_LOGINUID_LEGACY && !f->val) {
> > data->fields[i] = AUDIT_LOGINUID;
> > @@ -752,6 +766,10 @@ static int audit_compare_rule(struct audit_krule *a, struct audit_krule *b)
> > if (!gid_eq(a->fields[i].gid, b->fields[i].gid))
> > return 1;
> > break;
> > + case AUDIT_CONTAINERID:
> > + if (a->fields[i].val64 != b->fields[i].val64)
> > + return 1;
> > + break;
> > default:
> > if (a->fields[i].val != b->fields[i].val)
> > return 1;
> > @@ -1210,6 +1228,31 @@ int audit_comparator(u32 left, u32 op, u32 right)
> > }
> > }
> >
> > +int audit_comparator64(u64 left, u32 op, u64 right)
> > +{
> > + switch (op) {
> > + case Audit_equal:
> > + return (left == right);
> > + case Audit_not_equal:
> > + return (left != right);
> > + case Audit_lt:
> > + return (left < right);
> > + case Audit_le:
> > + return (left <= right);
> > + case Audit_gt:
> > + return (left > right);
> > + case Audit_ge:
> > + return (left >= right);
> > + case Audit_bitmask:
> > + return (left & right);
> > + case Audit_bittest:
> > + return ((left & right) == right);
> > + default:
> > + BUG();
> > + return 0;
> > + }
> > +}
> > +
> > int audit_uid_comparator(kuid_t left, u32 op, kuid_t right)
> > {
> > switch (op) {
> > @@ -1348,6 +1391,10 @@ int audit_filter(int msgtype, unsigned int listtype)
> > result = audit_comparator(audit_loginuid_set(current),
> > f->op, f->val);
> > break;
> > + case AUDIT_CONTAINERID:
> > + result = audit_comparator64(audit_get_containerid(current),
> > + f->op, f->val64);
> > + break;
> > case AUDIT_MSGTYPE:
> > result = audit_comparator(msgtype, f->op, f->val);
> > break;
> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index 65be110..2bba324 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -614,6 +614,9 @@ static int audit_filter_rules(struct task_struct *tsk,
> > case AUDIT_LOGINUID_SET:
> > result = audit_comparator(audit_loginuid_set(tsk), f->op, f->val);
> > break;
> > + case AUDIT_CONTAINERID:
> > + result = audit_comparator64(audit_get_containerid(tsk), f->op, f->val64);
> > + break;
> > case AUDIT_SUBJ_USER:
> > case AUDIT_SUBJ_ROLE:
> > case AUDIT_SUBJ_TYPE:
> > --
> > 1.8.3.1
> >
> > --
> > Linux-audit mailing list
> > Linux-audit@redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-audit
>
>
>
> --
> paul moore
> www.paul-moore.com
- RGB
--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
^ permalink raw reply
* Re: [PATCH net 3/3] net: sched: ife: check on metadata length
From: Jamal Hadi Salim @ 2018-04-19 12:10 UTC (permalink / raw)
To: yotam gigi, Alexander Aring
Cc: davem, Cong Wang, Jiří Pírko, Yuval Mintz, netdev,
kernel
In-Reply-To: <CANnrxJhUdk6s9_oRRyV+iC7Q_NzAFk5b9=FW5oGtuOuiFdHFvg@mail.gmail.com>
On 19/04/18 01:37 AM, yotam gigi wrote:
> On Thu, Apr 19, 2018 at 12:35 AM, Alexander Aring <aring@mojatatu.com> wrote:
>> This patch checks if sk buffer is available to dererence ife header. If
>> not then NULL will returned to signal an malformed ife packet. This
>> avoids to crashing the kernel from outside.
>>
>> Signed-off-by: Alexander Aring <aring@mojatatu.com>
>
> Reviewed-by: Yotam Gigi <yotam.gi@gmail.com>
>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply
* Re: [PATCH net 2/3] net: sched: ife: handle malformed tlv length
From: Jamal Hadi Salim @ 2018-04-19 12:09 UTC (permalink / raw)
To: yotam gigi, Alexander Aring
Cc: davem, Cong Wang, Jiří Pírko, Yuval Mintz, netdev,
kernel
In-Reply-To: <CANnrxJidq70VAmDza63cEkpd80c=VCxn6hg=m4Ko5oXYML82Ag@mail.gmail.com>
On 19/04/18 01:37 AM, yotam gigi wrote:
> On Thu, Apr 19, 2018 at 12:35 AM, Alexander Aring <aring@mojatatu.com> wrote:
>> There is currently no handling to check on a invalid tlv length. This
>> patch adds such handling to avoid killing the kernel with a malformed
>> ife packet.
>
> That's very important. Thanks for that!
>
>>
>> Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
cheers,
jamal
^ permalink raw reply
* Re: [PATCH] net: phy: TLK10X initial driver submission
From: Andrew Lunn @ 2018-04-19 12:09 UTC (permalink / raw)
To: Måns Andersson
Cc: Rob Herring, Mark Rutland, Florian Fainelli, netdev, devicetree,
linux-kernel
In-Reply-To: <20180419082816.109338-1-mans.andersson@nibe.se>
On Thu, Apr 19, 2018 at 10:28:16AM +0200, Måns Andersson wrote:
> From: Mans Andersson <mans.andersson@nibe.se>
>
> Add suport for the TI TLK105 and TLK106 10/100Mbit ethernet phys.
>
> In addition the TLK10X needs to be removed from DP83848 driver as the
> power back off support is added here for this device.
>
> Datasheet:
> http://www.ti.com/lit/gpn/tlk106
> ---
> .../devicetree/bindings/net/ti,tlk10x.txt | 27 +++
> drivers/net/phy/Kconfig | 5 +
> drivers/net/phy/Makefile | 1 +
> drivers/net/phy/dp83848.c | 3 -
> drivers/net/phy/tlk10x.c | 209 +++++++++++++++++++++
> 5 files changed, 242 insertions(+), 3 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/net/ti,tlk10x.txt
> create mode 100644 drivers/net/phy/tlk10x.c
>
> diff --git a/Documentation/devicetree/bindings/net/ti,tlk10x.txt b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> new file mode 100644
> index 0000000..371d0d7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/ti,tlk10x.txt
> @@ -0,0 +1,27 @@
> +* Texas Instruments - TLK105 / TLK106 ethernet PHYs
> +
> +Required properties:
> + - reg - The ID number for the phy, usually a small integer
> +
> +Optional properties:
> + - ti,power-back-off - Power Back Off Level
> + Please refer to data sheet chapter 8.6 and TI Application
> + Note SLLA3228
> + 0 - Normal Operation
> + 1 - Level 1 (up to 140m cable between TLK link partners)
> + 2 - Level 2 (up to 100m cable between TLK link partners)
> + 3 - Level 3 (up to 80m cable between TLK link partners)
Hi Måns
Device tree is all about board properties. In most cases, power back
off is not a board properties, since it depends on the cable length
and the peer board. If however, your board has two PHYs back to back,
say to connect to an Ethernet switch, that would be a valid board
property.
How are you using this?
I know of others who would like such a configuration. Marvell PHYs can
do something similar. I've always suggested adding a PHY tunable. Pass
the cable length in meters and let the PHY driver pick the nearest it
can do, rounding up. The Marvell PHYs also support measuring the cable
length as part of the cable diagnostics. So it would be good to
reserve a configuration value to mean 'auto' - measure the cable and
then pick the best power back off. Quickly scanning the data sheet, i
see that this PHY also has the ability to measure the cable length.
> +static int tlk10x_read(struct phy_device *phydev, int reg)
> +{
> + if (reg & ~0x1f) {
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }
> +
> + return phy_read(phydev, reg);
> +}
> +
> +static int tlk10x_write(struct phy_device *phydev, int reg, int val)
> +{
> + if (reg & ~0x1f) {
> + /* Extended register */
> + phy_write(phydev, TLK10X_REGCR, 0x001F);
> + phy_write(phydev, TLK10X_ADDAR, reg);
> + phy_write(phydev, TLK10X_REGCR, 0x401F);
> + reg = TLK10X_ADDAR;
> + }
> +
> + return phy_write(phydev, reg, val);
> +}
This looks to be phy_read_mmd() and phy_write_mmd(). If so, please use
them, they get the locking correct.
> +#ifdef CONFIG_OF_MDIO
> +static int tlk10x_of_init(struct phy_device *phydev)
> +{
> + struct tlk10x_private *tlk10x = phydev->priv;
> + struct device *dev = &phydev->mdio.dev;
> + struct device_node *of_node = dev->of_node;
> + int ret;
> +
> + if (!of_node)
> + return 0;
> +
> + ret = of_property_read_u32(of_node, "ti,power-back-off",
> + &tlk10x->pwrbo_level);
> + if (ret) {
> + dev_err(dev, "missing ti,power-back-off property");
> + tlk10x->pwrbo_level = 0;
> + }
If we decide to accept this, you should do range checking, and return
-EINVAL if the value is out of range.
> +static int tlk10x_config_init(struct phy_device *phydev)
> +{
> + int ret, reg;
> + struct tlk10x_private *tlk10x;
> +
> + ret = genphy_config_init(phydev);
> + if (ret < 0)
> + return ret;
> +
> + if (!phydev->priv) {
> + tlk10x = devm_kzalloc(&phydev->mdio.dev, sizeof(*tlk10x),
> + GFP_KERNEL);
> + if (!tlk10x)
> + return -ENOMEM;
> +
> + phydev->priv = tlk10x;
> + ret = tlk10x_of_init(phydev);
> + if (ret)
> + return ret;
> + } else {
> + tlk10x = (struct tlk10x_private *)phydev->priv;
> + }
This allocation should be done in .probe
> +
> + // Power back off
> + if (tlk10x->pwrbo_level < 0 || tlk10x->pwrbo_level > 3)
> + tlk10x->pwrbo_level = 0;
> + reg = tlk10x_read(phydev, TLK10X_PWRBOCR);
> + reg = ((reg & ~TLK10X_PWRBOCR_MASK)
> + | (tlk10x->pwrbo_level << 6));
> + ret = tlk10x_write(phydev, TLK10X_PWRBOCR, reg);
> + if (ret < 0) {
> + dev_err(&phydev->mdio.dev,
> + "unable to set power back-off (err=%d)\n", ret);
> + return ret;
> + }
> + dev_info(&phydev->mdio.dev, "power back-off set to level %d\n",
> + tlk10x->pwrbo_level);
> +
> + return 0;
> +}
Andrew
^ permalink raw reply
* Re: [PATCH net 1/3] net: sched: ife: signal not finding metaid
From: Jamal Hadi Salim @ 2018-04-19 12:08 UTC (permalink / raw)
To: yotam gigi, Alexander Aring
Cc: davem, Cong Wang, Jiří Pírko, Yuval Mintz, netdev,
kernel
In-Reply-To: <CANnrxJjLvzoDiMWmv0Ad-O44N-Vc=8Jjm2KrEKxiLt9a9fcNmA@mail.gmail.com>
On 19/04/18 01:37 AM, yotam gigi wrote:
> On Thu, Apr 19, 2018 at 12:35 AM, Alexander Aring <aring@mojatatu.com> wrote:
>> We need to record stats for received metadata that we dont know how
>> to process. Have find_decode_metaid() return -ENOENT to capture this.
>
> Agree.
>
>>
>> Signed-off-by: Alexander Aring <aring@mojatatu.com>
>
> Reviewed-by: Yotam Gigi <yotam.gi@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
^ permalink raw reply
* Re: [PATCH] net: phy: marvell: clear wol event before setting it
From: Andrew Lunn @ 2018-04-19 11:33 UTC (permalink / raw)
To: Bhadram Varka
Cc: Jisheng Zhang, Florian Fainelli, David S. Miller,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Jingju Hou
In-Reply-To: <1c3a4b66-b10c-adc9-b7e4-57b46f5c86e5@nvidia.com>
> >IIRC, the phy irq isn't necessary for WOL. The phy interrupt pin isn't
> >necessarily taken as "interrupt"
> Please correct me if I am wrong. In this case how the system will wake up
> from the SC7.There has to be wake capable irq/gpio pin to do this operation.
>
Hi Bhadram
I've seem implementations where the line from the PHY is connected to
the power supply. It simply turns the power on. No interrupt needed.
Andrew
^ permalink raw reply
* [PATCH] net: deal wrong skb and failure ret from __tcp_retransmit_skb
From: Liu, Changcheng @ 2018-04-19 11:26 UTC (permalink / raw)
To: davem, kuznet, yoshfuji; +Cc: netdev, akpm
Hit below panic due to skb is NULL, WARN wrong skb first.
if __tcp_retransmit_skb return failure e.g. -EAGAIN, it
needn't do further action in tcp_retransmit_skb.
gdb vmlinux
Reading symbols from vmlinux...done.
(gdb) p &((struct tcp_skb_cb *) \
&(((struct sk_buff *)0)->cb[0]))->tcp_gso_segs
$1 = (u16 *) 0x30 <irq_stack_union+48>
[ 9040.917533] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[ 9040.926279] IP: tcp_retransmit_skb+0x5c/0xc0
[ 9040.931043] PGD 0 P4D 0
[ 9040.933865] Oops: 0000 [#1] PREEMPT SMP PTI
[ 9040.972151] RIP: 0010:tcp_retransmit_skb+0x5c/0xc0
[ 9040.977496] RSP: 0018:ffff8802bec83e40 EFLAGS: 00010202
[ 9041.062527] Call Trace:
[ 9041.065250] <IRQ>
[ 9041.067489] tcp_retransmit_timer+0x481/0x820
[ 9041.077697] tcp_write_timer_handler+0xe9/0x230
[ 9041.082751] tcp_write_timer+0x75/0x80
[ 9041.086932] call_timer_fn+0x29/0x150
[ 9041.091018] run_timer_softirq+0x411/0x460
[ 9041.105017] __do_softirq+0x115/0x311
[ 9041.109103] irq_exit+0xb0/0xc0
[ 9041.112605] smp_apic_timer_interrupt+0x67/0x140
Signed-off-by: Liu Changcheng <changcheng.liu@intel.com>
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 383cac0..545b9b3 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2920,7 +2920,10 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
{
struct tcp_sock *tp = tcp_sk(sk);
- int err = __tcp_retransmit_skb(sk, skb, segs);
+ int err = 0;
+
+ WARN_ONCE(!skb, "sk_buff is NULL\n");
+ err = __tcp_retransmit_skb(sk, skb, segs);
if (err == 0) {
#if FASTRETRANS_DEBUG > 0
@@ -2935,6 +2938,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
if (!tp->retrans_stamp)
tp->retrans_stamp = tcp_skb_timestamp(skb);
+ } else {
+ return err;
}
if (tp->undo_retrans < 0)
--
2.7.4
^ permalink raw reply related
* Re: [PATCH 3/3] ath10k: Support ethtool gstats2 API.
From: kbuild test robot @ 2018-04-19 11:19 UTC (permalink / raw)
To: greearb; +Cc: kbuild-all, netdev, linux-wireless, ath10k, Ben Greear
In-Reply-To: <1524016176-3881-3-git-send-email-greearb@candelatech.com>
[-- Attachment #1: Type: text/plain, Size: 3474 bytes --]
Hi Ben,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on mac80211/master]
[also build test ERROR on v4.17-rc1 next-20180419]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/greearb-candelatech-com/ethtool-Support-ETHTOOL_GSTATS2-command/20180419-105301
base: https://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211.git master
config: x86_64-randconfig-ne0-04191514 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
>> drivers/net/wireless/ath/ath10k/mac.c:7706:21: error: 'ath10k_debug_get_et_stats2' undeclared here (not in a function)
.get_et_stats2 = ath10k_debug_get_et_stats2,
^~~~~~~~~~~~~~~~~~~~~~~~~~
vim +/ath10k_debug_get_et_stats2 +7706 drivers/net/wireless/ath/ath10k/mac.c
7672
7673 static const struct ieee80211_ops ath10k_ops = {
7674 .tx = ath10k_mac_op_tx,
7675 .wake_tx_queue = ath10k_mac_op_wake_tx_queue,
7676 .start = ath10k_start,
7677 .stop = ath10k_stop,
7678 .config = ath10k_config,
7679 .add_interface = ath10k_add_interface,
7680 .remove_interface = ath10k_remove_interface,
7681 .configure_filter = ath10k_configure_filter,
7682 .bss_info_changed = ath10k_bss_info_changed,
7683 .set_coverage_class = ath10k_mac_op_set_coverage_class,
7684 .hw_scan = ath10k_hw_scan,
7685 .cancel_hw_scan = ath10k_cancel_hw_scan,
7686 .set_key = ath10k_set_key,
7687 .set_default_unicast_key = ath10k_set_default_unicast_key,
7688 .sta_state = ath10k_sta_state,
7689 .conf_tx = ath10k_conf_tx,
7690 .remain_on_channel = ath10k_remain_on_channel,
7691 .cancel_remain_on_channel = ath10k_cancel_remain_on_channel,
7692 .set_rts_threshold = ath10k_set_rts_threshold,
7693 .set_frag_threshold = ath10k_mac_op_set_frag_threshold,
7694 .flush = ath10k_flush,
7695 .tx_last_beacon = ath10k_tx_last_beacon,
7696 .set_antenna = ath10k_set_antenna,
7697 .get_antenna = ath10k_get_antenna,
7698 .reconfig_complete = ath10k_reconfig_complete,
7699 .get_survey = ath10k_get_survey,
7700 .set_bitrate_mask = ath10k_mac_op_set_bitrate_mask,
7701 .sta_rc_update = ath10k_sta_rc_update,
7702 .offset_tsf = ath10k_offset_tsf,
7703 .ampdu_action = ath10k_ampdu_action,
7704 .get_et_sset_count = ath10k_debug_get_et_sset_count,
7705 .get_et_stats = ath10k_debug_get_et_stats,
> 7706 .get_et_stats2 = ath10k_debug_get_et_stats2,
7707 .get_et_strings = ath10k_debug_get_et_strings,
7708 .add_chanctx = ath10k_mac_op_add_chanctx,
7709 .remove_chanctx = ath10k_mac_op_remove_chanctx,
7710 .change_chanctx = ath10k_mac_op_change_chanctx,
7711 .assign_vif_chanctx = ath10k_mac_op_assign_vif_chanctx,
7712 .unassign_vif_chanctx = ath10k_mac_op_unassign_vif_chanctx,
7713 .switch_vif_chanctx = ath10k_mac_op_switch_vif_chanctx,
7714 .sta_pre_rcu_remove = ath10k_mac_op_sta_pre_rcu_remove,
7715 .sta_statistics = ath10k_sta_statistics,
7716
7717 CFG80211_TESTMODE_CMD(ath10k_tm_cmd)
7718
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30420 bytes --]
^ permalink raw reply
* Re: [PATCH bpf-next v3 4/8] bpf: add documentation for eBPF helpers (23-32)
From: Daniel Borkmann @ 2018-04-19 11:16 UTC (permalink / raw)
To: Quentin Monnet, ast; +Cc: netdev, oss-drivers, linux-doc, linux-man
In-Reply-To: <20180417143438.7018-5-quentin.monnet@netronome.com>
On 04/17/2018 04:34 PM, Quentin Monnet wrote:
> Add documentation for eBPF helper functions to bpf.h user header file.
> This documentation can be parsed with the Python script provided in
> another commit of the patch series, in order to provide a RST document
> that can later be converted into a man page.
>
> The objective is to make the documentation easily understandable and
> accessible to all eBPF developers, including beginners.
>
> This patch contains descriptions for the following helper functions, all
> written by Daniel:
>
> - bpf_get_prandom_u32()
> - bpf_get_smp_processor_id()
> - bpf_get_cgroup_classid()
> - bpf_get_route_realm()
> - bpf_skb_load_bytes()
> - bpf_csum_diff()
> - bpf_skb_get_tunnel_opt()
> - bpf_skb_set_tunnel_opt()
> - bpf_skb_change_proto()
> - bpf_skb_change_type()
>
> v3:
> - bpf_get_prandom_u32(): Fix helper name :(. Add description, including
> a note on the internal random state.
> - bpf_get_smp_processor_id(): Add description, including a note on the
> processor id remaining stable during program run.
> - bpf_get_cgroup_classid(): State that CONFIG_CGROUP_NET_CLASSID is
> required to use the helper. Add a reference to related documentation.
> State that placing a task in net_cls controller disables cgroup-bpf.
> - bpf_get_route_realm(): State that CONFIG_CGROUP_NET_CLASSID is
> required to use this helper.
> - bpf_skb_load_bytes(): Fix comment on current use cases for the helper.
>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> ---
> include/uapi/linux/bpf.h | 152 +++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 152 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c59bf5b28164..d748f65a8f58 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -483,6 +483,23 @@ union bpf_attr {
> * The number of bytes written to the buffer, or a negative error
> * in case of failure.
> *
> + * u32 bpf_get_prandom_u32(void)
> + * Description
> + * Get a pseudo-random number. Note that this helper uses its own
> + * pseudo-random internal state, and cannot be used to infer the
> + * seed of other random functions in the kernel.
We should still add that this prng is not cryptographically secure.
> + * Return
> + * A random 32-bit unsigned value.
> + *
> + * u32 bpf_get_smp_processor_id(void)
> + * Description
> + * Get the SMP (Symmetric multiprocessing) processor id. Note that
Nit: s/Symmetric/symmetric/ ?
> + * all programs run with preemption disabled, which means that the
> + * SMP processor id is stable during all the execution of the
> + * program.
> + * Return
> + * The SMP id of the processor running the program.
> + *
> * int bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void *from, u32 len, u64 flags)
> * Description
> * Store *len* bytes from address *from* into the packet
> @@ -615,6 +632,27 @@ union bpf_attr {
> * Return
> * 0 on success, or a negative error in case of failure.
> *
> + * u32 bpf_get_cgroup_classid(struct sk_buff *skb)
> + * Description
> + * Retrieve the classid for the current task, i.e. for the
> + * net_cls (network classifier) cgroup to which *skb* belongs.
> + *
> + * This helper is only available is the kernel was compiled with
> + * the **CONFIG_CGROUP_NET_CLASSID** configuration option set to
> + * "**y**" or to "**m**".
> + *
> + * Note that placing a task into the net_cls controller completely
> + * disables the execution of eBPF programs with the cgroup.
I'm not sure I follow the above sentence, what do you mean by that?
I would definitely also add here that this helper is limited to cgroups v1
only, and that it works on clsact TC egress hook but not the ingress one.
> + * Also note that, in the above description, the "network
> + * classifier" cgroup does not designate a generic classifier, but
> + * a particular mechanism that provides an interface to tag
> + * network packets with a specific class identifier. See also the
The "generic classifier" part is a bit strange to parse. I would probably
leave the first part out and explain that this provides a means to tag
packets based on a user-provided ID for all traffic coming from the tasks
belonging to the related cgroup.
> + * related kernel documentation, available from the Linux sources
> + * in file *Documentation/cgroup-v1/net_cls.txt*.
> + * Return
> + * The classid, or 0 for the default unconfigured classid.
> + *
> * int bpf_skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci)
> * Description
> * Push a *vlan_tci* (VLAN tag control information) of protocol
> @@ -734,6 +772,16 @@ union bpf_attr {
> * are **TC_ACT_REDIRECT** on success or **TC_ACT_SHOT** on
> * error.
> *
> + * u32 bpf_get_route_realm(struct sk_buff *skb)
> + * Description
> + * Retrieve the realm or the route, that is to say the
> + * **tclassid** field of the destination for the *skb*. This
> + * helper is available only if the kernel was compiled with
> + * **CONFIG_IP_ROUTE_CLASSID** configuration option.
Could mention that this is a similar user provided tag like in the net_cls
case with cgroups only that this applies to routes here (dst entries) which
hold this tag.
Also, should say that this works with clsact TC egress hook or alternatively
conventional classful egress qdiscs, but not on TC ingress. In case of clsact
TC egress hook this has the advantage that the dst entry has not been dropped
yet in the xmit path. Therefore, the dst entry does not need to be artificially
held via netif_keep_dst() for a classful qdisc until the skb is freed.
> + * Return
> + * The realm of the route for the packet associated to *sdb*, or 0
Typo: sdb
> + * if none was found.
> + *
> * int bpf_perf_event_output(struct pt_reg *ctx, struct bpf_map *map, u64 flags, void *data, u64 size)
> * Description
> * Write raw *data* blob into a special BPF perf event held by
> @@ -770,6 +818,23 @@ union bpf_attr {
> * Return
> * 0 on success, or a negative error in case of failure.
> *
> + * int bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void *to, u32 len)
> + * Description
> + * This helper was provided as an easy way to load data from a
> + * packet. It can be used to load *len* bytes from *offset* from
> + * the packet associated to *skb*, into the buffer pointed by
> + * *to*.
> + *
> + * Since Linux 4.7, usage of this helper has mostly been replaced
> + * by "direct packet access", enabling packet data to be
> + * manipulated with *skb*\ **->data** and *skb*\ **->data_end**
> + * pointing respectively to the first byte of packet data and to
> + * the byte after the last byte of packet data. However, it
> + * remains useful if one wishes to read large quantities of data
> + * at once from a packet.
I would add: s/at once from a packet/at once from a packet into the BPF stack/
> + * Return
> + * 0 on success, or a negative error in case of failure.
> + *
> * int bpf_get_stackid(struct pt_reg *ctx, struct bpf_map *map, u64 flags)
> * Description
> * Walk a user or a kernel stack and return its id. To achieve
> @@ -813,6 +878,93 @@ union bpf_attr {
> * The positive or null stack id on success, or a negative error
> * in case of failure.
> *
> + * s64 bpf_csum_diff(__be32 *from, u32 from_size, __be32 *to, u32 to_size, __wsum seed)
> + * Description
> + * Compute a checksum difference, from the raw buffer pointed by
> + * *from*, of length *from_size* (that must be a multiple of 4),
> + * towards the raw buffer pointed by *to*, of size *to_size*
> + * (same remark). An optional *seed* can be added to the value.
Wrt seed, we should explicitly mention that this can be cascaded but also that
this helper works in combination with the l3/l4 csum ones where you feed in this
diff coming from bpf_csum_diff().
> + * This is flexible enough to be used in several ways:
> + *
> + * * With *from_size* == 0, *to_size* > 0 and *seed* set to
> + * checksum, it can be used when pushing new data.
> + * * With *from_size* > 0, *to_size* == 0 and *seed* set to
> + * checksum, it can be used when removing data from a packet.
> + * * With *from_size* > 0, *to_size* > 0 and *seed* set to 0, it
> + * can be used to compute a diff. Note that *from_size* and
> + * *to_size* do not need to be equal.
> + * Return
> + * The checksum result, or a negative error code in case of
> + * failure.
> + *
> + * int bpf_skb_get_tunnel_opt(struct sk_buff *skb, u8 *opt, u32 size)
> + * Description
> + * Retrieve tunnel options metadata for the packet associated to
> + * *skb*, and store the raw tunnel option data to the buffer *opt*
> + * of *size*.
> + * Return
> + * The size of the option data retrieved.
> + *
> + * int bpf_skb_set_tunnel_opt(struct sk_buff *skb, u8 *opt, u32 size)
> + * Description
> + * Set tunnel options metadata for the packet associated to *skb*
> + * to the option data contained in the raw buffer *opt* of *size*.
Also here the same remark with collect meta data I made earlier, and as a
particular example where this can be used in combination with geneve where
this allows for pushing and retrieving (bpf_skb_get_tunnel_opt() case)
arbitrary TLVs from the BPF program that allows for full customization.
> + * Return
> + * 0 on success, or a negative error in case of failure.
> + *
> + * int bpf_skb_change_proto(struct sk_buff *skb, __be16 proto, u64 flags)
> + * Description
> + * Change the protocol of the *skb* to *proto*. Currently
> + * supported are transition from IPv4 to IPv6, and from IPv6 to
> + * IPv4. The helper takes care of the groundwork for the
> + * transition, including resizing the socket buffer. The eBPF
> + * program is expected to fill the new headers, if any, via
> + * **skb_store_bytes**\ () and to recompute the checksums with
> + * **bpf_l3_csum_replace**\ () and **bpf_l4_csum_replace**\
> + * ().
Could mention the main use case for NAT64 out of an BPF program.
> + *
> + * Internally, the GSO type is marked as dodgy so that headers are
> + * checked and segments are recalculated by the GSO/GRO engine.
> + * The size for GSO target is adapted as well.
> + *
> + * All values for *flags* are reserved for future usage, and must
> + * be left at zero.
> + *
> + * A call to this helper is susceptible to change data from the
> + * packet. Therefore, at load time, all checks on pointers
> + * previously done by the verifier are invalidated and must be
> + * performed again.
> + * Return
> + * 0 on success, or a negative error in case of failure.
> + *
> + * int bpf_skb_change_type(struct sk_buff *skb, u32 type)
> + * Description
> + * Change the packet type for the packet associated to *skb*. This
> + * comes down to setting *skb*\ **->pkt_type** to *type*, except
> + * the eBPF program does not have a write access to *skb*\
> + * **->pkt_type** beside this helper. Using a helper here allows
> + * for graceful handling of errors.
> + *
> + * The major use case is to change incoming *skb*s to
> + * **PACKET_HOST** in a programmatic way instead of having to
> + * recirculate via **redirect**\ (..., **BPF_F_INGRESS**), for
> + * example.
> + *
> + * Note that *type* only allows certain values. At this time, they
> + * are:
> + *
> + * **PACKET_HOST**
> + * Packet is for us.
> + * **PACKET_BROADCAST**
> + * Send packet to all.
> + * **PACKET_MULTICAST**
> + * Send packet to group.
> + * **PACKET_OTHERHOST**
> + * Send packet to someone else.
> + * Return
> + * 0 on success, or a negative error in case of failure.
> + *
> * u64 bpf_get_current_task(void)
> * Return
> * A pointer to the current task struct.
>
^ permalink raw reply
* Re: [PATCH net-next v3] team: account for oper state
From: Jiri Pirko @ 2018-04-19 11:16 UTC (permalink / raw)
To: George Wilkie; +Cc: netdev
In-Reply-To: <20180419103414.542-1-gwilkie@vyatta.att-mail.com>
Thu, Apr 19, 2018 at 12:34:14PM CEST, gwilkie@vyatta.att-mail.com wrote:
>Account for operational state when determining port linkup state,
>as per Documentation/networking/operstates.txt.
>
>Signed-off-by: George Wilkie <gwilkie@vyatta.att-mail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
^ permalink raw reply
* Re: [PATCH] bpf, x86_32: add eBPF JIT compiler for ia32 (x86_32)
From: Thomas Gleixner @ 2018-04-19 11:11 UTC (permalink / raw)
To: Wang YanQing
Cc: daniel, ast, illusionist.neo, mingo, hpa, x86, netdev,
linux-kernel
In-Reply-To: <20180418093118.GA4184@udknight>
On Wed, 18 Apr 2018, Wang YanQing wrote:
> @@ -0,0 +1,147 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* bpf_jit.S : BPF JIT helper functions
Please do not add these file names to the top level comment. They provide
no value and just become stale when the file gets moved/renamed.
> + *
> + * Copyright (C) 2018 Wang YanQing (udknight@gmail.com)
> + * Copyright (C) 2011 Eric Dumazet (eric.dumazet@gmail.com)
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; version 2
> + * of the License.
You have already the License Identifier. So you don't need the boiler plate
text.
Thanks,
tglx
^ permalink raw reply
* [PATCH net 6/6] s390/qeth: use Read device to query hypervisor for MAC
From: Julian Wiedmann @ 2018-04-19 10:52 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20180419105211.83935-1-jwi@linux.ibm.com>
From: Julian Wiedmann <jwi@linux.vnet.ibm.com>
For z/VM NICs, qeth needs to consider which of the three CCW devices in
an MPC group it uses for requesting a managed MAC address.
On the Base device, the hypervisor returns a default MAC which is
pre-assigned when creating the NIC (this MAC is also returned by the
READ MAC primitive). Querying any other device results in the allocation
of an additional MAC address.
For consistency with READ MAC and to avoid using up more addresses than
necessary, it is preferable to use the NIC's default MAC. So switch the
the diag26c over to using a NIC's Read device, which should always be
identical to the Base device.
Fixes: ec61bd2fd2a2 ("s390/qeth: use diag26c to get MAC address on L2")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
drivers/s390/net/qeth_core_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 9b22d5d496ae..dffd820731f2 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -4835,7 +4835,7 @@ int qeth_vm_request_mac(struct qeth_card *card)
goto out;
}
- ccw_device_get_id(CARD_DDEV(card), &id);
+ ccw_device_get_id(CARD_RDEV(card), &id);
request->resp_buf_len = sizeof(*response);
request->resp_version = DIAG26C_VERSION2;
request->op_code = DIAG26C_GET_MAC;
--
2.13.5
^ permalink raw reply related
* [PATCH net 5/6] s390/qeth: fix request-side race during cmd IO timeout
From: Julian Wiedmann @ 2018-04-19 10:52 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20180419105211.83935-1-jwi@linux.ibm.com>
From: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Submitting a cmd IO request (usually on the WRITE device, but for IDX
also on the READ device) is currently done with ccw_device_start()
and a manual timeout in the caller.
On timeout, the caller cleans up the related resources (eg. IO buffer).
But 1) the IO might still be active and utilize those resources, and
2) when the IO completes, qeth_irq() will attempt to clean up the
same resources again.
Instead of introducing additional resource locking, switch to
ccw_device_start_timeout() to ensure IO termination after timeout, and
let the IRQ handler alone deal with cleaning up after a request.
This also removes a stray write->irq_pending reset from
clear_ipacmd_list(). The routine doesn't terminate any pending IO on
the WRITE device, so this should be handled properly via IO timeout
in the IRQ handler.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
drivers/s390/net/qeth_core_main.c | 51 ++++++++++++++++++++-------------------
drivers/s390/net/qeth_core_mpc.h | 12 +++++++++
drivers/s390/net/qeth_l2_main.c | 4 +--
3 files changed, 40 insertions(+), 27 deletions(-)
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 9a08b545d018..9b22d5d496ae 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -706,7 +706,6 @@ void qeth_clear_ipacmd_list(struct qeth_card *card)
qeth_put_reply(reply);
}
spin_unlock_irqrestore(&card->lock, flags);
- atomic_set(&card->write.irq_pending, 0);
}
EXPORT_SYMBOL_GPL(qeth_clear_ipacmd_list);
@@ -1098,14 +1097,9 @@ static void qeth_irq(struct ccw_device *cdev, unsigned long intparm,
{
int rc;
int cstat, dstat;
+ struct qeth_cmd_buffer *iob = NULL;
struct qeth_channel *channel;
struct qeth_card *card;
- struct qeth_cmd_buffer *iob;
-
- if (__qeth_check_irb_error(cdev, intparm, irb))
- return;
- cstat = irb->scsw.cmd.cstat;
- dstat = irb->scsw.cmd.dstat;
card = CARD_FROM_CDEV(cdev);
if (!card)
@@ -1123,6 +1117,19 @@ static void qeth_irq(struct ccw_device *cdev, unsigned long intparm,
channel = &card->data;
QETH_CARD_TEXT(card, 5, "data");
}
+
+ if (qeth_intparm_is_iob(intparm))
+ iob = (struct qeth_cmd_buffer *) __va((addr_t)intparm);
+
+ if (__qeth_check_irb_error(cdev, intparm, irb)) {
+ /* IO was terminated, free its resources. */
+ if (iob)
+ qeth_release_buffer(iob->channel, iob);
+ atomic_set(&channel->irq_pending, 0);
+ wake_up(&card->wait_q);
+ return;
+ }
+
atomic_set(&channel->irq_pending, 0);
if (irb->scsw.cmd.fctl & (SCSW_FCTL_CLEAR_FUNC))
@@ -1146,6 +1153,10 @@ static void qeth_irq(struct ccw_device *cdev, unsigned long intparm,
/* we don't have to handle this further */
intparm = 0;
}
+
+ cstat = irb->scsw.cmd.cstat;
+ dstat = irb->scsw.cmd.dstat;
+
if ((dstat & DEV_STAT_UNIT_EXCEP) ||
(dstat & DEV_STAT_UNIT_CHECK) ||
(cstat)) {
@@ -1184,11 +1195,8 @@ static void qeth_irq(struct ccw_device *cdev, unsigned long intparm,
channel->state == CH_STATE_UP)
__qeth_issue_next_read(card);
- if (intparm) {
- iob = (struct qeth_cmd_buffer *) __va((addr_t)intparm);
- if (iob->callback)
- iob->callback(iob->channel, iob);
- }
+ if (iob && iob->callback)
+ iob->callback(iob->channel, iob);
out:
wake_up(&card->wait_q);
@@ -1859,8 +1867,8 @@ static int qeth_idx_activate_get_answer(struct qeth_channel *channel,
atomic_cmpxchg(&channel->irq_pending, 0, 1) == 0);
QETH_DBF_TEXT(SETUP, 6, "noirqpnd");
spin_lock_irqsave(get_ccwdev_lock(channel->ccwdev), flags);
- rc = ccw_device_start(channel->ccwdev,
- &channel->ccw, (addr_t) iob, 0, 0);
+ rc = ccw_device_start_timeout(channel->ccwdev, &channel->ccw,
+ (addr_t) iob, 0, 0, QETH_TIMEOUT);
spin_unlock_irqrestore(get_ccwdev_lock(channel->ccwdev), flags);
if (rc) {
@@ -1877,7 +1885,6 @@ static int qeth_idx_activate_get_answer(struct qeth_channel *channel,
if (channel->state != CH_STATE_UP) {
rc = -ETIME;
QETH_DBF_TEXT_(SETUP, 2, "3err%d", rc);
- qeth_clear_cmd_buffers(channel);
} else
rc = 0;
return rc;
@@ -1931,8 +1938,8 @@ static int qeth_idx_activate_channel(struct qeth_channel *channel,
atomic_cmpxchg(&channel->irq_pending, 0, 1) == 0);
QETH_DBF_TEXT(SETUP, 6, "noirqpnd");
spin_lock_irqsave(get_ccwdev_lock(channel->ccwdev), flags);
- rc = ccw_device_start(channel->ccwdev,
- &channel->ccw, (addr_t) iob, 0, 0);
+ rc = ccw_device_start_timeout(channel->ccwdev, &channel->ccw,
+ (addr_t) iob, 0, 0, QETH_TIMEOUT);
spin_unlock_irqrestore(get_ccwdev_lock(channel->ccwdev), flags);
if (rc) {
@@ -1953,7 +1960,6 @@ static int qeth_idx_activate_channel(struct qeth_channel *channel,
QETH_DBF_MESSAGE(2, "%s IDX activate timed out\n",
dev_name(&channel->ccwdev->dev));
QETH_DBF_TEXT_(SETUP, 2, "2err%d", -ETIME);
- qeth_clear_cmd_buffers(channel);
return -ETIME;
}
return qeth_idx_activate_get_answer(channel, idx_reply_cb);
@@ -2155,8 +2161,8 @@ int qeth_send_control_data(struct qeth_card *card, int len,
QETH_CARD_TEXT(card, 6, "noirqpnd");
spin_lock_irqsave(get_ccwdev_lock(card->write.ccwdev), flags);
- rc = ccw_device_start(card->write.ccwdev, &card->write.ccw,
- (addr_t) iob, 0, 0);
+ rc = ccw_device_start_timeout(CARD_WDEV(card), &card->write.ccw,
+ (addr_t) iob, 0, 0, event_timeout);
spin_unlock_irqrestore(get_ccwdev_lock(card->write.ccwdev), flags);
if (rc) {
QETH_DBF_MESSAGE(2, "%s qeth_send_control_data: "
@@ -2188,8 +2194,6 @@ int qeth_send_control_data(struct qeth_card *card, int len,
}
}
- if (reply->rc == -EIO)
- goto error;
rc = reply->rc;
qeth_put_reply(reply);
return rc;
@@ -2200,9 +2204,6 @@ int qeth_send_control_data(struct qeth_card *card, int len,
list_del_init(&reply->list);
spin_unlock_irqrestore(&reply->card->lock, flags);
atomic_inc(&reply->received);
-error:
- atomic_set(&card->write.irq_pending, 0);
- qeth_release_buffer(iob->channel, iob);
rc = reply->rc;
qeth_put_reply(reply);
return rc;
diff --git a/drivers/s390/net/qeth_core_mpc.h b/drivers/s390/net/qeth_core_mpc.h
index 619f897b4bb0..f4d1ec0b8f5a 100644
--- a/drivers/s390/net/qeth_core_mpc.h
+++ b/drivers/s390/net/qeth_core_mpc.h
@@ -35,6 +35,18 @@ extern unsigned char IPA_PDU_HEADER[];
#define QETH_HALT_CHANNEL_PARM -11
#define QETH_RCD_PARM -12
+static inline bool qeth_intparm_is_iob(unsigned long intparm)
+{
+ switch (intparm) {
+ case QETH_CLEAR_CHANNEL_PARM:
+ case QETH_HALT_CHANNEL_PARM:
+ case QETH_RCD_PARM:
+ case 0:
+ return false;
+ }
+ return true;
+}
+
/*****************************************************************************/
/* IP Assist related definitions */
/*****************************************************************************/
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 830ca56a62e5..2f1c13226a7a 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -1346,8 +1346,8 @@ static int qeth_osn_send_control_data(struct qeth_card *card, int len,
qeth_prepare_control_data(card, len, iob);
QETH_CARD_TEXT(card, 6, "osnoirqp");
spin_lock_irqsave(get_ccwdev_lock(card->write.ccwdev), flags);
- rc = ccw_device_start(card->write.ccwdev, &card->write.ccw,
- (addr_t) iob, 0, 0);
+ rc = ccw_device_start_timeout(CARD_WDEV(card), &card->write.ccw,
+ (addr_t) iob, 0, 0, QETH_IPA_TIMEOUT);
spin_unlock_irqrestore(get_ccwdev_lock(card->write.ccwdev), flags);
if (rc) {
QETH_DBF_MESSAGE(2, "qeth_osn_send_control_data: "
--
2.13.5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox