* [PATCH] rfs: call sock_rps_record_flow() in tcp_splice_read()
From: Changli Gao @ 2010-07-13 7:00 UTC (permalink / raw)
To: David S. Miller
Cc: David S. Miller, Alexey Kuznetsov, Pekka Savola (ipv6),
James Morris, Hideaki YOSHIFUJI, Patrick McHardy, Tom Herbert,
netdev, Changli Gao
rfs: call sock_rps_record_flow() in tcp_splice_read()
call sock_rps_record_flow() in tcp_splice_read(), so the applications using
splice(2) or sendfile(2) can utilize RFS.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
net/ipv4/tcp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9fce8a8..86b9f67 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -608,6 +608,7 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
ssize_t spliced;
int ret;
+ sock_rps_record_flow(sk);
/*
* We can't seek on a socket input
*/
^ permalink raw reply related
* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Sridhar Samudrala @ 2010-07-13 6:59 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Oleg Nesterov, Peter Zijlstra, Tejun Heo, Ingo Molnar, netdev,
lkml, kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev,
Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <20100704090005.GA8078@redhat.com>
On 7/4/2010 2:00 AM, Michael S. Tsirkin wrote:
> On Fri, Jul 02, 2010 at 11:06:37PM +0200, Oleg Nesterov wrote:
>
>> On 07/02, Peter Zijlstra wrote:
>>
>>> On Fri, 2010-07-02 at 11:01 -0700, Sridhar Samudrala wrote:
>>>
>>>> Does it (Tejun's kthread_clone() patch) also inherit the
>>>> cgroup of the caller?
>>>>
>>> Of course, its a simple do_fork() which inherits everything just as you
>>> would expect from a similar sys_clone()/sys_fork() call.
>>>
>> Yes. And I'm afraid it can inherit more than we want. IIUC, this is called
>> from ioctl(), right?
>>
>> Then the new thread becomes the natural child of the caller, and it shares
>> ->mm with the parent. And files, dup_fd() without CLONE_FS.
>>
>> Signals. Say, if you send SIGKILL to this new thread, it can't sleep in
>> TASK_INTERRUPTIBLE or KILLABLE after that. And this SIGKILL can be sent
>> just because the parent gets SIGQUIT or abother coredumpable signal.
>> Or the new thread can recieve SIGSTOP via ^Z.
>>
>> Perhaps this is OK, I do not know. Just to remind that kernel_thread()
>> is merely clone(CLONE_VM).
>>
>> Oleg.
>>
>
> Right. Doing this might break things like flush. The signal and exit
> behaviour needs to be examined carefully. I am also unsure whether
> using such threads might be more expensive than inheriting kthreadd.
>
>
Should we just leave it to the userspace to set the cgroup/cpumask after
qemu starts the guest and
the vhost threads?
Thanks
Sridhar
^ permalink raw reply
* Re: [PATCH] netfilter: xtables: userspace notification target
From: Changli Gao @ 2010-07-13 6:18 UTC (permalink / raw)
To: Samuel Ortiz
Cc: Patrick McHardy, David S. Miller, netdev, netfilter-devel,
Luciano Coelho
In-Reply-To: <20100713001115.GA3751@sortiz-mobl>
On Tue, Jul 13, 2010 at 8:11 AM, Samuel Ortiz <sameo@linux.intel.com> wrote:
>
> The userspace notification Xtables target sends a netlink notification
> whenever a packet hits the target. Notifications have a label attribute
> for userspace to match it against a previously set rule. The rules also
> take a --all option to switch between sending a notification for all
> packets or for the first one only.
> Userspace can also send a netlink message to toggle this switch while the
> target is in place. This target uses the nefilter netlink framework.
>
> This target combined with various matches (quota, rateest, etc..) allows
> userspace to make decisions on interfaces handling. One could for example
> decide to switch between power saving modes depending on estimated rate
> thresholds.
>
It much like the following iptables rules.
iptables -N log_and_drop
iptables -A log_and_drop -j NFLOG --nflog-group 1 --nflog-prefix "log_and_drop"
iptables -A log_and_drop -j DROP
...
iptables ... -m quota --quota-bytes 20000 -j log_and_drop
...
> include/linux/netfilter/Kbuild | 1 +
> include/linux/netfilter/nfnetlink.h | 5 +-
> include/linux/netfilter/nfnetlink_compat.h | 1 +
> include/linux/netfilter/xt_NFNOTIF.h | 55 +++++
> net/netfilter/Kconfig | 17 ++
> net/netfilter/Makefile | 1 +
> net/netfilter/xt_NFNOTIF.c | 300 ++++++++++++++++++++++++++++
> 7 files changed, 379 insertions(+), 1 deletions(-)
> create mode 100644 include/linux/netfilter/xt_NFNOTIF.h
> create mode 100644 net/netfilter/xt_NFNOTIF.c
>
> diff --git a/include/linux/netfilter/Kbuild b/include/linux/netfilter/Kbuild
> index bb103f4..1b80b27 100644
> --- a/include/linux/netfilter/Kbuild
> +++ b/include/linux/netfilter/Kbuild
> @@ -12,6 +12,7 @@ header-y += xt_IDLETIMER.h
> header-y += xt_LED.h
> header-y += xt_MARK.h
> header-y += xt_NFLOG.h
> +header-y += xt_NFNOTIF.h
> header-y += xt_NFQUEUE.h
> header-y += xt_RATEEST.h
> header-y += xt_SECMARK.h
> diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h
> index 361d6b5..e336f03 100644
> --- a/include/linux/netfilter/nfnetlink.h
> +++ b/include/linux/netfilter/nfnetlink.h
> @@ -18,6 +18,8 @@ enum nfnetlink_groups {
> #define NFNLGRP_CONNTRACK_EXP_UPDATE NFNLGRP_CONNTRACK_EXP_UPDATE
> NFNLGRP_CONNTRACK_EXP_DESTROY,
> #define NFNLGRP_CONNTRACK_EXP_DESTROY NFNLGRP_CONNTRACK_EXP_DESTROY
> + NFNLGRP_NFNOTIF,
> +#define NFNLGRP_NFNOTIF NFNLGRP_NFNOTIF
> __NFNLGRP_MAX,
> };
> #define NFNLGRP_MAX (__NFNLGRP_MAX - 1)
> @@ -47,7 +49,8 @@ struct nfgenmsg {
> #define NFNL_SUBSYS_QUEUE 3
> #define NFNL_SUBSYS_ULOG 4
> #define NFNL_SUBSYS_OSF 5
> -#define NFNL_SUBSYS_COUNT 6
> +#define NFNL_SUBSYS_NFNOTIF 6
> +#define NFNL_SUBSYS_COUNT 7
>
> #ifdef __KERNEL__
>
> diff --git a/include/linux/netfilter/nfnetlink_compat.h b/include/linux/netfilter/nfnetlink_compat.h
> index ffb9503..dca8ab2 100644
> --- a/include/linux/netfilter/nfnetlink_compat.h
> +++ b/include/linux/netfilter/nfnetlink_compat.h
> @@ -13,6 +13,7 @@
> #define NF_NETLINK_CONNTRACK_EXP_NEW 0x00000008
> #define NF_NETLINK_CONNTRACK_EXP_UPDATE 0x00000010
> #define NF_NETLINK_CONNTRACK_EXP_DESTROY 0x00000020
> +#define NF_NETLINK_NFNOTIF 0x00000040
>
> /* Generic structure for encapsulation optional netfilter information.
> * It is reminiscent of sockaddr, but with sa_family replaced
> diff --git a/include/linux/netfilter/xt_NFNOTIF.h b/include/linux/netfilter/xt_NFNOTIF.h
> new file mode 100644
> index 0000000..8fae827
> --- /dev/null
> +++ b/include/linux/netfilter/xt_NFNOTIF.h
> @@ -0,0 +1,55 @@
> +/*
> + * linux/include/linux/netfilter/xt_NFNOTIF.h
> + *
> + * Header file for Xtables notification target module.
> + *
> + * Copyright (C) 2010 Intel Corporation
> + * Samuel Ortiz <samuel.ortiz@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> + * 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> + * 02110-1301, USA.
> + */
> +
> +#ifndef _XT_NFNOTIF_H
> +#define _XT_NFNOTIF_H
> +
> +#include <linux/types.h>
> +
> +enum nfnotif_msg_type {
> + NFNOTIF_TG_MSG_PACKETS,
> +
> + NFNOTIF_TG_MSG_MAX
> +};
> +
> +enum nfnotif_attr_type {
> + NFNOTIF_TG_ATTR_UNSPEC,
> + NFNOTIF_TG_ATTR_LABEL,
> + NFNOTIF_TG_ATTR_SEND_NOTIF,
> +
> + __NFNOTIF_TG_ATTR_AFTER_LAST
> +};
> +#define NFNOTIF_TG_ATTR_MAX (__NFNOTIF_TG_ATTR_AFTER_LAST - 1)
> +
> +#define MAX_NFNOTIF_LABEL_SIZE 31
> +
> +struct nfnotif_tg_info {
> + __u8 all_packets;
> +
> + char label[MAX_NFNOTIF_LABEL_SIZE];
> +
> + /* for kernel module internal use only */
> + struct nfnotif_tg *notif __attribute((aligned(8)));
> +};
> +
> +#endif
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index aa2f106..0e2de36 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -469,6 +469,23 @@ config NETFILTER_XT_TARGET_NFQUEUE
>
> To compile it as a module, choose M here. If unsure, say N.
>
> +config NETFILTER_XT_TARGET_NFNOTIF
> + tristate '"NFNOTIF" target Support'
> + depends on NETFILTER_ADVANCED
> + select NETFILTER_NETLINK
> + help
> +
> + This option adds the `NFNOTIF' target, which allows to send
> + netfilter netlink messages when packets hit the target.
> +
> + This target comes with an option to specify if one wants all
> + packets hitting the target to trigger the netlink message
> + transmission, or only the first one.
> + It also listen on its netfilter netlink subsystem for messages
> + allowing to reset the above option.
> +
> + To compile it as a module, choose M here. If unsure, say N.
> +
> config NETFILTER_XT_TARGET_NOTRACK
> tristate '"NOTRACK" target support'
> depends on IP_NF_RAW || IP6_NF_RAW
> diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
> index e28420a..5d9c9e9 100644
> --- a/net/netfilter/Makefile
> +++ b/net/netfilter/Makefile
> @@ -62,6 +62,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP) += xt_TCPOPTSTRIP.o
> obj-$(CONFIG_NETFILTER_XT_TARGET_TEE) += xt_TEE.o
> obj-$(CONFIG_NETFILTER_XT_TARGET_TRACE) += xt_TRACE.o
> obj-$(CONFIG_NETFILTER_XT_TARGET_IDLETIMER) += xt_IDLETIMER.o
> +obj-$(CONFIG_NETFILTER_XT_TARGET_NFNOTIF) += xt_NFNOTIF.o
>
> # matches
> obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
> diff --git a/net/netfilter/xt_NFNOTIF.c b/net/netfilter/xt_NFNOTIF.c
> new file mode 100644
> index 0000000..e6e906b
> --- /dev/null
> +++ b/net/netfilter/xt_NFNOTIF.c
> @@ -0,0 +1,300 @@
> +/*
> + * linux/net/netfilter/xt_NFNOTIF.c
> + *
> + * Copyright (C) 2010 Intel Corporation
> + * Samuel Ortiz <samuel.ortiz@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License version
> + * 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> + * 02110-1301, USA.
> + *
> + */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include <linux/module.h>
> +#include <linux/list.h>
> +#include <linux/mutex.h>
> +#include <linux/netfilter.h>
> +#include <linux/netfilter/x_tables.h>
> +#include <linux/netfilter/nfnetlink.h>
> +#include <linux/netfilter/xt_NFNOTIF.h>
> +
> +struct nfnotif_tg {
> + struct list_head entry;
> + struct work_struct work;
> +
> + char *label;
> + __u8 all_packets;
> + struct net *net;
> +
> + __u8 send_notif;
> +
> + unsigned int refcnt;
> +};
> +
> +static LIST_HEAD(nfnotif_tg_list);
> +static DEFINE_MUTEX(list_mutex);
> +
> +static int __nfnotif_tg_netlink_send(struct nfnotif_tg *nfnotif)
> +{
> + struct nlmsghdr *nlh;
> + struct nfgenmsg *nfmsg;
> + struct sk_buff *skb;
> + struct net *net = nfnotif->net;
> + unsigned int type;
> + int flags;
> +
> + type = NFNL_SUBSYS_NFNOTIF << 8;
> + flags = NLM_F_CREATE;
> +
> + skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> + if (skb == NULL)
> + goto error_out;
> +
> + nlh = nlmsg_put(skb, 0, 0, type, sizeof(*nfmsg), flags);
> + if (nlh == NULL)
> + goto nlmsg_put_failure;
> +
> + nfmsg = nlmsg_data(nlh);
> + nfmsg->version = NFNETLINK_V0;
> + nfmsg->res_id = 0;
> +
> + NLA_PUT_STRING(skb, NFNOTIF_TG_ATTR_LABEL, nfnotif->label);
> +
> + nlmsg_end(skb, nlh);
> +
> + return nfnetlink_send(skb, net, 0, NFNLGRP_NFNOTIF, 0, GFP_KERNEL);
> +
> +nla_put_failure:
> + nlmsg_cancel(skb, nlh);
> +
> +nlmsg_put_failure:
> + kfree_skb(skb);
> +
> +error_out:
> + return nfnetlink_set_err(net, 0, 0, -ENOBUFS);
> +}
> +
> +static void nfnotif_tg_work(struct work_struct *work)
> +{
> + struct nfnotif_tg *notif = container_of(work, struct nfnotif_tg, work);
> +
> +
> + if (__nfnotif_tg_netlink_send(notif) < 0)
> + pr_debug("Could not send notification");
> +
> + if (!notif->all_packets)
> + notif->send_notif = 0;
> +}
> +
> +static struct nfnotif_tg *__nfnotif_tg_find_by_label(const char *label)
> +{
> + struct nfnotif_tg *entry;
> +
> + BUG_ON(!label);
> +
> + list_for_each_entry(entry, &nfnotif_tg_list, entry) {
> + if (!strcmp(label, entry->label))
> + return entry;
> + }
> +
> + return NULL;
> +}
> +
> +static int nfnotif_tg_create(struct nfnotif_tg_info *info)
> +{
> + info->notif = kmalloc(sizeof(*info->notif), GFP_KERNEL);
> + if (!info->notif) {
> + pr_debug("Couldn't allocate notification\n");
> + return -ENOMEM;
> + }
> +
> + info->notif->label = kstrdup(info->label, GFP_KERNEL);
> + if (!info->notif->label) {
> + pr_debug("Couldn't allocate label\n");
> + kfree(info->notif);
> + return -ENOMEM;
> + }
> +
> + info->notif->all_packets = info->all_packets;
> + info->notif->send_notif = 1;
> +
> + list_add(&info->notif->entry, &nfnotif_tg_list);
> +
> + info->notif->refcnt = 1;
> +
> + INIT_WORK(&info->notif->work, nfnotif_tg_work);
> +
> + return 0;
> +}
> +
> +static unsigned int nfnotif_tg_target(struct sk_buff *skb,
> + const struct xt_action_param *par)
> +{
> + const struct nfnotif_tg_info *info = par->targinfo;
> +
> + BUG_ON(!info->notif);
> +
> + if (!info->notif->send_notif)
> + return XT_CONTINUE;
> +
> + pr_debug("Sending notification for %s\n", info->label);
> +
> + schedule_work(&info->notif->work);
> +
Why do you use another kernel activity: kernel thread? netlink
messages can be sent in atomic context.
> + return XT_CONTINUE;
> +}
> +
> +static int nfnotif_tg_checkentry(const struct xt_tgchk_param *par)
> +{
> + struct nfnotif_tg_info *info = par->targinfo;
> + int ret;
> +
> + pr_debug("Checkentry targinfo %s\n", info->label);
> +
> + if (info->label[0] == '\0' ||
> + strnlen(info->label,
> + MAX_NFNOTIF_LABEL_SIZE) == MAX_NFNOTIF_LABEL_SIZE) {
> + pr_debug("Label is empty or not nul-terminated\n");
> + return -EINVAL;
> + }
> +
> + mutex_lock(&list_mutex);
> +
> + info->notif = __nfnotif_tg_find_by_label(info->label);
> + if (info->notif) {
> + info->notif->refcnt++;
> +
> + pr_debug("Increased refcnt for %s to %u\n",
> + info->label, info->notif->refcnt);
> + } else {
> + ret = nfnotif_tg_create(info);
> + if (ret < 0) {
> + pr_debug("Failed to create notification\n");
> + mutex_unlock(&list_mutex);
> + return ret;
> + }
> + }
> +
> + info->notif->net = par->net;
> +
> + mutex_unlock(&list_mutex);
> + return 0;
> +}
> +
> +static void nfnotif_tg_destroy(const struct xt_tgdtor_param *par)
> +{
> + const struct nfnotif_tg_info *info = par->targinfo;
> +
> + pr_debug("Destroy targinfo %s\n", info->label);
> +
> + mutex_lock(&list_mutex);
> +
> + if (--info->notif->refcnt == 0) {
> + pr_debug("Deleting notification %s\n", info->label);
> +
> + list_del(&info->notif->entry);
> + kfree(info->notif->label);
> + kfree(info->notif);
> + }
> +
> + mutex_unlock(&list_mutex);
> +}
> +
> +static struct xt_target nfnotif_tg __read_mostly = {
> + .name = "NFNOTIF",
> + .family = NFPROTO_UNSPEC,
> + .target = nfnotif_tg_target,
> + .targetsize = sizeof(struct nfnotif_tg_info),
> + .checkentry = nfnotif_tg_checkentry,
> + .destroy = nfnotif_tg_destroy,
> + .me = THIS_MODULE,
> +};
> +
> +static int nfnotif_msg_send_notif(struct sock *nfnl, struct sk_buff *skb,
> + const struct nlmsghdr *nlh,
> + const struct nlattr * const attrs[])
> +{
> + struct nfnotif_tg *notif;
> + char *label;
> + u8 send_notif;
> +
> + if (attrs[NFNOTIF_TG_ATTR_LABEL] == NULL ||
> + attrs[NFNOTIF_TG_ATTR_SEND_NOTIF] == NULL)
> + return -EINVAL;
> +
> + label = nla_data(attrs[NFNOTIF_TG_ATTR_LABEL]);
> + send_notif = nla_get_u8(attrs[NFNOTIF_TG_ATTR_SEND_NOTIF]);
> +
> + pr_debug("Label %s send %d\n", label, send_notif);
> +
> + notif = __nfnotif_tg_find_by_label(label);
> + if (notif == NULL)
> + return -EINVAL;
> +
> + notif->send_notif = send_notif;
> +
> + return 0;
> +}
> +
> +
> +static const struct nla_policy nfnotif_nla_policy[NFNOTIF_TG_ATTR_MAX + 1] = {
> + [NFNOTIF_TG_ATTR_LABEL] = { .type = NLA_NUL_STRING },
> + [NFNOTIF_TG_ATTR_SEND_NOTIF] = { .type = NLA_U8 },
> +};
> +
> +static const struct nfnl_callback nfnotif_cb[NFNOTIF_TG_MSG_MAX] = {
> + [NFNOTIF_TG_MSG_PACKETS] = { .call = nfnotif_msg_send_notif,
> + .attr_count = NFNOTIF_TG_ATTR_MAX,
> + .policy = nfnotif_nla_policy },
> +};
> +
> +static const struct nfnetlink_subsystem nfnotif_subsys = {
> + .name = "nfnotif",
> + .subsys_id = NFNL_SUBSYS_NFNOTIF,
> + .cb_count = NFNOTIF_TG_MSG_MAX,
> + .cb = nfnotif_cb,
> +};
> +
> +static int __init nfnotif_tg_init(void)
> +{
> + int ret;
> +
> + ret = nfnetlink_subsys_register(&nfnotif_subsys);
> + if (ret < 0) {
> + pr_err("%s: Cannot register with nfnetlink\n", __func__);
> + return ret;
> + }
> +
> + ret = xt_register_target(&nfnotif_tg);
> + if (ret < 0) {
> + pr_err("%s: Cannot register target\n", __func__);
> + nfnetlink_subsys_unregister(&nfnotif_subsys);
> + }
> +
> + return ret;
> +}
> +
> +static void __exit nfnotif_tg_exit(void)
> +{
> + nfnetlink_subsys_unregister(&nfnotif_subsys);
> + xt_unregister_target(&nfnotif_tg);
> +}
> +
> +module_init(nfnotif_tg_init);
> +module_exit(nfnotif_tg_exit);
> +
> +MODULE_AUTHOR("Samuel Ortiz <samuel.ortiz@intel.com>");
> +MODULE_DESCRIPTION("Xtables: userspace notification");
> +MODULE_LICENSE("GPL v2");
> --
> 1.7.1
>
> --
> Intel Open Source Technology Centre
> http://oss.intel.com/
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Regards,
Changli Gao(xiaosuo@gmail.com)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] netfilter: xtables: userspace notification target
From: Jan Engelhardt @ 2010-07-13 5:56 UTC (permalink / raw)
To: Samuel Ortiz
Cc: Patrick McHardy, David S. Miller, netdev, netfilter-devel,
Luciano Coelho
In-Reply-To: <20100713001115.GA3751@sortiz-mobl>
On Tuesday 2010-07-13 02:11, Samuel Ortiz wrote:
>
>The userspace notification Xtables target sends a netlink notification
>whenever a packet hits the target. Notifications have a label attribute
>for userspace to match it against a previously set rule. The rules also
>take a --all option to switch between sending a notification for all
>packets or for the first one only.
>Userspace can also send a netlink message to toggle this switch while the
>target is in place. This target uses the nefilter netlink framework.
Would it not make sense to modify that module?
Sounds an awful lot like NFQUEUE without passing the payload :)
>+++ b/net/netfilter/xt_NFNOTIF.c
>+struct nfnotif_tg {
>+ struct list_head entry;
>+ struct work_struct work;
>+
>+ char *label;
>+ __u8 all_packets;
>+ struct net *net;
>+
>+ __u8 send_notif;
>+
>+ unsigned int refcnt;
>+};
Has unnecessary padding holes.
^ permalink raw reply
* Re: iproute, batch-cmds, and mac-vlans.
From: Ben Greear @ 2010-07-13 5:32 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: NetDev
In-Reply-To: <20100712221958.2a87b0e3@nehalam>
On 07/12/2010 10:19 PM, Stephen Hemminger wrote:
> On Mon, 12 Jul 2010 21:49:20 -0700
> Ben Greear<greearb@candelatech.com> wrote:
>
>> After too much time debugging, I finally realized that the ip
>> tool was truncating my command because the mac-vlan device name
>> had a '#' in it.
>>
>> ]# cat /tmp/foo.txt
>> ru add to 10.99.21.1 iif eth0#0 lookup local pref 11
>>
>>
>> # IP tool has some hacked up debugging code
>> ]# ip -batch /tmp/foo.txt
>> argc: 4
>> arg -:to:-
>> arg -:iif:-
>> WARNING: Using TABLE_MAIN in iprule_modify, table_ok: 0 cmd: 32
>>
>>
>> So, it acts on eth0 instead of eth0#0, and silently ignores the 'lookup local pref 11'.
>>
>> I understand that it is trying to parse # as comments, but would you
>> all be interested in a patch that allowed ignoring '#' except
>> when it is the first non-whitespace character on a line, and maybe
>> when preceded by whitespace? This would of course have the possibility
>> of breaking someone's script somewhere, so it could be enabled with
>> a new command line arg, perhaps.
>
> Putting # in device name just sounds like a bad idea.
It's been the standard naming for mac-vlans since we started supporting them.
In case you change your mind, this patch seems to work..though I can't figure out
how to trigger the second bit of code in the while loop, so it may not be right.
I'll move my iproute2 tree to github in case someone else wants to give
it a try.
diff --git a/lib/utils.c b/lib/utils.c
index a60d884..ad8e1ac 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -25,7 +25,7 @@
#include <linux/pkt_sched.h>
#include <time.h>
#include <sys/time.h>
-
+#include <ctype.h>
#include "utils.h"
@@ -708,8 +708,23 @@ ssize_t getcmdline(char **linep, size_t *lenp, FILE *in)
++cmdlineno;
cp = strchr(*linep, '#');
- if (cp)
- *cp = '\0';
+
+ /* We don't want to treat the # in the middle of a word as
+ * a comment..makes batch commands dealing with mac-vlans: eth0#1
+ * silently do the wrong thing. So, tighten up the # syntax a bit.
+ *
+ * # at start of line comments rest of line
+ * # preceded by a whitespace character comments rest of line.
+ */
+ while (cp) {
+ if (cp &&
+ ((cp == *linep) /* starts line */
+ || ((cp > *linep) && isspace(*(cp - 1))))) { /* follows space */
+ *cp = '\0';
+ break;
+ }
+ cp = strchr(cp+1, '#');
+ }
while ((cp = strstr(*linep, "\\\n")) != NULL) {
char *line1 = NULL;
@@ -725,9 +740,16 @@ ssize_t getcmdline(char **linep, size_t *lenp, FILE *in)
*cp = 0;
cp = strchr(line1, '#');
- if (cp)
- *cp = '\0';
-
+ while (cp) {
+ if (cp &&
+ ((cp == line1) /* starts line */
+ || ((cp > line1) && isspace(*(cp - 1))))) { /* follows space */
+ *cp = '\0';
+ break;
+ }
+ cp = strchr(cp+1, '#');
+ }
+
*lenp = strlen(*linep) + strlen(line1) + 1;
*linep = realloc(*linep, *lenp);
if (!*linep) {
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply related
* Re: iproute, batch-cmds, and mac-vlans.
From: Stephen Hemminger @ 2010-07-13 5:19 UTC (permalink / raw)
To: Ben Greear; +Cc: NetDev
In-Reply-To: <4C3BF050.7040206@candelatech.com>
On Mon, 12 Jul 2010 21:49:20 -0700
Ben Greear <greearb@candelatech.com> wrote:
> After too much time debugging, I finally realized that the ip
> tool was truncating my command because the mac-vlan device name
> had a '#' in it.
>
> ]# cat /tmp/foo.txt
> ru add to 10.99.21.1 iif eth0#0 lookup local pref 11
>
>
> # IP tool has some hacked up debugging code
> ]# ip -batch /tmp/foo.txt
> argc: 4
> arg -:to:-
> arg -:iif:-
> WARNING: Using TABLE_MAIN in iprule_modify, table_ok: 0 cmd: 32
>
>
> So, it acts on eth0 instead of eth0#0, and silently ignores the 'lookup local pref 11'.
>
> I understand that it is trying to parse # as comments, but would you
> all be interested in a patch that allowed ignoring '#' except
> when it is the first non-whitespace character on a line, and maybe
> when preceded by whitespace? This would of course have the possibility
> of breaking someone's script somewhere, so it could be enabled with
> a new command line arg, perhaps.
Putting # in device name just sounds like a bad idea.
^ permalink raw reply
* Re: [Pv-drivers] RFC: Network Plugin Architecture (NPA) for vmxnet3
From: Stephen Hemminger @ 2010-07-13 5:16 UTC (permalink / raw)
To: Shreyas Bhatewara
Cc: Christoph Hellwig, Pankaj Thakkar, pv-drivers@vmware.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
In-Reply-To: <1278990388.32650.22.camel@eng-rhel5-64>
On Mon, 12 Jul 2010 20:06:28 -0700
Shreyas Bhatewara <sbhatewara@vmware.com> wrote:
>
> On Thu, 2010-05-06 at 13:21 -0700, Christoph Hellwig wrote:
> > On Wed, May 05, 2010 at 10:52:53AM -0700, Stephen Hemminger wrote:
> > > Let me put it bluntly. Any design that allows external code to run
> > > in the kernel is not going to be accepted. Out of tree kernel modules are enough
> > > of a pain already, why do you expect the developers to add another
> > > interface.
> >
> > Exactly. Until our friends at VMware get this basic fact it's useless
> > to continue arguing.
> >
> > Pankaj and Dmitry: you're fine to waste your time on this, but it's not
> > going to go anywhere until you address that fundamental problem. The
> > first thing you need to fix in your archicture is to integrate the VF
> > function code into the kernel tree, and we can work from there.
> >
> > Please post patches doing this if you want to resume the discussion.
> >
> > _______________________________________________
> > Pv-drivers mailing list
> > Pv-drivers@vmware.com
> > http://mailman2.vmware.com/mailman/listinfo/pv-drivers
>
>
> As discussed, following is the patch to give you an idea
> about implementation of NPA for vmxnet3 driver. Although the
> patch is big, I have verified it with checkpatch.pl. It gave
> 0 errors / warnings.
>
> Signed-off-by: Matthieu Bucchaineri <matthieu@vmware.com>
> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
I think the concept won't fly.
But you should really at least try running checkpatch to make sure
the style conforms.
--
^ permalink raw reply
* iproute, batch-cmds, and mac-vlans.
From: Ben Greear @ 2010-07-13 4:49 UTC (permalink / raw)
To: NetDev
After too much time debugging, I finally realized that the ip
tool was truncating my command because the mac-vlan device name
had a '#' in it.
]# cat /tmp/foo.txt
ru add to 10.99.21.1 iif eth0#0 lookup local pref 11
# IP tool has some hacked up debugging code
]# ip -batch /tmp/foo.txt
argc: 4
arg -:to:-
arg -:iif:-
WARNING: Using TABLE_MAIN in iprule_modify, table_ok: 0 cmd: 32
So, it acts on eth0 instead of eth0#0, and silently ignores the 'lookup local pref 11'.
I understand that it is trying to parse # as comments, but would you
all be interested in a patch that allowed ignoring '#' except
when it is the first non-whitespace character on a line, and maybe
when preceded by whitespace? This would of course have the possibility
of breaking someone's script somewhere, so it could be enabled with
a new command line arg, perhaps.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: [PATCH] hso: remove driver version
From: David Miller @ 2010-07-13 4:21 UTC (permalink / raw)
To: f.aben; +Cc: gregkh, linux-usb, netdev
In-Reply-To: <1278942125.23105.39.camel@filip-linux>
From: Filip Aben <f.aben@option.com>
Date: Mon, 12 Jul 2010 15:42:05 +0200
> This patch removes the driver version from the driver. This version
> hasn't changed since the driver's inclusion in the kernel and is a
> source of confusion for some customers.
>
> Signed-off-by: Filip Aben <f.aben@option.com>
Applied, thanks.
^ permalink raw reply
* Re: [REGRESSION,BISECTED] Panic on ifup
From: David Miller @ 2010-07-13 4:20 UTC (permalink / raw)
To: timo.teras; +Cc: linux, netdev
In-Reply-To: <4C3ABDC3.3000408@iki.fi>
From: Timo Teräs <timo.teras@iki.fi>
Date: Mon, 12 Jul 2010 10:01:23 +0300
> And here goes the patch (which I've only compile tested so far).
Timo, since this tested positively for the reporter, please make
a formal submission once you are happy with the patch.
Thanks.
^ permalink raw reply
* Re: [PATCH 32/36] net/irda: Remove unnecessary casts of private_data
From: David Miller @ 2010-07-13 4:14 UTC (permalink / raw)
To: joe; +Cc: trivial, linux-kernel, samuel, netdev
In-Reply-To: <0f791bb0d02971908720f0d09b4d8ff9140bb6d0.1278967121.git.joe@perches.com>
From: Joe Perches <joe@perches.com>
Date: Mon, 12 Jul 2010 13:50:24 -0700
> Signed-off-by: Joe Perches <joe@perches.com>
Applied.
^ permalink raw reply
* Re: [PATCH 13/36] drivers/net/caif: Remove unnecessary casts of private_data
From: David Miller @ 2010-07-13 4:13 UTC (permalink / raw)
To: joe; +Cc: trivial, linux-kernel, netdev
In-Reply-To: <2fadd3261e91b96050bf8193efacee08842ef9eb.1278967120.git.joe@perches.com>
From: Joe Perches <joe@perches.com>
Date: Mon, 12 Jul 2010 13:50:05 -0700
> Signed-off-by: Joe Perches <joe@perches.com>
Applied.
^ permalink raw reply
* Re: [PATCH 31/36] net/core: Remove unnecessary casts of private_data
From: David Miller @ 2010-07-13 4:14 UTC (permalink / raw)
To: joe; +Cc: trivial, linux-kernel, netdev
In-Reply-To: <217e404708d336ed1ad3254c5ea6626ee94b6a3d.1278967121.git.joe@perches.com>
From: Joe Perches <joe@perches.com>
Date: Mon, 12 Jul 2010 13:50:23 -0700
> Signed-off-by: Joe Perches <joe@perches.com>
Applied.
^ permalink raw reply
* Re: [PATCH 10/36] drivers/isdn: Remove unnecessary casts of private_data
From: David Miller @ 2010-07-13 4:13 UTC (permalink / raw)
To: joe; +Cc: trivial, linux-kernel, isdn, netdev
In-Reply-To: <3dced1fd71a6f9b10f10cb2f47c88e3767a6e6cf.1278967120.git.joe@perches.com>
From: Joe Perches <joe@perches.com>
Date: Mon, 12 Jul 2010 13:50:02 -0700
> Signed-off-by: Joe Perches <joe@perches.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next] drivers/scsi: Remove warnings after vsprintf %pV introduction
From: David Miller @ 2010-07-13 3:37 UTC (permalink / raw)
To: James.Bottomley
Cc: joe, sfr, netdev, linux-next, linux-kernel, gregkh, matthew,
linux-scsi
In-Reply-To: <1278923239.32614.1.camel@mulgrave.site>
From: James Bottomley <James.Bottomley@suse.de>
Date: Mon, 12 Jul 2010 04:27:18 -0400
> What's the other 60% of the patch about? the strange addition of
> scsi_show_extd_sense_args() and all the KERN_CONT bits?
Well the new function is for logical seperation, and KERN_CONT
is what is supposed to be at the front of every printk format
that continues a partially-printed line.
^ permalink raw reply
* Re: [PATCH] fec: use interrupt for MDIO completion indication
From: David Miller @ 2010-07-13 3:36 UTC (permalink / raw)
To: baruch; +Cc: netdev, linux-arm-kernel, kernel, bryan.wu, gerg
In-Reply-To: <006416d38a8e51ba8dd8631613a991528dc7976a.1278918594.git.baruch@tkos.co.il>
From: Baruch Siach <baruch@tkos.co.il>
Date: Mon, 12 Jul 2010 10:12:51 +0300
> With the move to phylib (commit e6b043d) I was seeing sporadic "MDIO write
> timeout" messages. Measure of the actual time spent showed latency times of
> more than 1600us.
>
> This patch uses the MII event indication of the FEC hardware to detect
> completion of MDIO transactions.
>
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Applied.
^ permalink raw reply
* Re: [PATCH 06/12] net: autoconvert trivial BKL users to private mutex
From: David Miller @ 2010-07-13 3:35 UTC (permalink / raw)
To: arnd; +Cc: linux-kernel, jkacur, fweisbec, netdev
In-Reply-To: <1278883143-29035-7-git-send-email-arnd@arndb.de>
From: Arnd Bergmann <arnd@arndb.de>
Date: Sun, 11 Jul 2010 23:18:57 +0200
> All these files use the big kernel lock in a trivial
> way to serialize their private file operations,
> typically resulting from an earlier semi-automatic
> pushdown from VFS.
>
> None of these drivers appears to want to lock against
> other code, and they all use the BKL as the top-level
> lock in their file operations, meaning that there
> is no lock-order inversion problem.
>
> Consequently, we can remove the BKL completely,
> replacing it with a per-file mutex in every case.
> Using a scripted approach means we can avoid
> typos.
...
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Applied.
^ permalink raw reply
* Re: [PATCH 02/12] isdn: autoconvert trivial BKL users to private mutex
From: David Miller @ 2010-07-13 3:35 UTC (permalink / raw)
To: arnd; +Cc: linux-kernel, jkacur, fweisbec, isdn, netdev
In-Reply-To: <1278883143-29035-3-git-send-email-arnd@arndb.de>
From: Arnd Bergmann <arnd@arndb.de>
Date: Sun, 11 Jul 2010 23:18:53 +0200
> All these files use the big kernel lock in a trivial
> way to serialize their private file operations,
> typically resulting from an earlier semi-automatic
> pushdown from VFS.
>
> None of these drivers appears to want to lock against
> other code, and they all use the BKL as the top-level
> lock in their file operations, meaning that there
> is no lock-order inversion problem.
>
> Consequently, we can remove the BKL completely,
> replacing it with a per-file mutex in every case.
> Using a scripted approach means we can avoid
> typos.
...
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Applied.
^ permalink raw reply
* Re: [PATCH net-next-2.6] net: sock_free() optimizations
From: David Miller @ 2010-07-13 3:35 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1278837917.2538.138.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 11 Jul 2010 10:45:17 +0200
> Avoid two extra instructions in sock_free(), to reload
> skb->truesize and skb->sk
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied, thanks Eric.
^ permalink raw reply
* Re: [PATCH 2/2] inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage()
From: David Miller @ 2010-07-13 3:35 UTC (permalink / raw)
To: xiaosuo; +Cc: kuznet, pekkas, jmorris, yoshfuji, kaber, therbert, netdev
In-Reply-To: <1278830515-22422-1-git-send-email-xiaosuo@gmail.com>
From: Changli Gao <xiaosuo@gmail.com>
Date: Sun, 11 Jul 2010 14:41:55 +0800
> inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and
> inet_sendpage()
>
> a new boolean flag no_autobind is added to structure proto to avoid the autobind
> calls when the protocol is TCP. Then sock_rps_record_flow() is called int the
> TCP's sendmsg() and sendpage() pathes.
>
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH 1/2] net: cleanups
From: David Miller @ 2010-07-13 3:35 UTC (permalink / raw)
To: xiaosuo; +Cc: netdev
In-Reply-To: <1278830466-22386-1-git-send-email-xiaosuo@gmail.com>
From: Changli Gao <xiaosuo@gmail.com>
Date: Sun, 11 Jul 2010 14:41:06 +0800
> net: cleanups
>
> remove useless blanks.
>
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Applied.
^ permalink raw reply
* Re: [patch] 9p: strlen() doesn't count the terminator
From: David Miller @ 2010-07-13 3:34 UTC (permalink / raw)
To: akpm
Cc: error27, ericvh, adkulkar, jvrao, linux-kernel, tilman, netdev,
kernel-janitors
In-Reply-To: <20100712130458.bb8ae751.akpm@linux-foundation.org>
From: Andrew Morton <akpm@linux-foundation.org>
Date: Mon, 12 Jul 2010 13:04:58 -0700
> This bug doesn't strike me as serious enough to warrant backporting the fix
> into -stable. What was your thinking there?
Meanwhile I'll queue this up to net-next-2.6, thanks.
^ permalink raw reply
* Re: [patch] isdn: fix strlen() usage
From: David Miller @ 2010-07-13 3:34 UTC (permalink / raw)
To: error27; +Cc: isdn, shemminger, hohndel, jkosina, netdev, kernel-janitors
In-Reply-To: <20100710143111.GA19184@bicker>
From: Dan Carpenter <error27@gmail.com>
Date: Sat, 10 Jul 2010 16:31:11 +0200
> There was a missing "else" statement so the original code overflowed if
> ->master->name was too long. Also the ->slave and ->master buffers can
> hold names with 9 characters and a NULL so I cleaned it up to allow
> another character.
>
> Signed-off-by: Dan Carpenter <error27@gmail.com>
Applied.
^ permalink raw reply
* Re: [PATCH net-next-2.6][atm]: remove IRQF_DISABLED in combination with IRQF_SHARED
From: David Miller @ 2010-07-13 3:34 UTC (permalink / raw)
To: chas; +Cc: netdev
In-Reply-To: <201007101342.o6ADg2wx019157@cmf.nrl.navy.mil>
From: "chas williams - CONTRACTOR" <chas@cmf.nrl.navy.mil>
Date: Sat, 10 Jul 2010 09:42:02 -0400
> From: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
>
> commit 2fae0433cf7febc4b69ff58730de1ca412846743
> Author: chas williams - CONTRACTOR <chas@relax.cmf.nrl.navy.mil>
> Date: Sat Jul 10 09:38:05 2010 -0400
>
> [atm]: remove IRQF_DISABLED in combination with IRQF_SHARED
>
> Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Applied.
^ permalink raw reply
* Re: [PATCH 3/3] xtsonic: free irq if sonic_open() fails
From: David Miller @ 2010-07-13 3:33 UTC (permalink / raw)
To: segooon; +Cc: kernel-janitors, u.kleine-koenig, tj, joe, netdev
In-Reply-To: <1278759704-7777-1-git-send-email-segooon@gmail.com>
From: Kulikov Vasiliy <segooon@gmail.com>
Date: Sat, 10 Jul 2010 15:01:44 +0400
> xtsonic_open() doesn't check sonic_open() return code. If it is error
> we must free requested IRQ.
>
> Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Applied.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox