* Re: [PATCH 01/11] Removing dead RT2800PCI_SOC
From: Bartlomiej Zolnierkiewicz @ 2010-07-15 8:41 UTC (permalink / raw)
To: Felix Fietkau
Cc: John W. Linville, Ivo Van Doorn, Christoph Egger,
Gertjan van Wingerde, Helmut Schaa,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
users-poMEt7QlJxcwIE2E9O76wjtx2kNaKg5H,
netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
vamos-dev-DSeRVwK2yFR7NQr57o4jwX20dTPRyWU8FLXUG6abMr4,
Luis Correia
In-Reply-To: <4C3DCD5C.1080705-p3rKhJxN3npAfugRpC6u6w@public.gmane.org>
On Wednesday 14 July 2010 04:44:44 pm Felix Fietkau wrote:
> On 2010-07-14 3:15 PM, John W. Linville wrote:
> > On Wed, Jul 14, 2010 at 02:52:14PM +0200, Ivo Van Doorn wrote:
> >> On Wed, Jul 14, 2010 at 2:46 PM, Luis Correia <luis.f.correia-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >> > On Wed, Jul 14, 2010 at 13:39, Christoph Egger <siccegge-n+aIp8eCc/CzQB+pC5nmwQ@public.gmane.org> wrote:
> >> >> While RT2800PCI_SOC exists in Kconfig, it depends on either
> >> >> RALINK_RT288X or RALINK_RT305X which are both not available in Kconfig
> >> >> so all Code depending on that can't ever be selected and, if there's
> >> >> no plan to add these options, should be cleaned up
> >> >>
> >> >> Signed-off-by: Christoph Egger <siccegge-n+aIp8eCc/CzQB+pC5nmwQ@public.gmane.org>
> >> >
> >> > NAK,
> >> >
> >> > this is not dead code, it is needed for the Ralink System-on-Chip
> >> > Platform devices.
> >> >
> >> > While I can't fix Kconfig errors and the current KConfig file may be
> >> > wrong, this code cannot and will not be deleted.
> >>
> >> When the config option was introduced, the config options RALINK_RT288X and
> >> RALINK_RT305X were supposed to be merged as well soon after by somebody (Felix?)
> >>
> >> But since testing is done on SoC boards by Helmut and Felix, I assume the code
> >> isn't dead but actually in use.
> >
> > Perhaps Helmut and Felix can send us the missing code?
> The missing code is a MIPS platform port, which is currently being
> maintained in OpenWrt, but is not ready for upstream submission yet.
> I'm not working on this code at the moment, but I think it will be
> submitted once it's ready.
People are using automatic scripts to catch unused config options nowadays
so the issue is quite likely to come back again sooner or later..
Would it be possible to improve situation somehow till the missing parts
get merged? Maybe by adding a tiny comment documenting RT2800PCI_SOC
situation to Kconfig (if the config option itself really cannot be removed)
until all code is ready etc.?
I bet that Christoph would be willing to update his patch if you ask him
nicely..
Thanks,
--
Bartlomiej Zolnierkiewicz
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] netfilter: xtables: userspace notification target
From: Patrick McHardy @ 2010-07-15 9:05 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Luciano Coelho, Changli Gao, Samuel Ortiz, David S. Miller,
netdev@vger.kernel.org, netfilter-devel@vger.kernel.org
In-Reply-To: <4C3DE715.8070502@netfilter.org>
Am 14.07.2010 18:34, schrieb Pablo Neira Ayuso:
> Hi Luciano,
>
> On 14/07/10 14:22, Luciano Coelho wrote:
>> On Wed, 2010-07-14 at 13:48 +0200, ext Patrick McHardy wrote:
>>> If you're using connection tracking, you can use conntrack marks
>>> to avoid sending more than a single message:
>>>
>>> iptables ... -m connmark --mark 0x1/0x1 -j RETURN
>>> iptables ... -j NFLOG ...
>>> iptables ... -j CONNMARK --set-mark 0x1/0x1
>>
>> Cool, thanks.
>>
>> It seems that there are lots of possibilities to get this to work, but
>> this is starting to get quite complex. I would still prefer having the
>> NFNOTIF module included, since we would be able to do what we want in a
>> very simple way. It's also probably much more efficient that using
>> several rules, which would increase the CPU usage considerably (in our
>> device we are already reaching the limit of a reasonable CPU resource
>> usage with high throughput WLAN connections).
Its hard to believe that a connmark match filtering out notifications
would require more CPU time than doing the same in a new target module.
>> While I agree that it is possible to achieve the NFNOTIF functionality
>> with existing modules, I still think there is a "niche" for such module,
>> because it is very simple, has a very clear purpose and would make the
>> ruleset simpler and more efficient.
>>
>> Does this make any sense?
>
> I don't think that the NFNOTIF infrastructure fulfill the policy for
> inclusion. It seems to me like something quite specific for your needs.
> It is simple, yes, but we already have this feature into the kernel. I
> don't think that this will reduce CPU usage considerably with regards to
> the NFLOG way.
>
> I would still prefer adding the once-per-matching notification feature
> to NFLOG than these extra lines in the kernel, Patrick?
I agree with Pablo.
^ permalink raw reply
* Re: [PATCH] netfilter: xtables: userspace notification target
From: Luciano Coelho @ 2010-07-15 9:18 UTC (permalink / raw)
To: ext Patrick McHardy
Cc: Pablo Neira Ayuso, Changli Gao, Samuel Ortiz, David S. Miller,
netdev@vger.kernel.org, netfilter-devel@vger.kernel.org
In-Reply-To: <4C3ECF6D.50409@trash.net>
On Thu, 2010-07-15 at 11:05 +0200, ext Patrick McHardy wrote:
> Am 14.07.2010 18:34, schrieb Pablo Neira Ayuso:
> >> It seems that there are lots of possibilities to get this to work, but
> >> this is starting to get quite complex. I would still prefer having the
> >> NFNOTIF module included, since we would be able to do what we want in a
> >> very simple way. It's also probably much more efficient that using
> >> several rules, which would increase the CPU usage considerably (in our
> >> device we are already reaching the limit of a reasonable CPU resource
> >> usage with high throughput WLAN connections).
>
> Its hard to believe that a connmark match filtering out notifications
> would require more CPU time than doing the same in a new target module.
Okay, you have convinced me. :) I studied connmark a bit more and now I
realize that it won't take more CPU. In the solution with connmark that
you proposed the packets coming from a connection that is already marked
will be quickly returned to normal processing, so it will be fairly
efficient and certainly not more CPU hungry than the NFNOTIF.
> >> While I agree that it is possible to achieve the NFNOTIF functionality
> >> with existing modules, I still think there is a "niche" for such module,
> >> because it is very simple, has a very clear purpose and would make the
> >> ruleset simpler and more efficient.
> >>
> >> Does this make any sense?
> >
> > I don't think that the NFNOTIF infrastructure fulfill the policy for
> > inclusion. It seems to me like something quite specific for your needs.
> > It is simple, yes, but we already have this feature into the kernel. I
> > don't think that this will reduce CPU usage considerably with regards to
> > the NFLOG way.
> >
> > I would still prefer adding the once-per-matching notification feature
> > to NFLOG than these extra lines in the kernel, Patrick?
>
> I agree with Pablo.
I have to admit that you're right here again. I think that it will not
be necessary to make this change in the NFLOG, since the connmark
solution is actually pretty clear too. If needed, I'll make this simple
change in the NFLOG module and submit.
Thanks for your help.
--
Cheers,
Luca.
^ permalink raw reply
* Re: [PATCHv2] netfilter: add CHECKSUM target
From: Patrick McHardy @ 2010-07-15 9:36 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: David S. Miller, Jan Engelhardt, Randy Dunlap, netfilter-devel,
netfilter, coreteam, linux-kernel, netdev, kvm, herbert
In-Reply-To: <20100711150623.GA7420@redhat.com>
Am 11.07.2010 17:06, schrieb Michael S. Tsirkin:
> +#ifndef _IPT_CHECKSUM_TARGET_H
> +#define _IPT_CHECKSUM_TARGET_H
> +
> +#define XT_CHECKSUM_OP_FILL 0x01 /* fill in checksum in IP header */
> +
> +struct xt_CHECKSUM_info {
> + u_int8_t operation; /* bitset of operations */
Please use __u8 in public header files.
> +};
> +
> +#endif /* _IPT_CHECKSUM_TARGET_H */
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index 8593a77..1cf4852 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -294,7 +294,7 @@ endif # NF_CONNTRACK
> config NETFILTER_TPROXY
> tristate "Transparent proxying support (EXPERIMENTAL)"
> depends on EXPERIMENTAL
> - depends on IP_NF_MANGLE
> + depends on IP_NF_MANGLE || IP6_NF_MANGLE
This does not seem to belong into this patch.
> depends on NETFILTER_ADVANCED
> help
> This option enables transparent proxying support, that is,
> @@ -347,6 +347,21 @@ config NETFILTER_XT_CONNMARK
>
> comment "Xtables targets"
>
> +config NETFILTER_XT_TARGET_CHECKSUM
> + tristate "CHECKSUM target support"
> + depends on NETFILTER_ADVANCED
> + ---help---
> + This option adds a `CHECKSUM' target, which can be used in the iptables mangle
> + table.
You should add a dependency on the mangle table then.
> +
> + You can use this target to compute and fill in the checksum in
> + a packet that lacks a checksum. This is particularly useful,
> + if you need to work around old applications such as dhcp clients,
> + that do not work well with checksum offloads, but don't want to disable
> + checksum offload in your device.
> +
> + To compile it as a module, choose M here. If unsure, say N.
> +
> config NETFILTER_XT_TARGET_CLASSIFY
> tristate '"CLASSIFY" target support'
> depends on NETFILTER_ADVANCED
> diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
> index 14e3a8f..8eb541d 100644
> --- a/net/netfilter/Makefile
> +++ b/net/netfilter/Makefile
> @@ -45,6 +45,7 @@ obj-$(CONFIG_NETFILTER_XT_MARK) += xt_mark.o
> obj-$(CONFIG_NETFILTER_XT_CONNMARK) += xt_connmark.o
>
> # targets
> +obj-$(CONFIG_NETFILTER_XT_TARGET_CHECKSUM) += xt_CHECKSUM.o
> obj-$(CONFIG_NETFILTER_XT_TARGET_CLASSIFY) += xt_CLASSIFY.o
> obj-$(CONFIG_NETFILTER_XT_TARGET_CONNSECMARK) += xt_CONNSECMARK.o
> obj-$(CONFIG_NETFILTER_XT_TARGET_CT) += xt_CT.o
> diff --git a/net/netfilter/xt_CHECKSUM.c b/net/netfilter/xt_CHECKSUM.c
> new file mode 100644
> index 0000000..0fee1a7
> --- /dev/null
> +++ b/net/netfilter/xt_CHECKSUM.c
> @@ -0,0 +1,72 @@
> +/* iptables module for the packet checksum mangling
> + *
> + * (C) 2002 by Harald Welte <laforge@netfilter.org>
> + * (C) 2010 Red Hat, Inc.
> + *
> + * Author: Michael S. Tsirkin <mst@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> +*/
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +#include <linux/in.h>
Do you really need in.h?
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/netdevice.h>
> +
> +#include <linux/netfilter/x_tables.h>
> +#include <linux/netfilter/xt_CHECKSUM.h>
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Michael S. Tsirkin <mst@redhat.com>");
> +MODULE_DESCRIPTION("Xtables: checksum modification");
> +MODULE_ALIAS("ipt_CHECKSUM");
> +MODULE_ALIAS("ip6t_CHECKSUM");
> +
> +static unsigned int
> +checksum_tg(struct sk_buff *skb, const struct xt_action_param *par)
> +{
> + if (skb->ip_summed == CHECKSUM_PARTIAL)
> + skb_checksum_help(skb);
> +
> + return XT_CONTINUE;
> +}
> +
> +static int checksum_tg_check(const struct xt_tgchk_param *par)
> +{
> + const struct xt_CHECKSUM_info *einfo = par->targinfo;
> +
> + if (einfo->operation & ~XT_CHECKSUM_OP_FILL) {
> + pr_info("unsupported CHECKSUM operation %x\n", einfo->operation);
> + return -EINVAL;
> + }
> + if (!einfo->operation) {
> + pr_info("no CHECKSUM operation enabled\n");
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> +static struct xt_target checksum_tg_reg __read_mostly = {
> + .name = "CHECKSUM",
> + .family = NFPROTO_UNSPEC,
> + .target = checksum_tg,
> + .targetsize = sizeof(struct xt_CHECKSUM_info),
> + .table = "mangle",
> + .checkentry = checksum_tg_check,
> + .me = THIS_MODULE,
> +};
> +
> +static int __init checksum_tg_init(void)
> +{
> + return xt_register_target(&checksum_tg_reg);
> +}
> +
> +static void __exit checksum_tg_exit(void)
> +{
> + xt_unregister_target(&checksum_tg_reg);
> +}
> +
> +module_init(checksum_tg_init);
> +module_exit(checksum_tg_exit);
^ permalink raw reply
* Re: [PATCH] CAN: Add Flexcan CAN controller driver
From: Wolfgang Grandegger @ 2010-07-15 9:37 UTC (permalink / raw)
To: Marc Kleine-Budde
Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1279144811-12251-1-git-send-email-mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
Hi Marc,
On 07/15/2010 12:00 AM, Marc Kleine-Budde wrote:
> From: Sascha Hauer <s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
>
> This core is found on some Freescale SoCs and also some Coldfire
> SoCs. Support for Coldfire is missing though at the moment as
> they have an older revision of the core which does not have RX FIFO
> support.
>
> Signed-off-by: Sascha Hauer <s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
> Signed-off-by: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
> ---
>
> Changes to prev version:
> * The is now GPLv2 (only) as no one complained.
>
> The patch applies to current net-next-2.6/master.
> If there aren't any objections please consider applying this patch.
> Wolfgang, can I an Acked-by?
I already had a look to the previous version and I realized that
accessing reg_esr is racy. I will dig out my notes and come up with a
full review later today or tomorrow.
Thanks,
Wolfgang.
^ permalink raw reply
* Re: [PATCHv3] extensions: libxt_CHECKSUM extension
From: Patrick McHardy @ 2010-07-15 9:39 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: David S. Miller, Jan Engelhardt, Randy Dunlap, netfilter-devel,
netfilter, coreteam, linux-kernel, netdev, kvm, herbert
In-Reply-To: <20100711151453.GA7563@redhat.com>
Am 11.07.2010 17:14, schrieb Michael S. Tsirkin:
> diff --git a/extensions/libxt_CHECKSUM.c b/extensions/libxt_CHECKSUM.c
> new file mode 100644
> index 0000000..00fbd8f
> --- /dev/null
> +++ b/extensions/libxt_CHECKSUM.c
> @@ -0,0 +1,99 @@
> +/* Shared library add-on to xtables for CHECKSUM
> + *
> + * (C) 2002 by Harald Welte <laforge@gnumonks.org>
> + * (C) 2010 by Red Hat, Inc
> + * Author: Michael S. Tsirkin <mst@redhat.com>
> + *
> + * This program is distributed under the terms of GNU GPL v2, 1991
> + *
> + * libxt_CHECKSUM.c borrowed some bits from libipt_ECN.c
> + *
> + * $Id$
Please no CVS IDs.
> + */
> +#include <stdio.h>
> +#include <string.h>
> +#include <stdlib.h>
> +#include <getopt.h>
> +
> +#include <xtables.h>
> +#include <linux/netfilter/xt_CHECKSUM.h>
> +
> +static void CHECKSUM_help(void)
> +{
> + printf(
> +"CHECKSUM target options\n"
> +" --checksum-fill Fill in packet checksum.\n");
> +}
> +
> +static const struct option CHECKSUM_opts[] = {
> + { "checksum-fill", 0, NULL, 'F' },
> + { .name = NULL }
> +};
> +
> +static int CHECKSUM_parse(int c, char **argv, int invert, unsigned int *flags,
> + const void *entry, struct xt_entry_target **target)
> +{
> + struct xt_CHECKSUM_info *einfo
> + = (struct xt_CHECKSUM_info *)(*target)->data;
> +
> + switch (c) {
> + case 'F':
> + if (*flags)
> + xtables_error(PARAMETER_PROBLEM,
> + "CHECKSUM target: Only use --checksum-fill ONCE!");
There is a helper function called xtables_param_act for checking double
arguments and printing a standarized error message.
> + einfo->operation = XT_CHECKSUM_OP_FILL;
> + *flags |= XT_CHECKSUM_OP_FILL;
> + break;
> + default:
> + return 0;
> + }
> +
> + return 1;
> +}
> +
^ permalink raw reply
* Re: [PATCH] CAN: Add Flexcan CAN controller driver
From: Marc Kleine-Budde @ 2010-07-15 10:06 UTC (permalink / raw)
To: Wolfgang Grandegger
Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4C3ED6F5.7040606-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 1646 bytes --]
Hey Wolfgang,
Wolfgang Grandegger wrote:
>> This core is found on some Freescale SoCs and also some Coldfire
>> SoCs. Support for Coldfire is missing though at the moment as
>> they have an older revision of the core which does not have RX FIFO
>> support.
>>
>> Signed-off-by: Sascha Hauer <s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
>> Signed-off-by: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
>> ---
>>
>> Changes to prev version:
>> * The is now GPLv2 (only) as no one complained.
>>
>> The patch applies to current net-next-2.6/master.
>> If there aren't any objections please consider applying this patch.
>> Wolfgang, can I an Acked-by?
>
> I already had a look to the previous version and I realized that
> accessing reg_esr is racy. I will dig out my notes and come up with a
> full review later today or tomorrow.
Let me have a look....
I think I should remove the read reg_esr in "flexcan_irq_err()" and use
the value read in "flexcan_irq()"
Without the fix, there's a race window that we loose some error bits
from the interrupt handler to the napi function, because the error bits
are cleared on read.
Regarding the upcoming improvement of the can error frames upon state
changes, I'd first like to get this driver merged and then discuss a
solution for the error frames.
cheers, Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]
[-- Attachment #2: Type: text/plain, Size: 188 bytes --]
_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core
^ permalink raw reply
* Re: several messages
From: Jan Engelhardt @ 2010-07-15 10:17 UTC (permalink / raw)
To: Patrick McHardy
Cc: Michael S. Tsirkin, David S. Miller, Randy Dunlap,
netfilter-devel, netfilter, coreteam, linux-kernel, netdev, kvm,
herbert
In-Reply-To: <4C3ED764.3060301@trash.net>
On Thursday 2010-07-15 11:36, Patrick McHardy wrote:
>> +struct xt_CHECKSUM_info {
>> + u_int8_t operation; /* bitset of operations */
>
>Please use __u8 in public header files.
>
>> +#include <linux/in.h>
>
>Do you really need in.h?
>
>> + * $Id$
>
>Please no CVS IDs.
>
>> + switch (c) {
>> + case 'F':
>> + if (*flags)
>> + xtables_error(PARAMETER_PROBLEM,
>> + "CHECKSUM target: Only use --checksum-fill ONCE!");
>
>There is a helper function called xtables_param_act for checking double
>arguments and printing a standarized error message.
I took care of these for Xt-a.
^ permalink raw reply
* Re: [PATCH] CAN: Add Flexcan CAN controller driver
From: Wolfgang Grandegger @ 2010-07-15 10:21 UTC (permalink / raw)
To: Marc Kleine-Budde
Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4C3EDDB1.5040109-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
On 07/15/2010 12:06 PM, Marc Kleine-Budde wrote:
> Hey Wolfgang,
>
> Wolfgang Grandegger wrote:
>>> This core is found on some Freescale SoCs and also some Coldfire
>>> SoCs. Support for Coldfire is missing though at the moment as
>>> they have an older revision of the core which does not have RX FIFO
>>> support.
>>>
>>> Signed-off-by: Sascha Hauer <s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
>>> Signed-off-by: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
>>> ---
>>>
>>> Changes to prev version:
>>> * The is now GPLv2 (only) as no one complained.
>>>
>>> The patch applies to current net-next-2.6/master.
>>> If there aren't any objections please consider applying this patch.
>>> Wolfgang, can I an Acked-by?
>>
>> I already had a look to the previous version and I realized that
>> accessing reg_esr is racy. I will dig out my notes and come up with a
>> full review later today or tomorrow.
>
> Let me have a look....
>
> I think I should remove the read reg_esr in "flexcan_irq_err()" and use
> the value read in "flexcan_irq()"
>
> Without the fix, there's a race window that we loose some error bits
> from the interrupt handler to the napi function, because the error bits
> are cleared on read.
Right, that is my concern. reg_esr should only be read and handled
*once*. Currently it's read *3* times! I would read it in the ISR and
handle it in the NAPI poll function, including state change
notification. But I need a closer look...
> Regarding the upcoming improvement of the can error frames upon state
> changes, I'd first like to get this driver merged and then discuss a
> solution for the error frames.
Fine for me. I'm currently preparing a RFC patch for SJA1000 and MSCAN.
Wolfgang.
^ permalink raw reply
* Re: Raise initial congestion window size / speedup slow start?
From: Alan Cox @ 2010-07-15 10:33 UTC (permalink / raw)
To: Hagen Paul Pfeifer
Cc: David Miller, rick.jones2, lists, davidsen, linux-kernel, netdev
In-Reply-To: <20100714221301.GI6682@nuttenaction>
On Thu, 15 Jul 2010 00:13:01 +0200
Hagen Paul Pfeifer <hagen@jauu.net> wrote:
> * David Miller | 2010-07-14 14:55:47 [-0700]:
>
> >Although section 3 of RFC 5681 is a great text, it does not say at all
> >that increasing the initial CWND would lead to fairness issues.
>
> Because it is only one side of the medal, probing conservative the available
> link capacity in conjunction with n simultaneous probing TCP/SCTP/DCCP
> instances is another.
>
> >To be honest, I think google's proposal holds a lot of weight. If
> >over time link sizes and speeds are increasing (they are) then nudging
> >the initial CWND every so often is a legitimate proposal. Were
> >someone to claim that utilization is lower than it could be because of
> >the currenttly specified initial CWND, I would have no problem
> >believing them.
> >
> >And I'm happy to make Linux use an increased value once it has
> >traction in the standardization community.
>
> Currently I know no working link capacity probing approach, without active
> network feedback, to conservatively probing the available link capacity with a
> high CWND. I am curious about any future trends.
Given perfect information from the network nodes you still need to
traverse the network each direction and then return an answer which means
with a 0.5sec end to end time as in the original posting causality itself
demands 1.5 seconds to get an answer which is itself incomplete and
obsolete.
Causality isn't showing any signs of going away soon.
^ permalink raw reply
* Re: [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical
From: Xiaotian Feng @ 2010-07-15 11:51 UTC (permalink / raw)
To: Mitchell Erblich
Cc: linux-mm, linux-nfs, netdev, riel, cl, a.p.zijlstra, linux-kernel,
lwang, penberg, akpm, davem
In-Reply-To: <6E38A74E-B033-4D2A-9620-2A8BDF9E0AD1@earthlink.net>
On 07/13/2010 06:55 PM, Mitchell Erblich wrote:
>
> On Jul 13, 2010, at 3:19 AM, Xiaotian Feng wrote:
>
>> From 6c3a91091b2910c23908a9f9953efcf3df14e522 Mon Sep 17 00:00:00 2001
>> From: Xiaotian Feng<dfeng@redhat.com>
>> Date: Tue, 13 Jul 2010 11:02:41 +0800
>> Subject: [PATCH 12/30] selinux: tag avc cache alloc as non-critical
>>
>> Failing to allocate a cache entry will only harm performance not correctness.
>> Do not consume valuable reserve pages for something like that.
>>
>> Signed-off-by: Peter Zijlstra<a.p.zijlstra@chello.nl>
>> Signed-off-by: Suresh Jayaraman<sjayaraman@suse.de>
>> Signed-off-by: Xiaotian Feng<dfeng@redhat.com>
>> ---
>> security/selinux/avc.c | 2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/security/selinux/avc.c b/security/selinux/avc.c
>> index 3662b0f..9029395 100644
>> --- a/security/selinux/avc.c
>> +++ b/security/selinux/avc.c
>> @@ -284,7 +284,7 @@ static struct avc_node *avc_alloc_node(void)
>> {
>> struct avc_node *node;
>>
>> - node = kmem_cache_zalloc(avc_node_cachep, GFP_ATOMIC);
>> + node = kmem_cache_zalloc(avc_node_cachep, GFP_ATOMIC|__GFP_NOMEMALLOC);
>> if (!node)
>> goto out;
>>
>> --
>> 1.7.1.1
>>
>
> Why not just replace GFP_ATOMIC with GFP_NOWAIT?
>
> This would NOT consume the valuable last pages.
But replace GFP_ATOMIC with GFP_NOWAIT can not prevent avc_alloc_node
consume reserved pages.
>
> Mitchell Erblich
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* [PATCHv4] netfilter: add CHECKSUM target
From: Michael S. Tsirkin @ 2010-07-15 11:52 UTC (permalink / raw)
To: Daniel P. Berrange, Jes Sorensen
Cc: Patrick McHardy, Michael S. Tsirkin, David S. Miller,
Jan Engelhardt, Randy Dunlap, netfilter-devel, netfilter,
coreteam, linux-kernel, netdev
This adds a `CHECKSUM' target, which can be used in the iptables mangle
table.
You can use this target to compute and fill in the checksum in
a packet that lacks a checksum. This is particularly useful,
if you need to work around old applications such as dhcp clients,
that do not work well with checksum offloads, but don't want to
disable checksum offload in your device.
The problem happens in the field with virtualized applications.
For reference, see Red Hat bz 605555, as well as
http://www.spinics.net/lists/kvm/msg37660.html
Typical expected use (helps old dhclient binary running in a VM):
iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
-j CHECKSUM --checksum-fill
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Includes fixes by Jan Engelhardt <jengelh@medozas.de>
---
Changes from v3: fix issues pointed out by Patrick McHardy:
fixed by Jan Engelhardt
use __u8 in public header files
remove in.h
fixed by me
kill change that does not belong into this patch
add a dependency on the mangle table
include/linux/netfilter/xt_CHECKSUM.h | 18 ++++++++
net/netfilter/Kconfig | 16 +++++++
net/netfilter/Makefile | 1 +
net/netfilter/xt_CHECKSUM.c | 70 +++++++++++++++++++++++++++++++++
4 files changed, 105 insertions(+), 0 deletions(-)
create mode 100644 include/linux/netfilter/xt_CHECKSUM.h
create mode 100644 net/netfilter/xt_CHECKSUM.c
diff --git a/include/linux/netfilter/xt_CHECKSUM.h b/include/linux/netfilter/xt_CHECKSUM.h
new file mode 100644
index 0000000..3b4fb77
--- /dev/null
+++ b/include/linux/netfilter/xt_CHECKSUM.h
@@ -0,0 +1,18 @@
+/* Header file for iptables ipt_CHECKSUM target
+ *
+ * (C) 2002 by Harald Welte <laforge@gnumonks.org>
+ * (C) 2010 Red Hat Inc
+ * Author: Michael S. Tsirkin <mst@redhat.com>
+ *
+ * This software is distributed under GNU GPL v2, 1991
+*/
+#ifndef _IPT_CHECKSUM_TARGET_H
+#define _IPT_CHECKSUM_TARGET_H
+
+#define XT_CHECKSUM_OP_FILL 0x01 /* fill in checksum in IP header */
+
+struct xt_CHECKSUM_info {
+ __u8 operation; /* bitset of operations */
+};
+
+#endif /* _IPT_CHECKSUM_TARGET_H */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 8593a77..16b2078 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -347,6 +347,22 @@ config NETFILTER_XT_CONNMARK
comment "Xtables targets"
+config NETFILTER_XT_TARGET_CHECKSUM
+ tristate "CHECKSUM target support"
+ depends on IP_NF_MANGLE || IP6_NF_MANGLE
+ depends on NETFILTER_ADVANCED
+ ---help---
+ This option adds a `CHECKSUM' target, which can be used in the iptables mangle
+ table.
+
+ You can use this target to compute and fill in the checksum in
+ a packet that lacks a checksum. This is particularly useful,
+ if you need to work around old applications such as dhcp clients,
+ that do not work well with checksum offloads, but don't want to disable
+ checksum offload in your device.
+
+ To compile it as a module, choose M here. If unsure, say N.
+
config NETFILTER_XT_TARGET_CLASSIFY
tristate '"CLASSIFY" target support'
depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 14e3a8f..8eb541d 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -45,6 +45,7 @@ obj-$(CONFIG_NETFILTER_XT_MARK) += xt_mark.o
obj-$(CONFIG_NETFILTER_XT_CONNMARK) += xt_connmark.o
# targets
+obj-$(CONFIG_NETFILTER_XT_TARGET_CHECKSUM) += xt_CHECKSUM.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CLASSIFY) += xt_CLASSIFY.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CONNSECMARK) += xt_CONNSECMARK.o
obj-$(CONFIG_NETFILTER_XT_TARGET_CT) += xt_CT.o
diff --git a/net/netfilter/xt_CHECKSUM.c b/net/netfilter/xt_CHECKSUM.c
new file mode 100644
index 0000000..0f642ef
--- /dev/null
+++ b/net/netfilter/xt_CHECKSUM.c
@@ -0,0 +1,70 @@
+/* iptables module for the packet checksum mangling
+ *
+ * (C) 2002 by Harald Welte <laforge@netfilter.org>
+ * (C) 2010 Red Hat, Inc.
+ *
+ * Author: Michael S. Tsirkin <mst@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+*/
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/module.h>
+#include <linux/skbuff.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_CHECKSUM.h>
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Michael S. Tsirkin <mst@redhat.com>");
+MODULE_DESCRIPTION("Xtables: checksum modification");
+MODULE_ALIAS("ipt_CHECKSUM");
+MODULE_ALIAS("ip6t_CHECKSUM");
+
+static unsigned int
+checksum_tg(struct sk_buff *skb, const struct xt_action_param *par)
+{
+ if (skb->ip_summed == CHECKSUM_PARTIAL)
+ skb_checksum_help(skb);
+
+ return XT_CONTINUE;
+}
+
+static int checksum_tg_check(const struct xt_tgchk_param *par)
+{
+ const struct xt_CHECKSUM_info *einfo = par->targinfo;
+
+ if (einfo->operation & ~XT_CHECKSUM_OP_FILL) {
+ pr_info("unsupported CHECKSUM operation %x\n", einfo->operation);
+ return -EINVAL;
+ }
+ if (!einfo->operation) {
+ pr_info("no CHECKSUM operation enabled\n");
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static struct xt_target checksum_tg_reg __read_mostly = {
+ .name = "CHECKSUM",
+ .family = NFPROTO_UNSPEC,
+ .target = checksum_tg,
+ .targetsize = sizeof(struct xt_CHECKSUM_info),
+ .table = "mangle",
+ .checkentry = checksum_tg_check,
+ .me = THIS_MODULE,
+};
+
+static int __init checksum_tg_init(void)
+{
+ return xt_register_target(&checksum_tg_reg);
+}
+
+static void __exit checksum_tg_exit(void)
+{
+ xt_unregister_target(&checksum_tg_reg);
+}
+
+module_init(checksum_tg_init);
+module_exit(checksum_tg_exit);
--
1.7.2.rc0.14.g41c1c
^ permalink raw reply related
* [PATCHv4] extensions: libxt_CHECKSUM extension
From: Michael S. Tsirkin @ 2010-07-15 11:52 UTC (permalink / raw)
To: Daniel P. Berrange, Jes Sorensen
Cc: Patrick McHardy, Michael S. Tsirkin, David S. Miller,
Jan Engelhardt, Randy Dunlap, netfilter-devel, netfilter,
coreteam, linux-kernel, netdev
In-Reply-To: <20100715114434.GA6415@redhat.com>
This adds a `CHECKSUM' target, which can be used in the iptables mangle
table.
You can use this target to compute and fill in the checksum in
a packet that lacks a checksum. This is particularly useful,
if you need to work around old applications such as dhcp clients,
that do not work well with checksum offloads, but don't want to disable
checksum offload in your device.
The problem happens in the field with virtualized applications.
For reference, see Red Hat bz 605555, as well as
http://www.spinics.net/lists/kvm/msg37660.html
Typical expected use (helps old dhclient binary running in a VM):
iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
-j CHECKSUM --checksum-fill
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Includes fixes by Jan Engelhardt <jengelh@medozas.de>
---
Changes from v3: fixed issues from review by Patrick McHardy
Applied these fixes by Jan Engelhardt
Remove $$ cvs tags
Use xtables_param_act
extensions/libxt_CHECKSUM.c | 96 +++++++++++++++++++++++++++++++++++++++++
extensions/libxt_CHECKSUM.man | 8 +++
2 files changed, 104 insertions(+), 0 deletions(-)
create mode 100644 extensions/libxt_CHECKSUM.c
create mode 100644 extensions/libxt_CHECKSUM.man
diff --git a/extensions/libxt_CHECKSUM.c b/extensions/libxt_CHECKSUM.c
new file mode 100644
index 0000000..9a24443
--- /dev/null
+++ b/extensions/libxt_CHECKSUM.c
@@ -0,0 +1,96 @@
+/* Shared library add-on to xtables for CHECKSUM
+ *
+ * (C) 2002 by Harald Welte <laforge@gnumonks.org>
+ * (C) 2010 by Red Hat, Inc
+ * Author: Michael S. Tsirkin <mst@redhat.com>
+ *
+ * This program is distributed under the terms of GNU GPL v2, 1991
+ *
+ * libxt_CHECKSUM.c borrowed some bits from libipt_ECN.c
+ */
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <getopt.h>
+
+#include <xtables.h>
+#include <linux/netfilter/xt_CHECKSUM.h>
+
+static void CHECKSUM_help(void)
+{
+ printf(
+"CHECKSUM target options\n"
+" --checksum-fill Fill in packet checksum.\n");
+}
+
+static const struct option CHECKSUM_opts[] = {
+ { "checksum-fill", 0, NULL, 'F' },
+ { .name = NULL }
+};
+
+static int CHECKSUM_parse(int c, char **argv, int invert, unsigned int *flags,
+ const void *entry, struct xt_entry_target **target)
+{
+ struct xt_CHECKSUM_info *einfo
+ = (struct xt_CHECKSUM_info *)(*target)->data;
+
+ switch (c) {
+ case 'F':
+ xtables_param_act(XTF_ONLY_ONCE, "CHECKSUM", "--checksum-fill",
+ *flags & XT_CHECKSUM_OP_FILL);
+ einfo->operation = XT_CHECKSUM_OP_FILL;
+ *flags |= XT_CHECKSUM_OP_FILL;
+ break;
+ default:
+ return 0;
+ }
+
+ return 1;
+}
+
+static void CHECKSUM_check(unsigned int flags)
+{
+ if (!flags)
+ xtables_error(PARAMETER_PROBLEM,
+ "CHECKSUM target: Parameter --checksum-fill is required");
+}
+
+static void CHECKSUM_print(const void *ip, const struct xt_entry_target *target,
+ int numeric)
+{
+ const struct xt_CHECKSUM_info *einfo =
+ (const struct xt_CHECKSUM_info *)target->data;
+
+ printf("CHECKSUM ");
+
+ if (einfo->operation & XT_CHECKSUM_OP_FILL)
+ printf("fill ");
+}
+
+static void CHECKSUM_save(const void *ip, const struct xt_entry_target *target)
+{
+ const struct xt_CHECKSUM_info *einfo =
+ (const struct xt_CHECKSUM_info *)target->data;
+
+ if (einfo->operation & XT_CHECKSUM_OP_FILL)
+ printf("--checksum-fill ");
+}
+
+static struct xtables_target checksum_tg_reg = {
+ .name = "CHECKSUM",
+ .version = XTABLES_VERSION,
+ .family = NFPROTO_UNSPEC,
+ .size = XT_ALIGN(sizeof(struct xt_CHECKSUM_info)),
+ .userspacesize = XT_ALIGN(sizeof(struct xt_CHECKSUM_info)),
+ .help = CHECKSUM_help,
+ .parse = CHECKSUM_parse,
+ .final_check = CHECKSUM_check,
+ .print = CHECKSUM_print,
+ .save = CHECKSUM_save,
+ .extra_opts = CHECKSUM_opts,
+};
+
+void _init(void)
+{
+ xtables_register_target(&checksum_tg_reg);
+}
diff --git a/extensions/libxt_CHECKSUM.man b/extensions/libxt_CHECKSUM.man
new file mode 100644
index 0000000..92ae700
--- /dev/null
+++ b/extensions/libxt_CHECKSUM.man
@@ -0,0 +1,8 @@
+This target allows to selectively work around broken/old applications.
+It can only be used in the mangle table.
+.TP
+\fB\-\-checksum\-fill\fP
+Compute and fill in the checksum in a packet that lacks a checksum.
+This is particularly useful, if you need to work around old applications
+such as dhcp clients, that do not work well with checksum offloads,
+but don't want to disable checksum offload in your device.
--
1.7.2.rc0.14.g41c1c
^ permalink raw reply related
* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
From: Lennart Schulte @ 2010-07-15 11:58 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Tejun Heo, David S. Miller, lkml, netdev@vger.kernel.org,
Fehrmann, Henning, Carsten Aulbert, Eric Dumazet
In-Reply-To: <alpine.DEB.2.00.1007111825510.15736@melkinpaasi.cs.helsinki.fi>
I'm testing new reordering algorithms in a virtual testbed, that is the
nodes are emulated with xen and all the network parameters can be tuned
with queues.
With one of the algorithms I also got tracebacks which include
tcp_xmit_retransmit_queue. It only happens with ACK loss. The kernel
version however is 2.6.31.
When I read this thread I tried the debug patch and got the following:
[ 2754.413150] NULL head, pkts 0
[ 2754.413156] Errors caught so far 1
Hope that is of any help.
^ permalink raw reply
* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
From: Eric Dumazet @ 2010-07-15 12:05 UTC (permalink / raw)
To: Lennart Schulte
Cc: Ilpo Järvinen, Tejun Heo, David S. Miller, lkml,
netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert
In-Reply-To: <4C3EF7EA.2040900@nets.rwth-aachen.de>
Le jeudi 15 juillet 2010 à 13:58 +0200, Lennart Schulte a écrit :
> I'm testing new reordering algorithms in a virtual testbed, that is the
> nodes are emulated with xen and all the network parameters can be tuned
> with queues.
> With one of the algorithms I also got tracebacks which include
> tcp_xmit_retransmit_queue. It only happens with ACK loss. The kernel
> version however is 2.6.31.
> When I read this thread I tried the debug patch and got the following:
>
> [ 2754.413150] NULL head, pkts 0
> [ 2754.413156] Errors caught so far 1
>
> Hope that is of any help.
Not sure I understand.
Are you saying you reproduce same tcp_xmit_retransmit_queue() bug, with
a special tc qdisc/class droppping some ACK frames ?
Could it be some sched problem and incorrect return codes in case of
congestion ?
^ permalink raw reply
* [PATCH] vhost-net: avoid flush under lock
From: Michael S. Tsirkin @ 2010-07-15 12:19 UTC (permalink / raw)
Cc: Michael S. Tsirkin, Sridhar Samudrala, David S. Miller,
Arnd Bergmann, Paul E. McKenney, kvm, virtualization, netdev,
linux-kernel
We flush under vq mutex when changing backends.
This creates a deadlock as workqueue being flushed
needs this lock as well.
https://bugzilla.redhat.com/show_bug.cgi?id=612421
Drop the vq mutex before flush: we have the device mutex
which is sufficient to prevent another ioctl from touching
the vq.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
drivers/vhost/net.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 28d7786..50df58e6 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -534,11 +534,16 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
rcu_assign_pointer(vq->private_data, sock);
vhost_net_enable_vq(n, vq);
done:
+ mutex_unlock(&vq->mutex);
+
if (oldsock) {
vhost_net_flush_vq(n, index);
fput(oldsock->file);
}
+ mutex_unlock(&n->dev.mutex);
+ return 0;
+
err_vq:
mutex_unlock(&vq->mutex);
err:
--
1.7.2.rc0.14.g41c1c
^ permalink raw reply related
* [PATCH 1/3] drivers: isdn: use kernel macros to convert hex digit
From: Andy Shevchenko @ 2010-07-15 12:37 UTC (permalink / raw)
To: linux-kernel; +Cc: Andy Shevchenko, Karsten Keil, Tilman Schmidt, netdev
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: netdev@vger.kernel.org
---
drivers/isdn/capi/capidrv.c | 7 ++-----
drivers/isdn/hisax/q931.c | 13 ++-----------
2 files changed, 4 insertions(+), 16 deletions(-)
diff --git a/drivers/isdn/capi/capidrv.c b/drivers/isdn/capi/capidrv.c
index bf55ed5..2978bda 100644
--- a/drivers/isdn/capi/capidrv.c
+++ b/drivers/isdn/capi/capidrv.c
@@ -1450,12 +1450,9 @@ static void handle_dtrace_data(capidrv_contr *card,
}
for (p = data, end = data+len; p < end; p++) {
- u8 w;
PUTBYTE_TO_STATUS(card, ' ');
- w = (*p >> 4) & 0xf;
- PUTBYTE_TO_STATUS(card, (w < 10) ? '0'+w : 'A'-10+w);
- w = *p & 0xf;
- PUTBYTE_TO_STATUS(card, (w < 10) ? '0'+w : 'A'-10+w);
+ PUTBYTE_TO_STATUS(card, hex_asc_hi(*p));
+ PUTBYTE_TO_STATUS(card, hex_asc_lo(*p));
}
PUTBYTE_TO_STATUS(card, '\n');
diff --git a/drivers/isdn/hisax/q931.c b/drivers/isdn/hisax/q931.c
index 8b853d5..c0771f9 100644
--- a/drivers/isdn/hisax/q931.c
+++ b/drivers/isdn/hisax/q931.c
@@ -1152,20 +1152,11 @@ QuickHex(char *txt, u_char * p, int cnt)
{
register int i;
register char *t = txt;
- register u_char w;
for (i = 0; i < cnt; i++) {
*t++ = ' ';
- w = (p[i] >> 4) & 0x0f;
- if (w < 10)
- *t++ = '0' + w;
- else
- *t++ = 'A' - 10 + w;
- w = p[i] & 0x0f;
- if (w < 10)
- *t++ = '0' + w;
- else
- *t++ = 'A' - 10 + w;
+ *t++ = hex_asc_hi(p[i]);
+ *t++ = hex_asc_lo(p[i]);
}
*t++ = 0;
return (t - txt);
--
1.7.1.1
^ permalink raw reply related
* Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
From: Xiaotian Feng @ 2010-07-15 12:37 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, linux-nfs, netdev, riel, cl, a.p.zijlstra, linux-kernel,
lwang, akpm, davem
In-Reply-To: <AANLkTilj5GrhbRJZfSsfXP1v9cQSRlARFmxpys1vUelr@mail.gmail.com>
On 07/14/2010 04:33 AM, Pekka Enberg wrote:
> Hi Xiaotian!
>
> I would actually prefer that the SLAB, SLOB, and SLUB changes were in
> separate patches to make reviewing easier.
>
> Looking at SLUB:
>
> On Tue, Jul 13, 2010 at 1:17 PM, Xiaotian Feng<dfeng@redhat.com> wrote:
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 7bb7940..7a5d6dc 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -27,6 +27,8 @@
>> #include<linux/memory.h>
>> #include<linux/math64.h>
>> #include<linux/fault-inject.h>
>> +#include "internal.h"
>> +
>>
>> /*
>> * Lock order:
>> @@ -1139,7 +1141,8 @@ static void setup_object(struct kmem_cache *s, struct page *page,
>> s->ctor(object);
>> }
>>
>> -static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>> +static
>> +struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node, int *reserve)
>> {
>> struct page *page;
>> void *start;
>> @@ -1153,6 +1156,8 @@ static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>> if (!page)
>> goto out;
>>
>> + *reserve = page->reserve;
>> +
>> inc_slabs_node(s, page_to_nid(page), page->objects);
>> page->slab = s;
>> page->flags |= 1<< PG_slab;
>> @@ -1606,10 +1611,20 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>> {
>> void **object;
>> struct page *new;
>> + int reserve;
>>
>> /* We handle __GFP_ZERO in the caller */
>> gfpflags&= ~__GFP_ZERO;
>>
>> + if (unlikely(c->reserve)) {
>> + /*
>> + * If the current slab is a reserve slab and the current
>> + * allocation context does not allow access to the reserves we
>> + * must force an allocation to test the current levels.
>> + */
>> + if (!(gfp_to_alloc_flags(gfpflags)& ALLOC_NO_WATERMARKS))
>> + goto grow_slab;
>
> OK, so assume that:
>
> (1) c->reserve is set to one
>
> (2) GFP flags don't allow dipping into the reserves
>
> (3) we've managed to free enough pages so normal
> allocations are fine
>
> (4) the page from reserves is not yet empty
>
> we will call flush_slab() and put the "emergency page" on partial list
> and clear c->reserve. This effectively means that now some other
> allocation can fetch the partial page and start to use it. Is this OK?
> Who makes sure the emergency reserves are large enough for the next
> out-of-memory condition where we swap over NFS?
>
Good catch. I'm just wondering if above check is necessary. For
"emergency page", we don't set c->freelist. How can we get a
reserved slab, if GPF flags don't allow dipping into reserves?
>> + }
>> if (!c->page)
>> goto new_slab;
>>
>> @@ -1623,8 +1638,8 @@ load_freelist:
>> object = c->page->freelist;
>> if (unlikely(!object))
>> goto another_slab;
>> - if (unlikely(SLABDEBUG&& PageSlubDebug(c->page)))
>> - goto debug;
>> + if (unlikely(SLABDEBUG&& PageSlubDebug(c->page) || c->reserve))
>> + goto slow_path;
>>
>> c->freelist = get_freepointer(s, object);
>> c->page->inuse = c->page->objects;
>> @@ -1646,16 +1661,18 @@ new_slab:
>> goto load_freelist;
>> }
>>
>> +grow_slab:
>> if (gfpflags& __GFP_WAIT)
>> local_irq_enable();
>>
>> - new = new_slab(s, gfpflags, node);
>> + new = new_slab(s, gfpflags, node,&reserve);
>>
>> if (gfpflags& __GFP_WAIT)
>> local_irq_disable();
>>
>> if (new) {
>> c = __this_cpu_ptr(s->cpu_slab);
>> + c->reserve = reserve;
>> stat(s, ALLOC_SLAB);
>> if (c->page)
>> flush_slab(s, c);
>> @@ -1667,10 +1684,20 @@ new_slab:
>> if (!(gfpflags& __GFP_NOWARN)&& printk_ratelimit())
>> slab_out_of_memory(s, gfpflags, node);
>> return NULL;
>> -debug:
>> - if (!alloc_debug_processing(s, c->page, object, addr))
>> +
>> +slow_path:
>> + if (!c->reserve&& !alloc_debug_processing(s, c->page, object, addr))
>> goto another_slab;
>>
>> + /*
>> + * Avoid the slub fast path in slab_alloc() by not setting
>> + * c->freelist and the fast path in slab_free() by making
>> + * node_match() fail by setting c->node to -1.
>> + *
>> + * We use this for for debug and reserve checks which need
>> + * to be done for each allocation.
>> + */
>> +
>> c->page->inuse++;
>> c->page->freelist = get_freepointer(s, object);
>> c->node = -1;
>> @@ -2095,10 +2122,11 @@ static void early_kmem_cache_node_alloc(gfp_t gfpflags, int node)
>> struct page *page;
>> struct kmem_cache_node *n;
>> unsigned long flags;
>> + int reserve;
>>
>> BUG_ON(kmalloc_caches->size< sizeof(struct kmem_cache_node));
>>
>> - page = new_slab(kmalloc_caches, gfpflags, node);
>> + page = new_slab(kmalloc_caches, gfpflags, node,&reserve);
>>
>> BUG_ON(!page);
>> if (page_to_nid(page) != node) {
>> --
>> 1.7.1.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email:<a href=mailto:"dont@kvack.org"> email@kvack.org</a>
>>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* [PATCH 2/3] drivers: isdn: remove custom strtoul()
From: Andy Shevchenko @ 2010-07-15 12:37 UTC (permalink / raw)
To: linux-kernel; +Cc: Andy Shevchenko, Karsten Keil, netdev
In-Reply-To: <1279197440-6585-1-git-send-email-andy.shevchenko@gmail.com>
In this case we safe to use strict_strtoul().
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: netdev@vger.kernel.org
---
drivers/isdn/hysdn/hysdn_proclog.c | 36 +++++++-----------------------------
1 files changed, 7 insertions(+), 29 deletions(-)
diff --git a/drivers/isdn/hysdn/hysdn_proclog.c b/drivers/isdn/hysdn/hysdn_proclog.c
index 7003698..2ee93d0 100644
--- a/drivers/isdn/hysdn/hysdn_proclog.c
+++ b/drivers/isdn/hysdn/hysdn_proclog.c
@@ -16,6 +16,7 @@
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/mutex.h>
+#include <linux/kernel.h>
#include "hysdn_defs.h"
@@ -155,9 +156,8 @@ static ssize_t
hysdn_log_write(struct file *file, const char __user *buf, size_t count, loff_t * off)
{
unsigned long u = 0;
- int found = 0;
- unsigned char *cp, valbuf[128];
- long base = 10;
+ int rc;
+ unsigned char valbuf[128];
hysdn_card *card = file->private_data;
if (count > (sizeof(valbuf) - 1))
@@ -166,32 +166,10 @@ hysdn_log_write(struct file *file, const char __user *buf, size_t count, loff_t
return (-EFAULT); /* copy failed */
valbuf[count] = 0; /* terminating 0 */
- cp = valbuf;
- if ((count > 2) && (valbuf[0] == '0') && (valbuf[1] == 'x')) {
- cp += 2; /* pointer after hex modifier */
- base = 16;
- }
- /* scan the input for debug flags */
- while (*cp) {
- if ((*cp >= '0') && (*cp <= '9')) {
- found = 1;
- u *= base; /* adjust to next digit */
- u += *cp++ - '0';
- continue;
- }
- if (base != 16)
- break; /* end of number */
-
- if ((*cp >= 'a') && (*cp <= 'f')) {
- found = 1;
- u *= base; /* adjust to next digit */
- u += *cp++ - 'a' + 10;
- continue;
- }
- break; /* terminated */
- }
- if (found) {
+ rc = strict_strtoul(valbuf, 0, &u);
+
+ if (rc == 0) {
card->debug_flags = u; /* remember debug flags */
hysdn_addlog(card, "debug set to 0x%lx", card->debug_flags);
}
--
1.7.1.1
^ permalink raw reply related
* [PATCH 3/3] drivers: isdn: get rid of custom strtoul()
From: Andy Shevchenko @ 2010-07-15 12:37 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Hansjoerg Lipp, Tilman Schmidt, Karsten Keil,
gigaset307x-common, netdev
In-Reply-To: <1279197440-6585-1-git-send-email-andy.shevchenko@gmail.com>
There were two methods isdn_gethex() and isdn_getnum() which are custom
implementations of strtoul(). Get rid of them in regard to
strict_strtoul() kernel's function.
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Hansjoerg Lipp <hjlipp@web.de>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: gigaset307x-common@lists.sourceforge.net
Cc: netdev@vger.kernel.org
---
drivers/isdn/gigaset/ev-layer.c | 80 ++++++++++-----------------------------
1 files changed, 20 insertions(+), 60 deletions(-)
diff --git a/drivers/isdn/gigaset/ev-layer.c b/drivers/isdn/gigaset/ev-layer.c
index a230ba7..a141876 100644
--- a/drivers/isdn/gigaset/ev-layer.c
+++ b/drivers/isdn/gigaset/ev-layer.c
@@ -385,64 +385,18 @@ static const struct zsau_resp_t {
{NULL, ZSAU_UNKNOWN}
};
-/*
- * Get integer from char-pointer
- */
-static int isdn_getnum(char *p)
-{
- int v = -1;
-
- gig_dbg(DEBUG_EVENT, "string: %s", p);
-
- while (*p >= '0' && *p <= '9')
- v = ((v < 0) ? 0 : (v * 10)) + (int) ((*p++) - '0');
- if (*p)
- v = -1; /* invalid Character */
- return v;
-}
-
-/*
- * Get integer from char-pointer
- */
-static int isdn_gethex(char *p)
-{
- int v = 0;
- int c;
-
- gig_dbg(DEBUG_EVENT, "string: %s", p);
-
- if (!*p)
- return -1;
-
- do {
- if (v > (INT_MAX - 15) / 16)
- return -1;
- c = *p;
- if (c >= '0' && c <= '9')
- c -= '0';
- else if (c >= 'a' && c <= 'f')
- c -= 'a' - 10;
- else if (c >= 'A' && c <= 'F')
- c -= 'A' - 10;
- else
- return -1;
- v = v * 16 + c;
- } while (*++p);
-
- return v;
-}
-
/* retrieve CID from parsed response
* returns 0 if no CID, -1 if invalid CID, or CID value 1..65535
*/
static int cid_of_response(char *s)
{
- int cid;
+ unsigned long cid;
+ int rc;
if (s[-1] != ';')
return 0; /* no CID separator */
- cid = isdn_getnum(s);
- if (cid < 0)
+ rc = strict_strtoul(s, 10, &cid);
+ if (rc)
return 0; /* CID not numeric */
if (cid < 1 || cid > 65535)
return -1; /* CID out of range */
@@ -612,21 +566,27 @@ void gigaset_handle_modem_response(struct cardstate *cs)
case RT_ZCAU:
event->parameter = -1;
if (curarg + 1 < params) {
- i = isdn_gethex(argv[curarg]);
- j = isdn_gethex(argv[curarg + 1]);
- if (i >= 0 && i < 256 && j >= 0 && j < 256)
- event->parameter = (unsigned) i << 8
- | j;
- curarg += 2;
+ unsigned long type, value;
+
+ i = strict_strtoul(argv[curarg++], 16, &type);
+ j = strict_strtoul(argv[curarg++], 16, &value);
+
+ if (i == 0 && type < 256 &&
+ j == 0 && value < 256)
+ event->parameter = (type << 8) | value;
} else
curarg = params - 1;
break;
case RT_NUMBER:
+ event->parameter = -1;
if (curarg < params) {
- event->parameter = isdn_getnum(argv[curarg]);
- ++curarg;
- } else
- event->parameter = -1;
+ unsigned long res;
+ int rc;
+
+ rc = strict_strtoul(argv[curarg++], 10, &res);
+ if (rc == 0)
+ event->parameter = res;
+ }
gig_dbg(DEBUG_EVENT, "parameter==%d", event->parameter);
break;
}
--
1.7.1.1
^ permalink raw reply related
* Re: [PATCH] vhost-net: avoid flush under lock
From: Michael S. Tsirkin @ 2010-07-15 12:37 UTC (permalink / raw)
To: Sridhar Samudrala, David S. Miller, Arnd Bergmann,
Paul E. McKenney, kvm
In-Reply-To: <20100715121912.GA7176@redhat.com>
On Thu, Jul 15, 2010 at 03:19:12PM +0300, Michael S. Tsirkin wrote:
> We flush under vq mutex when changing backends.
> This creates a deadlock as workqueue being flushed
> needs this lock as well.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=612421
>
> Drop the vq mutex before flush: we have the device mutex
> which is sufficient to prevent another ioctl from touching
> the vq.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Dave, just to clarify, I'll send pull request to merge it through my tree,
there's no need for you to bother with this.
> ---
> drivers/vhost/net.c | 5 +++++
> 1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 28d7786..50df58e6 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -534,11 +534,16 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd)
> rcu_assign_pointer(vq->private_data, sock);
> vhost_net_enable_vq(n, vq);
> done:
> + mutex_unlock(&vq->mutex);
> +
> if (oldsock) {
> vhost_net_flush_vq(n, index);
> fput(oldsock->file);
> }
>
> + mutex_unlock(&n->dev.mutex);
> + return 0;
> +
> err_vq:
> mutex_unlock(&vq->mutex);
> err:
> --
> 1.7.2.rc0.14.g41c1c
^ permalink raw reply
* Re: [PATCH RFC] act_cpu: packet distributing
From: jamal @ 2010-07-15 12:48 UTC (permalink / raw)
To: Changli Gao
Cc: Eric Dumazet, David S. Miller, Patrick McHardy, Tom Herbert,
netdev
In-Reply-To: <AANLkTimPmXWyAspJqnya7aM_vDRNzDdJJf2bl7CYrAXN@mail.gmail.com>
On Wed, 2010-07-14 at 12:17 +0800, Changli Gao wrote:
> Thanks, I'll try. It is a write critical section, and for me it is
> difficult to convert this lock to RCU. Could you show me some
> examples?
RCU maybe a little trickier here Eric. Actions could be shared i.e.
example, it is possible to have a policer action restricting rates for a
group of flows across multiple netdevices etc. Since action stats get
written to by different CPUs concurrently. It could be probably done if
one was to implement per-cpu stats which get summed-up when user space
asks.
cheers,
jamal
^ permalink raw reply
* Re: oops in tcp_xmit_retransmit_queue() w/ v2.6.32.15
From: Lennart Schulte @ 2010-07-15 12:55 UTC (permalink / raw)
To: Eric Dumazet
Cc: Ilpo Järvinen, Tejun Heo, David S. Miller, lkml,
netdev@vger.kernel.org, Fehrmann, Henning, Carsten Aulbert
In-Reply-To: <1279195528.2496.2.camel@edumazet-laptop>
Since tcp_xmit_retransmit_queue also gets skb == NULL I'm pretty sure it
is the same bug.
Up to now I only experienced the problem with ACK loss (without ACK loss
the test ran about 30min without problems, with ACK loss it had paniced
within 10min).
The data sender only has a HTB queue for traffic shaping (set to 20
Mbit/s). The ACK loss is done by another router.
The setup looks like this. This way it seems to be the most realistic.
o sender with HTB
|
|
o netem queue for forward path delay
|
o netem queue for a queue limit
|
o netem queue for backward path delay
|
o netem queue for ACK loss
|
|
o receiver with HTB
Perhaps now it is a little big clearer.
On 15.07.2010 14:05, Eric Dumazet wrote:
> Le jeudi 15 juillet 2010 à 13:58 +0200, Lennart Schulte a écrit :
>
>> I'm testing new reordering algorithms in a virtual testbed, that is the
>> nodes are emulated with xen and all the network parameters can be tuned
>> with queues.
>> With one of the algorithms I also got tracebacks which include
>> tcp_xmit_retransmit_queue. It only happens with ACK loss. The kernel
>> version however is 2.6.31.
>> When I read this thread I tried the debug patch and got the following:
>>
>> [ 2754.413150] NULL head, pkts 0
>> [ 2754.413156] Errors caught so far 1
>>
>> Hope that is of any help.
>>
> Not sure I understand.
>
> Are you saying you reproduce same tcp_xmit_retransmit_queue() bug, with
> a special tc qdisc/class droppping some ACK frames ?
>
> Could it be some sched problem and incorrect return codes in case of
> congestion ?
>
^ permalink raw reply
* [RFC] Question about tcp_sendmsg()
From: Eric Dumazet @ 2010-07-15 13:14 UTC (permalink / raw)
To: David Miller, Ilpo Järvinen, Krishna Kumar; +Cc: netdev
While investigating for various bug reports in tcp stack, I looked at
commit def87cf42069a (tcp: Slightly optimize tcp_sendmsg)
One question I have is that the
sg = sk->sk_route_caps & NETIF_F_SG;
is now done at the beginning of tcp_sendmsg(), and kept in sg variable
for the whole tcp_sendmsg() duration, even if task has to wait for
space.
Previously sk->sk_route_caps & NETIF_F_SG was done in select_size()
itself.
I am wondering if this can have a side effect, if SG capability changes
while a thread has to wait in sk_stream_wait_memory(), and socket route
changes (sk_route_caps flips NETIF_F_SG bit)
Thanks
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9fce8a8..8a4d9bd 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1093,6 +1093,7 @@ wait_for_memory:
goto do_error;
mss_now = tcp_send_mss(sk, &size_goal, flags);
+ sg = sk->sk_route_caps & NETIF_F_SG;
}
}
^ permalink raw reply related
* Question about way that NICs deliver packets to the kernel
From: Junchang Wang @ 2010-07-15 14:24 UTC (permalink / raw)
To: romieu, netdev
Hi list,
My understand of the way that NICs deliver packets to the kernel is
as follows. Correct me if any of this is wrong. Thanks.
1) The device buffer is fixed. When the kernel is acknowledged arrival of a
new packet, it dynamically allocate a new skb and copy the packet into it.
For example, 8139too.
2) The device buffer is mapped by streaming DMA. When the kernel is
acknowledged arrival of a new packet, it unmaps the region previously mapped.
Obviously, there is NO memcpy operation. Additional cost is streaming DMA
map/unmap operations. For example, e100 and e1000.
Here comes my question:
1) Is there a principle indicating which one is better? Is streaming DMA
map/unmap operations more expensive than memcpy operation?
2) Why does r8169 bias towards the first approach even if it support both? I
convert r8169 to the second one and get a 5% performance boost. Below is result
running netperf TCP_STREAM test with 1.6K byte packet length.
scheme 1 scheme 2 Imp.
r8169 683M 718M 5%
The following patch shows what I did:
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 239d7ef..707876f 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4556,15 +4556,9 @@ static int rtl8169_rx_interrupt(struct net_device *dev,
rtl8169_rx_csum(skb, desc);
- if (rtl8169_try_rx_copy(&skb, tp, pkt_size, addr)) {
- pci_dma_sync_single_for_device(pdev, addr,
- pkt_size, PCI_DMA_FROMDEVICE);
- rtl8169_mark_to_asic(desc, tp->rx_buf_sz);
- } else {
- pci_unmap_single(pdev, addr, tp->rx_buf_sz,
- PCI_DMA_FROMDEVICE);
- tp->Rx_skbuff[entry] = NULL;
- }
+ pci_unmap_single(pdev, addr, tp->rx_buf_sz,
+ PCI_DMA_FROMDEVICE);
+ tp->Rx_skbuff[entry] = NULL;
skb_put(skb, pkt_size);
skb->protocol = eth_type_trans(skb, dev);
Thanks in advance.
--Junchang
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox