From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Jan Engelhardt <jengelh@medozas.de>
Cc: netfilter-devel@vger.kernel.org, kaber@trash.net
Subject: Re: [PATCH] netfilter: xtables: add cluster match
Date: Sat, 14 Feb 2009 21:42:39 +0100 [thread overview]
Message-ID: <49972CBF.3040503@netfilter.org> (raw)
In-Reply-To: <alpine.LSU.2.00.0902142112590.28575@fbirervta.pbzchgretzou.qr>
Jan Engelhardt wrote:
> On Saturday 2009-02-14 20:29, Pablo Neira Ayuso wrote:
>
>> This patch adds the iptables cluster match. This match can be used
>> to deploy gateway and back-end load-sharing clusters.
>
> All of this nice text (below) should go into libxt_cluster.man :)
Indeed, that will go into the manpage. I have kept the iptables part
until you finish with Jamal's libxtables renaming, to avoid any
clashing. Anyway, I don't think that adding this information to the
manpage is a reason to trim the patch description, actually I think that
this patch deserves this long description and I have seen longer
descriptions in the kernel changelog :)
>> Assuming that all the nodes see all packets (see below for an
>> example on how to do that if your switch does not allow this), the
>> cluster match decides if this node has to handle a packet given:
>>
>> jhash(source IP) % total_nodes == node_id
>>
>> For related connections, the master conntrack is used. The following
>> is an example of its use to deploy a gateway cluster composed of two
>> nodes (where this is the node 1):
>>
>> iptables -I PREROUTING -t mangle -i eth1 -m cluster \
>> --cluster-total-nodes 2 --cluster-local-node 1 \
> [...]
>> echo +2 > /proc/sys/net/netfilter/cluster/$PROC_NAME
>>
>> BTW, some final notes:
>>
>> * This match mangles the skbuff pkt_type in case that it detects
>> PACKET_MULTICAST for a non-multicast address. This may be done in
>> a PKTTYPE target for this sole purpose.
>> * This match supersedes the CLUSTERIP target.
>
> The supersedes statement should also go into Kconfig.
Yes, but I was also waiting for Patrick to tell how to proceed with
this. I think that we also have to add CLUSTERIP to the
schedule-for-removal list.
>> @@ -0,0 +1,21 @@
>> +#ifndef _XT_CLUSTER_MATCH_H
>> +#define _XT_CLUSTER_MATCH_H
>> +
>> +struct proc_dir_entry;
>> +
>> +enum xt_cluster_flags {
>> + XT_CLUSTER_F_INV = 0,
>> +};
>
> Hm, that should be XT_CLUSTER_F_INV = 1 << 0.
Indeed.
>> +config NETFILTER_XT_MATCH_CLUSTER
>> + tristate '"cluster" match support'
>> + depends on NETFILTER_ADVANCED
>
> xt_cluster depends on NF_CONNTRACK too.
Right.
>> + ---help---
>> + This option allows you to build work-load-sharing clusters of
>> + network servers/stateful firewalls without having a dedicated
>> + load-balancing router/server/switch. Basically, this match returns
>> + true when the packet must be handled by this cluster node. Thus,
>> + all nodes see all packets and this match decides which node handles
>> + what packets. The work-load sharing algorithm is based on source
>> + address hashing.
>> +
>> + If you say Y here, try `iptables -m cluster --help` for
>> + more information.
>
> Somehow this gives the impression that Y is the only logical choice,
> when indeed, M would work too.
>
>> +struct xt_cluster_internal {
>> + unsigned long node_mask;
>
> I think I raised concern about this, but can't remember.
> Should not this be of type nodemask_t instead? Or ... is this
> "node" perhaps not describing a NUMA node, but a cluster node?
> Some comments would be appreciated, along the lines of
You mentioned cpumask_t last time, but this is neither related to NUMA
nor a CPU mask, so I prefer to keep using a long for this thing,
moreover test and set bit operations use long.
> /**
> * @node_mask: cluster node mask
> */
>
>> + struct proc_dir_entry *proc;
>> + atomic_t use;
>> +};
>
>> +static inline bool
>> +xt_cluster_is_multicast_addr(const struct sk_buff *skb, int family)
>
> Cosmetic warrants uint8_t family.
>
>> +static bool
>> +xt_cluster_mt(const struct sk_buff *skb, const struct xt_match_param *par)
>> +{
>> + struct sk_buff *pskb = (struct sk_buff *)skb;
>> + const struct xt_cluster_match_info *info = par->matchinfo;
>> + const struct xt_cluster_internal *internal = info->data;
>> + const struct nf_conn *ct;
>> + enum ip_conntrack_info ctinfo;
>> + unsigned long hash;
>> + bool inv = !!(info->flags & XT_CLUSTER_F_INV);
>> +
>> + if (!xt_cluster_is_multicast_addr(skb, par->family) &&
>> + skb->pkt_type == PACKET_MULTICAST) {
>> + pskb->pkt_type = PACKET_HOST;
>> + }
>
> {} are redundant here
When lines are splitted, I have seen these redundant {} for readability.
>> + if (ct->master)
>> + hash = xt_cluster_hash(ct->master, info);
>> + else
>> + hash = xt_cluster_hash(ct, info);
>> +
>> + return test_bit(hash, &internal->node_mask) ^ inv;
>
> Since inv is just once used, I would "inline" it, aka
> return test_bit(hash, &internal->node_mask) ^
> !!(info->flags & XT_CLUSTER_MATCH_INV);
Makes sense.
>> +static int xt_cluster_seq_show(struct seq_file *s, void *v)
>> +{
>> + unsigned long *mask = v;
>> + seq_printf(s, "0x%.8lx\n", *mask);
>
> '.' does not make sense with non-string,non-floating point numbers
> (though it is a stdc feature it seems). I'd say "0x%08lx", for clarity.
OK
>> + case '-':
>> + new_node_id = simple_strtoul(buffer+1, NULL, 10);
>> + if (!new_node_id || new_node_id > sizeof(info->node_mask)*8)
>> + return -EIO;
>> + printk(KERN_NOTICE "cluster: deleting node %u\n", new_node_id);
>> + clear_bit(new_node_id-1, &info->node_mask);
>> + break;
>> + default:
>> + return -EIO;
>
> EINVAL, I'd say.
Other /proc-related code uses this error, but indeed I prefer EINVAL.
>> +static bool
>> +xt_cluster_proc_entry_exist(struct proc_dir_entry *dir, const char *name)
>> +{
>> + struct proc_dir_entry *tmp;
>> +
>> + for (tmp = dir->subdir; tmp; tmp = tmp->next) {
>> + if (strcmp(tmp->name, name) == 0)
>> + return true;
>> + }
>
> -{}
Same comment as above, it's there for readability.
> Looks good so far.
Thanks for the review.
--
"Los honestos son inadaptados sociales" -- Les Luthiers
next prev parent reply other threads:[~2009-02-14 20:42 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-14 19:29 [PATCH] netfilter: xtables: add cluster match Pablo Neira Ayuso
2009-02-14 20:28 ` Jan Engelhardt
2009-02-14 20:42 ` Pablo Neira Ayuso [this message]
2009-02-14 22:31 ` Jan Engelhardt
2009-02-14 22:32 ` Jan Engelhardt
2009-02-16 10:56 ` Patrick McHardy
2009-02-16 14:01 ` Pablo Neira Ayuso
2009-02-16 14:03 ` Patrick McHardy
2009-02-16 14:30 ` Pablo Neira Ayuso
2009-02-16 15:01 ` Patrick McHardy
2009-02-16 15:14 ` Pablo Neira Ayuso
2009-02-16 15:10 ` Patrick McHardy
2009-02-16 15:27 ` Pablo Neira Ayuso
2009-02-17 10:46 ` Pablo Neira Ayuso
2009-02-17 10:50 ` Patrick McHardy
2009-02-17 13:50 ` Pablo Neira Ayuso
2009-02-17 19:45 ` Vincent Bernat
2009-02-18 10:14 ` Patrick McHardy
2009-02-18 10:13 ` Patrick McHardy
2009-02-18 11:06 ` Pablo Neira Ayuso
2009-02-18 11:14 ` Patrick McHardy
2009-02-18 17:20 ` Vincent Bernat
2009-02-18 17:25 ` Patrick McHardy
2009-02-18 18:38 ` Pablo Neira Ayuso
2009-02-16 17:17 ` Jan Engelhardt
2009-02-16 17:13 ` Jan Engelhardt
2009-02-16 17:16 ` Patrick McHardy
2009-02-16 17:22 ` Jan Engelhardt
-- strict thread matches above, loose matches on Subject: below --
2009-02-16 9:23 Pablo Neira Ayuso
2009-02-16 9:31 ` Pablo Neira Ayuso
2009-02-16 12:13 ` Jan Engelhardt
2009-02-16 12:17 ` Patrick McHardy
2009-02-16 9:32 Pablo Neira Ayuso
2009-02-19 23:14 Pablo Neira Ayuso
2009-02-20 9:24 ` Patrick McHardy
2009-02-20 13:15 ` Pablo Neira Ayuso
2009-02-20 13:48 ` Patrick McHardy
2009-02-20 16:52 ` Pablo Neira Ayuso
2009-02-20 20:50 Pablo Neira Ayuso
2009-02-20 20:56 ` Pablo Neira Ayuso
2009-02-23 10:13 Pablo Neira Ayuso
2009-02-24 13:46 ` Patrick McHardy
2009-02-24 14:05 ` Pablo Neira Ayuso
2009-02-24 14:06 ` Patrick McHardy
2009-02-24 23:13 ` Pablo Neira Ayuso
2009-02-25 5:52 ` Patrick McHardy
2009-02-25 9:42 ` Pablo Neira Ayuso
2009-02-25 10:20 ` Patrick McHardy
2009-03-16 16:11 ` Patrick McHardy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49972CBF.3040503@netfilter.org \
--to=pablo@netfilter.org \
--cc=jengelh@medozas.de \
--cc=kaber@trash.net \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.