From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Jan Engelhardt <jengelh@medozas.de>
Cc: netfilter-devel@vger.kernel.org, kaber@trash.net
Subject: Re: [PATCH] netfilter: xtables: add cluster match
Date: Sat, 14 Feb 2009 21:42:39 +0100 [thread overview]
Message-ID: <49972CBF.3040503@netfilter.org> (raw)
In-Reply-To: <alpine.LSU.2.00.0902142112590.28575@fbirervta.pbzchgretzou.qr>
Jan Engelhardt wrote:
> On Saturday 2009-02-14 20:29, Pablo Neira Ayuso wrote:
>
>> This patch adds the iptables cluster match. This match can be used
>> to deploy gateway and back-end load-sharing clusters.
>
> All of this nice text (below) should go into libxt_cluster.man :)
Indeed, that will go into the manpage. I have kept the iptables part
until you finish with Jamal's libxtables renaming, to avoid any
clashing. Anyway, I don't think that adding this information to the
manpage is a reason to trim the patch description, actually I think that
this patch deserves this long description and I have seen longer
descriptions in the kernel changelog :)
>> Assuming that all the nodes see all packets (see below for an
>> example on how to do that if your switch does not allow this), the
>> cluster match decides if this node has to handle a packet given:
>>
>> jhash(source IP) % total_nodes == node_id
>>
>> For related connections, the master conntrack is used. The following
>> is an example of its use to deploy a gateway cluster composed of two
>> nodes (where this is the node 1):
>>
>> iptables -I PREROUTING -t mangle -i eth1 -m cluster \
>> --cluster-total-nodes 2 --cluster-local-node 1 \
> [...]
>> echo +2 > /proc/sys/net/netfilter/cluster/$PROC_NAME
>>
>> BTW, some final notes:
>>
>> * This match mangles the skbuff pkt_type in case that it detects
>> PACKET_MULTICAST for a non-multicast address. This may be done in
>> a PKTTYPE target for this sole purpose.
>> * This match supersedes the CLUSTERIP target.
>
> The supersedes statement should also go into Kconfig.
Yes, but I was also waiting for Patrick to tell how to proceed with
this. I think that we also have to add CLUSTERIP to the
schedule-for-removal list.
>> @@ -0,0 +1,21 @@
>> +#ifndef _XT_CLUSTER_MATCH_H
>> +#define _XT_CLUSTER_MATCH_H
>> +
>> +struct proc_dir_entry;
>> +
>> +enum xt_cluster_flags {
>> + XT_CLUSTER_F_INV = 0,
>> +};
>
> Hm, that should be XT_CLUSTER_F_INV = 1 << 0.
Indeed.
>> +config NETFILTER_XT_MATCH_CLUSTER
>> + tristate '"cluster" match support'
>> + depends on NETFILTER_ADVANCED
>
> xt_cluster depends on NF_CONNTRACK too.
Right.
>> + ---help---
>> + This option allows you to build work-load-sharing clusters of
>> + network servers/stateful firewalls without having a dedicated
>> + load-balancing router/server/switch. Basically, this match returns
>> + true when the packet must be handled by this cluster node. Thus,
>> + all nodes see all packets and this match decides which node handles
>> + what packets. The work-load sharing algorithm is based on source
>> + address hashing.
>> +
>> + If you say Y here, try `iptables -m cluster --help` for
>> + more information.
>
> Somehow this gives the impression that Y is the only logical choice,
> when indeed, M would work too.
>
>> +struct xt_cluster_internal {
>> + unsigned long node_mask;
>
> I think I raised concern about this, but can't remember.
> Should not this be of type nodemask_t instead? Or ... is this
> "node" perhaps not describing a NUMA node, but a cluster node?
> Some comments would be appreciated, along the lines of
You mentioned cpumask_t last time, but this is neither related to NUMA
nor a CPU mask, so I prefer to keep using a long for this thing,
moreover test and set bit operations use long.
> /**
> * @node_mask: cluster node mask
> */
>
>> + struct proc_dir_entry *proc;
>> + atomic_t use;
>> +};
>
>> +static inline bool
>> +xt_cluster_is_multicast_addr(const struct sk_buff *skb, int family)
>
> Cosmetic warrants uint8_t family.
>
>> +static bool
>> +xt_cluster_mt(const struct sk_buff *skb, const struct xt_match_param *par)
>> +{
>> + struct sk_buff *pskb = (struct sk_buff *)skb;
>> + const struct xt_cluster_match_info *info = par->matchinfo;
>> + const struct xt_cluster_internal *internal = info->data;
>> + const struct nf_conn *ct;
>> + enum ip_conntrack_info ctinfo;
>> + unsigned long hash;
>> + bool inv = !!(info->flags & XT_CLUSTER_F_INV);
>> +
>> + if (!xt_cluster_is_multicast_addr(skb, par->family) &&
>> + skb->pkt_type == PACKET_MULTICAST) {
>> + pskb->pkt_type = PACKET_HOST;
>> + }
>
> {} are redundant here
When lines are splitted, I have seen these redundant {} for readability.
>> + if (ct->master)
>> + hash = xt_cluster_hash(ct->master, info);
>> + else
>> + hash = xt_cluster_hash(ct, info);
>> +
>> + return test_bit(hash, &internal->node_mask) ^ inv;
>
> Since inv is just once used, I would "inline" it, aka
> return test_bit(hash, &internal->node_mask) ^
> !!(info->flags & XT_CLUSTER_MATCH_INV);
Makes sense.
>> +static int xt_cluster_seq_show(struct seq_file *s, void *v)
>> +{
>> + unsigned long *mask = v;
>> + seq_printf(s, "0x%.8lx\n", *mask);
>
> '.' does not make sense with non-string,non-floating point numbers
> (though it is a stdc feature it seems). I'd say "0x%08lx", for clarity.
OK
>> + case '-':
>> + new_node_id = simple_strtoul(buffer+1, NULL, 10);
>> + if (!new_node_id || new_node_id > sizeof(info->node_mask)*8)
>> + return -EIO;
>> + printk(KERN_NOTICE "cluster: deleting node %u\n", new_node_id);
>> + clear_bit(new_node_id-1, &info->node_mask);
>> + break;
>> + default:
>> + return -EIO;
>
> EINVAL, I'd say.
Other /proc-related code uses this error, but indeed I prefer EINVAL.
>> +static bool
>> +xt_cluster_proc_entry_exist(struct proc_dir_entry *dir, const char *name)
>> +{
>> + struct proc_dir_entry *tmp;
>> +
>> + for (tmp = dir->subdir; tmp; tmp = tmp->next) {
>> + if (strcmp(tmp->name, name) == 0)
>> + return true;
>> + }
>
> -{}
Same comment as above, it's there for readability.
> Looks good so far.
Thanks for the review.
--
"Los honestos son inadaptados sociales" -- Les Luthiers
next prev parent reply other threads:[~2009-02-14 20:42 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-14 19:29 [PATCH] netfilter: xtables: add cluster match Pablo Neira Ayuso
2009-02-14 20:28 ` Jan Engelhardt
2009-02-14 20:42 ` Pablo Neira Ayuso [this message]
2009-02-14 22:31 ` Jan Engelhardt
2009-02-14 22:32 ` Jan Engelhardt
2009-02-16 10:56 ` Patrick McHardy
2009-02-16 14:01 ` Pablo Neira Ayuso
2009-02-16 14:03 ` Patrick McHardy
2009-02-16 14:30 ` Pablo Neira Ayuso
2009-02-16 15:01 ` Patrick McHardy
2009-02-16 15:14 ` Pablo Neira Ayuso
2009-02-16 15:10 ` Patrick McHardy
2009-02-16 15:27 ` Pablo Neira Ayuso
2009-02-17 10:46 ` Pablo Neira Ayuso
2009-02-17 10:50 ` Patrick McHardy
2009-02-17 13:50 ` Pablo Neira Ayuso
2009-02-17 19:45 ` Vincent Bernat
2009-02-18 10:14 ` Patrick McHardy
2009-02-18 10:13 ` Patrick McHardy
2009-02-18 11:06 ` Pablo Neira Ayuso
2009-02-18 11:14 ` Patrick McHardy
2009-02-18 17:20 ` Vincent Bernat
2009-02-18 17:25 ` Patrick McHardy
2009-02-18 18:38 ` Pablo Neira Ayuso
2009-02-16 17:17 ` Jan Engelhardt
2009-02-16 17:13 ` Jan Engelhardt
2009-02-16 17:16 ` Patrick McHardy
2009-02-16 17:22 ` Jan Engelhardt
-- strict thread matches above, loose matches on Subject: below --
2009-02-16 9:23 Pablo Neira Ayuso
2009-02-16 9:31 ` Pablo Neira Ayuso
2009-02-16 12:13 ` Jan Engelhardt
2009-02-16 12:17 ` Patrick McHardy
2009-02-16 9:32 Pablo Neira Ayuso
2009-02-19 23:14 Pablo Neira Ayuso
2009-02-20 9:24 ` Patrick McHardy
2009-02-20 13:15 ` Pablo Neira Ayuso
2009-02-20 13:48 ` Patrick McHardy
2009-02-20 16:52 ` Pablo Neira Ayuso
2009-02-20 20:50 Pablo Neira Ayuso
2009-02-20 20:56 ` Pablo Neira Ayuso
2009-02-23 10:13 Pablo Neira Ayuso
2009-02-24 13:46 ` Patrick McHardy
2009-02-24 14:05 ` Pablo Neira Ayuso
2009-02-24 14:06 ` Patrick McHardy
2009-02-24 23:13 ` Pablo Neira Ayuso
2009-02-25 5:52 ` Patrick McHardy
2009-02-25 9:42 ` Pablo Neira Ayuso
2009-02-25 10:20 ` Patrick McHardy
2009-03-16 16:11 ` Patrick McHardy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49972CBF.3040503@netfilter.org \
--to=pablo@netfilter.org \
--cc=jengelh@medozas.de \
--cc=kaber@trash.net \
--cc=netfilter-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).