[PATCH][RFC]: ingress socket filter by mark

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH][RFC]: ingress socket filter by mark
@ 2009-10-18 12:42 jamal
  2009-10-18 17:28 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: jamal @ 2009-10-18 12:42 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Atis Elsts, eric.dumazet, Maciej Żenczykowski

[-- Attachment #1: Type: text/plain, Size: 960 bytes --]

Maciej forced me to dig into this ;->

at the socket level if a packet arrives with a different mark than
what we bind to, drop it. I have tested this patch and it drops a packet
with mismatching mark.

There are several approaches - and i think the patch suggestion i have
made here maybe too strict. I assume that if someone binds to a mark,
they want to not only send packets with that mark but receive
only if that mark is set. 
A looser check would be something along the line accept as well if mark
is not set i.e
if (sk->sk_mark && skb->mark && sk->sk_mark != skb->mark)

Alternatively i could add one bit in the socket flags and have it so
that check is made only if app has been explicit:
if (sock_flag(sk, SOCK_CHK_SOMARK) && sk->sk_mark != skb->mark) drop

Another approach  is to set sock filter from app. I dont like this
approach because it will be the least usable from app level and would be
the least simple from kernel level.

cheers,
jamal

[-- Attachment #2: filt-sock-m --]
[-- Type: text/x-patch, Size: 375 bytes --]

diff --git a/net/core/filter.c b/net/core/filter.c
index d1d779c..6fcf577 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -85,6 +85,9 @@ int sk_filter(struct sock *sk, struct sk_buff *skb)
 	if (err)
 		return err;

+	if (sk->sk_mark && sk->sk_mark != skb->mark)
+		return -EPERM;
+
 	rcu_read_lock_bh();
 	filter = rcu_dereference(sk->sk_filter);
 	if (filter) {

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH][RFC]: ingress socket filter by mark
  2009-10-18 12:42 [PATCH][RFC]: ingress socket filter by mark jamal
@ 2009-10-18 17:28 ` Eric Dumazet
  2009-10-18 20:28   ` jamal
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2009-10-18 17:28 UTC (permalink / raw)
  To: hadi; +Cc: netdev, David Miller, Atis Elsts, Maciej Z.enczykowski

jamal a écrit :
> Maciej forced me to dig into this ;->
> 
> at the socket level if a packet arrives with a different mark than
> what we bind to, drop it. I have tested this patch and it drops a packet
> with mismatching mark.
> 
> There are several approaches - and i think the patch suggestion i have
> made here maybe too strict. I assume that if someone binds to a mark,
> they want to not only send packets with that mark but receive
> only if that mark is set. 
> A looser check would be something along the line accept as well if mark
> is not set i.e
> if (sk->sk_mark && skb->mark && sk->sk_mark != skb->mark)
> 
> Alternatively i could add one bit in the socket flags and have it so
> that check is made only if app has been explicit:
> if (sock_flag(sk, SOCK_CHK_SOMARK) && sk->sk_mark != skb->mark) drop
> 
> Another approach  is to set sock filter from app. I dont like this
> approach because it will be the least usable from app level and would be
> the least simple from kernel level.
> 
> cheers,
> jamal
> 

I vote for extending BPF, and not adding the price of a compare
for each packet. Only users wanting mark filtering should pay the price.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][RFC]: ingress socket filter by mark
  2009-10-18 17:28 ` Eric Dumazet
@ 2009-10-18 20:28   ` jamal
  2009-10-18 23:09     ` Maciej Żenczykowski
  0 siblings, 1 reply; 5+ messages in thread
From: jamal @ 2009-10-18 20:28 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David Miller, Atis Elsts, Maciej Z.enczykowski

[-- Attachment #1: Type: text/plain, Size: 385 bytes --]

On Sun, 2009-10-18 at 19:28 +0200, Eric Dumazet wrote:

> I vote for extending BPF, and not adding the price of a compare
> for each packet. Only users wanting mark filtering should pay the price.

To be honest it nagged me as well;->
So here's a basic patch stolen from a patch i just saw that you
posted;-> I still havent tested. Let me know if it looks reasonable...

cheers,
jamal

[-- Attachment #2: filt-sock-m-2 --]
[-- Type: text/x-patch, Size: 778 bytes --]

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 1354aaf..909193e 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -123,7 +123,8 @@ struct sock_fprog	/* Required for SO_ATTACH_FILTER. */
 #define SKF_AD_IFINDEX 	8
 #define SKF_AD_NLATTR	12
 #define SKF_AD_NLATTR_NEST	16
-#define SKF_AD_MAX	20
+#define SKF_AD_MARK 	20
+#define SKF_AD_MAX	24
 #define SKF_NET_OFF   (-0x100000)
 #define SKF_LL_OFF    (-0x200000)
 
diff --git a/net/core/filter.c b/net/core/filter.c
index d1d779c..e3987e1 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -303,6 +303,9 @@ load_b:
 		case SKF_AD_IFINDEX:
 			A = skb->dev->ifindex;
 			continue;
+		case SKF_AD_MARK:
+			A = skb->mark;
+			continue;
 		case SKF_AD_NLATTR: {
 			struct nlattr *nla;
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH][RFC]: ingress socket filter by mark
  2009-10-18 20:28   ` jamal
@ 2009-10-18 23:09     ` Maciej Żenczykowski
  2009-10-19 12:12       ` jamal
  0 siblings, 1 reply; 5+ messages in thread
From: Maciej Żenczykowski @ 2009-10-18 23:09 UTC (permalink / raw)
  To: hadi; +Cc: Eric Dumazet, netdev, David Miller, Atis Elsts

I haven't looked at the patches, but I do not believe requiring
marking to be symmetric to be a good idea.

Example:
- a relatively complex router/load balancer setup
- normal (no mark) packets get routed/load balanced to destinations
which are healthy
- a separate job which health checks the destinations (and updates the
no mark routing table on destination health state transitions) - it
uses socket marking to select a separate routing table which throws
all load at a specific destination host.

ie. the socket marking is used to explicitly direct load at a specific
destination host.
Obviously only the transmit path is affected.  There is no marking
happening on the receive path.

Another example would be tunneling. I'd envision something like:
ip tunnel add vtun0 mode gre remote ... local ... tos inherit ttl 64 mark 0x1234

ip rule add fwmark 0x1234 lookup 250
ip route add 192.168.1.0/24 dev eth0 table 250
ip route add default via 192.168.1.1 dev eth0 table 250
ip route add default dev vtun0 table main

(obviously this is just an example and not fully fleshed out,
furthermore ip tunnel doesn't currently support mark, nor do the
tunnel modules themselves)

If you do want to allow explicit incoming mark filtering use another
socket option (SO_MARK_RCV) and a different/new socket field.

>> I vote for extending BPF, and not adding the price of a compare
>> for each packet. Only users wanting mark filtering should pay the price.

I agree that being able to filter on mark in bpf makes a lot of sense.
I wonder if we're not hitting the filters potentially before the mark
is set though (on receive at least)...
I'm nowhere near sure but I think packets get diverted/cloned to
tcpdump before they hit the ip stack (and thus potentially get marked
by ip(6)table mangle rules)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH][RFC]: ingress socket filter by mark
  2009-10-18 23:09     ` Maciej Żenczykowski
@ 2009-10-19 12:12       ` jamal
  0 siblings, 0 replies; 5+ messages in thread
From: jamal @ 2009-10-19 12:12 UTC (permalink / raw)
  To: Maciej Żenczykowski; +Cc: Eric Dumazet, netdev, David Miller, Atis Elsts

On Sun, 2009-10-18 at 16:09 -0700, Maciej Żenczykowski wrote:
> 
> I agree that being able to filter on mark in bpf makes a lot of sense.

I agree as well - i posted a patch yesterday; i just tested it and it
works so i will formally post it shortly.

> I wonder if we're not hitting the filters potentially before the mark
> is set though (on receive at least)...
> I'm nowhere near sure but I think packets get diverted/cloned to
> tcpdump before they hit the ip stack (and thus potentially get marked
> by ip(6)table mangle rules)

There are many ways to mark the packets before they get to the socket.
tc ingress provides at least two ways (ipt action and recently posted
patch by me on skbedit); iptables as well.

cheers,
jamal


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-10-19 12:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-18 12:42 [PATCH][RFC]: ingress socket filter by mark jamal
2009-10-18 17:28 ` Eric Dumazet
2009-10-18 20:28   ` jamal
2009-10-18 23:09     ` Maciej Żenczykowski
2009-10-19 12:12       ` jamal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).