* Re: tc ipt action [not found] ` <50C9B4BB.9060609@mojatatu.com> @ 2012-12-15 21:19 ` Jamal Hadi Salim 2012-12-15 23:06 ` Jan Engelhardt 2012-12-16 0:27 ` tc ipt action Pablo Neira Ayuso 0 siblings, 2 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-15 21:19 UTC (permalink / raw) To: Yury Stankevich, shemonc; +Cc: netdev@vger.kernel.org, pablo, netfilter-devel Yury, I took a brief look and run some quick tests on ubuntu 12.04. I am going to be lazy and try and involve the netfilter folks. It seems that if you left out the args to CONNMARK (includes other targets like MARK etc) you will succeed - but you get default values. Example, the following should work for tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 action ipt -j CONNMARK \ action mirred egress redirect dev ifb0 Here is what the output looks like when you dont pass the parameters. ------- j@ubuntu:~$ sudo tc filter show dev eth0 parent ffff: filter protocol ip pref 1 u32 filter protocol ip pref 1 u32 fh 800: ht divisor 1 filter protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:15 match 0a000015/ffffffff at 12 action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING target MARK and 0xffffffff index 2 ref 1 bind 1 filter protocol ip pref 49149 u32 filter protocol ip pref 49149 u32 fh 804: ht divisor 1 filter protocol ip pref 49149 u32 fh 804::800 order 2048 key ht 804 bkt 0 flowid 1:12 match 00000000/00000000 at 0 action order 33: tablename: mangle hook: NF_IP_PRE_ROUTING target CONNMARK and 0x0 index 123 ref 1 bind 1 ---------------- Pablo, Hasan Chowdhury tells me this broke after iptable 1.4.10 Hasan also sent me a small patch to fake "xt" instead of "ipt" - but i think there's more than meets the eye here; some interface we are using to talk to xtables on user space seems to have changed. cheers, jamal On 12-12-13 05:58 AM, Jamal Hadi Salim wrote: > Yury, > > This appears to be an ABI breakage on iptables/netfilter side. > I will look at it (and hopefully fix it) over the weekend. > > cheers, > jamal > > On 12-12-09 07:20 AM, Yury Stankevich wrote: >> Hello, >> >> i not sure this is correct list, please advise if not. >> >> i'm trying to use ipt action, and got a problem: >> >> #tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 >> action ipt -j CONNMARK --restore-mark action mirred egress redirect >> dev ifb0 >> -> bad action type ipt >> >> from strace: >> open("/usr/lib/tc//m_gact.so", O_RDONLY) = -1 ENOENT (No such file or >> directory) >> write(2, "bad action type ipt\n", 20bad action type ipt >> >> well. i'm trying to use xt: >> #tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 >> action xt -j CONNMARK --restore-mark action mirred egress redirect dev >> ifb0 >> xt: unrecognized option '--restore-mark' >> >> from strace: >> open("/lib/xtables/libxt_CONNMARK.so", O_RDONLY) = 4 >> read(4, >> "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200\6\0\0004\0\0\0"..., >> 512) = 512 >> fstat64(4, {st_mode=S_IFREG|0644, st_size=9756, ...}) = 0 >> mmap2(NULL, 12548, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) >> = 0xf76f3000 >> mmap2(0xf76f5000, 8192, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x1) = 0xf76f5000 >> close(4) = 0 >> mprotect(0xf76f5000, 4096, PROT_READ) = 0 >> socket(PF_INET, SOCK_RAW, IPPROTO_RAW) = 4 >> fcntl64(4, F_SETFD, FD_CLOEXEC) = 0 >> lstat64("/proc/net/ip_tables_names", {st_mode=S_IFREG|0440, st_size=0, >> ...}) = 0 >> statfs64("/proc/net/ip_tables_names", 84, {f_type="PROC_SUPER_MAGIC", >> f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, >> f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0 >> getsockopt(4, SOL_IP, 0x43 /* IP_??? */, >> "CONNMARK\0\367\f\300\0\0\0po\367l8p\367\364/p\367:}\302\1", [30]) = 0 >> close(4) = 0 >> write(2, "xt: unrecognized option '--resto"..., 41xt: unrecognized >> option '--restore-mark' >> >> so... i make something wrong or this is a bug ? >> >> ps: 3.6.8 kernel 64 bit kernel with 32 bit userspace, iproute 20121001 >> from debian-experimental, >> module act_ipt is loaded. >> pps: please, cc me in reply. >> >> > ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-15 21:19 ` tc ipt action Jamal Hadi Salim @ 2012-12-15 23:06 ` Jan Engelhardt 2012-12-16 0:26 ` Jan Engelhardt ` (2 more replies) 2012-12-16 0:27 ` tc ipt action Pablo Neira Ayuso 1 sibling, 3 replies; 64+ messages in thread From: Jan Engelhardt @ 2012-12-15 23:06 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Yury Stankevich, shemonc, netdev@vger.kernel.org, pablo, netfilter-devel On Saturday 2012-12-15 22:19, Jamal Hadi Salim wrote: > > Example, the following should work for > tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 > action ipt -j CONNMARK \ > action mirred egress redirect dev ifb0 If I try that command (substituting ipt->xt and eth0->dummy0, ifb0->dummy1), all I get is the dreaded "Invalid argument". So the kernel rejected the command, which could indicate that userspace construction might have been ok. # tc filter add dev dummy0 parent ffff: protocol ip u32 match u32 0 0 \ action xt -j CONNMARK action mirred egress redirect dev dummy1 tablename: mangle hook: NF_IP_PRE_ROUTING target: CONNMARK and 0x0 index 0 RTNETLINK answers: Invalid argument We have an error talking to the kernel > Pablo, Hasan Chowdhury tells me this broke after iptable 1.4.10 > Hasan also sent me a small patch to fake "xt" instead of "ipt" - but i think > there's more than meets the eye here; some interface we are using to talk to > xtables on user space seems to have changed. What was the last combination that worked? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-15 23:06 ` Jan Engelhardt @ 2012-12-16 0:26 ` Jan Engelhardt 2012-12-16 0:32 ` [PATCH] build: unbreak linkage of m_xt.so Jan Engelhardt 2012-12-16 10:22 ` tc ipt action Jamal Hadi Salim [not found] ` <CAASe=fQT2pVOK0uctdaKL+aOrF8nYeTMfoF15kmd-rC02+7Vnw@mail.gmail.com> 2 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 0:26 UTC (permalink / raw) To: vapier Cc: Jamal Hadi Salim, Yury Stankevich, shemonc, netdev@vger.kernel.org, Pablo Neira Ayuso, Netfilter Developer Mailing List On Sunday 2012-12-16 00:06, Jan Engelhardt wrote: >On Saturday 2012-12-15 22:19, Jamal Hadi Salim wrote: >> >> Example, the following should work for >> tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 >> action ipt -j CONNMARK \ >> action mirred egress redirect dev ifb0 > >If I try that command (substituting ipt->xt and eth0->dummy0, >ifb0->dummy1), all I get is the dreaded "Invalid argument". >So the kernel rejected the command, which could indicate that >userspace construction might have been ok. > ># tc filter add dev dummy0 parent ffff: protocol ip u32 match u32 0 0 \ >action xt -j CONNMARK action mirred egress redirect dev dummy1 > >tablename: mangle hook: NF_IP_PRE_ROUTING > target: CONNMARK and 0x0 index 0 >RTNETLINK answers: Invalid argument >We have an error talking to the kernel > >> Pablo, Hasan Chowdhury tells me this broke after iptable 1.4.10 >> Hasan also sent me a small patch to fake "xt" instead of "ipt" - but i think >> there's more than meets the eye here; some interface we are using to talk to >> xtables on user space seems to have changed. > >What was the last combination that worked? For added fun, it works even less in iproute2-3.7.0. commit e4fc4ada3317ea94452576add25981279d705b14 Author: Mike Frysinger <vapier@gentoo.org> Date: Thu Nov 8 11:41:17 2012 -0500 allow pkg-config to be customized Rather than hard coding `pkg-config`, use ${PKG_CONFIG} so people can override it to their specific version (like when cross-compiling). This is the same way the upstream pkg-config code works. Signed-off-by: Mike Frysinger <vapier@gentoo.org> broke it by causing tc/m_xt.so to no longer link against libxtables.so, leading to: # tc [above parameters] tc: symbol lookup error: /usr/lib64/tc//m_xt.so: undefined symbol: xtables_init_all (Makefiles being simpler than $other_buildsys? A distant reality!) ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] build: unbreak linkage of m_xt.so 2012-12-16 0:26 ` Jan Engelhardt @ 2012-12-16 0:32 ` Jan Engelhardt 2012-12-16 10:30 ` Jamal Hadi Salim ` (2 more replies) 0 siblings, 3 replies; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 0:32 UTC (permalink / raw) To: stephen.hemminger Cc: vapier, netdev, jhs, urykhy, shemonc, pablo, netfilter-devel Commit v3.7.0~10 caused the variable new PKG_CONFIG variable never to be present at the time of calling make, leading to tc/m_xt.so not linked with -lxtables (result from pkg-config xtables --libs), that in turn leading to tc: symbol lookup error: /usr/lib64/tc//m_xt.so: undefined symbol: xtables_init_all Fixing that. Signed-off-by: Jan Engelhardt <jengelh@inai.de> --- configure | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure b/configure index 9912114..573ee55 100755 --- a/configure +++ b/configure @@ -4,7 +4,6 @@ INCLUDE=${1:-"$PWD/include"} : ${PKG_CONFIG:=pkg-config} : ${CC=gcc} -echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config # Make a temp directory in build tree. TMPDIR=$(mktemp -d config.XXXXXX) @@ -224,6 +223,7 @@ rm -f $TMPDIR/ipsettest.c $TMPDIR/ipsettest } echo "# Generated config based on" $INCLUDE >Config +echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config echo "TC schedulers" -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-16 0:32 ` [PATCH] build: unbreak linkage of m_xt.so Jan Engelhardt @ 2012-12-16 10:30 ` Jamal Hadi Salim 2012-12-16 17:03 ` Jamal Hadi Salim 2012-12-16 22:02 ` Mike Frysinger 2012-12-18 17:21 ` Stephen Hemminger 2 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 10:30 UTC (permalink / raw) To: Jan Engelhardt Cc: stephen.hemminger, vapier, netdev, urykhy, shemonc, pablo, netfilter-devel On 12-12-15 07:32 PM, Jan Engelhardt wrote: > Commit v3.7.0~10 caused the variable new PKG_CONFIG variable never > to be present at the time of calling make, leading to tc/m_xt.so > not linked with -lxtables (result from pkg-config xtables --libs), > that in turn leading to > > tc: symbol lookup error: /usr/lib64/tc//m_xt.so: undefined symbol: > xtables_init_all > Yep - run into this problem, scratching my head thinking something wrong with my environment with latest iproute2 git tree. I hacked mine to just always include xtables in LDLIBS. > Fixing that. > > Signed-off-by: Jan Engelhardt <jengelh@inai.de> I can confirm it builds fine for me now if i take out the hack I had and use this patch. Acked-by: Jamal Hadi Salim <hadi@mojatatu.com> Stephen, I think this patch would be equivalent of "stable fix". cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-16 10:30 ` Jamal Hadi Salim @ 2012-12-16 17:03 ` Jamal Hadi Salim 2012-12-16 17:43 ` Jan Engelhardt 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 17:03 UTC (permalink / raw) To: Jan Engelhardt Cc: stephen.hemminger, vapier, netdev, urykhy, shemonc, pablo, netfilter-devel On 12-12-16 05:30 AM, Jamal Hadi Salim wrote: > I can confirm it builds fine for me now if i take out the hack I had and > use this patch. Sorry, I take what i said back and went back to explicitly adding -l xtables. The problem is still the intepretation of tc/Makefile. Here's the compile output. ---- gcc -Wall -Wstrict-prototypes -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -DCONFIG_GACT -DCONFIG_GACT_PROB -DIPT_LIB_DIR=\"/lib/xtables\" -DYY_NO_INPUT -Wl,-export-dynamic -shared -fpic -o m_xt.so m_xt.c $(pkg-config xtables --cflags --libs) ---- Note the missing expansion. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-16 17:03 ` Jamal Hadi Salim @ 2012-12-16 17:43 ` Jan Engelhardt 2012-12-16 18:05 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 17:43 UTC (permalink / raw) To: Jamal Hadi Salim Cc: stephen.hemminger, vapier, netdev, urykhy, shemonc, pablo, netfilter-devel On Sunday 2012-12-16 18:03, Jamal Hadi Salim wrote: > On 12-12-16 05:30 AM, Jamal Hadi Salim wrote: > >> I can confirm it builds fine for me now if i take out the hack I had and >> use this patch. > > > Sorry, I take what i said back and went back to explicitly adding -l xtables. > The problem is still the intepretation of tc/Makefile. Here's the compile > output. > ---- > gcc -Wall -Wstrict-prototypes -O2 -I../include -DRESOLVE_HOSTNAMES > -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -DCONFIG_GACT > -DCONFIG_GACT_PROB -DIPT_LIB_DIR=\"/lib/xtables\" -DYY_NO_INPUT > -Wl,-export-dynamic -shared -fpic -o m_xt.so m_xt.c $(pkg-config xtables > --cflags --libs) > ---- I saw the same during make, _but_, on running `ldd tc/m_xt.so`, I got a libxtables.so entry, so I thought I was fine. "$() "is something for the shell to expand, not make. See this testcase. $ make echo $(pkg-config xtables --cflags --libs) -I/usr/include/iptables-1.4.16.3 -lxtables $ cat Makefile a: echo $$(pkg-config xtables --cflags --libs) ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-16 17:43 ` Jan Engelhardt @ 2012-12-16 18:05 ` Jamal Hadi Salim 0 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 18:05 UTC (permalink / raw) To: Jan Engelhardt Cc: stephen.hemminger, vapier, netdev, urykhy, shemonc, pablo, netfilter-devel On 12-12-16 12:43 PM, Jan Engelhardt wrote: > On Sunday 2012-12-16 18:03, Jamal Hadi Salim wrote: > > > I saw the same during make, _but_, on running `ldd tc/m_xt.so`, I got a > libxtables.so entry, so I thought I was fine. > Sorry, you are right. Without your patch that doesnt happen. I had removed the global dlopen while debugging, so it was using the wrong m_xt.so So my Ack is back on;-> cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-16 0:32 ` [PATCH] build: unbreak linkage of m_xt.so Jan Engelhardt 2012-12-16 10:30 ` Jamal Hadi Salim @ 2012-12-16 22:02 ` Mike Frysinger 2012-12-18 17:21 ` Stephen Hemminger 2 siblings, 0 replies; 64+ messages in thread From: Mike Frysinger @ 2012-12-16 22:02 UTC (permalink / raw) To: Jan Engelhardt Cc: stephen.hemminger, netdev, jhs, urykhy, shemonc, pablo, netfilter-devel [-- Attachment #1: Type: Text/Plain, Size: 792 bytes --] On Saturday 15 December 2012 19:32:48 Jan Engelhardt wrote: > --- a/configure > +++ b/configure > @@ -4,7 +4,6 @@ > INCLUDE=${1:-"$PWD/include"} > > : ${PKG_CONFIG:=pkg-config} > : ${CC=gcc} > > -echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config > > # Make a temp directory in build tree. > TMPDIR=$(mktemp -d config.XXXXXX) > @@ -224,6 +223,7 @@ rm -f $TMPDIR/ipsettest.c $TMPDIR/ipsettest > } > > echo "# Generated config based on" $INCLUDE >Config > +echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config > > echo "TC schedulers" the use of un-indented shell functions makes the code read in a way it doesn't actually execute. i'd suggest moving this logic into a function to match existing style rather than simply moving the Config write. i'll post a patch. -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-16 0:32 ` [PATCH] build: unbreak linkage of m_xt.so Jan Engelhardt 2012-12-16 10:30 ` Jamal Hadi Salim 2012-12-16 22:02 ` Mike Frysinger @ 2012-12-18 17:21 ` Stephen Hemminger 2012-12-18 18:47 ` Mike Frysinger 2 siblings, 1 reply; 64+ messages in thread From: Stephen Hemminger @ 2012-12-18 17:21 UTC (permalink / raw) To: Jan Engelhardt Cc: stephen.hemminger, vapier, netdev, jhs, urykhy, shemonc, pablo, netfilter-devel On Sun, 16 Dec 2012 01:32:48 +0100 Jan Engelhardt <jengelh@inai.de> wrote: > Commit v3.7.0~10 caused the variable new PKG_CONFIG variable never > to be present at the time of calling make, leading to tc/m_xt.so > not linked with -lxtables (result from pkg-config xtables --libs), > that in turn leading to > > tc: symbol lookup error: /usr/lib64/tc//m_xt.so: undefined symbol: > xtables_init_all > > Fixing that. > > Signed-off-by: Jan Engelhardt <jengelh@inai.de> > --- > configure | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/configure b/configure > index 9912114..573ee55 100755 > --- a/configure > +++ b/configure > @@ -4,7 +4,6 @@ > INCLUDE=${1:-"$PWD/include"} > : ${PKG_CONFIG:=pkg-config} > : ${CC=gcc} > -echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config > > # Make a temp directory in build tree. > TMPDIR=$(mktemp -d config.XXXXXX) > @@ -224,6 +223,7 @@ rm -f $TMPDIR/ipsettest.c $TMPDIR/ipsettest > } > > echo "# Generated config based on" $INCLUDE >Config > +echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config > > echo "TC schedulers" > Ok, manually did the diff (conflicted with other previous changes). ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-18 17:21 ` Stephen Hemminger @ 2012-12-18 18:47 ` Mike Frysinger 2012-12-20 0:03 ` Stephen Hemminger 0 siblings, 1 reply; 64+ messages in thread From: Mike Frysinger @ 2012-12-18 18:47 UTC (permalink / raw) To: Stephen Hemminger Cc: Jan Engelhardt, stephen.hemminger, netdev, jhs, urykhy, shemonc, pablo, netfilter-devel [-- Attachment #1: Type: Text/Plain, Size: 1346 bytes --] On Tuesday 18 December 2012 12:21:30 Stephen Hemminger wrote: > On Sun, 16 Dec 2012 01:32:48 +0100 Jan Engelhardt wrote: > > Commit v3.7.0~10 caused the variable new PKG_CONFIG variable never > > to be present at the time of calling make, leading to tc/m_xt.so > > not linked with -lxtables (result from pkg-config xtables --libs), > > that in turn leading to > > > > tc: symbol lookup error: /usr/lib64/tc//m_xt.so: undefined symbol: > > xtables_init_all > > > > Fixing that. > > > > --- a/configure > > +++ b/configure > > @@ -4,7 +4,6 @@ > > INCLUDE=${1:-"$PWD/include"} > > : ${PKG_CONFIG:=pkg-config} > > : ${CC=gcc} > > -echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config > > > > # Make a temp directory in build tree. > > TMPDIR=$(mktemp -d config.XXXXXX) > > @@ -224,6 +223,7 @@ rm -f $TMPDIR/ipsettest.c $TMPDIR/ipsettest > > } > > > > echo "# Generated config based on" $INCLUDE >Config > > +echo "PKG_CONFIG:=${PKG_CONFIG}" >>Config > > echo "TC schedulers" > > Ok, manually did the diff (conflicted with other previous changes). this patch is no longer necessary one you merged my: configure: move toolchain init to a function it's actually undesirable to apply this after that since it makes the configure script less clear again ... sorry if my commit message wasn't obvious. -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] build: unbreak linkage of m_xt.so 2012-12-18 18:47 ` Mike Frysinger @ 2012-12-20 0:03 ` Stephen Hemminger 0 siblings, 0 replies; 64+ messages in thread From: Stephen Hemminger @ 2012-12-20 0:03 UTC (permalink / raw) To: Mike Frysinger Cc: Jan Engelhardt, stephen.hemminger, netdev, jhs, urykhy, shemonc, pablo, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 393 bytes --] On Tue, 18 Dec 2012 13:47:58 -0500 Mike Frysinger <vapier@gentoo.org> wrote: > this patch is no longer necessary one you merged my: > configure: move toolchain init to a function > > it's actually undesirable to apply this after that since it makes the configure > script less clear again ... > > sorry if my commit message wasn't obvious. > -mike ok, went back to old way. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-15 23:06 ` Jan Engelhardt 2012-12-16 0:26 ` Jan Engelhardt @ 2012-12-16 10:22 ` Jamal Hadi Salim [not found] ` <CAASe=fQT2pVOK0uctdaKL+aOrF8nYeTMfoF15kmd-rC02+7Vnw@mail.gmail.com> 2 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 10:22 UTC (permalink / raw) To: Jan Engelhardt Cc: Yury Stankevich, shemonc, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-15 06:06 PM, Jan Engelhardt wrote: > If I try that command (substituting ipt->xt and eth0->dummy0, > ifb0->dummy1), all I get is the dreaded "Invalid argument". > So the kernel rejected the command, which could indicate that > userspace construction might have been ok. > > # tc filter add dev dummy0 parent ffff: protocol ip u32 match u32 0 0 \ > action xt -j CONNMARK action mirred egress redirect dev dummy1 > > tablename: mangle hook: NF_IP_PRE_ROUTING > target: CONNMARK and 0x0 index 0 > RTNETLINK answers: Invalid argument > We have an error talking to the kernel > No problem sending it to the kernel here on ubuntu 12.04. I also upgraded to current linus git tree, same result. The problem is the parameters are not accepted in user space as you can see for connmark and what gets sent (eg CONNMARK and 0x0) doesnt seem sensible. > What was the last combination that worked? First time this got reported to me (or i got CCed on the problem) - I am told it broke after iptables 1.4.11. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
[parent not found: <CAASe=fQT2pVOK0uctdaKL+aOrF8nYeTMfoF15kmd-rC02+7Vnw@mail.gmail.com>]
* Re: tc ipt action [not found] ` <CAASe=fQT2pVOK0uctdaKL+aOrF8nYeTMfoF15kmd-rC02+7Vnw@mail.gmail.com> @ 2012-12-16 16:48 ` Jamal Hadi Salim 2012-12-16 18:59 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 16:48 UTC (permalink / raw) To: Hasan Chowdhury Cc: Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-16 10:17 AM, Hasan Chowdhury wrote: > > > [Hasan**] I thought "xt" is a supported action kind for > iproute2-3.6.0. Besides with a default compilation it compiled m_xt.c > (not m_ipt.c ) linked with shared object m_xt.so > It is - but my hope is not to change the interface to existing scripts. One approach is rename "xt" to "ipt" and make the old vs new ways mutually exclusive based on Config options. But that will add more to baggage of all sorts of workarounds for 13 versions of iptables changing interfaces. My goal was to have a staged way to kill the old way but maintain the same command interface. I was hoping not to change the kernel but based on your patch, that may be the best place to place warnings about deprecating APIs (so maybe i will add support for "xt" and warn about "ipt"). Would you be able to test that kernel patch? > the workaround exits in the patch for file m_action (see the changes > there ) as netlink reply from kernel for this tc u32 action xt command > comes as .kind = "ipt" instead of xt (assumed act_ipt.c in kernle > is not aware of new xt extensions .) The most important part of your patch that i missed is you took care of some of the new API changes Pablo mentioned. I am suspicious of one of them: why call xtables_options_xfrm(). Pablo/Jan, could you please look at Hasan's patch in m_xt.c? Also, your patch doesnt compile for me. Can you please provide a version against the latest iproute2 git tree? cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 16:48 ` Jamal Hadi Salim @ 2012-12-16 18:59 ` Jamal Hadi Salim 2012-12-16 19:13 ` Jan Engelhardt 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 18:59 UTC (permalink / raw) To: Hasan Chowdhury Cc: Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel Hasan, I can confirm that xtables_options_xfrm() works. Just suspicious of that call, "xfrm" seems too specific to xfrm subsystem but generic enough. Dont bother resending the patch, it works right now, I am just waiting for confirmation from Pablo/Jan and i will proceed from there. I am also still struggling with whether to add your intermediate solution to rename the xt->ipt. I think i will go ahead and add a kernel change. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 18:59 ` Jamal Hadi Salim @ 2012-12-16 19:13 ` Jan Engelhardt 2012-12-16 20:36 ` Jamal Hadi Salim 2012-12-16 20:41 ` [PATCH] iproute2: act_ipt fix xtables breakage Jamal Hadi Salim 0 siblings, 2 replies; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 19:13 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On Sunday 2012-12-16 19:59, Jamal Hadi Salim wrote: > Hasan, > > I can confirm that xtables_options_xfrm() works. > Just suspicious of that call, "xfrm" seems too specific to xfrm > subsystem but generic enough. Heh, nah. "xfrm" is one of these pictogram-based abbreviations like "xing" (the thing they paint on roads/signs), apparently standing for "trans"form and "cross"ing, respectively, though 'x' is ambiguous and open to a lot of other interpretations. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 19:13 ` Jan Engelhardt @ 2012-12-16 20:36 ` Jamal Hadi Salim 2012-12-16 20:41 ` [PATCH] iproute2: act_ipt fix xtables breakage Jamal Hadi Salim 1 sibling, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 20:36 UTC (permalink / raw) To: Jan Engelhardt Cc: Hasan Chowdhury, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-16 02:13 PM, Jan Engelhardt wrote: > > "xfrm" is one of these pictogram-based abbreviations like "xing" (the > thing they paint on roads/signs), apparently standing for "trans"form > and "cross"ing, respectively, though 'x' is ambiguous and open to a lot > of other interpretations. Ok, In that case i'll push half of Hasan's patch. I have a kernel change that works with it. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] iproute2: act_ipt fix xtables breakage 2012-12-16 19:13 ` Jan Engelhardt 2012-12-16 20:36 ` Jamal Hadi Salim @ 2012-12-16 20:41 ` Jamal Hadi Salim 2012-12-17 12:30 ` RFC [PATCH] iproute2: temporary solution to fix xt breakage Jamal Hadi Salim 1 sibling, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 20:41 UTC (permalink / raw) To: Stephen Hemminger Cc: Hasan Chowdhury, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 307 bytes --] Attached. I am going to send a kernel patch as well. There is an "intermediate solution" from Hasan which doesnt require the kernel change. It changes the kernel endpoint to "ipt". I am conflicted because it is a quick hack while otoh forcing people to upgrade kernel is a usability issue. cheers, jamal [-- Attachment #2: patch-xt --] [-- Type: text/plain, Size: 4163 bytes --] commit ff707eaa1fd51435958ae2afcd09d4b3600160b4 Author: Hasan Chowdhury <shemonc@gmail.com> Date: Sun Dec 16 14:09:38 2012 -0500 Fixes breakage with xtables API starting with version 1.4.10 Signed-off-by: Hasan Chowdhury <shemonc@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> diff --git a/tc/m_xt.c b/tc/m_xt.c index bcc4d75..e1c3f38 100644 --- a/tc/m_xt.c +++ b/tc/m_xt.c @@ -118,6 +118,7 @@ static int parse_ipt(struct action_util *a,int *argc_p, struct xtables_target *m = NULL; struct ipt_entry fw; struct rtattr *tail; + int c; int rargc = *argc_p; char **argv = *argv_p; @@ -126,6 +127,7 @@ static int parse_ipt(struct action_util *a,int *argc_p, int size = 0; int iok = 0, ok = 0; __u32 hook = 0, index = 0; + struct option *opts = NULL; xtables_init_all(&tcipt_globals, NFPROTO_IPV4); set_lib_dir(); @@ -158,14 +160,22 @@ static int parse_ipt(struct action_util *a,int *argc_p, printf(" %s error \n", m->name); return -1; } - tcipt_globals.opts = - xtables_merge_options( #if (XTABLES_VERSION_CODE >= 6) - tcipt_globals.orig_opts, + opts = xtables_options_xfrm(tcipt_globals.orig_opts, + tcipt_globals.opts, + m->x6_options, + &m->option_offset); +#else + opts = xtables_merge_options(tcipt_globals.orig_opts, + tcipt_globals.opts, + m->extra_opts, + &m->option_offset); #endif - tcipt_globals.opts, - m->extra_opts, - &m->option_offset); + if (opts == NULL) { + fprintf(stderr, " failed to find aditional options for target %s\n\n", optarg); + return -1; + } else + tcipt_globals.opts = opts; } else { fprintf(stderr," failed to find target %s\n\n", optarg); return -1; @@ -175,17 +185,21 @@ static int parse_ipt(struct action_util *a,int *argc_p, default: memset(&fw, 0, sizeof (fw)); - if (m) { - m->parse(c - m->option_offset, argv, 0, - &m->tflags, NULL, &m->t); +#if (XTABLES_VERSION_CODE >= 6) + if (m != NULL && m->x6_parse != NULL ) { + xtables_option_tpcall(c, argv, 0 , m, NULL); +#else + if (m != NULL && m->parse != NULL ) { + m->parse(c - m->option_offset, argv, 0, &m->tflags, + NULL, &m->t); +#endif } else { - fprintf(stderr," failed to find target %s\n\n", optarg); + fprintf(stderr,"failed to find target %s\n\n", optarg); return -1; } ok++; break; - } } @@ -208,8 +222,13 @@ static int parse_ipt(struct action_util *a,int *argc_p, } /* check that we passed the correct parameters to the target */ +#if (XTABLES_VERSION_CODE >= 6) + if (m) + xtables_option_tfcall(m); +#else if (m && m->final_check) m->final_check(m->tflags); +#endif { struct tcmsg *t = NLMSG_DATA(n); @@ -271,6 +290,7 @@ print_ipt(struct action_util *au,FILE * f, struct rtattr *arg) { struct rtattr *tb[TCA_IPT_MAX + 1]; struct xt_entry_target *t = NULL; + struct option *opts = NULL; if (arg == NULL) return -1; @@ -309,14 +329,22 @@ print_ipt(struct action_util *au,FILE * f, struct rtattr *arg) return -1; } - tcipt_globals.opts = - xtables_merge_options( #if (XTABLES_VERSION_CODE >= 6) - tcipt_globals.orig_opts, + opts = xtables_options_xfrm(tcipt_globals.orig_opts, + tcipt_globals.opts, + m->x6_options, + &m->option_offset); +#else + opts = xtables_merge_options(tcipt_globals.orig_opts, + tcipt_globals.opts, + m->extra_opts, + &m->option_offset); #endif - tcipt_globals.opts, - m->extra_opts, - &m->option_offset); + if (opts == NULL) { + fprintf(stderr, " failed to find aditional options for target %s\n\n", optarg); + return -1; + } else + tcipt_globals.opts = opts; } else { fprintf(stderr, " failed to find target %s\n\n", t->u.user.name); @@ -355,4 +383,3 @@ struct action_util xt_action_util = { .parse_aopt = parse_ipt, .print_aopt = print_ipt, }; - ^ permalink raw reply related [flat|nested] 64+ messages in thread
* RFC [PATCH] iproute2: temporary solution to fix xt breakage 2012-12-16 20:41 ` [PATCH] iproute2: act_ipt fix xtables breakage Jamal Hadi Salim @ 2012-12-17 12:30 ` Jamal Hadi Salim 2012-12-17 16:12 ` Stephen Hemminger [not found] ` <CAASe=fRuJdtisEvp7uo=PHwN3nKHqsYDW4Om1gk2MK-vyNvBrA@mail.gmail.com> 0 siblings, 2 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-17 12:30 UTC (permalink / raw) To: Stephen Hemminger Cc: Hasan Chowdhury, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 704 bytes --] On 12-12-16 03:41 PM, Jamal Hadi Salim wrote: > > There is an "intermediate solution" from Hasan which doesnt require > the kernel change. It changes the kernel endpoint to "ipt". I am > conflicted because it is a quick hack while otoh forcing people to > upgrade kernel is a usability issue. > Attached. Author is Hasan - I didnt sign it because i am looking for feedback and i find it distasteful but it solves the problem. This is needed until we have a proper fix in the kernel propagated. Once that kernel change is ubiquitous this change is noise and a maintanance pain. I am making it hard to even turn it on (i.e someone knowledgeable will have to compile with CONFIG_XT_HACK) cheers, jamal [-- Attachment #2: p1 --] [-- Type: text/plain, Size: 1092 bytes --] diff --git a/tc/m_action.c b/tc/m_action.c index 1fe2431..fa9a7c8 100644 --- a/tc/m_action.c +++ b/tc/m_action.c @@ -209,10 +209,17 @@ done0: tail = NLMSG_TAIL(n); addattr_l(n, MAX_MSG, ++prio, NULL, 0); + /*XXX: hack to work around old kernels, newer xtables */ +#ifdef CONFIG_XT_HACK + if (strncmp(k,"xt",2)==0) + addattr_l(n, MAX_MSG, TCA_ACT_KIND, "ipt" , strlen("ipt") + 1); + else + addattr_l(n, MAX_MSG, TCA_ACT_KIND, k, strlen(k) + 1); +#else addattr_l(n, MAX_MSG, TCA_ACT_KIND, k, strlen(k) + 1); +#endif ret = a->parse_aopt(a,&argc, &argv, TCA_ACT_OPTIONS, n); - if (ret < 0) { fprintf(stderr,"bad action parsing\n"); goto bad_val; @@ -259,7 +266,15 @@ tc_print_one_action(FILE * f, struct rtattr *arg) } + /*XXX: hack to work around old kernels, newer xtables */ +#ifdef CONFIG_XT_HACK + if (strcmp(RTA_DATA(tb[TCA_ACT_KIND]), "ipt")==0) + a = get_action_kind("xt"); + else + a = get_action_kind(RTA_DATA(tb[TCA_ACT_KIND])); +#else a = get_action_kind(RTA_DATA(tb[TCA_ACT_KIND])); +#endif if (NULL == a) return err; ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: RFC [PATCH] iproute2: temporary solution to fix xt breakage 2012-12-17 12:30 ` RFC [PATCH] iproute2: temporary solution to fix xt breakage Jamal Hadi Salim @ 2012-12-17 16:12 ` Stephen Hemminger 2012-12-19 11:36 ` Jamal Hadi Salim [not found] ` <CAASe=fRuJdtisEvp7uo=PHwN3nKHqsYDW4Om1gk2MK-vyNvBrA@mail.gmail.com> 1 sibling, 1 reply; 64+ messages in thread From: Stephen Hemminger @ 2012-12-17 16:12 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On Mon, 17 Dec 2012 07:30:41 -0500 Jamal Hadi Salim <jhs@mojatatu.com> wrote: > On 12-12-16 03:41 PM, Jamal Hadi Salim wrote: > > > > There is an "intermediate solution" from Hasan which doesnt require > > the kernel change. It changes the kernel endpoint to "ipt". I am > > conflicted because it is a quick hack while otoh forcing people to > > upgrade kernel is a usability issue. > > > > > Attached. Author is Hasan - I didnt sign it because i am looking for > feedback and i find it distasteful but it solves the problem. > This is needed until we have a proper fix in the kernel propagated. > Once that kernel change is ubiquitous this change is noise and a > maintanance pain. I am making it hard to even turn it on > (i.e someone knowledgeable will have to compile with CONFIG_XT_HACK) > > cheers, > jamal > > Maybe xtables should have stable API/ABI and use shim routines there? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: RFC [PATCH] iproute2: temporary solution to fix xt breakage 2012-12-17 16:12 ` Stephen Hemminger @ 2012-12-19 11:36 ` Jamal Hadi Salim 0 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-19 11:36 UTC (permalink / raw) To: Stephen Hemminger Cc: Hasan Chowdhury, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-17 11:12 AM, Stephen Hemminger wrote: > > Maybe xtables should have stable API/ABI and use shim routines there? Thats the general direction being taken now with this last changes... cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
[parent not found: <CAASe=fRuJdtisEvp7uo=PHwN3nKHqsYDW4Om1gk2MK-vyNvBrA@mail.gmail.com>]
* Re: RFC [PATCH] iproute2: temporary solution to fix xt breakage [not found] ` <CAASe=fRuJdtisEvp7uo=PHwN3nKHqsYDW4Om1gk2MK-vyNvBrA@mail.gmail.com> @ 2012-12-18 12:28 ` Jamal Hadi Salim [not found] ` <CAASe=fR6Hm2dxp=1wDchtrzqnaH6qacHpg2wrsqLfmGpPbQ9Fg@mail.gmail.com> 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-18 12:28 UTC (permalink / raw) To: Hasan Chowdhury Cc: Stephen Hemminger, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-17 11:10 AM, Hasan Chowdhury wrote: > Juts noticed , the attached patch does not have the modification for > m_xt.c , without it ,I guess this patch is not going to work. Thats in the first patch i sent out which I hope Stephen will apply right away. This one depends on that. So you must apply that patch, then this. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
[parent not found: <CAASe=fR6Hm2dxp=1wDchtrzqnaH6qacHpg2wrsqLfmGpPbQ9Fg@mail.gmail.com>]
* Re: RFC [PATCH] iproute2: temporary solution to fix xt breakage [not found] ` <CAASe=fR6Hm2dxp=1wDchtrzqnaH6qacHpg2wrsqLfmGpPbQ9Fg@mail.gmail.com> @ 2012-12-19 11:44 ` Jamal Hadi Salim 2012-12-19 11:56 ` [PATCH] pkt_sched: act_xt support new Xtables interface Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-19 11:44 UTC (permalink / raw) To: Hasan Chowdhury Cc: Stephen Hemminger, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-18 09:45 AM, Hasan Chowdhury wrote: > Hi Jamal, > > Thanks for all the help and the information. I will keep tune myself so > when the proper path from kernel side will show up I will integrate it > into my system to test it. > Yikes. I guess i never posted that? Will do it shortly. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-19 11:44 ` Jamal Hadi Salim @ 2012-12-19 11:56 ` Jamal Hadi Salim 2012-12-19 15:52 ` Jan Engelhardt ` (2 more replies) 0 siblings, 3 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-19 11:56 UTC (permalink / raw) To: Hasan Chowdhury Cc: Stephen Hemminger, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 485 bytes --] To be applied pending more testing. Attached. Sorry, I thought I had sent this out over the weekend. I have done basic testing with a single mark and sending pings to update stats which can then displayed for the mark. Hasan/Yury, if you test this please use the latest iproute2 with only the first patch I posted (originally from Hasan). Hasan please use that patch not your version - if theres anything wrong we can find out sooner before the patch becomes final. cheers, jamal [-- Attachment #2: xt-p1 --] [-- Type: text/plain, Size: 10173 bytes --] commit 82330cc874429c63bd0e476e413a79ebab3da350 Author: Jamal Hadi Salim <hadi@mojatatu.com> Date: Wed Dec 19 06:23:28 2012 -0500 Fix iptables/xtables ABI changes. We will eventually replace act_ipt with act_xt since only very few targets still support the old xtables interface Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 235e01a..1693973 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -578,12 +578,25 @@ config NET_ACT_MIRRED config NET_ACT_IPT tristate "IPtables targets" depends on NET_CLS_ACT && NETFILTER && IP_NF_IPTABLES + select NET_ACT_XT ---help--- Say Y here to be able to invoke iptables targets after successful - classification. + classification. Better yet choose NET_ACT_XT since this version + will eventually be obsoleted. To compile this code as a module, choose M here: the module will be called act_ipt. +config NET_ACT_XT + tristate "New IPtables targets" + depends on NET_CLS_ACT && NETFILTER && IP_NF_IPTABLES + ---help--- + Say Y here to be able to invoke iptables targets after successful + classification using the new xtables mechanism. This mechanism + will eventually replace NET_ACT_IPT + + To compile this code as a module, choose M here: the + module will be called act_xt. + config NET_ACT_NAT tristate "Stateless NAT" diff --git a/net/sched/Makefile b/net/sched/Makefile index 978cbf0..10a1136 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -11,6 +11,7 @@ obj-$(CONFIG_NET_ACT_POLICE) += act_police.o obj-$(CONFIG_NET_ACT_GACT) += act_gact.o obj-$(CONFIG_NET_ACT_MIRRED) += act_mirred.o obj-$(CONFIG_NET_ACT_IPT) += act_ipt.o +obj-$(CONFIG_NET_ACT_XT) += act_xt.o obj-$(CONFIG_NET_ACT_NAT) += act_nat.o obj-$(CONFIG_NET_ACT_PEDIT) += act_pedit.o obj-$(CONFIG_NET_ACT_SIMP) += act_simple.o diff --git a/net/sched/act_xt.c b/net/sched/act_xt.c new file mode 100644 index 0000000..589cfe6 --- /dev/null +++ b/net/sched/act_xt.c @@ -0,0 +1,324 @@ +/* + * net/sched/act_xt.c iptables target interface + * + *TODO: Add other tables. For now we only support the ipv4 table targets + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Copyright: Jamal Hadi Salim (2002-12) + */ + +#include <linux/types.h> +#include <linux/kernel.h> +#include <linux/string.h> +#include <linux/errno.h> +#include <linux/skbuff.h> +#include <linux/rtnetlink.h> +#include <linux/module.h> +#include <linux/init.h> +#include <linux/slab.h> +#include <net/netlink.h> +#include <net/pkt_sched.h> +#include <linux/tc_act/tc_ipt.h> +#include <net/tc_act/tc_ipt.h> + +#include <linux/netfilter_ipv4/ip_tables.h> + +#define IPT_TAB_MASK 15 +static struct tcf_common *tcf_ipt_ht[IPT_TAB_MASK + 1]; +static u32 ipt_idx_gen; +static DEFINE_RWLOCK(ipt_lock); + +static struct tcf_hashinfo ipt_hash_info = { + .htab = tcf_ipt_ht, + .hmask = IPT_TAB_MASK, + .lock = &ipt_lock, +}; + +static int ipt_init_target(struct xt_entry_target *t, char *table, + unsigned int hook) +{ + struct xt_tgchk_param par; + struct xt_target *target; + int ret = 0; + + target = xt_request_find_target(AF_INET, t->u.user.name, + t->u.user.revision); + if (IS_ERR(target)) + return PTR_ERR(target); + + t->u.kernel.target = target; + par.table = table; + par.entryinfo = NULL; + par.target = target; + par.targinfo = t->data; + par.hook_mask = hook; + par.family = NFPROTO_IPV4; + + ret = xt_check_target(&par, t->u.target_size - sizeof(*t), 0, false); + if (ret < 0) { + module_put(t->u.kernel.target->me); + return ret; + } + return 0; +} + +static void ipt_destroy_target(struct xt_entry_target *t) +{ + struct xt_tgdtor_param par = { + .target = t->u.kernel.target, + .targinfo = t->data, + }; + if (par.target->destroy != NULL) + par.target->destroy(&par); + module_put(par.target->me); +} + +static int tcf_ipt_release(struct tcf_ipt *ipt, int bind) +{ + int ret = 0; + if (ipt) { + if (bind) + ipt->tcf_bindcnt--; + ipt->tcf_refcnt--; + if (ipt->tcf_bindcnt <= 0 && ipt->tcf_refcnt <= 0) { + ipt_destroy_target(ipt->tcfi_t); + kfree(ipt->tcfi_tname); + kfree(ipt->tcfi_t); + tcf_hash_destroy(&ipt->common, &ipt_hash_info); + ret = ACT_P_DELETED; + } + } + return ret; +} + +static const struct nla_policy ipt_policy[TCA_IPT_MAX + 1] = { + [TCA_IPT_TABLE] = {.type = NLA_STRING,.len = IFNAMSIZ}, + [TCA_IPT_HOOK] = {.type = NLA_U32}, + [TCA_IPT_INDEX] = {.type = NLA_U32}, + [TCA_IPT_TARG] = {.len = sizeof(struct xt_entry_target)}, +}; + +static int tcf_ipt_init(struct nlattr *nla, struct nlattr *est, + struct tc_action *a, int ovr, int bind) +{ + struct nlattr *tb[TCA_IPT_MAX + 1]; + struct tcf_ipt *ipt; + struct tcf_common *pc; + struct xt_entry_target *td, *t; + char *tname; + int ret = 0, err; + u32 hook = 0; + u32 index = 0; + + if (nla == NULL) + return -EINVAL; + + err = nla_parse_nested(tb, TCA_IPT_MAX, nla, ipt_policy); + if (err < 0) + return err; + + if (tb[TCA_IPT_HOOK] == NULL) + return -EINVAL; + if (tb[TCA_IPT_TARG] == NULL) + return -EINVAL; + + td = (struct xt_entry_target *)nla_data(tb[TCA_IPT_TARG]); + if (nla_len(tb[TCA_IPT_TARG]) < td->u.target_size) + return -EINVAL; + + if (tb[TCA_IPT_INDEX] != NULL) + index = nla_get_u32(tb[TCA_IPT_INDEX]); + + pc = tcf_hash_check(index, a, bind, &ipt_hash_info); + if (!pc) { + pc = tcf_hash_create(index, est, a, sizeof(*ipt), bind, + &ipt_idx_gen, &ipt_hash_info); + if (IS_ERR(pc)) + return PTR_ERR(pc); + ret = ACT_P_CREATED; + } else { + if (!ovr) { + tcf_ipt_release(to_ipt(pc), bind); + return -EEXIST; + } + } + ipt = to_ipt(pc); + + hook = nla_get_u32(tb[TCA_IPT_HOOK]); + + err = -ENOMEM; + tname = kmalloc(IFNAMSIZ, GFP_KERNEL); + if (unlikely(!tname)) + goto err1; + if (tb[TCA_IPT_TABLE] == NULL || + nla_strlcpy(tname, tb[TCA_IPT_TABLE], IFNAMSIZ) >= IFNAMSIZ) + strcpy(tname, "mangle"); + + t = kmemdup(td, td->u.target_size, GFP_KERNEL); + if (unlikely(!t)) + goto err2; + + err = ipt_init_target(t, tname, hook); + if (err < 0) + goto err3; + + spin_lock_bh(&ipt->tcf_lock); + if (ret != ACT_P_CREATED) { + ipt_destroy_target(ipt->tcfi_t); + kfree(ipt->tcfi_tname); + kfree(ipt->tcfi_t); + } + ipt->tcfi_tname = tname; + ipt->tcfi_t = t; + ipt->tcfi_hook = hook; + spin_unlock_bh(&ipt->tcf_lock); + if (ret == ACT_P_CREATED) + tcf_hash_insert(pc, &ipt_hash_info); + return ret; + +err3: + kfree(t); +err2: + kfree(tname); +err1: + if (ret == ACT_P_CREATED) { + if (est) + gen_kill_estimator(&pc->tcfc_bstats, + &pc->tcfc_rate_est); + kfree_rcu(pc, tcfc_rcu); + } + return err; +} + +static int tcf_ipt_cleanup(struct tc_action *a, int bind) +{ + struct tcf_ipt *ipt = a->priv; + return tcf_ipt_release(ipt, bind); +} + +static int tcf_ipt(struct sk_buff *skb, const struct tc_action *a, + struct tcf_result *res) +{ + int ret = 0, result = 0; + struct tcf_ipt *ipt = a->priv; + struct xt_action_param par; + + if (skb_cloned(skb)) { + if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + return TC_ACT_UNSPEC; + } + + spin_lock(&ipt->tcf_lock); + + ipt->tcf_tm.lastuse = jiffies; + bstats_update(&ipt->tcf_bstats, skb); + + /* yes, we have to worry about both in and out dev + * worry later - danger - this API seems to have changed + * from earlier kernels + */ + par.in = skb->dev; + par.out = NULL; + par.hooknum = ipt->tcfi_hook; + par.target = ipt->tcfi_t->u.kernel.target; + par.targinfo = ipt->tcfi_t->data; + ret = par.target->target(skb, &par); + + switch (ret) { + case NF_ACCEPT: + result = TC_ACT_OK; + break; + case NF_DROP: + result = TC_ACT_SHOT; + ipt->tcf_qstats.drops++; + break; + case XT_CONTINUE: + result = TC_ACT_PIPE; + break; + default: + net_notice_ratelimited + ("tc filter: Bogus netfilter code %d assume ACCEPT\n", ret); + result = TC_POLICE_OK; + break; + } + spin_unlock(&ipt->tcf_lock); + return result; + +} + +static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind, + int ref) +{ + unsigned char *b = skb_tail_pointer(skb); + struct tcf_ipt *ipt = a->priv; + struct xt_entry_target *t; + struct tcf_t tm; + struct tc_cnt c; + + /* for simple targets kernel size == user size + * user name = target name + * for foolproof you need to not assume this + */ + + t = kmemdup(ipt->tcfi_t, ipt->tcfi_t->u.user.target_size, GFP_ATOMIC); + if (unlikely(!t)) + goto nla_put_failure; + + c.bindcnt = ipt->tcf_bindcnt - bind; + c.refcnt = ipt->tcf_refcnt - ref; + strcpy(t->u.user.name, ipt->tcfi_t->u.kernel.target->name); + + if (nla_put(skb, TCA_IPT_TARG, ipt->tcfi_t->u.user.target_size, t) || + nla_put_u32(skb, TCA_IPT_INDEX, ipt->tcf_index) || + nla_put_u32(skb, TCA_IPT_HOOK, ipt->tcfi_hook) || + nla_put(skb, TCA_IPT_CNT, sizeof(struct tc_cnt), &c) || + nla_put_string(skb, TCA_IPT_TABLE, ipt->tcfi_tname)) + goto nla_put_failure; + tm.install = jiffies_to_clock_t(jiffies - ipt->tcf_tm.install); + tm.lastuse = jiffies_to_clock_t(jiffies - ipt->tcf_tm.lastuse); + tm.expires = jiffies_to_clock_t(ipt->tcf_tm.expires); + if (nla_put(skb, TCA_IPT_TM, sizeof(tm), &tm)) + goto nla_put_failure; + kfree(t); + return skb->len; + +nla_put_failure: + nlmsg_trim(skb, b); + kfree(t); + return -1; +} + +static struct tc_action_ops act_ipt_ops = { + .kind = "xt", + .hinfo = &ipt_hash_info, + .type = TCA_ACT_IPT, + .capab = TCA_CAP_NONE, + .owner = THIS_MODULE, + .act = tcf_ipt, + .dump = tcf_ipt_dump, + .cleanup = tcf_ipt_cleanup, + .lookup = tcf_hash_search, + .init = tcf_ipt_init, + .walk = tcf_generic_walker +}; + +MODULE_AUTHOR("Jamal Hadi Salim(2002-12)"); +MODULE_DESCRIPTION("New Iptables target actions"); +MODULE_LICENSE("GPL"); + +static int __init ipt_init_module(void) +{ + return tcf_register_action(&act_ipt_ops); +} + +static void __exit ipt_cleanup_module(void) +{ + tcf_unregister_action(&act_ipt_ops); +} + +module_init(ipt_init_module); +module_exit(ipt_cleanup_module); ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-19 11:56 ` [PATCH] pkt_sched: act_xt support new Xtables interface Jamal Hadi Salim @ 2012-12-19 15:52 ` Jan Engelhardt 2012-12-19 23:05 ` Jamal Hadi Salim [not found] ` <CAASe=fQZGwjM_2PStRE0tje33Doi6TuwJJ3p7x-SRcwq3mQvRg@mail.gmail.com> 2012-12-20 8:54 ` Yury Stankevich 2 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-19 15:52 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Stephen Hemminger, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On Wednesday 2012-12-19 12:56, Jamal Hadi Salim wrote: > > To be applied pending more testing. > > Attached. Sorry, I thought I had sent this out over the weekend. > I have done basic testing with a single mark and sending pings to > update stats which can then displayed for the mark. > > diffstat xt-p1 > Kconfig | 15 ++ > Makefile | 1 > act_xt.c | 324 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 339 insertions(+), 1 deletion(-) Humm... that's a huge patch for what seems to be equal to act_ipt.c Let's do a cross-diff: --- act_ipt.c 2012-10-25 19:49:25.372191795 +0200 +++ act_xt.c 2012-12-19 16:48:22.052419730 +0100 @@ -2 +2 @@ - * net/sched/ipt.c iptables target interface + * net/sched/act_xt.c iptables target interface @@ -11 +11 @@ - * Copyright: Jamal Hadi Salim (2002-4) + * Copyright: Jamal Hadi Salim (2002-12) @@ -30 +29,0 @@ - @@ -42 +41,2 @@ static struct tcf_hashinfo ipt_hash_info -static int ipt_init_target(struct xt_entry_target *t, char *table, unsigned int hook) +static int ipt_init_target(struct xt_entry_target *t, char *table, + unsigned int hook) @@ -243,2 +243,2 @@ static int tcf_ipt(struct sk_buff *skb, - net_notice_ratelimited("tc filter: Bogus netfilter code %d assume ACCEPT\n", - ret); + net_notice_ratelimited + ("tc filter: Bogus netfilter code %d assume ACCEPT\n", ret); @@ -253 +253,2 @@ static int tcf_ipt(struct sk_buff *skb, -static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind, + int ref) @@ -295 +296 @@ static struct tc_action_ops act_ipt_ops - .kind = "ipt", + .kind = "xt", @@ -308,2 +309,2 @@ static struct tc_action_ops act_ipt_ops -MODULE_AUTHOR("Jamal Hadi Salim(2002-4)"); -MODULE_DESCRIPTION("Iptables target actions"); +MODULE_AUTHOR("Jamal Hadi Salim(2002-12)"); +MODULE_DESCRIPTION("New Iptables target actions"); Is that [the set of hunks] all? Then I would instead suggest something like: diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index 58fb3c7..f92a007 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -305,18 +305,43 @@ static struct tc_action_ops act_ipt_ops = { .walk = tcf_generic_walker }; +static struct tc_action_ops act_xt_ops = { + .kind = "xt", + .hinfo = &ipt_hash_info, + .type = TCA_ACT_IPT, + .capab = TCA_CAP_NONE, + .owner = THIS_MODULE, + .act = tcf_ipt, + .dump = tcf_ipt_dump, + .cleanup = tcf_ipt_cleanup, + .lookup = tcf_hash_search, + .init = tcf_ipt_init, + .walk = tcf_generic_walker +}; + MODULE_AUTHOR("Jamal Hadi Salim(2002-4)"); MODULE_DESCRIPTION("Iptables target actions"); MODULE_LICENSE("GPL"); +MODULE_ALIAS("act_xt"); static int __init ipt_init_module(void) { - return tcf_register_action(&act_ipt_ops); + int ret; + ret = tcf_register_action(&act_ipt_ops); + if (ret < 0) + return ret; + ret = tcf_register_action(&xt_ipt_ops); + if (ret < 0) { + tcf_unregister_action(&act_ipt_ops); + return ret; + } + return 0; } static void __exit ipt_cleanup_module(void) { tcf_unregister_action(&act_ipt_ops); + tcf_unregister_action(&act_xt_ops); } module_init(ipt_init_module); ^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-19 15:52 ` Jan Engelhardt @ 2012-12-19 23:05 ` Jamal Hadi Salim 0 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-19 23:05 UTC (permalink / raw) To: Jan Engelhardt Cc: Hasan Chowdhury, Stephen Hemminger, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-19 10:52 AM, Jan Engelhardt wrote: > > > Humm... that's a huge patch for what seems to be equal to act_ipt.c > Let's do a cross-diff: > I was thinking of our little discussion when doing that. The one reason i separated the two is so when the time is right you can patch on top of only act_xt.c and eventually act_ipt.c will die.. Does changes on top of act_xt.c sound palatable to you? Otherwise, you are right - it is overkill cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
[parent not found: <CAASe=fQZGwjM_2PStRE0tje33Doi6TuwJJ3p7x-SRcwq3mQvRg@mail.gmail.com>]
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface [not found] ` <CAASe=fQZGwjM_2PStRE0tje33Doi6TuwJJ3p7x-SRcwq3mQvRg@mail.gmail.com> @ 2012-12-19 23:00 ` Jamal Hadi Salim 0 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-19 23:00 UTC (permalink / raw) To: Hasan Chowdhury Cc: Stephen Hemminger, Jan Engelhardt, Yury Stankevich, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-19 10:51 AM, Hasan Chowdhury wrote: > Hi Jamal, > I will test it once I get some opportunity , but think I like to know > even before any testing > > 1. What will be the new procedure to compile iproute2 after the patch > apply (any new library or any configuration that needs to be adjusted ) git pull Stephens latest tree. Apply patch 1 and compile. When you are done compiling an m_xt.so will sit in the tc directory; unfortunate that we are putting out this shared libs. Backup your distro version and copy this over to that location. On ubuntu 12.04: sudo cp tc/m_xt.so /usr/lib/tc/m_xt.so will do it. > 2. tc filter add dev eth0 protocol ip parent 1: prio 3 u32 match ip src > 192.168.0.0/16 <http://192.168.0.0/16> flowid 1:1 action xt -j MARK > --set-mark 3 > > > is this still a valid command ? > Indeed it is. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-19 11:56 ` [PATCH] pkt_sched: act_xt support new Xtables interface Jamal Hadi Salim 2012-12-19 15:52 ` Jan Engelhardt [not found] ` <CAASe=fQZGwjM_2PStRE0tje33Doi6TuwJJ3p7x-SRcwq3mQvRg@mail.gmail.com> @ 2012-12-20 8:54 ` Yury Stankevich 2012-12-20 12:35 ` Jamal Hadi Salim 2 siblings, 1 reply; 64+ messages in thread From: Yury Stankevich @ 2012-12-20 8:54 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel 19.12.2012 15:56, Jamal Hadi Salim пишет: > Hasan/Yury, if you test this please use the latest iproute2 with only > the first patch I posted (originally from Hasan). Hasan please use that > patch not your version - if theres anything wrong we can find out sooner > before the patch becomes final. Hello, 3.7.1 kernel with 3.7.0 iproute, patch-xt, xt-p1 + linkage fix was applyed command successfully performed, but actually doesn't work. command: tc filter add dev $dev parent ffff: protocol ip u32 match u32 0 0 \ action xt -j CONNMARK --restore-mark \ action mirred egress redirect dev ifb0 then i use filter: tc filter add dev ifb0 protocol ip parent 1: prio 2 handle 0xa fw flowid 1:102 iptables line: iptable -t mangle -A POSTROUTING -p tcp --dport 80 -m connmark --mark 0 -m connbytes --connbytes 204800: --connbytes-dir both --connbytes-mode bytes -j CONNMARK --set-mark 0xa once i run a test to download 300K file, from iptables counters i can see that rule in POSTROUTING is triggered, but from `tc -s qdisc show dev ifb0` i see that no packets was sent to 1:102 flow. btw, tc -p -s filter show dev ifb0 parent 1: do not show stats `(rule hit 416 success 0)` for this (filter protocol ip pref 2 fw handle 0xa classid 1:102) rule. -- Linux registered user #402966 // pub 1024D/E99AF373 <pgp.mit.edu> -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-20 8:54 ` Yury Stankevich @ 2012-12-20 12:35 ` Jamal Hadi Salim 2012-12-20 14:59 ` Yury Stankevich 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-20 12:35 UTC (permalink / raw) To: Yury Stankevich Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel Could be your setup. I didnt do a lot of testing but from my notes (running different kernel at the moment): #try to point to everything (no iptables setup) tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 flowid 23:23 action xt -j CONNMARK --restore-mark #let it run for a 1 sec then display with tc -s filter show dev eth0 parent ffff: ---- filter protocol ip pref 49152 u32 filter protocol ip pref 49152 u32 fh 800: ht divisor 1 filter protocol ip pref 49152 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 23:23 match 00000000/00000000 at 0 action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING target CONNMARK restore index 1 ref 1 bind 1 installed 3 sec used 1 sec Action statistics: Sent 280 bytes 4 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 ---- cheers, jamal On 12-12-20 03:54 AM, Yury Stankevich wrote: > 19.12.2012 15:56, Jamal Hadi Salim пишет: >> Hasan/Yury, if you test this please use the latest iproute2 with only >> the first patch I posted (originally from Hasan). Hasan please use that >> patch not your version - if theres anything wrong we can find out sooner >> before the patch becomes final. > > Hello, > 3.7.1 kernel with 3.7.0 iproute, > patch-xt, xt-p1 + linkage fix was applyed > command successfully performed, but actually doesn't work. > > command: > tc filter add dev $dev parent ffff: protocol ip u32 match u32 0 0 \ > action xt -j CONNMARK --restore-mark \ > action mirred egress redirect dev ifb0 > then i use filter: > > tc filter add dev ifb0 protocol ip parent 1: prio 2 handle 0xa fw flowid > 1:102 > > iptables line: > iptable -t mangle -A POSTROUTING -p tcp --dport 80 -m connmark --mark 0 > -m connbytes --connbytes 204800: --connbytes-dir both --connbytes-mode > bytes -j CONNMARK --set-mark 0xa > > once i run a test to download 300K file, > from iptables counters i can see that rule in POSTROUTING is triggered, > but from `tc -s qdisc show dev ifb0` i see that no packets was sent to > 1:102 flow. > > btw, > tc -p -s filter show dev ifb0 parent 1: > do not show stats `(rule hit 416 success 0)` for this (filter protocol > ip pref 2 fw handle 0xa classid 1:102) rule. > > > -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-20 12:35 ` Jamal Hadi Salim @ 2012-12-20 14:59 ` Yury Stankevich 2012-12-21 13:03 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Yury Stankevich @ 2012-12-20 14:59 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel interesting, #tc -s filter show dev usb0 parent ffff: filter protocol ip pref 49152 u32 filter protocol ip pref 49152 u32 fh 800: ht divisor 1 filter protocol ip pref 49152 u32 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? (rule hit 707 success 707) match 00000000/00000000 at 0 (success 707 ) action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING target CONNMARK restore index 5 ref 1 bind 1 installed 394 sec used 11 sec Action statistics: Sent 783783 bytes 707 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: mirred (Egress Redirect to device ifb0) stolen index 5 ref 1 bind 1 installed 394 sec used 11 sec Action statistics: Sent 783783 bytes 707 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 so, looks like packets was sent to CONNMARK target. but... i make a iptables rule to log packets with 0xa mark: Chain PREROUTING (policy ACCEPT 1308 packets, 848K bytes) pkts bytes target prot opt in out source destination 0 0 NFLOG all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0xa nflog-group 1 Chain POSTROUTING (policy ACCEPT 1240 packets, 550K bytes) pkts bytes target prot opt in out source destination 1 40 CONNMARK tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 connmark match 0x0 connbytes 204800 connbytes mode bytes connbytes direction both CONNMARK set 0xa idea is: i run downloading, rule from POSTROUTING must fire if i download more than ~200K, tc filter call to CONNMARK restore, must restore mark (0xa) for packets belong to this connection. so i expect, that PREROUTING rule must notice the restored mark, but it doesn't. maybe i miss something ? 20.12.2012 16:35, Jamal Hadi Salim пишет: > > Could be your setup. I didnt do a lot of testing but > from my notes (running different kernel at the moment): > > #try to point to everything (no iptables setup) > tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 flowid > 23:23 action xt -j CONNMARK --restore-mark > #let it run for a 1 sec then display with > tc -s filter show dev eth0 parent ffff: > > ---- > filter protocol ip pref 49152 u32 > filter protocol ip pref 49152 u32 fh 800: ht divisor 1 > filter protocol ip pref 49152 u32 fh 800::800 order 2048 key ht 800 bkt > 0 flowid 23:23 > match 00000000/00000000 at 0 > action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING > target CONNMARK restore > index 1 ref 1 bind 1 installed 3 sec used 1 sec > Action statistics: > Sent 280 bytes 4 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > ---- > > cheers, > jamal > > On 12-12-20 03:54 AM, Yury Stankevich wrote: >> 19.12.2012 15:56, Jamal Hadi Salim пишет: >>> Hasan/Yury, if you test this please use the latest iproute2 with only >>> the first patch I posted (originally from Hasan). Hasan please use that >>> patch not your version - if theres anything wrong we can find out sooner >>> before the patch becomes final. >> >> Hello, >> 3.7.1 kernel with 3.7.0 iproute, >> patch-xt, xt-p1 + linkage fix was applyed >> command successfully performed, but actually doesn't work. >> >> command: >> tc filter add dev $dev parent ffff: protocol ip u32 match u32 0 0 \ >> action xt -j CONNMARK --restore-mark \ >> action mirred egress redirect dev ifb0 >> then i use filter: >> >> tc filter add dev ifb0 protocol ip parent 1: prio 2 handle 0xa fw flowid >> 1:102 >> >> iptables line: >> iptable -t mangle -A POSTROUTING -p tcp --dport 80 -m connmark --mark 0 >> -m connbytes --connbytes 204800: --connbytes-dir both --connbytes-mode >> bytes -j CONNMARK --set-mark 0xa >> >> once i run a test to download 300K file, >> from iptables counters i can see that rule in POSTROUTING is triggered, >> but from `tc -s qdisc show dev ifb0` i see that no packets was sent to >> 1:102 flow. >> >> btw, >> tc -p -s filter show dev ifb0 parent 1: >> do not show stats `(rule hit 416 success 0)` for this (filter protocol >> ip pref 2 fw handle 0xa classid 1:102) rule. >> >> >> > -- Linux registered user #402966 // pub 1024D/E99AF373 <pgp.mit.edu> ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-20 14:59 ` Yury Stankevich @ 2012-12-21 13:03 ` Jamal Hadi Salim 2012-12-21 13:13 ` Yury Stankevich 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-21 13:03 UTC (permalink / raw) To: Yury Stankevich Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-20 09:59 AM, Yury Stankevich wrote: > interesting, > > #tc -s filter show dev usb0 parent ffff: Given you are adding this on ingress - the settings you have will happen before pre-routing hook. If you did things at egress - the setting will take effect after post-routing. So take a closer look at those details they look like your source of issues.. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-21 13:03 ` Jamal Hadi Salim @ 2012-12-21 13:13 ` Yury Stankevich 2012-12-21 13:50 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Yury Stankevich @ 2012-12-21 13:13 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel 21.12.2012 17:03, Jamal Hadi Salim пишет: > On 12-12-20 09:59 AM, Yury Stankevich wrote: >> interesting, >> >> #tc -s filter show dev usb0 parent ffff: > > > Given you are adding this on ingress - the settings you have will > happen before pre-routing hook. > If you did things at egress - the setting will take effect after > post-routing. So take a closer look at those details they look > like your source of issues.. sure, i use it ingress, so, i need to use tc xt action to get mark on the packet, before filter on ifb will run. prerouting rule, in turn, used to test if mark was actually restored. in practice: 1. prerouting rule - is not fired. so, no packets with mark was seen. 2. filter on ifb - do not pass traffic to flow configured. looks like `CONNMARK --restore` is not really called. -- Linux registered user #402966 // pub 1024D/E99AF373 <pgp.mit.edu> -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-21 13:13 ` Yury Stankevich @ 2012-12-21 13:50 ` Jamal Hadi Salim 2012-12-21 14:14 ` Yury Stankevich 2012-12-21 14:35 ` Jan Engelhardt 0 siblings, 2 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-21 13:50 UTC (permalink / raw) To: Yury Stankevich Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-21 08:13 AM, Yury Stankevich wrote: > sure, > i use it ingress, > so, i need to use tc xt action > to get mark on the packet, before filter on ifb will run. Ok. So does ifb see it? > prerouting rule, in turn, used to test if mark was actually restored. No experience with connmark, but - in order to restore something has to store it, correct? > in practice: > 1. prerouting rule - is not fired. so, no packets with mark was seen. > 2. filter on ifb - do not pass traffic to flow configured. > looks like `CONNMARK --restore` is not really called. > My suspicion is that it is not set to begin with... cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-21 13:50 ` Jamal Hadi Salim @ 2012-12-21 14:14 ` Yury Stankevich 2012-12-22 13:19 ` Jamal Hadi Salim 2012-12-21 14:35 ` Jan Engelhardt 1 sibling, 1 reply; 64+ messages in thread From: Yury Stankevich @ 2012-12-21 14:14 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel well. let me describe whole picture i want to achieve 1. use htb/sfq on ingress. i got a traffic, and use few u32 filters to direct it to 3 flows, priority, interactive, and bulk. as http normally pass to interactive flow, i want to move long donwloads to the bulk one. how i trying to find these downloads: iptables -t mangle -A POSTROUTING -p tcp --dport 80 -m connmark --mark 0 -m connbytes --connbytes 204800: --connbytes-dir both --connbytes-mode bytes -j CONNMARK --set-mark 0xa so, http connection where more than 200K downloaded, must got a connection mark. since ingress traffic hits qos before netfilter, i use xt action, to copy connection mark, to a packet. (action xt -j CONNMARK --restore-mark ) from this moment, i expect packet must have a restored mark after this, i can use high priority tc filter .. handle 0xa fw flowid 1:102 to direct packet with mark 0xa to 1:102 flow (bulk). now about a problem. 1. i run http download, once i get 200K - i can see that rule in POSTROUTING is triggered and connection mark is installed (iptables -L -n -v mangle -- can show number of packets matched by rule) 2. i see to tc stats for my flows, and i see, that packets still going to interactive flow, not bulk as i expect. 3. from tc -s filter show dev usb0 parent ffff: filter protocol ip pref 49152 u32 filter protocol ip pref 49152 u32 fh 800: ht divisor 1 filter protocol ip pref 49152 u32 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? (rule hit 707 success 707) match 00000000/00000000 at 0 (success 707 ) action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING target CONNMARK restore index 5 ref 1 bind 1 installed 394 sec used 11 sec Action statistics: Sent 783783 bytes 707 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: mirred (Egress Redirect to device ifb0) stolen index 5 ref 1 bind 1 installed 394 sec used 11 sec Action statistics: Sent 783783 bytes 707 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 i can see that packets must reach xt action. 4. lets try to check packets mark with iptables, if mark restored by xt action - i must be able to match it in prerouting rule. iptables -t mangle -A PREROUTING -m mark --mark 0xa -j NFLOG --nflog-group 1 but this rule not macthesd - so, no mark is restored by xt action. maybe im completely wrong here, and such mode can't work for some reason ? 21.12.2012 17:50, Jamal Hadi Salim пишет: > On 12-12-21 08:13 AM, Yury Stankevich wrote: > >> sure, >> i use it ingress, >> so, i need to use tc xt action >> to get mark on the packet, before filter on ifb will run. > > Ok. So does ifb see it? > >> prerouting rule, in turn, used to test if mark was actually restored. > > No experience with connmark, but - in order to restore something has > to store it, correct? > >> in practice: >> 1. prerouting rule - is not fired. so, no packets with mark was seen. >> 2. filter on ifb - do not pass traffic to flow configured. >> looks like `CONNMARK --restore` is not really called. >> > > My suspicion is that it is not set to begin with... > > cheers, > jamal > -- Linux registered user #402966 // pub 1024D/E99AF373 <pgp.mit.edu> ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-21 14:14 ` Yury Stankevich @ 2012-12-22 13:19 ` Jamal Hadi Salim 2012-12-22 13:43 ` Jan Engelhardt 2012-12-22 13:58 ` Yury Stankevich 0 siblings, 2 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-22 13:19 UTC (permalink / raw) To: Yury Stankevich Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-21 09:14 AM, Yury Stankevich wrote: > > well. > let me describe whole picture i want to achieve > I think i got what you are trying to do Yury. Clever. From the description Jan provided in his response, I dont think this used to work at all. Are you saying it worked before? Having said that, what you are doing sounds so useful that we need to make it work ;-> But it appears like we need a brand new action for it, something like GetMarkFromConntrack. Jan, I am assuming (on ingress only) we need to call "something" to give us the nfct then grab the skb->mark from nfct. On egress, I am assuming the skb->mark is already set if connmark is to be used... Am i correct? If yes, then this action will only be useful at ingress. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-22 13:19 ` Jamal Hadi Salim @ 2012-12-22 13:43 ` Jan Engelhardt 2012-12-22 13:56 ` Jamal Hadi Salim 2012-12-22 13:58 ` Yury Stankevich 1 sibling, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-22 13:43 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, netdev@vger.kernel.org, pablo, netfilter-devel On Saturday 2012-12-22 14:19, Jamal Hadi Salim wrote: > > Having said that, what you are doing sounds so useful > that we need to make it work ;-> But it appears like > we need a brand new action for it, something like > GetMarkFromConntrack. Jan, I am assuming (on ingress only) > we need to call "something" to give us the nfct then > grab the skb->mark from nfct. Looking up CT before ingress would mean the entire "raw" table needs to be moved before ingress. But with classic ip_tables, calling a table requires a lot of setup (basically ip_rcv). > On egress, > I am assuming the skb->mark is already set if connmark > is to be used... Am i correct? All new skbs (i.e. those that did not loop due to IPsec, for example) received through __netif_receive_skb should start out with skb->mark=0, which is why CONNMARK --restore-mark is needed to copy skb->mark=ct->mark. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-22 13:43 ` Jan Engelhardt @ 2012-12-22 13:56 ` Jamal Hadi Salim 0 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-22 13:56 UTC (permalink / raw) To: Jan Engelhardt Cc: Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-22 08:43 AM, Jan Engelhardt wrote: > > Looking up CT before ingress would mean the entire "raw" > table needs to be moved before ingress. But with classic > ip_tables, calling a table requires a lot of setup > (basically ip_rcv). Scanning the code: Would it not work if i only passed it IP packets (the tc classifier can check) and then for v4 i do something like ipv4_conntrack_in() with pre-routing as the hook to update the skb? > All new skbs (i.e. those that did not loop due to IPsec, for example) > received through __netif_receive_skb should start out with > skb->mark=0, which is why CONNMARK --restore-mark is needed > to copy skb->mark=ct->mark. I may be overthinking this: are you saying connmark should do the copying to skb->mark instead of some action? Earlier you said conmark depends on presence of skb->nfct. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-22 13:19 ` Jamal Hadi Salim 2012-12-22 13:43 ` Jan Engelhardt @ 2012-12-22 13:58 ` Yury Stankevich 2012-12-22 14:04 ` Florian Westphal 2012-12-22 14:09 ` Jamal Hadi Salim 1 sibling, 2 replies; 64+ messages in thread From: Yury Stankevich @ 2012-12-22 13:58 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel 22.12.2012 17:19, Jamal Hadi Salim пишет: > From the description Jan provided in his response, I dont > think this used to work at all. Are you saying it worked before? no. i'm trying if this can work, alas. it can't. > Having said that, what you are doing sounds so useful > that we need to make it work ;-> But it appears like > we need a brand new action for it, something like > GetMarkFromConntrack. maybe ifb device can be made more friendly to iptables ? for a sample, run some (or all?) nefilter hooks before qdisc, like on a normal interface ? -- Linux registered user #402966 // pub 1024D/E99AF373 <pgp.mit.edu> -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-22 13:58 ` Yury Stankevich @ 2012-12-22 14:04 ` Florian Westphal 2012-12-22 14:09 ` Jamal Hadi Salim 1 sibling, 0 replies; 64+ messages in thread From: Florian Westphal @ 2012-12-22 14:04 UTC (permalink / raw) To: Yury Stankevich Cc: Jamal Hadi Salim, Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel Yury Stankevich <urykhy@gmail.com> wrote: > 22.12.2012 17:19, Jamal Hadi Salim пишет: > > From the description Jan provided in his response, I dont > > think this used to work at all. Are you saying it worked before? > > no. > i'm trying if this can work, alas. it can't. Yury, what are you trying to accomplish? Is there a particular reason why you want to use ingress shaping instead of pure policing? I ask, because you could try to use hashlimit match to do rate policing via netfilter. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-22 13:58 ` Yury Stankevich 2012-12-22 14:04 ` Florian Westphal @ 2012-12-22 14:09 ` Jamal Hadi Salim 2012-12-24 11:34 ` Jamal Hadi Salim 1 sibling, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-22 14:09 UTC (permalink / raw) To: Yury Stankevich Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-22 08:58 AM, Yury Stankevich wrote: > i'm trying if this can work, alas. it can't. Now i want it to work ;-> So dont give up yet. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-22 14:09 ` Jamal Hadi Salim @ 2012-12-24 11:34 ` Jamal Hadi Salim 2012-12-24 11:49 ` Felix Fietkau 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-24 11:34 UTC (permalink / raw) To: Yury Stankevich, Felix Fietkau Cc: Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel Some good news Yury. I am told Felix Fietkau <nbd@openwrt.org> (on CC) actually already solved this issue and it is a feature in openwrt. I cant find the code. Felix - Yury is trying to retrieve skb->mark fields from netfilter connmark. My understanding is you have written such an action. Can you please point us to it - and any reason you havent submitted this for inclusion in kernel proper? cheers, jamal On 12-12-22 09:09 AM, Jamal Hadi Salim wrote: > On 12-12-22 08:58 AM, Yury Stankevich wrote: > >> i'm trying if this can work, alas. it can't. > > Now i want it to work ;-> So dont give up yet. > > > cheers, > jamal > ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-24 11:34 ` Jamal Hadi Salim @ 2012-12-24 11:49 ` Felix Fietkau 2012-12-24 12:19 ` Jamal Hadi Salim 2012-12-24 13:12 ` Pablo Neira Ayuso 0 siblings, 2 replies; 64+ messages in thread From: Felix Fietkau @ 2012-12-24 11:49 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel On 2012-12-24 12:34 PM, Jamal Hadi Salim wrote: > > Some good news Yury. > I am told Felix Fietkau <nbd@openwrt.org> (on CC) actually > already solved this issue and it is a feature in openwrt. I > cant find the code. > > Felix - Yury is trying to retrieve skb->mark fields from > netfilter connmark. My understanding is you have written > such an action. Can you please point us to it - and any > reason you havent submitted this for inclusion in kernel > proper? After I added it as an experiment, I got distracted with other projects again and forgot about submitting it. Take a look at the code - if the approach is reasonable, I'll submit this thing for inclusion soon. - Felix --- /dev/null +++ b/net/sched/act_connmark.c @@ -0,0 +1,137 @@ +/* + * Copyright (c) 2011 Felix Fietkau <nbd@openwrt.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/module.h> +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/skbuff.h> +#include <linux/rtnetlink.h> +#include <linux/pkt_cls.h> +#include <linux/ip.h> +#include <linux/ipv6.h> +#include <net/netlink.h> +#include <net/pkt_sched.h> +#include <net/act_api.h> + +#include <net/netfilter/nf_conntrack.h> +#include <net/netfilter/nf_conntrack_core.h> + +#define TCA_ACT_CONNMARK 20 + +#define CONNMARK_TAB_MASK 3 +static struct tcf_common *tcf_connmark_ht[CONNMARK_TAB_MASK + 1]; +static u32 connmark_idx_gen; +static DEFINE_RWLOCK(connmark_lock); + +static struct tcf_hashinfo connmark_hash_info = { + .htab = tcf_connmark_ht, + .hmask = CONNMARK_TAB_MASK, + .lock = &connmark_lock, +}; + +static int tcf_connmark(struct sk_buff *skb, const struct tc_action *a, + struct tcf_result *res) +{ + struct nf_conn *c; + enum ip_conntrack_info ctinfo; + int proto; + int r; + + if (skb->protocol == htons(ETH_P_IP)) { + if (skb->len < sizeof(struct iphdr)) + goto out; + proto = PF_INET; + } else if (skb->protocol == htons(ETH_P_IPV6)) { + if (skb->len < sizeof(struct ipv6hdr)) + goto out; + proto = PF_INET6; + } else + goto out; + + r = nf_conntrack_in(dev_net(skb->dev), proto, NF_INET_PRE_ROUTING, skb); + if (r != NF_ACCEPT) + goto out; + + c = nf_ct_get(skb, &ctinfo); + if (!c) + goto out; + + skb->mark = c->mark; + nf_conntrack_put(skb->nfct); + skb->nfct = NULL; + +out: + return TC_ACT_PIPE; +} + +static int tcf_connmark_init(struct nlattr *nla, struct nlattr *est, + struct tc_action *a, int ovr, int bind) +{ + struct tcf_common *pc; + + pc = tcf_hash_create(0, est, a, sizeof(*pc), bind, + &connmark_idx_gen, &connmark_hash_info); + if (IS_ERR(pc)) + return PTR_ERR(pc); + + tcf_hash_insert(pc, &connmark_hash_info); + + return ACT_P_CREATED; +} + +static inline int tcf_connmark_cleanup(struct tc_action *a, int bind) +{ + if (a->priv) + return tcf_hash_release(a->priv, bind, &connmark_hash_info); + return 0; +} + +static inline int tcf_connmark_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) +{ + return skb->len; +} + +static struct tc_action_ops act_connmark_ops = { + .kind = "connmark", + .hinfo = &connmark_hash_info, + .type = TCA_ACT_CONNMARK, + .capab = TCA_CAP_NONE, + .owner = THIS_MODULE, + .act = tcf_connmark, + .dump = tcf_connmark_dump, + .cleanup = tcf_connmark_cleanup, + .init = tcf_connmark_init, + .walk = tcf_generic_walker, +}; + +MODULE_AUTHOR("Felix Fietkau <nbd@openwrt.org>"); +MODULE_DESCRIPTION("Connection tracking mark restoring"); +MODULE_LICENSE("GPL"); + +static int __init connmark_init_module(void) +{ + return tcf_register_action(&act_connmark_ops); +} + +static void __exit connmark_cleanup_module(void) +{ + tcf_unregister_action(&act_connmark_ops); +} + +module_init(connmark_init_module); +module_exit(connmark_cleanup_module); --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -670,6 +670,19 @@ config NET_ACT_CSUM To compile this code as a module, choose M here: the module will be called act_csum. +config NET_ACT_CONNMARK + tristate "Connection Tracking Marking" + depends on NET_CLS_ACT + depends on NF_CONNTRACK + depends on NF_CONNTRACK_MARK + ---help--- + Say Y here to restore the connmark from a scheduler action + + If unsure, say N. + + To compile this code as a module, choose M here: the + module will be called act_connmark. + config NET_CLS_IND bool "Incoming device classification" depends on NET_CLS_U32 || NET_CLS_FW --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -16,6 +16,7 @@ obj-$(CONFIG_NET_ACT_PEDIT) += act_pedit obj-$(CONFIG_NET_ACT_SIMP) += act_simple.o obj-$(CONFIG_NET_ACT_SKBEDIT) += act_skbedit.o obj-$(CONFIG_NET_ACT_CSUM) += act_csum.o +obj-$(CONFIG_NET_ACT_CONNMARK) += act_connmark.o obj-$(CONFIG_NET_SCH_FIFO) += sch_fifo.o obj-$(CONFIG_NET_SCH_CBQ) += sch_cbq.o obj-$(CONFIG_NET_SCH_HTB) += sch_htb.o ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-24 11:49 ` Felix Fietkau @ 2012-12-24 12:19 ` Jamal Hadi Salim 2012-12-24 13:12 ` Pablo Neira Ayuso 1 sibling, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-24 12:19 UTC (permalink / raw) To: Felix Fietkau Cc: Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-24 06:49 AM, Felix Fietkau wrote: > > After I added it as an experiment, I got distracted with other projects > again and forgot about submitting it. Take a look at the code - if the > approach is reasonable, I'll submit this thing for inclusion soon. > Excellent ;-> Simple and elegant. Usable as is - some minor comments. First nitpick: The name is not very reflective, how about: GetMarkFromConntrack or something along those lines? > +static int tcf_connmark(struct sk_buff *skb, const struct tc_action *a, > + struct tcf_result *res) > +{ > + struct nf_conn *c; > + enum ip_conntrack_info ctinfo; > + int proto; > + int r; > + > + if (skb->protocol == htons(ETH_P_IP)) { > + if (skb->len < sizeof(struct iphdr)) > + goto out; > + proto = PF_INET; > + } else if (skb->protocol == htons(ETH_P_IPV6)) { > + if (skb->len < sizeof(struct ipv6hdr)) > + goto out; > + proto = PF_INET6; > + } else > + goto out; > + I would have said that this action is probably also not useful for egress qdisc path since skb->mark would already be set. It maybe worth checking skb->tc_verd and skipping overhead of nf_conntrack_in() call. Look at act_mirred for such a check. > + r = nf_conntrack_in(dev_net(skb->dev), proto, NF_INET_PRE_ROUTING, skb); > + if (r != NF_ACCEPT) > + goto out; > + > + c = nf_ct_get(skb, &ctinfo); > + if (!c) > + goto out; > + > + skb->mark = c->mark; > + nf_conntrack_put(skb->nfct); > + skb->nfct = NULL; > + > +out: > + return TC_ACT_PIPE; Ok, perhaps set tcf_action in (iproute2) user space to TC_ACT_PIPE then just return policy->tcf_action here. Even better is to have a different TC_ACT_XXX returned for failure vs success... Your success path becomes TC_ACT_PIPE and let the user program the failure branch optionally. This would allow for branching to different actions if success/failure, example: if mark is found { if mark is 0xa redirect to ifb0 else redirect to ifb1 } else set mark to 3 then redirect to ifb9 etc. Not sure if that made sense. I am under the influence of nyquil ;-> cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-24 11:49 ` Felix Fietkau 2012-12-24 12:19 ` Jamal Hadi Salim @ 2012-12-24 13:12 ` Pablo Neira Ayuso 2012-12-24 14:05 ` Jamal Hadi Salim 1 sibling, 1 reply; 64+ messages in thread From: Pablo Neira Ayuso @ 2012-12-24 13:12 UTC (permalink / raw) To: Felix Fietkau Cc: Jamal Hadi Salim, Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, netfilter-devel Hi Felix, On Mon, Dec 24, 2012 at 12:49:16PM +0100, Felix Fietkau wrote: > On 2012-12-24 12:34 PM, Jamal Hadi Salim wrote: > > > > Some good news Yury. > > I am told Felix Fietkau <nbd@openwrt.org> (on CC) actually > > already solved this issue and it is a feature in openwrt. I > > cant find the code. > > > > Felix - Yury is trying to retrieve skb->mark fields from > > netfilter connmark. My understanding is you have written > > such an action. Can you please point us to it - and any > > reason you havent submitted this for inclusion in kernel > > proper? > After I added it as an experiment, I got distracted with other projects > again and forgot about submitting it. Take a look at the code - if the > approach is reasonable, I'll submit this thing for inclusion soon. > > - Felix > > --- /dev/null > +++ b/net/sched/act_connmark.c > @@ -0,0 +1,137 @@ > +/* > + * Copyright (c) 2011 Felix Fietkau <nbd@openwrt.org> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple > + * Place - Suite 330, Boston, MA 02111-1307 USA. > + */ > + > +#include <linux/module.h> > +#include <linux/init.h> > +#include <linux/kernel.h> > +#include <linux/skbuff.h> > +#include <linux/rtnetlink.h> > +#include <linux/pkt_cls.h> > +#include <linux/ip.h> > +#include <linux/ipv6.h> > +#include <net/netlink.h> > +#include <net/pkt_sched.h> > +#include <net/act_api.h> > + > +#include <net/netfilter/nf_conntrack.h> > +#include <net/netfilter/nf_conntrack_core.h> > + > +#define TCA_ACT_CONNMARK 20 > + > +#define CONNMARK_TAB_MASK 3 > +static struct tcf_common *tcf_connmark_ht[CONNMARK_TAB_MASK + 1]; > +static u32 connmark_idx_gen; > +static DEFINE_RWLOCK(connmark_lock); > + > +static struct tcf_hashinfo connmark_hash_info = { > + .htab = tcf_connmark_ht, > + .hmask = CONNMARK_TAB_MASK, > + .lock = &connmark_lock, > +}; > + > +static int tcf_connmark(struct sk_buff *skb, const struct tc_action *a, > + struct tcf_result *res) > +{ > + struct nf_conn *c; > + enum ip_conntrack_info ctinfo; > + int proto; > + int r; > + > + if (skb->protocol == htons(ETH_P_IP)) { > + if (skb->len < sizeof(struct iphdr)) > + goto out; > + proto = PF_INET; > + } else if (skb->protocol == htons(ETH_P_IPV6)) { > + if (skb->len < sizeof(struct ipv6hdr)) > + goto out; > + proto = PF_INET6; > + } else > + goto out; > + > + r = nf_conntrack_in(dev_net(skb->dev), proto, NF_INET_PRE_ROUTING, skb); conntrack needs to see defragmented packets, you have to call nf_defrag_ipv4 / _ipv6 respectively before that. This also changes the semantics of the raw table in iptables since it will now see packet with conntrack already attached. So this would also break -j CT --notrack. This needs more thinking. I can appreciate the value of calling conntrack from different points of the packet traversal, but there are a couple of thing we have to resolve before allowing that. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-24 13:12 ` Pablo Neira Ayuso @ 2012-12-24 14:05 ` Jamal Hadi Salim 2012-12-24 18:19 ` Pablo Neira Ayuso 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-24 14:05 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: Felix Fietkau, Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, netfilter-devel Hi Pablo, On 12-12-24 08:12 AM, Pablo Neira Ayuso wrote: > > conntrack needs to see defragmented packets, you have to call > nf_defrag_ipv4 / _ipv6 respectively before that. > This should not be too hard to do - although my thinking says this should be a separate action. > This also changes the semantics of the raw table in iptables since it > will now see packet with conntrack already attached. So this would > also break -j CT --notrack. > Is there a flag we can check which says a flow is not to be tracked? Doesnt nf_conntrack_in() fail if --no track is set? > This needs more thinking. I can appreciate the value of calling > conntrack from different points of the packet traversal, but there are > a couple of thing we have to resolve before allowing that. There is user need for this Pablo - as you can see from what Felix deployed it seems to be used a lot more wider audience dependency. What do we need to do to get this to work properly? cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-24 14:05 ` Jamal Hadi Salim @ 2012-12-24 18:19 ` Pablo Neira Ayuso 2012-12-26 23:10 ` Pablo Neira Ayuso 0 siblings, 1 reply; 64+ messages in thread From: Pablo Neira Ayuso @ 2012-12-24 18:19 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Felix Fietkau, Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, netfilter-devel Hi Jamal, On Mon, Dec 24, 2012 at 09:05:42AM -0500, Jamal Hadi Salim wrote: > On 12-12-24 08:12 AM, Pablo Neira Ayuso wrote: > > > >conntrack needs to see defragmented packets, you have to call > >nf_defrag_ipv4 / _ipv6 respectively before that. > > > > This should not be too hard to do - although my thinking says this > should be a separate action. > > >This also changes the semantics of the raw table in iptables since it > >will now see packet with conntrack already attached. So this would > >also break -j CT --notrack. > > Is there a flag we can check which says a flow is not to be tracked? > Doesnt nf_conntrack_in() fail if --no track is set? The notrack dummy conntrack (consider it a flag) is attached in prerouting raw table. By attaching conntracks at ingress, the notrack flag will be ignored. Note that this also breaks conntrack templates via -j CT, that allows us to set custom conntrack timeouts, zones and helpers at prerouting raw. Basically, ct templates are attached via -j CT, this template is munched by nf_conntrack_in, which adds the corresponding ct features based on the template information. > >This needs more thinking. I can appreciate the value of calling > >conntrack from different points of the packet traversal, but there are > >a couple of thing we have to resolve before allowing that. > > There is user need for this Pablo - as you can see from what Felix > deployed it seems to be used a lot more wider audience dependency. > What do we need to do to get this to work properly? The conntrack code needs to be generalized to allow creating conntrack with features all at once (so we can remove the template infrastructure). Even after that, we'll still have that -j CT rules will be ignored if you're using, let's name it, act_ct from ingress to attach the conntrack to it. With the current approach you're using, people will see conntracks in the iptables raw table, that breaks the current semantics. We'll have the netfilter workshop by Q1/Q2 2013 (still TBA), I think this is material for discussion in it. cheers, Pablo ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-24 18:19 ` Pablo Neira Ayuso @ 2012-12-26 23:10 ` Pablo Neira Ayuso 0 siblings, 0 replies; 64+ messages in thread From: Pablo Neira Ayuso @ 2012-12-26 23:10 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Felix Fietkau, Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, Jan Engelhardt, netdev@vger.kernel.org, netfilter-devel On Mon, Dec 24, 2012 at 07:19:43PM +0100, Pablo Neira Ayuso wrote: > Hi Jamal, > > On Mon, Dec 24, 2012 at 09:05:42AM -0500, Jamal Hadi Salim wrote: > > On 12-12-24 08:12 AM, Pablo Neira Ayuso wrote: > > > > > >conntrack needs to see defragmented packets, you have to call > > >nf_defrag_ipv4 / _ipv6 respectively before that. > > > > > > > This should not be too hard to do - although my thinking says this > > should be a separate action. > > > > >This also changes the semantics of the raw table in iptables since it > > >will now see packet with conntrack already attached. So this would > > >also break -j CT --notrack. > > > > Is there a flag we can check which says a flow is not to be tracked? > > Doesnt nf_conntrack_in() fail if --no track is set? > > The notrack dummy conntrack (consider it a flag) is attached in > prerouting raw table. By attaching conntracks at ingress, the notrack > flag will be ignored. Note that this also breaks conntrack templates > via -j CT, that allows us to set custom conntrack timeouts, zones and > helpers at prerouting raw. > > Basically, ct templates are attached via -j CT, this template is > munched by nf_conntrack_in, which adds the corresponding ct features > based on the template information. I'm still spinning around this and I don't come with some easy solution that doesn't break the existing semantics. One possibility can be to drop the ct reference after leaving ingress, so the lookup happens again in prerouting after the raw table to attach it again and no ct is seen in the raw table but: 1) it's suboptimal in case users have rules using ct at ingress and in iptables. 2) the conntrack template infrastructure needs to be reworked/replaced by something more flexible to attach features to conntracks, so we can still attach features for conntrack entries that were created at ingress (so helpers / custom timeouts / notrack don't break). ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-21 13:50 ` Jamal Hadi Salim 2012-12-21 14:14 ` Yury Stankevich @ 2012-12-21 14:35 ` Jan Engelhardt 2012-12-21 15:45 ` Eric Dumazet 1 sibling, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-21 14:35 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, netdev@vger.kernel.org, pablo, netfilter-devel On Friday 2012-12-21 14:50, Jamal Hadi Salim wrote: > On 12-12-21 08:13 AM, Yury Stankevich wrote: >> i use it ingress, >> so, i need to use tc xt action >> to get mark on the packet, before filter on ifb will run. >> prerouting rule, in turn, used to test if mark was actually restored. > > No experience with connmark, but - in order to restore something has > to store it, correct? The bigger problem here, if I see __netif_receive_skb right, is that when ingress rules run, skb->nfct is still unset, thereby the CONNMARK action is a no-op. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-21 14:35 ` Jan Engelhardt @ 2012-12-21 15:45 ` Eric Dumazet 2012-12-22 13:42 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Eric Dumazet @ 2012-12-21 15:45 UTC (permalink / raw) To: Jan Engelhardt Cc: Jamal Hadi Salim, Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, netdev@vger.kernel.org, pablo, netfilter-devel On Fri, 2012-12-21 at 15:35 +0100, Jan Engelhardt wrote: > > The bigger problem here, if I see __netif_receive_skb right, is that > when ingress rules run, skb->nfct is still unset, thereby the > CONNMARK action is a no-op. Right, ingress is performed before IP/netfilter stack. This reminds me this might be the reason we have skb_reset_transport_header(skb); in __netif_receive_skb(), while its not very logical. (Yes, sorry for being off topic, but I am referring to http://www.spinics.net/lists/netdev/msg214662.html ) ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH] pkt_sched: act_xt support new Xtables interface 2012-12-21 15:45 ` Eric Dumazet @ 2012-12-22 13:42 ` Jamal Hadi Salim 0 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-22 13:42 UTC (permalink / raw) To: Eric Dumazet Cc: Jan Engelhardt, Yury Stankevich, Hasan Chowdhury, Stephen Hemminger, netdev@vger.kernel.org, pablo, netfilter-devel On 12-12-21 10:45 AM, Eric Dumazet wrote: > On Fri, 2012-12-21 at 15:35 +0100, Jan Engelhardt wrote: > > This reminds me this might be the reason we have > skb_reset_transport_header(skb); > in __netif_receive_skb(), while its not very logical. > You seem to have nailed the egress part finally. That has been a constant battle. At one point the standard answer was "turn off TSO" ;-> > (Yes, sorry for being off topic, but I am referring to > http://www.spinics.net/lists/netdev/msg214662.html ) I think the skb_reset_transport_header() when Acme made a major overhaul to replace direct pointer access. For this reason i think your second option seems preferable. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-15 21:19 ` tc ipt action Jamal Hadi Salim 2012-12-15 23:06 ` Jan Engelhardt @ 2012-12-16 0:27 ` Pablo Neira Ayuso 2012-12-16 0:59 ` Jan Engelhardt 2012-12-16 10:26 ` Jamal Hadi Salim 1 sibling, 2 replies; 64+ messages in thread From: Pablo Neira Ayuso @ 2012-12-16 0:27 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel Hi Jamal! On Sat, Dec 15, 2012 at 04:19:29PM -0500, Jamal Hadi Salim wrote: > Yury, > > I took a brief look and run some quick tests on ubuntu 12.04. I am going > to be lazy and try and involve the netfilter folks. > It seems that if you left out the args to CONNMARK (includes other > targets like MARK etc) you will succeed - but you get default > values. > > > Example, the following should work for > tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 > action ipt -j CONNMARK \ > action mirred egress redirect dev ifb0 > > Here is what the output looks like when you dont pass the parameters. > > ------- > j@ubuntu:~$ sudo tc filter show dev eth0 parent ffff: > filter protocol ip pref 1 u32 > filter protocol ip pref 1 u32 fh 800: ht divisor 1 > filter protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt > 0 flowid 1:15 > match 0a000015/ffffffff at 12 > action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING > target MARK and 0xffffffff > index 2 ref 1 bind 1 > > filter protocol ip pref 49149 u32 > filter protocol ip pref 49149 u32 fh 804: ht divisor 1 > filter protocol ip pref 49149 u32 fh 804::800 order 2048 key ht 804 > bkt 0 flowid 1:12 > match 00000000/00000000 at 0 > action order 33: tablename: mangle hook: NF_IP_PRE_ROUTING > target CONNMARK and 0x0 > index 123 ref 1 bind 1 > ---------------- > > Pablo, Hasan Chowdhury tells me this broke after iptable 1.4.10 > Hasan also sent me a small patch to fake "xt" instead of "ipt" - but > i think there's more than meets the eye here; some interface we are > using to talk to xtables on user space seems to have changed. The binary interface was broken in 1.4.11 with the guided option parser: commit 7299fa4b615d7f7ee12cde444266f6b31f667f9f Author: Jan Engelhardt <jengelh@medozas.de> Date: Sun Mar 6 15:54:58 2011 +0100 libxt_CONNMARK: use guided option parser You need a patch to use the new interface to stay in sync with current iptables libraries. I'll make it for tc and send it to you. BTW, I think it would be good if we find the way to check for libxtables current version (see iptables/configure.ac), so you can know that we broke binary compatibility again. Cheers, Pablo ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 0:27 ` tc ipt action Pablo Neira Ayuso @ 2012-12-16 0:59 ` Jan Engelhardt 2012-12-16 10:43 ` Jamal Hadi Salim 2012-12-16 10:26 ` Jamal Hadi Salim 1 sibling, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 0:59 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: Jamal Hadi Salim, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On Sunday 2012-12-16 01:27, Pablo Neira Ayuso wrote: >On Sat, Dec 15, 2012 at 04:19:29PM -0500, Jamal Hadi Salim wrote: >> Example, the following should work for >> tc filter add dev eth0 parent ffff: protocol ip u32 match u32 0 0 >> action ipt -j CONNMARK \ >> action mirred egress redirect dev ifb0 >> >commit 7299fa4b615d7f7ee12cde444266f6b31f667f9f > libxt_CONNMARK: use guided option parser > >BTW, I think it would be good if we find the way to check for >libxtables current version (see iptables/configure.ac), so you can >know that we broke binary compatibility again. For the C level, there is XTABLES_VERSION_CODE. #if XTABLES_VERSION_CODE >= 6 if (m != NULL && m->x6_parse != NULL) m->x6_parse(...) #else else if (m != NULL && m->parse != NULL) m->parse(...) ... We can also export this through pkgconfig, similar to how downstream users are to discover the plugin dir (`pkg-config xtables --variable libdir`). ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 0:59 ` Jan Engelhardt @ 2012-12-16 10:43 ` Jamal Hadi Salim 2012-12-16 17:21 ` Jan Engelhardt 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 10:43 UTC (permalink / raw) To: Jan Engelhardt Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On 12-12-15 07:59 PM, Jan Engelhardt wrote: > > For the C level, there is XTABLES_VERSION_CODE. > > #if XTABLES_VERSION_CODE >= 6 > if (m != NULL && m->x6_parse != NULL) > m->x6_parse(...) > #else > else if (m != NULL && m->parse != NULL) > m->parse(...) > ... > I think you are suggesting this to be done in tc. That would make it easier to fix. IMO, it is easier to keep backward compat if you left the old APIs around for a period of time and maybe log a warning that they will be deprecated over a period of time (sort of like kernel approach to changing APIs). BTW: another interface that seems to have changed that we need is m->final_check(). cheers, jamal > We can also export this through pkgconfig, similar to how > downstream users are to discover the plugin dir > (`pkg-config xtables --variable libdir`). > ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 10:43 ` Jamal Hadi Salim @ 2012-12-16 17:21 ` Jan Engelhardt 2012-12-16 17:47 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 17:21 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On Sunday 2012-12-16 11:43, Jamal Hadi Salim wrote: > On 12-12-15 07:59 PM, Jan Engelhardt wrote: >> > >> For the C level, there is XTABLES_VERSION_CODE. >> >> #if XTABLES_VERSION_CODE >= 6 >> if (m != NULL && m->x6_parse != NULL) >> m->x6_parse(...) >> #else >> else if (m != NULL && m->parse != NULL) >> m->parse(...) >> ... >> > > I think you are suggesting this to be done in tc. That would make it easier to > fix. > IMO, it is easier to keep backward compat if you left the old > APIs around for a period of time As you can see, the old ->parse() that goes back to libxtables.so.0 still remains. It's just that... only 5 out of 99 plugins still come with an old parse function. [m_xt] -> [libxtables] <- (plugins: libxt_*.so) > and maybe log a warning that they > will be deprecated over a period of time (sort of like kernel approach to > changing APIs). old parse has not entered any deprecation stage yet, since there are still plugins out there (both the 5 and external ones) that make use of it. Right now, both parse and x6_parse are valid. > BTW: another interface that seems to have changed that we > need is m->final_check(). Yes, all those with an x6_ prefix are new. Mh, I already dream of plans reducing m_xt to something that does not require libxtables.so anymore. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 17:21 ` Jan Engelhardt @ 2012-12-16 17:47 ` Jamal Hadi Salim 2012-12-16 18:59 ` Jan Engelhardt 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 17:47 UTC (permalink / raw) To: Jan Engelhardt Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On 12-12-16 12:21 PM, Jan Engelhardt wrote: > > > As you can see, the old ->parse() that goes back to libxtables.so.0 > still remains. It's just that... only 5 out of 99 plugins still come > with an old parse function. > I see. So calling m->XXX may not be a wise long term solution. Hasan's patch has a call to xtables_option_tpcall(), if that is the right interface I hope that going forward if any of the m->parseXX changes you will take care of hiding all that stuff. > old parse has not entered any deprecation stage yet, since there are still > plugins out there (both the 5 and external ones) that make use of it. > Right now, both parse and x6_parse are valid. > True - but we are getting broken because we are using a call that only 5 or so users provide. It would have saved us time if we got the a good log message instead of checking for if !m->parse() > Yes, all those with an x6_ prefix are new. > Mh, I already dream of plans reducing m_xt to something that > does not require libxtables.so anymore. > That would be nice - but someone is going to have to link to libxtables, no? cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 17:47 ` Jamal Hadi Salim @ 2012-12-16 18:59 ` Jan Engelhardt 2012-12-16 20:35 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 18:59 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On Sunday 2012-12-16 18:47, Jamal Hadi Salim wrote: > >> old parse has not entered any deprecation stage yet, since there are still >> plugins out there (both the 5 and external ones) that make use of it. >> Right now, both parse and x6_parse are valid. > > True - but we are getting broken because we are using a call that only 5 or so > users provide. It would have saved us time if we got the > a good log message instead of checking for if !m->parse() A certainly safe bet would be to write, at the top of tc/m_xt.c, #if XTABLES_VERSION_CODE > 9 # error Someone call the guy who changed iptables and \ make him fix it^W^W^W^W say you need help. #endif Then I would automatically notify myself of "oh I need fix that too" when I feed any new releases of {iptables, iproute} through the Open Build Service. >> Yes, all those with an x6_ prefix are new. >> Mh, I already dream of plans reducing m_xt to something that >> does not require libxtables.so anymore. > > That would be nice - but someone is going to have to link to libxtables, no? I hope the complete opposite. The following is a rough, it-compiles, untested never-run, draft of a module. The vision here is that userspace only ever sends a chain name to the kernel. The git tree/branch for it is git://git.inai.de/linux xt2-pktsched commit 42c559c148cbbc22bf2cc29f2ad08bc330891838 net_sched: act: new action to call into Xtables2 chains include/net/netfilter/xt_core.h | 8 ++ include/uapi/linux/tc_act/tc_ipt.h | 2 + net/netfilter/xt_core.c | 3 +- net/sched/Kconfig | 9 ++ net/sched/Makefile | 1 + net/sched/act_xtables.c | 238 ++++++++++++++++++++++++++++++++++++ 6 files changed, 260 insertions(+), 1 deletion(-) ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 18:59 ` Jan Engelhardt @ 2012-12-16 20:35 ` Jamal Hadi Salim 2012-12-16 21:21 ` Jan Engelhardt 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 20:35 UTC (permalink / raw) To: Jan Engelhardt Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On 12-12-16 01:59 PM, Jan Engelhardt wrote: > > > A certainly safe bet would be to write, at the top of tc/m_xt.c, > > #if XTABLES_VERSION_CODE > 9 > # error Someone call the guy who changed iptables and \ > make him fix it^W^W^W^W say you need help. > #endif > Excellent idea ;-> > The following is a rough, it-compiles, untested never-run, draft of a > module. The vision here is that userspace only ever sends a chain > name to the kernel. The git tree/branch for it is > > git://git.inai.de/linux xt2-pktsched > > commit 42c559c148cbbc22bf2cc29f2ad08bc330891838 > I'll look at it later - very slow connection at the moment so cloning will take a while. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 20:35 ` Jamal Hadi Salim @ 2012-12-16 21:21 ` Jan Engelhardt 2012-12-17 12:58 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-16 21:21 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On Sunday 2012-12-16 21:35, Jamal Hadi Salim wrote: >> git://git.inai.de/linux xt2-pktsched >> commit 42c559c148cbbc22bf2cc29f2ad08bc330891838 > > I'll look at it later - very slow connection at the moment so cloning will take > a while. If you have a preexisting clone of any linux tree, you can utilize `git remote add ...` to only grab the deltas. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 21:21 ` Jan Engelhardt @ 2012-12-17 12:58 ` Jamal Hadi Salim 2012-12-17 13:28 ` Jan Engelhardt 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-17 12:58 UTC (permalink / raw) To: Jan Engelhardt Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On 12-12-16 04:21 PM, Jan Engelhardt wrote: > If you have a preexisting clone of any linux tree, you can utilize > `git remote add ...` to only grab the deltas. It downloaded eventually. So looking at this quickly, basic question: is xtables2 different API wise from what we do today in act_ipt? Second: Are chain names unique system wide? i.e at the moment we send a hook and table selection? The patch i have currently for the kernel tries to pursue an approach that maximizes code reuse - depending on your answer I may go the approach of having a separate act_xt and hope you can build on top of that. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-17 12:58 ` Jamal Hadi Salim @ 2012-12-17 13:28 ` Jan Engelhardt 2012-12-18 13:23 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-17 13:28 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On Monday 2012-12-17 13:58, Jamal Hadi Salim wrote: > On 12-12-16 04:21 PM, Jan Engelhardt wrote: > >> If you have a preexisting clone of any linux tree, you can utilize >> `git remote add ...` to only grab the deltas. > >It downloaded eventually. So looking at this quickly, basic >question: is xtables2 different API wise from what we do today in >act_ipt? AFAICS, (one instance of) act_ipt today directly invokes (exactly one instance of) a target. With act_xt2 as drafted, it instead invokes a chain, which would 1. leave the construction of the target data and calling it to the subsystems they conceptually belong to - the packet filter 2. lets you do matches, jumps and all that. >Second: Are chain names unique system wide? Good thing you ask. Chain names are unique within a netns, and this act_xtables.c draft looks at the packet to get to know its netns, so that seems fine. However, your question also leads to looking at whether TC Actions themselves are sufficiently netns-ified, and it seems this is _not_ the case. Am I right in the observation that variables like "tcf_ipt_ht" are in fact global rather tha per-netns? ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-17 13:28 ` Jan Engelhardt @ 2012-12-18 13:23 ` Jamal Hadi Salim 2012-12-18 13:58 ` Jan Engelhardt 0 siblings, 1 reply; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-18 13:23 UTC (permalink / raw) To: Jan Engelhardt Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On 12-12-17 08:28 AM, Jan Engelhardt wrote: > > On Monday 2012-12-17 13:58, Jamal Hadi Salim wrote: > AFAICS, (one instance of) act_ipt today directly invokes (exactly one > instance of) a target. Design intent. You can have the same target instance used by specifying the same index on the command line. >With act_xt2 as drafted, it instead invokes a chain, which would > > 1. leave the construction of the target data and calling it > to the subsystems they conceptually belong to - the packet filter > 2. lets you do matches, jumps and all that. > I like #2. For #1 as long as it doesnt deviate from desire to have one or more instances of targets, we should be fine. > Good thing you ask. Chain names are unique within a netns, and this > act_xtables.c draft looks at the packet to get to know its netns, so > that seems fine. My motivation for that question: Is it possible to ignore the hook and tablename and just use the chain name? > However, your question also leads to looking at whether TC Actions > themselves are sufficiently netns-ified, and it seems this is _not_ > the case. Am I right in the observation that variables like > "tcf_ipt_ht" are in fact global rather tha per-netns? In general we dont need to worry about netns since actions are attached to the filters which are dependent on qdiscs which are dependent on netdevs which are per netns. I believe actions (not filters or qdiscs) have a way where this can be circumvented in one scenario (I can configure them bypassing the filter interface). Thanks for bringing this up - I will look at it. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-18 13:23 ` Jamal Hadi Salim @ 2012-12-18 13:58 ` Jan Engelhardt 2012-12-19 11:43 ` Jamal Hadi Salim 0 siblings, 1 reply; 64+ messages in thread From: Jan Engelhardt @ 2012-12-18 13:58 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On Tuesday 2012-12-18 14:23, Jamal Hadi Salim wrote: > On 12-12-17 08:28 AM, Jan Engelhardt wrote: >> >> With act_xt2 as drafted, it instead invokes a chain, which would >> >> 1. leave the construction of the target data and calling it >> to the subsystems they conceptually belong to - the packet filter >> 2. lets you do matches, jumps and all that. > >I like #2. For #1 as long as it doesnt deviate from desire to have >one or more instances of targets, we should be fine. Chains can store multiple targets, so no loss. >> Good thing you ask. Chain names are unique within a netns, and this >> act_xtables.c draft looks at the packet to get to know its netns, so >> that seems fine. > > My motivation for that question: > Is it possible to ignore the hook and tablename and just use the chain > name? 1. table First, I think some targets need to relax their restrictions, such as with xt_DSCP. Then, only a handful of extensions remain: CT, <all NATs>, TPROXY and REJECT. Would anyone want to call these from act_ipt? I doubt it. :) 2. hooks Extensions with hook limit: <NAT>, TPROXY, REJECT, CLASSIFY. Again, I don't quite see the value of attempting to NAT from act_ipt. CLASSIFY {c|sh?}ould be relaxed, unless I am missing something. ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-18 13:58 ` Jan Engelhardt @ 2012-12-19 11:43 ` Jamal Hadi Salim 0 siblings, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-19 11:43 UTC (permalink / raw) To: Jan Engelhardt Cc: Pablo Neira Ayuso, Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel On 12-12-18 08:58 AM, Jan Engelhardt wrote: > > Chains can store multiple targets, so no loss. Nice. > 1. table > > First, I think some targets need to relax their restrictions, such as > with xt_DSCP. Saw your other patch to get rid of mangle hardcoding. > Then, only a handful of extensions remain: CT, <all NATs>, > TPROXY and REJECT. Would anyone want to call these from act_ipt? > I doubt it. :) > Tempted to say tproxy. > 2. hooks > > Extensions with hook limit: <NAT>, TPROXY, REJECT, CLASSIFY. > Again, I don't quite see the value of attempting to NAT from act_ipt. > CLASSIFY {c|sh?}ould be relaxed, unless I am missing something. > I could live with that. It would be an improvement over whats there today. I would prefer however for this to be an improvement over act_xt.c i posted as opposed to have even more interfaces for xt. We've suffered enough already ;-> i.e add your patches on top. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: tc ipt action 2012-12-16 0:27 ` tc ipt action Pablo Neira Ayuso 2012-12-16 0:59 ` Jan Engelhardt @ 2012-12-16 10:26 ` Jamal Hadi Salim 1 sibling, 0 replies; 64+ messages in thread From: Jamal Hadi Salim @ 2012-12-16 10:26 UTC (permalink / raw) To: Pablo Neira Ayuso Cc: Yury Stankevich, shemonc, netdev@vger.kernel.org, netfilter-devel Hi Pablo, On 12-12-15 07:27 PM, Pablo Neira Ayuso wrote: > The binary interface was broken in 1.4.11 with the guided option > parser: Ah. Thanks that would explain it. > You need a patch to use the new interface to stay in sync with current > iptables libraries. I'll make it for tc and send it to you. > Much thanks. I just scanned it and things have changed; old way used to take multiparams. New one a single struct, so would have taken much longer for me to resolve. > BTW, I think it would be good if we find the way to check for > libxtables current version (see iptables/configure.ac), so you can > know that we broke binary compatibility again. > will do. cheers, jamal ^ permalink raw reply [flat|nested] 64+ messages in thread
end of thread, other threads:[~2012-12-26 23:10 UTC | newest]
Thread overview: 64+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <50C4821D.5090206@gmail.com>
     [not found] ` <50C9B4BB.9060609@mojatatu.com>
2012-12-15 21:19   ` tc ipt action Jamal Hadi Salim
2012-12-15 23:06     ` Jan Engelhardt
2012-12-16  0:26       ` Jan Engelhardt
2012-12-16  0:32         ` [PATCH] build: unbreak linkage of m_xt.so Jan Engelhardt
2012-12-16 10:30           ` Jamal Hadi Salim
2012-12-16 17:03             ` Jamal Hadi Salim
2012-12-16 17:43               ` Jan Engelhardt
2012-12-16 18:05                 ` Jamal Hadi Salim
2012-12-16 22:02           ` Mike Frysinger
2012-12-18 17:21           ` Stephen Hemminger
2012-12-18 18:47             ` Mike Frysinger
2012-12-20  0:03               ` Stephen Hemminger
2012-12-16 10:22       ` tc ipt action Jamal Hadi Salim
     [not found]       ` <CAASe=fQT2pVOK0uctdaKL+aOrF8nYeTMfoF15kmd-rC02+7Vnw@mail.gmail.com>
2012-12-16 16:48         ` Jamal Hadi Salim
2012-12-16 18:59           ` Jamal Hadi Salim
2012-12-16 19:13             ` Jan Engelhardt
2012-12-16 20:36               ` Jamal Hadi Salim
2012-12-16 20:41               ` [PATCH] iproute2: act_ipt fix xtables breakage Jamal Hadi Salim
2012-12-17 12:30                 ` RFC [PATCH] iproute2: temporary solution to fix xt breakage Jamal Hadi Salim
2012-12-17 16:12                   ` Stephen Hemminger
2012-12-19 11:36                     ` Jamal Hadi Salim
     [not found]                   ` <CAASe=fRuJdtisEvp7uo=PHwN3nKHqsYDW4Om1gk2MK-vyNvBrA@mail.gmail.com>
2012-12-18 12:28                     ` Jamal Hadi Salim
     [not found]                       ` <CAASe=fR6Hm2dxp=1wDchtrzqnaH6qacHpg2wrsqLfmGpPbQ9Fg@mail.gmail.com>
2012-12-19 11:44                         ` Jamal Hadi Salim
2012-12-19 11:56                           ` [PATCH] pkt_sched: act_xt support new Xtables interface Jamal Hadi Salim
2012-12-19 15:52                             ` Jan Engelhardt
2012-12-19 23:05                               ` Jamal Hadi Salim
     [not found]                             ` <CAASe=fQZGwjM_2PStRE0tje33Doi6TuwJJ3p7x-SRcwq3mQvRg@mail.gmail.com>
2012-12-19 23:00                               ` Jamal Hadi Salim
2012-12-20  8:54                             ` Yury Stankevich
2012-12-20 12:35                               ` Jamal Hadi Salim
2012-12-20 14:59                                 ` Yury Stankevich
2012-12-21 13:03                                   ` Jamal Hadi Salim
2012-12-21 13:13                                     ` Yury Stankevich
2012-12-21 13:50                                       ` Jamal Hadi Salim
2012-12-21 14:14                                         ` Yury Stankevich
2012-12-22 13:19                                           ` Jamal Hadi Salim
2012-12-22 13:43                                             ` Jan Engelhardt
2012-12-22 13:56                                               ` Jamal Hadi Salim
2012-12-22 13:58                                             ` Yury Stankevich
2012-12-22 14:04                                               ` Florian Westphal
2012-12-22 14:09                                               ` Jamal Hadi Salim
2012-12-24 11:34                                                 ` Jamal Hadi Salim
2012-12-24 11:49                                                   ` Felix Fietkau
2012-12-24 12:19                                                     ` Jamal Hadi Salim
2012-12-24 13:12                                                     ` Pablo Neira Ayuso
2012-12-24 14:05                                                       ` Jamal Hadi Salim
2012-12-24 18:19                                                         ` Pablo Neira Ayuso
2012-12-26 23:10                                                           ` Pablo Neira Ayuso
2012-12-21 14:35                                         ` Jan Engelhardt
2012-12-21 15:45                                           ` Eric Dumazet
2012-12-22 13:42                                             ` Jamal Hadi Salim
2012-12-16  0:27     ` tc ipt action Pablo Neira Ayuso
2012-12-16  0:59       ` Jan Engelhardt
2012-12-16 10:43         ` Jamal Hadi Salim
2012-12-16 17:21           ` Jan Engelhardt
2012-12-16 17:47             ` Jamal Hadi Salim
2012-12-16 18:59               ` Jan Engelhardt
2012-12-16 20:35                 ` Jamal Hadi Salim
2012-12-16 21:21                   ` Jan Engelhardt
2012-12-17 12:58                     ` Jamal Hadi Salim
2012-12-17 13:28                       ` Jan Engelhardt
2012-12-18 13:23                         ` Jamal Hadi Salim
2012-12-18 13:58                           ` Jan Engelhardt
2012-12-19 11:43                             ` Jamal Hadi Salim
2012-12-16 10:26       ` Jamal Hadi Salim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).