* [NET]: should explicitely initialize atomic_t field in struct dst_ops
From: Eric Dumazet @ 2008-01-24 15:11 UTC (permalink / raw)
To: David Miller; +Cc: netdev@vger.kernel.org
All but one struct dst_ops static initializations miss explicit
initialization of entries field.
As this field is atomic_t, we should use ATOMIC_INIT(0), and not
rely on atomic_t implementation.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
net/ipv4/route.c | 2 ++
net/ipv4/xfrm4_policy.c | 1 +
net/ipv6/route.c | 2 ++
net/ipv6/xfrm6_policy.c | 1 +
4 files changed, 6 insertions(+)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 896c768..163086b 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -169,6 +169,7 @@ static struct dst_ops ipv4_dst_ops = {
.update_pmtu = ip_rt_update_pmtu,
.local_out = ip_local_out,
.entry_size = sizeof(struct rtable),
+ .entries = ATOMIC_INIT(0),
};
#define ECN_OR_COST(class) TC_PRIO_##class
@@ -2498,6 +2499,7 @@ static struct dst_ops ipv4_dst_blackhole_ops = {
.check = ipv4_dst_check,
.update_pmtu = ipv4_rt_blackhole_update_pmtu,
.entry_size = sizeof(struct rtable),
+ .entries = ATOMIC_INIT(0),
};
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index 3783e3e..10ed704 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -247,6 +247,7 @@ static struct dst_ops xfrm4_dst_ops = {
.local_out = __ip_local_out,
.gc_thresh = 1024,
.entry_size = sizeof(struct xfrm_dst),
+ .entries = ATOMIC_INIT(0),
};
static struct xfrm_policy_afinfo xfrm4_policy_afinfo = {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4004c5f..8d669bb 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -107,6 +107,7 @@ static struct dst_ops ip6_dst_ops = {
.update_pmtu = ip6_rt_update_pmtu,
.local_out = ip6_local_out,
.entry_size = sizeof(struct rt6_info),
+ .entries = ATOMIC_INIT(0),
};
static void ip6_rt_blackhole_update_pmtu(struct dst_entry *dst, u32 mtu)
@@ -120,6 +121,7 @@ static struct dst_ops ip6_dst_blackhole_ops = {
.check = ip6_dst_check,
.update_pmtu = ip6_rt_blackhole_update_pmtu,
.entry_size = sizeof(struct rt6_info),
+ .entries = ATOMIC_INIT(0),
};
struct rt6_info ip6_null_entry = {
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index c25a6b5..7d20199 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -272,6 +272,7 @@ static struct dst_ops xfrm6_dst_ops = {
.local_out = __ip6_local_out,
.gc_thresh = 1024,
.entry_size = sizeof(struct xfrm_dst),
+ .entries = ATOMIC_INIT(0),
};
static struct xfrm_policy_afinfo xfrm6_policy_afinfo = {
^ permalink raw reply related
* Re: [PATCH net-2.6.25] Add packet filtering based on process's security context.
From: Paul Moore @ 2008-01-24 15:03 UTC (permalink / raw)
To: Tetsuo Handa; +Cc: netdev, davem, linux-security-module, netfilter-devel
In-Reply-To: <200801242047.JEI35479.OJLFHMtOOFQFVS@I-love.SAKURA.ne.jp>
On Thursday 24 January 2008 6:47:55 am Tetsuo Handa wrote:
> Are there any remaining questions/problems about this patch?
> If none, I want this patch applied to net-2.6.25 tree.
Hello,
Taking into consideration that there are no current in-tree users of
this patch and the only known user of this functionality is TOMOYO,
which is still dealing with some unresolved VFS issues, I suggest not
merging this patch at the current time. My recommendation is to
continue to work on resolving the VFS issues (which it appears you are
working on) and then submitting all of the required TOMOYO changes at
once.
As a general rule, removing functionality from the kernel tends to be
much more difficult then adding it.
--
paul moore
linux security @ hp
^ permalink raw reply
* Re: [Bugme-new] [Bug 9808] New: system hung with htb QoS
From: Andrew Morton @ 2008-01-24 14:11 UTC (permalink / raw)
To: netdev; +Cc: bilias, bugme-daemon, Auke Kok, Jesse Brandeburg
In-Reply-To: <bug-9808-10286@http.bugzilla.kernel.org/>
> On Thu, 24 Jan 2008 03:03:11 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9808
>
> Summary: system hung with htb QoS
> Product: Networking
> Version: 2.5
> KernelVersion: 2.6.23.9
> Platform: All
> OS/Version: Linux
> Tree: Fedora
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Netfilter/Iptables
> AssignedTo: networking_netfilter-iptables@kernel-bugs.osdl.org
> ReportedBy: bilias@edu.physics.uoc.gr
>
>
> Hi,
>
> I've setup QoS on my ftp server to limit outgoing traffic. Apparently the
> server
> stops responding (no output no keyboard) in an unpredictable manner. Sometimes
> it
> takes an hour, sometimes up to 4 days for the system to hung.
>
> I have attached my QoS startup script, dmesg output,
> lspci -vvv, iptables that interact with QoS.
>
> I'm also receiving this quite often:
> Jan 15 12:23:17 ftp kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit
> Hang
> Jan 15 12:23:17 ftp kernel: Tx Queue <0>
> Jan 15 12:23:17 ftp kernel: TDH <2a>
> Jan 15 12:23:17 ftp kernel: TDT <17>
> Jan 15 12:23:17 ftp kernel: next_to_use <17>
> Jan 15 12:23:17 ftp kernel: next_to_clean <2a>
> Jan 15 12:23:17 ftp kernel: buffer_info[next_to_clean]
> Jan 15 12:23:17 ftp kernel: time_stamp <5798144>
> Jan 15 12:23:17 ftp kernel: next_to_watch <2d>
> Jan 15 12:23:17 ftp kernel: jiffies <57988ef>
> Jan 15 12:23:17 ftp kernel: next_to_watch.status <0>
> Jan 15 12:23:19 ftp kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit
> Hang
>
> Today for the first time (after applying options to e1000 driver in
> modprobe.conf) I got a kernel panic:
>
> BUG: unable to handle kernel paging request at virtual address a0379120
> EIP: 0060: [<c05db2dc>] Not Tainted VLI
> EIP is at ip_rcv+0x286/0x4ba
> Kernel panic - not syncing: Fatal exception in interrupt
>
> This is what I wrote on paper cause there wasn't logged anywhere.
> Usually it hungs without a kernel panic.
>
> System in Fedoca Core 8 up2date
> 2.6.23.9-85.fc8PAE
> 2x Intel(R) Xeon(TM) CPU 3.20GHz
> 4G RAM
>
> Without the QoS loaded system never hungs. It must be related to this. However
> the e1000 error I'm receiving must have to do with the e1000 driver. I've seen
> this bug in the past that's why I tried to apply the options in modprobe.conf
>
> any help will be appreciated
> thanx in advance
>
> Giannis
>
> QoS startup script:
> # default WAN limit
> LIMIT="80mbit"
> LOW_LIMIT="50mbit"
>
> start() {
> echo -n "Starting QoS: (WAN limit set to ${LIMIT})"
> tc qdisc del dev eth0 root 2> /dev/null > /dev/null
> tc qdisc del dev eth0 ingress 2> /dev/null > /dev/null
> ADD_CLASS="tc class add dev eth0 "
> ###### uplink
> # install root HTB, point default traffic to 1:25
> tc qdisc add dev eth0 root handle 1: htb default 25
>
> tc class add dev eth0 parent 1: classid 1:1 htb rate 1000mbit
> # class for outgoing SYN packets + Minimize-Delay TOS
> ${ADD_CLASS} parent 1:1 classid 1:11 htb rate 2mbit ceil 5mbit prio 1
> # class for internal LAN traffic
> ${ADD_CLASS} parent 1:1 classid 1:12 htb rate 500mbit ceil 800mbit prio 2
> # class for WAN traffic
> ${ADD_CLASS} parent 1:1 classid 1:2 htb rate ${LIMIT} ceil ${LIMIT} prio 3
> # class for WAN http traffic
> ${ADD_CLASS} parent 1:2 classid 1:24 htb rate 30mbit ceil ${LIMIT} prio 4
> # default class, rest WAN traffic
> ${ADD_CLASS} parent 1:2 classid 1:25 htb rate 20mbit ceil ${LIMIT} prio 5
>
> tc filter add dev eth0 protocol ip parent 1:0 prio 1 handle 1 fw flowid 1:11
> tc filter add dev eth0 protocol ip parent 1:0 prio 2 handle 2 fw flowid 1:12
> tc filter add dev eth0 protocol ip parent 1:0 prio 4 u32 \
> match ip sport 80 0xffff flowid 1:24
>
> tc qdisc add dev eth0 parent 1:11 handle 11: sfq perturb 10
> tc qdisc add dev eth0 parent 1:12 handle 12: sfq perturb 10
> tc qdisc add dev eth0 parent 1:24 handle 24: sfq perturb 10
> tc qdisc add dev eth0 parent 1:25 handle 25: sfq perturb 10
>
> echo
> }
>
> stop() {
> echo -n "Stopping QoS: "
> tc qdisc del dev eth0 root 2> /dev/null > /dev/null
> tc qdisc del dev eth0 ingress 2> /dev/null > /dev/null
> echo
> }
>
> -------------------
> QoS startup script: http://www.edu.physics.uoc.gr/~bilias/ftp/QoS
> dmesg: http://www.edu.physics.uoc.gr/~bilias/ftp/dmesg
> lspci -vvv: http://www.edu.physics.uoc.gr/~bilias/ftp/lspci
> iptables for QoS: http://www.edu.physics.uoc.gr/~bilias/ftp/iptables
>
> modprobe.conf options for e1000:
> options e1000 XsumRX=0 Speed=1000 Duplex=2 InterruptThrottleRate=0
> FlowControl=3 RxDescriptors=4096 TxDescriptors=4096 RxIntDelay=0 TxIntDelay=0
>
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply
* [PATCH net-2.6.25][NETNS]: Fix race between put_net() and netlink_kernel_create().
From: Pavel Emelyanov @ 2008-01-24 13:15 UTC (permalink / raw)
To: David Miller; +Cc: Denis Lunev, Linux Netdev List, devel, Alexey Dobriyan
The comment about "race free view of the set of network
namespaces" was a bit hasty. Look (there even can be only
one CPU, as discovered by Alexey Dobriyan and Denis Lunev):
put_net()
if (atomic_dec_and_test(&net->refcnt))
/* true */
__put_net(net);
queue_work(...);
/*
* note: the net now has refcnt 0, but still in
* the global list of net namespaces
*/
== re-schedule ==
register_pernet_subsys(&some_ops);
register_pernet_operations(&some_ops);
(*some_ops)->init(net);
/*
* we call netlink_kernel_create() here
* in some places
*/
netlink_kernel_create();
sk_alloc();
get_net(net); /* refcnt = 1 */
/*
* now we drop the net refcount not to
* block the net namespace exit in the
* future (or this can be done on the
* error path)
*/
put_net(sk->sk_net);
if (atomic_dec_and_test(&...))
/*
* true. BOOOM! The net is
* scheduled for release twice
*/
When thinking on this problem, I decided, that getting and
putting the net in init callback is wrong. If some init
callback needs to have a refcount-less reference on the struct
net, _it_ has to be careful himself, rather than relying on
the infrastructure to handle this correctly.
In case of netlink_kernel_create(), the problem is that the
sk_alloc() gets the given namespace, but passing the info
that we don't want to get it inside this call is too heavy.
Instead, I propose to crate the socket inside an init_net
namespace and then re-attach it to the desired one right
after the socket is created.
After doing this, we also have to be careful on error paths
not to drop the reference on the namespace, we didn't get
the one on.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis Lunev <den@openvz.org>
---
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 6b178e1..ff9fb6b 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1344,6 +1344,22 @@ static void netlink_data_ready(struct sock *sk, int len)
* queueing.
*/
+static void __netlink_release(struct sock *sk)
+{
+ /*
+ * Last sock_put should drop referrence to sk->sk_net. It has already
+ * been dropped in netlink_kernel_create. Taking referrence to stopping
+ * namespace is not an option.
+ * Take referrence to a socket to remove it from netlink lookup table
+ * _alive_ and after that destroy it in the context of init_net.
+ */
+
+ sock_hold(sk);
+ sock_release(sk->sk_socket);
+ sk->sk_net = get_net(&init_net);
+ sock_put(sk);
+}
+
struct sock *
netlink_kernel_create(struct net *net, int unit, unsigned int groups,
void (*input)(struct sk_buff *skb),
@@ -1362,8 +1378,18 @@ netlink_kernel_create(struct net *net, int unit, unsigned int groups,
if (sock_create_lite(PF_NETLINK, SOCK_DGRAM, unit, &sock))
return NULL;
- if (__netlink_create(net, sock, cb_mutex, unit) < 0)
- goto out_sock_release;
+ /*
+ * We have to just have a reference on the net from sk, but don't
+ * get_net it. Besides, we cannot get and then put the net here.
+ * So we create one inside init_net and the move it to net.
+ */
+
+ if (__netlink_create(&init_net, sock, cb_mutex, unit) < 0)
+ goto out_sock_release_nosk;
+
+ sk = sock->sk;
+ put_net(sk->sk_net);
+ sk->sk_net = net;
if (groups < 32)
groups = 32;
@@ -1372,7 +1398,6 @@ netlink_kernel_create(struct net *net, int unit, unsigned int groups,
if (!listeners)
goto out_sock_release;
- sk = sock->sk;
sk->sk_data_ready = netlink_data_ready;
if (input)
nlk_sk(sk)->netlink_rcv = input;
@@ -1395,14 +1420,14 @@ netlink_kernel_create(struct net *net, int unit, unsigned int groups,
nl_table[unit].registered++;
}
netlink_table_ungrab();
-
- /* Do not hold an extra referrence to a namespace as this socket is
- * internal to a namespace and does not prevent it to stop. */
- put_net(net);
return sk;
out_sock_release:
kfree(listeners);
+ __netlink_release(sk);
+ return NULL;
+
+out_sock_release_nosk:
sock_release(sock);
return NULL;
}
@@ -1415,18 +1440,7 @@ netlink_kernel_release(struct sock *sk)
if (sk == NULL || sk->sk_socket == NULL)
return;
- /*
- * Last sock_put should drop referrence to sk->sk_net. It has already
- * been dropped in netlink_kernel_create. Taking referrence to stopping
- * namespace is not an option.
- * Take referrence to a socket to remove it from netlink lookup table
- * _alive_ and after that destroy it in the context of init_net.
- */
- sock_hold(sk);
- sock_release(sk->sk_socket);
-
- sk->sk_net = get_net(&init_net);
- sock_put(sk);
+ __netlink_release(sk);
}
EXPORT_SYMBOL(netlink_kernel_release);
^ permalink raw reply related
* Re: 2.6.24-rc8-mm1 : net tcp_input.c warnings
From: Kamalesh Babulal @ 2008-01-24 13:11 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Dave Young, Kamalesh Babulal, Krishna Kumar2,
Denys Fedoryshchenko, David Miller, LKML, Netdev, Andrew Morton
In-Reply-To: <Pine.LNX.4.64.0801241108400.31652@kivilampi-30.cs.helsinki.fi>
On Thu, Jan 24, 2008 at 11:54:18AM +0200, Ilpo Järvinen wrote:
> On Thu, 24 Jan 2008, Dave Young wrote:
>
> Hi Dave (& others),
>
> > Thanks.
>
> Thanks a lot, I was first to ignore all these because they occurred
> with newreno, but looked again... :-/
>
> > New warning trigged with your debug patch:
>
> This was probably with the earlier one I sent to you because there's still
> this case remaining which itself is valid:
>
> > P: 5 L: 0 vs 0 S: 0 vs 1 w: 2044790889-2044796616 (0)
>
> ...snip... this is still ok state (S+L <= P):
>
> > P: 5 L: 0 vs 0 S: 0 vs 3 w: 2044790889-2044796616 (0)
> > TCP wq(s) <
> > TCP wq(h) +++h+<
> > l0 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616
> > ------------[ cut here ]------------
> > WARNING: at net/ipv4/tcp_input.c:2169 tcp_mark_head_lost+0x122/0x150()
> > Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event
> > snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss eeprom e100 psmouse
> > snd_hda_intel snd_pcm snd_timer btusb bluetooth serio_raw snd 3c59x sg
> > evdev thermal soundcore rtc_cmos snd_page_alloc rtc_core rtc_lib
> > i2c_i801 processor button intel_agp dcdbas pcspkr agpgart
> > Pid: 0, comm: swapper Not tainted 2.6.24-rc8-mm1 #8
> > [<c0132100>] ? have_callable_console+0x20/0x30
> > [<c0131844>] warn_on_slowpath+0x54/0x80
> > [<c03ffe54>] ? tcp_print_queue+0x1a4/0x230
> > [<c0132438>] ? vprintk+0x308/0x320
.
.
<snip>
.
.
> > ---[ end trace 14b601818e6903ac ]---
>
> ...But this no longer is, and even more, L: 5 is not valid state at this
> point all (should only happen if we went to RTO but it would reset S to
> zero with newreno):
>
> > P: 5 L: 5 vs 5 S: 0 vs 3 w: 2044790889-2044796616 (0)
> > TCP wq(s) LLLLl<
> > TCP wq(h) +++h+<
> > l5 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616
>
> Surprisingly, it was the first time the WARN_ON for left_out returned
> correct location. This also explains why the patch I sent to Krishna
> didn't print anything (it didn't end up into printing because I forgot
> to add L+S>P check into to the state checking if).
>
> ...so please, could you (others than Denys) try this patch, it should
> solve the issue. And Denys, could you confirm (and if necessary double
> check) that the kernel you saw this similar problem with is the pure
> Linus' mainline, i.e., without any net-2.6.25 or mm bits please, if so,
> that problem persists. And anyway, there were some fackets_out related
> problems reported as well and this doesn't help for that but I think I've
> lost track of who was seeing it due to large number of reports :-), could
> somebody refresh my memory because I currently don't have time to dig it
> up from archives (at least on this week).
>
>
> --
> i.
>
> --
> [PATCH] [TCP]: NewReno must count every skb while marking losses
>
> NewReno should add cnt per skb (as with FACK) instead of depending
> on SACKED_ACKED bits which won't be set with it at all.
> Effectively, NewReno should always exists after the first
> iteration anyway (or immediately if there's already head in
> lost_out.
>
> This was fixed earlier in net-2.6.25 but got reverted among other
> stuff and I didn't notice that this is still necessary (actually
> wasn't even considering this case while trying to figure out the
> reports because I lived with different kind of code than it in
> reality was).
>
> This should solve the WARN_ONs in TCP code that as a result of
> this triggered multiple times in every place we check for this
> invariant.
>
> Special thanks to Dave Young <hidave.darkstar@gmail.com> and
> Krishna Kumar2 <krkumar2@in.ibm.com> for trying with my debug
> patches.
Hi,
Thanks, after applying the patch the warning is not seen.
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> ---
> net/ipv4/tcp_input.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 295490e..aa409a5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2156,7 +2156,7 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
> tp->lost_skb_hint = skb;
> tp->lost_cnt_hint = cnt;
>
> - if (tcp_is_fack(tp) ||
> + if (tcp_is_fack(tp) || tcp_is_reno(tp) ||
> (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
> cnt += tcp_skb_pcount(skb);
>
> --
> 1.5.2.2
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
^ permalink raw reply
* [PATCH 5/5] netns netfilter: put table module on netns stop
From: Alexey Dobriyan @ 2008-01-24 12:30 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel, devel
When number of entries exceeds number of initial entries, foo-tables code
will pin table module. But during table unregister on netns stop,
that additional pin was forgotten.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
---
net/ipv4/netfilter/arp_tables.c | 3 +++
net/ipv4/netfilter/ip_tables.c | 3 +++
net/ipv6/netfilter/ip6_tables.c | 3 +++
3 files changed, 9 insertions(+)
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -1773,6 +1773,7 @@ void arpt_unregister_table(struct arpt_table *table)
{
struct xt_table_info *private;
void *loc_cpu_entry;
+ struct module *table_owner = table->me;
private = xt_unregister_table(table);
@@ -1780,6 +1781,8 @@ void arpt_unregister_table(struct arpt_table *table)
loc_cpu_entry = private->entries[raw_smp_processor_id()];
ARPT_ENTRY_ITERATE(loc_cpu_entry, private->size,
cleanup_entry, NULL);
+ if (private->number > private->initial_entries)
+ module_put(table_owner);
xt_free_table_info(private);
}
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -2095,12 +2095,15 @@ void ipt_unregister_table(struct xt_table *table)
{
struct xt_table_info *private;
void *loc_cpu_entry;
+ struct module *table_owner = table->me;
private = xt_unregister_table(table);
/* Decrease module usage counts and free resources */
loc_cpu_entry = private->entries[raw_smp_processor_id()];
IPT_ENTRY_ITERATE(loc_cpu_entry, private->size, cleanup_entry, NULL);
+ if (private->number > private->initial_entries)
+ module_put(table_owner);
xt_free_table_info(private);
}
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -2120,12 +2120,15 @@ void ip6t_unregister_table(struct xt_table *table)
{
struct xt_table_info *private;
void *loc_cpu_entry;
+ struct module *table_owner = table->me;
private = xt_unregister_table(table);
/* Decrease module usage counts and free resources */
loc_cpu_entry = private->entries[raw_smp_processor_id()];
IP6T_ENTRY_ITERATE(loc_cpu_entry, private->size, cleanup_entry, NULL);
+ if (private->number > private->initial_entries)
+ module_put(table_owner);
xt_free_table_info(private);
}
^ permalink raw reply
* [PATCH 4/5] netns netfilter: per-netns arp_tables FILTER
From: Alexey Dobriyan @ 2008-01-24 12:29 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel, devel
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
---
include/net/netns/ipv4.h | 1
net/ipv4/netfilter/arptable_filter.c | 38 +++++++++++++++++++++++++----------
2 files changed, 29 insertions(+), 10 deletions(-)
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -32,6 +32,7 @@ struct netns_ipv4 {
struct xt_table *iptable_filter;
struct xt_table *iptable_mangle;
struct xt_table *iptable_raw;
+ struct xt_table *arptable_filter;
#endif
};
#endif
--- a/net/ipv4/netfilter/arptable_filter.c
+++ b/net/ipv4/netfilter/arptable_filter.c
@@ -20,7 +20,7 @@ static struct
struct arpt_replace repl;
struct arpt_standard entries[3];
struct arpt_error term;
-} initial_table __initdata = {
+} initial_table __net_initdata = {
.repl = {
.name = "filter",
.valid_hooks = FILTER_VALID_HOOKS,
@@ -45,7 +45,7 @@ static struct
.term = ARPT_ERROR_INIT,
};
-static struct arpt_table __packet_filter = {
+static struct arpt_table packet_filter = {
.name = "filter",
.valid_hooks = FILTER_VALID_HOOKS,
.lock = RW_LOCK_UNLOCKED,
@@ -53,7 +53,6 @@ static struct arpt_table __packet_filter = {
.me = THIS_MODULE,
.af = NF_ARP,
};
-static struct arpt_table *packet_filter;
/* The work comes in here from netfilter.c */
static unsigned int arpt_hook(unsigned int hook,
@@ -62,7 +61,7 @@ static unsigned int arpt_hook(unsigned int hook,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
- return arpt_do_table(skb, hook, in, out, packet_filter);
+ return arpt_do_table(skb, hook, in, out, init_net.ipv4.arptable_filter);
}
static struct nf_hook_ops arpt_ops[] __read_mostly = {
@@ -86,14 +85,33 @@ static struct nf_hook_ops arpt_ops[] __read_mostly = {
},
};
+static int __net_init arptable_filter_net_init(struct net *net)
+{
+ /* Register table */
+ net->ipv4.arptable_filter =
+ arpt_register_table(net, &packet_filter, &initial_table.repl);
+ if (IS_ERR(net->ipv4.arptable_filter))
+ return PTR_ERR(net->ipv4.arptable_filter);
+ return 0;
+}
+
+static void __net_exit arptable_filter_net_exit(struct net *net)
+{
+ arpt_unregister_table(net->ipv4.arptable_filter);
+}
+
+static struct pernet_operations arptable_filter_net_ops = {
+ .init = arptable_filter_net_init,
+ .exit = arptable_filter_net_exit,
+};
+
static int __init arptable_filter_init(void)
{
int ret;
- /* Register table */
- packet_filter = arpt_register_table(&init_net, &__packet_filter, &initial_table.repl);
- if (IS_ERR(packet_filter))
- return PTR_ERR(packet_filter);
+ ret = register_pernet_subsys(&arptable_filter_net_ops);
+ if (ret < 0)
+ return ret;
ret = nf_register_hooks(arpt_ops, ARRAY_SIZE(arpt_ops));
if (ret < 0)
@@ -101,14 +119,14 @@ static int __init arptable_filter_init(void)
return ret;
cleanup_table:
- arpt_unregister_table(packet_filter);
+ unregister_pernet_subsys(&arptable_filter_net_ops);
return ret;
}
static void __exit arptable_filter_fini(void)
{
nf_unregister_hooks(arpt_ops, ARRAY_SIZE(arpt_ops));
- arpt_unregister_table(packet_filter);
+ unregister_pernet_subsys(&arptable_filter_net_ops);
}
module_init(arptable_filter_init);
^ permalink raw reply
* [PATCH 3/5] netns netfilter: per-netns arp_tables
From: Alexey Dobriyan @ 2008-01-24 12:28 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel, devel
* Propagate netns from userspace.
* arpt_register_table() registers table in supplied netns.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
---
include/linux/netfilter_arp/arp_tables.h | 3 +
net/ipv4/netfilter/arp_tables.c | 55 +++++++++++++++++--------------
net/ipv4/netfilter/arptable_filter.c | 2 -
3 files changed, 34 insertions(+), 26 deletions(-)
--- a/include/linux/netfilter_arp/arp_tables.h
+++ b/include/linux/netfilter_arp/arp_tables.h
@@ -271,7 +271,8 @@ struct arpt_error
xt_register_target(tgt); })
#define arpt_unregister_target(tgt) xt_unregister_target(tgt)
-extern struct arpt_table *arpt_register_table(struct arpt_table *table,
+extern struct arpt_table *arpt_register_table(struct net *net,
+ struct arpt_table *table,
const struct arpt_replace *repl);
extern void arpt_unregister_table(struct arpt_table *table);
extern unsigned int arpt_do_table(struct sk_buff *skb,
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -22,6 +22,7 @@
#include <linux/mutex.h>
#include <linux/err.h>
#include <net/compat.h>
+#include <net/sock.h>
#include <asm/uaccess.h>
#include <linux/netfilter/x_tables.h>
@@ -850,7 +851,7 @@ static int compat_table_info(const struct xt_table_info *info,
}
#endif
-static int get_info(void __user *user, int *len, int compat)
+static int get_info(struct net *net, void __user *user, int *len, int compat)
{
char name[ARPT_TABLE_MAXNAMELEN];
struct arpt_table *t;
@@ -870,7 +871,7 @@ static int get_info(void __user *user, int *len, int compat)
if (compat)
xt_compat_lock(NF_ARP);
#endif
- t = try_then_request_module(xt_find_table_lock(&init_net, NF_ARP, name),
+ t = try_then_request_module(xt_find_table_lock(net, NF_ARP, name),
"arptable_%s", name);
if (t && !IS_ERR(t)) {
struct arpt_getinfo info;
@@ -908,7 +909,8 @@ static int get_info(void __user *user, int *len, int compat)
return ret;
}
-static int get_entries(struct arpt_get_entries __user *uptr, int *len)
+static int get_entries(struct net *net, struct arpt_get_entries __user *uptr,
+ int *len)
{
int ret;
struct arpt_get_entries get;
@@ -926,7 +928,7 @@ static int get_entries(struct arpt_get_entries __user *uptr, int *len)
return -EINVAL;
}
- t = xt_find_table_lock(&init_net, NF_ARP, get.name);
+ t = xt_find_table_lock(net, NF_ARP, get.name);
if (t && !IS_ERR(t)) {
struct xt_table_info *private = t->private;
duprintf("t->private->number = %u\n",
@@ -947,7 +949,8 @@ static int get_entries(struct arpt_get_entries __user *uptr, int *len)
return ret;
}
-static int __do_replace(const char *name, unsigned int valid_hooks,
+static int __do_replace(struct net *net, const char *name,
+ unsigned int valid_hooks,
struct xt_table_info *newinfo,
unsigned int num_counters,
void __user *counters_ptr)
@@ -966,7 +969,7 @@ static int __do_replace(const char *name, unsigned int valid_hooks,
goto out;
}
- t = try_then_request_module(xt_find_table_lock(&init_net, NF_ARP, name),
+ t = try_then_request_module(xt_find_table_lock(net, NF_ARP, name),
"arptable_%s", name);
if (!t || IS_ERR(t)) {
ret = t ? PTR_ERR(t) : -ENOENT;
@@ -1019,7 +1022,7 @@ static int __do_replace(const char *name, unsigned int valid_hooks,
return ret;
}
-static int do_replace(void __user *user, unsigned int len)
+static int do_replace(struct net *net, void __user *user, unsigned int len)
{
int ret;
struct arpt_replace tmp;
@@ -1053,7 +1056,7 @@ static int do_replace(void __user *user, unsigned int len)
duprintf("arp_tables: Translated table\n");
- ret = __do_replace(tmp.name, tmp.valid_hooks, newinfo,
+ ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo,
tmp.num_counters, tmp.counters);
if (ret)
goto free_newinfo_untrans;
@@ -1080,7 +1083,8 @@ static inline int add_counter_to_entry(struct arpt_entry *e,
return 0;
}
-static int do_add_counters(void __user *user, unsigned int len, int compat)
+static int do_add_counters(struct net *net, void __user *user, unsigned int len,
+ int compat)
{
unsigned int i;
struct xt_counters_info tmp;
@@ -1132,7 +1136,7 @@ static int do_add_counters(void __user *user, unsigned int len, int compat)
goto free;
}
- t = xt_find_table_lock(&init_net, NF_ARP, name);
+ t = xt_find_table_lock(net, NF_ARP, name);
if (!t || IS_ERR(t)) {
ret = t ? PTR_ERR(t) : -ENOENT;
goto free;
@@ -1435,7 +1439,8 @@ struct compat_arpt_replace {
struct compat_arpt_entry entries[0];
};
-static int compat_do_replace(void __user *user, unsigned int len)
+static int compat_do_replace(struct net *net, void __user *user,
+ unsigned int len)
{
int ret;
struct compat_arpt_replace tmp;
@@ -1471,7 +1476,7 @@ static int compat_do_replace(void __user *user, unsigned int len)
duprintf("compat_do_replace: Translated table\n");
- ret = __do_replace(tmp.name, tmp.valid_hooks, newinfo,
+ ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo,
tmp.num_counters, compat_ptr(tmp.counters));
if (ret)
goto free_newinfo_untrans;
@@ -1494,11 +1499,11 @@ static int compat_do_arpt_set_ctl(struct sock *sk, int cmd, void __user *user,
switch (cmd) {
case ARPT_SO_SET_REPLACE:
- ret = compat_do_replace(user, len);
+ ret = compat_do_replace(sk->sk_net, user, len);
break;
case ARPT_SO_SET_ADD_COUNTERS:
- ret = do_add_counters(user, len, 1);
+ ret = do_add_counters(sk->sk_net, user, len, 1);
break;
default:
@@ -1584,7 +1589,8 @@ struct compat_arpt_get_entries {
struct compat_arpt_entry entrytable[0];
};
-static int compat_get_entries(struct compat_arpt_get_entries __user *uptr,
+static int compat_get_entries(struct net *net,
+ struct compat_arpt_get_entries __user *uptr,
int *len)
{
int ret;
@@ -1604,7 +1610,7 @@ static int compat_get_entries(struct compat_arpt_get_entries __user *uptr,
}
xt_compat_lock(NF_ARP);
- t = xt_find_table_lock(&init_net, NF_ARP, get.name);
+ t = xt_find_table_lock(net, NF_ARP, get.name);
if (t && !IS_ERR(t)) {
struct xt_table_info *private = t->private;
struct xt_table_info info;
@@ -1641,10 +1647,10 @@ static int compat_do_arpt_get_ctl(struct sock *sk, int cmd, void __user *user,
switch (cmd) {
case ARPT_SO_GET_INFO:
- ret = get_info(user, len, 1);
+ ret = get_info(sk->sk_net, user, len, 1);
break;
case ARPT_SO_GET_ENTRIES:
- ret = compat_get_entries(user, len);
+ ret = compat_get_entries(sk->sk_net, user, len);
break;
default:
ret = do_arpt_get_ctl(sk, cmd, user, len);
@@ -1662,11 +1668,11 @@ static int do_arpt_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned
switch (cmd) {
case ARPT_SO_SET_REPLACE:
- ret = do_replace(user, len);
+ ret = do_replace(sk->sk_net, user, len);
break;
case ARPT_SO_SET_ADD_COUNTERS:
- ret = do_add_counters(user, len, 0);
+ ret = do_add_counters(sk->sk_net, user, len, 0);
break;
default:
@@ -1686,11 +1692,11 @@ static int do_arpt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len
switch (cmd) {
case ARPT_SO_GET_INFO:
- ret = get_info(user, len, 0);
+ ret = get_info(sk->sk_net, user, len, 0);
break;
case ARPT_SO_GET_ENTRIES:
- ret = get_entries(user, len);
+ ret = get_entries(sk->sk_net, user, len);
break;
case ARPT_SO_GET_REVISION_TARGET: {
@@ -1719,7 +1725,8 @@ static int do_arpt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len
return ret;
}
-struct arpt_table *arpt_register_table(struct arpt_table *table,
+struct arpt_table *arpt_register_table(struct net *net,
+ struct arpt_table *table,
const struct arpt_replace *repl)
{
int ret;
@@ -1749,7 +1756,7 @@ struct arpt_table *arpt_register_table(struct arpt_table *table,
if (ret != 0)
goto out_free;
- new_table = xt_register_table(&init_net, table, &bootstrap, newinfo);
+ new_table = xt_register_table(net, table, &bootstrap, newinfo);
if (IS_ERR(new_table)) {
ret = PTR_ERR(new_table);
goto out_free;
--- a/net/ipv4/netfilter/arptable_filter.c
+++ b/net/ipv4/netfilter/arptable_filter.c
@@ -91,7 +91,7 @@ static int __init arptable_filter_init(void)
int ret;
/* Register table */
- packet_filter = arpt_register_table(&__packet_filter, &initial_table.repl);
+ packet_filter = arpt_register_table(&init_net, &__packet_filter, &initial_table.repl);
if (IS_ERR(packet_filter))
return PTR_ERR(packet_filter);
^ permalink raw reply
* [PATCH 2/5] netns netfilter: per-netns IPv6 FILTER, MANGLE, RAW
From: Alexey Dobriyan @ 2008-01-24 12:27 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel, devel
Now it's possible to list and manipulate per-netns ip6tables rules.
Filtering decisions are based on init_net's table so far.
P.S.: remove init_net check in inet6_create() to see the effect
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
---
include/net/netns/ipv6.h | 6 +++++
net/ipv6/netfilter/ip6table_filter.c | 40 +++++++++++++++++++++++++----------
net/ipv6/netfilter/ip6table_mangle.c | 40 +++++++++++++++++++++++++----------
net/ipv6/netfilter/ip6table_raw.c | 38 ++++++++++++++++++++++++---------
4 files changed, 92 insertions(+), 32 deletions(-)
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -31,5 +31,10 @@ struct netns_ipv6 {
struct ipv6_devconf *devconf_all;
struct ipv6_devconf *devconf_dflt;
struct netns_frags frags;
+#ifdef CONFIG_NETFILTER
+ struct xt_table *ip6table_filter;
+ struct xt_table *ip6table_mangle;
+ struct xt_table *ip6table_raw;
+#endif
};
#endif
--- a/net/ipv6/netfilter/ip6table_filter.c
+++ b/net/ipv6/netfilter/ip6table_filter.c
@@ -26,7 +26,7 @@ static struct
struct ip6t_replace repl;
struct ip6t_standard entries[3];
struct ip6t_error term;
-} initial_table __initdata = {
+} initial_table __net_initdata = {
.repl = {
.name = "filter",
.valid_hooks = FILTER_VALID_HOOKS,
@@ -51,14 +51,13 @@ static struct
.term = IP6T_ERROR_INIT, /* ERROR */
};
-static struct xt_table __packet_filter = {
+static struct xt_table packet_filter = {
.name = "filter",
.valid_hooks = FILTER_VALID_HOOKS,
.lock = RW_LOCK_UNLOCKED,
.me = THIS_MODULE,
.af = AF_INET6,
};
-static struct xt_table *packet_filter;
/* The work comes in here from netfilter.c. */
static unsigned int
@@ -68,7 +67,7 @@ ip6t_hook(unsigned int hook,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
- return ip6t_do_table(skb, hook, in, out, packet_filter);
+ return ip6t_do_table(skb, hook, in, out, init_net.ipv6.ip6table_filter);
}
static unsigned int
@@ -88,7 +87,7 @@ ip6t_local_out_hook(unsigned int hook,
}
#endif
- return ip6t_do_table(skb, hook, in, out, packet_filter);
+ return ip6t_do_table(skb, hook, in, out, init_net.ipv6.ip6table_filter);
}
static struct nf_hook_ops ip6t_ops[] __read_mostly = {
@@ -119,6 +118,26 @@ static struct nf_hook_ops ip6t_ops[] __read_mostly = {
static int forward = NF_ACCEPT;
module_param(forward, bool, 0000);
+static int __net_init ip6table_filter_net_init(struct net *net)
+{
+ /* Register table */
+ net->ipv6.ip6table_filter =
+ ip6t_register_table(net, &packet_filter, &initial_table.repl);
+ if (IS_ERR(net->ipv6.ip6table_filter))
+ return PTR_ERR(net->ipv6.ip6table_filter);
+ return 0;
+}
+
+static void __net_exit ip6table_filter_net_exit(struct net *net)
+{
+ ip6t_unregister_table(net->ipv6.ip6table_filter);
+}
+
+static struct pernet_operations ip6table_filter_net_ops = {
+ .init = ip6table_filter_net_init,
+ .exit = ip6table_filter_net_exit,
+};
+
static int __init ip6table_filter_init(void)
{
int ret;
@@ -131,10 +150,9 @@ static int __init ip6table_filter_init(void)
/* Entry 1 is the FORWARD hook */
initial_table.entries[1].target.verdict = -forward - 1;
- /* Register table */
- packet_filter = ip6t_register_table(&init_net, &__packet_filter, &initial_table.repl);
- if (IS_ERR(packet_filter))
- return PTR_ERR(packet_filter);
+ ret = register_pernet_subsys(&ip6table_filter_net_ops);
+ if (ret < 0)
+ return ret;
/* Register hooks */
ret = nf_register_hooks(ip6t_ops, ARRAY_SIZE(ip6t_ops));
@@ -144,14 +162,14 @@ static int __init ip6table_filter_init(void)
return ret;
cleanup_table:
- ip6t_unregister_table(packet_filter);
+ unregister_pernet_subsys(&ip6table_filter_net_ops);
return ret;
}
static void __exit ip6table_filter_fini(void)
{
nf_unregister_hooks(ip6t_ops, ARRAY_SIZE(ip6t_ops));
- ip6t_unregister_table(packet_filter);
+ unregister_pernet_subsys(&ip6table_filter_net_ops);
}
module_init(ip6table_filter_init);
--- a/net/ipv6/netfilter/ip6table_mangle.c
+++ b/net/ipv6/netfilter/ip6table_mangle.c
@@ -26,7 +26,7 @@ static struct
struct ip6t_replace repl;
struct ip6t_standard entries[5];
struct ip6t_error term;
-} initial_table __initdata = {
+} initial_table __net_initdata = {
.repl = {
.name = "mangle",
.valid_hooks = MANGLE_VALID_HOOKS,
@@ -57,14 +57,13 @@ static struct
.term = IP6T_ERROR_INIT, /* ERROR */
};
-static struct xt_table __packet_mangler = {
+static struct xt_table packet_mangler = {
.name = "mangle",
.valid_hooks = MANGLE_VALID_HOOKS,
.lock = RW_LOCK_UNLOCKED,
.me = THIS_MODULE,
.af = AF_INET6,
};
-static struct xt_table *packet_mangler;
/* The work comes in here from netfilter.c. */
static unsigned int
@@ -74,7 +73,7 @@ ip6t_route_hook(unsigned int hook,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
- return ip6t_do_table(skb, hook, in, out, packet_mangler);
+ return ip6t_do_table(skb, hook, in, out, init_net.ipv6.ip6table_mangle);
}
static unsigned int
@@ -109,7 +108,7 @@ ip6t_local_hook(unsigned int hook,
/* flowlabel and prio (includes version, which shouldn't change either */
flowlabel = *((u_int32_t *)ipv6_hdr(skb));
- ret = ip6t_do_table(skb, hook, in, out, packet_mangler);
+ ret = ip6t_do_table(skb, hook, in, out, init_net.ipv6.ip6table_mangle);
if (ret != NF_DROP && ret != NF_STOLEN
&& (memcmp(&ipv6_hdr(skb)->saddr, &saddr, sizeof(saddr))
@@ -159,14 +158,33 @@ static struct nf_hook_ops ip6t_ops[] __read_mostly = {
},
};
+static int __net_init ip6table_mangle_net_init(struct net *net)
+{
+ /* Register table */
+ net->ipv6.ip6table_mangle =
+ ip6t_register_table(net, &packet_mangler, &initial_table.repl);
+ if (IS_ERR(net->ipv6.ip6table_mangle))
+ return PTR_ERR(net->ipv6.ip6table_mangle);
+ return 0;
+}
+
+static void __net_exit ip6table_mangle_net_exit(struct net *net)
+{
+ ip6t_unregister_table(net->ipv6.ip6table_mangle);
+}
+
+static struct pernet_operations ip6table_mangle_net_ops = {
+ .init = ip6table_mangle_net_init,
+ .exit = ip6table_mangle_net_exit,
+};
+
static int __init ip6table_mangle_init(void)
{
int ret;
- /* Register table */
- packet_mangler = ip6t_register_table(&init_net, &__packet_mangler, &initial_table.repl);
- if (IS_ERR(packet_mangler))
- return PTR_ERR(packet_mangler);
+ ret = register_pernet_subsys(&ip6table_mangle_net_ops);
+ if (ret < 0)
+ return ret;
/* Register hooks */
ret = nf_register_hooks(ip6t_ops, ARRAY_SIZE(ip6t_ops));
@@ -176,14 +194,14 @@ static int __init ip6table_mangle_init(void)
return ret;
cleanup_table:
- ip6t_unregister_table(packet_mangler);
+ unregister_pernet_subsys(&ip6table_mangle_net_ops);
return ret;
}
static void __exit ip6table_mangle_fini(void)
{
nf_unregister_hooks(ip6t_ops, ARRAY_SIZE(ip6t_ops));
- ip6t_unregister_table(packet_mangler);
+ unregister_pernet_subsys(&ip6table_mangle_net_ops);
}
module_init(ip6table_mangle_init);
--- a/net/ipv6/netfilter/ip6table_raw.c
+++ b/net/ipv6/netfilter/ip6table_raw.c
@@ -13,7 +13,7 @@ static struct
struct ip6t_replace repl;
struct ip6t_standard entries[2];
struct ip6t_error term;
-} initial_table __initdata = {
+} initial_table __net_initdata = {
.repl = {
.name = "raw",
.valid_hooks = RAW_VALID_HOOKS,
@@ -35,14 +35,13 @@ static struct
.term = IP6T_ERROR_INIT, /* ERROR */
};
-static struct xt_table __packet_raw = {
+static struct xt_table packet_raw = {
.name = "raw",
.valid_hooks = RAW_VALID_HOOKS,
.lock = RW_LOCK_UNLOCKED,
.me = THIS_MODULE,
.af = AF_INET6,
};
-static struct xt_table *packet_raw;
/* The work comes in here from netfilter.c. */
static unsigned int
@@ -52,7 +51,7 @@ ip6t_hook(unsigned int hook,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
- return ip6t_do_table(skb, hook, in, out, packet_raw);
+ return ip6t_do_table(skb, hook, in, out, init_net.ipv6.ip6table_raw);
}
static struct nf_hook_ops ip6t_ops[] __read_mostly = {
@@ -72,14 +71,33 @@ static struct nf_hook_ops ip6t_ops[] __read_mostly = {
},
};
+static int __net_init ip6table_raw_net_init(struct net *net)
+{
+ /* Register table */
+ net->ipv6.ip6table_raw =
+ ip6t_register_table(net, &packet_raw, &initial_table.repl);
+ if (IS_ERR(net->ipv6.ip6table_raw))
+ return PTR_ERR(net->ipv6.ip6table_raw);
+ return 0;
+}
+
+static void __net_exit ip6table_raw_net_exit(struct net *net)
+{
+ ip6t_unregister_table(net->ipv6.ip6table_raw);
+}
+
+static struct pernet_operations ip6table_raw_net_ops = {
+ .init = ip6table_raw_net_init,
+ .exit = ip6table_raw_net_exit,
+};
+
static int __init ip6table_raw_init(void)
{
int ret;
- /* Register table */
- packet_raw = ip6t_register_table(&init_net, &__packet_raw, &initial_table.repl);
- if (IS_ERR(packet_raw))
- return PTR_ERR(packet_raw);
+ ret = register_pernet_subsys(&ip6table_raw_net_ops);
+ if (ret < 0)
+ return ret;
/* Register hooks */
ret = nf_register_hooks(ip6t_ops, ARRAY_SIZE(ip6t_ops));
@@ -89,14 +107,14 @@ static int __init ip6table_raw_init(void)
return ret;
cleanup_table:
- ip6t_unregister_table(packet_raw);
+ unregister_pernet_subsys(&ip6table_raw_net_ops);
return ret;
}
static void __exit ip6table_raw_fini(void)
{
nf_unregister_hooks(ip6t_ops, ARRAY_SIZE(ip6t_ops));
- ip6t_unregister_table(packet_raw);
+ unregister_pernet_subsys(&ip6table_raw_net_ops);
}
module_init(ip6table_raw_init);
^ permalink raw reply
* [PATCH 1/5] netns netfilter: per-netns ip6tables
From: Alexey Dobriyan @ 2008-01-24 12:23 UTC (permalink / raw)
To: kaber; +Cc: netdev, netfilter-devel, devel
* Propagate netns from userspace down to xt_find_table_lock()
* Register ip6 tables in netns (modules still use init_net)
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
---
include/linux/netfilter_ipv6/ip6_tables.h | 3 +
net/ipv6/netfilter/ip6_tables.c | 50 +++++++++++++++---------------
net/ipv6/netfilter/ip6table_filter.c | 2 -
net/ipv6/netfilter/ip6table_mangle.c | 2 -
net/ipv6/netfilter/ip6table_raw.c | 2 -
5 files changed, 31 insertions(+), 28 deletions(-)
--- a/include/linux/netfilter_ipv6/ip6_tables.h
+++ b/include/linux/netfilter_ipv6/ip6_tables.h
@@ -305,7 +305,8 @@ ip6t_get_target(struct ip6t_entry *e)
#include <linux/init.h>
extern void ip6t_init(void) __init;
-extern struct xt_table *ip6t_register_table(struct xt_table *table,
+extern struct xt_table *ip6t_register_table(struct net *net,
+ struct xt_table *table,
const struct ip6t_replace *repl);
extern void ip6t_unregister_table(struct xt_table *table);
extern unsigned int ip6t_do_table(struct sk_buff *skb,
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1118,7 +1118,7 @@ static int compat_table_info(const struct xt_table_info *info,
}
#endif
-static int get_info(void __user *user, int *len, int compat)
+static int get_info(struct net *net, void __user *user, int *len, int compat)
{
char name[IP6T_TABLE_MAXNAMELEN];
struct xt_table *t;
@@ -1138,7 +1138,7 @@ static int get_info(void __user *user, int *len, int compat)
if (compat)
xt_compat_lock(AF_INET6);
#endif
- t = try_then_request_module(xt_find_table_lock(&init_net, AF_INET6, name),
+ t = try_then_request_module(xt_find_table_lock(net, AF_INET6, name),
"ip6table_%s", name);
if (t && !IS_ERR(t)) {
struct ip6t_getinfo info;
@@ -1178,7 +1178,7 @@ static int get_info(void __user *user, int *len, int compat)
}
static int
-get_entries(struct ip6t_get_entries __user *uptr, int *len)
+get_entries(struct net *net, struct ip6t_get_entries __user *uptr, int *len)
{
int ret;
struct ip6t_get_entries get;
@@ -1196,7 +1196,7 @@ get_entries(struct ip6t_get_entries __user *uptr, int *len)
return -EINVAL;
}
- t = xt_find_table_lock(&init_net, AF_INET6, get.name);
+ t = xt_find_table_lock(net, AF_INET6, get.name);
if (t && !IS_ERR(t)) {
struct xt_table_info *private = t->private;
duprintf("t->private->number = %u\n", private->number);
@@ -1217,7 +1217,7 @@ get_entries(struct ip6t_get_entries __user *uptr, int *len)
}
static int
-__do_replace(const char *name, unsigned int valid_hooks,
+__do_replace(struct net *net, const char *name, unsigned int valid_hooks,
struct xt_table_info *newinfo, unsigned int num_counters,
void __user *counters_ptr)
{
@@ -1235,7 +1235,7 @@ __do_replace(const char *name, unsigned int valid_hooks,
goto out;
}
- t = try_then_request_module(xt_find_table_lock(&init_net, AF_INET6, name),
+ t = try_then_request_module(xt_find_table_lock(net, AF_INET6, name),
"ip6table_%s", name);
if (!t || IS_ERR(t)) {
ret = t ? PTR_ERR(t) : -ENOENT;
@@ -1288,7 +1288,7 @@ __do_replace(const char *name, unsigned int valid_hooks,
}
static int
-do_replace(void __user *user, unsigned int len)
+do_replace(struct net *net, void __user *user, unsigned int len)
{
int ret;
struct ip6t_replace tmp;
@@ -1322,7 +1322,7 @@ do_replace(void __user *user, unsigned int len)
duprintf("ip_tables: Translated table\n");
- ret = __do_replace(tmp.name, tmp.valid_hooks, newinfo,
+ ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo,
tmp.num_counters, tmp.counters);
if (ret)
goto free_newinfo_untrans;
@@ -1358,7 +1358,7 @@ add_counter_to_entry(struct ip6t_entry *e,
}
static int
-do_add_counters(void __user *user, unsigned int len, int compat)
+do_add_counters(struct net *net, void __user *user, unsigned int len, int compat)
{
unsigned int i;
struct xt_counters_info tmp;
@@ -1410,7 +1410,7 @@ do_add_counters(void __user *user, unsigned int len, int compat)
goto free;
}
- t = xt_find_table_lock(&init_net, AF_INET6, name);
+ t = xt_find_table_lock(net, AF_INET6, name);
if (!t || IS_ERR(t)) {
ret = t ? PTR_ERR(t) : -ENOENT;
goto free;
@@ -1815,7 +1815,7 @@ out_unlock:
}
static int
-compat_do_replace(void __user *user, unsigned int len)
+compat_do_replace(struct net *net, void __user *user, unsigned int len)
{
int ret;
struct compat_ip6t_replace tmp;
@@ -1852,7 +1852,7 @@ compat_do_replace(void __user *user, unsigned int len)
duprintf("compat_do_replace: Translated table\n");
- ret = __do_replace(tmp.name, tmp.valid_hooks, newinfo,
+ ret = __do_replace(net, tmp.name, tmp.valid_hooks, newinfo,
tmp.num_counters, compat_ptr(tmp.counters));
if (ret)
goto free_newinfo_untrans;
@@ -1876,11 +1876,11 @@ compat_do_ip6t_set_ctl(struct sock *sk, int cmd, void __user *user,
switch (cmd) {
case IP6T_SO_SET_REPLACE:
- ret = compat_do_replace(user, len);
+ ret = compat_do_replace(sk->sk_net, user, len);
break;
case IP6T_SO_SET_ADD_COUNTERS:
- ret = do_add_counters(user, len, 1);
+ ret = do_add_counters(sk->sk_net, user, len, 1);
break;
default:
@@ -1929,7 +1929,8 @@ compat_copy_entries_to_user(unsigned int total_size, struct xt_table *table,
}
static int
-compat_get_entries(struct compat_ip6t_get_entries __user *uptr, int *len)
+compat_get_entries(struct net *net, struct compat_ip6t_get_entries __user *uptr,
+ int *len)
{
int ret;
struct compat_ip6t_get_entries get;
@@ -1950,7 +1951,7 @@ compat_get_entries(struct compat_ip6t_get_entries __user *uptr, int *len)
}
xt_compat_lock(AF_INET6);
- t = xt_find_table_lock(&init_net, AF_INET6, get.name);
+ t = xt_find_table_lock(net, AF_INET6, get.name);
if (t && !IS_ERR(t)) {
struct xt_table_info *private = t->private;
struct xt_table_info info;
@@ -1986,10 +1987,10 @@ compat_do_ip6t_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
switch (cmd) {
case IP6T_SO_GET_INFO:
- ret = get_info(user, len, 1);
+ ret = get_info(sk->sk_net, user, len, 1);
break;
case IP6T_SO_GET_ENTRIES:
- ret = compat_get_entries(user, len);
+ ret = compat_get_entries(sk->sk_net, user, len);
break;
default:
ret = do_ip6t_get_ctl(sk, cmd, user, len);
@@ -2008,11 +2009,11 @@ do_ip6t_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len)
switch (cmd) {
case IP6T_SO_SET_REPLACE:
- ret = do_replace(user, len);
+ ret = do_replace(sk->sk_net, user, len);
break;
case IP6T_SO_SET_ADD_COUNTERS:
- ret = do_add_counters(user, len, 0);
+ ret = do_add_counters(sk->sk_net, user, len, 0);
break;
default:
@@ -2033,11 +2034,11 @@ do_ip6t_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
switch (cmd) {
case IP6T_SO_GET_INFO:
- ret = get_info(user, len, 0);
+ ret = get_info(sk->sk_net, user, len, 0);
break;
case IP6T_SO_GET_ENTRIES:
- ret = get_entries(user, len);
+ ret = get_entries(sk->sk_net, user, len);
break;
case IP6T_SO_GET_REVISION_MATCH:
@@ -2074,7 +2075,8 @@ do_ip6t_get_ctl(struct sock *sk, int cmd, void __user *user, int *len)
return ret;
}
-struct xt_table *ip6t_register_table(struct xt_table *table, const struct ip6t_replace *repl)
+struct xt_table *ip6t_register_table(struct net *net, struct xt_table *table,
+ const struct ip6t_replace *repl)
{
int ret;
struct xt_table_info *newinfo;
@@ -2101,7 +2103,7 @@ struct xt_table *ip6t_register_table(struct xt_table *table, const struct ip6t_r
if (ret != 0)
goto out_free;
- new_table = xt_register_table(&init_net, table, &bootstrap, newinfo);
+ new_table = xt_register_table(net, table, &bootstrap, newinfo);
if (IS_ERR(new_table)) {
ret = PTR_ERR(new_table);
goto out_free;
--- a/net/ipv6/netfilter/ip6table_filter.c
+++ b/net/ipv6/netfilter/ip6table_filter.c
@@ -132,7 +132,7 @@ static int __init ip6table_filter_init(void)
initial_table.entries[1].target.verdict = -forward - 1;
/* Register table */
- packet_filter = ip6t_register_table(&__packet_filter, &initial_table.repl);
+ packet_filter = ip6t_register_table(&init_net, &__packet_filter, &initial_table.repl);
if (IS_ERR(packet_filter))
return PTR_ERR(packet_filter);
--- a/net/ipv6/netfilter/ip6table_mangle.c
+++ b/net/ipv6/netfilter/ip6table_mangle.c
@@ -164,7 +164,7 @@ static int __init ip6table_mangle_init(void)
int ret;
/* Register table */
- packet_mangler = ip6t_register_table(&__packet_mangler, &initial_table.repl);
+ packet_mangler = ip6t_register_table(&init_net, &__packet_mangler, &initial_table.repl);
if (IS_ERR(packet_mangler))
return PTR_ERR(packet_mangler);
--- a/net/ipv6/netfilter/ip6table_raw.c
+++ b/net/ipv6/netfilter/ip6table_raw.c
@@ -77,7 +77,7 @@ static int __init ip6table_raw_init(void)
int ret;
/* Register table */
- packet_raw = ip6t_register_table(&__packet_raw, &initial_table.repl);
+ packet_raw = ip6t_register_table(&init_net, &__packet_raw, &initial_table.repl);
if (IS_ERR(packet_raw))
return PTR_ERR(packet_raw);
^ permalink raw reply
* Re: [Lksctp-developers] [PATCH] SCTP: Fix kernel panic while received AUTH chunk with BAD shared key identifier
From: Neil Horman @ 2008-01-24 12:16 UTC (permalink / raw)
To: Wei Yongjun; +Cc: netdev, Vlad Yasevich, lksctp-developers
In-Reply-To: <4795A960.7000700@cn.fujitsu.com>
On Tue, Jan 22, 2008 at 05:29:20PM +0900, Wei Yongjun wrote:
>
>
> This patch fix this problem.
>
> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
>
> --- a/net/sctp/auth.c 2008-01-21 00:03:25.000000000 -0500
> +++ b/net/sctp/auth.c 2008-01-21 21:31:47.000000000 -0500
> @@ -420,15 +420,15 @@ struct sctp_shared_key *sctp_auth_get_sh
> const struct sctp_association *asoc,
> __u16 key_id)
> {
> - struct sctp_shared_key *key = NULL;
> + struct sctp_shared_key *key;
>
> /* First search associations set of endpoint pair shared keys */
> key_for_each(key, &asoc->endpoint_shared_keys) {
> if (key->key_id == key_id)
> - break;
> + return key;
> }
>
> - return key;
> + return NULL;
> }
>
> /*
>
FWIW, Ack from me. The assignment of NULL to key can safely be removed, since
key_for_each (which is just list_for_each_entry under the covers does an initial
assignment to key anyway).
If the endpoint_shared_keys list is empty, or if the key_id being requested does
not exist, the function as it currently stands returns the actuall list_head (in
this case endpoint_shared_keys. Since that list_head isn't surrounded by an
actuall data structure, the last iteration through list_for_each_entry will do a
container_of on key, and we wind up returning a bogus pointer, instead of NULL,
as we should. Wei's patch corrects that.
Regards
Neil
Acked-by: Neil Horman <nhorman@tuxdriver.com>
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Lksctp-developers mailing list
> Lksctp-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lksctp-developers
^ permalink raw reply
* Re: [PATCH net-2.6.25] Add packet filtering based on process's security context.
From: Tetsuo Handa @ 2008-01-24 11:47 UTC (permalink / raw)
To: netdev, davem; +Cc: linux-security-module, netfilter-devel
In-Reply-To: <200801230016.EGG34399.QFtVHFSJFMOOLO@I-love.SAKURA.ne.jp>
Hello.
Are there any remaining questions/problems about this patch?
If none, I want this patch applied to net-2.6.25 tree.
Regards.
---------------
This patch modifies security_socket_post_accept() and introduces
security_socket_post_recv_datagram() LSM hooks.
Currently, security_socket_post_accept() is called *after* fd_install()
at sys_accept(). This means that userland process might access accept()ed
but not yet properly labeled socket.
I believe this security_socket_post_accept() should be called *before*
fd_install() for the three reasons.
* The userland process will always access proper context labeled by
security_socket_post_accept() rather than default context labeled by
security_inode_alloc() (called from alloc_inode() from new_inode() from
sock_alloc() from sys_accept()). This gives a security module
a chance to perform access control based on proper context.
* If security_socket_post_accept() failed to assign proper context
(e.g. -ENOMEM), the accept()ed socket can't have proper context.
Use of void for security_socket_post_accept() deprives a security module
of a chance to abandon the accept()ed socket.
* A security module can perform connection filtering based on
process's security context.
If security_socket_post_accept() is called after fd_install(),
it is too late to do so because the accept()ed socket is already visible
to userland processes.
Currently, there is no way to directly map security context from incoming
packet to user process. This is because the creator or owner of a socket is
not always the receiver of an incoming packet. The userland process who
receives the incoming packet is not known until a process calls sys_recvmsg().
So, I want to add a LSM hook to give a security module a chance to control
after the recipient of the incoming packet is known.
Signed-off-by: Kentaro Takeda <takedakn@nttdata.co.jp>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
include/linux/security.h | 34 +++++++++++++++++++++++++++++-----
net/core/datagram.c | 29 ++++++++++++++++++++++++++++-
net/socket.c | 7 +++++--
security/dummy.c | 13 ++++++++++---
security/security.c | 10 ++++++++--
5 files changed, 80 insertions(+), 13 deletions(-)
--- net-2.6.25.orig/include/linux/security.h
+++ net-2.6.25/include/linux/security.h
@@ -781,8 +781,12 @@ struct request_sock;
* @socket_post_accept:
* This hook allows a security module to copy security
* information into the newly created socket's inode.
+ * This hook also allows a security module to filter connections
+ * from unwanted peers based on the process accepting this connection.
+ * The connection will be aborted if this hook returns nonzero.
* @sock contains the listening socket structure.
* @newsock contains the newly created server socket for connection.
+ * Return 0 if permission is granted.
* @socket_sendmsg:
* Check permission before transmitting a message to another socket.
* @sock contains the socket structure.
@@ -796,6 +800,15 @@ struct request_sock;
* @size contains the size of message structure.
* @flags contains the operational flags.
* Return 0 if permission is granted.
+ * @socket_post_recv_datagram:
+ * Check permission after receiving a datagram.
+ * This hook allows a security module to filter packets
+ * from unwanted peers based on the process receiving this datagram.
+ * The packet will be discarded if this hook returns nonzero.
+ * @sk contains the socket.
+ * @skb contains the socket buffer.
+ * @flags contains the operational flags.
+ * Return 0 if permission is granted.
* @socket_getsockname:
* Check permission before the local address (name) of the socket object
* @sock is retrieved.
@@ -1387,12 +1400,13 @@ struct security_operations {
struct sockaddr * address, int addrlen);
int (*socket_listen) (struct socket * sock, int backlog);
int (*socket_accept) (struct socket * sock, struct socket * newsock);
- void (*socket_post_accept) (struct socket * sock,
- struct socket * newsock);
+ int (*socket_post_accept) (struct socket *sock, struct socket *newsock);
int (*socket_sendmsg) (struct socket * sock,
struct msghdr * msg, int size);
int (*socket_recvmsg) (struct socket * sock,
struct msghdr * msg, int size, int flags);
+ int (*socket_post_recv_datagram) (struct sock *sk, struct sk_buff *skb,
+ unsigned int flags);
int (*socket_getsockname) (struct socket * sock);
int (*socket_getpeername) (struct socket * sock);
int (*socket_getsockopt) (struct socket * sock, int level, int optname);
@@ -2297,10 +2311,12 @@ int security_socket_bind(struct socket *
int security_socket_connect(struct socket *sock, struct sockaddr *address, int addrlen);
int security_socket_listen(struct socket *sock, int backlog);
int security_socket_accept(struct socket *sock, struct socket *newsock);
-void security_socket_post_accept(struct socket *sock, struct socket *newsock);
+int security_socket_post_accept(struct socket *sock, struct socket *newsock);
int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size);
int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
int size, int flags);
+int security_socket_post_recv_datagram(struct sock *sk, struct sk_buff *skb,
+ unsigned int flags);
int security_socket_getsockname(struct socket *sock);
int security_socket_getpeername(struct socket *sock);
int security_socket_getsockopt(struct socket *sock, int level, int optname);
@@ -2376,9 +2392,10 @@ static inline int security_socket_accept
return 0;
}
-static inline void security_socket_post_accept(struct socket * sock,
- struct socket * newsock)
+static inline int security_socket_post_accept(struct socket *sock,
+ struct socket *newsock)
{
+ return 0;
}
static inline int security_socket_sendmsg(struct socket * sock,
@@ -2394,6 +2411,13 @@ static inline int security_socket_recvms
return 0;
}
+static inline int security_socket_post_recv_datagram(struct sock *sk,
+ struct sk_buff *skb,
+ unsigned int flags)
+{
+ return 0;
+}
+
static inline int security_socket_getsockname(struct socket * sock)
{
return 0;
--- net-2.6.25.orig/net/socket.c
+++ net-2.6.25/net/socket.c
@@ -1454,13 +1454,16 @@ asmlinkage long sys_accept(int fd, struc
goto out_fd;
}
+ /* Filter connections from unwanted peers. */
+ err = security_socket_post_accept(sock, newsock);
+ if (err)
+ goto out_fd;
+
/* File flags are not inherited via accept() unlike another OSes. */
fd_install(newfd, newfile);
err = newfd;
- security_socket_post_accept(sock, newsock);
-
out_put:
fput_light(sock->file, fput_needed);
out:
--- net-2.6.25.orig/security/dummy.c
+++ net-2.6.25/security/dummy.c
@@ -748,10 +748,10 @@ static int dummy_socket_accept (struct s
return 0;
}
-static void dummy_socket_post_accept (struct socket *sock,
- struct socket *newsock)
+static int dummy_socket_post_accept(struct socket *sock,
+ struct socket *newsock)
{
- return;
+ return 0;
}
static int dummy_socket_sendmsg (struct socket *sock, struct msghdr *msg,
@@ -766,6 +766,12 @@ static int dummy_socket_recvmsg (struct
return 0;
}
+static int dummy_socket_post_recv_datagram(struct sock *sk, struct sk_buff *skb,
+ unsigned int flags)
+{
+ return 0;
+}
+
static int dummy_socket_getsockname (struct socket *sock)
{
return 0;
@@ -1099,6 +1105,7 @@ void security_fixup_ops (struct security
set_to_dummy_if_null(ops, socket_post_accept);
set_to_dummy_if_null(ops, socket_sendmsg);
set_to_dummy_if_null(ops, socket_recvmsg);
+ set_to_dummy_if_null(ops, socket_post_recv_datagram);
set_to_dummy_if_null(ops, socket_getsockname);
set_to_dummy_if_null(ops, socket_getpeername);
set_to_dummy_if_null(ops, socket_setsockopt);
--- net-2.6.25.orig/net/core/datagram.c
+++ net-2.6.25/net/core/datagram.c
@@ -55,6 +55,7 @@
#include <net/checksum.h>
#include <net/sock.h>
#include <net/tcp_states.h>
+#include <linux/security.h>
/*
* Is a socket 'connection oriented' ?
@@ -148,6 +149,7 @@ struct sk_buff *__skb_recv_datagram(stru
{
struct sk_buff *skb;
long timeo;
+ unsigned long cpu_flags;
/*
* Caller is allowed not to check sk->sk_err before skb_recv_datagram()
*/
@@ -165,7 +167,6 @@ struct sk_buff *__skb_recv_datagram(stru
* Look at current nfs client by the way...
* However, this function was corrent in any case. 8)
*/
- unsigned long cpu_flags;
spin_lock_irqsave(&sk->sk_receive_queue.lock, cpu_flags);
skb = skb_peek(&sk->sk_receive_queue);
@@ -179,6 +180,14 @@ struct sk_buff *__skb_recv_datagram(stru
}
spin_unlock_irqrestore(&sk->sk_receive_queue.lock, cpu_flags);
+ /* Filter packets from unwanted peers. */
+ if (skb) {
+ error = security_socket_post_recv_datagram(sk, skb,
+ flags);
+ if (error)
+ goto force_dequeue;
+ }
+
if (skb)
return skb;
@@ -191,6 +200,24 @@ struct sk_buff *__skb_recv_datagram(stru
return NULL;
+force_dequeue:
+ /* Drop this packet because LSM says "Don't pass it to the caller". */
+ if (!(flags & MSG_PEEK))
+ goto no_peek;
+ /*
+ * If this packet is MSG_PEEK'ed, dequeue it forcibly
+ * so that this packet won't prevent the caller from picking up
+ * next packet.
+ */
+ spin_lock_irqsave(&sk->sk_receive_queue.lock, cpu_flags);
+ if (skb == skb_peek(&sk->sk_receive_queue)) {
+ __skb_unlink(skb, &sk->sk_receive_queue);
+ atomic_dec(&skb->users);
+ /* Do I have something to do with skb->peeked ? */
+ }
+ spin_unlock_irqrestore(&sk->sk_receive_queue.lock, cpu_flags);
+no_peek:
+ kfree_skb(skb);
no_packet:
*err = error;
return NULL;
--- net-2.6.25.orig/security/security.c
+++ net-2.6.25/security/security.c
@@ -869,9 +869,9 @@ int security_socket_accept(struct socket
return security_ops->socket_accept(sock, newsock);
}
-void security_socket_post_accept(struct socket *sock, struct socket *newsock)
+int security_socket_post_accept(struct socket *sock, struct socket *newsock)
{
- security_ops->socket_post_accept(sock, newsock);
+ return security_ops->socket_post_accept(sock, newsock);
}
int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)
@@ -885,6 +885,12 @@ int security_socket_recvmsg(struct socke
return security_ops->socket_recvmsg(sock, msg, size, flags);
}
+int security_socket_post_recv_datagram(struct sock *sk, struct sk_buff *skb,
+ unsigned int flags)
+{
+ return security_ops->socket_post_recv_datagram(sk, skb, flags);
+}
+
int security_socket_getsockname(struct socket *sock)
{
return security_ops->socket_getsockname(sock);
^ permalink raw reply
* [XFRM]: constify 'struct xfrm_type'
From: Eric Dumazet @ 2008-01-24 11:26 UTC (permalink / raw)
To: David Miller; +Cc: netdev@vger.kernel.org
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
include/net/xfrm.h | 8 ++++----
net/ipv4/ah4.c | 2 +-
net/ipv4/esp4.c | 2 +-
net/ipv4/ipcomp.c | 2 +-
net/ipv4/xfrm4_tunnel.c | 2 +-
net/ipv6/ah6.c | 2 +-
net/ipv6/esp6.c | 2 +-
net/ipv6/ipcomp6.c | 2 +-
net/ipv6/mip6.c | 4 ++--
net/ipv6/xfrm6_tunnel.c | 2 +-
net/xfrm/xfrm_state.c | 16 ++++++++--------
11 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 5ebb9ba..02d9b1e 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -201,7 +201,7 @@ struct xfrm_state
/* Reference to data common to all the instances of this
* transformer. */
- struct xfrm_type *type;
+ const struct xfrm_type *type;
struct xfrm_mode *inner_mode;
struct xfrm_mode *outer_mode;
@@ -278,7 +278,7 @@ struct xfrm_state_afinfo {
unsigned int proto;
unsigned int eth_proto;
struct module *owner;
- struct xfrm_type *type_map[IPPROTO_MAX];
+ const struct xfrm_type *type_map[IPPROTO_MAX];
struct xfrm_mode *mode_map[XFRM_MODE_MAX];
int (*init_flags)(struct xfrm_state *x);
void (*init_tempsel)(struct xfrm_state *x, struct flowi *fl,
@@ -321,8 +321,8 @@ struct xfrm_type
u32 (*get_mtu)(struct xfrm_state *, int size);
};
-extern int xfrm_register_type(struct xfrm_type *type, unsigned short family);
-extern int xfrm_unregister_type(struct xfrm_type *type, unsigned short family);
+extern int xfrm_register_type(const struct xfrm_type *type, unsigned short family);
+extern int xfrm_unregister_type(const struct xfrm_type *type, unsigned short family);
struct xfrm_mode {
/*
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 3003503..4463eeb 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -216,10 +216,10 @@ static void xfrm_state_unlock_afinfo(struct xfrm_state_afinfo *afinfo)
write_unlock_bh(&xfrm_state_afinfo_lock);
}
-int xfrm_register_type(struct xfrm_type *type, unsigned short family)
+int xfrm_register_type(const struct xfrm_type *type, unsigned short family)
{
struct xfrm_state_afinfo *afinfo = xfrm_state_lock_afinfo(family);
- struct xfrm_type **typemap;
+ const struct xfrm_type **typemap;
int err = 0;
if (unlikely(afinfo == NULL))
@@ -235,10 +235,10 @@ int xfrm_register_type(struct xfrm_type *type, unsigned short family)
}
EXPORT_SYMBOL(xfrm_register_type);
-int xfrm_unregister_type(struct xfrm_type *type, unsigned short family)
+int xfrm_unregister_type(const struct xfrm_type *type, unsigned short family)
{
struct xfrm_state_afinfo *afinfo = xfrm_state_lock_afinfo(family);
- struct xfrm_type **typemap;
+ const struct xfrm_type **typemap;
int err = 0;
if (unlikely(afinfo == NULL))
@@ -254,11 +254,11 @@ int xfrm_unregister_type(struct xfrm_type *type, unsigned short family)
}
EXPORT_SYMBOL(xfrm_unregister_type);
-static struct xfrm_type *xfrm_get_type(u8 proto, unsigned short family)
+static const struct xfrm_type *xfrm_get_type(u8 proto, unsigned short family)
{
struct xfrm_state_afinfo *afinfo;
- struct xfrm_type **typemap;
- struct xfrm_type *type;
+ const struct xfrm_type **typemap;
+ const struct xfrm_type *type;
int modload_attempted = 0;
retry:
@@ -281,7 +281,7 @@ retry:
return type;
}
-static void xfrm_put_type(struct xfrm_type *type)
+static void xfrm_put_type(const struct xfrm_type *type)
{
module_put(type->owner);
}
diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c
index d76803a..9d4555e 100644
--- a/net/ipv4/ah4.c
+++ b/net/ipv4/ah4.c
@@ -300,7 +300,7 @@ static void ah_destroy(struct xfrm_state *x)
}
-static struct xfrm_type ah_type =
+static const struct xfrm_type ah_type =
{
.description = "AH4",
.owner = THIS_MODULE,
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 28ea5c7..602779d 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -448,7 +448,7 @@ error:
return -EINVAL;
}
-static struct xfrm_type esp_type =
+static const struct xfrm_type esp_type =
{
.description = "ESP4",
.owner = THIS_MODULE,
diff --git a/net/ipv4/ipcomp.c b/net/ipv4/ipcomp.c
index f4af99a..b271f87 100644
--- a/net/ipv4/ipcomp.c
+++ b/net/ipv4/ipcomp.c
@@ -434,7 +434,7 @@ error:
goto out;
}
-static struct xfrm_type ipcomp_type = {
+static const struct xfrm_type ipcomp_type = {
.description = "IPCOMP4",
.owner = THIS_MODULE,
.proto = IPPROTO_COMP,
diff --git a/net/ipv4/xfrm4_tunnel.c b/net/ipv4/xfrm4_tunnel.c
index 3268451..87f77e4 100644
--- a/net/ipv4/xfrm4_tunnel.c
+++ b/net/ipv4/xfrm4_tunnel.c
@@ -38,7 +38,7 @@ static void ipip_destroy(struct xfrm_state *x)
{
}
-static struct xfrm_type ipip_type = {
+static const struct xfrm_type ipip_type = {
.description = "IPIP",
.owner = THIS_MODULE,
.proto = IPPROTO_IPIP,
diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c
index fb0d07a..379c8e0 100644
--- a/net/ipv6/ah6.c
+++ b/net/ipv6/ah6.c
@@ -515,7 +515,7 @@ static void ah6_destroy(struct xfrm_state *x)
kfree(ahp);
}
-static struct xfrm_type ah6_type =
+static const struct xfrm_type ah6_type =
{
.description = "AH6",
.owner = THIS_MODULE,
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 5bd5292..e3fd9fe 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -395,7 +395,7 @@ error:
return -EINVAL;
}
-static struct xfrm_type esp6_type =
+static const struct xfrm_type esp6_type =
{
.description = "ESP6",
.owner = THIS_MODULE,
diff --git a/net/ipv6/ipcomp6.c b/net/ipv6/ipcomp6.c
index b276d04..c8f38e0 100644
--- a/net/ipv6/ipcomp6.c
+++ b/net/ipv6/ipcomp6.c
@@ -450,7 +450,7 @@ error:
goto out;
}
-static struct xfrm_type ipcomp6_type =
+static const struct xfrm_type ipcomp6_type =
{
.description = "IPCOMP6",
.owner = THIS_MODULE,
diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c
index 49d3966..cd8a5bd 100644
--- a/net/ipv6/mip6.c
+++ b/net/ipv6/mip6.c
@@ -330,7 +330,7 @@ static void mip6_destopt_destroy(struct xfrm_state *x)
{
}
-static struct xfrm_type mip6_destopt_type =
+static const struct xfrm_type mip6_destopt_type =
{
.description = "MIP6DESTOPT",
.owner = THIS_MODULE,
@@ -462,7 +462,7 @@ static void mip6_rthdr_destroy(struct xfrm_state *x)
{
}
-static struct xfrm_type mip6_rthdr_type =
+static const struct xfrm_type mip6_rthdr_type =
{
.description = "MIP6RT",
.owner = THIS_MODULE,
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c
index fae90ff..639fe8a 100644
--- a/net/ipv6/xfrm6_tunnel.c
+++ b/net/ipv6/xfrm6_tunnel.c
@@ -319,7 +319,7 @@ static void xfrm6_tunnel_destroy(struct xfrm_state *x)
xfrm6_tunnel_free_spi((xfrm_address_t *)&x->props.saddr);
}
-static struct xfrm_type xfrm6_tunnel_type = {
+static const struct xfrm_type xfrm6_tunnel_type = {
.description = "IP6IP6",
.owner = THIS_MODULE,
.proto = IPPROTO_IPV6,
^ permalink raw reply related
* Re: 2.6.24-rc8-mm1 : net tcp_input.c warnings
From: Krishna Kumar2 @ 2008-01-24 10:46 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Andrew Morton, David Miller, Denys Fedoryshchenko, Dave Young,
Kamalesh Babulal, LKML, Netdev, netdev-owner
In-Reply-To: <Pine.LNX.4.64.0801241108400.31652@kivilampi-30.cs.helsinki.fi>
Hi Ilpo,
I have tried parallel iperfs with this patch and don't get any more
warnings.
I will run overnight to be sure.
thanks,
- KK
netdev-owner@vger.kernel.org wrote on 01/24/2008 03:24:18 PM:
> On Thu, 24 Jan 2008, Dave Young wrote:
>
> Hi Dave (& others),
>
> > Thanks.
>
> Thanks a lot, I was first to ignore all these because they occurred
> with newreno, but looked again... :-/
>
> > New warning trigged with your debug patch:
>
> This was probably with the earlier one I sent to you because there's
still
> this case remaining which itself is valid:
>
> > P: 5 L: 0 vs 0 S: 0 vs 1 w: 2044790889-2044796616 (0)
>
> ...snip... this is still ok state (S+L <= P):
>
> > P: 5 L: 0 vs 0 S: 0 vs 3 w: 2044790889-2044796616 (0)
> > TCP wq(s) <
> > TCP wq(h) +++h+<
> > l0 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616
> > ------------[ cut here ]------------
> > WARNING: at net/ipv4/tcp_input.c:2169 tcp_mark_head_lost+0x122/0x150()
> > Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event
> > snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss eeprom e100 psmouse
> > snd_hda_intel snd_pcm snd_timer btusb bluetooth serio_raw snd 3c59x sg
> > evdev thermal soundcore rtc_cmos snd_page_alloc rtc_core rtc_lib
> > i2c_i801 processor button intel_agp dcdbas pcspkr agpgart
> > Pid: 0, comm: swapper Not tainted 2.6.24-rc8-mm1 #8
> > [<c0132100>] ? have_callable_console+0x20/0x30
> > [<c0131844>] warn_on_slowpath+0x54/0x80
> > [<c03ffe54>] ? tcp_print_queue+0x1a4/0x230
> > [<c0132438>] ? vprintk+0x308/0x320
> > [<c0132438>] ? vprintk+0x308/0x320
> > [<c0132438>] ? vprintk+0x308/0x320
> > [<c03ffff6>] ? tcp_verify_wq+0x116/0x1c0
> > [<c03f6052>] tcp_mark_head_lost+0x122/0x150
> > [<c03f60ca>] tcp_update_scoreboard+0x4a/0x190
> > [<c03f6e7a>] tcp_fastretrans_alert+0x4da/0x700
> > [<c03f7e63>] tcp_ack+0x1b3/0x3a0
> > [<c03fa21b>] tcp_rcv_established+0x3eb/0x710
> > [<c0401c05>] tcp_v4_do_rcv+0xe5/0x100
> > [<c04021fb>] tcp_v4_rcv+0x5db/0x660
> > [<c0401fa7>] ? tcp_v4_rcv+0x387/0x660
> > [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> > [<c03e5f44>] ip_local_deliver_finish+0x84/0x1d0
> > [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> > [<c0156b97>] ? __lock_release+0x47/0x70
> > [<c03e6147>] ip_local_deliver+0xb7/0xc0
> > [<c03e6202>] ip_rcv_finish+0xb2/0x3c0
> > [<c03c01d8>] ? sock_def_readable+0x48/0xa0
> > [<c03be061>] ? sock_queue_rcv_skb+0xb1/0x1a0
> > [<c03be0a7>] ? sock_queue_rcv_skb+0xf7/0x1a0
> > [<c03e669f>] ip_rcv+0x18f/0x290
> > [<c042fb10>] ? packet_rcv_spkt+0xd0/0x130
> > [<c03c8da6>] netif_receive_skb+0x2b6/0x330
> > [<c03c8c17>] ? netif_receive_skb+0x127/0x330
> > [<c03c8ea3>] ? process_backlog+0x83/0x100
> > [<c03c8eae>] process_backlog+0x8e/0x100
> > [<c03c90bc>] net_rx_action+0x13c/0x230
> > [<c03c8fd9>] ? net_rx_action+0x59/0x230
> > [<c0136f63>] __do_softirq+0x93/0x120
> > [<c013706a>] do_softirq+0x7a/0x80
> > [<c0137135>] irq_exit+0x65/0x90
> > [<c01078b1>] do_IRQ+0x41/0x80
> > [<c0155659>] ? trace_hardirqs_on+0xb9/0x130
> > [<c01059ca>] common_interrupt+0x2e/0x34
> > [<c0103390>] ? mwait_idle_with_hints+0x40/0x50
> > [<c01033a0>] ? mwait_idle+0x0/0x20
> > [<c01033b2>] mwait_idle+0x12/0x20
> > [<c0103141>] cpu_idle+0x61/0x110
> > [<c04339fd>] rest_init+0x5d/0x60
> > [<c05a47fa>] start_kernel+0x1fa/0x260
> > [<c05a4190>] ? unknown_bootoption+0x0/0x130
> > =======================
> > ---[ end trace 14b601818e6903ac ]---
> > ------------[ cut here ]------------
> > WARNING: at net/ipv4/tcp_ipv4.c:197 tcp_verify_wq+0x1b6/0x1c0()
> > Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event
> > snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss eeprom e100 psmouse
> > snd_hda_intel snd_pcm snd_timer btusb bluetooth serio_raw snd 3c59x sg
> > evdev thermal soundcore rtc_cmos snd_page_alloc rtc_core rtc_lib
> > i2c_i801 processor button intel_agp dcdbas pcspkr agpgart
> > Pid: 0, comm: swapper Not tainted 2.6.24-rc8-mm1 #8
> > [<c0132100>] ? have_callable_console+0x20/0x30
> > [<c0131844>] warn_on_slowpath+0x54/0x80
> > [<c01317da>] ? print_oops_end_marker+0x2a/0x30
> > [<c0131849>] ? warn_on_slowpath+0x59/0x80
> > [<c03ffe54>] ? tcp_print_queue+0x1a4/0x230
> > [<c0132438>] ? vprintk+0x308/0x320
> > [<c0132438>] ? vprintk+0x308/0x320
> > [<c0400096>] tcp_verify_wq+0x1b6/0x1c0
> > [<c03ffff6>] ? tcp_verify_wq+0x116/0x1c0
> > [<c03f5ffc>] tcp_mark_head_lost+0xcc/0x150
> > [<c03f60ca>] tcp_update_scoreboard+0x4a/0x190
> > [<c03f6e7a>] tcp_fastretrans_alert+0x4da/0x700
> > [<c03f7e63>] tcp_ack+0x1b3/0x3a0
> > [<c03fa21b>] tcp_rcv_established+0x3eb/0x710
> > [<c0401c05>] tcp_v4_do_rcv+0xe5/0x100
> > [<c04021fb>] tcp_v4_rcv+0x5db/0x660
> > [<c0401fa7>] ? tcp_v4_rcv+0x387/0x660
> > [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> > [<c03e5f44>] ip_local_deliver_finish+0x84/0x1d0
> > [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> > [<c0156b97>] ? __lock_release+0x47/0x70
> > [<c03e6147>] ip_local_deliver+0xb7/0xc0
> > [<c03e6202>] ip_rcv_finish+0xb2/0x3c0
> > [<c03c01d8>] ? sock_def_readable+0x48/0xa0
> > [<c03be061>] ? sock_queue_rcv_skb+0xb1/0x1a0
> > [<c03be0a7>] ? sock_queue_rcv_skb+0xf7/0x1a0
> > [<c03e669f>] ip_rcv+0x18f/0x290
> > [<c042fb10>] ? packet_rcv_spkt+0xd0/0x130
> > [<c03c8da6>] netif_receive_skb+0x2b6/0x330
> > [<c03c8c17>] ? netif_receive_skb+0x127/0x330
> > [<c03c8ea3>] ? process_backlog+0x83/0x100
> > [<c03c8eae>] process_backlog+0x8e/0x100
> > [<c03c90bc>] net_rx_action+0x13c/0x230
> > [<c03c8fd9>] ? net_rx_action+0x59/0x230
> > [<c0136f63>] __do_softirq+0x93/0x120
> > [<c013706a>] do_softirq+0x7a/0x80
> > [<c0137135>] irq_exit+0x65/0x90
> > [<c01078b1>] do_IRQ+0x41/0x80
> > [<c0155659>] ? trace_hardirqs_on+0xb9/0x130
> > [<c01059ca>] common_interrupt+0x2e/0x34
> > [<c0103390>] ? mwait_idle_with_hints+0x40/0x50
> > [<c01033a0>] ? mwait_idle+0x0/0x20
> > [<c01033b2>] mwait_idle+0x12/0x20
> > [<c0103141>] cpu_idle+0x61/0x110
> > [<c04339fd>] rest_init+0x5d/0x60
> > [<c05a47fa>] start_kernel+0x1fa/0x260
> > [<c05a4190>] ? unknown_bootoption+0x0/0x130
> > =======================
> > ---[ end trace 14b601818e6903ac ]---
>
> ...But this no longer is, and even more, L: 5 is not valid state at this
> point all (should only happen if we went to RTO but it would reset S to
> zero with newreno):
>
> > P: 5 L: 5 vs 5 S: 0 vs 3 w: 2044790889-2044796616 (0)
> > TCP wq(s) LLLLl<
> > TCP wq(h) +++h+<
> > l5 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616
>
> Surprisingly, it was the first time the WARN_ON for left_out returned
> correct location. This also explains why the patch I sent to Krishna
> didn't print anything (it didn't end up into printing because I forgot
> to add L+S>P check into to the state checking if).
>
> ...so please, could you (others than Denys) try this patch, it should
> solve the issue. And Denys, could you confirm (and if necessary double
> check) that the kernel you saw this similar problem with is the pure
> Linus' mainline, i.e., without any net-2.6.25 or mm bits please, if so,
> that problem persists. And anyway, there were some fackets_out related
> problems reported as well and this doesn't help for that but I think I've
> lost track of who was seeing it due to large number of reports :-), could
> somebody refresh my memory because I currently don't have time to dig it
> up from archives (at least on this week).
>
>
> --
> i.
>
> --
> [PATCH] [TCP]: NewReno must count every skb while marking losses
>
> NewReno should add cnt per skb (as with FACK) instead of depending
> on SACKED_ACKED bits which won't be set with it at all.
> Effectively, NewReno should always exists after the first
> iteration anyway (or immediately if there's already head in
> lost_out.
>
> This was fixed earlier in net-2.6.25 but got reverted among other
> stuff and I didn't notice that this is still necessary (actually
> wasn't even considering this case while trying to figure out the
> reports because I lived with different kind of code than it in
> reality was).
>
> This should solve the WARN_ONs in TCP code that as a result of
> this triggered multiple times in every place we check for this
> invariant.
>
> Special thanks to Dave Young <hidave.darkstar@gmail.com> and
> Krishna Kumar2 <krkumar2@in.ibm.com> for trying with my debug
> patches.
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> ---
> net/ipv4/tcp_input.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 295490e..aa409a5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2156,7 +2156,7 @@ static void tcp_mark_head_lost(struct sock *sk, int
> packets, int fast_rexmit)
> tp->lost_skb_hint = skb;
> tp->lost_cnt_hint = cnt;
>
> - if (tcp_is_fack(tp) ||
> + if (tcp_is_fack(tp) || tcp_is_reno(tp) ||
> (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
> cnt += tcp_skb_pcount(skb);
>
> --
> 1.5.2.2
^ permalink raw reply
* [PATCH UCC TDM 2/3 ]Updated: UCC TDM driver for QE based MPC83xx platforms.
From: Poonam_Aggrwal-b10812 @ 2008-01-24 10:35 UTC (permalink / raw)
To: kumar.gala, akpm, linux-kernel, netdev, rubini, linuxppc-dev
Cc: michael.barkowski, kim.phillips, ashish.kalra, timur, rich.cutler,
b10812
Incorporated Stephen's comments.
From: Poonam Agarwal-b10812 <b10812@freescale.com>
The UCC TDM driver basically multiplexes and demultiplexes data from
different channels. It can interface with for example SLIC kind of devices
to receive TDM data demultiplex it and send to upper modules. At the
transmit end it receives data for different channels multiplexes it and
sends them on the TDM channel. It internally uses TSA( Time Slot Assigner)
which does multiplexing and demultiplexing, UCC to perform SDMA between
host buffers and the TSA, CMX to connect TSA to UCC.
It can be used by a kernel module which can call tdm_register_client to
get access to a TDM device.
The driver is right now a misc driver with no subsystem as such.
There can be a platform independent TDM layer which is planned to be
done after this. TDM bus sort of thing.
The dts file keeps a track of the TDM devices present on the board.
Depending on them the TDM driver initializes those many driver instances
while coming up.
The driver on the upper level can plug to more than one tdm clients
depending on the availablity of TDM devices. At every new request of a TDM
client to bind with a TDM device, a free driver instance is allocated to
the client.
The interface can be described as follows.
tdm_register_client(struct tdm_client *)
This API returns a pointer to the structure tdm_client which is of
type
struct tdm_client {
u32 client_id;
u32 (*tdm_read)(u32 client_id, short chn_id, short
*pcm_buffer, short len);
u32 (*tdm_write)(u32 client_id, short chn_id, short
*pcm_buffer, short len);
wait_queue_head_t *wakeup_event;
}
It consists of:
- client_id: It is basically to identify the particular TDM
device/driver instance.
- tdm_read: It is a function pointer returned by the TDM driver to be
used to read TDM data from a particular TDM channel.
- tdm_write: It is a function pointer returned by the TDM driver to be
used to write TDM data to a particular TDM channel.
- wakeup_event: It is address of a wait_queue event on which the client
keeps on sleeping, and the TDM driver wakes it up periodically. The driver
is configured to
wake up the client after every 10ms.
Once the TDM client gets registered to a TDM driver instance and a TDM
device, it interfaces with the driver using tdm_read, tdm_write and
wakeup_event.
This driver will run on MPC8323E-RDB platforms.
Signed-off-by: Poonam Aggrwal <b10812@freescale.com>
Signed-off-by: Ashish Kalra <ashish.kalra@freescale.com>
Signed-off-by: Kim Phillips <Kim.Phillips@freescale.com>
Signed-off-by: Michael Barkowski <michael.barkowski@freescale.com>
---
drivers/misc/Kconfig | 14 +
drivers/misc/Makefile | 1 +
drivers/misc/ucc_tdm.c | 1017 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/misc/ucc_tdm.h | 221 +++++++++++
4 files changed, 1253 insertions(+), 0 deletions(-)
create mode 100644 drivers/misc/ucc_tdm.c
create mode 100644 drivers/misc/ucc_tdm.h
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index b5e67c0..628b14b 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -232,4 +232,18 @@ config ATMEL_SSC
If unsure, say N.
+config UCC_TDM
+ bool "Freescale UCC TDM Driver"
+ depends on QUICC_ENGINE && UCC_FAST
+ default n
+ ---help---
+ The TDM driver is for UCC based TDM devices for example, TDM device on
+ MPC832x RDB. Select it to run PowerVoIP on MPC832x RDB board.
+ The TDM driver can interface with SLIC kind of devices to transmit
+ and receive TDM samples. The TDM driver receives Time Division
+ multiplexed samples(for different channels) from the SLIC device,
+ demutiplexes them and sends them to the upper layers. At the transmit
+ end the TDM drivers receives samples for different channels, it
+ multiplexes them and sends them to the SLIC device.
+
endif # MISC_DEVICES
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 87f2685..6f0c49d 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -17,3 +17,4 @@ obj-$(CONFIG_SONY_LAPTOP) += sony-laptop.o
obj-$(CONFIG_THINKPAD_ACPI) += thinkpad_acpi.o
obj-$(CONFIG_FUJITSU_LAPTOP) += fujitsu-laptop.o
obj-$(CONFIG_EEPROM_93CX6) += eeprom_93cx6.o
+obj-$(CONFIG_UCC_TDM) += ucc_tdm.o
diff --git a/drivers/misc/ucc_tdm.c b/drivers/misc/ucc_tdm.c
new file mode 100644
index 0000000..2181132
--- /dev/null
+++ b/drivers/misc/ucc_tdm.c
@@ -0,0 +1,1017 @@
+/*
+ * drivers/misc/ucc_tdm.c
+ *
+ * UCC Based Linux TDM Driver
+ * This driver is designed to support UCC based TDM for PowerPC processors.
+ * This driver can interface with SLIC device to run VOIP kind of
+ * applications.
+ *
+ * Author: Ashish Kalra & Poonam Aggrwal
+ *
+ * Copyright (c) 2007 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#include <linux/autoconf.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/interrupt.h>
+#include <linux/time.h>
+#include <linux/skbuff.h>
+#include <linux/proc_fs.h>
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/string.h>
+#include <linux/irq.h>
+#include <linux/of_platform.h>
+#include <linux/io.h>
+#include <linux/wait.h>
+#include <linux/timer.h>
+
+#include <asm/immap_qe.h>
+#include <asm/qe.h>
+#include <asm/ucc.h>
+#include <asm/ucc_fast.h>
+#include <asm/ucc_slow.h>
+
+#include "ucc_tdm.h"
+#define DRV_DESC "Freescale QE UCC TDM Driver"
+#define DRV_NAME "ucc_tdm"
+
+
+/*
+ * define the following #define if snooping or hardware-based cache coherency
+ * is disabled on the UCC transparent controller.This flag enables
+ * software-based cache-coherency support by explicitly flushing data cache
+ * contents after setting up the TDM output buffer(s) and invalidating the
+ * data cache contents before the TDM input buffer(s) are read.
+ */
+#undef UCC_CACHE_SNOOPING_DISABLED
+
+#define MAX_NUM_TDM_DEVICES 8
+
+static struct tdm_ctrl *tdm_ctrl[MAX_NUM_TDM_DEVICES];
+
+static int num_tdm_devices;
+static int num_tdm_clients;
+
+static struct ucc_tdm_info utdm_primary_info = {
+ .uf_info = {
+ .tsa = 1,
+ .cdp = 1,
+ .cds = 1,
+ .ctsp = 1,
+ .ctss = 1,
+ .revd = 1,
+ .urfs = 0x128,
+ .utfs = 0x128,
+ .utfet = 0,
+ .utftt = 0x128,
+ .ufpt = 256,
+ .ttx_trx = UCC_FAST_GUMR_TRANSPARENT_TTX_TRX_TRANSPARENT,
+ .tenc = UCC_FAST_TX_ENCODING_NRZ,
+ .renc = UCC_FAST_RX_ENCODING_NRZ,
+ .tcrc = UCC_FAST_16_BIT_CRC,
+ .synl = UCC_FAST_SYNC_LEN_NOT_USED,
+ },
+ .ucc_busy = 0,
+};
+
+static struct ucc_tdm_info utdm_info[8];
+
+static void dump_siram(struct tdm_ctrl *tdm_c)
+{
+#ifdef DEBUG
+ int i;
+ u16 phy_num_ts;
+
+ phy_num_ts = tdm_c->physical_num_ts;
+
+ pr_debug("SI TxRAM dump\n");
+ /* each slot entry in SI RAM is of 2 bytes */
+ for (i = 0; i < phy_num_ts * 2; i++)
+ pr_debug("%x ", in_8(&qe_immr->sir.tx[i]));
+ pr_debug("\nSI RxRAM dump\n");
+ for (i = 0; i < phy_num_ts * 2; i++)
+ pr_debug("%x ", in_8(&qe_immr->sir.rx[i]));
+ pr_debug("\n");
+#endif
+}
+
+static void dump_ucc(struct tdm_ctrl *tdm_c)
+{
+#ifdef DEBUG
+ struct ucc_transparent_pram *ucc_pram;
+
+ ucc_pram = tdm_c->ucc_pram;
+
+ pr_debug("%s Dumping UCC Registers\n", __FUNCTION__);
+ ucc_fast_dump_regs(tdm_c->uf_private);
+ pr_debug("%s Dumping UCC Parameter RAM\n", __FUNCTION__);
+ pr_debug("rbase = 0x%x\n", in_be32(&ucc_pram->rbase));
+ pr_debug("rbptr = 0x%x\n", in_be32(&ucc_pram->rbptr));
+ pr_debug("mrblr = 0x%x\n", in_be16(&ucc_pram->mrblr));
+ pr_debug("rbdlen = 0x%x\n", in_be16(&ucc_pram->rbdlen));
+ pr_debug("rbdstat = 0x%x\n", in_be16(&ucc_pram->rbdstat));
+ pr_debug("rstate = 0x%x\n", in_be32(&ucc_pram->rstate));
+ pr_debug("rdptr = 0x%x\n", in_be32(&ucc_pram->rdptr));
+ pr_debug("tbase = 0x%x\n", in_be32(&ucc_pram->tbase));
+ pr_debug("tbptr = 0x%x\n", in_be32(&ucc_pram->tbptr));
+ pr_debug("tbdlen = 0x%x\n", in_be16(&ucc_pram->tbdlen));
+ pr_debug("tbdstat = 0x%x\n", in_be16(&ucc_pram->tbdstat));
+ pr_debug("tstate = 0x%x\n", in_be32(&ucc_pram->tstate));
+ pr_debug("tdptr = 0x%x\n", in_be32(&ucc_pram->tdptr));
+#endif
+}
+
+/*
+ * For use when a framing bit is not present
+ * Program current-route SI ram
+ * Set SIxRAM TDMx
+ * Entries must be in units of 8.
+ * SIR_UCC -> Channel Select
+ * SIR_CNT -> Number of bits or bytes
+ * SIR_BYTE -> Byte or Bit resolution
+ * SIR_LAST -> Indicates last entry in SIxRAM
+ * SIR_IDLE -> The Tx data pin is Tri-stated and the Rx data pin is
+ * ignored
+ */
+static void set_siram(struct tdm_ctrl *tdm_c, enum comm_dir dir)
+{
+ const u16 *mask;
+ u16 temp_mask = 1;
+ u16 siram_code = 0;
+ u32 i, j, k;
+ u32 ucc;
+ u32 phy_num_ts;
+
+ phy_num_ts = tdm_c->physical_num_ts;
+ ucc = tdm_c->ut_info->uf_info.ucc_num;
+
+ if (dir == COMM_DIR_RX)
+ mask = tdm_c->rx_mask;
+ else
+ mask = tdm_c->tx_mask;
+ k = 0;
+ j = 0;
+ for (i = 0; i < phy_num_ts; i++) {
+ if ((mask[k] & temp_mask) == temp_mask)
+ siram_code = SIR_UCC(ucc) | SIR_CNT(0) | SIR_BYTE;
+ else
+ siram_code = SIR_IDLE | SIR_CNT(0) | SIR_BYTE;
+ if (dir == COMM_DIR_RX)
+ out_be16((u16 *)&qe_immr->sir.rx[i * 2], siram_code);
+ else
+ out_be16((u16 *)&qe_immr->sir.tx[i * 2], siram_code);
+ temp_mask = temp_mask << 1;
+ j++;
+ if (j >= 16) {
+ j = 0;
+ temp_mask = 0x0001;
+ k++;
+ }
+ }
+ siram_code = siram_code | SIR_LAST;
+
+ if (dir == COMM_DIR_RX)
+ out_be16((u16 *)&qe_immr->sir.rx[(phy_num_ts - 1) * 2],
+ siram_code);
+ else
+ out_be16((u16 *)&qe_immr->sir.tx[(phy_num_ts - 1) * 2],
+ siram_code);
+}
+
+static void config_si(struct tdm_ctrl *tdm_c)
+{
+ u8 rxsyncdelay, txsyncdelay, tdm_port;
+ u16 sixmr_val = 0;
+ u32 tdma_mode_off;
+ u16 *si1_tdm_mode_reg;
+
+ tdm_port = tdm_c->tdm_port;
+
+ set_siram(tdm_c, COMM_DIR_RX);
+
+ set_siram(tdm_c, COMM_DIR_TX);
+
+ rxsyncdelay = tdm_c->cfg_ctrl.rx_fr_sync_delay;
+ txsyncdelay = tdm_c->cfg_ctrl.tx_fr_sync_delay;
+ if (tdm_c->cfg_ctrl.com_pin)
+ sixmr_val |= SIMODE_CRT;
+ if (tdm_c->cfg_ctrl.fr_sync_level == 1)
+ sixmr_val |= SIMODE_SL;
+ if (tdm_c->cfg_ctrl.clk_edge == 1)
+ sixmr_val |= SIMODE_CE;
+ if (tdm_c->cfg_ctrl.fr_sync_edge == 1)
+ sixmr_val |= SIMODE_FE;
+ sixmr_val |= (SIMODE_TFSD(txsyncdelay) | SIMODE_RFSD(rxsyncdelay));
+
+ tdma_mode_off = SI_TDM_MODE_REGISTER_OFFSET * tdm_c->tdm_port;
+
+ si1_tdm_mode_reg = (u8 *)&qe_immr->si1 + tdma_mode_off;
+ out_be16(si1_tdm_mode_reg, sixmr_val);
+
+ dump_siram(tdm_c);
+}
+
+static int tdm_init(struct tdm_ctrl *tdm_c)
+{
+ u32 tdm_port, ucc, act_num_ts;
+ int ret, i, err;
+ u32 cecr_subblock;
+ u32 pram_offset;
+ u32 rxbdt_offset;
+ u32 txbdt_offset;
+ u32 rx_ucode_buf_offset, tx_ucode_buf_offset;
+ u16 bd_status, bd_len;
+ enum qe_clock clock;
+ struct qe_bd __iomem *rx_bd, *tx_bd;
+
+ tdm_port = tdm_c->tdm_port;
+ ucc = tdm_c->ut_info->uf_info.ucc_num;
+ act_num_ts = tdm_c->cfg_ctrl.active_num_ts;
+
+ /*
+ * TDM Tx and Rx CLKs = 2048 KHz.
+ */
+ if (strstr(tdm_c->ut_info->uf_info.tdm_tx_clk, "BRG")) {
+ clock = qe_clock_source(tdm_c->ut_info->uf_info.tdm_tx_clk);
+ err = qe_setbrg(clock, 2048000, 1);
+ if (err < 0) {
+ printk(KERN_ERR "%s: Failed to set %s\n", __FUNCTION__,
+ tdm_c->ut_info->uf_info.tdm_tx_clk);
+ return err;
+ }
+ }
+ if (strstr(tdm_c->ut_info->uf_info.tdm_rx_clk, "BRG")) {
+ clock = qe_clock_source(tdm_c->ut_info->uf_info.tdm_rx_clk);
+ err = qe_setbrg(clock, 2048000, 1);
+ if (err < 0) {
+ printk(KERN_ERR "%s: Failed to set %s\n", __FUNCTION__,
+ tdm_c->ut_info->uf_info.tdm_rx_clk);
+ return err;
+ }
+ }
+ /*
+ * TDM FSyncs = 4 KHz.
+ */
+ if (strstr(tdm_c->ut_info->uf_info.tdm_tx_sync, "BRG")) {
+ clock = qe_clock_source(tdm_c->ut_info->uf_info.tdm_tx_sync);
+ err = qe_setbrg(clock, 4000, 1);
+ if (err < 0) {
+ printk(KERN_ERR "%s: Failed to set %s\n", __FUNCTION__,
+ tdm_c->ut_info->uf_info.tdm_tx_sync);
+ return err;
+ }
+ }
+ if (strstr(tdm_c->ut_info->uf_info.tdm_rx_sync, "BRG")) {
+ clock = qe_clock_source(tdm_c->ut_info->uf_info.tdm_rx_sync);
+ err = qe_setbrg(clock, 4000, 1);
+ if (err < 0) {
+ printk(KERN_ERR "%s: Failed to set %s\n", __FUNCTION__,
+ tdm_c->ut_info->uf_info.tdm_rx_sync);
+ return err;
+ }
+ }
+
+ tdm_c->ut_info->uf_info.uccm_mask = (u32)
+ ((UCC_TRANS_UCCE_RXB | UCC_TRANS_UCCE_BSY) << 16);
+
+ if (ucc_fast_init(&(tdm_c->ut_info->uf_info), &tdm_c->uf_private)) {
+ printk(KERN_ERR "%s: Failed to init uccf\n", __FUNCTION__);
+ return -ENOMEM;
+ }
+
+ ucc_fast_disable(tdm_c->uf_private, COMM_DIR_RX | COMM_DIR_TX);
+
+ /* Write to QE CECR, UCCx channel to Stop Transmission */
+ cecr_subblock = ucc_fast_get_qe_cr_subblock(ucc);
+ qe_issue_cmd(QE_STOP_TX, cecr_subblock,
+ (u8) QE_CR_PROTOCOL_UNSPECIFIED, 0);
+
+ pram_offset = qe_muram_alloc(UCC_TRANSPARENT_PRAM_SIZE,
+ ALIGNMENT_OF_UCC_SLOW_PRAM);
+ if (IS_ERR_VALUE(pram_offset)) {
+ printk(KERN_ERR "%s: Cannot allocate MURAM memory for"
+ " transparent UCC\n", __FUNCTION__);
+ ret = -ENOMEM;
+ goto pram_alloc_error;
+ }
+
+ cecr_subblock = ucc_fast_get_qe_cr_subblock(ucc);
+ qe_issue_cmd(QE_ASSIGN_PAGE_TO_DEVICE, cecr_subblock,
+ QE_CR_PROTOCOL_UNSPECIFIED, pram_offset);
+
+ tdm_c->ucc_pram = qe_muram_addr(pram_offset);
+ tdm_c->ucc_pram_offset = pram_offset;
+
+ /*
+ * zero-out pram, this will also ensure RSTATE, TSTATE are cleared, also
+ * DISFC & CRCEC counters will be initialized.
+ */
+ memset(tdm_c->ucc_pram, 0, sizeof(struct ucc_transparent_pram));
+
+ /* rbase, tbase alignment is 8. */
+ rxbdt_offset = qe_muram_alloc(NR_BUFS * sizeof(struct qe_bd),
+ QE_ALIGNMENT_OF_BD);
+ if (IS_ERR_VALUE(rxbdt_offset)) {
+ printk(KERN_ERR "%s: Cannot allocate MURAM memory for RxBDs\n",
+ __FUNCTION__);
+ ret = -ENOMEM;
+ goto rxbd_alloc_error;
+ }
+ txbdt_offset = qe_muram_alloc(NR_BUFS * sizeof(struct qe_bd),
+ QE_ALIGNMENT_OF_BD);
+ if (IS_ERR_VALUE(txbdt_offset)) {
+ printk(KERN_ERR "%s: Cannot allocate MURAM memory for TxBDs\n",
+ __FUNCTION__);
+ ret = -ENOMEM;
+ goto txbd_alloc_error;
+ }
+ tdm_c->tx_bd = qe_muram_addr(txbdt_offset);
+ tdm_c->rx_bd = qe_muram_addr(rxbdt_offset);
+
+ tdm_c->tx_bd_offset = txbdt_offset;
+ tdm_c->rx_bd_offset = rxbdt_offset;
+
+ rx_bd = tdm_c->rx_bd;
+ tx_bd = tdm_c->tx_bd;
+
+ out_be32(&tdm_c->ucc_pram->rbase, (u32) immrbar_virt_to_phys(rx_bd));
+ out_be32(&tdm_c->ucc_pram->tbase, (u32) immrbar_virt_to_phys(tx_bd));
+
+ for (i = 0; i < NR_BUFS - 1; i++) {
+ bd_status = (u16) ((R_E | R_CM | R_I) >> 16);
+ bd_len = 0;
+ out_be16(&rx_bd->length, bd_len);
+ out_be16(&rx_bd->status, bd_status);
+ out_be32(&rx_bd->buf,
+ tdm_c->dma_input_addr + i * SAMPLE_DEPTH * act_num_ts);
+ rx_bd += 1;
+
+ bd_status = (u16) ((T_R | T_CM) >> 16);
+ bd_len = SAMPLE_DEPTH * act_num_ts;
+ out_be16(&tx_bd->length, bd_len);
+ out_be16(&tx_bd->status, bd_status);
+ out_be32(&tx_bd->buf,
+ tdm_c->dma_output_addr + i * SAMPLE_DEPTH * act_num_ts);
+ tx_bd += 1;
+ }
+
+ bd_status = (u16) ((R_E | R_CM | R_I | R_W) >> 16);
+ bd_len = 0;
+ out_be16(&rx_bd->length, bd_len);
+ out_be16(&rx_bd->status, bd_status);
+ out_be32(&rx_bd->buf,
+ tdm_c->dma_input_addr + i * SAMPLE_DEPTH * act_num_ts);
+
+ bd_status = (u16) ((T_R | T_CM | T_W) >> 16);
+ bd_len = SAMPLE_DEPTH * act_num_ts;
+ out_be16(&tx_bd->length, bd_len);
+ out_be16(&tx_bd->status, bd_status);
+ out_be32(&tx_bd->buf,
+ tdm_c->dma_output_addr + i * SAMPLE_DEPTH * act_num_ts);
+
+ config_si(tdm_c);
+
+ setbits32(&qe_immr->ic.qimr, (0x80000000UL >> ucc));
+
+ rx_ucode_buf_offset = qe_muram_alloc(32, 32);
+ if (IS_ERR_VALUE(rx_ucode_buf_offset)) {
+ printk(KERN_ERR "%s: Cannot allocate MURAM mem for Rx"
+ " ucode buf\n", __FUNCTION__);
+ ret = -ENOMEM;
+ goto rxucode_buf_alloc_error;
+ }
+
+ tx_ucode_buf_offset = qe_muram_alloc(32, 32);
+ if (IS_ERR_VALUE(tx_ucode_buf_offset)) {
+ printk(KERN_ERR "%s: Cannot allocate MURAM mem for Tx"
+ " ucode buf\n", __FUNCTION__);
+ ret = -ENOMEM;
+ goto txucode_buf_alloc_error;
+ }
+ out_be16(&tdm_c->ucc_pram->riptr, (u16) rx_ucode_buf_offset);
+ out_be16(&tdm_c->ucc_pram->tiptr, (u16) tx_ucode_buf_offset);
+
+ tdm_c->rx_ucode_buf_offset = rx_ucode_buf_offset;
+ tdm_c->tx_ucode_buf_offset = tx_ucode_buf_offset;
+
+ /*
+ * set the receive buffer descriptor maximum size to be
+ * SAMPLE_DEPTH * number of active RX channels
+ */
+ out_be16(&tdm_c->ucc_pram->mrblr, (u16) SAMPLE_DEPTH * act_num_ts);
+
+ /*
+ * enable snooping and BE byte ordering on the UCC pram's
+ * tstate & rstate registers.
+ */
+ out_be32(&tdm_c->ucc_pram->tstate, 0x30000000UL);
+ out_be32(&tdm_c->ucc_pram->rstate, 0x30000000UL);
+
+ /*Put UCC transparent controller into serial interface mode. */
+ out_be32(&tdm_c->uf_regs->upsmr, 0);
+
+ /* Reset TX and RX for UCCx */
+ cecr_subblock = ucc_fast_get_qe_cr_subblock(ucc);
+ qe_issue_cmd(QE_INIT_TX_RX, cecr_subblock,
+ (u8) QE_CR_PROTOCOL_UNSPECIFIED, 0);
+
+ return 0;
+
+txucode_buf_alloc_error:
+ qe_muram_free(rx_ucode_buf_offset);
+rxucode_buf_alloc_error:
+ qe_muram_free(txbdt_offset);
+txbd_alloc_error:
+ qe_muram_free(rxbdt_offset);
+rxbd_alloc_error:
+ qe_muram_free(pram_offset);
+pram_alloc_error:
+ ucc_fast_free(tdm_c->uf_private);
+ return ret;
+}
+
+static void tdm_deinit(struct tdm_ctrl *tdm_c)
+{
+ qe_muram_free(tdm_c->rx_ucode_buf_offset);
+ qe_muram_free(tdm_c->tx_ucode_buf_offset);
+
+ if (tdm_c->rx_bd_offset) {
+ qe_muram_free(tdm_c->rx_bd_offset);
+ tdm_c->rx_bd = NULL;
+ tdm_c->rx_bd_offset = 0;
+ }
+ if (tdm_c->tx_bd_offset) {
+ qe_muram_free(tdm_c->tx_bd_offset);
+ tdm_c->tx_bd = NULL;
+ tdm_c->tx_bd_offset = 0;
+ }
+ if (tdm_c->ucc_pram_offset) {
+ qe_muram_free(tdm_c->ucc_pram_offset);
+ tdm_c->ucc_pram = NULL;
+ tdm_c->ucc_pram_offset = 0;
+ }
+}
+
+
+static irqreturn_t tdm_isr(int irq, void *dev_id)
+{
+ u8 *input_tdm_buffer, *output_tdm_buffer;
+ u32 txb, rxb;
+ u32 ucc;
+ register u32 ucce = 0;
+ struct tdm_ctrl *tdm_c;
+ tdm_c = (struct tdm_ctrl *)dev_id;
+
+ tdm_c->tdm_icnt++;
+ ucc = tdm_c->ut_info->uf_info.ucc_num;
+ input_tdm_buffer = tdm_c->tdm_input_data;
+ output_tdm_buffer = tdm_c->tdm_output_data;
+
+ if (in_be32(tdm_c->uf_private->p_ucce) &
+ (UCC_TRANS_UCCE_BSY << 16)) {
+ out_be32(tdm_c->uf_private->p_ucce,
+ (UCC_TRANS_UCCE_BSY << 16));
+ pr_info("%s: From tdm isr busy interrupt\n",
+ __FUNCTION__);
+ dump_ucc(tdm_c);
+
+ return IRQ_HANDLED;
+ }
+
+ if (tdm_c->tdm_flag == 1) {
+ /* track phases for Rx/Tx */
+ tdm_c->phase_rx += 1;
+ if (tdm_c->phase_rx == MAX_PHASE)
+ tdm_c->phase_rx = 0;
+
+ tdm_c->phase_tx += 1;
+ if (tdm_c->phase_tx == MAX_PHASE)
+ tdm_c->phase_tx = 0;
+
+#ifdef CONFIG_TDM_HW_LB_TSA_SLIC
+ {
+ u32 temp_rx, temp_tx, phase_tx, phase_rx;
+ int i;
+ phase_rx = tdm_c->phase_rx;
+ phase_tx = tdm_c->phase_tx;
+ if (phase_rx == 0)
+ phase_rx = MAX_PHASE;
+ else
+ phase_rx -= 1;
+ if (phase_tx == 0)
+ phase_tx = MAX_PHASE;
+ else
+ phase_tx -= 1;
+ temp_rx = phase_rx * SAMPLE_DEPTH * ACTIVE_CH;
+ temp_tx = phase_tx * SAMPLE_DEPTH * ACTIVE_CH;
+
+ /*check if loopback received data on TS0 is correct. */
+ pr_debug("%s: check if loopback received data on TS0"
+ " is correct\n", __FUNCTION__);
+ pr_debug("%d,%d ", phase_rx, phase_tx);
+ for (i = 0; i < 8; i++)
+ pr_debug("%1d,%1d ",
+ input_tdm_buffer[temp_rx + i],
+ output_tdm_buffer[temp_tx + i]);
+ pr_debug("\n");
+ }
+#endif
+
+ /* schedule BH */
+ wake_up_interruptible(&tdm_c->wakeup_event);
+ } else {
+ if (tdm_c->tdm_icnt == STUTTER_INT_CNT) {
+ txb = in_be32(&tdm_c->ucc_pram->tbptr) -
+ in_be32(&tdm_c->ucc_pram->tbase);
+ rxb = in_be32(&tdm_c->ucc_pram->rbptr) -
+ in_be32(&tdm_c->ucc_pram->rbase);
+ tdm_c->phase_tx = txb / sizeof(struct qe_bd);
+ tdm_c->phase_rx = rxb / sizeof(struct qe_bd);
+
+#ifdef CONFIG_TDM_HW_LB_TSA_SLIC
+ tdm_c->phase_tx = tdm_c->phase_rx;
+#endif
+
+ /* signal "stuttering" period is over */
+ tdm_c->tdm_flag = 1;
+
+ pr_debug("%s: stuttering period is over\n",
+ __FUNCTION__);
+
+ if (in_be32(tdm_c->uf_private->p_ucce) &
+ (UCC_TRANS_UCCE_TXE << 16)) {
+ u32 cecr_subblock;
+ out_be32(tdm_c->uf_private->p_ucce,
+ (UCC_TRANS_UCCE_TXE << 16));
+ pr_debug("%s: From tdm isr txe interrupt\n",
+ __FUNCTION__);
+
+ cecr_subblock =
+ ucc_fast_get_qe_cr_subblock(ucc);
+ qe_issue_cmd(QE_RESTART_TX, cecr_subblock,
+ (u8) QE_CR_PROTOCOL_UNSPECIFIED,
+ 0);
+ }
+ }
+ }
+
+ ucce = (in_be32(tdm_c->uf_private->p_ucce)
+ & in_be32(tdm_c->uf_private->p_uccm));
+
+ out_be32(tdm_c->uf_private->p_ucce, ucce);
+
+ return IRQ_HANDLED;
+}
+
+static int tdm_start(struct tdm_ctrl *tdm_c)
+{
+ if (request_irq(tdm_c->ut_info->uf_info.irq, tdm_isr,
+ 0, "tdm", tdm_c)) {
+ printk(KERN_ERR "%s: request_irq for tdm_isr failed\n",
+ __FUNCTION__);
+ return -ENODEV;
+ }
+
+ ucc_fast_enable(tdm_c->uf_private, COMM_DIR_RX | COMM_DIR_TX);
+
+ pr_info("%s 16-bit linear pcm mode active with"
+ " slots 0 & 2\n", __FUNCTION__);
+
+ dump_siram(tdm_c);
+ dump_ucc(tdm_c);
+
+ setbits8(&(qe_immr->si1.siglmr1_h), (0x1 << tdm_c->tdm_port));
+ pr_info("%s UCC based TDM enabled\n", __FUNCTION__);
+
+ return 0;
+}
+
+static void tdm_stop(struct tdm_ctrl *tdm_c)
+{
+ u32 port, si;
+ u32 ucc;
+ u32 cecr_subblock;
+
+ port = tdm_c->tdm_port;
+ si = tdm_c->si;
+ ucc = tdm_c->ut_info->uf_info.ucc_num;
+ cecr_subblock = ucc_fast_get_qe_cr_subblock(ucc);
+
+ qe_issue_cmd(QE_GRACEFUL_STOP_TX, cecr_subblock,
+ (u8) QE_CR_PROTOCOL_UNSPECIFIED, 0);
+ qe_issue_cmd(QE_CLOSE_RX_BD, cecr_subblock,
+ (u8) QE_CR_PROTOCOL_UNSPECIFIED, 0);
+
+ clrbits8(&qe_immr->si1.siglmr1_h, (0x1 << port));
+ ucc_fast_disable(tdm_c->uf_private, COMM_DIR_RX);
+ ucc_fast_disable(tdm_c->uf_private, COMM_DIR_TX);
+ free_irq(tdm_c->ut_info->uf_info.irq, tdm_c);
+}
+
+
+static void config_tdm(struct tdm_ctrl *tdm_c)
+{
+ u32 i, j, k;
+
+ j = 0;
+ k = 0;
+
+ /* Set Mask Bits */
+ for (i = 0; i < ACTIVE_CH; i++) {
+ tdm_c->tx_mask[k] |= (1 << j);
+ tdm_c->rx_mask[k] |= (1 << j);
+ j++;
+ if (j >= 16) {
+ j = 0;
+ k++;
+ }
+ }
+ /* physical number of slots in a frame */
+ tdm_c->physical_num_ts = NUM_TS;
+
+ /* common receive and transmit pins */
+ tdm_c->cfg_ctrl.com_pin = 1;
+
+ /* L1R/TSYNC active logic "1" */
+ tdm_c->cfg_ctrl.fr_sync_level = 0;
+
+ /*
+ * TX data on rising edge of clock
+ * RX data on falling edge
+ */
+ tdm_c->cfg_ctrl.clk_edge = 0;
+
+ /* Frame sync sampled on falling edge */
+ tdm_c->cfg_ctrl.fr_sync_edge = 0;
+
+ /* no bit delay */
+ tdm_c->cfg_ctrl.rx_fr_sync_delay = 0;
+
+ /* no bit delay */
+ tdm_c->cfg_ctrl.tx_fr_sync_delay = 0;
+
+#ifndef CONFIG_TDM_HW_LB_TSA_SLIC
+ if (tdm_c->leg_slic) {
+ /* Need 1 bit delay for Legrity SLIC */
+ tdm_c->cfg_ctrl.rx_fr_sync_delay = 1;
+ tdm_c->cfg_ctrl.tx_fr_sync_delay = 1;
+ pr_info("%s Delay for Legerity!\n", __FUNCTION__);
+ }
+#endif
+
+ tdm_c->cfg_ctrl.active_num_ts = ACTIVE_CH;
+}
+
+static void tdm_read(u32 client_id, short chn_id, short *pcm_buffer,
+ short len)
+{
+ int i;
+ u32 phase_rx;
+ /* point to where to start for the current phase data processing */
+ u32 temp_rx;
+
+ struct tdm_ctrl *tdm_c = tdm_ctrl[client_id];
+
+ u16 *input_tdm_buffer =
+ (u16 *)tdm_c->tdm_input_data;
+
+ phase_rx = tdm_c->phase_rx;
+ if (phase_rx == 0)
+ phase_rx = MAX_PHASE;
+ else
+ phase_rx -= 1;
+
+ temp_rx = phase_rx * SAMPLE_DEPTH * EFF_ACTIVE_CH;
+
+#ifdef UCC_CACHE_SNOOPING_DISABLED
+ flush_dcache_range((size_t) &input_tdm_buffer[temp_rx],
+ (size_t) &input_tdm_buffer[temp_rx +
+ SAMPLE_DEPTH * ACTIVE_CH]);
+#endif
+ for (i = 0; i < len; i++)
+ pcm_buffer[i] =
+ input_tdm_buffer[i * EFF_ACTIVE_CH + temp_rx + chn_id];
+
+}
+
+static void tdm_write(u32 client_id, short chn_id, short *pcm_buffer,
+ short len)
+{
+ int i;
+ int phase_tx;
+ u32 txb;
+ /* point to where to start for the current phase data processing */
+ int temp_tx;
+ struct tdm_ctrl *tdm_c = tdm_ctrl[client_id];
+
+ u16 *output_tdm_buffer;
+ output_tdm_buffer = (u16 *)tdm_c->tdm_output_data;
+ txb = in_be32(&tdm_c->ucc_pram->tbptr) -
+ in_be32(&tdm_c->ucc_pram->tbase);
+ phase_tx = txb / sizeof(struct qe_bd);
+
+ if (phase_tx == 0)
+ phase_tx = MAX_PHASE;
+ else
+ phase_tx -= 1;
+
+ temp_tx = phase_tx * SAMPLE_DEPTH * EFF_ACTIVE_CH;
+
+ for (i = 0; i < len; i++)
+ output_tdm_buffer[i * EFF_ACTIVE_CH + temp_tx + chn_id] =
+ pcm_buffer[i];
+
+#ifdef UCC_CACHE_SNOOPING_DISABLED
+ flush_dcache_range((size_t) &output_tdm_buffer[temp_tx],
+ (size_t) &output_tdm_buffer[temp_tx + SAMPLE_DEPTH *
+ ACTIVE_CH]);
+#endif
+}
+
+
+static int tdm_register_client(struct tdm_client *tdm_client)
+{
+ u32 i;
+ if (num_tdm_clients == num_tdm_devices) {
+ printk(KERN_ERR "all TDM devices busy\n");
+ return -EBUSY;
+ }
+
+ for (i = 0; i < num_tdm_devices; i++) {
+ if (!tdm_ctrl[i]->device_busy) {
+ tdm_ctrl[i]->device_busy = 1;
+ break;
+ }
+ }
+ num_tdm_clients++;
+ tdm_client->client_id = i;
+ tdm_client->tdm_read = tdm_read;
+ tdm_client->tdm_write = tdm_write;
+ tdm_client->wakeup_event =
+ &(tdm_ctrl[i]->wakeup_event);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(tdm_register_client);
+
+static int tdm_deregister_client(struct tdm_client *tdm_client)
+{
+ num_tdm_clients--;
+ tdm_ctrl[tdm_client->client_id]->device_busy = 0;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(tdm_deregister_client);
+
+static int ucc_tdm_probe(struct of_device *ofdev,
+ const struct of_device_id *match)
+{
+ struct device_node *np = ofdev->node;
+ struct resource res;
+ const unsigned int *prop;
+ u32 ucc_num, device_num, err, ret = 0;
+ struct device_node *np_tmp;
+ dma_addr_t physaddr;
+ void *tdm_buff;
+ struct ucc_tdm_info *ut_info;
+
+ prop = of_get_property(np, "device-id", NULL);
+ if (prop == NULL) {
+ printk(KERN_ERR "ucc_tdm: device-id missing\n");
+ return -ENODEV;
+ }
+
+ ucc_num = *prop - 1;
+ if ((ucc_num < 0) || (ucc_num > 7))
+ return -ENODEV;
+
+ ut_info = &utdm_info[ucc_num];
+ if (ut_info->ucc_busy) {
+ printk(KERN_ERR "ucc_tdm: UCC in use by another TDM driver"
+ "instance\n");
+ return -EBUSY;
+ }
+ if (num_tdm_devices == MAX_NUM_TDM_DEVICES) {
+ printk(KERN_ERR "ucc_tdm: All TDM devices already"
+ " initialized\n");
+ return -ENODEV;
+ }
+
+ ut_info->ucc_busy = 1;
+ tdm_ctrl[num_tdm_devices++] =
+ kzalloc(sizeof(struct tdm_ctrl), GFP_KERNEL);
+ if (!tdm_ctrl[num_tdm_devices - 1]) {
+ printk(KERN_ERR "ucc_tdm: no memory to allocate for"
+ " tdm control structure\n");
+ num_tdm_devices--;
+ return -ENOMEM;
+ }
+ device_num = num_tdm_devices - 1;
+
+ tdm_ctrl[device_num]->device = &ofdev->dev;
+ tdm_ctrl[device_num]->ut_info = ut_info;
+
+ tdm_ctrl[device_num]->ut_info->uf_info.ucc_num = ucc_num;
+
+ prop = of_get_property(np, "fsl,tdm-num", NULL);
+ if (prop == NULL) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+
+ tdm_ctrl[device_num]->tdm_port = *prop - 1;
+
+ if (tdm_ctrl[device_num]->tdm_port > 3) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+
+ prop = of_get_property(np, "fsl,si-num", NULL);
+ if (prop == NULL) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+
+ tdm_ctrl[device_num]->si = *prop - 1;
+
+ tdm_ctrl[device_num]->ut_info->uf_info.tdm_tx_clk =
+ of_get_property(np, "fsl,tdm-tx-clk", NULL);
+ if (tdm_ctrl[device_num]->ut_info->uf_info.tdm_tx_clk == NULL) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+
+ tdm_ctrl[device_num]->ut_info->uf_info.tdm_rx_clk =
+ of_get_property(np, "fsl,tdm-rx-clk", NULL);
+ if (tdm_ctrl[device_num]->ut_info->uf_info.tdm_rx_clk == NULL) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+
+ tdm_ctrl[device_num]->ut_info->uf_info.tdm_tx_sync =
+ of_get_property(np, "fsl,tdm-tx-sync", NULL);
+ if (tdm_ctrl[device_num]->ut_info->uf_info.tdm_tx_sync == NULL) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+
+ tdm_ctrl[device_num]->ut_info->uf_info.tdm_rx_sync =
+ of_get_property(np, "fsl,tdm-rx-sync", NULL);
+ if (tdm_ctrl[device_num]->ut_info->uf_info.tdm_rx_sync == NULL) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+
+ tdm_ctrl[device_num]->ut_info->uf_info.irq =
+ irq_of_parse_and_map(np, 0);
+ err = of_address_to_resource(np, 0, &res);
+ if (err) {
+ ret = -EINVAL;
+ goto get_property_error;
+ }
+ tdm_ctrl[device_num]->ut_info->uf_info.regs = res.start;
+ tdm_ctrl[device_num]->uf_regs = of_iomap(np, 0);
+
+ np_tmp = NULL;
+ np_tmp = of_find_compatible_node(np_tmp, "slic", "legerity-slic");
+ if (np_tmp != NULL) {
+ tdm_ctrl[device_num]->leg_slic = 1;
+ of_node_put(np_tmp);
+ } else
+ tdm_ctrl[device_num]->leg_slic = 0;
+
+ config_tdm(tdm_ctrl[device_num]);
+
+ tdm_buff = dma_alloc_coherent(NULL, 2 * NR_BUFS * SAMPLE_DEPTH *
+ tdm_ctrl[device_num]->cfg_ctrl.active_num_ts,
+ &physaddr, GFP_KERNEL);
+ if (!tdm_buff) {
+ printk(KERN_ERR "ucc-tdm: could not allocate buffer"
+ "descriptors\n");
+ ret = -ENOMEM;
+ goto alloc_error;
+ }
+
+ tdm_ctrl[device_num]->tdm_input_data = tdm_buff;
+ tdm_ctrl[device_num]->dma_input_addr = physaddr;
+
+ tdm_ctrl[device_num]->tdm_output_data = tdm_buff + NR_BUFS *
+ SAMPLE_DEPTH * tdm_ctrl[device_num]->cfg_ctrl.active_num_ts;
+ tdm_ctrl[device_num]->dma_output_addr = physaddr + NR_BUFS *
+ SAMPLE_DEPTH * tdm_ctrl[device_num]->cfg_ctrl.active_num_ts;
+
+ init_waitqueue_head(&(tdm_ctrl[device_num]->wakeup_event));
+
+ ret = tdm_init(tdm_ctrl[device_num]);
+ if (ret != 0)
+ goto tdm_init_error;
+
+ ret = tdm_start(tdm_ctrl[device_num]);
+ if (ret != 0)
+ goto tdm_start_error;
+
+ dev_set_drvdata(&(ofdev->dev), tdm_ctrl[device_num]);
+
+ pr_info("%s UCC based tdm module installed\n", __FUNCTION__);
+ return 0;
+
+tdm_start_error:
+ tdm_deinit(tdm_ctrl[device_num]);
+tdm_init_error:
+ dma_free_coherent(NULL, 2 * NR_BUFS * SAMPLE_DEPTH *
+ tdm_ctrl[device_num]->cfg_ctrl.active_num_ts,
+ tdm_ctrl[device_num]->tdm_input_data,
+ tdm_ctrl[device_num]->dma_input_addr);
+
+alloc_error:
+ irq_dispose_mapping(tdm_ctrl[device_num]->ut_info->uf_info.irq);
+ iounmap(tdm_ctrl[device_num]->uf_regs);
+
+get_property_error:
+ num_tdm_devices--;
+ kfree(tdm_ctrl[device_num]);
+ ut_info->ucc_busy = 0;
+ return ret;
+}
+
+static int ucc_tdm_remove(struct of_device *ofdev)
+{
+ struct tdm_ctrl *tdm_c;
+ struct ucc_tdm_info *ut_info;
+ u32 ucc_num;
+
+ tdm_c = dev_get_drvdata(&(ofdev->dev));
+ dev_set_drvdata(&(ofdev->dev), NULL);
+ ucc_num = tdm_c->ut_info->uf_info.ucc_num;
+ ut_info = &utdm_info[ucc_num];
+ tdm_stop(tdm_c);
+ tdm_deinit(tdm_c);
+
+ ucc_fast_free(tdm_c->uf_private);
+
+ dma_free_coherent(NULL, 2 * NR_BUFS * SAMPLE_DEPTH *
+ tdm_c->cfg_ctrl.active_num_ts,
+ tdm_c->tdm_input_data,
+ tdm_c->dma_input_addr);
+
+ irq_dispose_mapping(tdm_c->ut_info->uf_info.irq);
+ iounmap(tdm_c->uf_regs);
+
+ num_tdm_devices--;
+ kfree(tdm_c);
+
+ ut_info->ucc_busy = 0;
+
+ pr_info("%s UCC based tdm module uninstalled\n", __FUNCTION__);
+ return 0;
+}
+
+const struct of_device_id ucc_tdm_match[] = {
+ { .type = "tdm", .compatible = "fsl,ucc-tdm", },
+ {},
+};
+
+MODULE_DEVICE_TABLE(of, ucc_tdm_match);
+
+static struct of_platform_driver ucc_tdm_driver = {
+ .name = DRV_NAME,
+ .match_table = ucc_tdm_match,
+ .probe = ucc_tdm_probe,
+ .remove = ucc_tdm_remove,
+ .driver = {
+ .name = DRV_NAME,
+ .owner = THIS_MODULE,
+ },
+};
+
+static int __init ucc_tdm_init(void)
+{
+ u32 i;
+
+ pr_info("ucc_tdm: " DRV_DESC "\n");
+ for (i = 0; i < 8; i++)
+ memcpy(&(utdm_info[i]), &utdm_primary_info,
+ sizeof(utdm_primary_info));
+
+ return of_register_platform_driver(&ucc_tdm_driver);
+}
+
+static void __exit ucc_tdm_exit(void)
+{
+ of_unregister_platform_driver(&ucc_tdm_driver);
+}
+
+module_init(ucc_tdm_init);
+module_exit(ucc_tdm_exit);
+MODULE_AUTHOR("Freescale Semiconductor, Inc");
+MODULE_DESCRIPTION(DRV_DESC);
+MODULE_LICENSE("GPL");
diff --git a/drivers/misc/ucc_tdm.h b/drivers/misc/ucc_tdm.h
new file mode 100644
index 0000000..eaf2848
--- /dev/null
+++ b/drivers/misc/ucc_tdm.h
@@ -0,0 +1,221 @@
+/*
+ * drivers/misc/ucc_tdm.h
+ *
+ * UCC Based Linux TDM Driver
+ * This driver is designed to support UCC based TDM for PowerPC processors.
+ * This driver can interface with SLIC device to run VOIP kind of
+ * applications.
+ *
+ * Author: Ashish Kalra & Poonam Aggrwal
+ *
+ * Copyright (c) 2007 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#ifndef TDM_H
+#define TDM_H
+
+#define NUM_TS 8
+#define ACTIVE_CH 8
+
+/* SAMPLE_DEPTH is the sample depth is the number of frames before
+ * an interrupt. Must be a multiple of 4
+ */
+#define SAMPLE_DEPTH 80
+
+/* define the number of Rx interrupts to go by for initial stuttering */
+#define STUTTER_INT_CNT 1
+
+/* BMRx Field Descriptions to specify tstate and rstate in UCC parameter RAM*/
+#define EN_BUS_SNOOPING 0x20
+#define BE_BO 0x10
+
+/* UPSMR Register for Transparent UCC controller Bit definitions*/
+#define NBO 0x00000000 /* Normal Mode 1 bit of data per clock */
+
+/* SI Mode register bit definitions */
+#define NORMAL_OPERATION 0x0000
+#define AUTO_ECHO 0x0400
+#define INTERNAL_LB 0x0800
+#define CONTROL_LB 0x0c00
+#define SIMODE_CRT (0x8000 >> 9)
+#define SIMODE_SL (0x8000 >> 10)
+#define SIMODE_CE (0x8000 >> 11)
+#define SIMODE_FE (0x8000 >> 12)
+#define SIMODE_GM (0x8000 >> 13)
+#define SIMODE_TFSD(val) (val)
+#define SIMODE_RFSD(val) ((val) << 8)
+
+#define SI_TDM_MODE_REGISTER_OFFSET 0
+
+#define R_CM 0x02000000
+#define T_CM 0x02000000
+
+#define SET_RX_SI_RAM(n, val) \
+ out_be16((u16 *)&qe_immr->sir.rx[(n)*2], (u16)(val))
+
+#define SET_TX_SI_RAM(n, val) \
+ out_be16((u16 *)&qe_immr->sir.tx[(n)*2], (u16)(val))
+
+/* SI RAM entries */
+#define SIR_LAST 0x0001
+#define SIR_CNT(n) ((n) << 2)
+#define SIR_BYTE 0x0002
+#define SIR_BIT 0x0000
+#define SIR_IDLE 0
+#define SIR_UCC(uccx) (((uccx+9)) << 5)
+
+/* BRGC Register Bit definitions */
+#define BRGC_RESET (0x1<<17)
+#define BRGC_EN (0x1<<16)
+#define BRGC_EXTC_QE (0x00<<14)
+#define BRGC_EXTC_CLK3 (0x01<<14)
+#define BRGC_EXTC_CLK5 (0x01<<15)
+#define BRGC_EXTC_CLK9 (0x01<<14)
+#define BRGC_EXTC_CLK11 (0x01<<14)
+#define BRGC_EXTC_CLK13 (0x01<<14)
+#define BRGC_EXTC_CLK15 (0x01<<15)
+#define BRGC_ATB (0x1<<13)
+#define BRGC_DIV16 (0x1)
+
+/* structure representing UCC transparent parameter RAM */
+struct ucc_transparent_pram {
+ __be16 riptr;
+ __be16 tiptr;
+ __be16 res0;
+ __be16 mrblr;
+ __be32 rstate;
+ __be32 rbase;
+ __be16 rbdstat;
+ __be16 rbdlen;
+ __be32 rdptr;
+ __be32 tstate;
+ __be32 tbase;
+ __be16 tbdstat;
+ __be16 tbdlen;
+ __be32 tdptr;
+ __be32 rbptr;
+ __be32 tbptr;
+ __be32 rcrc;
+ __be32 res1;
+ __be32 tcrc;
+ __be32 res2;
+ __be32 res3;
+ __be32 c_mask;
+ __be32 c_pres;
+ __be16 disfc;
+ __be16 crcec;
+ __be32 res4[4];
+ __be16 ts_tmp;
+ __be16 tmp_mb;
+};
+
+#define UCC_TRANSPARENT_PRAM_SIZE 0x100
+
+struct tdm_cfg {
+ u8 com_pin; /* Common receive and transmit pins
+ * 0 = separate pins
+ * 1 = common pins
+ */
+
+ u8 fr_sync_level; /* SLx bit Frame Sync Polarity
+ * 0 = L1R/TSYNC active logic "1"
+ * 1 = L1R/TSYNC active logic "0"
+ */
+
+ u8 clk_edge; /* CEx bit Tx Rx Clock Edge
+ * 0 = TX data on rising edge of clock
+ * RX data on falling edge
+ * 1 = TX data on falling edge of clock
+ * RX data on rising edge
+ */
+
+ u8 fr_sync_edge; /* FEx bit Frame sync edge
+ * Determine when the sync pulses are sampled
+ * 0 = Falling edge
+ * 1 = Rising edge
+ */
+
+ u8 rx_fr_sync_delay; /* TFSDx/RFSDx bits Frame Sync Delay
+ * 00 = no bit delay
+ * 01 = 1 bit delay
+ * 10 = 2 bit delay
+ * 11 = 3 bit delay
+ */
+
+ u8 tx_fr_sync_delay; /* TFSDx/RFSDx bits Frame Sync Delay
+ * 00 = no bit delay
+ * 01 = 1 bit delay
+ * 10 = 2 bit delay
+ * 11 = 3 bit delay
+ */
+
+ u8 active_num_ts; /* Number of active time slots in TDM
+ * assume same active Rx/Tx time slots
+ */
+};
+
+struct ucc_tdm_info {
+ struct ucc_fast_info uf_info;
+ u32 ucc_busy;
+};
+
+struct tdm_ctrl {
+ u32 device_busy;
+ struct device *device;
+ struct ucc_fast_private *uf_private;
+ struct ucc_tdm_info *ut_info;
+ u32 tdm_port; /* port for this tdm:TDMA,TDMB,TDMC,TDMD */
+ u32 si; /* serial interface: 0 or 1 */
+ struct ucc_fast __iomem *uf_regs; /* UCC Fast registers */
+ u16 rx_mask[8]; /* Active Receive channels LSB is ch0 */
+ u16 tx_mask[8]; /* Active Transmit channels LSB is ch0 */
+ /* Only channels less than the number of FRAME_SIZE are implemented */
+ struct tdm_cfg cfg_ctrl; /* Signaling controls configuration */
+ u8 *tdm_input_data; /* buffer used for Rx by the tdm */
+ u8 *tdm_output_data; /* buffer used for Tx by the tdm */
+
+ dma_addr_t dma_input_addr; /* dma mapped buffer for TDM Rx */
+ dma_addr_t dma_output_addr; /* dma mapped buffer for TDM Tx */
+ u16 physical_num_ts; /* physical number of timeslots in the tdm
+ frame */
+ u32 phase_rx; /* cycles through 0, 1, 2 */
+ u32 phase_tx; /* cycles through 0, 1, 2 */
+ /*
+ * the following two variables are for dealing with "stutter" problem
+ * "stutter" period is about 20 frames or so, varies depending active
+ * channel num depending on the sample depth, the code should let a
+ * few Rx interrupts go by
+ */
+ u32 tdm_icnt;
+ u32 tdm_flag;
+ struct ucc_transparent_pram __iomem *ucc_pram;
+ struct qe_bd __iomem *tx_bd;
+ struct qe_bd __iomem *rx_bd;
+ u32 ucc_pram_offset;
+ u32 tx_bd_offset;
+ u32 rx_bd_offset;
+ u32 rx_ucode_buf_offset;
+ u32 tx_ucode_buf_offset;
+ bool leg_slic;
+ wait_queue_head_t wakeup_event;
+};
+
+struct tdm_client {
+ u32 client_id;
+ void (*tdm_read)(u32 client_id, short chn_id,
+ short *pcm_buffer, short len);
+ void (*tdm_write)(u32 client_id, short chn_id,
+ short *pcm_buffer, short len);
+ wait_queue_head_t *wakeup_event;
+ };
+
+#define MAX_PHASE 1
+#define NR_BUFS 2
+#define EFF_ACTIVE_CH ACTIVE_CH / 2
+
+#endif
--
1.5.2.4
^ permalink raw reply related
* [PATCH 1/1][NETNS] Add missing initialization of nl_info.nl_net in rtm_to_fib6_config()
From: Benjamin Thery @ 2008-01-24 10:32 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Denis V. Lunev, Daniel Lezcano, Benjamin Thery
In-Reply-To: <20080124103220.265806220@theryb.frec.bull.fr>
[-- Attachment #1: route6-finalize-add-netns-to-nl_info-structure.patch --]
[-- Type: text/plain, Size: 730 bytes --]
Add missing initialization of the new nl_info.nl_net field in
rtm_to_fib6_config(). This will be needed the store network namespace
associated to the fib6_config struct.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
---
net/ipv6/route.c | 1 +
1 file changed, 1 insertion(+)
Index: net-2.6.25/net/ipv6/route.c
===================================================================
--- net-2.6.25.orig/net/ipv6/route.c
+++ net-2.6.25/net/ipv6/route.c
@@ -1955,6 +1955,7 @@ static int rtm_to_fib6_config(struct sk_
cfg->fc_nlinfo.pid = NETLINK_CB(skb).pid;
cfg->fc_nlinfo.nlh = nlh;
+ cfg->fc_nlinfo.nl_net = skb->sk->sk_net;
if (tb[RTA_GATEWAY]) {
nla_memcpy(&cfg->fc_gateway, tb[RTA_GATEWAY], 16);
--
^ permalink raw reply
* [PATCH UCC TDM 1/3 Updated] Platform changes for UCC TDM driver for MPC8323eRDB. Also includes related QE changes and dts entries.
From: Poonam_Aggrwal-b10812 @ 2008-01-24 10:30 UTC (permalink / raw)
To: kumar.gala, akpm, linux-kernel, netdev, rubini, linuxppc-dev
Cc: michael.barkowski, kim.phillips, ashish.kalra, timur, rich.cutler,
b10812
Thanks Stephen for your comments, incorporated them.
From: Poonam Aggrwal <b10812@freescale.com>
This patch makes necessary changes in the QE and UCC framework to support
TDM. It also adds support to configure the BRG properly through device
tree entries. Includes the device tree changes for UCC TDM driver as well.
It also includes device tree entries for UCC TDM driver.
Tested on MPC8323ERDB platform.
Signed-off-by: Poonam Aggrwal <b10812@freescale.com>
Signed-off-by: Ashish Kalra <ashish.kalra@freescale.com>
Signed-off-by: Kim Phillips <Kim.Phillips@freescale.com>
Signed-off-by: Michael Barkowski <michael.barkowski@freescale.com>
---
arch/powerpc/boot/dts/mpc832x_rdb.dts | 58 +++++++
arch/powerpc/sysdev/qe_lib/qe.c | 184 +++++++++++++++++++++--
arch/powerpc/sysdev/qe_lib/ucc.c | 265 +++++++++++++++++++++++++++++++++
arch/powerpc/sysdev/qe_lib/ucc_fast.c | 37 +++++
include/asm-powerpc/qe.h | 8 +
include/asm-powerpc/ucc.h | 4 +
include/asm-powerpc/ucc_fast.h | 4 +
7 files changed, 548 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/boot/dts/mpc832x_rdb.dts b/arch/powerpc/boot/dts/mpc832x_rdb.dts
index 388c8a7..c0e6283 100644
--- a/arch/powerpc/boot/dts/mpc832x_rdb.dts
+++ b/arch/powerpc/boot/dts/mpc832x_rdb.dts
@@ -105,6 +105,17 @@
device_type = "par_io";
num-ports = <7>;
+ ucc1pio:ucc_pin@01 {
+ pio-map = <
+ /* port pin dir open_drain assignment has_irq */
+ 0 e 2 0 1 0 /* CLK11 */
+ 3 16 1 0 2 0 /* BRG9 */
+ 3 1b 1 0 2 0 /* BRG3 */
+ 0 0 3 0 2 0 /* TDMATxD0 */
+ 0 4 3 0 2 0 /* TDMARxD0 */
+ 3 1b 2 0 1 0>; /* CLK1 */
+ };
+
ucc2pio:ucc_pin@02 {
pio-map = <
/* port pin dir open_drain assignment has_irq */
@@ -169,6 +180,36 @@
};
};
+ clocks {
+ compatible = "fsl,cpm-clocks";
+ /* clock freqs in Hz(for CLK1~CLK24).
+ * CLK11 is 1024KHz,
+ * all other clocks unused
+ * #clock-cells define number of cells
+ * used by the clock-frequency.
+ * right now only #clock cells=1 is
+ * implemented. Provision is there to
+ * handle frequencies >4Gig
+ */
+ #clock-cells = <1>;
+ clock-frequency = <0 0 0 0 0 0
+ 0 0 0 0 d#1024000 0
+ 0 0 0 0 0 0
+ 0 0 0 0 0 0>;
+ };
+
+ brg@640 {
+ compatible = "fsl,cpm-brg";
+ /* input clock sources for all the 16 BRGs.
+ * 1-24 for CLK1 to CLK24.
+ * BRG9 uses CLK11,BRG1 and BRG2-8 use
+ * the QE clock.
+ */
+ fsl,brg-sources = <0 0 0 0 0 0 0 0
+ b 0 0 0 0 0 0 0>;
+ reg = <640 7f>;
+ };
+
spi@4c0 {
device_type = "spi";
compatible = "fsl_spi";
@@ -187,6 +228,23 @@
mode = "cpu";
};
+ ucc@2000 {
+ device_type = "tdm";
+ compatible = "fsl,ucc-tdm";
+ model = "UCC";
+ device-id = <1>;
+ fsl,tdm-num = <1>;
+ fsl,si-num = <1>;
+ fsl,tdm-tx-clk = "CLK1";
+ fsl,tdm-rx-clk = "CLK1";
+ fsl,tdm-tx-sync = "BRG9";
+ fsl,tdm-rx-sync = "BRG9";
+ reg = <2000 200>;
+ interrupts = <20>;
+ interrupt-parent = <&qeic>;
+ pio-handle = <&ucc1pio>;
+ };
+
ucc@3000 {
device_type = "network";
compatible = "ucc_geth";
diff --git a/arch/powerpc/sysdev/qe_lib/qe.c b/arch/powerpc/sysdev/qe_lib/qe.c
index 1df3b4a..9b9971d 100644
--- a/arch/powerpc/sysdev/qe_lib/qe.c
+++ b/arch/powerpc/sysdev/qe_lib/qe.c
@@ -149,20 +149,170 @@ EXPORT_SYMBOL(qe_issue_cmd);
*/
static unsigned int brg_clk = 0;
-unsigned int get_brg_clk(void)
+u32 get_brg_clk(enum qe_clock brgclk, enum qe_clock *brg_source)
{
- struct device_node *qe;
- if (brg_clk)
- return brg_clk;
+ struct device_node *qe, *brg, *clocks;
+ enum qe_clock brg_src;
+ u32 brg_input_freq = 0;
+ u32 brg_num;
+ int ret;
+ const unsigned int *prop;
- qe = of_find_node_by_type(NULL, "qe");
- if (qe) {
+ *brg_source = 0;
+
+ brg_num = brgclk - QE_BRG1;
+ brg = of_find_compatible_node(NULL, NULL, "fsl,cpm-brg");
+ if (!brg) {
+ printk(KERN_ERR "%s: no brg node in device tree\n",
+ __FUNCTION__);
+ return -EINVAL;
+ }
+ unsigned int size;
+ prop = of_get_property(brg, "fsl,brg-sources", &size);
+ of_node_put(brg);
+
+ if (!prop) {
+ printk(KERN_ERR "%s: invalid fsl,brg-sources in device tree \n"
+ , __FUNCTION__);
+ return -EINVAL;
+ }
+ brg_src = *(prop + brg_num);
+ if (brg_src == 0) {
+ *brg_source = 0;
+ if (brg_clk > 0)
+ return brg_clk;
+ qe = of_find_node_by_type(NULL, "qe");
+ if (!qe) {
+ printk(KERN_ERR "%s: no qe node in device tree \n"
+ , __FUNCTION__);
+ ret = EINVAL;
+ goto err;
+ }
unsigned int size;
- const u32 *prop = of_get_property(qe, "brg-frequency", &size);
- brg_clk = *prop;
+ prop = of_get_property(qe, "brg-frequency", &size);
+ if (!prop) {
+ printk(KERN_ERR "%s: QE brg-frequency"
+ "not present in device tree\n", __FUNCTION__);
+ ret = -EINVAL;
+ of_node_put(qe);
+ goto err;
+ }
+ if (*prop) {
+ of_node_put(qe);
+ brg_clk = *prop;
+ return *prop;
+ }
+ /*
+ * Older versions of U-Boot do not initialize
+ * the brg-frequency property, so in this case
+ * we assume the BRG frequency is half the QE
+ * bus frequency.
+ */
+ prop = of_get_property(qe, "bus-frequency", NULL);
of_node_put(qe);
- };
- return brg_clk;
+ if (!prop) {
+ printk(KERN_ERR "%s:QE bus-frequency not"
+ " present in device tree\n", __FUNCTION__);
+ ret = -EINVAL;
+ goto err;
+ }
+ if (*prop) {
+ brg_clk = *prop / 2;
+ return brg_clk;
+ }
+ printk(KERN_ERR "%s: invalid QE bus-frequency in device tree\n",
+ __FUNCTION__);
+ ret = -EINVAL;
+ goto err;
+ } else {
+ *brg_source = brg_src + QE_CLK1 - 1;
+ clocks = of_find_compatible_node(NULL, NULL, "fsl,cpm-clocks");
+ if (!clocks) {
+ printk(KERN_ERR "%s: no clocks node in device tree \n",
+ __FUNCTION__);
+ ret = -EINVAL;
+ goto err;
+ }
+ prop = of_get_property(clocks, "#clock-cells", &size);
+ /*
+ * clock-cells = 1 only supported right now.
+ */
+ if (!prop || *prop != 1) {
+ printk(KERN_ERR "%s: invalid #clock-cells value"
+ "in device tree \n", __FUNCTION__);
+ of_node_put(clocks);
+ ret = -EINVAL;
+ goto err;
+ }
+
+ prop = of_get_property(clocks, "clock-frequency", &size);
+ if (!prop) {
+ printk(KERN_ERR "%s:no #clock-frequency prop in"
+ "device tree\n", __FUNCTION__);
+ of_node_put(clocks);
+ ret = -EINVAL;
+ goto err;
+ }
+ brg_input_freq = *(prop+(brg_src - 1));
+ of_node_put(clocks);
+ return brg_input_freq;
+ }
+err:
+ return ret;
+}
+
+u32 qe_brg_src(int brg_num, enum qe_clock brg_src)
+{
+ u32 clock_bits, shift;
+
+ clock_bits = 0;
+
+ switch (brg_num) {
+ case 1:
+ case 2:
+ case 5:
+ case 6:
+ switch (brg_src) {
+ case QE_CLK3: clock_bits = 1; break;
+ case QE_CLK5: clock_bits = 2; break;
+ default: break;
+ }
+ break;
+ case 3:
+ case 4:
+ case 7:
+ case 8:
+ switch (brg_src) {
+ case QE_CLK9: clock_bits = 1; break;
+ case QE_CLK15: clock_bits = 2; break;
+ default: break;
+ }
+ break;
+ case 9:
+ case 10:
+ switch (brg_src) {
+ case QE_CLK11: clock_bits = 1; break;
+ default: break;
+ }
+ break;
+ case 11:
+ case 15:
+ case 16:
+ switch (brg_src) {
+ case QE_CLK13: clock_bits = 1; break;
+ default: break;
+ }
+ break;
+ default: clock_bits = 0; break;
+ }
+ shift = 14;
+
+ if (!clock_bits)
+ return -ENOENT;
+
+ clock_bits <<= shift;
+
+ return clock_bits;
}
/* Program the BRG to the given sampling rate and multiplier
@@ -177,11 +327,18 @@ int qe_setbrg(enum qe_clock brg, unsigned int rate, unsigned int multiplier)
{
u32 divisor, tempval;
u32 div16 = 0;
+ u32 brg_clock;
+ enum qe_clock brgsrc;
+ u32 src_bits = 0;
if ((brg < QE_BRG1) || (brg > QE_BRG16))
return -EINVAL;
- divisor = get_brg_clk() / (rate * multiplier);
+ brg_clock = get_brg_clk(brg, &brgsrc);
+ if (brg_clock < 0)
+ return -EINVAL;
+
+ divisor = brg_clock / (rate * multiplier);
if (divisor > QE_BRGC_DIVISOR_MAX + 1) {
div16 = QE_BRGC_DIV16;
@@ -194,8 +351,11 @@ int qe_setbrg(enum qe_clock brg, unsigned int rate, unsigned int multiplier)
if (!div16 && (divisor & 1))
divisor++;
+ if (brgsrc > 0)
+ src_bits = qe_brg_src(brg - QE_BRG1 + 1, brgsrc);
+
tempval = ((divisor - 1) << QE_BRGC_DIVISOR_SHIFT) |
- QE_BRGC_ENABLE | div16;
+ QE_BRGC_ENABLE | div16 | src_bits;
out_be32(&qe_immr->brg.brgc[brg - QE_BRG1], tempval);
diff --git a/arch/powerpc/sysdev/qe_lib/ucc.c b/arch/powerpc/sysdev/qe_lib/ucc.c
index 0e348d9..f2de0ed 100644
--- a/arch/powerpc/sysdev/qe_lib/ucc.c
+++ b/arch/powerpc/sysdev/qe_lib/ucc.c
@@ -213,3 +213,268 @@ int ucc_set_qe_mux_rxtx(unsigned int ucc_num, enum qe_clock clock,
return 0;
}
+
+int ucc_set_tdm_rxtx_clk(int tdm_num, char *clk_src, enum comm_dir mode)
+{
+ enum qe_clock clock;
+ u32 clock_bits, shift;
+ struct qe_mux *qe_mux_reg = NULL;
+
+ clock_bits = 0;
+ qe_mux_reg = &qe_immr->qmx;
+
+ if ((tdm_num > 3 || tdm_num < 0))
+ return -EINVAL;
+
+ /* The communications direction must be RX or TX */
+ if (!((mode == COMM_DIR_RX) || (mode == COMM_DIR_TX)))
+ return -EINVAL;
+
+ clock = qe_clock_source(clk_src);
+ switch (mode) {
+ case COMM_DIR_RX:
+ switch (tdm_num) {
+ case 0:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK3: clock_bits = 6; break;
+ case QE_CLK8: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 28;
+ break;
+ case 1:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK5: clock_bits = 6; break;
+ case QE_CLK10: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 24;
+ break;
+ case 2:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK7: clock_bits = 6; break;
+ case QE_CLK12: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 20;
+ break;
+ case 3:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK9: clock_bits = 6; break;
+ case QE_CLK14: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 16;
+ break;
+ default:
+ break;
+ }
+ break;
+ case COMM_DIR_TX:
+ switch (tdm_num) {
+ case 0:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK4: clock_bits = 6; break;
+ case QE_CLK9: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 12;
+ break;
+ case 1:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK6: clock_bits = 6; break;
+ case QE_CLK11: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 8;
+ break;
+ case 2:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK8: clock_bits = 6; break;
+ case QE_CLK13: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 4;
+ break;
+ case 3:
+ switch (clock) {
+ case QE_BRG3: clock_bits = 1; break;
+ case QE_BRG4: clock_bits = 2; break;
+ case QE_CLK1: clock_bits = 4; break;
+ case QE_CLK2: clock_bits = 5; break;
+ case QE_CLK10: clock_bits = 6; break;
+ case QE_CLK15: clock_bits = 7; break;
+ default: break;
+ }
+ shift = 0;
+ break;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+
+ if (!clock_bits)
+ return -ENOENT;
+
+ clock_bits <<= shift;
+
+ qe_mux_reg->cmxsi1cr_l |= clock_bits;
+
+ return 0;
+}
+
+int ucc_set_tdm_rxtx_sync(int tdm_num, char *sync_src, enum comm_dir mode)
+{
+ enum qe_clock clock;
+ u32 shift, clock_bits;
+ struct qe_mux *qe_mux_reg = NULL;
+ int source;
+
+ source = -1;
+ qe_mux_reg = &qe_immr->qmx;
+
+ if ((tdm_num > 3 || tdm_num < 0))
+ return -EINVAL;
+
+ /* The communications direction must be RX or TX */
+ if (!((mode == COMM_DIR_RX) || (mode == COMM_DIR_TX)))
+ return -EINVAL;
+
+ switch (mode) {
+ case COMM_DIR_RX:
+ if (strcasecmp("RSYNC", sync_src) == 0) {
+ source = 0;
+ shift = 0;
+ break;
+ }
+ clock = qe_clock_source(sync_src);
+ switch (tdm_num) {
+ case 0:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG10: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 30;
+ break;
+ case 1:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG10: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 28;
+ break;
+ case 2:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG11: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 26;
+ break;
+ case 3:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG11: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 24;
+ break;
+ default:
+ source = -1;
+ break;
+ }
+ break;
+ case COMM_DIR_TX:
+ if (strcasecmp("TSYNC", sync_src) == 0) {
+ source = 0;
+ shift = 0;
+ break;
+ }
+ clock = qe_clock_source(sync_src);
+ switch (tdm_num) {
+ case 0:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG10: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 14;
+ break;
+ case 1:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG10: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 12;
+ break;
+ case 2:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG11: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 10;
+ break;
+ case 3:
+ switch (clock) {
+ case QE_BRG9: source = 1; break;
+ case QE_BRG11: source = 2; break;
+ default: source = -1; break;
+ }
+ shift = 8;
+ break;
+ default:
+ source = -1;
+ break;
+ }
+ break;
+ default:
+ source = -1;
+ break;
+ }
+
+ if (source == -1)
+ return -ENOENT;
+
+ clock_bits = (u32) source;
+ clock_bits <<= shift;
+
+
+ qe_mux_reg->cmxsi1syr |= clock_bits;
+
+ return 0;
+}
diff --git a/arch/powerpc/sysdev/qe_lib/ucc_fast.c b/arch/powerpc/sysdev/qe_lib/ucc_fast.c
index 3223acb..9c8559f 100644
--- a/arch/powerpc/sysdev/qe_lib/ucc_fast.c
+++ b/arch/powerpc/sysdev/qe_lib/ucc_fast.c
@@ -327,6 +327,43 @@ int ucc_fast_init(struct ucc_fast_info * uf_info, struct ucc_fast_private ** ucc
ucc_fast_free(uccf);
return -EINVAL;
}
+ } else {
+ /* TDM Rx clock routing */
+ if ((uf_info->tdm_rx_clk != NULL) &&
+ ucc_set_tdm_rxtx_clk(uf_info->ucc_num,
+ uf_info->tdm_rx_clk, COMM_DIR_RX)) {
+ printk(KERN_ERR "%s: illegal value for TDM RX clock",
+ __FUNCTION__);
+ ucc_fast_free(uccf);
+ return -EINVAL;
+ }
+ /* TDM Tx clock routing */
+ if ((uf_info->tdm_tx_clk != NULL) &&
+ ucc_set_tdm_rxtx_clk(uf_info->ucc_num,
+ uf_info->tdm_tx_clk, COMM_DIR_TX)) {
+ printk(KERN_ERR "%s: illegal value for TDM TX clock",
+ __FUNCTION__);
+ ucc_fast_free(uccf);
+ return -EINVAL;
+ }
+ /* TDM Rx sync routing */
+ if ((uf_info->tdm_rx_sync != NULL) &&
+ ucc_set_tdm_rxtx_sync(uf_info->ucc_num,
+ uf_info->tdm_rx_sync, COMM_DIR_RX)) {
+ printk(KERN_ERR "%s: illegal value for TDM RX"
+ "Frame sync", __FUNCTION__);
+ ucc_fast_free(uccf);
+ return -EINVAL;
+ }
+ /* TDM Tx sync routing */
+ if ((uf_info->tdm_tx_sync != NULL) &&
+ ucc_set_tdm_rxtx_sync(uf_info->ucc_num,
+ uf_info->tdm_tx_sync, COMM_DIR_TX)) {
+ printk(KERN_ERR "%s: illegal value for TDM TX"
+ "Frame sync", __FUNCTION__);
+ ucc_fast_free(uccf);
+ return -EINVAL;
+ }
}
/* Set interrupt mask register at UCC level. */
diff --git a/include/asm-powerpc/qe.h b/include/asm-powerpc/qe.h
index bcf60be..51de236 100644
--- a/include/asm-powerpc/qe.h
+++ b/include/asm-powerpc/qe.h
@@ -497,6 +497,14 @@ struct ucc_slow_pram {
#define UCC_GETH_UCCE_RXF1 0x00000002
#define UCC_GETH_UCCE_RXF0 0x00000001
+/* Transparent UCC Event Register (UCCE) */
+#define UCC_TRANS_UCCE_GRA 0x0080
+#define UCC_TRANS_UCCE_TXE 0x0010
+#define UCC_TRANS_UCCE_RXF 0x0008
+#define UCC_TRANS_UCCE_BSY 0x0004
+#define UCC_TRANS_UCCE_TXB 0x0002
+#define UCC_TRANS_UCCE_RXB 0x0001
+
/* UPSMR, when used as a UART */
#define UCC_UART_UPSMR_FLC 0x8000
#define UCC_UART_UPSMR_SL 0x4000
diff --git a/include/asm-powerpc/ucc.h b/include/asm-powerpc/ucc.h
index 46b09ba..153db97 100644
--- a/include/asm-powerpc/ucc.h
+++ b/include/asm-powerpc/ucc.h
@@ -42,6 +42,10 @@ int ucc_set_qe_mux_mii_mng(unsigned int ucc_num);
int ucc_set_qe_mux_rxtx(unsigned int ucc_num, enum qe_clock clock,
enum comm_dir mode);
+int ucc_set_tdm_rxtx_clk(int tdm_num, char *clk_src, enum comm_dir mode);
+
+int ucc_set_tdm_rxtx_sync(int tdm_num, char *clk_src, enum comm_dir mode);
+
int ucc_mux_set_grant_tsa_bkpt(unsigned int ucc_num, int set, u32 mask);
/* QE MUX clock routing for UCC
diff --git a/include/asm-powerpc/ucc_fast.h b/include/asm-powerpc/ucc_fast.h
index f529f70..cf35553 100644
--- a/include/asm-powerpc/ucc_fast.h
+++ b/include/asm-powerpc/ucc_fast.h
@@ -152,6 +152,10 @@ struct ucc_fast_info {
enum ucc_fast_rx_decoding_method renc;
enum ucc_fast_transparent_tcrc tcrc;
enum ucc_fast_sync_len synl;
+ const char *tdm_rx_clk;
+ const char *tdm_tx_clk;
+ const char *tdm_rx_sync;
+ const char *tdm_tx_sync;
};
struct ucc_fast_private {
--
1.5.2.4
^ permalink raw reply related
* Re: 2.6.24-rc8-mm1 : net tcp_input.c warnings
From: Ilpo Järvinen @ 2008-01-24 10:24 UTC (permalink / raw)
To: Netdev
Cc: Dave Young, Kamalesh Babulal, Krishna Kumar2,
Denys Fedoryshchenko, David Miller, LKML, Andrew Morton
In-Reply-To: <Pine.LNX.4.64.0801241108400.31652@kivilampi-30.cs.helsinki.fi>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 10779 bytes --]
On Thu, 24 Jan 2008, Ilpo Järvinen wrote:
> And anyway, there were some fackets_out related
> problems reported as well and this doesn't help for that but I think I've
> lost track of who was seeing it due to large number of reports :-), could
> somebody refresh my memory because I currently don't have time to dig it
> up from archives (at least on this week).
Here's the updated debug patch for net-2.6.25/mm for tracking
fackets_out inconsistencies (it won't work for trees which don't
include net-2.6.25, mm does of course :-)).
I hope I got it into good shape this time to avoid spurious stacktraces
but still maintaining 100% accuracy, but it's not a simple oneliner so I
might have missed something... :-)
I'd suggest that people trying with this first apply the newreno fix of
the previous mail to avoid already-fixed case from triggering.
--
i.
--
[PATCH] [TCP]: debug S+L (for net-2.5.26 / mm, incompatible with mainline)
---
include/net/tcp.h | 5 ++-
net/ipv4/tcp_input.c | 18 +++++++-
net/ipv4/tcp_ipv4.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_output.c | 23 +++++++--
4 files changed, 165 insertions(+), 8 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7de4ea3..552aa71 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -272,6 +272,9 @@ DECLARE_SNMP_STAT(struct tcp_mib, tcp_statistics);
#define TCP_ADD_STATS_BH(field, val) SNMP_ADD_STATS_BH(tcp_statistics, field, val)
#define TCP_ADD_STATS_USER(field, val) SNMP_ADD_STATS_USER(tcp_statistics, field, val)
+extern void tcp_print_queue(struct sock *sk);
+extern void tcp_verify_wq(struct sock *sk);
+
extern void tcp_v4_err(struct sk_buff *skb, u32);
extern void tcp_shutdown (struct sock *sk, int how);
@@ -768,7 +771,7 @@ static inline __u32 tcp_current_ssthresh(const struct sock *sk)
}
/* Use define here intentionally to get WARN_ON location shown at the caller */
-#define tcp_verify_left_out(tp) WARN_ON(tcp_left_out(tp) > tp->packets_out)
+#define tcp_verify_left_out(tp) tcp_verify_wq((struct sock *)tp)
extern void tcp_enter_cwr(struct sock *sk, const int set_ssthresh);
extern __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 19c449f..c897c93 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1426,8 +1426,10 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb,
int first_sack_index;
if (!tp->sacked_out) {
- if (WARN_ON(tp->fackets_out))
+ if (WARN_ON(tp->fackets_out)) {
+ tcp_verify_left_out(tp);
tp->fackets_out = 0;
+ }
tcp_highest_sack_reset(sk);
}
@@ -2136,6 +2138,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
struct sk_buff *skb;
int cnt;
+ tcp_verify_left_out(tp);
+
BUG_TRAP(packets <= tp->packets_out);
if (tp->lost_skb_hint) {
skb = tp->lost_skb_hint;
@@ -2501,6 +2505,8 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
(tcp_fackets_out(tp) > tp->reordering));
int fast_rexmit = 0;
+ tcp_verify_left_out(tp);
+
if (WARN_ON(!tp->packets_out && tp->sacked_out))
tp->sacked_out = 0;
if (WARN_ON(!tp->sacked_out && tp->fackets_out))
@@ -2645,6 +2651,10 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
if (do_lost || (tcp_is_fack(tp) && tcp_head_timedout(sk)))
tcp_update_scoreboard(sk, fast_rexmit);
tcp_cwnd_down(sk, flag);
+
+ WARN_ON(tcp_write_queue_head(sk) == NULL);
+ WARN_ON(!tp->packets_out);
+
tcp_xmit_retransmit_queue(sk);
}
@@ -2848,6 +2858,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets)
tcp_clear_all_retrans_hints(tp);
}
+ tcp_verify_left_out(tp);
+
if (skb && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
flag |= FLAG_SACK_RENEGING;
@@ -3175,6 +3187,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
prior_fackets = tp->fackets_out;
prior_in_flight = tcp_packets_in_flight(tp);
+ tcp_verify_left_out(tp);
+
if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
/* Window is constant, pure forward advance.
* No more checks are required.
@@ -3237,6 +3251,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
dst_confirm(sk->sk_dst_cache);
+ tcp_verify_left_out(tp);
+
return 1;
no_queue:
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 9aea88b..e6e3ad5 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -108,6 +108,133 @@ struct inet_hashinfo __cacheline_aligned tcp_hashinfo = {
.lhash_wait = __WAIT_QUEUE_HEAD_INITIALIZER(tcp_hashinfo.lhash_wait),
};
+void tcp_print_queue(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct sk_buff *skb;
+ char s[50+1];
+ char h[50+1];
+ int idx = 0;
+ int i;
+
+ i = 0;
+ tcp_for_write_queue(skb, sk) {
+ if (skb == tcp_send_head(sk))
+ printk(KERN_ERR "head %u %p\n", i, skb);
+ else
+ printk(KERN_ERR "skb %u %p\n", i, skb);
+ i++;
+ }
+
+ tcp_for_write_queue(skb, sk) {
+ if (skb == tcp_send_head(sk))
+ break;
+
+ for (i = 0; i < tcp_skb_pcount(skb); i++) {
+ if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
+ s[idx] = 'S';
+ if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
+ s[idx] = 'B';
+
+ } else if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
+ s[idx] = 'L';
+ } else {
+ s[idx] = ' ';
+ }
+ if (s[idx] != ' ' && skb->len < tp->mss_cache)
+ s[idx] += 'a' - 'A';
+
+ if (i == 0) {
+ if (skb == tcp_highest_sack(sk))
+ h[idx] = 'h';
+ else
+ h[idx] = '+';
+ } else {
+ h[idx] = '-';
+ }
+
+ if (++idx >= 50) {
+ s[idx] = 0;
+ h[idx] = 0;
+ printk(KERN_ERR "TCP wq(s) %s\n", s);
+ printk(KERN_ERR "TCP wq(h) %s\n", h);
+ idx = 0;
+ }
+ }
+ }
+ if (idx) {
+ s[idx] = '<';
+ s[idx+1] = 0;
+ h[idx] = '<';
+ h[idx+1] = 0;
+ printk(KERN_ERR "TCP wq(s) %s\n", s);
+ printk(KERN_ERR "TCP wq(h) %s\n", h);
+ }
+ printk(KERN_ERR "l%u s%u f%u p%u seq: su%u hs%u sn%u\n",
+ tp->lost_out, tp->sacked_out, tp->fackets_out, tp->packets_out,
+ tp->snd_una, tcp_highest_sack_seq(tp), tp->snd_nxt);
+}
+
+void tcp_verify_wq(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ u32 lost = 0;
+ u32 sacked = 0;
+ u32 packets = 0;
+ u32 fackets = 0;
+ int hs_valid = 0;
+ struct sk_buff *skb;
+
+ tcp_for_write_queue(skb, sk) {
+ if (skb == tcp_send_head(sk))
+ break;
+
+ if ((fackets == packets) && (skb == tp->highest_sack))
+ hs_valid = 1;
+
+ packets += tcp_skb_pcount(skb);
+
+ if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
+ sacked += tcp_skb_pcount(skb);
+ if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
+ printk(KERN_ERR "Sacked bitmap S+L: %u %u-%u/%u\n",
+ TCP_SKB_CB(skb)->sacked,
+ TCP_SKB_CB(skb)->end_seq - tp->snd_una,
+ TCP_SKB_CB(skb)->seq - tp->snd_una,
+ tp->snd_una);
+ fackets = packets;
+ hs_valid = 0;
+ }
+ if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
+ lost += tcp_skb_pcount(skb);
+ }
+
+ if ((fackets == packets) && (tp->highest_sack == tcp_send_head(sk)))
+ hs_valid = 1;
+
+ if ((lost != tp->lost_out) ||
+ (tcp_is_sack(tp) && (sacked != tp->sacked_out)) ||
+ ((sacked || (tcp_is_sack(tp) && tp->sacked_out)) && !hs_valid) ||
+ (packets != tp->packets_out) ||
+ (fackets != tp->fackets_out) ||
+ tcp_left_out(tp) > tp->packets_out) {
+ printk(KERN_ERR "P: %u L: %u vs %u S: %u vs %u F: %u vs %u w: %u-%u (%u)\n",
+ tp->packets_out,
+ lost, tp->lost_out,
+ sacked, tp->sacked_out,
+ fackets, tp->fackets_out,
+ tp->snd_una, tp->snd_nxt,
+ tp->rx_opt.sack_ok);
+ tcp_print_queue(sk);
+ }
+
+ WARN_ON(lost != tp->lost_out);
+ WARN_ON(tcp_is_sack(tp) && (sacked != tp->sacked_out));
+ WARN_ON(packets != tp->packets_out);
+ WARN_ON(fackets != tp->fackets_out);
+ WARN_ON(tcp_left_out(tp) > tp->packets_out);
+}
+
static int tcp_v4_get_port(struct sock *sk, unsigned short snum)
{
return inet_csk_get_port(&tcp_hashinfo, sk, snum,
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 89f0188..8fb6628 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -779,10 +779,9 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
tp->lost_out -= diff;
/* Adjust Reno SACK estimate. */
- if (tcp_is_reno(tp) && diff > 0) {
+ if (tcp_is_reno(tp) && diff > 0)
tcp_dec_pcount_approx_int(&tp->sacked_out, diff);
- tcp_verify_left_out(tp);
- }
+
tcp_adjust_fackets_out(sk, skb, diff);
}
@@ -790,6 +789,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
skb_header_release(buff);
tcp_insert_write_queue_after(skb, buff, sk);
+ tcp_verify_left_out(tp);
+
return 0;
}
@@ -1463,6 +1464,7 @@ static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle)
} else if (result > 0) {
sent_pkts = 1;
}
+ tcp_verify_left_out(tp);
while ((skb = tcp_send_head(sk))) {
unsigned int limit;
@@ -1764,6 +1766,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb,
tcp_clear_retrans_hints_partial(tp);
sk_wmem_free_skb(sk, next_skb);
+ tcp_verify_left_out(tp);
}
/* Do a simple retransmit without using the backoff mechanisms in
@@ -1795,13 +1798,13 @@ void tcp_simple_retransmit(struct sock *sk)
}
}
+ tcp_verify_left_out(tp);
+
tcp_clear_all_retrans_hints(tp);
if (!lost)
return;
- tcp_verify_left_out(tp);
-
/* Don't muck with the congestion window here.
* Reason is that we do not increase amount of _data_
* in network, but units changed and effective
@@ -1888,6 +1891,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
tcp_init_nondata_skb(skb, TCP_SKB_CB(skb)->end_seq - 1,
TCP_SKB_CB(skb)->flags);
skb->ip_summed = CHECKSUM_NONE;
+
+ tcp_verify_left_out(tp);
}
}
@@ -1970,8 +1975,10 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
* packet to be MSS sized and all the
* packet counting works out.
*/
- if (tcp_packets_in_flight(tp) >= tp->snd_cwnd)
+ if (tcp_packets_in_flight(tp) >= tp->snd_cwnd) {
+ tcp_verify_left_out(tp);
return;
+ }
if (sacked & TCPCB_LOST) {
if (!(sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))) {
@@ -1997,6 +2004,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
}
}
+ tcp_verify_left_out(tp);
+
/* OK, demanded retransmission is finished. */
/* Forward retransmissions are possible only during Recovery. */
@@ -2054,6 +2063,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
NET_INC_STATS_BH(LINUX_MIB_TCPFORWARDRETRANS);
}
+
+ tcp_verify_left_out(tp);
}
/* Send a fin. The caller locks the socket for us. This cannot be
--
1.5.2.2
^ permalink raw reply related
* Re: My 802.3ad is my bond
From: Steven Whitehouse @ 2008-01-24 10:06 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev
In-Reply-To: <26889.1201108405@death>
Hi,
On Wed, 2008-01-23 at 09:13 -0800, Jay Vosburgh wrote:
> Steven Whitehouse <swhiteho@redhat.com> wrote:
> [...]
> >This commit: ece95f7fefe3afae19e641e1b3f5e64b00d5b948 seems to have
> >caused a problem with parsing bond arguments as now only the numeric
> >arguments seem to work (in modprobe.conf) and specifying 802.3ad fails.
> >When I revert that patch in my local tree all seems ok.
>
> Thanks for the report; I know what the problem here is. I'll
> get a fix out.
>
> >Also I notice that one of my two NICs now reports this:
> >
> >bonding: bond0: link status definitely down for interface eth0,
> >disabling it
> >bonding: bond0: Interface eth0 is already enslaved!
> >bond0.5: no IPv6 routers present
> >
> >which I think is also new with this set of bonding updates, before it
> >used to use both interfaces ok. I've not worked out which of the other
> >patches causes this so far, but I can if its helpful,
>
> That would be helpful, as would some more details: e.g., the
> various options passed to bonding, the complete dmesg log, contents of
> /proc/net/bonding/bond0 [or whatever your interface is called], and
> anything else you think would be helpful.
>
> -J
>
I've been through all the patches now, reverting each in turn and it
seems that I still have the problem. Thats rather odd as I'm sure that I
didn't have only one interface working before. When I get a moment I'll
try and work out whats going on here,
Steve.
^ permalink raw reply
* Re: 2.6.24-rc8-mm1 : net tcp_input.c warnings
From: Ilpo Järvinen @ 2008-01-24 9:54 UTC (permalink / raw)
To: Dave Young, Kamalesh Babulal, Krishna Kumar2,
Denys Fedoryshchenko, David Miller <
Cc: LKML, Netdev, Andrew Morton
In-Reply-To: <a8e1da0801231842m128d8261oc14135716a0a2d85@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 8416 bytes --]
On Thu, 24 Jan 2008, Dave Young wrote:
Hi Dave (& others),
> Thanks.
Thanks a lot, I was first to ignore all these because they occurred
with newreno, but looked again... :-/
> New warning trigged with your debug patch:
This was probably with the earlier one I sent to you because there's still
this case remaining which itself is valid:
> P: 5 L: 0 vs 0 S: 0 vs 1 w: 2044790889-2044796616 (0)
...snip... this is still ok state (S+L <= P):
> P: 5 L: 0 vs 0 S: 0 vs 3 w: 2044790889-2044796616 (0)
> TCP wq(s) <
> TCP wq(h) +++h+<
> l0 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616
> ------------[ cut here ]------------
> WARNING: at net/ipv4/tcp_input.c:2169 tcp_mark_head_lost+0x122/0x150()
> Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event
> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss eeprom e100 psmouse
> snd_hda_intel snd_pcm snd_timer btusb bluetooth serio_raw snd 3c59x sg
> evdev thermal soundcore rtc_cmos snd_page_alloc rtc_core rtc_lib
> i2c_i801 processor button intel_agp dcdbas pcspkr agpgart
> Pid: 0, comm: swapper Not tainted 2.6.24-rc8-mm1 #8
> [<c0132100>] ? have_callable_console+0x20/0x30
> [<c0131844>] warn_on_slowpath+0x54/0x80
> [<c03ffe54>] ? tcp_print_queue+0x1a4/0x230
> [<c0132438>] ? vprintk+0x308/0x320
> [<c0132438>] ? vprintk+0x308/0x320
> [<c0132438>] ? vprintk+0x308/0x320
> [<c03ffff6>] ? tcp_verify_wq+0x116/0x1c0
> [<c03f6052>] tcp_mark_head_lost+0x122/0x150
> [<c03f60ca>] tcp_update_scoreboard+0x4a/0x190
> [<c03f6e7a>] tcp_fastretrans_alert+0x4da/0x700
> [<c03f7e63>] tcp_ack+0x1b3/0x3a0
> [<c03fa21b>] tcp_rcv_established+0x3eb/0x710
> [<c0401c05>] tcp_v4_do_rcv+0xe5/0x100
> [<c04021fb>] tcp_v4_rcv+0x5db/0x660
> [<c0401fa7>] ? tcp_v4_rcv+0x387/0x660
> [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> [<c03e5f44>] ip_local_deliver_finish+0x84/0x1d0
> [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> [<c0156b97>] ? __lock_release+0x47/0x70
> [<c03e6147>] ip_local_deliver+0xb7/0xc0
> [<c03e6202>] ip_rcv_finish+0xb2/0x3c0
> [<c03c01d8>] ? sock_def_readable+0x48/0xa0
> [<c03be061>] ? sock_queue_rcv_skb+0xb1/0x1a0
> [<c03be0a7>] ? sock_queue_rcv_skb+0xf7/0x1a0
> [<c03e669f>] ip_rcv+0x18f/0x290
> [<c042fb10>] ? packet_rcv_spkt+0xd0/0x130
> [<c03c8da6>] netif_receive_skb+0x2b6/0x330
> [<c03c8c17>] ? netif_receive_skb+0x127/0x330
> [<c03c8ea3>] ? process_backlog+0x83/0x100
> [<c03c8eae>] process_backlog+0x8e/0x100
> [<c03c90bc>] net_rx_action+0x13c/0x230
> [<c03c8fd9>] ? net_rx_action+0x59/0x230
> [<c0136f63>] __do_softirq+0x93/0x120
> [<c013706a>] do_softirq+0x7a/0x80
> [<c0137135>] irq_exit+0x65/0x90
> [<c01078b1>] do_IRQ+0x41/0x80
> [<c0155659>] ? trace_hardirqs_on+0xb9/0x130
> [<c01059ca>] common_interrupt+0x2e/0x34
> [<c0103390>] ? mwait_idle_with_hints+0x40/0x50
> [<c01033a0>] ? mwait_idle+0x0/0x20
> [<c01033b2>] mwait_idle+0x12/0x20
> [<c0103141>] cpu_idle+0x61/0x110
> [<c04339fd>] rest_init+0x5d/0x60
> [<c05a47fa>] start_kernel+0x1fa/0x260
> [<c05a4190>] ? unknown_bootoption+0x0/0x130
> =======================
> ---[ end trace 14b601818e6903ac ]---
> ------------[ cut here ]------------
> WARNING: at net/ipv4/tcp_ipv4.c:197 tcp_verify_wq+0x1b6/0x1c0()
> Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event
> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss eeprom e100 psmouse
> snd_hda_intel snd_pcm snd_timer btusb bluetooth serio_raw snd 3c59x sg
> evdev thermal soundcore rtc_cmos snd_page_alloc rtc_core rtc_lib
> i2c_i801 processor button intel_agp dcdbas pcspkr agpgart
> Pid: 0, comm: swapper Not tainted 2.6.24-rc8-mm1 #8
> [<c0132100>] ? have_callable_console+0x20/0x30
> [<c0131844>] warn_on_slowpath+0x54/0x80
> [<c01317da>] ? print_oops_end_marker+0x2a/0x30
> [<c0131849>] ? warn_on_slowpath+0x59/0x80
> [<c03ffe54>] ? tcp_print_queue+0x1a4/0x230
> [<c0132438>] ? vprintk+0x308/0x320
> [<c0132438>] ? vprintk+0x308/0x320
> [<c0400096>] tcp_verify_wq+0x1b6/0x1c0
> [<c03ffff6>] ? tcp_verify_wq+0x116/0x1c0
> [<c03f5ffc>] tcp_mark_head_lost+0xcc/0x150
> [<c03f60ca>] tcp_update_scoreboard+0x4a/0x190
> [<c03f6e7a>] tcp_fastretrans_alert+0x4da/0x700
> [<c03f7e63>] tcp_ack+0x1b3/0x3a0
> [<c03fa21b>] tcp_rcv_established+0x3eb/0x710
> [<c0401c05>] tcp_v4_do_rcv+0xe5/0x100
> [<c04021fb>] tcp_v4_rcv+0x5db/0x660
> [<c0401fa7>] ? tcp_v4_rcv+0x387/0x660
> [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> [<c03e5f44>] ip_local_deliver_finish+0x84/0x1d0
> [<c03e5eed>] ? ip_local_deliver_finish+0x2d/0x1d0
> [<c0156b97>] ? __lock_release+0x47/0x70
> [<c03e6147>] ip_local_deliver+0xb7/0xc0
> [<c03e6202>] ip_rcv_finish+0xb2/0x3c0
> [<c03c01d8>] ? sock_def_readable+0x48/0xa0
> [<c03be061>] ? sock_queue_rcv_skb+0xb1/0x1a0
> [<c03be0a7>] ? sock_queue_rcv_skb+0xf7/0x1a0
> [<c03e669f>] ip_rcv+0x18f/0x290
> [<c042fb10>] ? packet_rcv_spkt+0xd0/0x130
> [<c03c8da6>] netif_receive_skb+0x2b6/0x330
> [<c03c8c17>] ? netif_receive_skb+0x127/0x330
> [<c03c8ea3>] ? process_backlog+0x83/0x100
> [<c03c8eae>] process_backlog+0x8e/0x100
> [<c03c90bc>] net_rx_action+0x13c/0x230
> [<c03c8fd9>] ? net_rx_action+0x59/0x230
> [<c0136f63>] __do_softirq+0x93/0x120
> [<c013706a>] do_softirq+0x7a/0x80
> [<c0137135>] irq_exit+0x65/0x90
> [<c01078b1>] do_IRQ+0x41/0x80
> [<c0155659>] ? trace_hardirqs_on+0xb9/0x130
> [<c01059ca>] common_interrupt+0x2e/0x34
> [<c0103390>] ? mwait_idle_with_hints+0x40/0x50
> [<c01033a0>] ? mwait_idle+0x0/0x20
> [<c01033b2>] mwait_idle+0x12/0x20
> [<c0103141>] cpu_idle+0x61/0x110
> [<c04339fd>] rest_init+0x5d/0x60
> [<c05a47fa>] start_kernel+0x1fa/0x260
> [<c05a4190>] ? unknown_bootoption+0x0/0x130
> =======================
> ---[ end trace 14b601818e6903ac ]---
...But this no longer is, and even more, L: 5 is not valid state at this
point all (should only happen if we went to RTO but it would reset S to
zero with newreno):
> P: 5 L: 5 vs 5 S: 0 vs 3 w: 2044790889-2044796616 (0)
> TCP wq(s) LLLLl<
> TCP wq(h) +++h+<
> l5 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616
Surprisingly, it was the first time the WARN_ON for left_out returned
correct location. This also explains why the patch I sent to Krishna
didn't print anything (it didn't end up into printing because I forgot
to add L+S>P check into to the state checking if).
...so please, could you (others than Denys) try this patch, it should
solve the issue. And Denys, could you confirm (and if necessary double
check) that the kernel you saw this similar problem with is the pure
Linus' mainline, i.e., without any net-2.6.25 or mm bits please, if so,
that problem persists. And anyway, there were some fackets_out related
problems reported as well and this doesn't help for that but I think I've
lost track of who was seeing it due to large number of reports :-), could
somebody refresh my memory because I currently don't have time to dig it
up from archives (at least on this week).
--
i.
--
[PATCH] [TCP]: NewReno must count every skb while marking losses
NewReno should add cnt per skb (as with FACK) instead of depending
on SACKED_ACKED bits which won't be set with it at all.
Effectively, NewReno should always exists after the first
iteration anyway (or immediately if there's already head in
lost_out.
This was fixed earlier in net-2.6.25 but got reverted among other
stuff and I didn't notice that this is still necessary (actually
wasn't even considering this case while trying to figure out the
reports because I lived with different kind of code than it in
reality was).
This should solve the WARN_ONs in TCP code that as a result of
this triggered multiple times in every place we check for this
invariant.
Special thanks to Dave Young <hidave.darkstar@gmail.com> and
Krishna Kumar2 <krkumar2@in.ibm.com> for trying with my debug
patches.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
net/ipv4/tcp_input.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 295490e..aa409a5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2156,7 +2156,7 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
tp->lost_skb_hint = skb;
tp->lost_cnt_hint = cnt;
- if (tcp_is_fack(tp) ||
+ if (tcp_is_fack(tp) || tcp_is_reno(tp) ||
(TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
cnt += tcp_skb_pcount(skb);
--
1.5.2.2
^ permalink raw reply related
* Re: [resend][PATCH] Introducing socket mark socket option
From: Patrick McHardy @ 2008-01-24 9:43 UTC (permalink / raw)
To: Laszlo Attila Toth; +Cc: Netfilter Developer Mailing List, netdev, linux-arch
In-Reply-To: <1201167495912-git-send-email-panther@balabit.hu>
Laszlo Attila Toth wrote:
> A userspace program may wish to set the mark for each packets its send
> without using the netfilter MARK target. Changing the mark can be used
> for mark based routing without netfilter or for packet filtering.
>
> It requires CAP_NET_ADMIN capability.
Looks good to me.
^ permalink raw reply
* [resend][PATCH] Introducing socket mark socket option
From: Laszlo Attila Toth @ 2008-01-24 9:38 UTC (permalink / raw)
To: Patrick McHardy
Cc: Netfilter Developer Mailing List, netdev, linux-arch,
Laszlo Attila Toth
In-Reply-To: <47974D07.5040202@trash.net>
A userspace program may wish to set the mark for each packets its send
without using the netfilter MARK target. Changing the mark can be used
for mark based routing without netfilter or for packet filtering.
It requires CAP_NET_ADMIN capability.
Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
---
include/asm-alpha/socket.h | 2 ++
include/asm-arm/socket.h | 2 ++
include/asm-avr32/socket.h | 2 ++
include/asm-blackfin/socket.h | 3 +++
include/asm-cris/socket.h | 2 ++
include/asm-frv/socket.h | 2 ++
include/asm-h8300/socket.h | 2 ++
include/asm-ia64/socket.h | 2 ++
include/asm-m32r/socket.h | 2 ++
include/asm-m68k/socket.h | 2 ++
include/asm-mips/socket.h | 2 ++
include/asm-parisc/socket.h | 2 ++
include/asm-powerpc/socket.h | 2 ++
include/asm-s390/socket.h | 2 ++
include/asm-sh/socket.h | 2 ++
include/asm-sparc/socket.h | 2 ++
include/asm-sparc64/socket.h | 1 +
include/asm-v850/socket.h | 2 ++
include/asm-x86/socket.h | 2 ++
include/asm-xtensa/socket.h | 2 ++
include/net/route.h | 2 ++
include/net/sock.h | 2 ++
net/core/sock.c | 11 +++++++++++
net/ipv4/ip_output.c | 3 +++
net/ipv4/raw.c | 2 ++
net/ipv6/ip6_output.c | 2 ++
net/ipv6/raw.c | 3 +++
27 files changed, 65 insertions(+), 0 deletions(-)
diff --git a/include/asm-alpha/socket.h b/include/asm-alpha/socket.h
index 1fede7f..08c9793 100644
--- a/include/asm-alpha/socket.h
+++ b/include/asm-alpha/socket.h
@@ -60,4 +60,6 @@
#define SO_SECURITY_ENCRYPTION_TRANSPORT 20
#define SO_SECURITY_ENCRYPTION_NETWORK 21
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-arm/socket.h b/include/asm-arm/socket.h
index 65a1a64..6817be9 100644
--- a/include/asm-arm/socket.h
+++ b/include/asm-arm/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-avr32/socket.h b/include/asm-avr32/socket.h
index a0d0507..35863f2 100644
--- a/include/asm-avr32/socket.h
+++ b/include/asm-avr32/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* __ASM_AVR32_SOCKET_H */
diff --git a/include/asm-blackfin/socket.h b/include/asm-blackfin/socket.h
index 5213c96..2ca702e 100644
--- a/include/asm-blackfin/socket.h
+++ b/include/asm-blackfin/socket.h
@@ -50,4 +50,7 @@
#define SO_PASSSEC 34
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-cris/socket.h b/include/asm-cris/socket.h
index 5b18dfd..9df0ca8 100644
--- a/include/asm-cris/socket.h
+++ b/include/asm-cris/socket.h
@@ -54,6 +54,8 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-frv/socket.h b/include/asm-frv/socket.h
index a823bef..e51ca67 100644
--- a/include/asm-frv/socket.h
+++ b/include/asm-frv/socket.h
@@ -52,5 +52,7 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-h8300/socket.h b/include/asm-h8300/socket.h
index 39911d8..da2520d 100644
--- a/include/asm-h8300/socket.h
+++ b/include/asm-h8300/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-ia64/socket.h b/include/asm-ia64/socket.h
index 9e42ce4..d5ef0aa 100644
--- a/include/asm-ia64/socket.h
+++ b/include/asm-ia64/socket.h
@@ -61,4 +61,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_IA64_SOCKET_H */
diff --git a/include/asm-m32r/socket.h b/include/asm-m32r/socket.h
index 793d5d3..9a0e200 100644
--- a/include/asm-m32r/socket.h
+++ b/include/asm-m32r/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_M32R_SOCKET_H */
diff --git a/include/asm-m68k/socket.h b/include/asm-m68k/socket.h
index 6d21b90..dbc64e9 100644
--- a/include/asm-m68k/socket.h
+++ b/include/asm-m68k/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-mips/socket.h b/include/asm-mips/socket.h
index 9594568..63f6025 100644
--- a/include/asm-mips/socket.h
+++ b/include/asm-mips/socket.h
@@ -73,6 +73,8 @@ To add: #define SO_REUSEPORT 0x0200 /* Allow local address and port reuse. */
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#ifdef __KERNEL__
/** sock_type - Socket types
diff --git a/include/asm-parisc/socket.h b/include/asm-parisc/socket.h
index 99e868f..69a7a0d 100644
--- a/include/asm-parisc/socket.h
+++ b/include/asm-parisc/socket.h
@@ -52,4 +52,6 @@
#define SO_PEERSEC 0x401d
#define SO_PASSSEC 0x401e
+#define SO_MARK 0x401f
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-powerpc/socket.h b/include/asm-powerpc/socket.h
index 403e9fd..f5a4e16 100644
--- a/include/asm-powerpc/socket.h
+++ b/include/asm-powerpc/socket.h
@@ -59,4 +59,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_POWERPC_SOCKET_H */
diff --git a/include/asm-s390/socket.h b/include/asm-s390/socket.h
index 1161ebe..c786ab6 100644
--- a/include/asm-s390/socket.h
+++ b/include/asm-s390/socket.h
@@ -60,4 +60,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-sh/socket.h b/include/asm-sh/socket.h
index c48d6fc..6d4bf65 100644
--- a/include/asm-sh/socket.h
+++ b/include/asm-sh/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* __ASM_SH_SOCKET_H */
diff --git a/include/asm-sparc/socket.h b/include/asm-sparc/socket.h
index 7c14239..2e2bd0b 100644
--- a/include/asm-sparc/socket.h
+++ b/include/asm-sparc/socket.h
@@ -52,6 +52,8 @@
#define SO_TIMESTAMPNS 0x0021
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 0x0022
+
/* Security levels - as per NRL IPv6 - don't actually do anything */
#define SO_SECURITY_AUTHENTICATION 0x5001
#define SO_SECURITY_ENCRYPTION_TRANSPORT 0x5002
diff --git a/include/asm-sparc64/socket.h b/include/asm-sparc64/socket.h
index 986441d..44a625a 100644
--- a/include/asm-sparc64/socket.h
+++ b/include/asm-sparc64/socket.h
@@ -57,4 +57,5 @@
#define SO_SECURITY_ENCRYPTION_TRANSPORT 0x5002
#define SO_SECURITY_ENCRYPTION_NETWORK 0x5004
+#define SO_MARK 0x0022
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-v850/socket.h b/include/asm-v850/socket.h
index a4c2493..e199a2b 100644
--- a/include/asm-v850/socket.h
+++ b/include/asm-v850/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* __V850_SOCKET_H__ */
diff --git a/include/asm-x86/socket.h b/include/asm-x86/socket.h
index 99ca648..80af9c4 100644
--- a/include/asm-x86/socket.h
+++ b/include/asm-x86/socket.h
@@ -52,4 +52,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _ASM_SOCKET_H */
diff --git a/include/asm-xtensa/socket.h b/include/asm-xtensa/socket.h
index 1f5aeac..6100682 100644
--- a/include/asm-xtensa/socket.h
+++ b/include/asm-xtensa/socket.h
@@ -63,4 +63,6 @@
#define SO_TIMESTAMPNS 35
#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
+#define SO_MARK 36
+
#endif /* _XTENSA_SOCKET_H */
diff --git a/include/net/route.h b/include/net/route.h
index 4eabf00..fcc6d5b 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -27,6 +27,7 @@
#include <net/dst.h>
#include <net/inetpeer.h>
#include <net/flow.h>
+#include <net/sock.h>
#include <linux/in_route.h>
#include <linux/rtnetlink.h>
#include <linux/route.h>
@@ -149,6 +150,7 @@ static inline int ip_route_connect(struct rtable **rp, __be32 dst,
int flags)
{
struct flowi fl = { .oif = oif,
+ .mark = sk->sk_mark,
.nl_u = { .ip4_u = { .daddr = dst,
.saddr = src,
.tos = tos } },
diff --git a/include/net/sock.h b/include/net/sock.h
index 9023244..e3fb4c0 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -262,6 +262,8 @@ struct sock {
__u32 sk_sndmsg_off;
int sk_write_pending;
void *sk_security;
+ __u32 sk_mark;
+ /* XXX 4 bytes hole on 64 bit */
void (*sk_state_change)(struct sock *sk);
void (*sk_data_ready)(struct sock *sk, int bytes);
void (*sk_write_space)(struct sock *sk);
diff --git a/net/core/sock.c b/net/core/sock.c
index 1c4b1cd..433715f 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -667,6 +667,13 @@ set_rcvbuf:
else
clear_bit(SOCK_PASSSEC, &sock->flags);
break;
+ case SO_MARK:
+ if (!capable(CAP_NET_ADMIN))
+ ret = -EPERM;
+ else {
+ sk->sk_mark = val;
+ }
+ break;
/* We implement the SO_SNDLOWAT etc to
not be settable (1003.1g 5.3) */
@@ -836,6 +843,10 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
case SO_PEERSEC:
return security_socket_getpeersec_stream(sock, optval, optlen, len);
+ case SO_MARK:
+ v.val = sk->sk_mark;
+ break;
+
default:
return -ENOPROTOOPT;
}
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 4fad239..1295881 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -168,6 +168,7 @@ int ip_build_and_send_pkt(struct sk_buff *skb, struct sock *sk,
}
skb->priority = sk->sk_priority;
+ skb->mark = sk->sk_mark;
/* Send it out. */
return ip_local_out(skb);
@@ -385,6 +386,7 @@ packet_routed:
(skb_shinfo(skb)->gso_segs ?: 1) - 1);
skb->priority = sk->sk_priority;
+ skb->mark = sk->sk_mark;
return ip_local_out(skb);
@@ -1282,6 +1284,7 @@ int ip_push_pending_frames(struct sock *sk)
iph->daddr = rt->rt_dst;
skb->priority = sk->sk_priority;
+ skb->mark = sk->sk_mark;
skb->dst = dst_clone(&rt->u.dst);
if (iph->protocol == IPPROTO_ICMP)
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 85c0869..f863c3d 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -352,6 +352,7 @@ static int raw_send_hdrinc(struct sock *sk, void *from, size_t length,
skb_reserve(skb, hh_len);
skb->priority = sk->sk_priority;
+ skb->mark = sk->sk_mark;
skb->dst = dst_clone(&rt->u.dst);
skb_reset_network_header(skb);
@@ -544,6 +545,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
{
struct flowi fl = { .oif = ipc.oif,
+ .mark = sk->sk_mark,
.nl_u = { .ip4_u =
{ .daddr = daddr,
.saddr = saddr,
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 9922be2..a8a0c53 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -257,6 +257,7 @@ int ip6_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl,
ipv6_addr_copy(&hdr->daddr, first_hop);
skb->priority = sk->sk_priority;
+ skb->mark = sk->sk_mark;
mtu = dst_mtu(dst);
if ((skb->len <= mtu) || ipfragok || skb_is_gso(skb)) {
@@ -1437,6 +1438,7 @@ int ip6_push_pending_frames(struct sock *sk)
ipv6_addr_copy(&hdr->daddr, final_dst);
skb->priority = sk->sk_priority;
+ skb->mark = sk->sk_mark;
skb->dst = dst_clone(&rt->u.dst);
IP6_INC_STATS(rt->rt6i_idev, IPSTATS_MIB_OUTREQUESTS);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 4d88055..d61c63d 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -641,6 +641,7 @@ static int rawv6_send_hdrinc(struct sock *sk, void *from, int length,
skb_reserve(skb, hh_len);
skb->priority = sk->sk_priority;
+ skb->mark = sk->sk_mark;
skb->dst = dst_clone(&rt->u.dst);
skb_put(skb, length);
@@ -767,6 +768,8 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
*/
memset(&fl, 0, sizeof(fl));
+ fl.mark = sk->sk_mark;
+
if (sin6) {
if (addr_len < SIN6_LEN_RFC2133)
return -EINVAL;
--
1.5.2.5
^ permalink raw reply related
* Re: [IPV4 0/9] TRIE performance patches
From: Robert Olsson @ 2008-01-24 9:36 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Robert Olsson, David Miller, netdev
In-Reply-To: <20080123154933.233c4909@deepthought>
Stephen Hemminger writes:
> Dumping by prefix is possible, but unless 32x slower. Dumping in
> address order is just as logical. Like I said, I'm investigating what
> quagga handles.
How about taking a snapshot to in address order (as you did) to some
allocated memory, returning from that memory in prefix order? This would
solve the -EBUSY too and give a consistent view of the routing table at
the time for the dump/snapshot.
Cheers
--ro
^ permalink raw reply
* Re: bluetooth : lockdep warning on rfcomm
From: Dave Young @ 2008-01-24 9:25 UTC (permalink / raw)
To: LKML; +Cc: Netdev, Marcel Holtmann, David Miller, bluez-devel
In-Reply-To: <a8e1da0801231902x5ab7691bq5a30e839f6c8cb71@mail.gmail.com>
On Jan 24, 2008 11:02 AM, Dave Young <hidave.darkstar@gmail.com> wrote:
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.24-rc8-mm1 #8
> ---------------------------------------------
> bluepush/3213 is trying to acquire lock:
> (sk_lock-AF_BLUETOOTH){--..}, at: [<f8978c80>]
> l2cap_sock_bind+0x40/0x100 [l2cap]
>
> but task is already holding lock:
> (sk_lock-AF_BLUETOOTH){--..}, at: [<f894a31e>]
> rfcomm_sock_connect+0x3e/0xe0 [rfcomm]
>
> other info that might help us debug this:
> 2 locks held by bluepush/3213:
> #0: (sk_lock-AF_BLUETOOTH){--..}, at: [<f894a31e>]
> rfcomm_sock_connect+0x3e/0xe0 [rfcomm]
> #1: (rfcomm_mutex){--..}, at: [<f8947556>] rfcomm_dlc_open+0x26/0x60 [rfcomm]
>
> stack backtrace:
> Pid: 3213, comm: bluepush Not tainted 2.6.24-rc8-mm1 #8
> [<c0132128>] ? printk+0x18/0x20
> [<c0154437>] print_deadlock_bug+0xc7/0xe0
> [<c01544bc>] check_deadlock+0x6c/0x80
> [<c01548fc>] validate_chain+0x14c/0x320
> [<c0156221>] __lock_acquire+0x1c1/0x730
> [<c0156d89>] lock_acquire+0x79/0xb0
> [<f8978c80>] ? l2cap_sock_bind+0x40/0x100 [l2cap]
> [<c03c05f5>] lock_sock_nested+0x55/0x70
> [<f8978c80>] ? l2cap_sock_bind+0x40/0x100 [l2cap]
> [<f8978c80>] l2cap_sock_bind+0x40/0x100 [l2cap]
> [<c03bdb4a>] kernel_bind+0xa/0x10
> [<f8947afc>] rfcomm_session_create+0x4c/0x110 [rfcomm]
> [<f8947509>] __rfcomm_dlc_open+0x129/0x150 [rfcomm]
> [<f8947568>] rfcomm_dlc_open+0x38/0x60 [rfcomm]
> [<f894a396>] rfcomm_sock_connect+0xb6/0xe0 [rfcomm]
> [<c03bcd39>] sys_connect+0x99/0xd0
> [<c010f509>] ? cache_add_dev+0x39/0x1a0
> [<c015323d>] ? put_lock_stats+0xd/0x30
> [<c01532c0>] ? lock_release_holdtime+0x60/0x80
> [<c018e86c>] ? fget+0x7c/0x100
> [<c0156b97>] ? __lock_release+0x47/0x70
> [<c018e86c>] ? fget+0x7c/0x100
> [<c02611b7>] ? copy_from_user+0x37/0x70
> [<c03bd855>] sys_socketcall+0xa5/0x230
> [<c0155659>] ? trace_hardirqs_on+0xb9/0x130
> [<c010501b>] ? restore_nocheck+0x12/0x15
>
While fixing this issue, another locking dependency confused me. Are
rfcomm_dev_lock and &d->lock in same lock class?
The warnings as following:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.24-rc8-mm1 #8
-------------------------------------------------------
krfcommd/2912 is trying to acquire lock:
(rfcomm_dev_lock){-.--}, at: [<f89d5bf2>]
rfcomm_dev_state_change+0x92/0x160 [rfcomm]
but task is already holding lock:
(&d->lock){--..}, at: [<f89d15d3>] __rfcomm_dlc_close+0x43/0xd0 [rfcomm]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&d->lock){--..}:
[<c0153b13>] add_lock_to_list+0x33/0x70
[<c01545a3>] check_prev_add+0xd3/0x200
[<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
[<c0154765>] check_prevs_add+0x95/0xe0
[<c01549ef>] validate_chain+0x23f/0x320
[<c0156221>] __lock_acquire+0x1c1/0x730
[<c0155539>] mark_held_locks+0x39/0x80
[<c0156d89>] lock_acquire+0x79/0xb0
[<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
[<c0436749>] _spin_lock+0x39/0x80
[<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
[<f89d5b10>] rfcomm_dev_data_ready+0x0/0x50 [rfcomm]
[<f89d5311>] rfcomm_dev_add+0x191/0x300 [rfcomm]
[<f89d565e>] rfcomm_create_dev+0x6e/0x100 [rfcomm]
[<c03c05fa>] lock_sock_nested+0x5a/0x70
[<f89d5ae3>] rfcomm_dev_ioctl+0x33/0x60 [rfcomm]
[<f89d4c8c>] rfcomm_sock_ioctl+0x2c/0x50 [rfcomm]
[<c03bc001>] sock_ioctl+0xc1/0x220
[<c03bbf40>] sock_ioctl+0x0/0x220
[<c019a366>] vfs_ioctl+0x76/0x90
[<c019a616>] do_vfs_ioctl+0x56/0x140
[<c019a762>] sys_ioctl+0x62/0x70
[<c0104fba>] syscall_call+0x7/0xb
[<ffffffff>] 0xffffffff
-> #0 (rfcomm_dev_lock){-.--}:
[<c0154504>] check_prev_add+0x34/0x200
[<c012a9ab>] default_wake_function+0xb/0x10
[<c0154765>] check_prevs_add+0x95/0xe0
[<c01549ef>] validate_chain+0x23f/0x320
[<c0156221>] __lock_acquire+0x1c1/0x730
[<c0156d89>] lock_acquire+0x79/0xb0
[<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
[<c04361e9>] _read_lock+0x39/0x80
[<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
[<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
[<f89d15e8>] __rfcomm_dlc_close+0x58/0xd0 [rfcomm]
[<f89d2523>] rfcomm_recv_ua+0x73/0x140 [rfcomm]
[<f89d3151>] rfcomm_recv_frame+0x171/0x1e0 [rfcomm]
[<c0155659>] trace_hardirqs_on+0xb9/0x130
[<c0436a39>] _spin_unlock_irqrestore+0x39/0x70
[<f89d3447>] rfcomm_run+0xe7/0x590 [rfcomm]
[<c0120000>] hpet_legacy_next_event+0x20/0x50
[<f89d3360>] rfcomm_run+0x0/0x590 [rfcomm]
[<c0146e0c>] kthread+0x5c/0xa0
[<c0146db0>] kthread+0x0/0xa0
[<c0105c17>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff
other info that might help us debug this:
2 locks held by krfcommd/2912:
#0: (rfcomm_mutex){--..}, at: [<f89d33db>] rfcomm_run+0x7b/0x590 [rfcomm]
#1: (&d->lock){--..}, at: [<f89d15d3>] __rfcomm_dlc_close+0x43/0xd0 [rfcomm]
stack backtrace:
Pid: 2912, comm: krfcommd Not tainted 2.6.24-rc8-mm1 #8
[<c0132128>] ? printk+0x18/0x20
[<c0153cff>] print_circular_bug_tail+0x6f/0x80
[<c0154504>] check_prev_add+0x34/0x200
[<c012a9ab>] ? default_wake_function+0xb/0x10
[<c0154765>] check_prevs_add+0x95/0xe0
[<c01549ef>] validate_chain+0x23f/0x320
[<c0156221>] __lock_acquire+0x1c1/0x730
[<c0156d89>] lock_acquire+0x79/0xb0
[<f89d5bf2>] ? rfcomm_dev_state_change+0x92/0x160 [rfcomm]
[<c04361e9>] _read_lock+0x39/0x80
[<f89d5bf2>] ? rfcomm_dev_state_change+0x92/0x160 [rfcomm]
[<f89d5bf2>] rfcomm_dev_state_change+0x92/0x160 [rfcomm]
[<f89d15e8>] __rfcomm_dlc_close+0x58/0xd0 [rfcomm]
[<f89d2523>] rfcomm_recv_ua+0x73/0x140 [rfcomm]
[<f89d3151>] rfcomm_recv_frame+0x171/0x1e0 [rfcomm]
[<c0155659>] ? trace_hardirqs_on+0xb9/0x130
[<c0436a39>] ? _spin_unlock_irqrestore+0x39/0x70
[<f89d3447>] rfcomm_run+0xe7/0x590 [rfcomm]
[<c0120000>] ? hpet_legacy_next_event+0x20/0x50
[<f89d3360>] ? rfcomm_run+0x0/0x590 [rfcomm]
[<c0146e0c>] kthread+0x5c/0xa0
[<c0146db0>] ? kthread+0x0/0xa0
[<c0105c17>] kernel_thread_helper+0x7/0x10
=======================
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
^ permalink raw reply
* Re: [Bugme-new] [Bug 9806] New: (tun dev) Impossible to deassert IFF_ONE_QUEUE or IFF_NO_PI
From: Andrew Morton @ 2008-01-24 8:33 UTC (permalink / raw)
To: nwfilardo; +Cc: bugme-daemon, maxk, vtun, netdev
In-Reply-To: <bug-9806-10286@http.bugzilla.kernel.org/>
> On Wed, 23 Jan 2008 13:13:13 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9806
>
> Summary: (tun dev) Impossible to deassert IFF_ONE_QUEUE or
> IFF_NO_PI
> Product: Drivers
> Version: 2.5
> KernelVersion: 2.6.23
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Network
> AssignedTo: jgarzik@pobox.com
> ReportedBy: nwfilardo@gmail.com
>
>
> Problem Description:
>
> The TUN/TAP driver only permits one-way transitions of IFF_NO_PI or
> IFF_ONE_QUEUE during the lifetime of a tap/tun interface. Note that
> tun_set_iff contains
>
> 541 if (ifr->ifr_flags & IFF_NO_PI)
> 542 tun->flags |= TUN_NO_PI;
> 543
> 544 if (ifr->ifr_flags & IFF_ONE_QUEUE)
> 545 tun->flags |= TUN_ONE_QUEUE;
>
> This is easily fixed by adding else branches which clear these bits.
>
> Steps to reproduce:
>
> This is easily reproduced by setting an interface persistant using tunctl then
> attempting to open it as IFF_TAP or IFF_TUN, without asserting the IFF_NO_PI
> flag. The ioctl() will succeed and the ifr.flags word is not modified, but the
> interface remains in IFF_NO_PI mode (as it was set by tunctl).
>
Thanks. Could you please submit the patch via email? Send it to
all recipients of this email.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox