* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 12:49 ` Alexander Holler
0 siblings, 0 replies; 43+ messages in thread
From: Alexander Holler @ 2011-06-06 12:49 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Neil Horman, linux-kernel, David Miller, Herbert Xu, netdev
Am 06.06.2011 14:12, schrieb Eric Dumazet:
> Le lundi 06 juin 2011 à 13:48 +0200, Alexander Holler a écrit :
>> Am 06.06.2011 13:15, schrieb Neil Horman:
>>> On Fri, Jun 03, 2011 at 09:21:06PM +0200, Alexander Holler wrote:
>>>> Hello,
>>>>
>>>> I'm getting a oops in the bridge code in br_change_mtu() with
>>>> 2.6.39.1. The patch below seems to fix that.
>>>>
>>>> I'm not sure about the usage of dst_cow_metrics_generic() in
>>>> fake_dst_ops, but after having a quick look at it seems to be ok to
>>>> use that here.
>>>>
>>>> Regards,
>>>>
>>>> Alexander
>>>>
>>> How did the flags of the dst entry on which we're callnig dst_entry_write_ptr
>>> wind up getting the READ_ONLY flag set on it? I don't see how we'er falling
>>> into that clause in which we call cow_metrics when we call dst_metric_set. It
>>> seems like that flag is set erroneously. perhaps we should just update
>>> fake_rtable.dst to have the correct flags?
>>> Neil
>>
>> It is set by that change:
>>
>> --------
>> @@ -124,7 +128,7 @@ void br_netfilter_rtable_init(struct net_bridge *br)
>> atomic_set(&rt->dst.__refcnt, 1);
>> rt->dst.dev = br->dev;
>> rt->dst.path =&rt->dst;
>> - dst_metric_set(&rt->dst, RTAX_MTU, 1500);
>> + dst_init_metrics(&rt->dst, br_dst_default_metrics, true);
>> rt->dst.flags = DST_NOXFRM;
>> rt->dst.ops =&fake_dst_ops;
>> }
>> --------
>>
>> The true in dst_init_metrics() is responsible for that flag.
>>
>
> You are aware this change fixed an oops ?
No, I'm not aware of this. I know almost nothing about what all that
stuff is doing. For me that change above just introduced an oops through
an immediate NULL pointer dereference in br_change_mtu().
> read_only in this context means : In case this must be written, we make
> a COW first
> (allocate a piece of memory, copy the source in it before applying any
> change)
>
> It would be nice you send us the stack trace, so that we can have a clue
> of whats going on.
Here is the text as found in my first mail:
----
Commit 42923465fb8d025a2b5153f2e7ab1e6e1058bf00 does here what it
should prevent, it introduces NULL a dereference.
The above commit uses dst_init_metrics() which sets the metrics as
read only. As result br_change_mtu() dies in dst_metric_set()
which calls dst_metrics_write_ptr() which calls
dst->ops->cow_metrics() if the metrics are read only.
I don't have a system with 2.6.39.1 or 3.0-rcN ready which writes an
oops to somewhere else than the screen. Just set up a bridge and change
the MTU for the IF, that will trigger the oops.
----
Here is the oops:
----
[ 136.546023] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 136.546038] IP: [< (null)>] (null)
[ 136.546046] *pde = 00000000
[ 136.546052] Oops: 0000 [#1] SMP
[ 136.546059] last sysfs file: /sys/devices/virtual/net/mybridge/uevent
[ 136.546066] Modules linked in: sit tunnel4 tun iwlagn sch_sfq bridge
stp llc nfs lockd auth_rpcgss nfs_acl sunrpc ipt_LOG xt_recent xt_state
iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_addrtype xt_dscp xt_iprange xt_DSCP xt_set ip_set nfnetlink ip6t_LOG
xt_limit ip6table_filter xt_string xt_owner xt_multiport xt_hashlimit
xt_conntrack xt_NFQUEUE xt_mark xt_connmark nf_conntrack ip6_tables
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss
snd_mixer_oss uinput ipv6 dm_mod nvidiafb fb_ddc i2c_algo_bit vgastate
nvidia(P) arc4 uvcvideo videodev ecb snd_hda_codec_realtek iwl4965
snd_hda_intel iwl_legacy snd_hda_codec uhci_hcd firewire_ohci mac80211
sdhci_pci snd_hwdep snd_pcm sdhci ehci_hcd intel_agp firewire_core
cfg80211 snd_timer asus_laptop snd e1000e usbcore sparse_keymap rfkill
tpm_infineon crc_itu_t intel_gtt mmc_core iTCO_wdt tpm_tis ata_piix
agpgart tpm video soundcore sg tpm_bios joydev snd_page_alloc [last
unloaded: microcode]
[ 136.546235]
[ 136.546243] Pid: 8415, comm: ip Tainted: P
2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
/V1Sn
[ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[ 136.546268] EIP is at 0x0
[ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
[ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
[ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
task.ti=f15c2000)
[ 136.546297] Stack:
[ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
ffffffa1 f15c3bbc
[ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
ffffffa6 f15c3be4
[ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
00000000 00000000
[ 136.546343] Call Trace:
[ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
[ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
[ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
[ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
[ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
[ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
[ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
[ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
[ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
[ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
[ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
[ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
[ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
[ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
[ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
[ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
[ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
[ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
[ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
[ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
[ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
[ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
[ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
[ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
[ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
[ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
[ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
[ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
[ 136.546619] Code: Bad EIP value.
[ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
[ 136.546645] CR2: 0000000000000000
[ 136.546652] ---[ end trace 6909b560e78934fa ]---
----
And if you want the oops on your machine (uisng 2.6.39.1 or 3.0-rcN):
----
brctl addbr mybridge
ip link set mybridge mtu 1234
oops
----
Regards,
Alexander
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 12:49 ` Alexander Holler
@ 2011-06-06 13:13 ` Neil Horman
-1 siblings, 0 replies; 43+ messages in thread
From: Neil Horman @ 2011-06-06 13:13 UTC (permalink / raw)
To: Alexander Holler
Cc: Eric Dumazet, linux-kernel, David Miller, Herbert Xu, netdev
On Mon, Jun 06, 2011 at 02:49:04PM +0200, Alexander Holler wrote:
> Am 06.06.2011 14:12, schrieb Eric Dumazet:
> >Le lundi 06 juin 2011 à 13:48 +0200, Alexander Holler a écrit :
> >>Am 06.06.2011 13:15, schrieb Neil Horman:
> >>>On Fri, Jun 03, 2011 at 09:21:06PM +0200, Alexander Holler wrote:
> >>>>Hello,
> >>>>
> >>>>I'm getting a oops in the bridge code in br_change_mtu() with
> >>>>2.6.39.1. The patch below seems to fix that.
> >>>>
> >>>>I'm not sure about the usage of dst_cow_metrics_generic() in
> >>>>fake_dst_ops, but after having a quick look at it seems to be ok to
> >>>>use that here.
> >>>>
> >>>>Regards,
> >>>>
> >>>>Alexander
> >>>>
> >>>How did the flags of the dst entry on which we're callnig dst_entry_write_ptr
> >>>wind up getting the READ_ONLY flag set on it? I don't see how we'er falling
> >>>into that clause in which we call cow_metrics when we call dst_metric_set. It
> >>>seems like that flag is set erroneously. perhaps we should just update
> >>>fake_rtable.dst to have the correct flags?
> >>>Neil
> >>
> >>It is set by that change:
> >>
> >>--------
> >>@@ -124,7 +128,7 @@ void br_netfilter_rtable_init(struct net_bridge *br)
> >> atomic_set(&rt->dst.__refcnt, 1);
> >> rt->dst.dev = br->dev;
> >> rt->dst.path =&rt->dst;
> >>- dst_metric_set(&rt->dst, RTAX_MTU, 1500);
> >>+ dst_init_metrics(&rt->dst, br_dst_default_metrics, true);
> >> rt->dst.flags = DST_NOXFRM;
> >> rt->dst.ops =&fake_dst_ops;
> >> }
> >>--------
> >>
> >>The true in dst_init_metrics() is responsible for that flag.
> >>
> >
> >You are aware this change fixed an oops ?
>
> No, I'm not aware of this. I know almost nothing about what all that
> stuff is doing. For me that change above just introduced an oops
> through an immediate NULL pointer dereference in br_change_mtu().
>
> >read_only in this context means : In case this must be written, we make
> >a COW first
> >(allocate a piece of memory, copy the source in it before applying any
> >change)
> >
> >It would be nice you send us the stack trace, so that we can have a clue
> >of whats going on.
>
> Here is the text as found in my first mail:
> ----
> Commit 42923465fb8d025a2b5153f2e7ab1e6e1058bf00 does here what it
> should prevent, it introduces NULL a dereference.
>
> The above commit uses dst_init_metrics() which sets the metrics as
> read only. As result br_change_mtu() dies in dst_metric_set()
> which calls dst_metrics_write_ptr() which calls
> dst->ops->cow_metrics() if the metrics are read only.
>
> I don't have a system with 2.6.39.1 or 3.0-rcN ready which writes an
> oops to somewhere else than the screen. Just set up a bridge and
> change the MTU for the IF, that will trigger the oops.
> ----
>
> Here is the oops:
>
> ----
> [ 136.546023] BUG: unable to handle kernel NULL pointer dereference
> at (null)
> [ 136.546038] IP: [< (null)>] (null)
> [ 136.546046] *pde = 00000000
> [ 136.546052] Oops: 0000 [#1] SMP
> [ 136.546059] last sysfs file: /sys/devices/virtual/net/mybridge/uevent
> [ 136.546066] Modules linked in: sit tunnel4 tun iwlagn sch_sfq
> bridge stp llc nfs lockd auth_rpcgss nfs_acl sunrpc ipt_LOG
> xt_recent xt_state iptable_filter iptable_nat nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_addrtype xt_dscp xt_iprange
> xt_DSCP xt_set ip_set nfnetlink ip6t_LOG xt_limit ip6table_filter
> xt_string xt_owner xt_multiport xt_hashlimit xt_conntrack xt_NFQUEUE
> xt_mark xt_connmark nf_conntrack ip6_tables snd_seq_oss
> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> uinput ipv6 dm_mod nvidiafb fb_ddc i2c_algo_bit vgastate nvidia(P)
> arc4 uvcvideo videodev ecb snd_hda_codec_realtek iwl4965
> snd_hda_intel iwl_legacy snd_hda_codec uhci_hcd firewire_ohci
> mac80211 sdhci_pci snd_hwdep snd_pcm sdhci ehci_hcd intel_agp
> firewire_core cfg80211 snd_timer asus_laptop snd e1000e usbcore
> sparse_keymap rfkill tpm_infineon crc_itu_t intel_gtt mmc_core
> iTCO_wdt tpm_tis ata_piix agpgart tpm video soundcore sg tpm_bios
> joydev snd_page_alloc [last unloaded: microcode]
> [ 136.546235]
> [ 136.546243] Pid: 8415, comm: ip Tainted: P
> 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
> /V1Sn
> [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [ 136.546268] EIP is at 0x0
> [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
> [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
> [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
> task.ti=f15c2000)
> [ 136.546297] Stack:
> [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8
> f80d1b80 ffffffa1 f15c3bbc
> [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000
> f80d1b80 ffffffa6 f15c3be4
> [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc
> c11d6cae 00000000 00000000
> [ 136.546343] Call Trace:
> [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
> [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
> [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
> [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
> [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
> [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
> [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
> [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
> [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
> [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
> [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
> [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
> [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
> [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
> [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
> [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
> [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
> [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
> [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
> [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
> [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
> [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
> [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
> [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
> [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
> [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
> [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
> [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
> [ 136.546619] Code: Bad EIP value.
> [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
> [ 136.546645] CR2: 0000000000000000
> [ 136.546652] ---[ end trace 6909b560e78934fa ]---
> ----
>
> And if you want the oops on your machine (uisng 2.6.39.1 or 3.0-rcN):
>
> ----
> brctl addbr mybridge
> ip link set mybridge mtu 1234
> oops
> ----
>
Ok, this makes sense to me now, thanks. The change to the dst initalization to
mark our fake routing table as read only means we need a cow_metrics method to
copy the metrics before we we can compete the dst_metric_set in br_change_mtu.
Thats fine, but given that its really a fake routing table with only one dst
entry which (I think) is only written under the rtnl lock, why not just modify
the dst_init_metrics call so that its not marked as read-only?
Regards
Neil
> Regards,
>
> Alexander
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 13:13 ` Neil Horman
0 siblings, 0 replies; 43+ messages in thread
From: Neil Horman @ 2011-06-06 13:13 UTC (permalink / raw)
To: Alexander Holler
Cc: Eric Dumazet, linux-kernel, David Miller, Herbert Xu, netdev
On Mon, Jun 06, 2011 at 02:49:04PM +0200, Alexander Holler wrote:
> Am 06.06.2011 14:12, schrieb Eric Dumazet:
> >Le lundi 06 juin 2011 à 13:48 +0200, Alexander Holler a écrit :
> >>Am 06.06.2011 13:15, schrieb Neil Horman:
> >>>On Fri, Jun 03, 2011 at 09:21:06PM +0200, Alexander Holler wrote:
> >>>>Hello,
> >>>>
> >>>>I'm getting a oops in the bridge code in br_change_mtu() with
> >>>>2.6.39.1. The patch below seems to fix that.
> >>>>
> >>>>I'm not sure about the usage of dst_cow_metrics_generic() in
> >>>>fake_dst_ops, but after having a quick look at it seems to be ok to
> >>>>use that here.
> >>>>
> >>>>Regards,
> >>>>
> >>>>Alexander
> >>>>
> >>>How did the flags of the dst entry on which we're callnig dst_entry_write_ptr
> >>>wind up getting the READ_ONLY flag set on it? I don't see how we'er falling
> >>>into that clause in which we call cow_metrics when we call dst_metric_set. It
> >>>seems like that flag is set erroneously. perhaps we should just update
> >>>fake_rtable.dst to have the correct flags?
> >>>Neil
> >>
> >>It is set by that change:
> >>
> >>--------
> >>@@ -124,7 +128,7 @@ void br_netfilter_rtable_init(struct net_bridge *br)
> >> atomic_set(&rt->dst.__refcnt, 1);
> >> rt->dst.dev = br->dev;
> >> rt->dst.path =&rt->dst;
> >>- dst_metric_set(&rt->dst, RTAX_MTU, 1500);
> >>+ dst_init_metrics(&rt->dst, br_dst_default_metrics, true);
> >> rt->dst.flags = DST_NOXFRM;
> >> rt->dst.ops =&fake_dst_ops;
> >> }
> >>--------
> >>
> >>The true in dst_init_metrics() is responsible for that flag.
> >>
> >
> >You are aware this change fixed an oops ?
>
> No, I'm not aware of this. I know almost nothing about what all that
> stuff is doing. For me that change above just introduced an oops
> through an immediate NULL pointer dereference in br_change_mtu().
>
> >read_only in this context means : In case this must be written, we make
> >a COW first
> >(allocate a piece of memory, copy the source in it before applying any
> >change)
> >
> >It would be nice you send us the stack trace, so that we can have a clue
> >of whats going on.
>
> Here is the text as found in my first mail:
> ----
> Commit 42923465fb8d025a2b5153f2e7ab1e6e1058bf00 does here what it
> should prevent, it introduces NULL a dereference.
>
> The above commit uses dst_init_metrics() which sets the metrics as
> read only. As result br_change_mtu() dies in dst_metric_set()
> which calls dst_metrics_write_ptr() which calls
> dst->ops->cow_metrics() if the metrics are read only.
>
> I don't have a system with 2.6.39.1 or 3.0-rcN ready which writes an
> oops to somewhere else than the screen. Just set up a bridge and
> change the MTU for the IF, that will trigger the oops.
> ----
>
> Here is the oops:
>
> ----
> [ 136.546023] BUG: unable to handle kernel NULL pointer dereference
> at (null)
> [ 136.546038] IP: [< (null)>] (null)
> [ 136.546046] *pde = 00000000
> [ 136.546052] Oops: 0000 [#1] SMP
> [ 136.546059] last sysfs file: /sys/devices/virtual/net/mybridge/uevent
> [ 136.546066] Modules linked in: sit tunnel4 tun iwlagn sch_sfq
> bridge stp llc nfs lockd auth_rpcgss nfs_acl sunrpc ipt_LOG
> xt_recent xt_state iptable_filter iptable_nat nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_addrtype xt_dscp xt_iprange
> xt_DSCP xt_set ip_set nfnetlink ip6t_LOG xt_limit ip6table_filter
> xt_string xt_owner xt_multiport xt_hashlimit xt_conntrack xt_NFQUEUE
> xt_mark xt_connmark nf_conntrack ip6_tables snd_seq_oss
> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> uinput ipv6 dm_mod nvidiafb fb_ddc i2c_algo_bit vgastate nvidia(P)
> arc4 uvcvideo videodev ecb snd_hda_codec_realtek iwl4965
> snd_hda_intel iwl_legacy snd_hda_codec uhci_hcd firewire_ohci
> mac80211 sdhci_pci snd_hwdep snd_pcm sdhci ehci_hcd intel_agp
> firewire_core cfg80211 snd_timer asus_laptop snd e1000e usbcore
> sparse_keymap rfkill tpm_infineon crc_itu_t intel_gtt mmc_core
> iTCO_wdt tpm_tis ata_piix agpgart tpm video soundcore sg tpm_bios
> joydev snd_page_alloc [last unloaded: microcode]
> [ 136.546235]
> [ 136.546243] Pid: 8415, comm: ip Tainted: P
> 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
> /V1Sn
> [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [ 136.546268] EIP is at 0x0
> [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
> [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
> [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
> task.ti=f15c2000)
> [ 136.546297] Stack:
> [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8
> f80d1b80 ffffffa1 f15c3bbc
> [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000
> f80d1b80 ffffffa6 f15c3be4
> [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc
> c11d6cae 00000000 00000000
> [ 136.546343] Call Trace:
> [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
> [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
> [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
> [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
> [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
> [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
> [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
> [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
> [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
> [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
> [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
> [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
> [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
> [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
> [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
> [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
> [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
> [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
> [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
> [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
> [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
> [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
> [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
> [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
> [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
> [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
> [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
> [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
> [ 136.546619] Code: Bad EIP value.
> [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
> [ 136.546645] CR2: 0000000000000000
> [ 136.546652] ---[ end trace 6909b560e78934fa ]---
> ----
>
> And if you want the oops on your machine (uisng 2.6.39.1 or 3.0-rcN):
>
> ----
> brctl addbr mybridge
> ip link set mybridge mtu 1234
> oops
> ----
>
Ok, this makes sense to me now, thanks. The change to the dst initalization to
mark our fake routing table as read only means we need a cow_metrics method to
copy the metrics before we we can compete the dst_metric_set in br_change_mtu.
Thats fine, but given that its really a fake routing table with only one dst
entry which (I think) is only written under the rtnl lock, why not just modify
the dst_init_metrics call so that its not marked as read-only?
Regards
Neil
> Regards,
>
> Alexander
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 13:13 ` Neil Horman
@ 2011-06-06 13:18 ` Eric Dumazet
-1 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 13:18 UTC (permalink / raw)
To: Neil Horman
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
Le lundi 06 juin 2011 à 09:13 -0400, Neil Horman a écrit :
> Ok, this makes sense to me now, thanks. The change to the dst initalization to
> mark our fake routing table as read only means we need a cow_metrics method to
> copy the metrics before we we can compete the dst_metric_set in br_change_mtu.
> Thats fine, but given that its really a fake routing table with only one dst
> entry which (I think) is only written under the rtnl lock, why not just modify
> the dst_init_metrics call so that its not marked as read-only?
It _is_ read-only, its even a "const", so if you do that you'll trap if
kernel const pages are RO
Please check commit 0972ddb237 for a proper way to deal with this.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 13:18 ` Eric Dumazet
0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 13:18 UTC (permalink / raw)
To: Neil Horman
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
Le lundi 06 juin 2011 à 09:13 -0400, Neil Horman a écrit :
> Ok, this makes sense to me now, thanks. The change to the dst initalization to
> mark our fake routing table as read only means we need a cow_metrics method to
> copy the metrics before we we can compete the dst_metric_set in br_change_mtu.
> Thats fine, but given that its really a fake routing table with only one dst
> entry which (I think) is only written under the rtnl lock, why not just modify
> the dst_init_metrics call so that its not marked as read-only?
It _is_ read-only, its even a "const", so if you do that you'll trap if
kernel const pages are RO
Please check commit 0972ddb237 for a proper way to deal with this.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 12:49 ` Alexander Holler
@ 2011-06-06 13:16 ` Eric Dumazet
-1 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 13:16 UTC (permalink / raw)
To: Alexander Holler
Cc: Neil Horman, linux-kernel, David Miller, Herbert Xu, netdev
Le lundi 06 juin 2011 à 14:49 +0200, Alexander Holler a écrit :
> Am 06.06.2011 14:12, schrieb Eric Dumazet:
> > Le lundi 06 juin 2011 à 13:48 +0200, Alexander Holler a écrit :
> >> Am 06.06.2011 13:15, schrieb Neil Horman:
> >>> On Fri, Jun 03, 2011 at 09:21:06PM +0200, Alexander Holler wrote:
> >>>> Hello,
> >>>>
> >>>> I'm getting a oops in the bridge code in br_change_mtu() with
> >>>> 2.6.39.1. The patch below seems to fix that.
> >>>>
> >>>> I'm not sure about the usage of dst_cow_metrics_generic() in
> >>>> fake_dst_ops, but after having a quick look at it seems to be ok to
> >>>> use that here.
> >>>>
> >>>> Regards,
> >>>>
> >>>> Alexander
> >>>>
> >>> How did the flags of the dst entry on which we're callnig dst_entry_write_ptr
> >>> wind up getting the READ_ONLY flag set on it? I don't see how we'er falling
> >>> into that clause in which we call cow_metrics when we call dst_metric_set. It
> >>> seems like that flag is set erroneously. perhaps we should just update
> >>> fake_rtable.dst to have the correct flags?
> >>> Neil
> >>
> >> It is set by that change:
> >>
> >> --------
> >> @@ -124,7 +128,7 @@ void br_netfilter_rtable_init(struct net_bridge *br)
> >> atomic_set(&rt->dst.__refcnt, 1);
> >> rt->dst.dev = br->dev;
> >> rt->dst.path =&rt->dst;
> >> - dst_metric_set(&rt->dst, RTAX_MTU, 1500);
> >> + dst_init_metrics(&rt->dst, br_dst_default_metrics, true);
> >> rt->dst.flags = DST_NOXFRM;
> >> rt->dst.ops =&fake_dst_ops;
> >> }
> >> --------
> >>
> >> The true in dst_init_metrics() is responsible for that flag.
> >>
> >
> > You are aware this change fixed an oops ?
>
> No, I'm not aware of this. I know almost nothing about what all that
> stuff is doing. For me that change above just introduced an oops through
> an immediate NULL pointer dereference in br_change_mtu().
>
> > read_only in this context means : In case this must be written, we make
> > a COW first
> > (allocate a piece of memory, copy the source in it before applying any
> > change)
> >
> > It would be nice you send us the stack trace, so that we can have a clue
> > of whats going on.
>
> Here is the text as found in my first mail:
> ----
> Commit 42923465fb8d025a2b5153f2e7ab1e6e1058bf00 does here what it
> should prevent, it introduces NULL a dereference.
>
Hmm, this commit uncovers a previous bug, introduced in commit
62fa8a846d7d.
> The above commit uses dst_init_metrics() which sets the metrics as
> read only. As result br_change_mtu() dies in dst_metric_set()
> which calls dst_metrics_write_ptr() which calls
> dst->ops->cow_metrics() if the metrics are read only.
>
> I don't have a system with 2.6.39.1 or 3.0-rcN ready which writes an
> oops to somewhere else than the screen. Just set up a bridge and change
> the MTU for the IF, that will trigger the oops.
> ----
>
> Here is the oops:
>
> ----
> [ 136.546023] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [ 136.546038] IP: [< (null)>] (null)
> [ 136.546046] *pde = 00000000
> [ 136.546052] Oops: 0000 [#1] SMP
> [ 136.546059] last sysfs file: /sys/devices/virtual/net/mybridge/uevent
> [ 136.546066] Modules linked in: sit tunnel4 tun iwlagn sch_sfq bridge
> stp llc nfs lockd auth_rpcgss nfs_acl sunrpc ipt_LOG xt_recent xt_state
> iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> xt_addrtype xt_dscp xt_iprange xt_DSCP xt_set ip_set nfnetlink ip6t_LOG
> xt_limit ip6table_filter xt_string xt_owner xt_multiport xt_hashlimit
> xt_conntrack xt_NFQUEUE xt_mark xt_connmark nf_conntrack ip6_tables
> snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss
> snd_mixer_oss uinput ipv6 dm_mod nvidiafb fb_ddc i2c_algo_bit vgastate
> nvidia(P) arc4 uvcvideo videodev ecb snd_hda_codec_realtek iwl4965
> snd_hda_intel iwl_legacy snd_hda_codec uhci_hcd firewire_ohci mac80211
> sdhci_pci snd_hwdep snd_pcm sdhci ehci_hcd intel_agp firewire_core
> cfg80211 snd_timer asus_laptop snd e1000e usbcore sparse_keymap rfkill
> tpm_infineon crc_itu_t intel_gtt mmc_core iTCO_wdt tpm_tis ata_piix
> agpgart tpm video soundcore sg tpm_bios joydev snd_page_alloc [last
> unloaded: microcode]
> [ 136.546235]
> [ 136.546243] Pid: 8415, comm: ip Tainted: P
> 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
> /V1Sn
> [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [ 136.546268] EIP is at 0x0
> [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
> [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
> [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
> task.ti=f15c2000)
> [ 136.546297] Stack:
> [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
> ffffffa1 f15c3bbc
> [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
> ffffffa6 f15c3be4
> [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
> 00000000 00000000
> [ 136.546343] Call Trace:
> [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
> [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
> [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
> [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
> [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
> [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
> [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
> [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
> [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
> [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
> [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
> [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
> [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
> [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
> [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
> [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
> [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
> [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
> [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
> [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
> [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
> [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
> [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
> [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
> [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
> [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
> [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
> [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
> [ 136.546619] Code: Bad EIP value.
> [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
> [ 136.546645] CR2: 0000000000000000
> [ 136.546652] ---[ end trace 6909b560e78934fa ]---
> ----
>
> And if you want the oops on your machine (uisng 2.6.39.1 or 3.0-rcN):
>
> ----
> brctl addbr mybridge
> ip link set mybridge mtu 1234
> oops
> ----
Nice, now please submit a patch with 0972ddb237 as a guideline.
BTW, you could also check other struct dst_ops methods for
bridge/netfilter:
- What about ADVMSS or things like that, see commit 214f45c9
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 13:16 ` Eric Dumazet
0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 13:16 UTC (permalink / raw)
To: Alexander Holler
Cc: Neil Horman, linux-kernel, David Miller, Herbert Xu, netdev
Le lundi 06 juin 2011 à 14:49 +0200, Alexander Holler a écrit :
> Am 06.06.2011 14:12, schrieb Eric Dumazet:
> > Le lundi 06 juin 2011 à 13:48 +0200, Alexander Holler a écrit :
> >> Am 06.06.2011 13:15, schrieb Neil Horman:
> >>> On Fri, Jun 03, 2011 at 09:21:06PM +0200, Alexander Holler wrote:
> >>>> Hello,
> >>>>
> >>>> I'm getting a oops in the bridge code in br_change_mtu() with
> >>>> 2.6.39.1. The patch below seems to fix that.
> >>>>
> >>>> I'm not sure about the usage of dst_cow_metrics_generic() in
> >>>> fake_dst_ops, but after having a quick look at it seems to be ok to
> >>>> use that here.
> >>>>
> >>>> Regards,
> >>>>
> >>>> Alexander
> >>>>
> >>> How did the flags of the dst entry on which we're callnig dst_entry_write_ptr
> >>> wind up getting the READ_ONLY flag set on it? I don't see how we'er falling
> >>> into that clause in which we call cow_metrics when we call dst_metric_set. It
> >>> seems like that flag is set erroneously. perhaps we should just update
> >>> fake_rtable.dst to have the correct flags?
> >>> Neil
> >>
> >> It is set by that change:
> >>
> >> --------
> >> @@ -124,7 +128,7 @@ void br_netfilter_rtable_init(struct net_bridge *br)
> >> atomic_set(&rt->dst.__refcnt, 1);
> >> rt->dst.dev = br->dev;
> >> rt->dst.path =&rt->dst;
> >> - dst_metric_set(&rt->dst, RTAX_MTU, 1500);
> >> + dst_init_metrics(&rt->dst, br_dst_default_metrics, true);
> >> rt->dst.flags = DST_NOXFRM;
> >> rt->dst.ops =&fake_dst_ops;
> >> }
> >> --------
> >>
> >> The true in dst_init_metrics() is responsible for that flag.
> >>
> >
> > You are aware this change fixed an oops ?
>
> No, I'm not aware of this. I know almost nothing about what all that
> stuff is doing. For me that change above just introduced an oops through
> an immediate NULL pointer dereference in br_change_mtu().
>
> > read_only in this context means : In case this must be written, we make
> > a COW first
> > (allocate a piece of memory, copy the source in it before applying any
> > change)
> >
> > It would be nice you send us the stack trace, so that we can have a clue
> > of whats going on.
>
> Here is the text as found in my first mail:
> ----
> Commit 42923465fb8d025a2b5153f2e7ab1e6e1058bf00 does here what it
> should prevent, it introduces NULL a dereference.
>
Hmm, this commit uncovers a previous bug, introduced in commit
62fa8a846d7d.
> The above commit uses dst_init_metrics() which sets the metrics as
> read only. As result br_change_mtu() dies in dst_metric_set()
> which calls dst_metrics_write_ptr() which calls
> dst->ops->cow_metrics() if the metrics are read only.
>
> I don't have a system with 2.6.39.1 or 3.0-rcN ready which writes an
> oops to somewhere else than the screen. Just set up a bridge and change
> the MTU for the IF, that will trigger the oops.
> ----
>
> Here is the oops:
>
> ----
> [ 136.546023] BUG: unable to handle kernel NULL pointer dereference at
> (null)
> [ 136.546038] IP: [< (null)>] (null)
> [ 136.546046] *pde = 00000000
> [ 136.546052] Oops: 0000 [#1] SMP
> [ 136.546059] last sysfs file: /sys/devices/virtual/net/mybridge/uevent
> [ 136.546066] Modules linked in: sit tunnel4 tun iwlagn sch_sfq bridge
> stp llc nfs lockd auth_rpcgss nfs_acl sunrpc ipt_LOG xt_recent xt_state
> iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> xt_addrtype xt_dscp xt_iprange xt_DSCP xt_set ip_set nfnetlink ip6t_LOG
> xt_limit ip6table_filter xt_string xt_owner xt_multiport xt_hashlimit
> xt_conntrack xt_NFQUEUE xt_mark xt_connmark nf_conntrack ip6_tables
> snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss
> snd_mixer_oss uinput ipv6 dm_mod nvidiafb fb_ddc i2c_algo_bit vgastate
> nvidia(P) arc4 uvcvideo videodev ecb snd_hda_codec_realtek iwl4965
> snd_hda_intel iwl_legacy snd_hda_codec uhci_hcd firewire_ohci mac80211
> sdhci_pci snd_hwdep snd_pcm sdhci ehci_hcd intel_agp firewire_core
> cfg80211 snd_timer asus_laptop snd e1000e usbcore sparse_keymap rfkill
> tpm_infineon crc_itu_t intel_gtt mmc_core iTCO_wdt tpm_tis ata_piix
> agpgart tpm video soundcore sg tpm_bios joydev snd_page_alloc [last
> unloaded: microcode]
> [ 136.546235]
> [ 136.546243] Pid: 8415, comm: ip Tainted: P
> 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
> /V1Sn
> [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [ 136.546268] EIP is at 0x0
> [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
> [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
> [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
> task.ti=f15c2000)
> [ 136.546297] Stack:
> [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
> ffffffa1 f15c3bbc
> [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
> ffffffa6 f15c3be4
> [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
> 00000000 00000000
> [ 136.546343] Call Trace:
> [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
> [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
> [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
> [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
> [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
> [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
> [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
> [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
> [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
> [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
> [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
> [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
> [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
> [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
> [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
> [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
> [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
> [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
> [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
> [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
> [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
> [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
> [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
> [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
> [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
> [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
> [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
> [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
> [ 136.546619] Code: Bad EIP value.
> [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
> [ 136.546645] CR2: 0000000000000000
> [ 136.546652] ---[ end trace 6909b560e78934fa ]---
> ----
>
> And if you want the oops on your machine (uisng 2.6.39.1 or 3.0-rcN):
>
> ----
> brctl addbr mybridge
> ip link set mybridge mtu 1234
> oops
> ----
Nice, now please submit a patch with 0972ddb237 as a guideline.
BTW, you could also check other struct dst_ops methods for
bridge/netfilter:
- What about ADVMSS or things like that, see commit 214f45c9
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 13:16 ` Eric Dumazet
@ 2011-06-06 13:29 ` Alexander Holler
-1 siblings, 0 replies; 43+ messages in thread
From: Alexander Holler @ 2011-06-06 13:29 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Neil Horman, linux-kernel, David Miller, Herbert Xu, netdev
> Nice, now please submit a patch with 0972ddb237 as a guideline.
>
> BTW, you could also check other struct dst_ops methods for
> bridge/netfilter:
Sorry, but I prefer to submit patches I understand by myself and for
stuff I know something about. The patch in my first mail was just meant
as a quick fix (which seemed to work here).
So even if I might be able to construct a working patch using commit
0972ddb237, I don't think I should do that.
Regards,
Alexander
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 13:29 ` Alexander Holler
0 siblings, 0 replies; 43+ messages in thread
From: Alexander Holler @ 2011-06-06 13:29 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Neil Horman, linux-kernel, David Miller, Herbert Xu, netdev
> Nice, now please submit a patch with 0972ddb237 as a guideline.
>
> BTW, you could also check other struct dst_ops methods for
> bridge/netfilter:
Sorry, but I prefer to submit patches I understand by myself and for
stuff I know something about. The patch in my first mail was just meant
as a quick fix (which seemed to work here).
So even if I might be able to construct a working patch using commit
0972ddb237, I don't think I should do that.
Regards,
Alexander
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 13:29 ` Alexander Holler
@ 2011-06-06 14:26 ` Eric Dumazet
-1 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 14:26 UTC (permalink / raw)
To: Alexander Holler
Cc: Neil Horman, linux-kernel, David Miller, Herbert Xu, netdev
From: Alexander Holler <holler@ahsoftware.de>
Le lundi 06 juin 2011 à 15:29 +0200, Alexander Holler a écrit :
> > Nice, now please submit a patch with 0972ddb237 as a guideline.
> >
> > BTW, you could also check other struct dst_ops methods for
> > bridge/netfilter:
>
> Sorry, but I prefer to submit patches I understand by myself and for
> stuff I know something about. The patch in my first mail was just meant
> as a quick fix (which seemed to work here).
>
> So even if I might be able to construct a working patch using commit
> 0972ddb237, I don't think I should do that.
OK, I'll do it for you then, I am surprised you dont understand my
review / suggestion.
One patch submitter is supposed to followup and send a new version to
take into account reviews/comments, not wait that eventually everybody
says "OK, lets take it as is"
[PATCH] bridge: provide a cow_metrics method for fake_ops
Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
well.
This fixes a regression coming from commits 62fa8a846d7d (net: Implement
read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
initialize fake_rtable metrics)
ip link set mybridge mtu 1234
-->
[ 136.546243] Pid: 8415, comm: ip Tainted: P
2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
/V1Sn
[ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[ 136.546268] EIP is at 0x0
[ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
[ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
[ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
task.ti=f15c2000)
[ 136.546297] Stack:
[ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
ffffffa1 f15c3bbc
[ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
ffffffa6 f15c3be4
[ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
00000000 00000000
[ 136.546343] Call Trace:
[ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
[ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
[ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
[ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
[ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
[ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
[ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
[ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
[ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
[ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
[ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
[ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
[ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
[ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
[ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
[ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
[ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
[ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
[ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
[ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
[ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
[ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
[ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
[ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
[ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
[ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
[ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
[ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
[ 136.546619] Code: Bad EIP value.
[ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
[ 136.546645] CR2: 0000000000000000
[ 136.546652] ---[ end trace 6909b560e78934fa ]---
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
---
net/bridge/br_netfilter.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 3fa1231..23b43d2 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -104,10 +104,16 @@ static void fake_update_pmtu(struct dst_entry *dst, u32 mtu)
{
}
+static u32 *fake_cow_metrics(struct dst_entry *dst, unsigned long old)
+{
+ return NULL;
+}
+
static struct dst_ops fake_dst_ops = {
.family = AF_INET,
.protocol = cpu_to_be16(ETH_P_IP),
.update_pmtu = fake_update_pmtu,
+ .cow_metrics = fake_cow_metrics,
};
/*
^ permalink raw reply related [flat|nested] 43+ messages in thread* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 14:26 ` Eric Dumazet
0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 14:26 UTC (permalink / raw)
To: Alexander Holler
Cc: Neil Horman, linux-kernel, David Miller, Herbert Xu, netdev
From: Alexander Holler <holler@ahsoftware.de>
Le lundi 06 juin 2011 à 15:29 +0200, Alexander Holler a écrit :
> > Nice, now please submit a patch with 0972ddb237 as a guideline.
> >
> > BTW, you could also check other struct dst_ops methods for
> > bridge/netfilter:
>
> Sorry, but I prefer to submit patches I understand by myself and for
> stuff I know something about. The patch in my first mail was just meant
> as a quick fix (which seemed to work here).
>
> So even if I might be able to construct a working patch using commit
> 0972ddb237, I don't think I should do that.
OK, I'll do it for you then, I am surprised you dont understand my
review / suggestion.
One patch submitter is supposed to followup and send a new version to
take into account reviews/comments, not wait that eventually everybody
says "OK, lets take it as is"
[PATCH] bridge: provide a cow_metrics method for fake_ops
Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
well.
This fixes a regression coming from commits 62fa8a846d7d (net: Implement
read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
initialize fake_rtable metrics)
ip link set mybridge mtu 1234
-->
[ 136.546243] Pid: 8415, comm: ip Tainted: P
2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
/V1Sn
[ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[ 136.546268] EIP is at 0x0
[ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
[ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
[ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
task.ti=f15c2000)
[ 136.546297] Stack:
[ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
ffffffa1 f15c3bbc
[ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
ffffffa6 f15c3be4
[ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
00000000 00000000
[ 136.546343] Call Trace:
[ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
[ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
[ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
[ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
[ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
[ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
[ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
[ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
[ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
[ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
[ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
[ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
[ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
[ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
[ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
[ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
[ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
[ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
[ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
[ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
[ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
[ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
[ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
[ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
[ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
[ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
[ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
[ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
[ 136.546619] Code: Bad EIP value.
[ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
[ 136.546645] CR2: 0000000000000000
[ 136.546652] ---[ end trace 6909b560e78934fa ]---
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
---
net/bridge/br_netfilter.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 3fa1231..23b43d2 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -104,10 +104,16 @@ static void fake_update_pmtu(struct dst_entry *dst, u32 mtu)
{
}
+static u32 *fake_cow_metrics(struct dst_entry *dst, unsigned long old)
+{
+ return NULL;
+}
+
static struct dst_ops fake_dst_ops = {
.family = AF_INET,
.protocol = cpu_to_be16(ETH_P_IP),
.update_pmtu = fake_update_pmtu,
+ .cow_metrics = fake_cow_metrics,
};
/*
^ permalink raw reply related [flat|nested] 43+ messages in thread* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 14:26 ` Eric Dumazet
@ 2011-06-06 15:32 ` Neil Horman
-1 siblings, 0 replies; 43+ messages in thread
From: Neil Horman @ 2011-06-06 15:32 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
On Mon, Jun 06, 2011 at 04:26:03PM +0200, Eric Dumazet wrote:
> From: Alexander Holler <holler@ahsoftware.de>
>
> Le lundi 06 juin 2011 à 15:29 +0200, Alexander Holler a écrit :
> > > Nice, now please submit a patch with 0972ddb237 as a guideline.
> > >
> > > BTW, you could also check other struct dst_ops methods for
> > > bridge/netfilter:
> >
> > Sorry, but I prefer to submit patches I understand by myself and for
> > stuff I know something about. The patch in my first mail was just meant
> > as a quick fix (which seemed to work here).
> >
> > So even if I might be able to construct a working patch using commit
> > 0972ddb237, I don't think I should do that.
>
> OK, I'll do it for you then, I am surprised you dont understand my
> review / suggestion.
>
> One patch submitter is supposed to followup and send a new version to
> take into account reviews/comments, not wait that eventually everybody
> says "OK, lets take it as is"
>
>
> [PATCH] bridge: provide a cow_metrics method for fake_ops
>
> Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
> dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
> well.
>
> This fixes a regression coming from commits 62fa8a846d7d (net: Implement
> read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
> initialize fake_rtable metrics)
>
> ip link set mybridge mtu 1234
> -->
> [ 136.546243] Pid: 8415, comm: ip Tainted: P
> 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
> /V1Sn
> [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [ 136.546268] EIP is at 0x0
> [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
> [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
> [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
> task.ti=f15c2000)
> [ 136.546297] Stack:
> [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
> ffffffa1 f15c3bbc
> [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
> ffffffa6 f15c3be4
> [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
> 00000000 00000000
> [ 136.546343] Call Trace:
> [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
> [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
> [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
> [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
> [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
> [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
> [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
> [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
> [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
> [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
> [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
> [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
> [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
> [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
> [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
> [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
> [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
> [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
> [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
> [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
> [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
> [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
> [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
> [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
> [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
> [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
> [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
> [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
> [ 136.546619] Code: Bad EIP value.
> [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
> [ 136.546645] CR2: 0000000000000000
> [ 136.546652] ---[ end trace 6909b560e78934fa ]---
>
> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Neil Horman <nhorman@tuxdriver.com>
> CC: Herbert Xu <herbert@gondor.apana.org.au>
> ---
> net/bridge/br_netfilter.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index 3fa1231..23b43d2 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -104,10 +104,16 @@ static void fake_update_pmtu(struct dst_entry *dst, u32 mtu)
> {
> }
>
> +static u32 *fake_cow_metrics(struct dst_entry *dst, unsigned long old)
> +{
> + return NULL;
> +}
> +
> static struct dst_ops fake_dst_ops = {
> .family = AF_INET,
> .protocol = cpu_to_be16(ETH_P_IP),
> .update_pmtu = fake_update_pmtu,
> + .cow_metrics = fake_cow_metrics,
> };
>
> /*
>
>
Not to drag this out further, but since you illustrated the correct way to do
this with the blackhole_ops test, and this modification now gives us two
instances of that case, would it perhaps be better to just do this in
dst_metrics_write_ptr:
return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
Then we could eliminate the two functions that do nothing be retun NULL (along
with their respective call instructions), and save any future users from having
to remember to include a dummy cow_metrics method if they happen to set the read
only flag on thier dst_ops?
Neil
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 15:32 ` Neil Horman
0 siblings, 0 replies; 43+ messages in thread
From: Neil Horman @ 2011-06-06 15:32 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
On Mon, Jun 06, 2011 at 04:26:03PM +0200, Eric Dumazet wrote:
> From: Alexander Holler <holler@ahsoftware.de>
>
> Le lundi 06 juin 2011 à 15:29 +0200, Alexander Holler a écrit :
> > > Nice, now please submit a patch with 0972ddb237 as a guideline.
> > >
> > > BTW, you could also check other struct dst_ops methods for
> > > bridge/netfilter:
> >
> > Sorry, but I prefer to submit patches I understand by myself and for
> > stuff I know something about. The patch in my first mail was just meant
> > as a quick fix (which seemed to work here).
> >
> > So even if I might be able to construct a working patch using commit
> > 0972ddb237, I don't think I should do that.
>
> OK, I'll do it for you then, I am surprised you dont understand my
> review / suggestion.
>
> One patch submitter is supposed to followup and send a new version to
> take into account reviews/comments, not wait that eventually everybody
> says "OK, lets take it as is"
>
>
> [PATCH] bridge: provide a cow_metrics method for fake_ops
>
> Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
> dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
> well.
>
> This fixes a regression coming from commits 62fa8a846d7d (net: Implement
> read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
> initialize fake_rtable metrics)
>
> ip link set mybridge mtu 1234
> -->
> [ 136.546243] Pid: 8415, comm: ip Tainted: P
> 2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc. V1Sn
> /V1Sn
> [ 136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [ 136.546268] EIP is at 0x0
> [ 136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
> [ 136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
> [ 136.546285] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [ 136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80
> task.ti=f15c2000)
> [ 136.546297] Stack:
> [ 136.546301] f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80
> ffffffa1 f15c3bbc
> [ 136.546315] c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80
> ffffffa6 f15c3be4
> [ 136.546329] 00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae
> 00000000 00000000
> [ 136.546343] Call Trace:
> [ 136.546359] [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
> [ 136.546372] [<c12cb9c8>] dev_set_mtu+0x38/0x80
> [ 136.546381] [<c12da347>] do_setlink+0x1a7/0x860
> [ 136.546390] [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
> [ 136.546400] [<c11d6cae>] ? nla_parse+0x6e/0xb0
> [ 136.546409] [<c12db931>] rtnl_newlink+0x361/0x510
> [ 136.546420] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546429] [<c1362762>] ? error_code+0x5a/0x60
> [ 136.546438] [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
> [ 136.546446] [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
> [ 136.546454] [<c12db180>] ? __rtnl_unlock+0x20/0x20
> [ 136.546463] [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
> [ 136.546471] [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
> [ 136.546479] [<c12edafa>] netlink_unicast+0x23a/0x280
> [ 136.546487] [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
> [ 136.546497] [<c12bb828>] sock_sendmsg+0xc8/0x100
> [ 136.546508] [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
> [ 136.546517] [<c11d0602>] ? _copy_from_user+0x42/0x60
> [ 136.546525] [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
> [ 136.546534] [<c12bd805>] sys_sendmsg+0x1c5/0x200
> [ 136.546542] [<c10c2150>] ? __do_fault+0x310/0x410
> [ 136.546549] [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
> [ 136.546557] [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
> [ 136.546565] [<c12bd1af>] ? sys_getsockname+0x7f/0x90
> [ 136.546574] [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
> [ 136.546582] [<c1023240>] ? vmalloc_sync_all+0x100/0x100
> [ 136.546589] [<c10233b3>] ? do_page_fault+0x173/0x3d0
> [ 136.546596] [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
> [ 136.546605] [<c12bdd83>] sys_socketcall+0x293/0x2d0
> [ 136.546614] [<c13629d0>] sysenter_do_call+0x12/0x26
> [ 136.546619] Code: Bad EIP value.
> [ 136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
> [ 136.546645] CR2: 0000000000000000
> [ 136.546652] ---[ end trace 6909b560e78934fa ]---
>
> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Neil Horman <nhorman@tuxdriver.com>
> CC: Herbert Xu <herbert@gondor.apana.org.au>
> ---
> net/bridge/br_netfilter.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index 3fa1231..23b43d2 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -104,10 +104,16 @@ static void fake_update_pmtu(struct dst_entry *dst, u32 mtu)
> {
> }
>
> +static u32 *fake_cow_metrics(struct dst_entry *dst, unsigned long old)
> +{
> + return NULL;
> +}
> +
> static struct dst_ops fake_dst_ops = {
> .family = AF_INET,
> .protocol = cpu_to_be16(ETH_P_IP),
> .update_pmtu = fake_update_pmtu,
> + .cow_metrics = fake_cow_metrics,
> };
>
> /*
>
>
Not to drag this out further, but since you illustrated the correct way to do
this with the blackhole_ops test, and this modification now gives us two
instances of that case, would it perhaps be better to just do this in
dst_metrics_write_ptr:
return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
Then we could eliminate the two functions that do nothing be retun NULL (along
with their respective call instructions), and save any future users from having
to remember to include a dummy cow_metrics method if they happen to set the read
only flag on thier dst_ops?
Neil
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 15:32 ` Neil Horman
@ 2011-06-06 16:11 ` Eric Dumazet
-1 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 16:11 UTC (permalink / raw)
To: Neil Horman
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
Le lundi 06 juin 2011 à 11:32 -0400, Neil Horman a écrit :
> Not to drag this out further, but since you illustrated the correct way to do
> this with the blackhole_ops test, and this modification now gives us two
> instances of that case, would it perhaps be better to just do this in
> dst_metrics_write_ptr:
>
> return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
>
> Then we could eliminate the two functions that do nothing be retun NULL (along
> with their respective call instructions), and save any future users from having
> to remember to include a dummy cow_metrics method if they happen to set the read
> only flag on thier dst_ops?
Well, I prefer how David coded the thing.
We can add selective traces where we want.
Having a default behavior might give much more work to find a bug in
this area. A NULL pointer access gives us an immediate indication.
Its a bit late to add an "if (dst->ops->cow_metrics)" test now that we
covered all call sites ;)
But we probably have more bugs elsewhere, because of many dst changes in
2.6.39
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 16:11 ` Eric Dumazet
0 siblings, 0 replies; 43+ messages in thread
From: Eric Dumazet @ 2011-06-06 16:11 UTC (permalink / raw)
To: Neil Horman
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
Le lundi 06 juin 2011 à 11:32 -0400, Neil Horman a écrit :
> Not to drag this out further, but since you illustrated the correct way to do
> this with the blackhole_ops test, and this modification now gives us two
> instances of that case, would it perhaps be better to just do this in
> dst_metrics_write_ptr:
>
> return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
>
> Then we could eliminate the two functions that do nothing be retun NULL (along
> with their respective call instructions), and save any future users from having
> to remember to include a dummy cow_metrics method if they happen to set the read
> only flag on thier dst_ops?
Well, I prefer how David coded the thing.
We can add selective traces where we want.
Having a default behavior might give much more work to find a bug in
this area. A NULL pointer access gives us an immediate indication.
Its a bit late to add an "if (dst->ops->cow_metrics)" test now that we
covered all call sites ;)
But we probably have more bugs elsewhere, because of many dst changes in
2.6.39
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 16:11 ` Eric Dumazet
@ 2011-06-06 17:07 ` Neil Horman
-1 siblings, 0 replies; 43+ messages in thread
From: Neil Horman @ 2011-06-06 17:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
On Mon, Jun 06, 2011 at 06:11:45PM +0200, Eric Dumazet wrote:
> Le lundi 06 juin 2011 à 11:32 -0400, Neil Horman a écrit :
>
> > Not to drag this out further, but since you illustrated the correct way to do
> > this with the blackhole_ops test, and this modification now gives us two
> > instances of that case, would it perhaps be better to just do this in
> > dst_metrics_write_ptr:
> >
> > return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
> >
> > Then we could eliminate the two functions that do nothing be retun NULL (along
> > with their respective call instructions), and save any future users from having
> > to remember to include a dummy cow_metrics method if they happen to set the read
> > only flag on thier dst_ops?
>
> Well, I prefer how David coded the thing.
> We can add selective traces where we want.
>
> Having a default behavior might give much more work to find a bug in
> this area. A NULL pointer access gives us an immediate indication.
>
> Its a bit late to add an "if (dst->ops->cow_metrics)" test now that we
> covered all call sites ;)
>
> But we probably have more bugs elsewhere, because of many dst changes in
> 2.6.39
Ok, sounds reasonable to me.
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
@ 2011-06-06 17:07 ` Neil Horman
0 siblings, 0 replies; 43+ messages in thread
From: Neil Horman @ 2011-06-06 17:07 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexander Holler, linux-kernel, David Miller, Herbert Xu, netdev
On Mon, Jun 06, 2011 at 06:11:45PM +0200, Eric Dumazet wrote:
> Le lundi 06 juin 2011 à 11:32 -0400, Neil Horman a écrit :
>
> > Not to drag this out further, but since you illustrated the correct way to do
> > this with the blackhole_ops test, and this modification now gives us two
> > instances of that case, would it perhaps be better to just do this in
> > dst_metrics_write_ptr:
> >
> > return dst->ops->cow_metrics ? return dst->ops->cow_metrics(dst, p) : NULL;
> >
> > Then we could eliminate the two functions that do nothing be retun NULL (along
> > with their respective call instructions), and save any future users from having
> > to remember to include a dummy cow_metrics method if they happen to set the read
> > only flag on thier dst_ops?
>
> Well, I prefer how David coded the thing.
> We can add selective traces where we want.
>
> Having a default behavior might give much more work to find a bug in
> this area. A NULL pointer access gives us an immediate indication.
>
> Its a bit late to add an "if (dst->ops->cow_metrics)" test now that we
> covered all call sites ;)
>
> But we probably have more bugs elsewhere, because of many dst changes in
> 2.6.39
Ok, sounds reasonable to me.
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: bridge/netfilter: regression in 2.6.39.1
2011-06-06 14:26 ` Eric Dumazet
(?)
(?)
@ 2011-06-07 7:52 ` David Miller
-1 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2011-06-07 7:52 UTC (permalink / raw)
To: eric.dumazet; +Cc: holler, nhorman, linux-kernel, herbert, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 06 Jun 2011 16:26:03 +0200
> [PATCH] bridge: provide a cow_metrics method for fake_ops
>
> Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
> dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
> well.
>
> This fixes a regression coming from commits 62fa8a846d7d (net: Implement
> read-only protection and COW'ing of metrics.) and 33eb9873a28 (bridge:
> initialize fake_rtable metrics)
>
> ip link set mybridge mtu 1234
> -->
...
> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied, thanks everyone.
^ permalink raw reply [flat|nested] 43+ messages in thread