Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-2.6 v2 3/3] ixgbe: Look inside vlan when determining offload protocol.
From: David Miller @ 2010-11-12 20:24 UTC (permalink / raw)
  To: jesse
  Cc: netdev, hzheng, jeffrey.t.kirsher, alexander.h.duyck,
	jesse.brandeburg
In-Reply-To: <1289519279-20641-3-git-send-email-jesse@nicira.com>

From: Jesse Gross <jesse@nicira.com>
Date: Thu, 11 Nov 2010 15:47:59 -0800

> From: Hao Zheng <hzheng@nicira.com>
> 
> Currently the skb->protocol field is used to setup various
> offloading parameters on transmit for the correct protocol.
> However, if vlan offloading is disabled or otherwise not used,
> the protocol field will be ETH_P_8021Q, not the actual protocol.
> This will cause the offloading to be not performed correctly,
> even though the hardware is capable of looking inside vlan tags.
> Instead, look inside the header if necessary to determine the
> correct protocol type.
> 
> To some extent this fixes a regression from 2.6.36 because it
> was previously not possible to disable vlan offloading and this
> error case was not exposed.
> 
> Signed-off-by: Hao Zheng <hzheng@nicira.com>
> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> CC: Alex Duyck <alexander.h.duyck@intel.com>
> CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Signed-off-by: Jesse Gross <jesse@nicira.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6 v2 2/3] bnx2x: Look inside vlan when determining checksum proto.
From: David Miller @ 2010-11-12 20:24 UTC (permalink / raw)
  To: jesse; +Cc: netdev, hzheng, eilong
In-Reply-To: <1289519279-20641-2-git-send-email-jesse@nicira.com>

From: Jesse Gross <jesse@nicira.com>
Date: Thu, 11 Nov 2010 15:47:58 -0800

> From: Hao Zheng <hzheng@nicira.com>
> 
> Currently the skb->protocol field is used to setup checksum
> offloading on transmit for the correct protocol.  However, if
> vlan offloading is disabled or otherwise not used, the protocol
> field will be ETH_P_8021Q, not the actual protocol.  This will
> cause the checksum to be not computed correctly, even though the
> hardware is capable of looking inside vlan tags.  Instead,
> look inside the header if necessary to determine the correct
> protocol type.
> 
> To some extent this fixes a regression from 2.6.36 because it
> was previously not possible to disable vlan offloading and this
> error case was not exposed.
> 
> Signed-off-by: Hao Zheng <hzheng@nicira.com>
> CC: Eilon Greenstein <eilong@broadcom.com>
> Signed-off-by: Jesse Gross <jesse@nicira.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6 v2 1/3] vlan: Add function to retrieve EtherType from vlan packets.
From: David Miller @ 2010-11-12 20:24 UTC (permalink / raw)
  To: jesse; +Cc: netdev, hzheng
In-Reply-To: <1289519279-20641-1-git-send-email-jesse@nicira.com>

From: Jesse Gross <jesse@nicira.com>
Date: Thu, 11 Nov 2010 15:47:57 -0800

> From: Hao Zheng <hzheng@nicira.com>
> 
> Depending on how a packet is vlan tagged (i.e. hardware accelerated or
> not), the encapsulated protocol is stored in different locations.  This
> provides a consistent method of accessing that protocol, which is needed
> by drivers, security checks, etc.
> 
> Signed-off-by: Hao Zheng <hzheng@nicira.com>
> Signed-off-by: Jesse Gross <jesse@nicira.com>

Applied.

^ permalink raw reply

* Fwd: WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0() 2.6.35.7
From: Alexey Dobriyan @ 2010-11-12 20:21 UTC (permalink / raw)
  To: netdev; +Cc: udovdh

----- Forwarded message from Udo van den Heuvel <udovdh@xs4all.nl> -----

From: Udo van den Heuvel <udovdh@xs4all.nl>
To: linux-kernel@vger.kernel.org
Subject: WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()  2.6.35.7
Date: Thu, 11 Nov 2010 17:56:50 +0100

Hello,

Should apps like NetworkManager cause stuff like in the logs added below?
What happened?
How to fix?

Kind regards,
Udo

------------[ cut here ]------------
WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()
Hardware name:
Modules linked in: sit tunnel4 nls_utf8 isofs vfat fat usb_storage
vboxnetadp vboxnetflt vboxdrv radeon ttm drm_kms_helper drm fb fbdev
i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect rfcomm sco bridge stp llc
bnep l2cap nfsd nfs_acl auth_rpcgss exportfs eeprom it87 hwmon_vid lockd
sunrpc powernow_k8 mperf ipt_REJECT iptable_filter ipt_MASQUERADE
iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc dm_mirror
dm_region_hash dm_log ppdev snd_hda_codec_atihdmi snd_hda_codec_realtek
pwc snd_seq snd_seq_device videodev parport_pc snd_hda_intel
snd_hda_codec k10temp snd_pcm ohci1394 snd_timer btusb ieee1394 snd
v4l1_compat v4l2_compat_ioctl32 i2c_piix4 usblp evdev sg parport
bluetooth button snd_page_alloc sr_mod cdrom ide_pci_generic atiixp
pata_atiixp ehci_hcd ohci_hcd sata_sil24 floppy [last unloaded:
scsi_wait_scan]
Pid: 19225, comm: NetworkManager Not tainted 2.6.35.7 #10
Call Trace:
 [<ffffffff8103e02b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffff8137b7b6>] ? fib6_del+0x4f6/0x5a0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff8137a9c0>] ? fib6_prune_clone+0x0/0x20
 [<ffffffff8137b8bc>] ? fib6_clean_node+0x5c/0xc0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137acd6>] ? fib6_walk_continue+0x156/0x170
 [<ffffffff8137ad41>] ? fib6_walk+0x51/0xb0
 [<ffffffff813a8533>] ? _raw_write_lock_bh+0x13/0x30
 [<ffffffff8137b177>] ? fib6_clean_all+0x97/0xe0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137a8e6>] ? rt6_ifdown+0x26/0xc0
 [<ffffffff81373b88>] ? addrconf_ifdown+0x48/0x4e0
 [<ffffffff81043cdd>] ? local_bh_enable_ip+0x4d/0xb0
 [<ffffffff81374405>] ? addrconf_notify+0xe5/0x910
 [<ffffffff8104a013>] ? lock_timer_base+0x33/0x70
 [<ffffffff813a84b2>] ? _raw_spin_unlock_irqrestore+0x12/0x40
 [<ffffffff8104a8bd>] ? mod_timer+0x14d/0x250
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff8105acf6>] ? notifier_call_chain+0x46/0x70
 [<ffffffff812f91d5>] ? __dev_notify_flags+0x65/0x90
 [<ffffffff812f923b>] ? dev_change_flags+0x3b/0x70
 [<ffffffff81304f09>] ? do_setlink+0x189/0x740
 [<ffffffff811c1da0>] ? nla_parse+0x30/0x100
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff81305597>] ? rtnl_setlink+0xd7/0x120
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff813153e9>] ? netlink_rcv_skb+0x89/0xb0
 [<ffffffff81305b7f>] ? rtnetlink_rcv+0x1f/0x30
 [<ffffffff81314fe5>] ? netlink_unicast+0x2a5/0x2f0
 [<ffffffff8131592c>] ? netlink_sendmsg+0x1fc/0x300
 [<ffffffff812e559e>] ? sock_sendmsg+0xfe/0x110
 [<ffffffff8110e3a9>] ? dquot_file_open+0x19/0x60
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff8108d709>] ? file_read_actor+0x179/0x1b0
 [<ffffffff811ae452>] ? cpumask_any_but+0x22/0x40
 [<ffffffff8102b27c>] ? flush_tlb_page+0x5c/0xe0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff810c0972>] ? fget_light+0xb2/0xf0
 [<ffffffff812e7946>] ? move_addr_to_kernel+0x46/0x70
 [<ffffffff812f185a>] ? verify_iovec+0x6a/0xc0
 [<ffffffff812e58bb>] ? sys_sendmsg+0x23b/0x380
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81026829>] ? do_page_fault+0x199/0x3b0
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff810c0972>] ? fget_light+0xb2/0xf0
 [<ffffffff8107a9d8>] ? audit_syscall_entry+0x1b8/0x1e0
 [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
---[ end trace 286af05228c1bc82 ]---
fib6_clean_node: del failed: rt=ffff880037a84200@98c20d5400000000 err=-2
------------[ cut here ]------------
WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()
Hardware name:
Modules linked in: sit tunnel4 nls_utf8 isofs vfat fat usb_storage
vboxnetadp vboxnetflt vboxdrv radeon ttm drm_kms_helper drm fb fbdev
i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect rfcomm sco bridge stp llc
bnep l2cap nfsd nfs_acl auth_rpcgss exportfs eeprom it87 hwmon_vid lockd
sunrpc powernow_k8 mperf ipt_REJECT iptable_filter ipt_MASQUERADE
iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc dm_mirror
dm_region_hash dm_log ppdev snd_hda_codec_atihdmi snd_hda_codec_realtek
pwc snd_seq snd_seq_device videodev parport_pc snd_hda_intel
snd_hda_codec k10temp snd_pcm ohci1394 snd_timer btusb ieee1394 snd
v4l1_compat v4l2_compat_ioctl32 i2c_piix4 usblp evdev sg parport
bluetooth button snd_page_alloc sr_mod cdrom ide_pci_generic atiixp
pata_atiixp ehci_hcd ohci_hcd sata_sil24 floppy [last unloaded:
scsi_wait_scan]
Pid: 19225, comm: NetworkManager Tainted: G        W   2.6.35.7 #10
Call Trace:
 [<ffffffff8103e02b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffff8137b7b6>] ? fib6_del+0x4f6/0x5a0
 [<ffffffff813a5a16>] ? printk+0x40/0x4a
 [<ffffffff8137b8bc>] ? fib6_clean_node+0x5c/0xc0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137acd6>] ? fib6_walk_continue+0x156/0x170
 [<ffffffff8137ad41>] ? fib6_walk+0x51/0xb0
 [<ffffffff813a8533>] ? _raw_write_lock_bh+0x13/0x30
 [<ffffffff8137b177>] ? fib6_clean_all+0x97/0xe0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137a8e6>] ? rt6_ifdown+0x26/0xc0
 [<ffffffff81373b88>] ? addrconf_ifdown+0x48/0x4e0
 [<ffffffff81043cdd>] ? local_bh_enable_ip+0x4d/0xb0
 [<ffffffff81374405>] ? addrconf_notify+0xe5/0x910
 [<ffffffff8104a013>] ? lock_timer_base+0x33/0x70
 [<ffffffff813a84b2>] ? _raw_spin_unlock_irqrestore+0x12/0x40
 [<ffffffff8104a8bd>] ? mod_timer+0x14d/0x250
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff8105acf6>] ? notifier_call_chain+0x46/0x70
 [<ffffffff812f91d5>] ? __dev_notify_flags+0x65/0x90
 [<ffffffff812f923b>] ? dev_change_flags+0x3b/0x70
 [<ffffffff81304f09>] ? do_setlink+0x189/0x740
 [<ffffffff811c1da0>] ? nla_parse+0x30/0x100
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff81305597>] ? rtnl_setlink+0xd7/0x120
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff813153e9>] ? netlink_rcv_skb+0x89/0xb0
 [<ffffffff81305b7f>] ? rtnetlink_rcv+0x1f/0x30
 [<ffffffff81314fe5>] ? netlink_unicast+0x2a5/0x2f0
 [<ffffffff8131592c>] ? netlink_sendmsg+0x1fc/0x300
 [<ffffffff812e559e>] ? sock_sendmsg+0xfe/0x110
 [<ffffffff8110e3a9>] ? dquot_file_open+0x19/0x60
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff8108d709>] ? file_read_actor+0x179/0x1b0
 [<ffffffff811ae452>] ? cpumask_any_but+0x22/0x40
 [<ffffffff8102b27c>] ? flush_tlb_page+0x5c/0xe0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff810c0972>] ? fget_light+0xb2/0xf0
 [<ffffffff812e7946>] ? move_addr_to_kernel+0x46/0x70
 [<ffffffff812f185a>] ? verify_iovec+0x6a/0xc0
 [<ffffffff812e58bb>] ? sys_sendmsg+0x23b/0x380
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81026829>] ? do_page_fault+0x199/0x3b0
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff810c0972>] ? fget_light+0xb2/0xf0
 [<ffffffff8107a9d8>] ? audit_syscall_entry+0x1b8/0x1e0
 [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
---[ end trace 286af05228c1bc83 ]---
fib6_clean_node: del failed: rt=ffff88012ae4b200@620aa8c000000000 err=-2
r8169 0000:03:00.0: eth0: link up
eth0: no IPv6 routers present
------------[ cut here ]------------
WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()
Hardware name:
Modules linked in: sit tunnel4 nls_utf8 isofs vfat fat usb_storage
vboxnetadp vboxnetflt vboxdrv radeon ttm drm_kms_helper drm fb fbdev
i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect rfcomm sco bridge stp llc
bnep l2cap nfsd nfs_acl auth_rpcgss exportfs eeprom it87 hwmon_vid lockd
sunrpc powernow_k8 mperf ipt_REJECT iptable_filter ipt_MASQUERADE
iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc dm_mirror
dm_region_hash dm_log ppdev snd_hda_codec_atihdmi snd_hda_codec_realtek
pwc snd_seq snd_seq_device videodev parport_pc snd_hda_intel
snd_hda_codec k10temp snd_pcm ohci1394 snd_timer btusb ieee1394 snd
v4l1_compat v4l2_compat_ioctl32 i2c_piix4 usblp evdev sg parport
bluetooth button snd_page_alloc sr_mod cdrom ide_pci_generic atiixp
pata_atiixp ehci_hcd ohci_hcd sata_sil24 floppy [last unloaded:
scsi_wait_scan]
Pid: 19514, comm: NetworkManager Tainted: G        W   2.6.35.7 #10
Call Trace:
 [<ffffffff8103e02b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffff8137b7b6>] ? fib6_del+0x4f6/0x5a0
 [<ffffffff810386af>] ? load_balance+0xbf/0x6b0
 [<ffffffff8103201f>] ? dequeue_task_fair+0x4f/0x1c0
 [<ffffffff8137b8bc>] ? fib6_clean_node+0x5c/0xc0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137acd6>] ? fib6_walk_continue+0x156/0x170
 [<ffffffff8137ad41>] ? fib6_walk+0x51/0xb0
 [<ffffffff813a8533>] ? _raw_write_lock_bh+0x13/0x30
 [<ffffffff8137b177>] ? fib6_clean_all+0x97/0xe0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137a8e6>] ? rt6_ifdown+0x26/0xc0
 [<ffffffff81373b88>] ? addrconf_ifdown+0x48/0x4e0
 [<ffffffff81043cdd>] ? local_bh_enable_ip+0x4d/0xb0
 [<ffffffff81374405>] ? addrconf_notify+0xe5/0x910
 [<ffffffff8104a013>] ? lock_timer_base+0x33/0x70
 [<ffffffff813a84b2>] ? _raw_spin_unlock_irqrestore+0x12/0x40
 [<ffffffff8104a8bd>] ? mod_timer+0x14d/0x250
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff8105acf6>] ? notifier_call_chain+0x46/0x70
 [<ffffffff812f91d5>] ? __dev_notify_flags+0x65/0x90
 [<ffffffff812f923b>] ? dev_change_flags+0x3b/0x70
 [<ffffffff81304f09>] ? do_setlink+0x189/0x740
 [<ffffffff811c1da0>] ? nla_parse+0x30/0x100
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff81305597>] ? rtnl_setlink+0xd7/0x120
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff813153e9>] ? netlink_rcv_skb+0x89/0xb0
 [<ffffffff81305b7f>] ? rtnetlink_rcv+0x1f/0x30
 [<ffffffff81314fe5>] ? netlink_unicast+0x2a5/0x2f0
 [<ffffffff8131592c>] ? netlink_sendmsg+0x1fc/0x300
 [<ffffffff812e559e>] ? sock_sendmsg+0xfe/0x110
 [<ffffffff8110e3a9>] ? dquot_file_open+0x19/0x60
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff8108d709>] ? file_read_actor+0x179/0x1b0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff812e7946>] ? move_addr_to_kernel+0x46/0x70
 [<ffffffff812f185a>] ? verify_iovec+0x6a/0xc0
 [<ffffffff812e58bb>] ? sys_sendmsg+0x23b/0x380
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035c8f>] ? add_preempt_count+0x5f/0xb0
 [<ffffffff81043c7e>] ? local_bh_disable+0xe/0x20
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff810d1d2f>] ? d_kill+0x5f/0x80
 [<ffffffff8107a9d8>] ? audit_syscall_entry+0x1b8/0x1e0
 [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
---[ end trace 286af05228c1bc84 ]---
fib6_clean_node: del failed: rt=ffff88012ae4b200@620aa8c000000000 err=-2
fib6_clean_node: del failed: rt=ffff8800250edd00@(null) err=-2
fib6_clean_node: del failed: rt=ffff88012d877840@(null) err=-2
------------[ cut here ]------------
WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()
Hardware name:
Modules linked in: sit tunnel4 nls_utf8 isofs vfat fat usb_storage
vboxnetadp vboxnetflt vboxdrv radeon ttm drm_kms_helper drm fb fbdev
i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect rfcomm sco bridge stp llc
bnep l2cap nfsd nfs_acl auth_rpcgss exportfs eeprom it87 hwmon_vid lockd
sunrpc powernow_k8 mperf ipt_REJECT iptable_filter ipt_MASQUERADE
iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc dm_mirror
dm_region_hash dm_log ppdev snd_hda_codec_atihdmi snd_hda_codec_realtek
pwc snd_seq snd_seq_device videodev parport_pc snd_hda_intel
snd_hda_codec k10temp snd_pcm ohci1394 snd_timer btusb ieee1394 snd
v4l1_compat v4l2_compat_ioctl32 i2c_piix4 usblp evdev sg parport
bluetooth button snd_page_alloc sr_mod cdrom ide_pci_generic atiixp
pata_atiixp ehci_hcd ohci_hcd sata_sil24 floppy [last unloaded:
scsi_wait_scan]
Pid: 19678, comm: ip Tainted: G        W   2.6.35.7 #10
Call Trace:
 [<ffffffff8103e02b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffff8137b7b6>] ? fib6_del+0x4f6/0x5a0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff8137a9c0>] ? fib6_prune_clone+0x0/0x20
 [<ffffffff8137b8bc>] ? fib6_clean_node+0x5c/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137acd6>] ? fib6_walk_continue+0x156/0x170
 [<ffffffff8137ad41>] ? fib6_walk+0x51/0xb0
 [<ffffffff813a8533>] ? _raw_write_lock_bh+0x13/0x30
 [<ffffffff8137b177>] ? fib6_clean_all+0x97/0xe0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137a8e6>] ? rt6_ifdown+0x26/0xc0
 [<ffffffff81373b88>] ? addrconf_ifdown+0x48/0x4e0
 [<ffffffff813741f6>] ? inet6_addr_del+0x106/0x130
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff81374285>] ? inet6_rtm_deladdr+0x65/0x70
 [<ffffffff813153e9>] ? netlink_rcv_skb+0x89/0xb0
 [<ffffffff81305b7f>] ? rtnetlink_rcv+0x1f/0x30
 [<ffffffff81314fe5>] ? netlink_unicast+0x2a5/0x2f0
 [<ffffffff8131592c>] ? netlink_sendmsg+0x1fc/0x300
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff812e559e>] ? sock_sendmsg+0xfe/0x110
 [<ffffffff810a3f2a>] ? __do_fault+0x40a/0x520
 [<ffffffff812e6cc8>] ? move_addr_to_user+0x88/0xa0
 [<ffffffff812e6e2d>] ? __sys_recvmsg+0x14d/0x290
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff812e5622>] ? sockfd_lookup_light+0x22/0x80
 [<ffffffff812e7a8d>] ? sys_sendto+0x11d/0x180
 [<ffffffff810ab0ca>] ? vma_link+0xaa/0x110
 [<ffffffff810ac5bd>] ? do_brk+0x2fd/0x330
 [<ffffffff8107a9d8>] ? audit_syscall_entry+0x1b8/0x1e0
 [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
---[ end trace 286af05228c1bc85 ]---
fib6_clean_node: del failed: rt=ffff880028faa340@ffff8800250a6e40 err=-2
------------[ cut here ]------------
WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()
Hardware name:
Modules linked in: sit tunnel4 nls_utf8 isofs vfat fat usb_storage
vboxnetadp vboxnetflt vboxdrv radeon ttm drm_kms_helper drm fb fbdev
i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect rfcomm sco bridge stp llc
bnep l2cap nfsd nfs_acl auth_rpcgss exportfs eeprom it87 hwmon_vid lockd
sunrpc powernow_k8 mperf ipt_REJECT iptable_filter ipt_MASQUERADE
iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc dm_mirror
dm_region_hash dm_log ppdev snd_hda_codec_atihdmi snd_hda_codec_realtek
pwc snd_seq snd_seq_device videodev parport_pc snd_hda_intel
snd_hda_codec k10temp snd_pcm ohci1394 snd_timer btusb ieee1394 snd
v4l1_compat v4l2_compat_ioctl32 i2c_piix4 usblp evdev sg parport
bluetooth button snd_page_alloc sr_mod cdrom ide_pci_generic atiixp
pata_atiixp ehci_hcd ohci_hcd sata_sil24 floppy [last unloaded:
scsi_wait_scan]
Pid: 19678, comm: ip Tainted: G        W   2.6.35.7 #10
Call Trace:
 [<ffffffff8103e02b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffff8137b7b6>] ? fib6_del+0x4f6/0x5a0
 [<ffffffff813a5a16>] ? printk+0x40/0x4a
 [<ffffffff8137b8bc>] ? fib6_clean_node+0x5c/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137acd6>] ? fib6_walk_continue+0x156/0x170
 [<ffffffff8137ad41>] ? fib6_walk+0x51/0xb0
 [<ffffffff813a8533>] ? _raw_write_lock_bh+0x13/0x30
 [<ffffffff8137b177>] ? fib6_clean_all+0x97/0xe0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137a8e6>] ? rt6_ifdown+0x26/0xc0
 [<ffffffff81373b88>] ? addrconf_ifdown+0x48/0x4e0
 [<ffffffff813741f6>] ? inet6_addr_del+0x106/0x130
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff81374285>] ? inet6_rtm_deladdr+0x65/0x70
 [<ffffffff813153e9>] ? netlink_rcv_skb+0x89/0xb0
 [<ffffffff81305b7f>] ? rtnetlink_rcv+0x1f/0x30
 [<ffffffff81314fe5>] ? netlink_unicast+0x2a5/0x2f0
 [<ffffffff8131592c>] ? netlink_sendmsg+0x1fc/0x300
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff812e559e>] ? sock_sendmsg+0xfe/0x110
 [<ffffffff810a3f2a>] ? __do_fault+0x40a/0x520
 [<ffffffff812e6cc8>] ? move_addr_to_user+0x88/0xa0
 [<ffffffff812e6e2d>] ? __sys_recvmsg+0x14d/0x290
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff812e5622>] ? sockfd_lookup_light+0x22/0x80
 [<ffffffff812e7a8d>] ? sys_sendto+0x11d/0x180
 [<ffffffff810ab0ca>] ? vma_link+0xaa/0x110
 [<ffffffff810ac5bd>] ? do_brk+0x2fd/0x330
 [<ffffffff8107a9d8>] ? audit_syscall_entry+0x1b8/0x1e0
 [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
---[ end trace 286af05228c1bc86 ]---
fib6_clean_node: del failed: rt=ffff8800ce662cc0@0100007f00000000 err=-2
------------[ cut here ]------------
WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()
Hardware name:
Modules linked in: sit tunnel4 nls_utf8 isofs vfat fat usb_storage
vboxnetadp vboxnetflt vboxdrv radeon ttm drm_kms_helper drm fb fbdev
i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect rfcomm sco bridge stp llc
bnep l2cap nfsd nfs_acl auth_rpcgss exportfs eeprom it87 hwmon_vid lockd
sunrpc powernow_k8 mperf ipt_REJECT iptable_filter ipt_MASQUERADE
iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc dm_mirror
dm_region_hash dm_log ppdev snd_hda_codec_atihdmi snd_hda_codec_realtek
pwc snd_seq snd_seq_device videodev parport_pc snd_hda_intel
snd_hda_codec k10temp snd_pcm ohci1394 snd_timer btusb ieee1394 snd
v4l1_compat v4l2_compat_ioctl32 i2c_piix4 usblp evdev sg parport
bluetooth button snd_page_alloc sr_mod cdrom ide_pci_generic atiixp
pata_atiixp ehci_hcd ohci_hcd sata_sil24 floppy [last unloaded:
scsi_wait_scan]
Pid: 19679, comm: ip Tainted: G        W   2.6.35.7 #10
Call Trace:
 [<ffffffff8103e02b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffff8137b7b6>] ? fib6_del+0x4f6/0x5a0
 [<ffffffff810386af>] ? load_balance+0xbf/0x6b0
 [<ffffffff8103201f>] ? dequeue_task_fair+0x4f/0x1c0
 [<ffffffff8137b8bc>] ? fib6_clean_node+0x5c/0xc0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137acd6>] ? fib6_walk_continue+0x156/0x170
 [<ffffffff8137ad41>] ? fib6_walk+0x51/0xb0
 [<ffffffff813a8533>] ? _raw_write_lock_bh+0x13/0x30
 [<ffffffff8137b177>] ? fib6_clean_all+0x97/0xe0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137a8e6>] ? rt6_ifdown+0x26/0xc0
 [<ffffffff81373b88>] ? addrconf_ifdown+0x48/0x4e0
 [<ffffffff81043cdd>] ? local_bh_enable_ip+0x4d/0xb0
 [<ffffffff81374405>] ? addrconf_notify+0xe5/0x910
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff8137a9e0>] ? fib6_age+0x0/0x80
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff8105acf6>] ? notifier_call_chain+0x46/0x70
 [<ffffffff812f91d5>] ? __dev_notify_flags+0x65/0x90
 [<ffffffff812f923b>] ? dev_change_flags+0x3b/0x70
 [<ffffffff81304f09>] ? do_setlink+0x189/0x740
 [<ffffffff811c1da0>] ? nla_parse+0x30/0x100
 [<ffffffff813061e9>] ? rtnl_newlink+0x3e9/0x4f0
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff813a84b2>] ? _raw_spin_unlock_irqrestore+0x12/0x40
 [<ffffffff812f28ba>] ? __skb_recv_datagram+0xca/0x280
 [<ffffffff81305c15>] ? rtnetlink_rcv_msg+0x85/0x270
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff813153e9>] ? netlink_rcv_skb+0x89/0xb0
 [<ffffffff81305b7f>] ? rtnetlink_rcv+0x1f/0x30
 [<ffffffff81314fe5>] ? netlink_unicast+0x2a5/0x2f0
 [<ffffffff8131592c>] ? netlink_sendmsg+0x1fc/0x300
 [<ffffffff812e559e>] ? sock_sendmsg+0xfe/0x110
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff8108e18d>] ? find_get_page+0x6d/0xd0
 [<ffffffff8108eb19>] ? filemap_fault+0x99/0x400
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff810a3f2a>] ? __do_fault+0x40a/0x520
 [<ffffffff812e7946>] ? move_addr_to_kernel+0x46/0x70
 [<ffffffff812f185a>] ? verify_iovec+0x6a/0xc0
 [<ffffffff812e58bb>] ? sys_sendmsg+0x23b/0x380
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff81026829>] ? do_page_fault+0x199/0x3b0
 [<ffffffff810ab0ca>] ? vma_link+0xaa/0x110
 [<ffffffff810ac5bd>] ? do_brk+0x2fd/0x330
 [<ffffffff8107a9d8>] ? audit_syscall_entry+0x1b8/0x1e0
 [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
---[ end trace 286af05228c1bc87 ]---
fib6_clean_node: del failed: rt=ffff880028faa340@ffff8800250a6e40 err=-2
------------[ cut here ]------------
WARNING: at net/ipv6/ip6_fib.c:1172 fib6_del+0x4f6/0x5a0()
Hardware name:
Modules linked in: sit tunnel4 nls_utf8 isofs vfat fat usb_storage
vboxnetadp vboxnetflt vboxdrv radeon ttm drm_kms_helper drm fb fbdev
i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect rfcomm sco bridge stp llc
bnep l2cap nfsd nfs_acl auth_rpcgss exportfs eeprom it87 hwmon_vid lockd
sunrpc powernow_k8 mperf ipt_REJECT iptable_filter ipt_MASQUERADE
iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables binfmt_misc dm_mirror
dm_region_hash dm_log ppdev snd_hda_codec_atihdmi snd_hda_codec_realtek
pwc snd_seq snd_seq_device videodev parport_pc snd_hda_intel
snd_hda_codec k10temp snd_pcm ohci1394 snd_timer btusb ieee1394 snd
v4l1_compat v4l2_compat_ioctl32 i2c_piix4 usblp evdev sg parport
bluetooth button snd_page_alloc sr_mod cdrom ide_pci_generic atiixp
pata_atiixp ehci_hcd ohci_hcd sata_sil24 floppy [last unloaded:
scsi_wait_scan]
Pid: 19679, comm: ip Tainted: G        W   2.6.35.7 #10
Call Trace:
 [<ffffffff8103e02b>] ? warn_slowpath_common+0x7b/0xc0
 [<ffffffff8137b7b6>] ? fib6_del+0x4f6/0x5a0
 [<ffffffff810386af>] ? load_balance+0xbf/0x6b0
 [<ffffffff813a5a16>] ? printk+0x40/0x4a
 [<ffffffff8137b8bc>] ? fib6_clean_node+0x5c/0xc0
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137acd6>] ? fib6_walk_continue+0x156/0x170
 [<ffffffff8137ad41>] ? fib6_walk+0x51/0xb0
 [<ffffffff813a8533>] ? _raw_write_lock_bh+0x13/0x30
 [<ffffffff8137b177>] ? fib6_clean_all+0x97/0xe0
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff81377130>] ? fib6_ifdown+0x0/0x30
 [<ffffffff8137a8e6>] ? rt6_ifdown+0x26/0xc0
 [<ffffffff81373b88>] ? addrconf_ifdown+0x48/0x4e0
 [<ffffffff81043cdd>] ? local_bh_enable_ip+0x4d/0xb0
 [<ffffffff81374405>] ? addrconf_notify+0xe5/0x910
 [<ffffffff8137b860>] ? fib6_clean_node+0x0/0xc0
 [<ffffffff8137a9e0>] ? fib6_age+0x0/0x80
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff8105acf6>] ? notifier_call_chain+0x46/0x70
 [<ffffffff812f91d5>] ? __dev_notify_flags+0x65/0x90
 [<ffffffff812f923b>] ? dev_change_flags+0x3b/0x70
 [<ffffffff81304f09>] ? do_setlink+0x189/0x740
 [<ffffffff811c1da0>] ? nla_parse+0x30/0x100
 [<ffffffff813061e9>] ? rtnl_newlink+0x3e9/0x4f0
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff813a84b2>] ? _raw_spin_unlock_irqrestore+0x12/0x40
 [<ffffffff812f28ba>] ? __skb_recv_datagram+0xca/0x280
 [<ffffffff81305c15>] ? rtnetlink_rcv_msg+0x85/0x270
 [<ffffffff81305b90>] ? rtnetlink_rcv_msg+0x0/0x270
 [<ffffffff813153e9>] ? netlink_rcv_skb+0x89/0xb0
 [<ffffffff81305b7f>] ? rtnetlink_rcv+0x1f/0x30
 [<ffffffff81314fe5>] ? netlink_unicast+0x2a5/0x2f0
 [<ffffffff8131592c>] ? netlink_sendmsg+0x1fc/0x300
 [<ffffffff812e559e>] ? sock_sendmsg+0xfe/0x110
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff8108e18d>] ? find_get_page+0x6d/0xd0
 [<ffffffff8108eb19>] ? filemap_fault+0x99/0x400
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff810a3f2a>] ? __do_fault+0x40a/0x520
 [<ffffffff812e7946>] ? move_addr_to_kernel+0x46/0x70
 [<ffffffff812f185a>] ? verify_iovec+0x6a/0xc0
 [<ffffffff812e58bb>] ? sys_sendmsg+0x23b/0x380
 [<ffffffff81034349>] ? get_parent_ip+0x9/0x20
 [<ffffffff81035bff>] ? sub_preempt_count+0x7f/0xb0
 [<ffffffff81026829>] ? do_page_fault+0x199/0x3b0
 [<ffffffff810ab0ca>] ? vma_link+0xaa/0x110
 [<ffffffff810ac5bd>] ? do_brk+0x2fd/0x330
 [<ffffffff8107a9d8>] ? audit_syscall_entry+0x1b8/0x1e0
 [<ffffffff810023eb>] ? system_call_fastpath+0x16/0x1b
---[ end trace 286af05228c1bc88 ]---
fib6_clean_node: del failed: rt=ffff8800ce662cc0@0100007f00000000 err=-2
fib6_clean_node: del failed: rt=ffff880028faad40@(null) err=-2
lo: Disabled Privacy Extensions
r8169 0000:03:00.0: eth0: link up
usb0: no IPv6 routers present
fib6_clean_node: del failed: rt=ffff880028faa5c0@(null) err=-2
fib6_clean_node: del failed: rt=ffff880028faa5c0@(null) err=-2
fib6_clean_node: del failed: rt=ffff880028faa5c0@(null) err=-2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

----- End forwarded message -----

^ permalink raw reply

* Re: [PATCH 4/10] Fix leaking of kernel heap addresses in net/
From: Alexey Dobriyan @ 2010-11-12 20:18 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20101112083315.096dfaa3@nehalam>

On Fri, Nov 12, 2010 at 08:33:15AM -0800, Stephen Hemminger wrote:
> Also, the whole idea needs to be under a config option, so only
> the paranoid idiots turn it on.

Would be fun if something will break because ffff8800bcd498c0
will become something else. :-)

^ permalink raw reply

* Re: [PATCH] dlm: Handle application limited situations properly.
From: David Teigland @ 2010-11-12 20:03 UTC (permalink / raw)
  To: David Miller; +Cc: ccaulfie, cluster-devel, netdev, linux-kernel
In-Reply-To: <20101110.215639.189706684.davem@davemloft.net>

On Wed, Nov 10, 2010 at 09:56:39PM -0800, David Miller wrote:
> 
> In the normal regime where an application uses non-blocking I/O
> writes on a socket, they will handle -EAGAIN and use poll() to
> wait for send space.
> 
> They don't actually sleep on the socket I/O write.
> 
> But kernel level RPC layers that do socket I/O operations directly
> and key off of -EAGAIN on the write() to "try again later" don't
> use poll(), they instead have their own sleeping mechanism and
> rely upon ->sk_write_space() to trigger the wakeup.
> 
> So they do effectively sleep on the write(), but this mechanism
> alone does not let the socket layers know what's going on.
> 
> Therefore they must emulate what would have happened, otherwise
> TCP cannot possibly see that the connection is application window
> size limited.
> 
> Handle this, therefore, like SUNRPC by setting SOCK_NOSPACE and
> bumping the ->sk_write_count as needed when we hit the send buffer
> limits.
> 
> This should make TCP send buffer size auto-tuning and the
> ->sk_write_space() callback invocations actually happen.

Thanks, pushed to
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm.git#next

^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: David Miller @ 2010-11-12 19:28 UTC (permalink / raw)
  To: kuznet; +Cc: eparis, netdev, linux-kernel, pekkas, jmorris, yoshfuji, kaber
In-Reply-To: <20101112174620.GA16544@ms2.inr.ac.ru>

From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Date: Fri, 12 Nov 2010 20:46:20 +0300

> The only loophole is ICMP error in the same case as yours. In
> _violation_ of specs linux immediately aborts unestablished connect
> on an icmp error. IMHO that thing which you suggest is correct (of
> course, provided you filter out transient errors and react only to
> EPERM or something like this). It was not done because it was
> expected firewall rule prescribing immediate abort is configured
> with "--reject-with icmp-port-unreachable", otherwise the rule
> orders real blackhole.

The idea to signal on -EPERM might be OK, but if that's also
what things like "-m statistical" and friends end up reporting
then we still cannot do it.

^ permalink raw reply

* Re: a problem tcp_v4_err()
From: David Miller @ 2010-11-12 19:22 UTC (permalink / raw)
  To: eric.dumazet
  Cc: kuznet, kaber, equinox, eparis, hzhong, netdev, linux-kernel,
	pekkas, jmorris, yoshfuji, paul.moore, damian
In-Reply-To: <1289586803.3185.275.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 12 Nov 2010 19:33:23 +0100

> I CC Damian Lukowski in my previous answer (and this one too)

Probably the safest fix is this:

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 8f8527d..69ccbc1 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -415,6 +415,9 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 		    !icsk->icsk_backoff)
 			break;
 
+		if (sock_owned_by_user(sk))
+			break;
+
 		icsk->icsk_backoff--;
 		inet_csk(sk)->icsk_rto = __tcp_set_rto(tp) <<
 					 icsk->icsk_backoff;
@@ -429,11 +432,6 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
 		if (remaining) {
 			inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
 						  remaining, TCP_RTO_MAX);
-		} else if (sock_owned_by_user(sk)) {
-			/* RTO revert clocked out retransmission,
-			 * but socket is locked. Will defer. */
-			inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
-						  HZ/20, TCP_RTO_MAX);
 		} else {
 			/* RTO revert clocked out retransmission.
 			 * Will retransmit now */

^ permalink raw reply related

* Re: [PATCH] NET: sunrpc, remove unneeded NULL tests
From: J. Bruce Fields @ 2010-11-12 19:14 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	jirislaby-Re5JQEeQqe8AvxtiuMwx3w, Neil Brown,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, Trond Myklebust
In-Reply-To: <1288180325-20009-1-git-send-email-jslaby-AlSwsSmVLrQ@public.gmane.org>

Sorry for the slow response, this one's for Trond if it hasn't already
been handled.

--b.

On Wed, Oct 27, 2010 at 01:52:05PM +0200, Jiri Slaby wrote:
> Stanse found that req in xprt_reserve_xprt is dereferenced prior its
> test to NULL. If that's the case, the checks are unnecessary, so
> remove them.
> 
> The alternative is not to dereference it before the test. The patch
> is to point out the problem, you have to decide.
> 
> Signed-off-by: Jiri Slaby <jslaby-AlSwsSmVLrQ@public.gmane.org>
> Cc: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
> Cc: Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org>
> Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> ---
>  net/sunrpc/xprt.c |    8 +++-----
>  1 files changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 4c8f18a..5355c71 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -202,10 +202,8 @@ int xprt_reserve_xprt(struct rpc_task *task)
>  		goto out_sleep;
>  	}
>  	xprt->snd_task = task;
> -	if (req) {
> -		req->rq_bytes_sent = 0;
> -		req->rq_ntrans++;
> -	}
> +	req->rq_bytes_sent = 0;
> +	req->rq_ntrans++;
>  	return 1;
>  
>  out_sleep:
> @@ -213,7 +211,7 @@ out_sleep:
>  			task->tk_pid, xprt);
>  	task->tk_timeout = 0;
>  	task->tk_status = -EAGAIN;
> -	if (req && req->rq_ntrans)
> +	if (req->rq_ntrans)
>  		rpc_sleep_on(&xprt->resend, task, NULL);
>  	else
>  		rpc_sleep_on(&xprt->sending, task, NULL);
> -- 
> 1.7.3.1
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] atomic: add atomic_inc_not_zero_hint()
From: Christoph Lameter @ 2010-11-12 19:14 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Andrew Morton, linux-kernel, David Miller, netdev,
	Arnaldo Carvalho de Melo, Ingo Molnar, Andi Kleen, Nick Piggin
In-Reply-To: <20101105195101.GC15561@linux.vnet.ibm.com>


prefetchw() would be too much overhead?

^ permalink raw reply

* Re: [PATCH 0/2] netfilter: netfilter fixes
From: David Miller @ 2010-11-12 19:07 UTC (permalink / raw)
  To: kaber; +Cc: netfilter-devel, netdev
In-Reply-To: <1289575172-7272-1-git-send-email-kaber@trash.net>

From: kaber@trash.net
Date: Fri, 12 Nov 2010 16:19:30 +0100

> Hi Dave,
> 
> The following two patches fix some netfilter bugs:
> 
> - missing parentheses in NF_HOOK_COND, breaking error propagation for
>   dropped packets. From Eric Paris.
> 
> - incorrect checking for overlapping fragments in IPv6 conntrack
>   reassembly. From Shan Wei
> 
> Please apply or pull from:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6.git

Pulled, thanks Patrick.

^ permalink raw reply

* Re: [PATCH] rtnetlink: Fix message size calculation for link messages
From: David Miller @ 2010-11-12 18:53 UTC (permalink / raw)
  To: kaber; +Cc: netdev
In-Reply-To: <4CDCF032.7040802@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Fri, 12 Nov 2010 08:43:46 +0100

> On 12.11.2010 02:47, Thomas Graf wrote:
>> nlmsg_total_size() calculates the length of a netlink message
>> including header and alignment. nla_total_size() calculates the
>> space an individual attribute consumes which was meant to be used
>> in this context.
>> 
>> Also, ensure to account for the attribute header for the
>> IFLA_INFO_XSTATS attribute as implementations of get_xstats_size()
>> seem to assume that we do so.
>> 
>> The addition of two message headers minus the missing attribute
>> header resulted in a calculated message size that was larger than
>> required. Therefore we never risked running out of skb tailroom.
>> 
>> Signed-off-by: Thomas Graf <tgraf@infradead.org>
>> Cc: Patrick McHardy <kaber@trash.net>
> 
> Looks good to me, thanks Thomas.
> 
> Acked-by: Patrick McHardy <kaber@trash.net>

Applied, thanks.

^ permalink raw reply

* Re: a problem tcp_v4_err()
From: Eric Dumazet @ 2010-11-12 18:33 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Patrick McHardy, David Lamparter, Eric Paris, Hua Zhong, netdev,
	linux-kernel, davem, pekkas, jmorris, yoshfuji, paul.moore,
	Damian Lukowski
In-Reply-To: <20101112182959.GA20459@ms2.inr.ac.ru>

Le vendredi 12 novembre 2010 à 21:29 +0300, Alexey Kuznetsov a écrit :
> Hello!
> 
> On Fri, Nov 12, 2010 at 07:12:58PM +0100, Eric Dumazet wrote:
> > I see socket is locked around line 368,
> > 
> >         bh_lock_sock(sk);
> >         /* If too many ICMPs get dropped on busy
> >          * servers this needs to be solved differently.
> >          */
> >         if (sock_owned_by_user(sk))
> >                 NET_INC_STATS_BH(net, LINUX_MIB_LOCKDROPPEDICMPS);
> > 
> > 
> > Hmm, maybe some goto is missing ;)
> 
> It is not missing, sock_owned_by_user() is checked later when some operation which
> cannot be done without lock is required. It was done to save error in sk_err_soft even
> when socket is locked.
> 
> This code also _understands_ this: look at
> 
>                 } else if (sock_owned_by_user(sk)) {
>                         /* RTO revert clocked out retransmission,
>                          * but socket is locked. Will defer. */
>                         inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
>                                                   HZ/20, TCP_RTO_MAX);
> 
> but somehow it considers the manipulations with rto/backoff/write_queue as safe.
> Seems, they are not.

Indeed, right you are, I came to same conclusion.

I CC Damian Lukowski in my previous answer (and this one too)

Thanks !

^ permalink raw reply

* Re: [PATCH 4/10] Fix leaking of kernel heap addresses in net/
From: Stephen Hemminger @ 2010-11-12 18:33 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Eric Dumazet, David Miller, socketcan, kuznet, urs.thuermann,
	yoshfuji, kaber, jmorris, remi.denis-courmont, pekkas, sri,
	vladislav.yasevich, tj, lizf, joe, hadi, ebiederm, adobriyan,
	jpirko, johannes.berg, daniel.lezcano, xemul, socketcan-core,
	netdev, linux-sctp, torvalds
In-Reply-To: <1289582682.3090.323.camel@Dan>

On Fri, 12 Nov 2010 12:24:42 -0500
Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> 
> > 
> > Also, the whole idea needs to be under a config option, so only
> > the paranoid idiots turn it on.
> 
> If that's what's necessary to get it accepted, I'm willing to do that.
> But when a solution does not negatively impact usability or performance
> and improves security, even in a small way, why should it not be enabled
> by default?  Of course it's my responsibility to first propose a
> solution that is acceptable from a usability/debugging standpoint, but
> assuming that can be achieved, I don't really see what the problem is.
> There's a difference between being a "paranoid idiot" and wanting to
> protect users from unnecessary exposure.

See earlier discussion about automatically running crypto tests on boot
which caused Linus to flame. This is more intrusive, and is not something
most developers would want; but it might make sense in production
environment.


^ permalink raw reply

* Re: a problem tcp_v4_err()
From: Alexey Kuznetsov @ 2010-11-12 18:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Patrick McHardy, David Lamparter, Eric Paris, Hua Zhong, netdev,
	linux-kernel, davem, pekkas, jmorris, yoshfuji, paul.moore,
	Damian Lukowski
In-Reply-To: <1289586477.3185.273.camel@edumazet-laptop>

Hello!

> Oh well, it seems you are right (backlog processing)

Exactly.

Alexey

^ permalink raw reply

* Re: a problem tcp_v4_err()
From: Alexey Kuznetsov @ 2010-11-12 18:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Patrick McHardy, David Lamparter, Eric Paris, Hua Zhong, netdev,
	linux-kernel, davem, pekkas, jmorris, yoshfuji, paul.moore
In-Reply-To: <1289585578.3185.268.camel@edumazet-laptop>

Hello!

On Fri, Nov 12, 2010 at 07:12:58PM +0100, Eric Dumazet wrote:
> I see socket is locked around line 368,
> 
>         bh_lock_sock(sk);
>         /* If too many ICMPs get dropped on busy
>          * servers this needs to be solved differently.
>          */
>         if (sock_owned_by_user(sk))
>                 NET_INC_STATS_BH(net, LINUX_MIB_LOCKDROPPEDICMPS);
> 
> 
> Hmm, maybe some goto is missing ;)

It is not missing, sock_owned_by_user() is checked later when some operation which
cannot be done without lock is required. It was done to save error in sk_err_soft even
when socket is locked.

This code also _understands_ this: look at

                } else if (sock_owned_by_user(sk)) {
                        /* RTO revert clocked out retransmission,
                         * but socket is locked. Will defer. */
                        inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
                                                  HZ/20, TCP_RTO_MAX);

but somehow it considers the manipulations with rto/backoff/write_queue as safe.
Seems, they are not.

Alexey

^ permalink raw reply

* Re: a problem tcp_v4_err()
From: Eric Dumazet @ 2010-11-12 18:27 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Patrick McHardy, David Lamparter, Eric Paris, Hua Zhong, netdev,
	linux-kernel, davem, pekkas, jmorris, yoshfuji, paul.moore,
	Damian Lukowski
In-Reply-To: <1289586114.3185.271.camel@edumazet-laptop>

Le vendredi 12 novembre 2010 à 19:21 +0100, Eric Dumazet a écrit :
> Le vendredi 12 novembre 2010 à 19:12 +0100, Eric Dumazet a écrit :
> > Le vendredi 12 novembre 2010 à 20:57 +0300, Alexey Kuznetsov a écrit :
> > > Hello!
> > > 
> > > I looked at tcp_v4_err() and found something strange. Quite non-trivial operations
> > > are performed on unlocked sockets. It looks like at least this BUG_ON():
> > > 
> > >                 skb = tcp_write_queue_head(sk);
> > >                 BUG_ON(!skb);
> > > 
> > > can be easily triggered.
> > > 
> > > Do I miss something?
> > > 
> > 
> > Hi Alexey !
> > 
> > I see socket is locked around line 368,
> > 
> >         bh_lock_sock(sk);
> >         /* If too many ICMPs get dropped on busy
> >          * servers this needs to be solved differently.
> >          */
> >         if (sock_owned_by_user(sk))
> >                 NET_INC_STATS_BH(net, LINUX_MIB_LOCKDROPPEDICMPS);
> > 
> > 
> > Hmm, maybe some goto is missing ;)
> > 
> 
> Well, goto is not missing.
> 
> Why do you think BUG_ON(!skb) can be triggered ?
> 
> We test before :
> 
> 	if (seq != tp->snd_una  || !icsk->icsk_retransmits ||
> 		!icsk->icsk_backoff)
> 		break;
> 
> So a concurrent user only can add new skb(s) in the (non empty) queue ?
> 
> 

Oh well, it seems you are right (backlog processing)

Bug was introduced in commit f1ecd5d9e736660 (Revert Backoff [v3]:
Revert RTO on ICMP destination unreachable) from Damian Lukowski

^ permalink raw reply

* Re: a problem tcp_v4_err()
From: Eric Dumazet @ 2010-11-12 18:21 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Patrick McHardy, David Lamparter, Eric Paris, Hua Zhong, netdev,
	linux-kernel, davem, pekkas, jmorris, yoshfuji, paul.moore
In-Reply-To: <1289585578.3185.268.camel@edumazet-laptop>

Le vendredi 12 novembre 2010 à 19:12 +0100, Eric Dumazet a écrit :
> Le vendredi 12 novembre 2010 à 20:57 +0300, Alexey Kuznetsov a écrit :
> > Hello!
> > 
> > I looked at tcp_v4_err() and found something strange. Quite non-trivial operations
> > are performed on unlocked sockets. It looks like at least this BUG_ON():
> > 
> >                 skb = tcp_write_queue_head(sk);
> >                 BUG_ON(!skb);
> > 
> > can be easily triggered.
> > 
> > Do I miss something?
> > 
> 
> Hi Alexey !
> 
> I see socket is locked around line 368,
> 
>         bh_lock_sock(sk);
>         /* If too many ICMPs get dropped on busy
>          * servers this needs to be solved differently.
>          */
>         if (sock_owned_by_user(sk))
>                 NET_INC_STATS_BH(net, LINUX_MIB_LOCKDROPPEDICMPS);
> 
> 
> Hmm, maybe some goto is missing ;)
> 

Well, goto is not missing.

Why do you think BUG_ON(!skb) can be triggered ?

We test before :

	if (seq != tp->snd_una  || !icsk->icsk_retransmits ||
		!icsk->icsk_backoff)
		break;

So a concurrent user only can add new skb(s) in the (non empty) queue ?

^ permalink raw reply

* Re: a problem tcp_v4_err()
From: Eric Dumazet @ 2010-11-12 18:12 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Patrick McHardy, David Lamparter, Eric Paris, Hua Zhong, netdev,
	linux-kernel, davem, pekkas, jmorris, yoshfuji, paul.moore
In-Reply-To: <20101112175715.GB16544@ms2.inr.ac.ru>

Le vendredi 12 novembre 2010 à 20:57 +0300, Alexey Kuznetsov a écrit :
> Hello!
> 
> I looked at tcp_v4_err() and found something strange. Quite non-trivial operations
> are performed on unlocked sockets. It looks like at least this BUG_ON():
> 
>                 skb = tcp_write_queue_head(sk);
>                 BUG_ON(!skb);
> 
> can be easily triggered.
> 
> Do I miss something?
> 

Hi Alexey !

I see socket is locked around line 368,

        bh_lock_sock(sk);
        /* If too many ICMPs get dropped on busy
         * servers this needs to be solved differently.
         */
        if (sock_owned_by_user(sk))
                NET_INC_STATS_BH(net, LINUX_MIB_LOCKDROPPEDICMPS);


Hmm, maybe some goto is missing ;)

^ permalink raw reply

* a problem tcp_v4_err()
From: Alexey Kuznetsov @ 2010-11-12 17:57 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: David Lamparter, Eric Dumazet, Eric Paris, Hua Zhong, netdev,
	linux-kernel, davem, pekkas, jmorris, yoshfuji, paul.moore
In-Reply-To: <4CDD7145.8070606@trash.net>

Hello!

I looked at tcp_v4_err() and found something strange. Quite non-trivial operations
are performed on unlocked sockets. It looks like at least this BUG_ON():

                skb = tcp_write_queue_head(sk);
                BUG_ON(!skb);

can be easily triggered.

Do I miss something?

Alexey

^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: Alexey Kuznetsov @ 2010-11-12 17:46 UTC (permalink / raw)
  To: Eric Paris; +Cc: netdev, linux-kernel, davem, pekkas, jmorris, yoshfuji, kaber
In-Reply-To: <20101111210341.31350.86916.stgit@paris.rdu.redhat.com>

Hello!

On Thu, Nov 11, 2010 at 04:03:41PM -0500, Eric Paris wrote:
> immediately when it calls connect().  Is this wrong?  Is this bad to tell
> userspace more quickly what happened?  Does passing this error code back up
> the stack here break something else?  Why do some functions seem to pay
> attention to tcp_transmit_skb() return codes and some functions just ignore
> it?

Essentially, return value of tcp_transmit_skb() is always ignored.
It is used only for accounting and for some optimization of retransmission behaviour.
Generally, tcp does not react on errors coming outside of tcp protocol.

The only loophole is ICMP error in the same case as yours. In _violation_ of specs
linux immediately aborts unestablished connect on an icmp error. IMHO that thing
which you suggest is correct (of course, provided you filter out transient errors and react only
to EPERM or something like this). It was not done because it was expected
firewall rule prescribing immediate abort is configured with "--reject-with icmp-port-unreachable",
otherwise the rule orders real blackhole.

Alexey

^ permalink raw reply

* Re: [PATCH 4/10] Fix leaking of kernel heap addresses in net/
From: Dan Rosenberg @ 2010-11-12 17:24 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Eric Dumazet, David Miller, socketcan, kuznet, urs.thuermann,
	yoshfuji, kaber, jmorris, remi.denis-courmont, pekkas, sri,
	vladislav.yasevich, tj, lizf, joe, hadi, ebiederm, adobriyan,
	jpirko, johannes.berg, daniel.lezcano, xemul, socketcan-core,
	netdev, linux-sctp, torvalds
In-Reply-To: <20101112083315.096dfaa3@nehalam>

> 
> Also, the whole idea needs to be under a config option, so only
> the paranoid idiots turn it on.

If that's what's necessary to get it accepted, I'm willing to do that.
But when a solution does not negatively impact usability or performance
and improves security, even in a small way, why should it not be enabled
by default?  Of course it's my responsibility to first propose a
solution that is acceptable from a usability/debugging standpoint, but
assuming that can be achieved, I don't really see what the problem is.
There's a difference between being a "paranoid idiot" and wanting to
protect users from unnecessary exposure.

-Dan

^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: Patrick McHardy @ 2010-11-12 16:54 UTC (permalink / raw)
  To: David Lamparter
  Cc: Eric Dumazet, Eric Paris, Hua Zhong, netdev, linux-kernel, davem,
	kuznet, pekkas, jmorris, yoshfuji, paul.moore
In-Reply-To: <20101112163543.GB122902@jupiter.n2.diac24.net>

Am 12.11.2010 17:35, schrieb David Lamparter:
> On Fri, Nov 12, 2010 at 05:15:32PM +0100, Eric Dumazet wrote:
>> Le vendredi 12 novembre 2010 à 11:08 -0500, Eric Paris a écrit :
>>
>>> 2) What should the generic TCP code (tcp_connect()) do if the skb failed
>>> to send.  Should it return error codes back up the stack somehow or
>>> should they continue to be ignored?  Obviously continuing to just ignore
>>> information we have doesn't make me happy (otherwise I wouldn't have
>>> started scratching this itch).  But the point about ENOBUFS is well
>>> taken.  Maybe I should make tcp_connect(), or the caller to
>>> tcp_connect() more intelligent about specific error codes?
>>>
>>> I'm looking for a path forward.  If SELinux is rejecting the SYN packets
>>> on connect() I want to pass that info to userspace rather than just
>>> hanging.  What's the best way to accomplish that?
>>>
>>
>> Eric, if you can differentiate a permanent reject, instead of a
>> temporary one (congestion, or rate limiting, or ENOBUF, or ...), then
>> yes, you could make tcp_connect() report to user the permanent error,
>> and ignore the temporary one.

Indeed. We could even make the NF_DROP return value configurable
by encoding it in the verdict.

> If the netfilter targets DROP/REJECT match the NF_DROP/NF_REJECT
> counterparts, which i guess they do but i didn't read the source ;),
> then SELinux should use NF_REJECT in my opinion.

There is no NF_REJECT.

> NF_DROP does exactly what the name says, it drops the packet aka
> basically puts it in /dev/null. As with writing to /dev/null, you don't
> get an error for that. Even more, if in the meantime the DROP rule does
> not match anymore, the 2nd or 3rd SYN from the connect() can come
> through and establish a connection (think of "-m statistic" & co.)
> 
> This is very different from REJECT.

Returning NF_DROP results in -EPERM getting reported back. As Eric
noticed, this is ignored for SYN packets.

> If REJECT doesn't immediately get reported to the application, that *is*
> a bug, but last time i checked i got EPERM immediately. I would fix
> SELinux to use the same mechanism.

NF_DROP returns -EPERM, the REJECT targets send packets to reject
a connection. Whether this is reported immediately depends on the
error and the protocol in question. Using a TCP reset immediately
resets the connection.

^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: Eric Paris @ 2010-11-12 16:53 UTC (permalink / raw)
  To: David Lamparter
  Cc: Eric Dumazet, Hua Zhong, netdev, linux-kernel, davem, kuznet,
	pekkas, jmorris, yoshfuji, kaber, paul.moore
In-Reply-To: <20101112163543.GB122902@jupiter.n2.diac24.net>

On Fri, 2010-11-12 at 17:35 +0100, David Lamparter wrote:
> On Fri, Nov 12, 2010 at 05:15:32PM +0100, Eric Dumazet wrote:
> > Le vendredi 12 novembre 2010 à 11:08 -0500, Eric Paris a écrit :
> > 
> > > 2) What should the generic TCP code (tcp_connect()) do if the skb failed
> > > to send.  Should it return error codes back up the stack somehow or
> > > should they continue to be ignored?  Obviously continuing to just ignore
> > > information we have doesn't make me happy (otherwise I wouldn't have
> > > started scratching this itch).  But the point about ENOBUFS is well
> > > taken.  Maybe I should make tcp_connect(), or the caller to
> > > tcp_connect() more intelligent about specific error codes?
> > > 
> > > I'm looking for a path forward.  If SELinux is rejecting the SYN packets
> > > on connect() I want to pass that info to userspace rather than just
> > > hanging.  What's the best way to accomplish that?
> > > 
> > 
> > Eric, if you can differentiate a permanent reject, instead of a
> > temporary one (congestion, or rate limiting, or ENOBUF, or ...), then
> > yes, you could make tcp_connect() report to user the permanent error,
> > and ignore the temporary one.
> 
> If the netfilter targets DROP/REJECT match the NF_DROP/NF_REJECT
> counterparts, which i guess they do but i didn't read the source ;),
> then SELinux should use NF_REJECT in my opinion.

As it stands today there is no NF_REJECT.  NF_DROP is the only (related)
permitted return value from a netfilter hook.  Maybe I need to change
that fact though.

> NF_DROP does exactly what the name says, it drops the packet aka
> basically puts it in /dev/null. As with writing to /dev/null, you don't
> get an error for that. Even more, if in the meantime the DROP rule does
> not match anymore, the 2nd or 3rd SYN from the connect() can come
> through and establish a connection (think of "-m statistic" & co.)
> 
> This is very different from REJECT.
> 
> If REJECT doesn't immediately get reported to the application, that *is*
> a bug, but last time i checked i got EPERM immediately. I would fix
> SELinux to use the same mechanism.

I haven't looked at what -j REJECT does (or was intended to do) but it
most certainly does not return an error to sys_connect().  Try it out.

iptables -A OUTPUT -p tcp --dport 80 -j REJECT
links www.google.com

it just hangs on 'making connection'  (exact same for -j DROP)

If everyone agrees that's the wrong behavior (for -j REJECT) I'll work
on fixing that (however is appropriate) and will change the SELinux code
if needed after we've fixed the -j REJECT code.  Obviously there's
problems with my original way to fix the lack of error returns (namely
that I would immediately EACCES for DROP as well as REJECT).

I'm glad to hear that others seem to believe the current code is buggy
and I'm not completely off my rocker to think that applications should
be able to learn somehow that things fell down...

-Eric

^ permalink raw reply

* Re: [RFC PATCH] network: return errors if we know tcp_connect failed
From: David Lamparter @ 2010-11-12 16:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Paris, Hua Zhong, netdev, linux-kernel, davem, kuznet,
	pekkas, jmorris, yoshfuji, kaber, paul.moore
In-Reply-To: <1289578532.3185.265.camel@edumazet-laptop>

On Fri, Nov 12, 2010 at 05:15:32PM +0100, Eric Dumazet wrote:
> Le vendredi 12 novembre 2010 à 11:08 -0500, Eric Paris a écrit :
> 
> > 2) What should the generic TCP code (tcp_connect()) do if the skb failed
> > to send.  Should it return error codes back up the stack somehow or
> > should they continue to be ignored?  Obviously continuing to just ignore
> > information we have doesn't make me happy (otherwise I wouldn't have
> > started scratching this itch).  But the point about ENOBUFS is well
> > taken.  Maybe I should make tcp_connect(), or the caller to
> > tcp_connect() more intelligent about specific error codes?
> > 
> > I'm looking for a path forward.  If SELinux is rejecting the SYN packets
> > on connect() I want to pass that info to userspace rather than just
> > hanging.  What's the best way to accomplish that?
> > 
> 
> Eric, if you can differentiate a permanent reject, instead of a
> temporary one (congestion, or rate limiting, or ENOBUF, or ...), then
> yes, you could make tcp_connect() report to user the permanent error,
> and ignore the temporary one.

If the netfilter targets DROP/REJECT match the NF_DROP/NF_REJECT
counterparts, which i guess they do but i didn't read the source ;),
then SELinux should use NF_REJECT in my opinion.

NF_DROP does exactly what the name says, it drops the packet aka
basically puts it in /dev/null. As with writing to /dev/null, you don't
get an error for that. Even more, if in the meantime the DROP rule does
not match anymore, the 2nd or 3rd SYN from the connect() can come
through and establish a connection (think of "-m statistic" & co.)

This is very different from REJECT.

If REJECT doesn't immediately get reported to the application, that *is*
a bug, but last time i checked i got EPERM immediately. I would fix
SELinux to use the same mechanism.

-David

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox