* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Eric Dumazet @ 2011-09-08 5:59 UTC (permalink / raw)
To: Tim Chen
Cc: Yan, Zheng, sedat.dilek@gmail.com, Yan, Zheng,
netdev@vger.kernel.org, davem@davemloft.net, sfr@canb.auug.org.au,
jirislaby@gmail.com, Shi, Alex, Valdis Kletnieks
In-Reply-To: <1315429583.2361.3.camel@schen9-mobl>
Le mercredi 07 septembre 2011 à 14:06 -0700, Tim Chen a écrit :
> On Thu, 2011-09-08 at 08:27 +0800, Yan, Zheng wrote:
>
> > > err = -EPIPE;
> > > out_err:
> > > - if (skb == NULL)
> > > + if (!steal_refs)
> > > scm_destroy(siocb->scm);
> >
> > I think we should call scm_release() here in the case of
> > steal_refs == true. Otherwise siocb->scm->fp may leak.
>
> Yan Zheng,
>
> I've updated the patch. If it looks good to you now, can you add your
> Signed-off-by to the patch. Pending Sedat's testing on this patch,
> I think it is good to go.
>
> Tim
>
> ---
> Commit 0856a30409 (Scm: Remove unnecessary pid & credential references
> in Unix socket's send and receive path) introduced a use-after-free bug.
> The sent skbs from unix_stream_sendmsg could be consumed and destructed
> by the receive side, removing all references to the credentials,
> before the send side has finished sending out all
> packets. However, send side could continue to consturct new packets in the
> stream, using credentials that have lost its last reference and been
> freed.
>
> In this fix, we don't steal the reference to credentials we have obtained
> in scm_send at beginning of unix_stream_sendmsg, till we've reached
> the last packet. This fixes the problem in commit 0856a30409.
>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Reported-by: Jiri Slaby <jirislaby@gmail.com>
> Tested-by: Sedat Dilek <sedat.dilek@googlemail.com>
> Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
> ---
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 136298c..47780dc 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1383,10 +1383,11 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
> }
>
> static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
> - bool send_fds, bool ref)
> + bool send_fds, bool steal_refs)
> {
> int err = 0;
> - if (ref) {
> +
> + if (!steal_refs) {
> UNIXCB(skb).pid = get_pid(scm->pid);
> UNIXCB(skb).cred = get_cred(scm->cred);
> } else {
> @@ -1458,7 +1459,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
> if (skb == NULL)
> goto out;
>
> - err = unix_scm_to_skb(siocb->scm, skb, true, false);
> + err = unix_scm_to_skb(siocb->scm, skb, true, true);
> if (err < 0)
> goto out_free;
> max_level = err + 1;
> @@ -1581,6 +1582,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> int sent = 0;
> struct scm_cookie tmp_scm;
> bool fds_sent = false;
> + bool steal_refs = false;
> int max_level;
>
> if (NULL == siocb->scm)
> @@ -1642,11 +1644,14 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> size = min_t(int, size, skb_tailroom(skb));
>
>
> - /* Only send the fds and no ref to pid in the first buffer */
> - err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
> + /* Only send the fds in first buffer
> + * Last buffer can steal our references to pid/cred
> + */
> + steal_refs = (sent + size >= len);
> + err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
> if (err < 0) {
> kfree_skb(skb);
> - goto out;
> + goto out_err;
> }
> max_level = err + 1;
> fds_sent = true;
> @@ -1654,7 +1659,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
> if (err) {
> kfree_skb(skb);
> - goto out;
> + goto out_err;
> }
>
> unix_state_lock(other);
> @@ -1671,7 +1676,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> sent += size;
> }
>
> - if (skb)
> + if (steal_refs)
> scm_release(siocb->scm);
> else
> scm_destroy(siocb->scm);
> @@ -1687,9 +1692,10 @@ pipe_err:
> send_sig(SIGPIPE, current, 0);
> err = -EPIPE;
> out_err:
> - if (skb == NULL)
> + if (steal_refs)
> + scm_release(siocb->scm);
> + else
> scm_destroy(siocb->scm);
> -out:
> siocb->scm = NULL;
> return sent ? : err;
> }
>
>
>
I dont think this patch is good.
Sedat traces have nothing to do with af_unix.
Once unix_scm_to_skb() was called and successful, and steal_refs is true
our refs are attached to this skb. They will be released by
skb_free(skb). Same for fp : They either were sent in a previous skb or
this one.
This is why the "goto out;" was OK.
^ permalink raw reply
* Re: [RFC] interface for outgoing packet
From: Eric Dumazet @ 2011-09-08 6:09 UTC (permalink / raw)
To: Viral Mehta; +Cc: netdev
In-Reply-To: <CANX6DanC8DJsAZj4dV3oZo3DnLV6AyxBDm82vHaJm3C51aRqEg@mail.gmail.com>
Le mercredi 07 septembre 2011 à 20:24 -0400, Viral Mehta a écrit :
> Hi All,
>
> How reasonably I can assume that,
>
> on whatever interface, I received the last packet
> for a particular Socket Connection, the same interface will be
> used to SEND the next packet for same socket connection?
>
No
> I am collecting the logs on one of my test machines.
> And it shows, I can assume all the time that "interface" which received packet
> for some socket connection, the same will be used to send packet for
> that connection
>
> But, I am not sure if I am missing some scenarios
> or setup (for e.g., bonding) where it can be wrong ?
>
> It would be more help if some one can shed some light.
By default, a socket is not bound to one interface, so you cant assume
this.
Check SO_BINDTODEVICE socket option if you need it.
^ permalink raw reply
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Eric Dumazet @ 2011-09-08 6:21 UTC (permalink / raw)
To: Tim Chen
Cc: davem@davemloft.net, sedat.dilek@gmail.com, Yan, Zheng,
netdev@vger.kernel.org, sfr@canb.auug.org.au, jirislaby@gmail.com,
Shi, Alex, Valdis Kletnieks, Yan, Zheng
In-Reply-To: <1315430115.2361.11.camel@schen9-mobl>
Le mercredi 07 septembre 2011 à 14:15 -0700, Tim Chen a écrit :
> Oops, I've forgotten to add Eric's previous Signed-off-by in my latest
> patch log. David, please add that when you pick up the patch.
> Once Yan Zheng added his sign off and Sedat tested the patch, I think it
> will be good to go.
Tim, as soon as you post another patch version, you must remove all
prior Signed-off-by, and only add yours.
So it was fine to do so.
By the way your last version introduce a new bug, so I would rather NACK
it ;)
^ permalink raw reply
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Yan, Zheng @ 2011-09-08 6:22 UTC (permalink / raw)
To: Eric Dumazet
Cc: Tim Chen, sedat.dilek@gmail.com, Yan, Zheng,
netdev@vger.kernel.org, davem@davemloft.net, sfr@canb.auug.org.au,
jirislaby@gmail.com, Shi, Alex, Valdis Kletnieks
In-Reply-To: <1315461572.2532.7.camel@edumazet-laptop>
On 09/08/2011 01:59 PM, Eric Dumazet wrote:
> Le mercredi 07 septembre 2011 à 14:06 -0700, Tim Chen a écrit :
>> On Thu, 2011-09-08 at 08:27 +0800, Yan, Zheng wrote:
>>
>>>> err = -EPIPE;
>>>> out_err:
>>>> - if (skb == NULL)
>>>> + if (!steal_refs)
>>>> scm_destroy(siocb->scm);
>>>
>>> I think we should call scm_release() here in the case of
>>> steal_refs == true. Otherwise siocb->scm->fp may leak.
>>
>> Yan Zheng,
>>
>> I've updated the patch. If it looks good to you now, can you add your
>> Signed-off-by to the patch. Pending Sedat's testing on this patch,
>> I think it is good to go.
>>
>> Tim
>>
>> ---
>> Commit 0856a30409 (Scm: Remove unnecessary pid & credential references
>> in Unix socket's send and receive path) introduced a use-after-free bug.
>> The sent skbs from unix_stream_sendmsg could be consumed and destructed
>> by the receive side, removing all references to the credentials,
>> before the send side has finished sending out all
>> packets. However, send side could continue to consturct new packets in the
>> stream, using credentials that have lost its last reference and been
>> freed.
>>
>> In this fix, we don't steal the reference to credentials we have obtained
>> in scm_send at beginning of unix_stream_sendmsg, till we've reached
>> the last packet. This fixes the problem in commit 0856a30409.
>>
>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reported-by: Jiri Slaby <jirislaby@gmail.com>
>> Tested-by: Sedat Dilek <sedat.dilek@googlemail.com>
>> Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
>> ---
>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>> index 136298c..47780dc 100644
>> --- a/net/unix/af_unix.c
>> +++ b/net/unix/af_unix.c
>> @@ -1383,10 +1383,11 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
>> }
>>
>> static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
>> - bool send_fds, bool ref)
>> + bool send_fds, bool steal_refs)
>> {
>> int err = 0;
>> - if (ref) {
>> +
>> + if (!steal_refs) {
>> UNIXCB(skb).pid = get_pid(scm->pid);
>> UNIXCB(skb).cred = get_cred(scm->cred);
>> } else {
>> @@ -1458,7 +1459,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> if (skb == NULL)
>> goto out;
>>
>> - err = unix_scm_to_skb(siocb->scm, skb, true, false);
>> + err = unix_scm_to_skb(siocb->scm, skb, true, true);
>> if (err < 0)
>> goto out_free;
>> max_level = err + 1;
>> @@ -1581,6 +1582,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> int sent = 0;
>> struct scm_cookie tmp_scm;
>> bool fds_sent = false;
>> + bool steal_refs = false;
>> int max_level;
>>
>> if (NULL == siocb->scm)
>> @@ -1642,11 +1644,14 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> size = min_t(int, size, skb_tailroom(skb));
>>
>>
>> - /* Only send the fds and no ref to pid in the first buffer */
>> - err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
>> + /* Only send the fds in first buffer
>> + * Last buffer can steal our references to pid/cred
>> + */
>> + steal_refs = (sent + size >= len);
>> + err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
>> if (err < 0) {
>> kfree_skb(skb);
>> - goto out;
>> + goto out_err;
>> }
>> max_level = err + 1;
>> fds_sent = true;
>> @@ -1654,7 +1659,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
>> if (err) {
>> kfree_skb(skb);
>> - goto out;
>> + goto out_err;
>> }
>>
>> unix_state_lock(other);
>> @@ -1671,7 +1676,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> sent += size;
>> }
>>
>> - if (skb)
>> + if (steal_refs)
>> scm_release(siocb->scm);
>> else
>> scm_destroy(siocb->scm);
>> @@ -1687,9 +1692,10 @@ pipe_err:
>> send_sig(SIGPIPE, current, 0);
>> err = -EPIPE;
>> out_err:
>> - if (skb == NULL)
>> + if (steal_refs)
>> + scm_release(siocb->scm);
>> + else
>> scm_destroy(siocb->scm);
>> -out:
>> siocb->scm = NULL;
>> return sent ? : err;
>> }
>>
>>
>>
>
> I dont think this patch is good.
>
> Sedat traces have nothing to do with af_unix.
>
> Once unix_scm_to_skb() was called and successful, and steal_refs is true
> our refs are attached to this skb. They will be released by
> skb_free(skb). Same for fp : They either were sent in a previous skb or
> this one.
>
> This is why the "goto out;" was OK.
>
I don't think so. unix_scm_to_skb() calls unix_attach_fds(), it
always duplicates scm->fp.
Regards
^ permalink raw reply
* [PATCH] net: phy: Add config option to specify external switch port to be used if switch is used as PHY
From: Lambrecht Jürgen @ 2011-09-08 6:54 UTC (permalink / raw)
To: netdev@vger.kernel.org; +Cc: linux-embedded@vger.kernel.org
Hello,
In our embedded designs, this is a useful patch. Maybe it can be useful
for somebody else too.
Or maybe there are already better solutions?
I know I could also write a driver for our switch, but that is too much
effort just to select the active port.
Kind regards,
Jürgen
In embedded design, instead of a PHY, sometimes a switch is used that
behaves as a PHY through its MII port. For example to use a daisy
chain network configuration instead of an expensive star config.
In that case, many phy ports are available, but only 1 should
be used
to check link status, and not the first one available as is
the case
without this configuration (that is, set to its default value 0).
So this options specifies the switch port number to be used
to check
link status, because if the link is down, no data is sent by the
TCP/IP stack.
Signed-off-by: Jürgen Lambrecht <J.Lambrecht@televic.com>
---
drivers/net/phy/Kconfig | 14 ++++++++++++++
drivers/net/phy/mdio_bus.c | 9 +++++++++
2 files changed, 23 insertions(+), 0 deletions(-)
diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index a702443..554561f 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -13,6 +13,20 @@ menuconfig PHYLIB
if PHYLIB
+config SWITCH_PHY
+ int "External switch port to be used if switch is used as PHY"
+ default "0"
+ help
+ In embedded design, instead of a PHY, sometimes a switch is
used that
+ behaves as a PHY through its MII port. For example to use a daisy
+ chain network configuration instead of an expensive star config.
+ In that case, many phy ports are available, but only 1 should
be used
+ to check link status, and not the first one available as is
the case
+ without this configuration (that is, set to its default value 0).
+ So this options specifies the switch port number to be used to
check
+ link status, because if the link is down, no data is sent by the
+ TCP/IP stack.
+
comment "MII PHY device drivers"
config MARVELL_PHY
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 6c58da2..016437a 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -112,7 +112,14 @@ int mdiobus_register(struct mii_bus *bus)
if (bus->reset)
bus->reset(bus);
+ /* The config below is always availble with CONFIG_PHYLIB. If 0, the
+ behavior is as before without this patch (or P0 of the switch is
+ taken because it is the first one found). */
+#if CONFIG_SWITCH_PHY
+ i = CONFIG_SWITCH_PHY;
+#else
for (i = 0; i < PHY_MAX_ADDR; i++) {
+#endif
if ((bus->phy_mask & (1 << i)) == 0) {
struct phy_device *phydev;
@@ -122,7 +129,9 @@ int mdiobus_register(struct mii_bus *bus)
goto error;
}
}
+#if !CONFIG_SWITCH_PHY
}
+#endif
bus->state = MDIOBUS_REGISTERED;
pr_info("%s: probed\n", bus->name);
--
1.7.1
^ permalink raw reply related
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Sedat Dilek @ 2011-09-08 7:02 UTC (permalink / raw)
To: Eric Dumazet
Cc: Tim Chen, Yan, Zheng, Yan, Zheng, netdev@vger.kernel.org,
davem@davemloft.net, sfr@canb.auug.org.au, jirislaby@gmail.com,
Shi, Alex, Valdis Kletnieks
In-Reply-To: <1315461572.2532.7.camel@edumazet-laptop>
[-- Attachment #1: Type: text/plain, Size: 5903 bytes --]
On Thu, Sep 8, 2011 at 7:59 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mercredi 07 septembre 2011 à 14:06 -0700, Tim Chen a écrit :
>> On Thu, 2011-09-08 at 08:27 +0800, Yan, Zheng wrote:
>>
>> > > err = -EPIPE;
>> > > out_err:
>> > > - if (skb == NULL)
>> > > + if (!steal_refs)
>> > > scm_destroy(siocb->scm);
>> >
>> > I think we should call scm_release() here in the case of
>> > steal_refs == true. Otherwise siocb->scm->fp may leak.
>>
>> Yan Zheng,
>>
>> I've updated the patch. If it looks good to you now, can you add your
>> Signed-off-by to the patch. Pending Sedat's testing on this patch,
>> I think it is good to go.
>>
>> Tim
>>
>> ---
>> Commit 0856a30409 (Scm: Remove unnecessary pid & credential references
>> in Unix socket's send and receive path) introduced a use-after-free bug.
>> The sent skbs from unix_stream_sendmsg could be consumed and destructed
>> by the receive side, removing all references to the credentials,
>> before the send side has finished sending out all
>> packets. However, send side could continue to consturct new packets in the
>> stream, using credentials that have lost its last reference and been
>> freed.
>>
>> In this fix, we don't steal the reference to credentials we have obtained
>> in scm_send at beginning of unix_stream_sendmsg, till we've reached
>> the last packet. This fixes the problem in commit 0856a30409.
>>
>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Reported-by: Jiri Slaby <jirislaby@gmail.com>
>> Tested-by: Sedat Dilek <sedat.dilek@googlemail.com>
>> Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
>> ---
>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>> index 136298c..47780dc 100644
>> --- a/net/unix/af_unix.c
>> +++ b/net/unix/af_unix.c
>> @@ -1383,10 +1383,11 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
>> }
>>
>> static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
>> - bool send_fds, bool ref)
>> + bool send_fds, bool steal_refs)
>> {
>> int err = 0;
>> - if (ref) {
>> +
>> + if (!steal_refs) {
>> UNIXCB(skb).pid = get_pid(scm->pid);
>> UNIXCB(skb).cred = get_cred(scm->cred);
>> } else {
>> @@ -1458,7 +1459,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> if (skb == NULL)
>> goto out;
>>
>> - err = unix_scm_to_skb(siocb->scm, skb, true, false);
>> + err = unix_scm_to_skb(siocb->scm, skb, true, true);
>> if (err < 0)
>> goto out_free;
>> max_level = err + 1;
>> @@ -1581,6 +1582,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> int sent = 0;
>> struct scm_cookie tmp_scm;
>> bool fds_sent = false;
>> + bool steal_refs = false;
>> int max_level;
>>
>> if (NULL == siocb->scm)
>> @@ -1642,11 +1644,14 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> size = min_t(int, size, skb_tailroom(skb));
>>
>>
>> - /* Only send the fds and no ref to pid in the first buffer */
>> - err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
>> + /* Only send the fds in first buffer
>> + * Last buffer can steal our references to pid/cred
>> + */
>> + steal_refs = (sent + size >= len);
>> + err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
>> if (err < 0) {
>> kfree_skb(skb);
>> - goto out;
>> + goto out_err;
>> }
>> max_level = err + 1;
>> fds_sent = true;
>> @@ -1654,7 +1659,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
>> if (err) {
>> kfree_skb(skb);
>> - goto out;
>> + goto out_err;
>> }
>>
>> unix_state_lock(other);
>> @@ -1671,7 +1676,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
>> sent += size;
>> }
>>
>> - if (skb)
>> + if (steal_refs)
>> scm_release(siocb->scm);
>> else
>> scm_destroy(siocb->scm);
>> @@ -1687,9 +1692,10 @@ pipe_err:
>> send_sig(SIGPIPE, current, 0);
>> err = -EPIPE;
>> out_err:
>> - if (skb == NULL)
>> + if (steal_refs)
>> + scm_release(siocb->scm);
>> + else
>> scm_destroy(siocb->scm);
>> -out:
>> siocb->scm = NULL;
>> return sent ? : err;
>> }
>>
>>
>>
>
> I dont think this patch is good.
>
> Sedat traces have nothing to do with af_unix.
>
> Once unix_scm_to_skb() was called and successful, and steal_refs is true
> our refs are attached to this skb. They will be released by
> skb_free(skb). Same for fp : They either were sent in a previous skb or
> this one.
>
> This is why the "goto out;" was OK.
>
Good morning,
/me sees so many patches :-).
Yes, I guess the patch by Eric has nothing to do with seen
call-traces, adding "irqpoll" to Kernel command line seems to fix them
(my uptime: approx. 10:30, Eric's proposal patch in my last
patch-series is attached).
- Sedat -
[-- Attachment #2: unix-stream-Fix-use-after-free-crashes-by-edumazet.patch --]
[-- Type: text/x-diff, Size: 2709 bytes --]
Commit 0856a30409 (Scm: Remove unnecessary pid & credential references
in Unix socket's send and receive path) introduced a use-after-free bug.
The sent skbs from unix_stream_sendmsg could be consumed and destructed
by the receive side, removing all references to the credentials,
before the send side has finished sending out all
packets. However, send side could continue to consturct new packets in the
stream, using credentials that have lost its last reference and been
freed.
In this fix, we don't steal the reference to credentials we have obtained
in scm_send at beginning of unix_stream_sendmsg, till we've reached
the last packet. This fixes the problem in commit 0856a30409.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@googlemail.com>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
---
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 136298c..4a324a0 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1383,10 +1383,11 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
}
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
- bool send_fds, bool ref)
+ bool send_fds, bool steal_refs)
{
int err = 0;
- if (ref) {
+
+ if (!steal_refs) {
UNIXCB(skb).pid = get_pid(scm->pid);
UNIXCB(skb).cred = get_cred(scm->cred);
} else {
@@ -1458,7 +1459,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (skb == NULL)
goto out;
- err = unix_scm_to_skb(siocb->scm, skb, true, false);
+ err = unix_scm_to_skb(siocb->scm, skb, true, true);
if (err < 0)
goto out_free;
max_level = err + 1;
@@ -1581,6 +1582,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
int sent = 0;
struct scm_cookie tmp_scm;
bool fds_sent = false;
+ bool steal_refs = false;
int max_level;
if (NULL == siocb->scm)
@@ -1642,8 +1644,11 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
size = min_t(int, size, skb_tailroom(skb));
- /* Only send the fds and no ref to pid in the first buffer */
- err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
+ /* Only send the fds in first buffer
+ * Last buffer can steal our references to pid/cred
+ */
+ steal_refs = (sent + size >= len);
+ err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
if (err < 0) {
kfree_skb(skb);
goto out;
@@ -1671,7 +1676,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
sent += size;
}
- if (skb)
+ if (steal_refs)
scm_release(siocb->scm);
else
scm_destroy(siocb->scm);
^ permalink raw reply related
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Eric Dumazet @ 2011-09-08 7:11 UTC (permalink / raw)
To: Yan, Zheng
Cc: Tim Chen, sedat.dilek@gmail.com, Yan, Zheng,
netdev@vger.kernel.org, davem@davemloft.net, sfr@canb.auug.org.au,
jirislaby@gmail.com, Shi, Alex, Valdis Kletnieks
In-Reply-To: <4E685F19.6030407@intel.com>
Le jeudi 08 septembre 2011 à 14:22 +0800, Yan, Zheng a écrit :
> I don't think so. unix_scm_to_skb() calls unix_attach_fds(), it
> always duplicates scm->fp.
What a mess. This code is a nightmare.
Part of the mess comes from scm_destroy() and scm_release() duplication.
We should have scm_destroy() only, as before, and NULLify scm->pid/cred
in unix_scm_to_skb() when we steal references.
It makes more sense and keeps things clearer.
include/net/scm.h | 9 ---------
net/unix/af_unix.c | 27 +++++++++++++++------------
2 files changed, 15 insertions(+), 21 deletions(-)
diff --git a/include/net/scm.h b/include/net/scm.h
index 68e1e48..2a5b42f 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -78,15 +78,6 @@ static __inline__ void scm_destroy(struct scm_cookie *scm)
__scm_destroy(scm);
}
-static __inline__ void scm_release(struct scm_cookie *scm)
-{
- /* keep ref on pid and cred */
- scm->pid = NULL;
- scm->cred = NULL;
- if (scm->fp)
- __scm_destroy(scm);
-}
-
static __inline__ int scm_send(struct socket *sock, struct msghdr *msg,
struct scm_cookie *scm)
{
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e6d9d10..1fa270a 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1379,15 +1379,18 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
}
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
- bool send_fds, bool ref)
+ bool send_fds, bool steal_refs)
{
int err = 0;
- if (ref) {
+
+ if (!steal_refs) {
UNIXCB(skb).pid = get_pid(scm->pid);
UNIXCB(skb).cred = get_cred(scm->cred);
} else {
UNIXCB(skb).pid = scm->pid;
UNIXCB(skb).cred = scm->cred;
+ scm->pid = NULL;
+ scm->cred = NULL;
}
UNIXCB(skb).fp = NULL;
if (scm->fp && send_fds)
@@ -1454,7 +1457,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (skb == NULL)
goto out;
- err = unix_scm_to_skb(siocb->scm, skb, true, false);
+ err = unix_scm_to_skb(siocb->scm, skb, true, true);
if (err < 0)
goto out_free;
max_level = err + 1;
@@ -1550,7 +1553,7 @@ restart:
unix_state_unlock(other);
other->sk_data_ready(other, len);
sock_put(other);
- scm_release(siocb->scm);
+ scm_destroy(siocb->scm);
return len;
out_unlock:
@@ -1577,6 +1580,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
int sent = 0;
struct scm_cookie tmp_scm;
bool fds_sent = false;
+ bool steal_refs = false;
int max_level;
if (NULL == siocb->scm)
@@ -1638,8 +1642,11 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
size = min_t(int, size, skb_tailroom(skb));
- /* Only send the fds and no ref to pid in the first buffer */
- err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
+ /* Only send the fds in first buffer
+ * Last buffer can steal our references to pid/cred
+ */
+ steal_refs = (sent + size >= len);
+ err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
if (err < 0) {
kfree_skb(skb);
goto out;
@@ -1667,10 +1674,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
sent += size;
}
- if (skb)
- scm_release(siocb->scm);
- else
- scm_destroy(siocb->scm);
+ scm_destroy(siocb->scm);
siocb->scm = NULL;
return sent;
@@ -1683,8 +1687,7 @@ pipe_err:
send_sig(SIGPIPE, current, 0);
err = -EPIPE;
out_err:
- if (skb == NULL)
- scm_destroy(siocb->scm);
+ scm_destroy(siocb->scm);
out:
siocb->scm = NULL;
return sent ? : err;
^ permalink raw reply related
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Yan, Zheng @ 2011-09-08 7:23 UTC (permalink / raw)
To: Eric Dumazet
Cc: Tim Chen, sedat.dilek@gmail.com, Yan, Zheng,
netdev@vger.kernel.org, davem@davemloft.net, sfr@canb.auug.org.au,
jirislaby@gmail.com, Shi, Alex, Valdis Kletnieks
In-Reply-To: <1315465919.2532.19.camel@edumazet-laptop>
On 09/08/2011 03:11 PM, Eric Dumazet wrote:
> Le jeudi 08 septembre 2011 à 14:22 +0800, Yan, Zheng a écrit :
>
>> I don't think so. unix_scm_to_skb() calls unix_attach_fds(), it
>> always duplicates scm->fp.
>
> What a mess. This code is a nightmare.
>
> Part of the mess comes from scm_destroy() and scm_release() duplication.
>
> We should have scm_destroy() only, as before, and NULLify scm->pid/cred
> in unix_scm_to_skb() when we steal references.
>
> It makes more sense and keeps things clearer.
>
>
> include/net/scm.h | 9 ---------
> net/unix/af_unix.c | 27 +++++++++++++++------------
> 2 files changed, 15 insertions(+), 21 deletions(-)
>
>
> diff --git a/include/net/scm.h b/include/net/scm.h
> index 68e1e48..2a5b42f 100644
> --- a/include/net/scm.h
> +++ b/include/net/scm.h
> @@ -78,15 +78,6 @@ static __inline__ void scm_destroy(struct scm_cookie *scm)
> __scm_destroy(scm);
> }
>
> -static __inline__ void scm_release(struct scm_cookie *scm)
> -{
> - /* keep ref on pid and cred */
> - scm->pid = NULL;
> - scm->cred = NULL;
> - if (scm->fp)
> - __scm_destroy(scm);
> -}
> -
> static __inline__ int scm_send(struct socket *sock, struct msghdr *msg,
> struct scm_cookie *scm)
> {
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index e6d9d10..1fa270a 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1379,15 +1379,18 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
> }
>
> static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
> - bool send_fds, bool ref)
> + bool send_fds, bool steal_refs)
> {
> int err = 0;
> - if (ref) {
> +
> + if (!steal_refs) {
> UNIXCB(skb).pid = get_pid(scm->pid);
> UNIXCB(skb).cred = get_cred(scm->cred);
> } else {
> UNIXCB(skb).pid = scm->pid;
> UNIXCB(skb).cred = scm->cred;
> + scm->pid = NULL;
> + scm->cred = NULL;
> }
> UNIXCB(skb).fp = NULL;
> if (scm->fp && send_fds)
> @@ -1454,7 +1457,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
> if (skb == NULL)
> goto out;
>
> - err = unix_scm_to_skb(siocb->scm, skb, true, false);
> + err = unix_scm_to_skb(siocb->scm, skb, true, true);
> if (err < 0)
> goto out_free;
> max_level = err + 1;
> @@ -1550,7 +1553,7 @@ restart:
> unix_state_unlock(other);
> other->sk_data_ready(other, len);
> sock_put(other);
> - scm_release(siocb->scm);
> + scm_destroy(siocb->scm);
> return len;
>
> out_unlock:
> @@ -1577,6 +1580,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> int sent = 0;
> struct scm_cookie tmp_scm;
> bool fds_sent = false;
> + bool steal_refs = false;
> int max_level;
>
> if (NULL == siocb->scm)
> @@ -1638,8 +1642,11 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> size = min_t(int, size, skb_tailroom(skb));
>
>
> - /* Only send the fds and no ref to pid in the first buffer */
> - err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
> + /* Only send the fds in first buffer
> + * Last buffer can steal our references to pid/cred
> + */
> + steal_refs = (sent + size >= len);
> + err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
> if (err < 0) {
> kfree_skb(skb);
> goto out;
> @@ -1667,10 +1674,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> sent += size;
> }
>
> - if (skb)
> - scm_release(siocb->scm);
> - else
> - scm_destroy(siocb->scm);
> + scm_destroy(siocb->scm);
> siocb->scm = NULL;
>
> return sent;
> @@ -1683,8 +1687,7 @@ pipe_err:
> send_sig(SIGPIPE, current, 0);
> err = -EPIPE;
> out_err:
> - if (skb == NULL)
> - scm_destroy(siocb->scm);
> + scm_destroy(siocb->scm);
> out:
> siocb->scm = NULL;
> return sent ? : err;
>
>
This code looks great, except "goto out;" is still there. I think we should replace it with "goto out_err;" :)
Regards
^ permalink raw reply
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Eric Dumazet @ 2011-09-08 7:33 UTC (permalink / raw)
To: Yan, Zheng
Cc: Tim Chen, sedat.dilek@gmail.com, Yan, Zheng,
netdev@vger.kernel.org, davem@davemloft.net, sfr@canb.auug.org.au,
jirislaby@gmail.com, Shi, Alex, Valdis Kletnieks
In-Reply-To: <4E686D71.30603@intel.com>
Le jeudi 08 septembre 2011 à 15:23 +0800, Yan, Zheng a écrit :
> This code looks great, except "goto out;" is still there. I think we should replace it with "goto out_err;" :)
>
Indeed, you're right, thanks
include/net/scm.h | 9 ---------
net/unix/af_unix.c | 32 +++++++++++++++++---------------
2 files changed, 17 insertions(+), 24 deletions(-)
diff --git a/include/net/scm.h b/include/net/scm.h
index 68e1e48..2a5b42f 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -78,15 +78,6 @@ static __inline__ void scm_destroy(struct scm_cookie *scm)
__scm_destroy(scm);
}
-static __inline__ void scm_release(struct scm_cookie *scm)
-{
- /* keep ref on pid and cred */
- scm->pid = NULL;
- scm->cred = NULL;
- if (scm->fp)
- __scm_destroy(scm);
-}
-
static __inline__ int scm_send(struct socket *sock, struct msghdr *msg,
struct scm_cookie *scm)
{
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e6d9d10..c8a08ba 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1379,15 +1379,18 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
}
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
- bool send_fds, bool ref)
+ bool send_fds, bool steal_refs)
{
int err = 0;
- if (ref) {
+
+ if (!steal_refs) {
UNIXCB(skb).pid = get_pid(scm->pid);
UNIXCB(skb).cred = get_cred(scm->cred);
} else {
UNIXCB(skb).pid = scm->pid;
UNIXCB(skb).cred = scm->cred;
+ scm->pid = NULL;
+ scm->cred = NULL;
}
UNIXCB(skb).fp = NULL;
if (scm->fp && send_fds)
@@ -1454,7 +1457,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (skb == NULL)
goto out;
- err = unix_scm_to_skb(siocb->scm, skb, true, false);
+ err = unix_scm_to_skb(siocb->scm, skb, true, true);
if (err < 0)
goto out_free;
max_level = err + 1;
@@ -1550,7 +1553,7 @@ restart:
unix_state_unlock(other);
other->sk_data_ready(other, len);
sock_put(other);
- scm_release(siocb->scm);
+ scm_destroy(siocb->scm);
return len;
out_unlock:
@@ -1577,6 +1580,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
int sent = 0;
struct scm_cookie tmp_scm;
bool fds_sent = false;
+ bool steal_refs = false;
int max_level;
if (NULL == siocb->scm)
@@ -1638,11 +1642,14 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
size = min_t(int, size, skb_tailroom(skb));
- /* Only send the fds and no ref to pid in the first buffer */
- err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
+ /* Only send the fds in first buffer
+ * Last buffer can steal our references to pid/cred
+ */
+ steal_refs = (sent + size >= len);
+ err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
if (err < 0) {
kfree_skb(skb);
- goto out;
+ goto out_err;
}
max_level = err + 1;
fds_sent = true;
@@ -1650,7 +1657,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
if (err) {
kfree_skb(skb);
- goto out;
+ goto out_err;
}
unix_state_lock(other);
@@ -1667,10 +1674,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
sent += size;
}
- if (skb)
- scm_release(siocb->scm);
- else
- scm_destroy(siocb->scm);
+ scm_destroy(siocb->scm);
siocb->scm = NULL;
return sent;
@@ -1683,9 +1687,7 @@ pipe_err:
send_sig(SIGPIPE, current, 0);
err = -EPIPE;
out_err:
- if (skb == NULL)
- scm_destroy(siocb->scm);
-out:
+ scm_destroy(siocb->scm);
siocb->scm = NULL;
return sent ? : err;
}
^ permalink raw reply related
* e1000e: NIC not working (afer resume?)
From: Frederik Himpe @ 2011-09-08 7:45 UTC (permalink / raw)
To: netdev
I have a Dell Latitude E6400 which has a network card supported by the
e1000e driver. Often (I think after a suspend/resume cycle), the network
card does not work at all: the NIC is correctly seen by ifconfig, but
running ethtool just returns: "No such device". dhclient -v gives the
impression that it's correctly sending out DHCPDISCOVER packets on the
NIC, but a tcpdump running on the same machine does not see any packets
going out.
I'm using Debian's 3.0.0-3 kernel (corresponding with Linux 3.0.3).
Full lspci, .config and dmesg output at
http://artipc10.vub.ac.be/~frederik/e1000e/
Here is some relevant summary.
How can I find out what is going wrong?
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:21:70:e1:bb:4c
UP BROADCAST PROMISC MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:22 Memory:f6ae0000-f6b00000
# ethtool eth0
Settings for eth0:
Cannot get device settings: No such device
Cannot get wake-on-lan settings: No such device
Cannot get message level: No such device
Cannot get link status: No such device
No data available
# lspci -vvnn
00:19.0 Ethernet controller [0200]: Intel Corporation 82567LM Gigabit Network Connection [8086:10f5] (rev 03)
Subsystem: Dell Device [1028:0233]
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 22
Region 0: Memory at f6ae0000 (32-bit, non-prefetchable) [disabled] [size=128K]
Region 1: Memory at f6adb000 (32-bit, non-prefetchable) [disabled] [size=4K]
Region 2: I/O ports at efe0 [disabled] [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=1 PME+
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 00000000fee0300c Data: 4182
Capabilities: [e0] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: e1000e
# dmesg | grep -E "e1000e|eth0"
[ 1.027437] e1000e: Intel(R) PRO/1000 Network Driver - 1.3.10-k2
[ 1.027441] e1000e: Copyright(c) 1999 - 2011 Intel Corporation.
[ 1.027480] e1000e 0000:00:19.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
[ 1.027491] e1000e 0000:00:19.0: setting latency timer to 64
[ 1.027605] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[ 1.231440] e1000e 0000:00:19.0: eth0: (PCI Express:2.5GT/s:Width x1) 00:21:70:e1:bb:4c
[ 1.231444] e1000e 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
[ 1.231470] e1000e 0000:00:19.0: eth0: MAC: 7, PHY: 8, PBA No: 1004FF-0FF
[ 22.896268] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[ 22.952097] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[ 22.954132] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 24.504903] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx
[ 24.506413] e1000e 0000:00:19.0: eth0: 10/100 speed: disabling TSO
[ 24.508402] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 34.788022] eth0: no IPv6 routers present
[ 41.444922] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx
[ 41.446325] e1000e 0000:00:19.0: eth0: 10/100 speed: disabling TSO
[25136.918393] e1000e 0000:00:19.0: PME# enabled
[25142.488050] e1000e 0000:00:19.0: BAR 0: set to [mem 0xf6ae0000-0xf6afffff] (PCI address [0xf6ae0000-0xf6afffff])
[25142.488058] e1000e 0000:00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff])
[25142.488066] e1000e 0000:00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff])
[25142.488085] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[25142.488110] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[25142.488167] e1000e 0000:00:19.0: PME# disabled
[25142.510966] e1000e 0000:00:19.0: PCI INT A disabled
[25142.510971] e1000e 0000:00:19.0: PME# enabled
[25143.468668] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[25143.468689] e1000e 0000:00:19.0: restoring config space at offset 0x6 (was 0x1, writing 0xefe1)
[25143.468696] e1000e 0000:00:19.0: restoring config space at offset 0x5 (was 0x0, writing 0xf6adb000)
[25143.468703] e1000e 0000:00:19.0: restoring config space at offset 0x4 (was 0x0, writing 0xf6ae0000)
[25143.468713] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[25143.471149] e1000e 0000:00:19.0: PME# disabled
[25143.471277] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[25145.276441] e1000e 0000:00:19.0: PME# enabled
[25146.082051] e1000e 0000:00:19.0: BAR 0: set to [mem 0xf6ae0000-0xf6afffff] (PCI address [0xf6ae0000-0xf6afffff])
[25146.082076] e1000e 0000:00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff])
[25146.082090] e1000e 0000:00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff])
[25146.082126] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[25146.082169] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[25146.082280] e1000e 0000:00:19.0: PME# disabled
[25146.176605] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[25146.232228] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[25146.234003] ADDRCONF(NETDEV_UP): eth0: link is not ready
[25147.328875] e1000e 0000:00:19.0: PME# enabled
[26261.896126] e1000e 0000:00:19.0: BAR 0: set to [mem 0xf6ae0000-0xf6afffff] (PCI address [0xf6ae0000-0xf6afffff])
[26261.896140] e1000e 0000:00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff])
[26261.896149] e1000e 0000:00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff])
[26261.896169] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[26261.896196] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[26261.896241] e1000e 0000:00:19.0: PME# disabled
[26261.896316] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26261.972148] e1000e 0000:00:19.0: PME# enabled
[26268.192160] e1000e 0000:00:19.0: BAR 0: set to [mem 0xf6ae0000-0xf6afffff] (PCI address [0xf6ae0000-0xf6afffff])
[26268.192168] e1000e 0000:00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff])
[26268.192175] e1000e 0000:00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff])
[26268.192194] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[26268.192219] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[26268.192274] e1000e 0000:00:19.0: PME# disabled
[26268.213875] e1000e 0000:00:19.0: PME# enabled
[26269.172481] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[26269.172497] e1000e 0000:00:19.0: restoring config space at offset 0x6 (was 0x1, writing 0xefe1)
[26269.172502] e1000e 0000:00:19.0: restoring config space at offset 0x5 (was 0x0, writing 0xf6adb000)
[26269.172507] e1000e 0000:00:19.0: restoring config space at offset 0x4 (was 0x0, writing 0xf6ae0000)
[26269.172515] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[26269.174262] e1000e 0000:00:19.0: PME# disabled
[26269.174339] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26272.384806] e1000e 0000:00:19.0: PME# enabled
[26274.688310] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26275.482490] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26276.277343] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26277.075194] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26277.870807] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26278.667881] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26279.463492] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26280.258982] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26281.056598] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26281.855386] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26282.649838] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26283.445674] e1000e 0000:00:19.0: eth0: Error reading PHY register
[26289.848036] e1000e 0000:00:19.0: BAR 0: set to [mem 0xf6ae0000-0xf6afffff] (PCI address [0xf6ae0000-0xf6afffff])
[26289.848044] e1000e 0000:00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff])
[26289.848050] e1000e 0000:00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff])
[26289.848067] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[26289.848090] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[26289.848146] e1000e 0000:00:19.0: PME# disabled
[26289.940279] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26289.996092] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26289.996761] ADDRCONF(NETDEV_UP): eth0: link is not ready
[26291.088343] e1000e 0000:00:19.0: PME# enabled
[26900.272077] e1000e 0000:00:19.0: BAR 0: set to [mem 0xf6ae0000-0xf6afffff] (PCI address [0xf6ae0000-0xf6afffff])
[26900.272096] e1000e 0000:00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff])
[26900.272111] e1000e 0000:00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff])
[26900.272142] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[26900.272180] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[26900.272250] e1000e 0000:00:19.0: PME# disabled
[26900.272379] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26900.274426] e1000e 0000:00:19.0: eth0: MAC Wakeup cause - Link Status Change
[26905.560584] e1000e: Intel(R) PRO/1000 Network Driver - 1.3.10-k2
[26905.560591] e1000e: Copyright(c) 1999 - 2011 Intel Corporation.
[26905.560650] e1000e 0000:00:19.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22
[26905.560665] e1000e 0000:00:19.0: setting latency timer to 64
[26905.560836] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26905.744547] e1000e 0000:00:19.0: eth0: (PCI Express:2.5GT/s:Width x1) 00:21:70:e1:bb:4c
[26905.744552] e1000e 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
[26905.744581] e1000e 0000:00:19.0: eth0: MAC: 7, PHY: 8, PBA No: 1004FF-0FF
[26905.744594] e1000e 0000:00:19.0: PME# enabled
[26905.880141] e1000e 0000:00:19.0: BAR 0: set to [mem 0xf6ae0000-0xf6afffff] (PCI address [0xf6ae0000-0xf6afffff])
[26905.880149] e1000e 0000:00:19.0: BAR 1: set to [mem 0xf6adb000-0xf6adbfff] (PCI address [0xf6adb000-0xf6adbfff])
[26905.880155] e1000e 0000:00:19.0: BAR 2: set to [io 0xefe0-0xefff] (PCI address [0xefe0-0xefff])
[26905.880172] e1000e 0000:00:19.0: restoring config space at offset 0xf (was 0x100, writing 0x10a)
[26905.880194] e1000e 0000:00:19.0: restoring config space at offset 0x1 (was 0x100000, writing 0x100107)
[26905.880247] e1000e 0000:00:19.0: PME# disabled
[26905.968320] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26906.024395] e1000e 0000:00:19.0: irq 43 for MSI/MSI-X
[26906.025907] ADDRCONF(NETDEV_UP): eth0: link is not ready
[26907.112415] e1000e 0000:00:19.0: PME# enabled
[27308.975856] device eth0 entered promiscuous mode
[27326.414382] device eth0 left promiscuous mode
[27336.815362] device eth0 entered promiscuous mode
* Right after start up: machine is in docking station, network is working
fine.
* At 25136: resume machine, not in docking station, no network cable
plugged in, so network connection was not tested.
* At 26261: resume machine, in docking station, network cable plugged in,
but network connection is not working
* At 26900: rmmod e1000e && modprobe e1000e: network still not working
--
Frederik Himpe
^ permalink raw reply
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Jiri Slaby @ 2011-09-08 7:56 UTC (permalink / raw)
To: Eric Dumazet
Cc: Yan, Zheng, Tim Chen, sedat.dilek@gmail.com, Yan, Zheng,
netdev@vger.kernel.org, davem@davemloft.net, sfr@canb.auug.org.au,
Shi, Alex, Valdis Kletnieks
In-Reply-To: <1315465919.2532.19.camel@edumazet-laptop>
On 09/08/2011 09:11 AM, Eric Dumazet wrote:
> Le jeudi 08 septembre 2011 à 14:22 +0800, Yan, Zheng a écrit :
>
>> I don't think so. unix_scm_to_skb() calls unix_attach_fds(), it
>> always duplicates scm->fp.
>
> What a mess. This code is a nightmare.
>
> Part of the mess comes from scm_destroy() and scm_release() duplication.
>
> We should have scm_destroy() only, as before, and NULLify scm->pid/cred
> in unix_scm_to_skb() when we steal references.
This patch works for me. I haven't tried the out_err fixup from the
followup, but I assume I won't spot a difference as those are fail paths
anyway...
thanks,
--
js
^ permalink raw reply
* Re: [PATCH] net: phy: Add config option to specify external switch port to be used if switch is used as PHY
From: Florian Fainelli @ 2011-09-08 8:39 UTC (permalink / raw)
To: Lambrecht Jürgen
Cc: netdev@vger.kernel.org, linux-embedded@vger.kernel.org
In-Reply-To: <4E68668F.9060008@televic.com>
Hello Jurgen,
On Thursday 08 September 2011 08:54:07 Lambrecht Jürgen wrote:
> Hello,
>
> In our embedded designs, this is a useful patch. Maybe it can be useful
> for somebody else too.
> Or maybe there are already better solutions?
> I know I could also write a driver for our switch, but that is too much
> effort just to select the active port.
This is not going to work well with all switches out there. You could use the
fixed-PHY driver to make your ethernet driver see the link as always up between
the MAC and switch CPU port.
A better solution would be to have proper switch drivers and user-space, which
reminds me that we (OpenWrt) should at some point propose our switch drivers
[1] for review.
[1]:
https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/
>
> Kind regards,
> Jürgen
>
> In embedded design, instead of a PHY, sometimes a switch is used that
> behaves as a PHY through its MII port. For example to use a
> daisy chain network configuration instead of an expensive star config. In
> that case, many phy ports are available, but only 1 should be used
> to check link status, and not the first one available as is
> the case
> without this configuration (that is, set to its default value
> 0). So this options specifies the switch port number to be used to check
> link status, because if the link is down, no data is sent by the
> TCP/IP stack.
>
> Signed-off-by: Jürgen Lambrecht <J.Lambrecht@televic.com>
> ---
> drivers/net/phy/Kconfig | 14 ++++++++++++++
> drivers/net/phy/mdio_bus.c | 9 +++++++++
> 2 files changed, 23 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
> index a702443..554561f 100644
> --- a/drivers/net/phy/Kconfig
> +++ b/drivers/net/phy/Kconfig
> @@ -13,6 +13,20 @@ menuconfig PHYLIB
>
> if PHYLIB
>
> +config SWITCH_PHY
> + int "External switch port to be used if switch is used as PHY"
> + default "0"
> + help
> + In embedded design, instead of a PHY, sometimes a switch is
> used that
> + behaves as a PHY through its MII port. For example to use a daisy
> + chain network configuration instead of an expensive star config.
> + In that case, many phy ports are available, but only 1 should
> be used
> + to check link status, and not the first one available as is
> the case
> + without this configuration (that is, set to its default value 0).
> + So this options specifies the switch port number to be used to
> check
> + link status, because if the link is down, no data is sent by the
> + TCP/IP stack.
> +
> comment "MII PHY device drivers"
>
> config MARVELL_PHY
> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> index 6c58da2..016437a 100644
> --- a/drivers/net/phy/mdio_bus.c
> +++ b/drivers/net/phy/mdio_bus.c
> @@ -112,7 +112,14 @@ int mdiobus_register(struct mii_bus *bus)
> if (bus->reset)
> bus->reset(bus);
>
> + /* The config below is always availble with CONFIG_PHYLIB. If 0,
> the + behavior is as before without this patch (or P0 of the
> switch is + taken because it is the first one found). */
> +#if CONFIG_SWITCH_PHY
> + i = CONFIG_SWITCH_PHY;
> +#else
> for (i = 0; i < PHY_MAX_ADDR; i++) {
> +#endif
> if ((bus->phy_mask & (1 << i)) == 0) {
> struct phy_device *phydev;
>
> @@ -122,7 +129,9 @@ int mdiobus_register(struct mii_bus *bus)
> goto error;
> }
> }
> +#if !CONFIG_SWITCH_PHY
> }
> +#endif
>
> bus->state = MDIOBUS_REGISTERED;
> pr_info("%s: probed\n", bus->name);
> --
> 1.7.1
> N�����r��y����b�X��ǧv�^�){.n�+���z�^�)����w*\x1fjg���\x1e�����ݢj/���z�ޖ��2�ޙ����
> &�)ߡ�a��\x7f��\x1e�G���h�\x0f�j:+v���w��٥
--
Florian
^ permalink raw reply
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Sedat Dilek @ 2011-09-08 8:43 UTC (permalink / raw)
To: Jiri Slaby
Cc: Eric Dumazet, Yan, Zheng, Tim Chen, Yan, Zheng,
netdev@vger.kernel.org, davem@davemloft.net, sfr@canb.auug.org.au,
Shi, Alex, Valdis Kletnieks
In-Reply-To: <4E687511.3040107@gmail.com>
On Thu, Sep 8, 2011 at 9:56 AM, Jiri Slaby <jirislaby@gmail.com> wrote:
> On 09/08/2011 09:11 AM, Eric Dumazet wrote:
>> Le jeudi 08 septembre 2011 à 14:22 +0800, Yan, Zheng a écrit :
>>
>>> I don't think so. unix_scm_to_skb() calls unix_attach_fds(), it
>>> always duplicates scm->fp.
>>
>> What a mess. This code is a nightmare.
>>
>> Part of the mess comes from scm_destroy() and scm_release() duplication.
>>
>> We should have scm_destroy() only, as before, and NULLify scm->pid/cred
>> in unix_scm_to_skb() when we steal references.
>
> This patch works for me. I haven't tried the out_err fixup from the
> followup, but I assume I won't spot a difference as those are fail paths
> anyway...
>
> thanks,
> --
> js
>
I have tested the same patch here (before shopping, but can test the
"final" patch, too.).
- Sedat -
^ permalink raw reply
* Re: [PATCH 08/11] netlink: implement memory mapped sendmsg()
From: Patrick McHardy @ 2011-09-08 9:33 UTC (permalink / raw)
To: Michał Mirosław; +Cc: davem, netfilter-devel, netdev
In-Reply-To: <20110907200319.GA29545@rere.qmqm.pl>
Am 07.09.2011 22:03, schrieb Michał Mirosław:
> On Wed, Sep 07, 2011 at 05:22:00PM +0200, Patrick McHardy wrote:
>> On 04.09.2011 18:18, Michał Mirosław wrote:
>>> On Sat, Sep 03, 2011 at 07:26:08PM +0200, kaber@trash.net wrote:
>>>> From: Patrick McHardy <kaber@trash.net>
>>>>
>>>> Add support for memory mapped sendmsg() to netlink. Userspace queued to
>>>> be processed frames into the TX ring and invokes sendmsg with
>>>> msg.iov.iov_base = NULL to trigger processing of all pending messages.
>>>>
>>>> Since the kernel usually performs full message validation before beginning
>>>> processing, userspace must be prevented from modifying the message
>>>> contents while the kernel is processing them. In order to do so, the
>>>> frames contents are copied to an allocated skb in case the the ring is
>>>> mapped more than once or the file descriptor is shared (f.i. through
>>>> AF_UNIX file descriptor passing).
>>>>
>>>> Otherwise an skb without a data area is allocated, the data pointer set
>>>> to point to the data area of the ring frame and the skb is processed.
>>>> Once the skb is freed, the destructor releases the frame back to userspace
>>>> by setting the status to NL_MMAP_STATUS_UNUSED.
>>>
>>> Is this protected from threads? Like: one thread waits on sendmsg() and
>>> another (same process) changes the buffer.
>> Yes, if the ring is mapped multiple times (or the file descriptor
>> is changed), the contents are copied to an allocated skb.
>
> I mean:
>
> [1] mmap()
> [1] fill buffers
> [1] pthread_create() [creates: 2]
> [1] sendmsg() starts
> [2] modify buffers
> [1] sendmsg() returns
>
> So: no multiple mmaps, and no touching of the fd. I haven't dug into
> filesystem layer to see if threads affect file->f_count, but there
> sure are no multiple mappings here.
If CLONE_VM is given to clone(), the mapping is visible in both
threads and thus we have multiple mappings (vma_ops->open() is
invoked through clone()). Without CLONE_VM, the second thread
can't access the ring unless it mmap()s it itself, in case we'd
also have multiple mappings.
The file descriptor check is only meant for the case that
the fd is passed to a second process through AF_UNIX, the
first process invokes sendmsg(), sendmsg() checks for multiple
mappings and the second process invokes mmap() after that.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] net: phy: Add config option to specify external switch port to be used if switch is used as PHY
From: Lambrecht Jürgen @ 2011-09-08 10:00 UTC (permalink / raw)
To: Florian Fainelli; +Cc: netdev@vger.kernel.org, linux-embedded@vger.kernel.org
In-Reply-To: <201109081039.35272.florian@openwrt.org>
On 09/08/2011 10:39 AM, Florian Fainelli wrote:
>
> Hello Jurgen,
>
> On Thursday 08 September 2011 08:54:07 Lambrecht Jürgen wrote:
> > Hello,
> >
> > In our embedded designs, this is a useful patch. Maybe it can be useful
> > for somebody else too.
> > Or maybe there are already better solutions?
> > I know I could also write a driver for our switch, but that is too much
> > effort just to select the active port.
>
> This is not going to work well with all switches out there. You could
> use the
>
Do not all switches follow the basic MII register map with room for 31
phy's?
>
> fixed-PHY driver to make your ethernet driver see the link as always
> up between
> the MAC and switch CPU port.
>
Indeed, I tried to, but it didn't work. (would be my preferred solution)
I juist enabled FIXED_PHY in menuconfig (and kept MII, NET_ETHERNET and
FEC (for my iMX cpu); also PHYLIB is on that makes mdio_bus.c compile).
I checked the architecture file for mpc866ads and didn't find any init
for it, but maybe I need to initialize fixed-PHY somewhere?
However, it could be interesting sometimes from application side to know
if the real external link is up, then fixed-PHY is not ok.
>
>
> A better solution would be to have proper switch drivers and
> user-space, which
> reminds me that we (OpenWrt) should at some point propose our switch
> drivers
> [1] for review.
>
> [1]:
> https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/
>
Just had a fast look to the Marvell switch driver. Nice to know it is there.
>
>
> >
> > Kind regards,
> > Jürgen
> >
> > In embedded design, instead of a PHY, sometimes a switch is used that
> > behaves as a PHY through its MII port. For example to use a
> > daisy chain network configuration instead of an expensive star
> config. In
> > that case, many phy ports are available, but only 1 should be used
> > to check link status, and not the first one available as is
> > the case
> > without this configuration (that is, set to its default value
> > 0). So this options specifies the switch port number to be used to check
> > link status, because if the link is down, no data is sent
> by the
> > TCP/IP stack.
> >
> > Signed-off-by: Jürgen Lambrecht <J.Lambrecht@televic.com>
> > ---
>
[snip]
>
> --
> Florian
>
--
Jürgen Lambrecht
R&D Associate
Tel: +32 (0)51 303045 Fax: +32 (0)51 310670
http://www.televic-rail.com
Televic Rail NV - Leo Bekaertlaan 1 - 8870 Izegem - Belgium
Company number 0825.539.581 - RPR Kortrijk
^ permalink raw reply
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Sedat Dilek @ 2011-09-08 10:05 UTC (permalink / raw)
To: Tim Chen
Cc: Eric Dumazet, Yan, Zheng, netdev@vger.kernel.org,
davem@davemloft.net, sfr@canb.auug.org.au, jirislaby@gmail.com,
alex.shi
In-Reply-To: <1315339157.2576.3079.camel@schen9-DESK>
On Tue, Sep 6, 2011 at 9:59 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote:
> On Tue, 2011-09-06 at 21:43 +0200, Eric Dumazet wrote:
>> Le mardi 06 septembre 2011 à 12:33 -0700, Tim Chen a écrit :
>>
>> > Yes, I think locking the sendmsg for the entire duration of
>> > unix_stream_sendmsg makes a lot of sense. It simplifies the logic a lot
>> > more. I'll try to cook something up in the next couple of days.
>>
>> Thats not really possible, we cant hold a spinlock and call
>> sock_alloc_send_skb() and/or memcpy_fromiovec(), wich might sleep.
>>
>> You would need to prepare the full skb list, then :
>> - stick the ref on the last skb of the list.
>>
>> Transfert the whole skb list in other->sk_receive_queue in one go,
>> instead of one after another.
>>
>> Unfortunately, this would break streaming (big send(), and another
>> thread doing the receive)
>>
>> Listen, I am wondering why hackbench even triggers SCM code. This is
>> really odd. We should not have a _single_ pid/cred ref/unref at all.
>>
>
> Hackbench triggers the code because it has a bunch of threads sending
> msgs on UNIX socket.
>>
>
# lsof | grep socket | wc -l
198
Aprrox 200 sockets in usage here, can you post your hackbench line, please?
I would compare hackbench results with and without new improvements in SCM code.
- Sedat -
[...]
>
> Tim
>
^ permalink raw reply
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Sedat Dilek @ 2011-09-08 9:59 UTC (permalink / raw)
To: Eric Dumazet
Cc: Yan, Zheng, Tim Chen, Yan, Zheng, netdev@vger.kernel.org,
davem@davemloft.net, sfr@canb.auug.org.au, jirislaby@gmail.com,
Shi, Alex, Valdis Kletnieks
In-Reply-To: <1315467184.2532.22.camel@edumazet-laptop>
On Thu, Sep 8, 2011 at 9:33 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le jeudi 08 septembre 2011 à 15:23 +0800, Yan, Zheng a écrit :
>
>> This code looks great, except "goto out;" is still there. I think we should replace it with "goto out_err;" :)
>>
>
> Indeed, you're right, thanks
>
> include/net/scm.h | 9 ---------
> net/unix/af_unix.c | 32 +++++++++++++++++---------------
> 2 files changed, 17 insertions(+), 24 deletions(-)
>
> diff --git a/include/net/scm.h b/include/net/scm.h
> index 68e1e48..2a5b42f 100644
> --- a/include/net/scm.h
> +++ b/include/net/scm.h
> @@ -78,15 +78,6 @@ static __inline__ void scm_destroy(struct scm_cookie *scm)
> __scm_destroy(scm);
> }
>
> -static __inline__ void scm_release(struct scm_cookie *scm)
> -{
> - /* keep ref on pid and cred */
> - scm->pid = NULL;
> - scm->cred = NULL;
> - if (scm->fp)
> - __scm_destroy(scm);
> -}
> -
> static __inline__ int scm_send(struct socket *sock, struct msghdr *msg,
> struct scm_cookie *scm)
> {
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index e6d9d10..c8a08ba 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1379,15 +1379,18 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
> }
>
> static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
> - bool send_fds, bool ref)
> + bool send_fds, bool steal_refs)
> {
> int err = 0;
> - if (ref) {
> +
> + if (!steal_refs) {
> UNIXCB(skb).pid = get_pid(scm->pid);
> UNIXCB(skb).cred = get_cred(scm->cred);
> } else {
> UNIXCB(skb).pid = scm->pid;
> UNIXCB(skb).cred = scm->cred;
> + scm->pid = NULL;
> + scm->cred = NULL;
> }
> UNIXCB(skb).fp = NULL;
> if (scm->fp && send_fds)
> @@ -1454,7 +1457,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
> if (skb == NULL)
> goto out;
>
> - err = unix_scm_to_skb(siocb->scm, skb, true, false);
> + err = unix_scm_to_skb(siocb->scm, skb, true, true);
> if (err < 0)
> goto out_free;
> max_level = err + 1;
> @@ -1550,7 +1553,7 @@ restart:
> unix_state_unlock(other);
> other->sk_data_ready(other, len);
> sock_put(other);
> - scm_release(siocb->scm);
> + scm_destroy(siocb->scm);
> return len;
>
> out_unlock:
> @@ -1577,6 +1580,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> int sent = 0;
> struct scm_cookie tmp_scm;
> bool fds_sent = false;
> + bool steal_refs = false;
> int max_level;
>
> if (NULL == siocb->scm)
> @@ -1638,11 +1642,14 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> size = min_t(int, size, skb_tailroom(skb));
>
>
> - /* Only send the fds and no ref to pid in the first buffer */
> - err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
> + /* Only send the fds in first buffer
> + * Last buffer can steal our references to pid/cred
> + */
> + steal_refs = (sent + size >= len);
> + err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
> if (err < 0) {
> kfree_skb(skb);
> - goto out;
> + goto out_err;
> }
> max_level = err + 1;
> fds_sent = true;
> @@ -1650,7 +1657,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
> if (err) {
> kfree_skb(skb);
> - goto out;
> + goto out_err;
> }
>
> unix_state_lock(other);
> @@ -1667,10 +1674,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
> sent += size;
> }
>
> - if (skb)
> - scm_release(siocb->scm);
> - else
> - scm_destroy(siocb->scm);
> + scm_destroy(siocb->scm);
> siocb->scm = NULL;
>
> return sent;
> @@ -1683,9 +1687,7 @@ pipe_err:
> send_sig(SIGPIPE, current, 0);
> err = -EPIPE;
> out_err:
> - if (skb == NULL)
> - scm_destroy(siocb->scm);
> -out:
> + scm_destroy(siocb->scm);
> siocb->scm = NULL;
> return sent ? : err;
> }
>
I have tested this fixup patch on i386.
Can we have a separate patch with corrected descriptive text?
Thanks to all involved people.
- Sedat -
^ permalink raw reply
* Re: [PATCH] net: phy: Add config option to specify external switch port to be used if switch is used as PHY
From: Francois Romieu @ 2011-09-08 10:13 UTC (permalink / raw)
To: Lambrecht Jürgen
Cc: netdev@vger.kernel.org, linux-embedded@vger.kernel.org
In-Reply-To: <4E68668F.9060008@televic.com>
Lambrecht Jürgen <J.Lambrecht@TELEVIC.com> :
[...]
> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> index 6c58da2..016437a 100644
> --- a/drivers/net/phy/mdio_bus.c
> +++ b/drivers/net/phy/mdio_bus.c
> @@ -112,7 +112,14 @@ int mdiobus_register(struct mii_bus *bus)
> if (bus->reset)
> bus->reset(bus);
>
> + /* The config below is always availble with CONFIG_PHYLIB. If 0, the
> + behavior is as before without this patch (or P0 of the switch is
> + taken because it is the first one found). */
> +#if CONFIG_SWITCH_PHY
> + i = CONFIG_SWITCH_PHY;
> +#else
> for (i = 0; i < PHY_MAX_ADDR; i++) {
> +#endif
> if ((bus->phy_mask & (1 << i)) == 0) {
> struct phy_device *phydev;
This config option may help your platform but it is nowhere reusable on a
slightly different one (each mii_bus behavior is changed).
Which driver(s) do you use that you can not set phy_mask directly ?
--
Ueimor
^ permalink raw reply
* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
From: Michael S. Tsirkin @ 2011-09-08 11:08 UTC (permalink / raw)
To: Roopa Prabhu
Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
eric.dumazet, mchan, kvm
In-Reply-To: <CA8D9EAC.33A98%roprabhu@cisco.com>
On Wed, Sep 07, 2011 at 10:20:28PM -0700, Roopa Prabhu wrote:
> On 9/7/11 5:34 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
> > On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
> >> This patch is an attempt at providing address filtering support for macvtap
> >> devices in PASSTHRU mode. Its still a work in progress.
> >> Briefly tested for basic functionality. Wanted to get some feedback on the
> >> direction before proceeding.
> >>
> >
> > Good work, thanks.
> >
>
> Thanks.
>
> >> I have hopefully CC'ed all concerned people.
> >
> > kvm crowd might also be interested.
> > Try using ./scripts/get_maintainer.pl as well.
> >
> Thanks for the tip. Expanded CC list a bit more.
>
> >> PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
> >> there is a 1-1 mapping between macvtap device and physical nic or VF. And all
> >> filtering is done in lowerdev hw. The lowerdev does not need to be in
> >> promiscous mode as long as the guest filters are passed down to the lowerdev.
> >> This patch tries to remove the need for putting the lowerdev in promiscous
> >> mode.
> >> I have also referred to the thread below where TUNSETTXFILTER was mentioned
> >> in
> >> this context:
> >> http://patchwork.ozlabs.org/patch/69297/
> >>
> >> This patch basically passes the addresses got by TUNSETTXFILTER to macvlan
> >> lowerdev.
> >>
> >> I have looked at previous work and discussions on this for qemu-kvm
> >> by Michael Tsirkin, Alex Williamson and Dragos Tatulea
> >> http://patchwork.ozlabs.org/patch/78595/
> >> http://patchwork.ozlabs.org/patch/47160/
> >> https://patchwork.kernel.org/patch/474481/
> >>
> >> Redhat bugzilla by Michael Tsirkin:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=655013
> >>
> >> I used Michael's qemu-kvm patch for testing the changes with KVM
> >>
> >> I would like to cover both MAC and vlan filtering in this work.
> >>
> >> Open Questions/Issues:
> >> - There is a need for vlan filtering to complete the patch. It will require
> >> a new tap ioctl cmd for vlans.
> >> Some ideas on this are:
> >>
> >> a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap filter
> >> (similar to tun_filter for addresses). Passing the vlan id's to lower
> >> device will mean going thru the whole list of vlans every time.
> >>
> >> OR
> >>
> >> b) TUNSETVLAN with vlan id and flag to set/unset
> >>
> >> Does option 'b' sound ok ?
> >>
> >> - In this implementation we make the macvlan address list same as the address
> >> list that came in the filter with TUNSETTXFILTER. This will not cover cases
> >> where the macvlan device needs to have other addresses that are not
> >> necessarily in the filter. Is this a problem ?
> >
> > What cases do you have in mind?
> >
> This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
> see a problem with uc/mc address list being the same in all the stacked
> netdevs in the path. I called that out above to make sure I was not missing
> any case in PASSTHRU mode where this might be invalid. Otherwise I don't see
> a problem in the simple PASSTHRU use case this patch supports.
>
> >> - The patch currently only supports passing of IFF_PROMISC and IFF_MULTICAST
> >> filter flags to lowerdev
> >>
> >> This patch series implements the following
> >> 01/3 - macvlan: Add support for unicast filtering in macvlan
> >> 02/3 - macvlan: Add function to set addr filter on lower device in passthru
> >> mode
> >> 03/3 - macvtap: Add support for TUNSETTXFILTER
> >>
> >> Please comment. Thanks.
> >>
> >> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
> >> Signed-off-by: Christian Benvenuti <benve@cisco.com>
> >> Signed-off-by: David Wang <dwang2@cisco.com>
> >
> > The security isn't lower than with promisc, so I don't see
> > a problem with this as such.
> >
> > There are more features we'll want down the road though,
> > so let's see whether the interface will be able to
> > satisfy them in a backwards compatible way before we
> > set it in stone. Here's what I came up with:
> >
> > How will the filtering table be partitioned within guests?
>
> Since this patch supports macvlan PASSTHRU mode only, in which the lower
> device has 1-1 mapping to the guest nic, it does not require any
> partitioning of filtering table within guests. Unless I missed understanding
> something.
> If the lower device were being shared by multiple guest network interfaces
> (non PASSTHRU mode), only then we will need to maintain separate filter
> tables for each guest network interface in macvlan and forward the pkt to
> respective guest interface after a filter lookup. This could affect
> performance too I think.
Not with hardware filtering support. Which is where we'd need to
partition the host nic mac table between guests.
> I chose to support PASSTHRU Mode only at first because its simpler and all
> code additions are in control path only.
I agree. It would be a bit silly to have a dedicated interface
for passthough and a completely separate one for
non passthrough.
> >
> > A way to limit what the guest can do would also be useful.
> > How can this be done? selinux?
>
> I vaguely remember a thread on the same context.. had a suggestion to
> maintain pre-approved address lists and allow guest filter registration of
> only those addresses for security. This seemed reasonable. Plus the ability
> to support additional address registration from guest could be made
> configurable (One of your ideas again from prior work).
>
> I am not an selinux expert, but I am thinking we can use it to only allow or
> disallow access or operations to the macvtap device. (?). I will check more
> on this.
We'd have to have a way to revoke that as well.
> >
> > Any thoughts on spoofing filtering?
>
> I can only think of checking addresses against an allowed address list.
> Don't know of any other ways. Any hints ?
Hardware (esp SRIOV) often has ways to do this check, too.
>
> In any case I am assuming all the protection/security measures should be
> taken at the layer calling the TUNSETTXFILTER ie..In macvtap virtualization
> use case its libvirt or qemu-kvm. No ?
Ideally we'd have a way to separate these capabilities, so that libvirt
can override qemu.
> >
> > Would it be possible to make the filtering programmable
> > using netlink, e.g. ethtool, ip, or some such?
>
> Should be possible via ethtool or ip calling ioctl TUNSETTXFILTER. Are you
> thinking of macvlan having a netlink interface to set filter and not ioctl
> ?. Sure.
Yes.
> But I was thinking the point of implementing TUNSETTXFILTER was to
> maintain compatibility with the generic tap interface that does the same
> thing.
Yes. OTOH I don't think anyone uses that ATM so it might not
be important if it's not a good fit.
E.g. we could notify libvirt and have it use netlink for us
if we like that better.
> And having both the netlink op and ioctl interface might not be clean ?.
No idea.
> Sorry if I misunderstood your question.
>
> > That would make this useful for bridged setups besides
> > macvtap/virtualization.
> >
>
> Thanks for the comments.
^ permalink raw reply
* (unknown),
From: DEACONNESS CLARA BENZ @ 2011-09-08 23:34 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 27 bytes --]
KINDLY DOWNLOAD ATTACHMENT
[-- Attachment #2: CLARA BENZ.txt --]
[-- Type: application/octet-stream, Size: 1159 bytes --]
I am happy to finally contact you on your new email id as i have been trying to reach you
regarding your 800,000USD Draft which i have with me.But due to the fact that i couldnt
reach you i then decided to deposite the draft with Fedex courier service United Kingdom(England).
for them to deliver it to you.I travelled out of the country for some official duties and
i will be staying for 3months.
Kindly contact them immediately by clicking reply to so that they can let you know when delivery will be
made to you.here is there contact number Tel- +447010039919(Mr Anthny moore).The tracking number will
be provided to you by Fedex.You are to send your mobile details.
You are requried to pay $85 USD as secuity deposite as they(Fedex) have refused to
accept the amount because they dont know when you will contact them and so decided to avoid any
sort of demurages.Make sure you confirm your postal address and telephone numbers to them when contacting them.
All other chrages including delivey and premiun charges have been paid already by me.Thanks once more and do contact
them as soon as you receive this email.fedex212unit@qatar.io
^ permalink raw reply
* Re: [PATCH] net: phy: Add config option to specify external switch port to be used if switch is used as PHY
From: Lambrecht Jürgen @ 2011-09-08 11:59 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev@vger.kernel.org, linux-embedded@vger.kernel.org
In-Reply-To: <20110908101315.GA27909@electric-eye.fr.zoreil.com>
On 09/08/2011 12:13 PM, Francois Romieu wrote:
>
> Lambrecht Jürgen <J.Lambrecht@TELEVIC.com> :
> [...]
> > diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> > index 6c58da2..016437a 100644
> > --- a/drivers/net/phy/mdio_bus.c
> > +++ b/drivers/net/phy/mdio_bus.c
> > @@ -112,7 +112,14 @@ int mdiobus_register(struct mii_bus *bus)
> > if (bus->reset)
> > bus->reset(bus);
> >
> > + /* The config below is always availble with CONFIG_PHYLIB.
> If 0, the
> > + behavior is as before without this patch (or P0 of the
> switch is
> > + taken because it is the first one found). */
> > +#if CONFIG_SWITCH_PHY
> > + i = CONFIG_SWITCH_PHY;
> > +#else
> > for (i = 0; i < PHY_MAX_ADDR; i++) {
> > +#endif
> > if ((bus->phy_mask & (1 << i)) == 0) {
> > struct phy_device *phydev;
>
> This config option may help your platform but it is nowhere reusable on a
> slightly different one (each mii_bus behavior is changed).
>
> Which driver(s) do you use that you can not set phy_mask directly ?
>
The HW driver is 'FEC' for iMX Ethernet. For the PHY, just MII and PHYLIB.
I am rather new to linux, didn't knew phy_mask. Checked it now, and is
not set in fec.c.
You mean then to patch drivers/net/fec.c in the same way (as my current
patch) to set the phy_mask instead (via menuconfig, or in the platform
init)?
Thanks for your reply,
Jürgen
>
>
> --
> Ueimor
>
--
Jürgen Lambrecht
R&D Associate
Tel: +32 (0)51 303045 Fax: +32 (0)51 310670
http://www.televic-rail.com
Televic Rail NV - Leo Bekaertlaan 1 - 8870 Izegem - Belgium
Company number 0825.539.581 - RPR Kortrijk
^ permalink raw reply
* Re: [PATCH v2 3/9] socket: initial cgroup code.
From: Glauber Costa @ 2011-09-08 12:41 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: linux-kernel, xemul, netdev, linux-mm, Eric W. Biederman,
containers, David S. Miller
In-Reply-To: <20110908053558.GA9464@shutemov.name>
On 09/08/2011 02:35 AM, Kirill A. Shutemov wrote:
> On Thu, Sep 08, 2011 at 01:54:03AM -0300, Glauber Costa wrote:
>> On 09/07/2011 07:17 PM, Kirill A. Shutemov wrote:
>>> On Wed, Sep 07, 2011 at 01:23:13AM -0300, Glauber Costa wrote:
>>>> We aim to control the amount of kernel memory pinned at any
>>>> time by tcp sockets. To lay the foundations for this work,
>>>> this patch adds a pointer to the kmem_cgroup to the socket
>>>> structure.
>>>>
>>>> Signed-off-by: Glauber Costa<glommer@parallels.com>
>>>> CC: David S. Miller<davem@davemloft.net>
>>>> CC: Hiroyouki Kamezawa<kamezawa.hiroyu@jp.fujitsu.com>
>>>> CC: Eric W. Biederman<ebiederm@xmission.com>
>>>> ---
>>>> include/linux/kmem_cgroup.h | 29 +++++++++++++++++++++++++++++
>>>> include/net/sock.h | 2 ++
>>>> net/core/sock.c | 5 ++---
>>>> 3 files changed, 33 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/include/linux/kmem_cgroup.h b/include/linux/kmem_cgroup.h
>>>> index 0e4a74b..77076d8 100644
>>>> --- a/include/linux/kmem_cgroup.h
>>>> +++ b/include/linux/kmem_cgroup.h
>>>> @@ -49,5 +49,34 @@ static inline struct kmem_cgroup *kcg_from_task(struct task_struct *tsk)
>>>> return NULL;
>>>> }
>>>> #endif /* CONFIG_CGROUP_KMEM */
>>>> +
>>>> +#ifdef CONFIG_INET
>>>
>>> Will it break something if you define the helpers even if CONFIG_INET
>>> is not defined?
>>> It will be much cleaner. You can reuse ifdef CONFIG_CGROUP_KMEM in this
>>> case.
>>
>> The helpers inside CONFIG_INET are needed for the network code,
>> regardless of kmem cgroup is defined or not, not the other way around.
>>
>> So I could remove CONFIG_INET, but I can't possibly move it inside
>> CONFIG_CGROUP_KMEM. So this buy us nothing.
>
> You can define empty under CONFIG_CGROUP_KMEM's #else, can't you?
> Like with kcg_from_cgroup()/kcg_from_task().
>
Do you really think it is cleaner?
Why would I define empty something that is not empty at all?
Look again. Most of those helpers would be the exact same with or
without CONFIG_CGROUP_KMEM . The others, very few differences. If
CONFIG_INET bothers you, I can remove it altogether, making it
unconditional. But moving it inside CONFIG_CGROUP_KMEM makes no sense.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* [PATCH net-next v3] af_unix: Fix use-after-free crashes
From: Eric Dumazet @ 2011-09-08 13:21 UTC (permalink / raw)
To: sedat.dilek
Cc: Yan, Zheng, Tim Chen, Yan, Zheng, netdev@vger.kernel.org,
davem@davemloft.net, sfr@canb.auug.org.au, jirislaby@gmail.com,
Shi, Alex, Valdis Kletnieks
In-Reply-To: <CA+icZUWH8953m4qs74JLhNwrzJM0eMfE8aizJ1XtuVFFTtLKqg@mail.gmail.com>
Le jeudi 08 septembre 2011 à 11:59 +0200, Sedat Dilek a écrit :
> I have tested this fixup patch on i386.
> Can we have a separate patch with corrected descriptive text?
>
> Thanks to all involved people.
Here it is :
[PATCH net-next v3] af_unix: Fix use-after-free crashes
Commit 0856a30409 (Scm: Remove unnecessary pid & credential references
in Unix socket's send and receive path) introduced an use-after-free
bug.
We are allowed to steal the references to pid/cred only in the last skb
sent from unix_stream_sendmsg(), because first skbs might be consumed by
the receiver before we finish our sendmsg() call.
Remove scm_release() helper, since its cleaner to clear pid/cred fields
in unix_scm_to_skb() when we steal them.
Based on prior patches from Yan Zheng and Tim Chen
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Jiri Slaby <jirislaby@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@googlemail.com>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
---
include/net/scm.h | 9 ---------
net/unix/af_unix.c | 32 +++++++++++++++++---------------
2 files changed, 17 insertions(+), 24 deletions(-)
diff --git a/include/net/scm.h b/include/net/scm.h
index 68e1e48..2a5b42f 100644
--- a/include/net/scm.h
+++ b/include/net/scm.h
@@ -78,15 +78,6 @@ static __inline__ void scm_destroy(struct scm_cookie *scm)
__scm_destroy(scm);
}
-static __inline__ void scm_release(struct scm_cookie *scm)
-{
- /* keep ref on pid and cred */
- scm->pid = NULL;
- scm->cred = NULL;
- if (scm->fp)
- __scm_destroy(scm);
-}
-
static __inline__ int scm_send(struct socket *sock, struct msghdr *msg,
struct scm_cookie *scm)
{
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e6d9d10..c8a08ba 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1379,15 +1379,18 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb)
}
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb,
- bool send_fds, bool ref)
+ bool send_fds, bool steal_refs)
{
int err = 0;
- if (ref) {
+
+ if (!steal_refs) {
UNIXCB(skb).pid = get_pid(scm->pid);
UNIXCB(skb).cred = get_cred(scm->cred);
} else {
UNIXCB(skb).pid = scm->pid;
UNIXCB(skb).cred = scm->cred;
+ scm->pid = NULL;
+ scm->cred = NULL;
}
UNIXCB(skb).fp = NULL;
if (scm->fp && send_fds)
@@ -1454,7 +1457,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (skb == NULL)
goto out;
- err = unix_scm_to_skb(siocb->scm, skb, true, false);
+ err = unix_scm_to_skb(siocb->scm, skb, true, true);
if (err < 0)
goto out_free;
max_level = err + 1;
@@ -1550,7 +1553,7 @@ restart:
unix_state_unlock(other);
other->sk_data_ready(other, len);
sock_put(other);
- scm_release(siocb->scm);
+ scm_destroy(siocb->scm);
return len;
out_unlock:
@@ -1577,6 +1580,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
int sent = 0;
struct scm_cookie tmp_scm;
bool fds_sent = false;
+ bool steal_refs = false;
int max_level;
if (NULL == siocb->scm)
@@ -1638,11 +1642,14 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
size = min_t(int, size, skb_tailroom(skb));
- /* Only send the fds and no ref to pid in the first buffer */
- err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, fds_sent);
+ /* Only send the fds in first buffer
+ * Last buffer can steal our references to pid/cred
+ */
+ steal_refs = (sent + size >= len);
+ err = unix_scm_to_skb(siocb->scm, skb, !fds_sent, steal_refs);
if (err < 0) {
kfree_skb(skb);
- goto out;
+ goto out_err;
}
max_level = err + 1;
fds_sent = true;
@@ -1650,7 +1657,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
if (err) {
kfree_skb(skb);
- goto out;
+ goto out_err;
}
unix_state_lock(other);
@@ -1667,10 +1674,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
sent += size;
}
- if (skb)
- scm_release(siocb->scm);
- else
- scm_destroy(siocb->scm);
+ scm_destroy(siocb->scm);
siocb->scm = NULL;
return sent;
@@ -1683,9 +1687,7 @@ pipe_err:
send_sig(SIGPIPE, current, 0);
err = -EPIPE;
out_err:
- if (skb == NULL)
- scm_destroy(siocb->scm);
-out:
+ scm_destroy(siocb->scm);
siocb->scm = NULL;
return sent ? : err;
}
^ permalink raw reply related
* Re: [PATCH -next v2] unix stream: Fix use-after-free crashes
From: Eric Dumazet @ 2011-09-08 13:28 UTC (permalink / raw)
To: Tim Chen
Cc: Yan, Zheng, Yan, Zheng, netdev@vger.kernel.org,
davem@davemloft.net, sfr@canb.auug.org.au, jirislaby@gmail.com,
sedat.dilek@gmail.com, Shi, Alex, Valdis Kletnieks
In-Reply-To: <1315430766.2532.1.camel@edumazet-laptop>
Le mercredi 07 septembre 2011 à 23:26 +0200, Eric Dumazet a écrit :
> Le mercredi 07 septembre 2011 à 05:01 -0700, Tim Chen a écrit :
> > Eric, are you planning to do a fast path patch that doesn't do pid ref
> > for the case where CONFIG_PID_NS is not set?
> >
>
> Yes, I'll try to cook a patch.
Thinking a bit more on this issue, I really believe we should not stick
pid/cred in skbs sent from a write() system call.
That would break following use case :
An application uses a write(fd) and expects a receiver using recvmsg()
to get process credentials (SCM_CREDENTIALS)
This is currently working, but not documented (man unix says ancillary
data are sent with sendmsg())
If everybody agrees, I can send a patch for this : This would speedup
write()/read() af_unix by an order of magnitude.
^ permalink raw reply
* Re: [PATCH 45/62] net: irq: Remove IRQF_DISABLED
From: Yong Zhang @ 2011-09-08 13:41 UTC (permalink / raw)
To: Thomas Gleixner
Cc: perex, dhowells, prakity, netdev, chunkeey, sonic.zhang,
klaus.kudielka, t.sailer, khilman, eric.dumazet, jhovold, cyril,
rmk+kernel, mingo, jpr, adobriyan, jreuter, cbe-oss-dev,
martinez.javier, samuel, vapier, grundler, tklauser,
lucas.demarchi, u.kleine-koenig, olof, uclinux-dist-devel,
linux-hams, leitao, blogic, shawn.guo, nico, geoff, jkosina,
linux-wireless, linux-kernel, ralf, ralph.hempel, jdmas
In-Reply-To: <alpine.LFD.2.02.1109072000290.2723@ionos>
On Wed, Sep 07, 2011 at 08:03:14PM +0200, Thomas Gleixner wrote:
> On Wed, 7 Sep 2011, David Miller wrote:
>
> > From: Thomas Gleixner <tglx@linutronix.de>
> > Date: Wed, 7 Sep 2011 19:32:39 +0200 (CEST)
> >
> > > On Wed, 7 Sep 2011, David Miller wrote:
> > >
> > >> From: Yong Zhang <yong.zhang0@gmail.com>
> > >> Date: Wed, 7 Sep 2011 16:10:42 +0800
> > >>
> > >> > This flag is a NOOP and can be removed now.
> > >> >
> > >> > Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
> > >>
> > >> I have the same concerns here as I had for the sparc case.
> > >>
> > >> Some of these drivers might be using IRQF_DISABLED to make sure the
> > >> IRQ cannot be delivered until it is explicitly enabled via an
> > >> enable_irq() call.
> > >>
> > >> How is that being accomodated now?
> > >
> > > Again IRQF_DISABLED never ever had that functionality.
> >
> > My bad.
> >
> > But what if these interrupts want interrupts disabled during their
> > interrupt handler, for other reasons?
>
> We run ALL interrupt handlers with irqs disabled always.
>
> > This has the potential to break tons of stuff, especially on the
> > really old chips which almost no developers have any more but some
> > user might try to use.
>
> It won't. We removed IRQF_DISABLED from kernel/irq/* long ago
Yeah, and this is why we say it's a NOOP now :)
Thanks,
Yong
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox