From: Alex Sidorenko <alexandre.sidorenko@hpe.com>
To: netdev@vger.kernel.org
Subject: A bug in team driver
Date: Tue, 04 Oct 2016 18:30:08 -0400 [thread overview]
Message-ID: <3444639.ILgt5kU9OR@zbook> (raw)
The problem was found on RHEL7.2 but is still present in the latest upstream kernel (according to visual sources inspection).
While using roundrobin runner we have noticed that after sending on team0 about 2.1 billion packets we started seeing 50% packet drop on team0
(according to 'netstat -i'). This number suggested 'signed int' overflow and indeed, inspecting the sources I have noticed the following in
drivers/net/team/team_mode_roundrobin.c
---------------------------------------
struct rr_priv {
unsigned int sent_packets; <--------- unsigned int
};
static struct rr_priv *rr_priv(struct team *team)
{
return (struct rr_priv *) &team->mode_priv;
}
static bool rr_transmit(struct team *team, struct sk_buff *skb)
{
struct team_port *port;
int port_index;
port_index = team_num_to_port_index(team,
rr_priv(team)->sent_packets++);
---
we have 'unsigned int sent_packets' but we call team_num_to_port_index where 'num' is 'int'
include/linux/if_team.h
-----------------------
static inline int team_num_to_port_index(struct team *team, int num) <-- signed int
{
int en_port_count = ACCESS_ONCE(team->en_port_count);
if (unlikely(!en_port_count))
return 0;
return num % en_port_count;
}
As soon as sent_packets becomes larger than MAXINT (=2**31-1), team_num_to_port_index() can return negative number as num becomes negative and remainder
(num % en_port_count) is either 0 or negative. This leads to looking up incorrect hash-bucket and dropping packets.
We have easily duplicated this in roundrobin mode with two ports. After reaching 2**31 packets sent on team0 every second packet was dropped.
Rebuilding the kernel after changing
team_num_to_port_index(struct team *team, int num) -> team_num_to_port_index(struct team *team, unsigned int num)
and running the test again does not show packet drop anymore.
The same subroutine is used in team_mode_loadbalance.c:lb_hash_select_tx_port but we pass 'unsigned char hash' to team_num_to_port_index(), so there should be no overflow. I did not test that mode in my tests.
Regards,
Alex
--
------------------------------------------------------------------
Alex Sidorenko email: asid@hpe.com
ERT Linux Hewlett-Packard Enterprise (Canada)
------------------------------------------------------------------
next reply other threads:[~2016-10-04 22:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-04 22:30 Alex Sidorenko [this message]
2016-10-05 3:08 ` A bug in team driver Eric Dumazet
2016-10-05 13:06 ` [PATCH net] Fixing a bug in team driver due to incorrect 'unsigned int' to 'int' conversion Alex Sidorenko
2016-10-05 19:37 ` Eric Dumazet
2016-10-06 5:19 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3444639.ILgt5kU9OR@zbook \
--to=alexandre.sidorenko@hpe.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox