* gcc 4.8.3 miscompiles drivers/net/ethernet/freescale/fec_main.c ?!
@ 2014-09-10 9:09 David Jander
2014-09-10 12:32 ` Mikael Pettersson
0 siblings, 1 reply; 4+ messages in thread
From: David Jander @ 2014-09-10 9:09 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
I am seeing a strange problem when building a recent kernel with gcc-4.8.3 for
armv7-a that contains the following patch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=bfd4ecdd87d350e19457fe0d02fa1e046774c44e
Unfortunately I am not good enough at reading ARM assembly output from GCC to
understand whats going wrong, so I am asking for help.
I started noticing ethernet packet loss on a i.MX6 board after upgrading the
kernel from 3.16-rc-something to latest mainline. The problem is very easy to
reproduce so I started git-bisecting. Git bisect gave me the above patch as
the culprit, and indeed: Without the patch a flood-ping goes fine (just one
dot on screen, no lost packets). I apply the patch and the dots start filling
the screen instantly.
I am compiling the kernel using Pengutronix's OSELAS toolchain version
2013.12.1, which is based on linaro gcc-4.8.3 without any relevant patches
AFAIK.
Compiling with -O2 breaks the code, while -Os seems to produce a correctly
working kernel.
I decided to make changes to the code and see if I could find other ways to
"fix" the problem, and I got the following result:
The above mentioned patch introduces the static function fec_enet_hwtstamp()
near line 1068 of fec_main.c. If I make an exact copy of this function, where
I only change the name (e.g. fec_enet_hwtstamp2), and change one of the two
places this function is called to instead use the other name, GCC inlines both
copies and the problem disappears!
Since I am not very good at GCC internals nor do I know this piece of code in
fec_main.c very well, I am asking here for help in hunting down the real bug,
which I suspect is in GCC... but I want to know for sure.
Best regards,
--
David Jander
Protonic Holland.
^ permalink raw reply [flat|nested] 4+ messages in thread
* gcc 4.8.3 miscompiles drivers/net/ethernet/freescale/fec_main.c ?!
2014-09-10 9:09 gcc 4.8.3 miscompiles drivers/net/ethernet/freescale/fec_main.c ?! David Jander
@ 2014-09-10 12:32 ` Mikael Pettersson
2014-09-10 14:49 ` David Jander
0 siblings, 1 reply; 4+ messages in thread
From: Mikael Pettersson @ 2014-09-10 12:32 UTC (permalink / raw)
To: linux-arm-kernel
David Jander writes:
>
> Hi,
>
> I am seeing a strange problem when building a recent kernel with gcc-4.8.3 for
> armv7-a that contains the following patch:
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=bfd4ecdd87d350e19457fe0d02fa1e046774c44e
>
> Unfortunately I am not good enough at reading ARM assembly output from GCC to
> understand whats going wrong, so I am asking for help.
>
> I started noticing ethernet packet loss on a i.MX6 board after upgrading the
> kernel from 3.16-rc-something to latest mainline. The problem is very easy to
> reproduce so I started git-bisecting. Git bisect gave me the above patch as
> the culprit, and indeed: Without the patch a flood-ping goes fine (just one
> dot on screen, no lost packets). I apply the patch and the dots start filling
> the screen instantly.
>
> I am compiling the kernel using Pengutronix's OSELAS toolchain version
> 2013.12.1, which is based on linaro gcc-4.8.3 without any relevant patches
> AFAIK.
Linaro's toolchain is itself heavily modified compared to FSF gcc-4.8.3,
so first please try a pure vanilla FSF gcc-4.8.3, and then a likewise
vanilla gcc-4.9.1. If those also cause the malfunction, then you have
proof for a bug in upstream gcc (or possibly undefined code in the kernel),
otherwise the bug is likely Linaro's.
> Compiling with -O2 breaks the code, while -Os seems to produce a correctly
> working kernel.
>
> I decided to make changes to the code and see if I could find other ways to
> "fix" the problem, and I got the following result:
>
> The above mentioned patch introduces the static function fec_enet_hwtstamp()
> near line 1068 of fec_main.c. If I make an exact copy of this function, where
> I only change the name (e.g. fec_enet_hwtstamp2), and change one of the two
> places this function is called to instead use the other name, GCC inlines both
> copies and the problem disappears!
>
> Since I am not very good at GCC internals nor do I know this piece of code in
> fec_main.c very well, I am asking here for help in hunting down the real bug,
> which I suspect is in GCC... but I want to know for sure.
>
> Best regards,
>
> --
> David Jander
> Protonic Holland.
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
^ permalink raw reply [flat|nested] 4+ messages in thread
* gcc 4.8.3 miscompiles drivers/net/ethernet/freescale/fec_main.c ?!
2014-09-10 12:32 ` Mikael Pettersson
@ 2014-09-10 14:49 ` David Jander
2014-09-16 11:42 ` David Jander
0 siblings, 1 reply; 4+ messages in thread
From: David Jander @ 2014-09-10 14:49 UTC (permalink / raw)
To: linux-arm-kernel
(included Michael Olbrich in CC)...
On Wed, 10 Sep 2014 14:32:20 +0200
Mikael Pettersson <mikpelinux@gmail.com> wrote:
> David Jander writes:
> >
> > Hi,
> >
> > I am seeing a strange problem when building a recent kernel with
> > gcc-4.8.3 for armv7-a that contains the following patch:
> >
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=bfd4ecdd87d350e19457fe0d02fa1e046774c44e
> >
> > Unfortunately I am not good enough at reading ARM assembly output from
> > GCC to understand whats going wrong, so I am asking for help.
> >
> > I started noticing ethernet packet loss on a i.MX6 board after upgrading
> > the kernel from 3.16-rc-something to latest mainline. The problem is very
> > easy to reproduce so I started git-bisecting. Git bisect gave me the
> > above patch as the culprit, and indeed: Without the patch a flood-ping
> > goes fine (just one dot on screen, no lost packets). I apply the patch
> > and the dots start filling the screen instantly.
> >
> > I am compiling the kernel using Pengutronix's OSELAS toolchain version
> > 2013.12.1, which is based on linaro gcc-4.8.3 without any relevant patches
> > AFAIK.
>
> Linaro's toolchain is itself heavily modified compared to FSF gcc-4.8.3,
> so first please try a pure vanilla FSF gcc-4.8.3, and then a likewise
> vanilla gcc-4.9.1. If those also cause the malfunction, then you have
> proof for a bug in upstream gcc (or possibly undefined code in the kernel),
> otherwise the bug is likely Linaro's.
Thanks. I will try to build gcc-4.8.3 from vanilla FSF sources and try to
reproduce the problem there. Do you think there is a chance this is still a
kernel bug?
I have assembly output of both working and broken cases (inlined and
non-inlined function). I can post them here or send to anyone who wants to
try to make sense of it....
> > Compiling with -O2 breaks the code, while -Os seems to produce a correctly
> > working kernel.
> >
> > I decided to make changes to the code and see if I could find other ways
> > to "fix" the problem, and I got the following result:
> >
> > The above mentioned patch introduces the static function
> > fec_enet_hwtstamp() near line 1068 of fec_main.c. If I make an exact copy
> > of this function, where I only change the name (e.g. fec_enet_hwtstamp2),
> > and change one of the two places this function is called to instead use
> > the other name, GCC inlines both copies and the problem disappears!
> >
> > Since I am not very good at GCC internals nor do I know this piece of
> > code in fec_main.c very well, I am asking here for help in hunting down
> > the real bug, which I suspect is in GCC... but I want to know for sure.
> >
> > Best regards,
> >
> > --
> > David Jander
> > Protonic Holland.
Best regards,
--
David Jander
Protonic Holland.
^ permalink raw reply [flat|nested] 4+ messages in thread
* gcc 4.8.3 miscompiles drivers/net/ethernet/freescale/fec_main.c ?!
2014-09-10 14:49 ` David Jander
@ 2014-09-16 11:42 ` David Jander
0 siblings, 0 replies; 4+ messages in thread
From: David Jander @ 2014-09-16 11:42 UTC (permalink / raw)
To: linux-arm-kernel
Hi Mikael,
On Wed, 10 Sep 2014 16:49:20 +0200
David Jander <david@protonic.nl> wrote:
> On Wed, 10 Sep 2014 14:32:20 +0200
> Mikael Pettersson <mikpelinux@gmail.com> wrote:
>
> > David Jander writes:
> > >
> > > Hi,
> > >
> > > I am seeing a strange problem when building a recent kernel with
> > > gcc-4.8.3 for armv7-a that contains the following patch:
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=bfd4ecdd87d350e19457fe0d02fa1e046774c44e
> > >
> > > Unfortunately I am not good enough at reading ARM assembly output from
> > > GCC to understand whats going wrong, so I am asking for help.
> > >
> > > I started noticing ethernet packet loss on a i.MX6 board after upgrading
> > > the kernel from 3.16-rc-something to latest mainline. The problem is
> > > very easy to reproduce so I started git-bisecting. Git bisect gave me
> > > the above patch as the culprit, and indeed: Without the patch a
> > > flood-ping goes fine (just one dot on screen, no lost packets). I apply
> > > the patch and the dots start filling the screen instantly.
> > >
> > > I am compiling the kernel using Pengutronix's OSELAS toolchain version
> > > 2013.12.1, which is based on linaro gcc-4.8.3 without any relevant
> > > patches AFAIK.
> >
> > Linaro's toolchain is itself heavily modified compared to FSF gcc-4.8.3,
> > so first please try a pure vanilla FSF gcc-4.8.3, and then a likewise
> > vanilla gcc-4.9.1. If those also cause the malfunction, then you have
> > proof for a bug in upstream gcc (or possibly undefined code in the kernel),
> > otherwise the bug is likely Linaro's.
>
> Thanks. I will try to build gcc-4.8.3 from vanilla FSF sources and try to
> reproduce the problem there. Do you think there is a chance this is still a
> kernel bug?
> I have assembly output of both working and broken cases (inlined and
> non-inlined function). I can post them here or send to anyone who wants to
> try to make sense of it....
This is getting weird:
I have build both vanilla FSF toolchains: binutils-2.24 with gcc-4.8.3 and
gcc-4.9.1
I can't quite make sense of the results, other than that there is a
race-condition in the Linux kernel fec_main.c:
gcc-4.8.3 with -O2: fails the same as with OSELAS.Toolchain/linaro gcc-4.8.3
gcc-4.8.3 with -Os: Seems to work correctly
gcc-4.9.1 with -O2: Loses packets, but much less often than with gcc-4.8.3 -O2
gcc-4.9.1 with -Os: Fails even worse than with gcc-4.8.3 -O2
Any suggestion on how to make sense of this?
For me this looks slightly more like a kernel bug than a compiler bug...
> > > Compiling with -O2 breaks the code, while -Os seems to produce a
> > > correctly working kernel.
> > >
> > > I decided to make changes to the code and see if I could find other ways
> > > to "fix" the problem, and I got the following result:
> > >
> > > The above mentioned patch introduces the static function
> > > fec_enet_hwtstamp() near line 1068 of fec_main.c. If I make an exact
> > > copy of this function, where I only change the name (e.g.
> > > fec_enet_hwtstamp2), and change one of the two places this function is
> > > called to instead use the other name, GCC inlines both copies and the
> > > problem disappears!
> > >
> > > Since I am not very good at GCC internals nor do I know this piece of
> > > code in fec_main.c very well, I am asking here for help in hunting down
> > > the real bug, which I suspect is in GCC... but I want to know for sure.
> > >
Best regards,
--
David Jander
Protonic Holland.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-09-16 11:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-10 9:09 gcc 4.8.3 miscompiles drivers/net/ethernet/freescale/fec_main.c ?! David Jander
2014-09-10 12:32 ` Mikael Pettersson
2014-09-10 14:49 ` David Jander
2014-09-16 11:42 ` David Jander
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).