netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nf_nat_pptp 4.12.3 kernel lockup/reboot
@ 2017-07-24 14:12 Denys Fedoryshchenko
  2017-07-24 16:19 ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-07-24 14:12 UTC (permalink / raw)
  To: Linux Kernel Network Developers; +Cc: Florian Westphal

Hi,

I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, 
handling approx 2gbps of pppoe users traffic) and noticed that after 
while server rebooting(i have set reboot on panic and etc).
I can't run serial console, and in pstore / netconsole there is nothing.
Best i got is some very short message about softlockup in ipmi, but as 
storage very limited there - it is near useless.

By preliminary testing (can't do it much, as it's production) - it seems 
following lines causing issue, they worked in 4.11.8 and no more in 
4.12.3.

iptables -t raw -A PREROUTING -p tcp -m tcp --dport 1723 -j CT --helper 
pptp
iptables -t raw -A PREROUTING -p tcp -m tcp --sport 1723 -j CT --helper 
pptp

(there is no solid examples for helpers, not sure second line is 
necessary)

I will try to do more tests tonight (lockdep debug and etc), but maybe 
someone have idea what might be wrong?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
  2017-07-24 14:12 nf_nat_pptp 4.12.3 kernel lockup/reboot Denys Fedoryshchenko
@ 2017-07-24 16:19 ` Florian Westphal
  2017-07-24 16:20   ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2017-07-24 16:19 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: Linux Kernel Network Developers, Florian Westphal

Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
> Hi,
> 
> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
> approx 2gbps of pppoe users traffic) and noticed that after while server
> rebooting(i have set reboot on panic and etc).
> I can't run serial console, and in pstore / netconsole there is nothing.
> Best i got is some very short message about softlockup in ipmi, but as
> storage very limited there - it is near useless.
> 
> By preliminary testing (can't do it much, as it's production) - it seems
> following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.

Wild guess here, does this help?

diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
                help = nf_ct_helper_ext_add(ct, helper, flags);
                if (help == NULL)
                        return -ENOMEM;
+              	if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
+                       return -ENOMEM;
        } else {
                /* We only allow helper re-assignment of the same sort since
                 * we cannot reallocate the helper extension area.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
  2017-07-24 16:19 ` Florian Westphal
@ 2017-07-24 16:20   ` Florian Westphal
  2017-07-25  7:27     ` Denys Fedoryshchenko
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Florian Westphal @ 2017-07-24 16:20 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Denys Fedoryshchenko, Linux Kernel Network Developers

Florian Westphal <fw@strlen.de> wrote:
> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
> > Hi,
> > 
> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
> > approx 2gbps of pppoe users traffic) and noticed that after while server
> > rebooting(i have set reboot on panic and etc).
> > I can't run serial console, and in pstore / netconsole there is nothing.
> > Best i got is some very short message about softlockup in ipmi, but as
> > storage very limited there - it is near useless.
> > 
> > By preliminary testing (can't do it much, as it's production) - it seems
> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
> 
> Wild guess here, does this help?
> 
> diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
> --- a/net/netfilter/nf_conntrack_helper.c
> +++ b/net/netfilter/nf_conntrack_helper.c
> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
>                 help = nf_ct_helper_ext_add(ct, helper, flags);
>                 if (help == NULL)
>                         return -ENOMEM;
> +              	if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));

sigh, stupid typo, should be no ';' at the end above.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
  2017-07-24 16:20   ` Florian Westphal
@ 2017-07-25  7:27     ` Denys Fedoryshchenko
  2017-07-27  6:29     ` Denys Fedoryshchenko
  2017-08-25  2:58     ` Denys Fedoryshchenko
  2 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-07-25  7:27 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Linux Kernel Network Developers

On 2017-07-24 19:20, Florian Westphal wrote:
> Florian Westphal <fw@strlen.de> wrote:
>> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> > Hi,
>> >
>> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> > approx 2gbps of pppoe users traffic) and noticed that after while server
>> > rebooting(i have set reboot on panic and etc).
>> > I can't run serial console, and in pstore / netconsole there is nothing.
>> > Best i got is some very short message about softlockup in ipmi, but as
>> > storage very limited there - it is near useless.
>> >
>> > By preliminary testing (can't do it much, as it's production) - it seems
>> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>> 
>> Wild guess here, does this help?
>> 
>> diff --git a/net/netfilter/nf_conntrack_helper.c 
>> b/net/netfilter/nf_conntrack_helper.c
>> --- a/net/netfilter/nf_conntrack_helper.c
>> +++ b/net/netfilter/nf_conntrack_helper.c
>> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, 
>> struct nf_conn *tmpl,
>>                 help = nf_ct_helper_ext_add(ct, helper, flags);
>>                 if (help == NULL)
>>                         return -ENOMEM;
>> +              	if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
> 
> sigh, stupid typo, should be no ';' at the end above.

Tested, it looks like not hanging anymore (before it was hanging within 
10 minutes)
Probably i will wait 24h testing cycle.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
  2017-07-24 16:20   ` Florian Westphal
  2017-07-25  7:27     ` Denys Fedoryshchenko
@ 2017-07-27  6:29     ` Denys Fedoryshchenko
  2017-08-25  2:58     ` Denys Fedoryshchenko
  2 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-07-27  6:29 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Linux Kernel Network Developers

On 2017-07-24 19:20, Florian Westphal wrote:
> Florian Westphal <fw@strlen.de> wrote:
>> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> > Hi,
>> >
>> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> > approx 2gbps of pppoe users traffic) and noticed that after while server
>> > rebooting(i have set reboot on panic and etc).
>> > I can't run serial console, and in pstore / netconsole there is nothing.
>> > Best i got is some very short message about softlockup in ipmi, but as
>> > storage very limited there - it is near useless.
>> >
>> > By preliminary testing (can't do it much, as it's production) - it seems
>> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>> 
>> Wild guess here, does this help?
>> 
>> diff --git a/net/netfilter/nf_conntrack_helper.c 
>> b/net/netfilter/nf_conntrack_helper.c
>> --- a/net/netfilter/nf_conntrack_helper.c
>> +++ b/net/netfilter/nf_conntrack_helper.c
>> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, 
>> struct nf_conn *tmpl,
>>                 help = nf_ct_helper_ext_add(ct, helper, flags);
>>                 if (help == NULL)
>>                         return -ENOMEM;
>> +              	if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
> 
> sigh, stupid typo, should be no ';' at the end above.

Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>

Tested and no more hangs for 2 days, definitely improvement.
Any chance it will go to stable 4.12.x and new kernel?

Thank you very much!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
  2017-07-24 16:20   ` Florian Westphal
  2017-07-25  7:27     ` Denys Fedoryshchenko
  2017-07-27  6:29     ` Denys Fedoryshchenko
@ 2017-08-25  2:58     ` Denys Fedoryshchenko
  2017-08-25  5:21       ` Florian Westphal
  2 siblings, 1 reply; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-08-25  2:58 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Linux Kernel Network Developers

On 2017-07-24 19:20, Florian Westphal wrote:
> Florian Westphal <fw@strlen.de> wrote:
>> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> > Hi,
>> >
>> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> > approx 2gbps of pppoe users traffic) and noticed that after while server
>> > rebooting(i have set reboot on panic and etc).
>> > I can't run serial console, and in pstore / netconsole there is nothing.
>> > Best i got is some very short message about softlockup in ipmi, but as
>> > storage very limited there - it is near useless.
>> >
>> > By preliminary testing (can't do it much, as it's production) - it seems
>> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>> 
>> Wild guess here, does this help?
>> 
>> diff --git a/net/netfilter/nf_conntrack_helper.c 
>> b/net/netfilter/nf_conntrack_helper.c
>> --- a/net/netfilter/nf_conntrack_helper.c
>> +++ b/net/netfilter/nf_conntrack_helper.c
>> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, 
>> struct nf_conn *tmpl,
>>                 help = nf_ct_helper_ext_add(ct, helper, flags);
>>                 if (help == NULL)
>>                         return -ENOMEM;
>> +              	if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
> 
> sigh, stupid typo, should be no ';' at the end above.
Sorry, is there any plans to push this to 4.12 stable queue?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
  2017-08-25  2:58     ` Denys Fedoryshchenko
@ 2017-08-25  5:21       ` Florian Westphal
  2017-08-25  7:15         ` Denys Fedoryshchenko
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2017-08-25  5:21 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: Florian Westphal, Linux Kernel Network Developers

Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
> >>> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
> >>> approx 2gbps of pppoe users traffic) and noticed that after while server
> >>> rebooting(i have set reboot on panic and etc).
> >>> I can't run serial console, and in pstore / netconsole there is nothing.
> >>> Best i got is some very short message about softlockup in ipmi, but as
> >>> storage very limited there - it is near useless.
> >>>
> >>> By preliminary testing (can't do it much, as it's production) - it seems
> >>> following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
> >>
> >>Wild guess here, does this help?
> >>
> >>diff --git a/net/netfilter/nf_conntrack_helper.c
> >>b/net/netfilter/nf_conntrack_helper.c
> >>--- a/net/netfilter/nf_conntrack_helper.c
> >>+++ b/net/netfilter/nf_conntrack_helper.c
> >>@@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct,
> >>struct nf_conn *tmpl,
> >>                help = nf_ct_helper_ext_add(ct, helper, flags);
> >>                if (help == NULL)
> >>                        return -ENOMEM;
> >>+              	if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
> >
> >sigh, stupid typo, should be no ';' at the end above.
> Sorry, is there any plans to push this to 4.12 stable queue?

No, sorry, this patch adds the extension for all connections
that use a helper, but the nat extension is only used/required by pptp
helper (and masquerade).

Thing is that this patch should not be needed, I will have
to review pptp again, maybe i missed a case where the extension is not
added.

Do you happen to have an oops backtrace?

That might speed this up a bit.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
  2017-08-25  5:21       ` Florian Westphal
@ 2017-08-25  7:15         ` Denys Fedoryshchenko
  0 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-08-25  7:15 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Linux Kernel Network Developers

On 2017-08-25 08:21, Florian Westphal wrote:
> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> >>> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> >>> approx 2gbps of pppoe users traffic) and noticed that after while server
>> >>> rebooting(i have set reboot on panic and etc).
>> >>> I can't run serial console, and in pstore / netconsole there is nothing.
>> >>> Best i got is some very short message about softlockup in ipmi, but as
>> >>> storage very limited there - it is near useless.
>> >>>
>> >>> By preliminary testing (can't do it much, as it's production) - it seems
>> >>> following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>> >>
>> >>Wild guess here, does this help?
>> >>
>> >>diff --git a/net/netfilter/nf_conntrack_helper.c
>> >>b/net/netfilter/nf_conntrack_helper.c
>> >>--- a/net/netfilter/nf_conntrack_helper.c
>> >>+++ b/net/netfilter/nf_conntrack_helper.c
>> >>@@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct,
>> >>struct nf_conn *tmpl,
>> >>                help = nf_ct_helper_ext_add(ct, helper, flags);
>> >>                if (help == NULL)
>> >>                        return -ENOMEM;
>> >>+              	if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
>> >
>> >sigh, stupid typo, should be no ';' at the end above.
>> Sorry, is there any plans to push this to 4.12 stable queue?
> 
> No, sorry, this patch adds the extension for all connections
> that use a helper, but the nat extension is only used/required by pptp
> helper (and masquerade).
> 
> Thing is that this patch should not be needed, I will have
> to review pptp again, maybe i missed a case where the extension is not
> added.
> 
> Do you happen to have an oops backtrace?
> 
> That might speed this up a bit.
There is nothing in netconsole, and also nothing ERST pstore, i found 
reason just by guessing.
Its totally headless also (no screen, no serial console).
I can try to attach USB serial for serial console, but not sure it will 
help.
If there is any other way to catch - i can try it, but as it's 
production server, so i can't "crash it" more than once per day.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-08-25  7:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-24 14:12 nf_nat_pptp 4.12.3 kernel lockup/reboot Denys Fedoryshchenko
2017-07-24 16:19 ` Florian Westphal
2017-07-24 16:20   ` Florian Westphal
2017-07-25  7:27     ` Denys Fedoryshchenko
2017-07-27  6:29     ` Denys Fedoryshchenko
2017-08-25  2:58     ` Denys Fedoryshchenko
2017-08-25  5:21       ` Florian Westphal
2017-08-25  7:15         ` Denys Fedoryshchenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).