* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
2017-07-24 16:20 ` Florian Westphal
@ 2017-07-25 7:27 ` Denys Fedoryshchenko
2017-07-27 6:29 ` Denys Fedoryshchenko
2017-08-25 2:58 ` Denys Fedoryshchenko
2 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-07-25 7:27 UTC (permalink / raw)
To: Florian Westphal; +Cc: Linux Kernel Network Developers
On 2017-07-24 19:20, Florian Westphal wrote:
> Florian Westphal <fw@strlen.de> wrote:
>> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> > Hi,
>> >
>> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> > approx 2gbps of pppoe users traffic) and noticed that after while server
>> > rebooting(i have set reboot on panic and etc).
>> > I can't run serial console, and in pstore / netconsole there is nothing.
>> > Best i got is some very short message about softlockup in ipmi, but as
>> > storage very limited there - it is near useless.
>> >
>> > By preliminary testing (can't do it much, as it's production) - it seems
>> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>>
>> Wild guess here, does this help?
>>
>> diff --git a/net/netfilter/nf_conntrack_helper.c
>> b/net/netfilter/nf_conntrack_helper.c
>> --- a/net/netfilter/nf_conntrack_helper.c
>> +++ b/net/netfilter/nf_conntrack_helper.c
>> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct,
>> struct nf_conn *tmpl,
>> help = nf_ct_helper_ext_add(ct, helper, flags);
>> if (help == NULL)
>> return -ENOMEM;
>> + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
>
> sigh, stupid typo, should be no ';' at the end above.
Tested, it looks like not hanging anymore (before it was hanging within
10 minutes)
Probably i will wait 24h testing cycle.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
2017-07-24 16:20 ` Florian Westphal
2017-07-25 7:27 ` Denys Fedoryshchenko
@ 2017-07-27 6:29 ` Denys Fedoryshchenko
2017-08-25 2:58 ` Denys Fedoryshchenko
2 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-07-27 6:29 UTC (permalink / raw)
To: Florian Westphal; +Cc: Linux Kernel Network Developers
On 2017-07-24 19:20, Florian Westphal wrote:
> Florian Westphal <fw@strlen.de> wrote:
>> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> > Hi,
>> >
>> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> > approx 2gbps of pppoe users traffic) and noticed that after while server
>> > rebooting(i have set reboot on panic and etc).
>> > I can't run serial console, and in pstore / netconsole there is nothing.
>> > Best i got is some very short message about softlockup in ipmi, but as
>> > storage very limited there - it is near useless.
>> >
>> > By preliminary testing (can't do it much, as it's production) - it seems
>> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>>
>> Wild guess here, does this help?
>>
>> diff --git a/net/netfilter/nf_conntrack_helper.c
>> b/net/netfilter/nf_conntrack_helper.c
>> --- a/net/netfilter/nf_conntrack_helper.c
>> +++ b/net/netfilter/nf_conntrack_helper.c
>> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct,
>> struct nf_conn *tmpl,
>> help = nf_ct_helper_ext_add(ct, helper, flags);
>> if (help == NULL)
>> return -ENOMEM;
>> + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
>
> sigh, stupid typo, should be no ';' at the end above.
Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Tested and no more hangs for 2 days, definitely improvement.
Any chance it will go to stable 4.12.x and new kernel?
Thank you very much!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
2017-07-24 16:20 ` Florian Westphal
2017-07-25 7:27 ` Denys Fedoryshchenko
2017-07-27 6:29 ` Denys Fedoryshchenko
@ 2017-08-25 2:58 ` Denys Fedoryshchenko
2017-08-25 5:21 ` Florian Westphal
2 siblings, 1 reply; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-08-25 2:58 UTC (permalink / raw)
To: Florian Westphal; +Cc: Linux Kernel Network Developers
On 2017-07-24 19:20, Florian Westphal wrote:
> Florian Westphal <fw@strlen.de> wrote:
>> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> > Hi,
>> >
>> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> > approx 2gbps of pppoe users traffic) and noticed that after while server
>> > rebooting(i have set reboot on panic and etc).
>> > I can't run serial console, and in pstore / netconsole there is nothing.
>> > Best i got is some very short message about softlockup in ipmi, but as
>> > storage very limited there - it is near useless.
>> >
>> > By preliminary testing (can't do it much, as it's production) - it seems
>> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>>
>> Wild guess here, does this help?
>>
>> diff --git a/net/netfilter/nf_conntrack_helper.c
>> b/net/netfilter/nf_conntrack_helper.c
>> --- a/net/netfilter/nf_conntrack_helper.c
>> +++ b/net/netfilter/nf_conntrack_helper.c
>> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct,
>> struct nf_conn *tmpl,
>> help = nf_ct_helper_ext_add(ct, helper, flags);
>> if (help == NULL)
>> return -ENOMEM;
>> + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
>
> sigh, stupid typo, should be no ';' at the end above.
Sorry, is there any plans to push this to 4.12 stable queue?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
2017-08-25 2:58 ` Denys Fedoryshchenko
@ 2017-08-25 5:21 ` Florian Westphal
2017-08-25 7:15 ` Denys Fedoryshchenko
0 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2017-08-25 5:21 UTC (permalink / raw)
To: Denys Fedoryshchenko; +Cc: Florian Westphal, Linux Kernel Network Developers
Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
> >>> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
> >>> approx 2gbps of pppoe users traffic) and noticed that after while server
> >>> rebooting(i have set reboot on panic and etc).
> >>> I can't run serial console, and in pstore / netconsole there is nothing.
> >>> Best i got is some very short message about softlockup in ipmi, but as
> >>> storage very limited there - it is near useless.
> >>>
> >>> By preliminary testing (can't do it much, as it's production) - it seems
> >>> following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
> >>
> >>Wild guess here, does this help?
> >>
> >>diff --git a/net/netfilter/nf_conntrack_helper.c
> >>b/net/netfilter/nf_conntrack_helper.c
> >>--- a/net/netfilter/nf_conntrack_helper.c
> >>+++ b/net/netfilter/nf_conntrack_helper.c
> >>@@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct,
> >>struct nf_conn *tmpl,
> >> help = nf_ct_helper_ext_add(ct, helper, flags);
> >> if (help == NULL)
> >> return -ENOMEM;
> >>+ if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
> >
> >sigh, stupid typo, should be no ';' at the end above.
> Sorry, is there any plans to push this to 4.12 stable queue?
No, sorry, this patch adds the extension for all connections
that use a helper, but the nat extension is only used/required by pptp
helper (and masquerade).
Thing is that this patch should not be needed, I will have
to review pptp again, maybe i missed a case where the extension is not
added.
Do you happen to have an oops backtrace?
That might speed this up a bit.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nf_nat_pptp 4.12.3 kernel lockup/reboot
2017-08-25 5:21 ` Florian Westphal
@ 2017-08-25 7:15 ` Denys Fedoryshchenko
0 siblings, 0 replies; 8+ messages in thread
From: Denys Fedoryshchenko @ 2017-08-25 7:15 UTC (permalink / raw)
To: Florian Westphal; +Cc: Linux Kernel Network Developers
On 2017-08-25 08:21, Florian Westphal wrote:
> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote:
>> >>> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling
>> >>> approx 2gbps of pppoe users traffic) and noticed that after while server
>> >>> rebooting(i have set reboot on panic and etc).
>> >>> I can't run serial console, and in pstore / netconsole there is nothing.
>> >>> Best i got is some very short message about softlockup in ipmi, but as
>> >>> storage very limited there - it is near useless.
>> >>>
>> >>> By preliminary testing (can't do it much, as it's production) - it seems
>> >>> following lines causing issue, they worked in 4.11.8 and no more in 4.12.3.
>> >>
>> >>Wild guess here, does this help?
>> >>
>> >>diff --git a/net/netfilter/nf_conntrack_helper.c
>> >>b/net/netfilter/nf_conntrack_helper.c
>> >>--- a/net/netfilter/nf_conntrack_helper.c
>> >>+++ b/net/netfilter/nf_conntrack_helper.c
>> >>@@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct,
>> >>struct nf_conn *tmpl,
>> >> help = nf_ct_helper_ext_add(ct, helper, flags);
>> >> if (help == NULL)
>> >> return -ENOMEM;
>> >>+ if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags));
>> >
>> >sigh, stupid typo, should be no ';' at the end above.
>> Sorry, is there any plans to push this to 4.12 stable queue?
>
> No, sorry, this patch adds the extension for all connections
> that use a helper, but the nat extension is only used/required by pptp
> helper (and masquerade).
>
> Thing is that this patch should not be needed, I will have
> to review pptp again, maybe i missed a case where the extension is not
> added.
>
> Do you happen to have an oops backtrace?
>
> That might speed this up a bit.
There is nothing in netconsole, and also nothing ERST pstore, i found
reason just by guessing.
Its totally headless also (no screen, no serial console).
I can try to attach USB serial for serial console, but not sure it will
help.
If there is any other way to catch - i can try it, but as it's
production server, so i can't "crash it" more than once per day.
^ permalink raw reply [flat|nested] 8+ messages in thread