Patrick McHardy wrote: > Patrick McHardy wrote: >> BORBELY Zoltan wrote: >>> On Tue, Nov 18, 2008 at 12:07:20PM +0100, Patrick McHardy wrote: >>>>> --- /tmp/nf_conntrack_netlink.c-orig 2008-09-29 >>>>> 23:28:55.000000000 +0200 >>>>> +++ /tmp/nf_conntrack_netlink.c 2008-09-29 23:29:11.000000000 +0200 >>>>> @@ -1177,8 +1177,8 @@ >>>>> ct->master = master_ct; >>>>> } >>>>> - add_timer(&ct->timeout); >>>>> nf_conntrack_hash_insert(ct); >>>>> + add_timer(&ct->timeout); >>>>> rcu_read_unlock(); >>>> That code looks very fishy. We should be holding the conntrack lock, >>>> otherwise the addition is not only racy against the timer, but also >>>> against addition of identical conntracks. Let me look into what >>>> happened here. >>> >>> We have experienced a lot of kernel crashes, _every time_ in the >>> death_by_timeout() function while we were trying to add a new conntrack >>> entry from userspace via netlink (attached the disassembled version >>> of the function, ===> points to the EIP upon the crash). There was a >>> possibility, that we tried to add conntrack entries with zero timeout >>> value, maybe it's necessary to trigger this crash. The previous patch >>> has definitly solved the problem for us. >>> >>> I've got photos from various crashes, but it takes a little time to >>> find them. Please let me know if you want to see them. >> >> Thats not necessary, the problem is pretty obvious, I was mainly >> wondering at what point we broke it. >> >> I'll send you a patch soon. > > Could you try whether this patch fixes the problem? > > Pablo, do you recall the reason why the lock isn't held in > ctnetlink_create_conntrack()? The creation is done under the nfnl_mutex so that requests to create identical entries cannot race. Of course, this is not enough to avoid the race with the timer if we set a very small timer for a conntrack :(. AFAICS, we don't need to enclose the whole conntrack creation path. Would you prefer the patch attached? This patch should apply fine to 2.6.28-rc. I can send this patch to -stable. BTW, this patch may conflict with my patch enqueued for 2.6.29 that adds userspace reporting, let me know if I can give you a hand in some way (like sending it on top of this one or yours again, whatever). -- "Los honestos son inadaptados sociales" -- Les Luthiers