linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nikolay Borisov <kernel@kyup.com>
To: Tejun Heo <tj@kernel.org>
Cc: "Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
	SiteGround Operations <operations@siteground.com>
Subject: Re: corruption causing crash in __queue_work
Date: Thu, 10 Dec 2015 11:28:02 +0200	[thread overview]
Message-ID: <566945A2.1050208@kyup.com> (raw)
In-Reply-To: <20151209162744.GN30240@mtj.duckdns.org>



On 12/09/2015 06:27 PM, Tejun Heo wrote:
> Hello,
> 
> On Wed, Dec 09, 2015 at 06:23:15PM +0200, Nikolay Borisov wrote:
>> I think we are seeing this at least daily on at least 1 server (we have
>> multiple servers like that). So adding printk's would likely be the way
>> to go, anything in particular you might be interested in knowing? I see
>> RCU stuff around so might be tricky race condition.
> 
> Printing out the workqueue's pointer, name, pwq's pointer, the node
> being installed for and the installed pointer should give us enough
> clues.  There's RCU involved but the pointers shouldn't be becoming
> NULLs unless we're installing NULL ptrs.

So the debug patch has been rolled on 1 server and several more 
are in the process, here it is what it prints: 

WQ: ffff88046f00ba00 (events_unbound) old_pwq:           (null) new_pwq: ffff88046f00d300 node: 0
WQ: ffff88046f00be00 (events_power_efficient) old_pwq:           (null) new_pwq: ffff88046f00d400 node: 0
WQ: ffff88046d71c000 (events_freezable_power_) old_pwq:           (null) new_pwq: ffff88046f00d500 node: 0
WQ: ffff88046ce9ca00 (khelper) old_pwq:           (null) new_pwq: ffff88046f00d600 node: 0
WQ: ffff88046ce9c000 (netns) old_pwq:           (null) new_pwq: ffff88046f00d700 node: 0
WQ: ffff88046ce9d400 (perf) old_pwq:           (null) new_pwq: ffff88046f00d800 node: 0
WQ: ffff88046c408000 (writeback) old_pwq:           (null) new_pwq: ffff88046c800000 node: 0
WQ: ffff88046c409200 (kacpi_hotplug) old_pwq:           (null) new_pwq: ffff88046c42e200 node: 0
WQ: ffff880468455600 (scsi_tmf_0) old_pwq:           (null) new_pwq: ffff88046c801f00 node: 0
WQ: ffff8804687f4400 (scsi_tmf_1) old_pwq:           (null) new_pwq: ffff88046caa6700 node: 0
WQ: ffff8804687f4c00 (scsi_tmf_2) old_pwq:           (null) new_pwq: ffff88046caa6900 node: 0
WQ: ffff8804687f5400 (scsi_tmf_3) old_pwq:           (null) new_pwq: ffff88046caa6b00 node: 0
WQ: ffff8804687f5c00 (scsi_tmf_4) old_pwq:           (null) new_pwq: ffff88046caa6d00 node: 0
WQ: ffff8804687f6400 (scsi_tmf_5) old_pwq:           (null) new_pwq: ffff88046caa7000 node: 0
WQ: ffff8804687f6c00 (scsi_tmf_6) old_pwq:           (null) new_pwq: ffff88046caa7300 node: 0
WQ: ffff880467964000 (kdmremove) old_pwq:           (null) new_pwq: ffff880467a3c800 node: 0
WQ: ffff880467965000 (deferwq) old_pwq:           (null) new_pwq: ffff880467a3c100 node: 0
WQ: ffff8804669bc600 (ib_addr) old_pwq:           (null) new_pwq: ffff88046845a600 node: 0
WQ: ffff88007d167e00 (qib0_0) old_pwq:           (null) new_pwq: ffff880466c19800 node: 0
WQ: ffff88007d165a00 (qib0_1) old_pwq:           (null) new_pwq: ffff880466c18e00 node: 0
WQ: ffff88007d165200 (ib_mad1) old_pwq:           (null) new_pwq: ffff880466c19d00 node: 0
WQ: ffff8804665d2000 (ib_mad2) old_pwq:           (null) new_pwq: ffff880466c18a00 node: 0
WQ: ffff8804667d7600 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff880469806100 node: 0
WQ: ffff880079a9fc00 (edac-poller) old_pwq:           (null) new_pwq: ffff88007d5ebf00 node: 0
WQ: ffff88046b47cc00 (kvm-irqfd-cleanup) old_pwq:           (null) new_pwq: ffff8804651f0f00 node: 0
WQ: ffff8804694baa00 (kloopd0) old_pwq:           (null) new_pwq: ffff88046949d100 node: 0
WQ: ffff880079a9cc00 (kloopd1) old_pwq:           (null) new_pwq: ffff8804698cb900 node: 0
WQ: ffff88046809dc00 (kloopd2) old_pwq:           (null) new_pwq: ffff88046957aa00 node: 0
WQ: ffff88046809c000 (kloopd3) old_pwq:           (null) new_pwq: ffff8804650acc00 node: 0
WQ: ffff880466f3b000 (kloopd4) old_pwq:           (null) new_pwq: ffff880469575900 node: 0
WQ: ffff88046809e800 (kloopd5) old_pwq:           (null) new_pwq: ffff880469888200 node: 0
WQ: ffff88046809de00 (kloopd6) old_pwq:           (null) new_pwq: ffff880469827400 node: 0
WQ: ffff88007d5f1c00 (dm_bufio_cache) old_pwq:           (null) new_pwq: ffff8804673dda00 node: 0
WQ: ffff88046c42a400 (dm-thin) old_pwq:           (null) new_pwq: ffff880079955100 node: 0
WQ: ffff8804672d0800 (dm-thin) old_pwq:           (null) new_pwq: ffff88046baed800 node: 0
WQ: ffff88046993fa00 (dm-thin) old_pwq:           (null) new_pwq: ffff8804650ff100 node: 0
WQ: ffff88046993d400 (dm-thin) old_pwq:           (null) new_pwq: ffff88046949d600 node: 0
WQ: ffff88046993e400 (dm-thin) old_pwq:           (null) new_pwq: ffff88046b833000 node: 0
WQ: ffff880466466400 (dm-thin) old_pwq:           (null) new_pwq: ffff88007da60d00 node: 0
WQ: ffff88046b3eb200 (dm-thin) old_pwq:           (null) new_pwq: ffff88046633d200 node: 0
WQ: ffff8804672d0600 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff880079955400 node: 0
WQ: ffff88046b3eb600 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff880465684900 node: 0
WQ: ffff88046c42a400 (dm-thin) old_pwq:           (null) new_pwq: ffff8800799ee900 node: 0
WQ: ffff880466f39a00 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff880469849e00 node: 0
WQ: ffff880467b0cc00 (dm-thin) old_pwq:           (null) new_pwq: ffff88007d52fa00 node: 0
WQ: ffff8804672d4e00 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff88046ca07f00 node: 0
WQ: ffff880079a9ca00 (dm-thin) old_pwq:           (null) new_pwq: ffff8802d1be9e00 node: 0
WQ: ffff880466175000 (dm-thin) old_pwq:           (null) new_pwq: ffff8802d8efec00 node: 0
WQ: ffff880403f28400 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff8802e224dd00 node: 0
WQ: ffff880403f29a00 (dm-thin) old_pwq:           (null) new_pwq: ffff880465685300 node: 0
WQ: ffff8804672d6c00 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff880466d69300 node: 0
WQ: ffff880466f3ba00 (dm-thin) old_pwq:           (null) new_pwq: ffff880469576500 node: 0
WQ: ffff8804672d4600 (dm-thin) old_pwq:           (null) new_pwq: ffff8802d1a1ee00 node: 0
WQ: ffff8803ccf5c200 (ext4-rsv-conversion) old_pwq:           (null) new_pwq: ffff8804657b3200 node: 0

Is this format ok? Also I observed the exact same crash
on a machine running 4.1.12 kernel as well. 

> 
> Thanks.
> 

  reply	other threads:[~2015-12-10  9:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-09 12:08 corruption causing crash in __queue_work Nikolay Borisov
2015-12-09 16:08 ` Tejun Heo
2015-12-09 16:23   ` Nikolay Borisov
2015-12-09 16:27     ` Tejun Heo
2015-12-10  9:28       ` Nikolay Borisov [this message]
2015-12-10 15:29         ` Tejun Heo
2015-12-11 15:57           ` Nikolay Borisov
2015-12-11 17:08             ` Tejun Heo
2015-12-11 18:00               ` Nikolay Borisov
2015-12-11 19:14                 ` Mike Snitzer
2015-12-12 11:49                   ` Nikolay Borisov
2015-12-14  8:41               ` Nikolay Borisov
2015-12-14 15:31                 ` Mike Snitzer
2015-12-14 20:11                   ` Nikolay Borisov
2015-12-14 20:31                     ` Mike Snitzer
2015-12-17 10:46                       ` Nikolay Borisov
2015-12-17 15:33                         ` Tejun Heo
2015-12-17 15:43                           ` Nikolay Borisov
2015-12-17 15:50                             ` Tejun Heo
2015-12-17 17:15                               ` Mike Snitzer
     [not found]                                 ` <CAJFSNy5Lqv_xy7Lf1GEDPczHpZU8+a2CYCM-3ZR=VkDPJptmcg@mail.gmail.com>
2015-12-21 21:44                                   ` Tejun Heo
2015-12-21 21:45                                     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566945A2.1050208@kyup.com \
    --to=kernel@kyup.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=operations@siteground.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).