public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Fjellstrom <thomas@fjellstrom.ca>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Subject: Re: recent issues with heavy delete's causing soft lockups
Date: Wed, 21 Nov 2018 14:25:54 -0700	[thread overview]
Message-ID: <1876163.D0WNILC8ec@natasha> (raw)
In-Reply-To: <baa9f4e4-34e2-1112-5d01-ce86c1b51005@kernel.dk>

On Friday, November 2, 2018 2:37:08 PM MST Jens Axboe wrote:
> On 11/2/18 2:32 PM, Thomas Fjellstrom wrote:
> > On Saturday, October 27, 2018 1:20:10 PM MDT Jens Axboe wrote:
> >> On Oct 27, 2018, at 12:40 PM, Thomas Fjellstrom <thomas@fjellstrom.ca>
> > 
> > [snip]
> > 
> >> Can you try 4.19? A patch went in since 4.18 that fixes a starvation
> >> issue
> >> around requeue conditions, which SATA is the one to most often hit.
> >> 
> >> Jens
> > 
> > I just had to do a clean, and I have the mq kernel options I mentioned in
> > my previous mail enabled. (mq should be disabled) and it appears to still
> > be causing issues. current io scheduler appears to be cfq, and it took
> > that "make clean" about 4 minutes, a lot of that time was spent with
> > plasma, intelij, and chrome all starved of IO.
> > 
> > I did switch to a terminal and checked iostat -d 1, and it showed very
> > little actual io for the time I was looking at it.
> > 
> > I have no idea what's going on.
> 
> If you're using cfq, then it's not using mq at all. Maybe do something ala:

Yeah, I switched off mq to test. I mentioned it in a previous mail.

> # perf record -ag -- sleep 10
> 
> while the slowdown is happening and then do perf report -g --no-children and
> see if that yields anything interesting. Sounds like time is being spent
> elsewhere and you aren't actually waiting on IO.

Ok, with the 4.19.1 kernel from linux-stable I've managed to catch the issue 
during real use, rather than just a dd command.

I should note that I have swap turned off, so I'm not sure what 
the "swapper" process in the below log is doing.

I also see the problem with swap enabled. But right now I'd rather 
certain apps die rather than slow the entire system down.

I also have a perf report -t log if that'd be helpful. It shows a lot of "use" 
in do_idle/acpi_idle_do_entry though I presume that's actual real idle time, 
not actual use.  The next most eye catching item in the -t log is chrome spending
17% of its time in glibc's free function.

(the top 100~ lines from perf report -g)

# Total Lost Samples: 0
#
# Samples: 456K of event 'cycles'
# Event count (approx.): 136347735217
#
# Overhead  Command          Shared Object                           Symbol                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
# ........  ...............  ......................................  .........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
#
    25.64%  swapper          [kernel.kallsyms]                       [k] acpi_idle_do_entry
            |
            ---0xffffffffa16000d4
               |          
               |--22.23%--start_secondary
               |          cpu_startup_entry
               |          do_idle
               |          cpuidle_enter_state
               |          acpi_idle_enter
               |          acpi_idle_do_entry
               |          
                --3.41%--start_kernel
                          cpu_startup_entry
                          do_idle
                          cpuidle_enter_state
                          acpi_idle_enter
                          acpi_idle_do_entry

     0.61%  swapper          [kernel.kallsyms]                       [k] apic_timer_interrupt
            |
            ---0xffffffffa16000d4
               |          
                --0.52%--start_secondary
                          cpu_startup_entry
                          do_idle
                          |          
                           --0.52%--cpuidle_enter_state

     0.54%  chrome           chrome                                  [.] _fini
     0.42%  swapper          [kernel.kallsyms]                       [k] native_sched_clock
     0.41%  swapper          [kernel.kallsyms]                       [k] menu_select
     0.40%  swapper          [kernel.kallsyms]                       [k] check_preemption_disabled
     0.35%  http.so          libQt5Core.so.5.11.2                    [.] QTranslatorPrivate::do_translate
     0.35%  swapper          [kernel.kallsyms]                       [k] x86_pmu_disable_all
     0.32%  TaskSchedulerFo  [kernel.kallsyms]                       [k] osq_lock
     0.31%  Chrome_IOThread  chrome                                  [.] _fini
     0.30%  chrome           libpthread-2.27.so                      [.] __pthread_mutex_lock
     0.29%  swapper          [kernel.kallsyms]                       [k] _raw_spin_lock_irqsave
     0.28%  swapper          [kernel.kallsyms]                       [k] read_tsc
     0.26%  chrome           libpthread-2.27.so                      [.] __pthread_mutex_unlock_usercnt
     0.26%  swapper          [kernel.kallsyms]                       [k] reschedule_interrupt
     0.24%  swapper          [kernel.kallsyms]                       [k] _raw_spin_lock
     0.24%  swapper          [kernel.kallsyms]                       [k] __sched_text_start
     0.24%  swapper          [kernel.kallsyms]                       [k] native_load_gs_index
     0.23%  swapper          [kernel.kallsyms]                       [k] __switch_to
     0.22%  swapper          [kernel.kallsyms]                       [k] do_idle
     0.21%  TaskSchedulerFo  [kernel.kallsyms]                       [k] mutex_lock
     0.21%  swapper          [kernel.kallsyms]                       [k] cpuidle_enter_state
     0.21%  TaskSchedulerFo  chrome                                  [.] 0x000000000306c000
     0.20%  chrome           [kernel.kallsyms]                       [k] native_sched_clock
     0.20%  TaskSchedulerFo  [kernel.kallsyms]                       [k] mutex_unlock
     0.18%  chrome           [kernel.kallsyms]                       [k] entry_SYSCALL_64
     0.18%  thumbnail.so     ld-2.27.so                              [.] do_lookup_x
     0.17%  Xorg             [kernel.kallsyms]                       [k] delay_tsc
     0.17%  rm               [ext4]                                  [k] ext4_mark_iloc_dirty
     0.16%  swapper          [kernel.kallsyms]                       [k] update_blocked_averages
     0.16%  chrome           [kernel.kallsyms]                       [k] check_preemption_disabled
     0.15%  swapper          [kernel.kallsyms]                       [k] update_load_avg
     0.15%  swapper          [kernel.kallsyms]                       [k] interrupt_entry
     0.15%  swapper          [kernel.kallsyms]                       [k] ktime_get
     0.15%  swapper          [kernel.kallsyms]                       [k] switch_mm_irqs_off
     0.15%  TaskSchedulerFo  [kernel.kallsyms]                       [k] __mutex_lock.isra.5
     0.14%  rm               [kernel.kallsyms]                       [k] check_preemption_disabled
     0.14%  TaskSchedulerFo  chrome                                  [.] 0x000000000306c009
     0.13%  swapper          [kernel.kallsyms]                       [k] __update_load_avg_se
     0.13%  chrome           libc-2.27.so                            [.] __memcpy_ssse3
     0.13%  swapper          [kernel.kallsyms]                       [k] __update_load_avg_cfs_rq
     0.12%  http.so          libQt5Core.so.5.11.2                    [.] QCoreApplicationPrivate::sendPostedEvents
     0.12%  rm               [kernel.kallsyms]                       [k] __find_get_block
     0.12%  swapper          [kernel.kallsyms]                       [k] timerqueue_add
     0.12%  swapper          [kernel.kallsyms]                       [k] acpi_idle_enter
     0.12%  apt-cache        libz.so.1.2.11                          [.] adler32_z
     0.12%  swapper          [kernel.kallsyms]                       [k] rcu_dynticks_eqs_exit
     0.12%  Xorg             [radeon]                                [k] cail_reg_read
     0.12%  swapper          [kernel.kallsyms]                       [k] trace_hardirqs_off
     0.11%  swapper          [kernel.kallsyms]                       [k] set_next_entity
     0.11%  swapper          [kernel.kallsyms]                       [k] _raw_spin_unlock_irqrestore
     0.11%  http.so          libQt5Core.so.5.11.2                    [.] QCoreApplication::translate
     0.11%  http.so          [kernel.kallsyms]                       [k] __switch_to
     0.11%  Chrome_ChildIOT  chrome                                  [.] _fini
     0.11%  chrome           [kernel.kallsyms]                       [k] __fget
     0.10%  swapper          [kernel.kallsyms]                       [k] __hrtimer_next_event_base
     0.10%  http.so          [kernel.kallsyms]                       [k] native_load_gs_index
     0.10%  swapper          [kernel.kallsyms]                       [k] rcu_check_callbacks
     0.10%  drkonqi          ld-2.27.so                              [.] do_lookup_x
     0.10%  TaskSchedulerFo  chrome                                  [.] 0x000000000306e42b
     0.10%  http.so          [kernel.kallsyms]                       [k] native_sched_clock
     0.10%  swapper          [kernel.kallsyms]                       [k] x86_pmu_enable_all
     0.10%  swapper          [kernel.kallsyms]                       [k] find_busiest_group
     0.10%  radeon_cs:0      [kernel.kallsyms]                       [k] refcount_sub_and_test_checked
     0.10%  http.so          [vdso]                                  [.] 0x00000000000008d9


Thanks,

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca




      reply	other threads:[~2018-11-21 21:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-27 18:40 recent issues with heavy delete's causing soft lockups Thomas Fjellstrom
2018-10-27 19:20 ` Jens Axboe
2018-11-02 18:25   ` Thomas Fjellstrom
2018-11-02 20:32   ` Thomas Fjellstrom
2018-11-02 20:37     ` Jens Axboe
2018-11-21 21:25       ` Thomas Fjellstrom [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1876163.D0WNILC8ec@natasha \
    --to=thomas@fjellstrom.ca \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox