Flexible I/O Tester development
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: "Elliott, Robert (Server Storage)" <Elliott@hp.com>,
	"stephenmcameron@gmail.com" <stephenmcameron@gmail.com>
Cc: "fio@vger.kernel.org" <fio@vger.kernel.org>
Subject: Re: fio main thread got stuck over the weekend
Date: Fri, 12 Dec 2014 21:49:35 -0700	[thread overview]
Message-ID: <548BC55F.9020706@kernel.dk> (raw)
In-Reply-To: <94D0CD8314A33A4D9D801C0FE68B40295940B8A0@G4W3202.americas.hpqcorp.net>

On 12/12/2014 01:32 PM, Elliott, Robert (Server Storage) wrote:
>
>
>> -----Original Message-----
>> From: fio-owner@vger.kernel.org [mailto:fio-owner@vger.kernel.org] On
>> Behalf Of Jens Axboe
>> Sent: Friday, 22 August, 2014 2:11 PM
>> To: scameron@beardog.cce.hp.com
> ...
>> On 2014-08-22 14:09, scameron@beardog.cce.hp.com wrote:
>>> On Fri, Aug 22, 2014 at 02:04:34PM -0500, Jens Axboe wrote:
>>>> On 2014-08-11 11:04, scameron@beardog.cce.hp.com wrote:
>>>>> On Mon, Aug 11, 2014 at 10:44:23AM -0500, scameron@beardog.cce.hp.com
>>>>> wrote:
>>>>>>
> ...
>>>>>
>>>> >from eta.c:
>>>>>
>>>>> void print_thread_status(void)
>>>>> {
>>>>>           struct jobs_eta *je;
>>>>>           size_t size;
>>>>>
>>>>>           je = get_jobs_eta(0, &size);
>>>>>           if (je)
>>>>>                   display_thread_status(je);
>>>>>
>>>>>           free(je);
>>>>> }
>>>>>
>>>>> Maybe that je is coming back false?  which is
>>>>> probably the return value of calc_thread_status() which, well,
>>>>> at a glance, I'm not sure what calc_thread_status() is doing.
>>>>
>>>> I'll take a look at this next week, been away at a conference since
>> last
>>>> weekend.
>>>
>>> Ok.  Meantime, I had to reclaim the machine for testing, so I no longer
>>> have it just sitting there to debug, and I have not sseen the problem
>> again
>>> that I know of.
>>
>> Clearly a hardware issue :-)
>>
>> --
>> Jens Axboe
>
> Rerunning a multi-day job to test out the 64-bit counter fixes,
> I just saw the same thing after about 2 days - eta updates stop,
> although IO is still running.
>
> Jobs: 210 (f=210): [r(98),X(14),r(112)] [31.5% done] [2388MB/0KB/0KB /s] [4891K/0/0 iops] [eta 01d:17h:05m:24s]
>
> I notice that get_jobs_eta makes a malloc() call without
> checking for NULL - maybe that happened?

If that happened, the frontend would crash, so I don't think that's too 
likely. But the patch is still sane, of course :-)

Is this close to when it stopped last time as well?

If you have it running, it would be great to do a gdb attach and see 
what the frontend is up to (or where it might be stuck)...

-- 
Jens Axboe



  reply	other threads:[~2014-12-13  4:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-11 15:44 fio main thread got stuck over the weekend scameron
2014-08-11 16:04 ` scameron
2014-08-22 19:04   ` Jens Axboe
2014-08-22 19:09     ` scameron
2014-08-22 19:11       ` Jens Axboe
2014-12-12 20:32         ` Elliott, Robert (Server Storage)
2014-12-13  4:49           ` Jens Axboe [this message]
2014-12-15 17:33             ` Elliott, Robert (Server Storage)
2014-12-15 17:37               ` Jens Axboe
2014-12-15 19:39                 ` Elliott, Robert (Server Storage)
2014-12-15 20:12                   ` Jens Axboe
2014-12-15 20:31                     ` Jens Axboe
2014-12-15 20:49                       ` Jens Axboe
2014-12-16  0:52                         ` Elliott, Robert (Server Storage)
2014-12-16  2:51                           ` Jens Axboe
2014-12-16 22:43                           ` Jens Axboe
2014-12-17  3:52                             ` Elliott, Robert (Server Storage)
2014-12-17  5:43                               ` Jens Axboe
2014-12-17 16:48                                 ` Elliott, Robert (Server Storage)
2014-12-17 17:27                                   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=548BC55F.9020706@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=Elliott@hp.com \
    --cc=fio@vger.kernel.org \
    --cc=stephenmcameron@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox