From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from merlin.infradead.org ([205.233.59.134]:56197 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751251Ab2DRHX5 (ORCPT ); Wed, 18 Apr 2012 03:23:57 -0400 Message-ID: <4F8E6BFD.10408@kernel.dk> Date: Wed, 18 Apr 2012 09:23:41 +0200 From: Jens Axboe MIME-Version: 1.0 Subject: Re: segfault runninng fio against 2048 jobs References: <0FEAAA2D70C89D49A62173478A6C4A5D02DECC2A@XYUS-EX22.xyus.xyratex.com> In-Reply-To: <0FEAAA2D70C89D49A62173478A6C4A5D02DECC2A@XYUS-EX22.xyus.xyratex.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: fio-owner@vger.kernel.org List-Id: fio@vger.kernel.org To: Roger Sibert Cc: fio@vger.kernel.org On 04/17/2012 11:05 PM, Roger Sibert wrote: > Hello Everyone, > > I am using a 2.0x variant ran across a couple of things, one of which > looks to be as designed and the other was a segfault in fio. > > My original job file had 4800 entries which exceeds the max limit. > (error: maximum number of jobs (2048) reached) The question I have > here , is there a reason the limit can't be raised to handle larger > job files? There's no inherent limit in fio that causes this, it was done to avoid errors on platforms where shared memory segments were more limited. A check now reveals that thread_data is around 15KB, which means that the segment is around 30MB in total. You should be safe to bump the #define REAL_MAX_JOBS 2048 in fio.h to something bigger. In fact I should just make it bigger, we scale it down these days if we see errors. > Reducing the job file to the max re-running it jumped straight to the initial print screen and then to a segfault. (Segmentation fault (core dumped)) > > Doing a quick look gave me > > [root@localhost std-testing]# gdb fio core.9582 > GNU gdb (GDB) CentOS (7.0.1-42.el5.centos) > Copyright (C) 2009 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > For bug reporting instructions, please see: > ... > Reading symbols from /root/fio-test/std-testing/fio...done. > [New Thread 9583] > [New Thread 9582] > > warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff213fd000 > Core was generated by `./fio --output=1.log 1.inp'. > Program terminated with signal 11, Segmentation fault. > #0 0x00000000004167b0 in display_thread_status (je=) at eta.c:416 > 416 eta.c: No such file or directory. > in eta.c > (gdb) quit > > I reduced the job count down to about 33 and re-started the run which I am waiting to finish so I can re-compile fio with whatever extra flags and to whatever code level are requested. Currently file gives me: > fio: ELF 64-bit LSB executable, AMD x86-64, version 1 (GNU/Linux), for GNU/Linux 2.6.15, statically linked, not stripped > Which is running on a CentOS box > Linux localhost.localdomain 2.6.18-308.1.1.el5 #1 SMP Wed Mar 7 04:16:51 EST 2012 x86_64 x86_64 x86_64 GNU/Linux There's not enough information here to help you out, I'm afraid. What fio version are you running? What job did you run that caused this failure? -- Jens Axboe