linux-btrace.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH]  Increased limits to allow for large system runs
@ 2009-01-22 18:55 Alan D. Brunelle
  2009-01-23 10:02 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Alan D. Brunelle @ 2009-01-22 18:55 UTC (permalink / raw)
  To: linux-btrace

[-- Attachment #1: Type: text/plain, Size: 0 bytes --]



[-- Attachment #2: 0001-Increased-limits-to-allow-for-large-system-runs.patch --]
[-- Type: text/x-diff, Size: 4161 bytes --]

On 16-way w/ 104 disks and a 32-way w/ 96 disks, I was getting:

$ sudo blktrace -b 1024 -n 8 -I ../files
./cciss_c1d6.blktrace.10: Too many open files
Failed to start worker threads

Due to the nature of our N(cpus) X N(devices) order of file opens, and
our N(cpus) X N(devices) X N(buffers) X (buffer size) amount of mmaps()
going on we're exceeding both the RLIMIT_NOFILE and RLIMIT_MEMLOCK
limits.

This patch raises limits for RLIMIT_NOFILE and RLIMIT_MEMLOCK to
"infinity", and allows blktrace to handle the large(ish) systems. (If
these settings fail, we "guestimate" about how much we really need.)

There is still an underlying blktrace and/or kernel problem: The
directory /sys/kernel/debug/block/<DSF> where <DSF> is the device that
encountered the limit is left behind (not cleaned up correctly). This
stops blktrace from running a second time (even on another device):

$  ls /sys/kernel/debug/block
cciss_c1d6
$ sudo blktrace /dev/sda
BLKTRACESETUP: No such file or directory
Failed to start trace on /dev/sda

and requires a reboot. (Looking into that next, as this patch - whilst
stopping the original problem from happening - does not address the
secondary problem. And there may be some other ways for the secondary
problem to still occur...)

I also fixed a warning concerning ftruncate's return value being
ignored.

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
---
 blktrace.c |   65 +++++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/blktrace.c b/blktrace.c
index 7e27f14..afcc42f 100644
--- a/blktrace.c
+++ b/blktrace.c
@@ -43,6 +43,7 @@
 #include <arpa/inet.h>
 #include <netdb.h>
 #include <sys/sendfile.h>
+#include <sys/resource.h>
 
 #include "blktrace.h"
 #include "barrier.h"
@@ -347,6 +348,51 @@ static int net_connects;
 
 static int *net_out_fd;
 
+/*
+ * For large(-ish) systems, we run into real issues in our
+ * N(devs) X N(cpus) algorithms if we are being limited by arbitrary
+ * resource constraints.
+ *
+ * We try to set our limits to infinity, if that fails, we guestimate a max
+ * needed and try that.
+ */
+static int increase_limit(int r, rlim_t val)
+{
+	struct rlimit rlim;
+
+	rlim.rlim_cur = rlim.rlim_max = RLIM_INFINITY;
+	if (setrlimit(r, &rlim) < 0) {
+		rlim.rlim_cur = rlim.rlim_max = val;
+		if (setrlimit(r, &rlim) < 0) {
+			perror(r == RLIMIT_NOFILE ? "NOFILE" : "MEMLOCK");
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ *
+ * For the number of files: we need N(devs) X N(cpus) for:
+ *	o  ioctl's
+ *	o  read from /sys/kernel/debug/...
+ *	o  write to blktrace output file
+ *	o  Add some misc. extras - we'll muliply by 4 instead of 3
+ *
+ * For the memory locked, we know we need at least
+ *		N(devs) X N(cpus) X N(buffers) X buffer-size
+ * 	we double that for misc. extras
+ */
+static int increase_limits(void)
+{
+	rlim_t nofile_lim = 4 * ndevs * ncpus;
+	rlim_t memlock_lim = 2 * ndevs * ncpus * buf_nr * buf_size;
+
+	return increase_limit(RLIMIT_NOFILE, nofile_lim) != 0 ||
+	       increase_limit(RLIMIT_MEMLOCK, memlock_lim) != 0;
+}
+
 static void handle_sigint(__attribute__((__unused__)) int sig)
 {
 	struct device_information *dip;
@@ -659,7 +705,9 @@ static void tip_ftrunc_final(struct thread_information *tip)
 		if (tip->fs_buf)
 			munmap(tip->fs_buf, tip->fs_buf_len);
 
-		ftruncate(ofd, tip->fs_size);
+		if (ftruncate(ofd, tip->fs_size) < 0)
+			fprintf(stderr, "Ignoring error: ftruncate:  %d/%s\n",
+				errno, strerror(errno));
 	}
 }
 
@@ -1924,6 +1972,15 @@ int main(int argc, char *argv[])
 		return 1;
 	}
 
+	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	if (ncpus < 0) {
+		fprintf(stderr, "sysconf(_SC_NPROCESSORS_ONLN) failed\n");
+		return 1;
+	}
+
+	if (increase_limits() != 0)
+		return 1;
+
 	if (act_mask_tmp != 0)
 		act_mask = act_mask_tmp;
 
@@ -1949,12 +2006,6 @@ int main(int argc, char *argv[])
 		return 0;
 	}
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
-	if (ncpus < 0) {
-		fprintf(stderr, "sysconf(_SC_NPROCESSORS_ONLN) failed\n");
-		return 1;
-	}
-
 	signal(SIGINT, handle_sigint);
 	signal(SIGHUP, handle_sigint);
 	signal(SIGTERM, handle_sigint);
-- 
1.5.6.3


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH]  Increased limits to allow for large system runs
  2009-01-22 18:55 [PATCH] Increased limits to allow for large system runs Alan D. Brunelle
@ 2009-01-23 10:02 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2009-01-23 10:02 UTC (permalink / raw)
  To: linux-btrace

On Thu, Jan 22 2009, Alan D. Brunelle wrote:

> On 16-way w/ 104 disks and a 32-way w/ 96 disks, I was getting:
> 
> $ sudo blktrace -b 1024 -n 8 -I ../files
> ./cciss_c1d6.blktrace.10: Too many open files
> Failed to start worker threads
> 
> Due to the nature of our N(cpus) X N(devices) order of file opens, and
> our N(cpus) X N(devices) X N(buffers) X (buffer size) amount of mmaps()
> going on we're exceeding both the RLIMIT_NOFILE and RLIMIT_MEMLOCK
> limits.
> 
> This patch raises limits for RLIMIT_NOFILE and RLIMIT_MEMLOCK to
> "infinity", and allows blktrace to handle the large(ish) systems. (If
> these settings fail, we "guestimate" about how much we really need.)

Thanks Alan, I pushed it out.

> There is still an underlying blktrace and/or kernel problem: The
> directory /sys/kernel/debug/block/<DSF> where <DSF> is the device that
> encountered the limit is left behind (not cleaned up correctly). This
> stops blktrace from running a second time (even on another device):
> 
> $  ls /sys/kernel/debug/block
> cciss_c1d6
> $ sudo blktrace /dev/sda
> BLKTRACESETUP: No such file or directory
> Failed to start trace on /dev/sda
> 
> and requires a reboot. (Looking into that next, as this patch - whilst
> stopping the original problem from happening - does not address the
> secondary problem. And there may be some other ways for the secondary
> problem to still occur...)

Would be nice if you have time to get to the bottom of that!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-01-23 10:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-22 18:55 [PATCH] Increased limits to allow for large system runs Alan D. Brunelle
2009-01-23 10:02 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).