All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH net 0/5] tcp: more robust ooo handling
From: David Miller @ 2018-07-23 19:03 UTC (permalink / raw)
  To: edumazet; +Cc: juha-matti.tilli, ycheng, soheil, netdev, eric.dumazet
In-Reply-To: <20180723162821.11556-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Mon, 23 Jul 2018 09:28:16 -0700

> Juha-Matti Tilli reported that malicious peers could inject tiny
> packets in out_of_order_queue, forcing very expensive calls
> to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
> every incoming packet.
> 
> With tcp_rmem[2] default of 6MB, the ooo queue could
> contain ~7000 nodes.
> 
> This patch series makes sure we cut cpu cycles enough to
> render the attack not critical.
> 
> We might in the future go further, like disconnecting
> or black-holing proven malicious flows.

Sucky...

It took me a while to understand the sums_tiny logic, every
time I read that function I forget that we reset all of the
state and restart the loop after a coalesce inside the loop.

Series applied, and queued up for -stable.

Thanks!

^ permalink raw reply

* [ovmf test] 125523: all pass - PUSHED
From: osstest service owner @ 2018-07-23 20:06 UTC (permalink / raw)
  To: xen-devel, osstest-admin

flight 125523 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/125523/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf                 549ae85ce1b00228c3abcf6a9e4022c4f4fba5ed
baseline version:
 ovmf                 d9e206d4bf9124fe526baaa0ec56a7d2316ca6b3

Last test of basis   125510  2018-07-23 00:40:46 Z    0 days
Testing same since   125513  2018-07-23 05:27:10 Z    0 days    3 attempts

------------------------------------------------------------
People who touched revisions under test:
  Star Zeng <star.zeng@intel.com>
  Supreeth Venkatesh <supreeth.venkatesh@arm.com>
  Yonghong Zhu <yonghong.zhu@intel.com>
  Yunhua Feng <yunhuax.feng@intel.com>

jobs:
 build-amd64-xsm                                              pass    
 build-i386-xsm                                               pass    
 build-amd64                                                  pass    
 build-i386                                                   pass    
 build-amd64-libvirt                                          pass    
 build-i386-libvirt                                           pass    
 build-amd64-pvops                                            pass    
 build-i386-pvops                                             pass    
 test-amd64-amd64-xl-qemuu-ovmf-amd64                         pass    
 test-amd64-i386-xl-qemuu-ovmf-amd64                          pass    


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   d9e206d4bf..549ae85ce1  549ae85ce1b00228c3abcf6a9e4022c4f4fba5ed -> xen-tested-master

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
From: kbuild test robot @ 2018-07-23 20:06 UTC (permalink / raw)
  To: Chen Lin
  Cc: kbuild-all, mingo, peterz, linux-kernel, jiang.biao2,
	zhong.weidong, tan.hu, Chen Lin, Tan Hu
In-Reply-To: <1532324370-80651-1-git-send-email-chen.lin130@zte.com.cn>

[-- Attachment #1: Type: text/plain, Size: 1848 bytes --]

Hi Chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on v4.18-rc6 next-20180723]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chen-Lin/sched-numa-do-not-balance-tasks-onto-isolated-cpus/20180724-031803
config: x86_64-randconfig-x008-201829 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   kernel//sched/fair.c: In function 'task_numa_find_cpu':
>> kernel//sched/fair.c:1726:46: error: 'cpu_isolated_map' undeclared (first use in this function); did you mean 'cpu_core_map'?
                        || cpumask_test_cpu(cpu, cpu_isolated_map))
                                                 ^~~~~~~~~~~~~~~~
                                                 cpu_core_map
   kernel//sched/fair.c:1726:46: note: each undeclared identifier is reported only once for each function it appears in

vim +1726 kernel//sched/fair.c

  1717	
  1718	static void task_numa_find_cpu(struct task_numa_env *env,
  1719					long taskimp, long groupimp)
  1720	{
  1721		int cpu;
  1722	
  1723		for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) {
  1724			/* Skip this CPU if the source task cannot migrate */
  1725			if ((!cpumask_test_cpu(cpu, &env->p->cpus_allowed))
> 1726	                    || cpumask_test_cpu(cpu, cpu_isolated_map))
  1727				continue;
  1728	
  1729			env->dst_cpu = cpu;
  1730			task_numa_compare(env, taskimp, groupimp);
  1731		}
  1732	}
  1733	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27462 bytes --]

^ permalink raw reply

* [Buildroot] [PATCH 0/3] perl-net-ssh2 fixes
From: Thomas De Schampheleire @ 2018-07-23 20:06 UTC (permalink / raw)
  To: buildroot

These are three related fixes to perl-net-ssh2 detected in order to fix
autobuild issue
http://autobuild.buildroot.net/results/6ee18e7dd17f168c52f79e49cb5e94cf3aa3df1a/

Best regards,
Thomas

Thomas De Schampheleire (3):
  perl-net-ssh2: add missing dependency on zlib
  perl-net-ssh2: avoid build system inspecting host paths
  perl-net-ssh2: add support for libgcrypt crypto backend

 package/perl-net-ssh2/Config.in        |  9 ++++++++-
 package/perl-net-ssh2/perl-net-ssh2.mk | 10 +++++++++-
 2 files changed, 17 insertions(+), 2 deletions(-)

-- 
2.16.4

^ permalink raw reply

* [Buildroot] [PATCH 1/3] perl-net-ssh2: add missing dependency on zlib
From: Thomas De Schampheleire @ 2018-07-23 20:06 UTC (permalink / raw)
  To: buildroot
In-Reply-To: <20180723200628.2256-1-thomas.de_schampheleire@nokia.com>

perl-net-ssh2 requires zlib. When using the openssl backend to libssh2, this
dependency is implicit via openssl, but when using the libgcrypt backend the
dependency is missing.

Signed-off-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
---
 package/perl-net-ssh2/Config.in        | 1 +
 package/perl-net-ssh2/perl-net-ssh2.mk | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/package/perl-net-ssh2/Config.in b/package/perl-net-ssh2/Config.in
index 07c42ee5b1..8f359b7015 100644
--- a/package/perl-net-ssh2/Config.in
+++ b/package/perl-net-ssh2/Config.in
@@ -2,6 +2,7 @@ config BR2_PACKAGE_PERL_NET_SSH2
 	bool "perl-net-ssh2"
 	depends on !BR2_STATIC_LIBS
 	select BR2_PACKAGE_LIBSSH2
+	select BR2_PACKAGE_ZLIB
 	help
 	  Support for the SSH 2 protocol via libssh2.
 
diff --git a/package/perl-net-ssh2/perl-net-ssh2.mk b/package/perl-net-ssh2/perl-net-ssh2.mk
index 6d84deb284..b174fa6210 100644
--- a/package/perl-net-ssh2/perl-net-ssh2.mk
+++ b/package/perl-net-ssh2/perl-net-ssh2.mk
@@ -9,6 +9,6 @@ PERL_NET_SSH2_SOURCE = Net-SSH2-$(PERL_NET_SSH2_VERSION).tar.gz
 PERL_NET_SSH2_SITE = $(BR2_CPAN_MIRROR)/authors/id/S/SA/SALVA
 PERL_NET_SSH2_LICENSE = Artistic or GPL-1.0+
 PERL_NET_SSH2_LICENSE_FILES = README
-PERL_NET_SSH2_DEPENDENCIES = libssh2
+PERL_NET_SSH2_DEPENDENCIES = libssh2 zlib
 
 $(eval $(perl-package))
-- 
2.16.4

^ permalink raw reply related

* [Buildroot] [PATCH 2/3] perl-net-ssh2: avoid build system inspecting host paths
From: Thomas De Schampheleire @ 2018-07-23 20:06 UTC (permalink / raw)
  To: buildroot
In-Reply-To: <20180723200628.2256-1-thomas.de_schampheleire@nokia.com>

During investigation of adding gcrypt support in perl-net-ssh2, it became
clear that its build system is trying to find libraries via host search
paths, i.e. /usr/lib64/ etc.

This can be avoided by explicitly passing a 'lib' and 'inc' path.

Signed-off-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
---
 package/perl-net-ssh2/perl-net-ssh2.mk | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/package/perl-net-ssh2/perl-net-ssh2.mk b/package/perl-net-ssh2/perl-net-ssh2.mk
index b174fa6210..77d39edef5 100644
--- a/package/perl-net-ssh2/perl-net-ssh2.mk
+++ b/package/perl-net-ssh2/perl-net-ssh2.mk
@@ -10,5 +10,9 @@ PERL_NET_SSH2_SITE = $(BR2_CPAN_MIRROR)/authors/id/S/SA/SALVA
 PERL_NET_SSH2_LICENSE = Artistic or GPL-1.0+
 PERL_NET_SSH2_LICENSE_FILES = README
 PERL_NET_SSH2_DEPENDENCIES = libssh2 zlib
+# build system will use host search paths by default
+PERL_NET_SSH2_CONF_OPTS += \
+	lib="$(STAGING_DIR)/usr/lib" \
+	inc="$(STAGING_DIR)/usr/include"
 
 $(eval $(perl-package))
-- 
2.16.4

^ permalink raw reply related

* [Buildroot] [PATCH 3/3] perl-net-ssh2: add support for libgcrypt crypto backend
From: Thomas De Schampheleire @ 2018-07-23 20:06 UTC (permalink / raw)
  To: buildroot
In-Reply-To: <20180723200628.2256-1-thomas.de_schampheleire@nokia.com>

Fix usage of libgcrypt as crypto backend to libssh2, when building
perl-net-ssh2. In order to achieve that, we need to use 'depends on' the
libssh2 backends, which means the user will first need to enable libssh2 and
one of the supported backends, before being able to enable perl-net-ssh2.

Fixes
http://autobuild.buildroot.net/results/6ee18e7dd17f168c52f79e49cb5e94cf3aa3df1a/

Signed-off-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
---
 package/perl-net-ssh2/Config.in        | 8 +++++++-
 package/perl-net-ssh2/perl-net-ssh2.mk | 4 ++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/package/perl-net-ssh2/Config.in b/package/perl-net-ssh2/Config.in
index 8f359b7015..4dfd6e1dd3 100644
--- a/package/perl-net-ssh2/Config.in
+++ b/package/perl-net-ssh2/Config.in
@@ -1,12 +1,18 @@
 config BR2_PACKAGE_PERL_NET_SSH2
 	bool "perl-net-ssh2"
 	depends on !BR2_STATIC_LIBS
-	select BR2_PACKAGE_LIBSSH2
+	depends on BR2_PACKAGE_LIBSSH2_OPENSSL || BR2_PACKAGE_LIBSSH2_LIBGCRYPT
 	select BR2_PACKAGE_ZLIB
 	help
 	  Support for the SSH 2 protocol via libssh2.
 
+	  Note: only the OpenSSL and Libgcrypt backends of libssh2 are
+	  supported.
+
 	  https://metacpan.org/release/Net-SSH2
 
 comment "perl-net-ssh2 needs a toolchain w/ dynamic library"
 	depends on BR2_STATIC_LIBS
+
+comment "perl-net-ssh2 needs libssh2 with OpenSSL or Libgcrypt backend"
+	depends on !(BR2_PACKAGE_LIBSSH2_OPENSSL || BR2_PACKAGE_LIBSSH2_LIBGCRYPT)
diff --git a/package/perl-net-ssh2/perl-net-ssh2.mk b/package/perl-net-ssh2/perl-net-ssh2.mk
index 77d39edef5..ebd5803826 100644
--- a/package/perl-net-ssh2/perl-net-ssh2.mk
+++ b/package/perl-net-ssh2/perl-net-ssh2.mk
@@ -15,4 +15,8 @@ PERL_NET_SSH2_CONF_OPTS += \
 	lib="$(STAGING_DIR)/usr/lib" \
 	inc="$(STAGING_DIR)/usr/include"
 
+ifeq ($(BR2_PACKAGE_LIBSSH2_LIBGCRYPT),y)
+PERL_NET_SSH2_CONF_OPTS += gcrypt
+endif
+
 $(eval $(perl-package))
-- 
2.16.4

^ permalink raw reply related

* Re: [PATCH] sched/numa: do not balance tasks onto isolated cpus
From: kbuild test robot @ 2018-07-23 20:06 UTC (permalink / raw)
  To: Chen Lin
  Cc: kbuild-all, mingo, peterz, linux-kernel, jiang.biao2,
	zhong.weidong, tan.hu, Chen Lin, Tan Hu
In-Reply-To: <1532324370-80651-1-git-send-email-chen.lin130@zte.com.cn>

[-- Attachment #1: Type: text/plain, Size: 2576 bytes --]

Hi Chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on v4.18-rc6 next-20180723]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chen-Lin/sched-numa-do-not-balance-tasks-onto-isolated-cpus/20180724-031803
config: i386-randconfig-x008-201829 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   kernel/sched/core.c: In function 'migrate_swap':
>> kernel/sched/core.c:1283:46: error: 'cpu_isolated_map' undeclared (first use in this function); did you mean 'cpu_core_map'?
                || cpumask_test_cpu(arg.dst_cpu, cpu_isolated_map))
                                                 ^~~~~~~~~~~~~~~~
                                                 cpu_core_map
   kernel/sched/core.c:1283:46: note: each undeclared identifier is reported only once for each function it appears in

vim +1283 kernel/sched/core.c

  1256	
  1257	/*
  1258	 * Cross migrate two tasks
  1259	 */
  1260	int migrate_swap(struct task_struct *cur, struct task_struct *p)
  1261	{
  1262		struct migration_swap_arg arg;
  1263		int ret = -EINVAL;
  1264	
  1265		arg = (struct migration_swap_arg){
  1266			.src_task = cur,
  1267			.src_cpu = task_cpu(cur),
  1268			.dst_task = p,
  1269			.dst_cpu = task_cpu(p),
  1270		};
  1271	
  1272		if (arg.src_cpu == arg.dst_cpu)
  1273			goto out;
  1274	
  1275		/*
  1276		 * These three tests are all lockless; this is OK since all of them
  1277		 * will be re-checked with proper locks held further down the line.
  1278		 */
  1279		if (!cpu_active(arg.src_cpu) || !cpu_active(arg.dst_cpu))
  1280			goto out;
  1281	
  1282		if ((!cpumask_test_cpu(arg.dst_cpu, &arg.src_task->cpus_allowed))
> 1283	            || cpumask_test_cpu(arg.dst_cpu, cpu_isolated_map))
  1284			goto out;
  1285	
  1286		if ((!cpumask_test_cpu(arg.src_cpu, &arg.dst_task->cpus_allowed))
  1287	            || cpumask_test_cpu(arg.src_cpu, cpu_isolated_map))
  1288			goto out;
  1289	
  1290		trace_sched_swap_numa(cur, arg.src_cpu, p, arg.dst_cpu);
  1291		ret = stop_two_cpus(arg.dst_cpu, arg.src_cpu, migrate_swap_stop, &arg);
  1292	
  1293	out:
  1294		return ret;
  1295	}
  1296	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25146 bytes --]

^ permalink raw reply

* Changing non-volatile access to volatile in counter examples
From: Imre Palik @ 2018-07-23 20:07 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook


This series changes some of the counter examples to use volatile access, to
avoid overly eager compilers/linkers to optimise out necessary reads/writes.


^ permalink raw reply

* [PATCH 1/4] Changing counttorture defaults.
From: Imre Palik @ 2018-07-23 20:07 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook, Palik, Imre
In-Reply-To: <1532376448-15103-1-git-send-email-imrep.amz@gmail.com>

From: "Palik, Imre" <imrep.amz@gmail.com>

As the counter implementations are supposed to implement write-mostly
parallelism, this patch changes the default behaviour of counttorture to reflect
it.

Signed-off-by: Imre Palik <imrep.amz@gmail.com>
---
 CodeSamples/count/counttorture.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/CodeSamples/count/counttorture.h b/CodeSamples/count/counttorture.h
index 3276d68..ff0dd72 100644
--- a/CodeSamples/count/counttorture.h
+++ b/CodeSamples/count/counttorture.h
@@ -164,19 +164,19 @@ void perftestrun(int nthreads, int nreaders, int nupdaters)
 	exit(EXIT_SUCCESS);
 }

-void perftest(int nreaders, int cpustride)
+void perftest(int nwriters, int cpustride)
 {
 	int i;
 	long arg;

-	perftestinit(nreaders + 1);
-	for (i = 0; i < nreaders; i++) {
+	perftestinit(nwriters + 1);
+	for (i = 0; i < nwriters; i++) {
 		arg = (long)(i * cpustride);
-		create_thread(count_read_perf_test, (void *)arg);
+		create_thread(count_update_perf_test, (void *)arg);
 	}
 	arg = (long)(i * cpustride);
-	create_thread(count_update_perf_test, (void *)arg);
-	perftestrun(i + 1, nreaders, 1);
+	create_thread(count_read_perf_test, (void *)arg);
+	perftestrun(i + 1, 1, nwriters);
 }

 void rperftest(int nreaders, int cpustride)
-- 
2.7.4


^ permalink raw reply related

* [PATCH 2/4] Making the counter implementations safer
From: Imre Palik @ 2018-07-23 20:07 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook, Palik, Imre
In-Reply-To: <1532376448-15103-1-git-send-email-imrep.amz@gmail.com>

From: "Palik, Imre" <imrep.amz@gmail.com>

Relevant parts of some of the counter implementations were prone to be optimised
out by an overly eager compiler/linker.

This patch makes the compiler's task easier, by declaring big parts of the
implementation inline.  Then proceeds to fix the issue.

Some barriers from the countertorture framework also got removed, as a proper
multithreaded implementation should provide its own ordering guarantees.

Signed-off-by: Imre Palik <imrep.amz@gmail.com>
---
 CodeSamples/count/Makefile              | 30 ++++++++++++++++--------------
 CodeSamples/count/count_atomic.c        |  6 +++---
 CodeSamples/count/count_end.c           |  9 +++++----
 CodeSamples/count/count_end_rcu.c       |  7 ++++---
 CodeSamples/count/count_nonatomic.c     |  8 ++++----
 CodeSamples/count/count_stack.c         |  9 +++++----
 CodeSamples/count/count_stat.c          |  6 +++---
 CodeSamples/count/count_stat_atomic.c   |  6 +++---
 CodeSamples/count/count_stat_eventual.c |  2 +-
 CodeSamples/count/count_tstat.c         |  3 ++-
 CodeSamples/count/counttorture.h        |  2 --
 11 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/CodeSamples/count/Makefile b/CodeSamples/count/Makefile
index eacdb57..481eb3f 100644
--- a/CodeSamples/count/Makefile
+++ b/CodeSamples/count/Makefile
@@ -43,49 +43,51 @@ else
 all: $(PROGS)
 endif

+CC?=cc
+
 include $(top)/recipes.mk

 count_atomic: count_atomic.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_atomic count_atomic.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_atomic count_atomic.c -lpthread

 count_end: count_end.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_end count_end.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_end count_end.c -lpthread

 count_end_rcu: count_end_rcu.c ../api.h counttorture.h $(RCU_SRCS)
-	cc $(GCC_ARGS) $(CFLAGS) -o count_end_rcu count_end_rcu.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_end_rcu count_end_rcu.c -lpthread

 count_lim: count_lim.c ../api.h limtorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_lim count_lim.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_lim count_lim.c -lpthread

 count_lim_app: count_lim_app.c ../api.h limtorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_lim_app count_lim_app.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_lim_app count_lim_app.c -lpthread

 count_lim_atomic: count_lim_atomic.c ../api.h limtorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_lim_atomic count_lim_atomic.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_lim_atomic count_lim_atomic.c -lpthread

 count_lim_sig: count_lim_sig.c ../api.h limtorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_lim_sig count_lim_sig.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_lim_sig count_lim_sig.c -lpthread

 count_limd: count_limd.c ../api.h limtorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_limd count_limd.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_limd count_limd.c -lpthread

 count_nonatomic: count_nonatomic.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_nonatomic count_nonatomic.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_nonatomic count_nonatomic.c -lpthread

 count_stack: count_stack.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_stack count_stack.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_stack count_stack.c -lpthread

 count_stat: count_stat.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_stat count_stat.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_stat count_stat.c -lpthread

 count_stat_atomic: count_stat_atomic.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_stat_atomic count_stat_atomic.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_stat_atomic count_stat_atomic.c -lpthread

 count_stat_eventual: count_stat_eventual.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_stat_eventual count_stat_eventual.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_stat_eventual count_stat_eventual.c -lpthread

 count_tstat: count_tstat.c ../api.h counttorture.h
-	cc $(GCC_ARGS) $(CFLAGS) -o count_tstat count_tstat.c -lpthread
+	$(CC) $(GCC_ARGS) $(CFLAGS) -o count_tstat count_tstat.c -lpthread

 clean:
 	rm -f $(PROGS)
diff --git a/CodeSamples/count/count_atomic.c b/CodeSamples/count/count_atomic.c
index 0457aa1..fc73717 100644
--- a/CodeSamples/count/count_atomic.c
+++ b/CodeSamples/count/count_atomic.c
@@ -27,16 +27,16 @@ void inc_count(void)
 	atomic_inc(&counter);
 }

-long read_count(void)
+__inline__ long read_count(void)
 {
 	return atomic_read(&counter);
 }

-void count_init(void)
+__inline__ void count_init(void)
 {
 }

-void count_cleanup(void)
+__inline__ void count_cleanup(void)
 {
 }

diff --git a/CodeSamples/count/count_end.c b/CodeSamples/count/count_end.c
index a39c0c9..00335f2 100644
--- a/CodeSamples/count/count_end.c
+++ b/CodeSamples/count/count_end.c
@@ -26,9 +26,10 @@ unsigned long *counterp[NR_THREADS] = { NULL };
 unsigned long finalcount = 0;
 DEFINE_SPINLOCK(final_mutex);

-void inc_count(void)
+__inline__ void inc_count(void)
 {
-	counter++;
+	WRITE_ONCE(counter,
+		   READ_ONCE(counter) + 1);
 }

 unsigned long read_count(void)
@@ -45,7 +46,7 @@ unsigned long read_count(void)
 	return sum;
 }

-void count_init(void)
+__inline__ void count_init(void)
 {
 }

@@ -68,7 +69,7 @@ void count_unregister_thread(int nthreadsexpected)
 	spin_unlock(&final_mutex);
 }

-void count_cleanup(void)
+__inline__ void count_cleanup(void)
 {
 }

diff --git a/CodeSamples/count/count_end_rcu.c b/CodeSamples/count/count_end_rcu.c
index e6614b9..ca0a392 100644
--- a/CodeSamples/count/count_end_rcu.c
+++ b/CodeSamples/count/count_end_rcu.c
@@ -32,9 +32,10 @@ unsigned long __thread counter = 0;
 struct countarray *countarrayp = NULL;
 DEFINE_SPINLOCK(final_mutex);

-void inc_count(void)
+__inline__ void inc_count(void)
 {
-	counter++;
+	WRITE_ONCE(counter,
+		   READ_ONCE(counter) + 1);
 }

 unsigned long read_count(void)
@@ -94,7 +95,7 @@ void count_unregister_thread(int nthreadsexpected)
 	free(capold);
 }

-void count_cleanup(void)
+__inline__ void count_cleanup(void)
 {
 }

diff --git a/CodeSamples/count/count_nonatomic.c b/CodeSamples/count/count_nonatomic.c
index 868b0fe..90979c5 100644
--- a/CodeSamples/count/count_nonatomic.c
+++ b/CodeSamples/count/count_nonatomic.c
@@ -23,21 +23,21 @@

 unsigned long counter = 0;

-void inc_count(void)
+__inline__ void inc_count(void)
 {
 	counter++;
 }

-unsigned long read_count(void)
+__inline__ unsigned long read_count(void)
 {
 	return counter;
 }

-void count_init(void)
+__inline__ void count_init(void)
 {
 }

-void count_cleanup(void)
+__inline__ void count_cleanup(void)
 {
 }

diff --git a/CodeSamples/count/count_stack.c b/CodeSamples/count/count_stack.c
index aa6185d..975db48 100644
--- a/CodeSamples/count/count_stack.c
+++ b/CodeSamples/count/count_stack.c
@@ -26,9 +26,10 @@ unsigned long *counterp[NR_THREADS] = { NULL };
 unsigned long finalcount = 0;
 DEFINE_SPINLOCK(final_mutex);

-void inc_count(void)
+__inline__ void inc_count(void)
 {
-	(*counter)++;
+	WRITE_ONCE(*counter,
+		   READ_ONCE(*counter) + 1);
 }

 unsigned long read_count(void)
@@ -45,7 +46,7 @@ unsigned long read_count(void)
 	return sum;
 }

-void count_init(void)
+__inline__ void count_init(void)
 {
 }

@@ -69,7 +70,7 @@ void count_unregister_thread(int nthreadsexpected)
 	spin_unlock(&final_mutex);
 }

-void count_cleanup(void)
+__inline__ void count_cleanup(void)
 {
 }

diff --git a/CodeSamples/count/count_stat.c b/CodeSamples/count/count_stat.c
index b483022..1d72f99 100644
--- a/CodeSamples/count/count_stat.c
+++ b/CodeSamples/count/count_stat.c
@@ -27,7 +27,7 @@ void inc_count(void)
 	__get_thread_var(counter)++;
 }

-unsigned long read_count(void)
+__inline__ unsigned long read_count(void)
 {
 	int t;
 	unsigned long sum = 0;
@@ -37,11 +37,11 @@ unsigned long read_count(void)
 	return sum;
 }

-void count_init(void)
+__inline__ void count_init(void)
 {
 }

-void count_cleanup(void)
+__inline__ void count_cleanup(void)
 {
 }

diff --git a/CodeSamples/count/count_stat_atomic.c b/CodeSamples/count/count_stat_atomic.c
index 732ab6d..d1ff10b 100644
--- a/CodeSamples/count/count_stat_atomic.c
+++ b/CodeSamples/count/count_stat_atomic.c
@@ -27,7 +27,7 @@ void inc_count(void)
 	atomic_inc(&__get_thread_var(counter));
 }

-unsigned long read_count(void)
+__inline__ unsigned long read_count(void)
 {
 	int t;
 	unsigned long sum = 0;
@@ -37,11 +37,11 @@ unsigned long read_count(void)
 	return sum;
 }

-void count_init(void)
+__inline__ void count_init(void)
 {
 }

-void count_cleanup(void)
+__inline__ void count_cleanup(void)
 {
 }

diff --git a/CodeSamples/count/count_stat_eventual.c b/CodeSamples/count/count_stat_eventual.c
index 2b23dbd..324bc24 100644
--- a/CodeSamples/count/count_stat_eventual.c
+++ b/CodeSamples/count/count_stat_eventual.c
@@ -31,7 +31,7 @@ void inc_count(void)
 		   READ_ONCE(__get_thread_var(counter)) + 1);
 }

-unsigned long read_count(void)
+__inline__ unsigned long read_count(void)
 {
 	return READ_ONCE(global_count);
 }
diff --git a/CodeSamples/count/count_tstat.c b/CodeSamples/count/count_tstat.c
index 1fa4e52..59e4025 100644
--- a/CodeSamples/count/count_tstat.c
+++ b/CodeSamples/count/count_tstat.c
@@ -29,7 +29,8 @@ DEFINE_SPINLOCK(final_mutex);

 void inc_count(void)
 {
-	counter++;
+	WRITE_ONCE(counter,
+		   READ_ONCE(counter) + 1);
 }

 unsigned long read_count(void)  /* known failure with counttorture! */
diff --git a/CodeSamples/count/counttorture.h b/CodeSamples/count/counttorture.h
index ff0dd72..bdfc7d4 100644
--- a/CodeSamples/count/counttorture.h
+++ b/CodeSamples/count/counttorture.h
@@ -86,7 +86,6 @@ void *count_read_perf_test(void *arg)
 	while (READ_ONCE(goflag) == GOFLAG_RUN) {
 		for (i = COUNT_READ_RUN; i > 0; i--) {
 			j += read_count();
-			barrier();
 		}
 		n_reads_local += COUNT_READ_RUN;
 	}
@@ -110,7 +109,6 @@ void *count_update_perf_test(void *arg)
 	while (READ_ONCE(goflag) == GOFLAG_RUN) {
 		for (i = COUNT_UPDATE_RUN; i > 0; i--) {
 			inc_count();
-			barrier();
 		}
 		n_updates_local += COUNT_UPDATE_RUN;
 	}
-- 
2.7.4


^ permalink raw reply related

* [PATCH 3/4] Updating count.tex with new counter code
From: Imre Palik @ 2018-07-23 20:07 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook, Palik, Imre
In-Reply-To: <1532376448-15103-1-git-send-email-imrep.amz@gmail.com>

From: "Palik, Imre" <imrep.amz@gmail.com>

Now count.text reflects the changes made to the counter implementations, to
restrict too eager compilers.

Signed-off-by: Imre Palik <imrep.amz@gmail.com>
---
 count/count.tex | 93 +++++++++++++++++++++++++++++----------------------------
 1 file changed, 47 insertions(+), 46 deletions(-)

diff --git a/count/count.tex b/count/count.tex
index 561256a..82d4a7f 100644
--- a/count/count.tex
+++ b/count/count.tex
@@ -1003,41 +1003,42 @@ comes at the cost of the additional thread running \co{eventual()}.
   5 
   6 void inc_count(void)
   7 {
-  8   counter++;
-  9 }
- 10 
- 11 long read_count(void)
- 12 {
- 13   int t;
- 14   long sum;
- 15 
- 16   spin_lock(&final_mutex);
- 17   sum = finalcount;
- 18   for_each_thread(t)
- 19     if (counterp[t] != NULL)
- 20       sum += *counterp[t];
- 21   spin_unlock(&final_mutex);
- 22   return sum;
- 23 }
- 24 
- 25 void count_register_thread(void)
- 26 {
- 27   int idx = smp_thread_id();
- 28 
- 29   spin_lock(&final_mutex);
- 30   counterp[idx] = &counter;
- 31   spin_unlock(&final_mutex);
- 32 }
- 33 
- 34 void count_unregister_thread(int nthreadsexpected)
- 35 {
- 36   int idx = smp_thread_id();
- 37 
- 38   spin_lock(&final_mutex);
- 39   finalcount += counter;
- 40   counterp[idx] = NULL;
- 41   spin_unlock(&final_mutex);
- 42 }
+  8   WRITE_ONCE(counter,
+  9              READ_ONCE(counter) + 1);counter++;
+ 10 }
+ 11 
+ 12 long read_count(void)
+ 13 {
+ 14   int t;
+ 15   long sum;
+ 16 
+ 17   spin_lock(&final_mutex);
+ 18   sum = finalcount;
+ 19   for_each_thread(t)
+ 20     if (counterp[t] != NULL)
+ 21       sum += *counterp[t];
+ 22   spin_unlock(&final_mutex);
+ 23   return sum;
+ 24 }
+ 25 
+ 26 void count_register_thread(void)
+ 27 {
+ 28   int idx = smp_thread_id();
+ 29 
+ 30   spin_lock(&final_mutex);
+ 31   counterp[idx] = &counter;
+ 32   spin_unlock(&final_mutex);
+ 33 }
+ 34 
+ 35 void count_unregister_thread(int nthreadsexpected)
+ 36 {
+ 37   int idx = smp_thread_id();
+ 38 
+ 39   spin_lock(&final_mutex);
+ 40   finalcount += counter;
+ 41   counterp[idx] = NULL;
+ 42   spin_unlock(&final_mutex);
+ 43 }
 \end{verbbox}
 }
 \centering
@@ -1105,18 +1106,18 @@ value of the counter and exiting threads.
 } \QuickQuizEnd

 The \co{inc_count()} function used by updaters is quite simple, as can
-be seen on lines~6-9.
+be seen on lines~6-10.

 The \co{read_count()} function used by readers is a bit more complex.
-Line~16 acquires a lock to exclude exiting threads, and line~21 releases
+Line~17 acquires a lock to exclude exiting threads, and line~22 releases
 it.
-Line~17 initializes the sum to the count accumulated by those threads that
-have already exited, and lines~18-20 sum the counts being accumulated
+Line~18 initializes the sum to the count accumulated by those threads that
+have already exited, and lines~19-21 sum the counts being accumulated
 by threads currently running.
-Finally, line~22 returns the sum.
+Finally, line~23 returns the sum.

 \QuickQuiz{}
-	Doesn't the check for \co{NULL} on line~19 of
+	Doesn't the check for \co{NULL} on line~20 of
 	Listing~\ref{lst:count:Per-Thread Statistical Counters}
 	add extra branch mispredictions?
 	Why not have a variable set permanently to zero, and point
@@ -1156,7 +1157,7 @@ Finally, line~22 returns the sum.
 	\co{inc_count()} fastpath.
 } \QuickQuizEnd

-Lines~25-32 show the \co{count_register_thread()} function, which
+Lines~26-33 show the \co{count_register_thread()} function, which
 must be called by each thread before its first use of this counter.
 This function simply sets up this thread's element of the \co{counterp[]}
 array to point to its per-thread \co{counter} variable.
@@ -1177,14 +1178,14 @@ array to point to its per-thread \co{counter} variable.
 	a hundred or so CPUs, there is no need to get fancy.
 } \QuickQuizEnd

-Lines~34-42 show the \co{count_unregister_thread()} function, which
+Lines~35-43 show the \co{count_unregister_thread()} function, which
 must be called prior to exit by each thread that previously called
 \co{count_register_thread()}.
-Line~38 acquires the lock, and line~41 releases it, thus excluding any
+Line~39 acquires the lock, and line~42 releases it, thus excluding any
 calls to \co{read_count()} as well as other calls to
 \co{count_unregister_thread()}.
-Line~39 adds this thread's \co{counter} to the global \co{finalcount},
-and then line~40 \co{NULL}s out its \co{counterp[]} array entry.
+Line~40 adds this thread's \co{counter} to the global \co{finalcount},
+and then line~41 \co{NULL}s out its \co{counterp[]} array entry.
 A subsequent call to \co{read_count()} will see the exiting thread's
 count in the global \co{finalcount}, and will skip the exiting thread
 when sequencing through the \co{counterp[]} array, thus obtaining
-- 
2.7.4


^ permalink raw reply related

* [PATCH 4/4] Regenerating the atomic counter graph on a more modern CPU
From: Imre Palik @ 2018-07-23 20:07 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook, Palik, Imre
In-Reply-To: <1532376448-15103-1-git-send-email-imrep.amz@gmail.com>

From: "Palik, Imre" <imrep.amz@gmail.com>

Regenerating the graph on Kaby Lake, and updating the text.

Signed-off-by: Imre Palik <imrep.amz@gmail.com>
---
 CodeSamples/count/atomic.eps    | 483 +++++++++++++----------
 CodeSamples/count/atomic125.eps | 856 +++++++++++++++++-----------------------
 CodeSamples/count/atomic125.png | Bin 3337 -> 3189 bytes
 count/count.tex                 |  12 +-
 defer/rcuintro.tex              |   2 +-
 locking/locking.tex             |   2 +-
 6 files changed, 658 insertions(+), 697 deletions(-)

diff --git a/CodeSamples/count/atomic.eps b/CodeSamples/count/atomic.eps
index 5e3ed2b..c2fc079 100644
--- a/CodeSamples/count/atomic.eps
+++ b/CodeSamples/count/atomic.eps
@@ -1,7 +1,7 @@
 %!PS-Adobe-2.0
 %%Title: Is Parallel Programming Hard, And, If So, What Can You Do About It?
-%%Creator: gnuplot 4.4 patchlevel 0
-%%CreationDate: Wed Jan  5 07:20:50 2011
+%%Creator: gnuplot 5.0 patchlevel 3
+%%CreationDate: Mon Jul 16 22:02:17 2018
 %%DocumentFonts: (atend)
 %%BoundingBox: 50 95 302 355
 %%Orientation: Portrait
@@ -20,42 +20,29 @@ gnudict begin
 /Dashlength 1 def
 /Landscape false def
 /Level1 false def
+/Level3 false def
 /Rounded false def
 /ClipToBoundingBox false def
+/SuppressPDFMark false def
 /TransparentPatterns false def
 /gnulinewidth 5.000 def
 /userlinewidth gnulinewidth def
 /Gamma 1.0 def
+/BackgroundColor {-1.000 -1.000 -1.000} def
 %
 /vshift -33 def
 /dl1 {
-  10.0 Dashlength mul mul
+  10.0 Dashlength userlinewidth gnulinewidth div mul mul mul
   Rounded { currentlinewidth 0.75 mul sub dup 0 le { pop 0.01 } if } if
 } def
 /dl2 {
-  10.0 Dashlength mul mul
+  10.0 Dashlength userlinewidth gnulinewidth div mul mul mul
   Rounded { currentlinewidth 0.75 mul add } if
 } def
 /hpt_ 31.5 def
 /vpt_ 31.5 def
 /hpt hpt_ def
 /vpt vpt_ def
-Level1 {} {
-/SDict 10 dict def
-systemdict /pdfmark known not {
-  userdict /pdfmark systemdict /cleartomark get put
-} if
-SDict begin [
-  /Title (Is Parallel Programming Hard, And, If So, What Can You Do About It?)
-  /Subject (gnuplot plot)
-  /Creator (gnuplot 4.4 patchlevel 0)
-  /Author (paulmck)
-%  /Producer (gnuplot)
-%  /Keywords ()
-  /CreationDate (Wed Jan  5 07:20:50 2011)
-  /DOCINFO pdfmark
-end
-} ifelse
 /doclip {
   ClipToBoundingBox {
     newpath 50 50 moveto 302 50 lineto 302 410 lineto 50 410 lineto closepath
@@ -63,7 +50,7 @@ end
   } if
 } def
 %
-% Gnuplot Prolog Version 4.4 (January 2010)
+% Gnuplot Prolog Version 5.1 (Oct 2015)
 %
 %/SuppressPDFMark true def
 %
@@ -75,15 +62,16 @@ end
 /Z {closepath} bind def
 /C {setrgbcolor} bind def
 /f {rlineto fill} bind def
+/g {setgray} bind def
 /Gshow {show} def   % May be redefined later in the file to support UTF-8
 /vpt2 vpt 2 mul def
 /hpt2 hpt 2 mul def
 /Lshow {currentpoint stroke M 0 vshift R 
-	Blacktext {gsave 0 setgray show grestore} {show} ifelse} def
+	Blacktext {gsave 0 setgray textshow grestore} {textshow} ifelse} def
 /Rshow {currentpoint stroke M dup stringwidth pop neg vshift R
-	Blacktext {gsave 0 setgray show grestore} {show} ifelse} def
+	Blacktext {gsave 0 setgray textshow grestore} {textshow} ifelse} def
 /Cshow {currentpoint stroke M dup stringwidth pop -2 div vshift R 
-	Blacktext {gsave 0 setgray show grestore} {show} ifelse} def
+	Blacktext {gsave 0 setgray textshow grestore} {textshow} ifelse} def
 /UP {dup vpt_ mul /vpt exch def hpt_ mul /hpt exch def
   /hpt2 hpt 2 mul def /vpt2 vpt 2 mul def} def
 /DL {Color {setrgbcolor Solid {pop []} if 0 setdash}
@@ -96,7 +84,8 @@ end
 	dup 1 lt {pop 1} if 10 mul /udl exch def} def
 /PL {stroke userlinewidth setlinewidth
 	Rounded {1 setlinejoin 1 setlinecap} if} def
-% Default Line colors
+3.8 setmiterlimit
+% Classic Line colors (version 5.0)
 /LCw {1 1 1} def
 /LCb {0 0 0} def
 /LCa {0 0 0} def
@@ -109,19 +98,21 @@ end
 /LC6 {0 0 0} def
 /LC7 {1 0.3 0} def
 /LC8 {0.5 0.5 0.5} def
-% Default Line Types
+% Default dash patterns (version 5.0)
+/LTB {BL [] LCb DL} def
 /LTw {PL [] 1 setgray} def
-/LTb {BL [] LCb DL} def
+/LTb {PL [] LCb DL} def
 /LTa {AL [1 udl mul 2 udl mul] 0 setdash LCa setrgbcolor} def
 /LT0 {PL [] LC0 DL} def
-/LT1 {PL [4 dl1 2 dl2] LC1 DL} def
-/LT2 {PL [2 dl1 3 dl2] LC2 DL} def
-/LT3 {PL [1 dl1 1.5 dl2] LC3 DL} def
-/LT4 {PL [6 dl1 2 dl2 1 dl1 2 dl2] LC4 DL} def
-/LT5 {PL [3 dl1 3 dl2 1 dl1 3 dl2] LC5 DL} def
-/LT6 {PL [2 dl1 2 dl2 2 dl1 6 dl2] LC6 DL} def
-/LT7 {PL [1 dl1 2 dl2 6 dl1 2 dl2 1 dl1 2 dl2] LC7 DL} def
-/LT8 {PL [2 dl1 2 dl2 2 dl1 2 dl2 2 dl1 2 dl2 2 dl1 4 dl2] LC8 DL} def
+/LT1 {PL [2 dl1 3 dl2] LC1 DL} def
+/LT2 {PL [1 dl1 1.5 dl2] LC2 DL} def
+/LT3 {PL [6 dl1 2 dl2 1 dl1 2 dl2] LC3 DL} def
+/LT4 {PL [1 dl1 2 dl2 6 dl1 2 dl2 1 dl1 2 dl2] LC4 DL} def
+/LT5 {PL [4 dl1 2 dl2] LC5 DL} def
+/LT6 {PL [1.5 dl1 1.5 dl2 1.5 dl1 1.5 dl2 1.5 dl1 6 dl2] LC6 DL} def
+/LT7 {PL [3 dl1 3 dl2 1 dl1 3 dl2] LC7 DL} def
+/LT8 {PL [2 dl1 2 dl2 2 dl1 6 dl2] LC8 DL} def
+/SL {[] 0 setdash} def
 /Pnt {stroke [] 0 setdash gsave 1 setlinecap M 0 0 V stroke grestore} def
 /Dia {stroke [] 0 setdash 2 copy vpt add M
   hpt neg vpt neg V hpt vpt neg V
@@ -328,7 +319,8 @@ end
 /PatternFill {gsave /PFa [ 9 2 roll ] def
   PFa 0 get PFa 2 get 2 div add PFa 1 get PFa 3 get 2 div add translate
   PFa 2 get -2 div PFa 3 get -2 div PFa 2 get PFa 3 get Rec
-  gsave 1 setgray fill grestore clip
+  TransparentPatterns {} {gsave 1 setgray fill grestore} ifelse
+  clip
   currentlinewidth 0.5 mul setlinewidth
   /PFs PFa 2 get dup mul PFa 3 get dup mul add sqrt def
   0 0 M PFa 5 get rotate PFs -2 div dup translate
@@ -342,9 +334,14 @@ end
 %
 /languagelevel where
  {pop languagelevel} {1} ifelse
- 2 lt
-	{/InterpretLevel1 true def}
-	{/InterpretLevel1 Level1 def}
+dup 2 lt
+	{/InterpretLevel1 true def
+	 /InterpretLevel3 false def}
+	{/InterpretLevel1 Level1 def
+	 2 gt
+	    {/InterpretLevel3 Level3 def}
+	    {/InterpretLevel3 false def}
+	 ifelse }
  ifelse
 %
 % PostScript level 2 pattern fill definitions
@@ -433,16 +430,17 @@ Level1 {Level1PatternFill} {Level2PatternFill} ifelse
 /Symbol-Oblique /Symbol findfont [1 0 .167 1 0 0] makefont
 dup length dict begin {1 index /FID eq {pop pop} {def} ifelse} forall
 currentdict end definefont pop
+%
 /MFshow {
    { dup 5 get 3 ge
      { 5 get 3 eq {gsave} {grestore} ifelse }
      {dup dup 0 get findfont exch 1 get scalefont setfont
      [ currentpoint ] exch dup 2 get 0 exch R dup 5 get 2 ne {dup dup 6
-     get exch 4 get {Gshow} {stringwidth pop 0 R} ifelse }if dup 5 get 0 eq
+     get exch 4 get {textshow} {stringwidth pop 0 R} ifelse }if dup 5 get 0 eq
      {dup 3 get {2 get neg 0 exch R pop} {pop aload pop M} ifelse} {dup 5
      get 1 eq {dup 2 get exch dup 3 get exch 6 get stringwidth pop -2 div
      dup 0 R} {dup 6 get stringwidth pop -2 div 0 R 6 get
-     show 2 index {aload pop M neg 3 -1 roll neg R pop pop} {pop pop pop
+     textshow 2 index {aload pop M neg 3 -1 roll neg R pop pop} {pop pop pop
      pop aload pop M} ifelse }ifelse }ifelse }
      ifelse }
    forall} def
@@ -1850,6 +1848,51 @@ ba6d2a8e73b10c6de72a8e8f3bfeab8344fe6622fadd5d01457e31e8facd74cb
 cleartomark
 {restore}if
 %%EndProcSet
+Level1 SuppressPDFMark or 
+{} {
+/SDict 10 dict def
+systemdict /pdfmark known not {
+  userdict /pdfmark systemdict /cleartomark get put
+} if
+SDict begin [
+  /Title (Is Parallel Programming Hard, And, If So, What Can You Do About It?)
+  /Subject (gnuplot plot)
+  /Creator (gnuplot 5.0 patchlevel 3)
+  /Author (imre)
+%  /Producer (gnuplot)
+%  /Keywords ()
+  /CreationDate (Mon Jul 16 22:02:17 2018)
+  /DOCINFO pdfmark
+end
+} ifelse
+%
+% Support for boxed text - Ethan A Merritt May 2005
+%
+/InitTextBox { userdict /TBy2 3 -1 roll put userdict /TBx2 3 -1 roll put
+           userdict /TBy1 3 -1 roll put userdict /TBx1 3 -1 roll put
+	   /Boxing true def } def
+/ExtendTextBox { Boxing
+    { gsave dup false charpath pathbbox
+      dup TBy2 gt {userdict /TBy2 3 -1 roll put} {pop} ifelse
+      dup TBx2 gt {userdict /TBx2 3 -1 roll put} {pop} ifelse
+      dup TBy1 lt {userdict /TBy1 3 -1 roll put} {pop} ifelse
+      dup TBx1 lt {userdict /TBx1 3 -1 roll put} {pop} ifelse
+      grestore } if } def
+/PopTextBox { newpath TBx1 TBxmargin sub TBy1 TBymargin sub M
+               TBx1 TBxmargin sub TBy2 TBymargin add L
+	       TBx2 TBxmargin add TBy2 TBymargin add L
+	       TBx2 TBxmargin add TBy1 TBymargin sub L closepath } def
+/DrawTextBox { PopTextBox stroke /Boxing false def} def
+/FillTextBox { gsave PopTextBox 1 1 1 setrgbcolor fill grestore /Boxing false def} def
+0 0 0 0 InitTextBox
+/TBxmargin 20 def
+/TBymargin 20 def
+/Boxing false def
+/textshow { ExtendTextBox Gshow } def
+%
+% redundant definitions for compatibility with prologue.ps older than 5.0.2
+/LTB {BL [] LCb DL} def
+/LTb {PL [] LCb DL} def
 end
 %%EndProlog
 %%Page: 1 1
@@ -1861,402 +1904,436 @@ doclip
 0 setgray
 newpath
 (NimbusSanL-Regu) findfont 100 scalefont setfont
+BackgroundColor 0 lt 3 1 roll 0 lt exch 0 lt or or not {gsave BackgroundColor C clippath fill grestore} if
 1.000 UL
 LTb
-550 990 M
+LCb setrgbcolor
+490 975 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 990 M
+430 975 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 0)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 1192 M
+LCb setrgbcolor
+490 1181 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 1192 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 100)]
+430 1181 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 20)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 1394 M
+LCb setrgbcolor
+490 1386 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 1394 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 200)]
+430 1386 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 40)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 1597 M
+LCb setrgbcolor
+490 1592 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 1597 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 300)]
+430 1592 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 60)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 1799 M
+LCb setrgbcolor
+490 1797 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 1799 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 400)]
+430 1797 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 80)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 2001 M
+LCb setrgbcolor
+490 2003 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 2001 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 500)]
+430 2003 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 100)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 2203 M
+LCb setrgbcolor
+490 2208 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 2203 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 600)]
+430 2208 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 120)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 2406 M
+LCb setrgbcolor
+490 2414 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 2406 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 700)]
+430 2414 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 140)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 2608 M
+LCb setrgbcolor
+490 2619 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 2608 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 800)]
+430 2619 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 160)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 2810 M
+LCb setrgbcolor
+490 2825 M
 63 0 V
-1756 0 R
+1786 0 R
 -63 0 V
 stroke
-490 2810 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 900)]
+430 2825 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 180)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-550 990 M
+LCb setrgbcolor
+490 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-550 890 M
+490 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 1)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-810 990 M
+LCb setrgbcolor
+754 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-810 890 M
+754 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 2)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1070 990 M
+LCb setrgbcolor
+1018 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-1070 890 M
+1018 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 3)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1330 990 M
+LCb setrgbcolor
+1282 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-1330 890 M
+1282 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 4)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1589 990 M
+LCb setrgbcolor
+1547 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-1589 890 M
+1547 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 5)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1849 990 M
+LCb setrgbcolor
+1811 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-1849 890 M
+1811 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 6)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-2109 990 M
+LCb setrgbcolor
+2075 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-2109 890 M
+2075 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 7)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-2369 990 M
+LCb setrgbcolor
+2339 975 M
 0 63 V
-0 1757 R
+0 1787 R
 0 -63 V
 stroke
-2369 890 M
+2339 875 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 8)]
 ] -33.3 MCshow
 1.000 UL
 LTb
+LCb setrgbcolor
 1.000 UL
-LTb
-550 2810 N
-550 990 L
-1819 0 V
-0 1820 V
--1819 0 V
+LTB
+LCb setrgbcolor
+490 2825 N
+490 975 L
+1849 0 V
+0 1850 V
+-1849 0 V
 Z stroke
+1.000 UP
+1.000 UL
+LTb
 LCb setrgbcolor
-140 1900 M
+LCb setrgbcolor
+80 1900 M
 currentpoint gsave translate -270 rotate 0 0 moveto
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 (Time Per Increment \(nanoseconds\))]
 ] -33.3 MCshow
 grestore
 LTb
 LCb setrgbcolor
-1459 740 M
+1414 725 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 (Number of CPUs \(Threads\))]
 ] -33.3 MCshow
 LTb
-1.000 UP
-1.000 UL
-LTb
 % Begin plot #1
 1.000 UP
 1.000 UL
-LT0
+LTb
+LCb setrgbcolor
 /NimbusSanL-Regu findfont 100 scalefont setfont
-550 1079 M
-0 2 V
--31 -2 R
-62 0 V
--62 2 R
-62 0 V
-229 118 R
+/vshift -33 def
+490 1019 M
 0 10 V
--31 -10 R
-62 0 V
--62 10 R
+0 -10 R
+31 0 V
+-31 10 R
+31 0 V
+233 335 R
+0 28 V
+-31 -28 R
 62 0 V
-229 189 R
-0 94 V
--31 -94 R
+-62 28 R
 62 0 V
--62 94 R
+233 102 R
+0 112 V
+987 1494 M
 62 0 V
-229 205 R
-0 97 V
--31 -97 R
+-62 112 R
 62 0 V
--62 97 R
+233 3 R
+0 199 V
+-31 -199 R
 62 0 V
-228 64 R
-0 80 V
--31 -80 R
+-62 199 R
 62 0 V
--62 80 R
+234 69 R
+0 133 V
+-31 -133 R
 62 0 V
-229 96 R
-0 281 V
--31 -281 R
+-62 133 R
 62 0 V
--62 281 R
+233 -1 R
+0 241 V
+-31 -241 R
 62 0 V
-229 11 R
-0 149 V
--31 -149 R
+-62 241 R
 62 0 V
--62 149 R
+233 -56 R
+0 238 V
+-31 -238 R
 62 0 V
-229 27 R
-0 144 V
--31 -144 R
+-62 238 R
 62 0 V
--62 144 R
-62 0 V
-550 1080 Pls
-810 1204 Pls
-1070 1454 Pls
-1330 1739 Pls
-1589 1904 Pls
-1849 2163 Pls
-2109 2398 Pls
-2369 2588 Pls
+233 63 R
+0 200 V
+-31 -200 R
+31 0 V
+-31 200 R
+31 0 V
+490 1023 Pls
+754 1378 Pls
+1018 1569 Pls
+1282 1696 Pls
+1547 1934 Pls
+1811 2119 Pls
+2075 2300 Pls
+2339 2625 Pls
 % End plot #1
 % Begin plot #2
 1.000 UL
+LTb
 LT1
+LCb setrgbcolor
 /NimbusSanL-Regu findfont 100 scalefont setfont
-550 1080 M
-260 124 V
-260 250 V
-260 285 V
-259 165 V
-260 259 V
-260 235 V
-260 190 V
+490 1023 M
+264 355 V
+264 191 V
+264 127 V
+265 238 V
+264 185 V
+264 181 V
+264 325 V
 % End plot #2
 % Begin plot #3
 stroke
+LTb
 LT2
+LCb setrgbcolor
 /NimbusSanL-Regu findfont 100 scalefont setfont
-550 1008 M
-18 0 V
+490 1066 M
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
-18 0 V
 19 0 V
 18 0 V
 19 0 V
-18 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
-18 0 V
 19 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
-18 0 V
 19 0 V
 18 0 V
 19 0 V
-18 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
-18 0 V
 19 0 V
 18 0 V
 19 0 V
-18 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
 19 0 V
 18 0 V
-18 0 V
 19 0 V
-18 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
+19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
+19 0 V
 18 0 V
+19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
+19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
+19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
+19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
+19 0 V
+19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
+19 0 V
+19 0 V
 18 0 V
 19 0 V
+19 0 V
 18 0 V
+19 0 V
 % End plot #3
 stroke
+2.000 UL
 LTb
-550 2810 N
-550 990 L
-1819 0 V
-0 1820 V
--1819 0 V
+LCb setrgbcolor
+1.000 UL
+LTB
+LCb setrgbcolor
+490 2825 N
+490 975 L
+1849 0 V
+0 1850 V
+-1849 0 V
 Z stroke
 1.000 UP
 1.000 UL
 LTb
+LCb setrgbcolor
 stroke
 grestore
 end
diff --git a/CodeSamples/count/atomic125.eps b/CodeSamples/count/atomic125.eps
index 179f3d4..1e28a27 100644
--- a/CodeSamples/count/atomic125.eps
+++ b/CodeSamples/count/atomic125.eps
@@ -1,7 +1,7 @@
 %!PS-Adobe-2.0
 %%Title: Is Parallel Programming Hard, And, If So, What Can You Do About It?
-%%Creator: gnuplot 4.4 patchlevel 0
-%%CreationDate: Wed Jan  5 07:20:51 2011
+%%Creator: gnuplot 5.0 patchlevel 3
+%%CreationDate: Mon Jul 16 22:02:17 2018
 %%DocumentFonts: (atend)
 %%BoundingBox: 50 95 302 355
 %%Orientation: Portrait
@@ -20,42 +20,29 @@ gnudict begin
 /Dashlength 1 def
 /Landscape false def
 /Level1 false def
+/Level3 false def
 /Rounded false def
 /ClipToBoundingBox false def
+/SuppressPDFMark false def
 /TransparentPatterns false def
 /gnulinewidth 5.000 def
 /userlinewidth gnulinewidth def
 /Gamma 1.0 def
+/BackgroundColor {-1.000 -1.000 -1.000} def
 %
 /vshift -33 def
 /dl1 {
-  10.0 Dashlength mul mul
+  10.0 Dashlength userlinewidth gnulinewidth div mul mul mul
   Rounded { currentlinewidth 0.75 mul sub dup 0 le { pop 0.01 } if } if
 } def
 /dl2 {
-  10.0 Dashlength mul mul
+  10.0 Dashlength userlinewidth gnulinewidth div mul mul mul
   Rounded { currentlinewidth 0.75 mul add } if
 } def
 /hpt_ 31.5 def
 /vpt_ 31.5 def
 /hpt hpt_ def
 /vpt vpt_ def
-Level1 {} {
-/SDict 10 dict def
-systemdict /pdfmark known not {
-  userdict /pdfmark systemdict /cleartomark get put
-} if
-SDict begin [
-  /Title (Is Parallel Programming Hard, And, If So, What Can You Do About It?)
-  /Subject (gnuplot plot)
-  /Creator (gnuplot 4.4 patchlevel 0)
-  /Author (paulmck)
-%  /Producer (gnuplot)
-%  /Keywords ()
-  /CreationDate (Wed Jan  5 07:20:51 2011)
-  /DOCINFO pdfmark
-end
-} ifelse
 /doclip {
   ClipToBoundingBox {
     newpath 50 50 moveto 302 50 lineto 302 410 lineto 50 410 lineto closepath
@@ -63,7 +50,7 @@ end
   } if
 } def
 %
-% Gnuplot Prolog Version 4.4 (January 2010)
+% Gnuplot Prolog Version 5.1 (Oct 2015)
 %
 %/SuppressPDFMark true def
 %
@@ -75,15 +62,16 @@ end
 /Z {closepath} bind def
 /C {setrgbcolor} bind def
 /f {rlineto fill} bind def
+/g {setgray} bind def
 /Gshow {show} def   % May be redefined later in the file to support UTF-8
 /vpt2 vpt 2 mul def
 /hpt2 hpt 2 mul def
 /Lshow {currentpoint stroke M 0 vshift R 
-	Blacktext {gsave 0 setgray show grestore} {show} ifelse} def
+	Blacktext {gsave 0 setgray textshow grestore} {textshow} ifelse} def
 /Rshow {currentpoint stroke M dup stringwidth pop neg vshift R
-	Blacktext {gsave 0 setgray show grestore} {show} ifelse} def
+	Blacktext {gsave 0 setgray textshow grestore} {textshow} ifelse} def
 /Cshow {currentpoint stroke M dup stringwidth pop -2 div vshift R 
-	Blacktext {gsave 0 setgray show grestore} {show} ifelse} def
+	Blacktext {gsave 0 setgray textshow grestore} {textshow} ifelse} def
 /UP {dup vpt_ mul /vpt exch def hpt_ mul /hpt exch def
   /hpt2 hpt 2 mul def /vpt2 vpt 2 mul def} def
 /DL {Color {setrgbcolor Solid {pop []} if 0 setdash}
@@ -96,7 +84,8 @@ end
 	dup 1 lt {pop 1} if 10 mul /udl exch def} def
 /PL {stroke userlinewidth setlinewidth
 	Rounded {1 setlinejoin 1 setlinecap} if} def
-% Default Line colors
+3.8 setmiterlimit
+% Classic Line colors (version 5.0)
 /LCw {1 1 1} def
 /LCb {0 0 0} def
 /LCa {0 0 0} def
@@ -109,19 +98,21 @@ end
 /LC6 {0 0 0} def
 /LC7 {1 0.3 0} def
 /LC8 {0.5 0.5 0.5} def
-% Default Line Types
+% Default dash patterns (version 5.0)
+/LTB {BL [] LCb DL} def
 /LTw {PL [] 1 setgray} def
-/LTb {BL [] LCb DL} def
+/LTb {PL [] LCb DL} def
 /LTa {AL [1 udl mul 2 udl mul] 0 setdash LCa setrgbcolor} def
 /LT0 {PL [] LC0 DL} def
-/LT1 {PL [4 dl1 2 dl2] LC1 DL} def
-/LT2 {PL [2 dl1 3 dl2] LC2 DL} def
-/LT3 {PL [1 dl1 1.5 dl2] LC3 DL} def
-/LT4 {PL [6 dl1 2 dl2 1 dl1 2 dl2] LC4 DL} def
-/LT5 {PL [3 dl1 3 dl2 1 dl1 3 dl2] LC5 DL} def
-/LT6 {PL [2 dl1 2 dl2 2 dl1 6 dl2] LC6 DL} def
-/LT7 {PL [1 dl1 2 dl2 6 dl1 2 dl2 1 dl1 2 dl2] LC7 DL} def
-/LT8 {PL [2 dl1 2 dl2 2 dl1 2 dl2 2 dl1 2 dl2 2 dl1 4 dl2] LC8 DL} def
+/LT1 {PL [2 dl1 3 dl2] LC1 DL} def
+/LT2 {PL [1 dl1 1.5 dl2] LC2 DL} def
+/LT3 {PL [6 dl1 2 dl2 1 dl1 2 dl2] LC3 DL} def
+/LT4 {PL [1 dl1 2 dl2 6 dl1 2 dl2 1 dl1 2 dl2] LC4 DL} def
+/LT5 {PL [4 dl1 2 dl2] LC5 DL} def
+/LT6 {PL [1.5 dl1 1.5 dl2 1.5 dl1 1.5 dl2 1.5 dl1 6 dl2] LC6 DL} def
+/LT7 {PL [3 dl1 3 dl2 1 dl1 3 dl2] LC7 DL} def
+/LT8 {PL [2 dl1 2 dl2 2 dl1 6 dl2] LC8 DL} def
+/SL {[] 0 setdash} def
 /Pnt {stroke [] 0 setdash gsave 1 setlinecap M 0 0 V stroke grestore} def
 /Dia {stroke [] 0 setdash 2 copy vpt add M
   hpt neg vpt neg V hpt vpt neg V
@@ -328,7 +319,8 @@ end
 /PatternFill {gsave /PFa [ 9 2 roll ] def
   PFa 0 get PFa 2 get 2 div add PFa 1 get PFa 3 get 2 div add translate
   PFa 2 get -2 div PFa 3 get -2 div PFa 2 get PFa 3 get Rec
-  gsave 1 setgray fill grestore clip
+  TransparentPatterns {} {gsave 1 setgray fill grestore} ifelse
+  clip
   currentlinewidth 0.5 mul setlinewidth
   /PFs PFa 2 get dup mul PFa 3 get dup mul add sqrt def
   0 0 M PFa 5 get rotate PFs -2 div dup translate
@@ -342,9 +334,14 @@ end
 %
 /languagelevel where
  {pop languagelevel} {1} ifelse
- 2 lt
-	{/InterpretLevel1 true def}
-	{/InterpretLevel1 Level1 def}
+dup 2 lt
+	{/InterpretLevel1 true def
+	 /InterpretLevel3 false def}
+	{/InterpretLevel1 Level1 def
+	 2 gt
+	    {/InterpretLevel3 Level3 def}
+	    {/InterpretLevel3 false def}
+	 ifelse }
  ifelse
 %
 % PostScript level 2 pattern fill definitions
@@ -433,16 +430,17 @@ Level1 {Level1PatternFill} {Level2PatternFill} ifelse
 /Symbol-Oblique /Symbol findfont [1 0 .167 1 0 0] makefont
 dup length dict begin {1 index /FID eq {pop pop} {def} ifelse} forall
 currentdict end definefont pop
+%
 /MFshow {
    { dup 5 get 3 ge
      { 5 get 3 eq {gsave} {grestore} ifelse }
      {dup dup 0 get findfont exch 1 get scalefont setfont
      [ currentpoint ] exch dup 2 get 0 exch R dup 5 get 2 ne {dup dup 6
-     get exch 4 get {Gshow} {stringwidth pop 0 R} ifelse }if dup 5 get 0 eq
+     get exch 4 get {textshow} {stringwidth pop 0 R} ifelse }if dup 5 get 0 eq
      {dup 3 get {2 get neg 0 exch R pop} {pop aload pop M} ifelse} {dup 5
      get 1 eq {dup 2 get exch dup 3 get exch 6 get stringwidth pop -2 div
      dup 0 R} {dup 6 get stringwidth pop -2 div 0 R 6 get
-     show 2 index {aload pop M neg 3 -1 roll neg R pop pop} {pop pop pop
+     textshow 2 index {aload pop M neg 3 -1 roll neg R pop pop} {pop pop pop
      pop aload pop M} ifelse }ifelse }ifelse }
      ifelse }
    forall} def
@@ -1850,6 +1848,51 @@ ba6d2a8e73b10c6de72a8e8f3bfeab8344fe6622fadd5d01457e31e8facd74cb
 cleartomark
 {restore}if
 %%EndProcSet
+Level1 SuppressPDFMark or 
+{} {
+/SDict 10 dict def
+systemdict /pdfmark known not {
+  userdict /pdfmark systemdict /cleartomark get put
+} if
+SDict begin [
+  /Title (Is Parallel Programming Hard, And, If So, What Can You Do About It?)
+  /Subject (gnuplot plot)
+  /Creator (gnuplot 5.0 patchlevel 3)
+  /Author (imre)
+%  /Producer (gnuplot)
+%  /Keywords ()
+  /CreationDate (Mon Jul 16 22:02:17 2018)
+  /DOCINFO pdfmark
+end
+} ifelse
+%
+% Support for boxed text - Ethan A Merritt May 2005
+%
+/InitTextBox { userdict /TBy2 3 -1 roll put userdict /TBx2 3 -1 roll put
+           userdict /TBy1 3 -1 roll put userdict /TBx1 3 -1 roll put
+	   /Boxing true def } def
+/ExtendTextBox { Boxing
+    { gsave dup false charpath pathbbox
+      dup TBy2 gt {userdict /TBy2 3 -1 roll put} {pop} ifelse
+      dup TBx2 gt {userdict /TBx2 3 -1 roll put} {pop} ifelse
+      dup TBy1 lt {userdict /TBy1 3 -1 roll put} {pop} ifelse
+      dup TBx1 lt {userdict /TBx1 3 -1 roll put} {pop} ifelse
+      grestore } if } def
+/PopTextBox { newpath TBx1 TBxmargin sub TBy1 TBymargin sub M
+               TBx1 TBxmargin sub TBy2 TBymargin add L
+	       TBx2 TBxmargin add TBy2 TBymargin add L
+	       TBx2 TBxmargin add TBy1 TBymargin sub L closepath } def
+/DrawTextBox { PopTextBox stroke /Boxing false def} def
+/FillTextBox { gsave PopTextBox 1 1 1 setrgbcolor fill grestore /Boxing false def} def
+0 0 0 0 InitTextBox
+/TBxmargin 20 def
+/TBymargin 20 def
+/Boxing false def
+/textshow { ExtendTextBox Gshow } def
+%
+% redundant definitions for compatibility with prologue.ps older than 5.0.2
+/LTB {BL [] LCb DL} def
+/LTb {PL [] LCb DL} def
 end
 %%EndProlog
 %%Page: 1 1
@@ -1861,595 +1904,436 @@ doclip
 0 setgray
 newpath
 (NimbusSanL-Regu) findfont 100 scalefont setfont
+BackgroundColor 0 lt 3 1 roll 0 lt exch 0 lt or or not {gsave BackgroundColor C clippath fill grestore} if
 1.000 UL
 LTb
-670 1050 M
+LCb setrgbcolor
+490 975 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 1050 M
+430 975 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 0)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 1239 M
+LCb setrgbcolor
+490 1181 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 1239 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 5000)]
+430 1181 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 20)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 1428 M
+LCb setrgbcolor
+490 1386 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 1428 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 10000)]
+430 1386 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 40)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 1617 M
+LCb setrgbcolor
+490 1592 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 1617 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 15000)]
+430 1592 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 60)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 1806 M
+LCb setrgbcolor
+490 1797 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 1806 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 20000)]
+430 1797 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 80)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 1994 M
+LCb setrgbcolor
+490 2003 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 1994 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 25000)]
+430 2003 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 100)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 2183 M
+LCb setrgbcolor
+490 2208 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 2183 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 30000)]
+430 2208 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 120)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 2372 M
+LCb setrgbcolor
+490 2414 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 2372 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 35000)]
+430 2414 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 140)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 2561 M
+LCb setrgbcolor
+490 2619 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 2561 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 40000)]
+430 2619 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 160)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 2750 M
+LCb setrgbcolor
+490 2825 M
 63 0 V
-1636 0 R
+1786 0 R
 -63 0 V
 stroke
-610 2750 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 45000)]
+430 2825 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 180)]
 ] -33.3 MRshow
 1.000 UL
 LTb
-670 1050 M
+LCb setrgbcolor
+490 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-670 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 0)]
+490 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 1)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-913 1050 M
+LCb setrgbcolor
+754 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-913 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 20)]
+754 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 2)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1155 1050 M
+LCb setrgbcolor
+1018 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-1155 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 40)]
+1018 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 3)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1398 1050 M
+LCb setrgbcolor
+1282 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-1398 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 60)]
+1282 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 4)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1641 1050 M
+LCb setrgbcolor
+1547 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-1641 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 80)]
+1547 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 5)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-1884 1050 M
+LCb setrgbcolor
+1811 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-1884 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 100)]
+1811 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 6)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-2126 1050 M
+LCb setrgbcolor
+2075 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-2126 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 120)]
+2075 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 7)]
 ] -33.3 MCshow
 1.000 UL
 LTb
-2369 1050 M
+LCb setrgbcolor
+2339 975 M
 0 63 V
-0 1637 R
+0 1787 R
 0 -63 V
 stroke
-2369 950 M
-[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 140)]
+2339 875 M
+[ [(NimbusSanL-Regu) 100.0 0.0 true true 0 ( 8)]
 ] -33.3 MCshow
 1.000 UL
 LTb
+LCb setrgbcolor
 1.000 UL
-LTb
-670 2750 N
-0 -1700 V
-1699 0 V
-0 1700 V
--1699 0 V
+LTB
+LCb setrgbcolor
+490 2825 N
+490 975 L
+1849 0 V
+0 1850 V
+-1849 0 V
 Z stroke
+1.000 UP
+1.000 UL
+LTb
 LCb setrgbcolor
-140 1900 M
+LCb setrgbcolor
+80 1900 M
 currentpoint gsave translate -270 rotate 0 0 moveto
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 (Time Per Increment \(ns\))]
 ] -33.3 MCshow
 grestore
 LTb
 LCb setrgbcolor
-1519 800 M
+1414 725 M
 [ [(NimbusSanL-Regu) 100.0 0.0 true true 0 (Number of CPUs \(Threads\))]
 ] -33.3 MCshow
 LTb
-1.000 UP
-1.000 UL
-LTb
 % Begin plot #1
 1.000 UP
 1.000 UL
-LT0
+LTb
+LCb setrgbcolor
 /NimbusSanL-Regu findfont 100 scalefont setfont
-682 1051 M
-0 1 V
--31 -1 R
-62 0 V
--62 1 R
-62 0 V
-18 12 R
-0 3 V
--31 -3 R
-62 0 V
--62 3 R
-62 0 V
-17 12 R
-0 6 V
--31 -6 R
-62 0 V
--62 6 R
-62 0 V
-18 20 R
-0 6 V
--31 -6 R
-62 0 V
--62 6 R
-62 0 V
-17 27 R
-0 9 V
--31 -9 R
-62 0 V
--62 9 R
-62 0 V
-18 32 R
-0 12 V
--31 -12 R
-62 0 V
--62 12 R
-62 0 V
-17 26 R
-0 14 V
--31 -14 R
-62 0 V
--62 14 R
-62 0 V
-18 32 R
-0 30 V
--31 -30 R
-62 0 V
--62 30 R
-62 0 V
-17 29 R
-0 46 V
--31 -46 R
-62 0 V
--62 46 R
-62 0 V
-18 6 R
-0 45 V
--31 -45 R
-62 0 V
--62 45 R
-62 0 V
-18 42 R
-0 48 V
--31 -48 R
-62 0 V
--62 48 R
-62 0 V
-17 24 R
-0 39 V
--31 -39 R
-62 0 V
--62 39 R
-62 0 V
-18 22 R
-0 58 V
--31 -58 R
-62 0 V
--62 58 R
-62 0 V
-17 24 R
-0 32 V
--31 -32 R
-62 0 V
--62 32 R
-62 0 V
-18 29 R
-0 82 V
--31 -82 R
-62 0 V
--62 82 R
-62 0 V
-17 -33 R
-0 74 V
--31 -74 R
-62 0 V
--62 74 R
-62 0 V
-18 -5 R
-0 79 V
--31 -79 R
-62 0 V
--62 79 R
-62 0 V
-17 -13 R
-0 83 V
--31 -83 R
-62 0 V
-stroke 1538 1921 M
--62 83 R
-62 0 V
-18 -32 R
-0 72 V
--31 -72 R
-62 0 V
--62 72 R
-62 0 V
-17 -12 R
-0 85 V
--31 -85 R
-62 0 V
--62 85 R
-62 0 V
-18 -31 R
-0 55 V
--31 -55 R
-62 0 V
--62 55 R
-62 0 V
-18 -12 R
-0 107 V
--31 -107 R
-62 0 V
--62 107 R
-62 0 V
-17 -48 R
-0 81 V
--31 -81 R
-62 0 V
--62 81 R
-62 0 V
-18 -31 R
-0 69 V
--31 -69 R
-62 0 V
--62 69 R
-62 0 V
-17 -36 R
-0 90 V
--31 -90 R
-62 0 V
--62 90 R
-62 0 V
-18 -51 R
-0 79 V
--31 -79 R
-62 0 V
--62 79 R
-62 0 V
-17 0 R
-0 47 V
--31 -47 R
-62 0 V
--62 47 R
-62 0 V
-18 -221 R
-0 217 V
--31 -217 R
-62 0 V
--62 217 R
-62 0 V
-17 31 R
-0 83 V
--31 -83 R
-62 0 V
--62 83 R
-62 0 V
-18 -291 R
-0 304 V
--31 -304 R
-62 0 V
--62 304 R
-62 0 V
-17 -44 R
-0 104 V
--31 -104 R
-62 0 V
--62 104 R
-62 0 V
-18 -16 R
-0 73 V
--31 -73 R
-62 0 V
--62 73 R
-62 0 V
-682 1052 Pls
-731 1066 Pls
-779 1081 Pls
-828 1108 Pls
-876 1143 Pls
-925 1185 Pls
-973 1224 Pls
-1022 1276 Pls
-1070 1339 Pls
-1119 1404 Pls
-1168 1479 Pls
-1216 1548 Pls
-1265 1630 Pls
-1313 1690 Pls
-1362 1764 Pls
-1410 1823 Pls
-1459 1888 Pls
-1507 1969 Pls
-1556 2005 Pls
-1604 2071 Pls
-1653 2123 Pls
-1702 2171 Pls
-1750 2223 Pls
-1799 2276 Pls
-1847 2336 Pls
-1896 2353 Pls
-1944 2416 Pls
-1993 2352 Pls
-2041 2515 Pls
-2090 2453 Pls
-2138 2572 Pls
-2187 2650 Pls
+/vshift -33 def
+490 1019 M
+0 10 V
+0 -10 R
+31 0 V
+-31 10 R
+31 0 V
+233 335 R
+0 28 V
+-31 -28 R
+62 0 V
+-62 28 R
+62 0 V
+233 102 R
+0 112 V
+987 1494 M
+62 0 V
+-62 112 R
+62 0 V
+233 3 R
+0 199 V
+-31 -199 R
+62 0 V
+-62 199 R
+62 0 V
+234 69 R
+0 133 V
+-31 -133 R
+62 0 V
+-62 133 R
+62 0 V
+233 -1 R
+0 241 V
+-31 -241 R
+62 0 V
+-62 241 R
+62 0 V
+233 -56 R
+0 238 V
+-31 -238 R
+62 0 V
+-62 238 R
+62 0 V
+233 63 R
+0 200 V
+-31 -200 R
+31 0 V
+-31 200 R
+31 0 V
+490 1023 Pls
+754 1378 Pls
+1018 1569 Pls
+1282 1696 Pls
+1547 1934 Pls
+1811 2119 Pls
+2075 2300 Pls
+2339 2625 Pls
 % End plot #1
 % Begin plot #2
 1.000 UL
+LTb
 LT1
+LCb setrgbcolor
 /NimbusSanL-Regu findfont 100 scalefont setfont
-682 1052 M
-49 14 V
-48 15 V
-49 27 V
-48 35 V
-49 42 V
-48 39 V
-49 52 V
-48 63 V
-49 65 V
-49 75 V
-48 69 V
-49 82 V
-48 60 V
-49 74 V
-48 59 V
-49 65 V
-48 81 V
-49 36 V
-48 66 V
-49 52 V
-49 48 V
-48 52 V
-49 53 V
-48 60 V
-49 17 V
-48 63 V
-49 -64 V
-48 163 V
-49 -62 V
-48 119 V
-49 78 V
+490 1023 M
+264 355 V
+264 191 V
+264 127 V
+265 238 V
+264 185 V
+264 181 V
+264 325 V
 % End plot #2
 % Begin plot #3
 stroke
+LTb
 LT2
+LCb setrgbcolor
 /NimbusSanL-Regu findfont 100 scalefont setfont
-682 1050 M
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
-15 0 V
-15 0 V
-16 0 V
-15 0 V
-15 0 V
+490 1066 M
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
+19 0 V
+18 0 V
+19 0 V
 % End plot #3
 stroke
+2.000 UL
 LTb
-670 2750 N
-0 -1700 V
-1699 0 V
-0 1700 V
--1699 0 V
+LCb setrgbcolor
+1.000 UL
+LTB
+LCb setrgbcolor
+490 2825 N
+490 975 L
+1849 0 V
+0 1850 V
+-1849 0 V
 Z stroke
 1.000 UP
 1.000 UL
 LTb
+LCb setrgbcolor
 stroke
 grestore
 end
diff --git a/CodeSamples/count/atomic125.png b/CodeSamples/count/atomic125.png
index f2b860ae4d0615f7580a5320ee7305cfd0777739..8cb327e16335a307735e8f9423a973fa5b7362aa 100644
GIT binary patch
delta 2747
zcma)8i96KWAO3!42E*8jBBCpmtu)y}mM9fj%9?dZi=m8=NX)3vVk?ZnL>G;aEjuyO
z?S^a55;ZlZG$y+-2($Q&?)?LP=XuWayq|O4=RN21c|YfQ&O?P2`LuoDf$IJH!1F*}
zm}~9SH|aPG005#PXWYX8K=SXc31(_ZDguDmiuFkg=Sbq*R|$M+<4)mo*~E#?+|7MH
zD6ECGtmcRG#$VH}XT_#p<zlvFz$8V|<3N0R8YqPQoK0d+Osy}$@<SzPQ=8o~ce;$i
zA)TW=Ep#X0V$Rdxn1%fGa(pxeT`zJ_&p-D3VkTT|Tu7w<DUf0TtQX9v3^6F^8(dkF
zvk=4P*xc&zuI#CXU7T`a_Wu)L1w5ZzR^)75=+2Gn?Xg;W##SwCcn1FDOsSxL!=~RF
zr*A3-R-s><dS5$=VHD;l=P@6@PJjBFzG*sC!NR>=@m%lEwzDzb-43U-B8hy?CiU`S
zfJf?Z`HXjd#&Xk!=6dDiFlxlycUQ%G4pnV`JifX7PC&5|mfj6W4hSK9ZJkI`%G$UR
zYF9`7tk2bj`RokY3&zVH>pyk&SU%EPR^vtgd>yL1GL|&8Jm$}spmNJ?5_<`K4oa?O
zZF7a&jIXta$4*x*v94n>_&G<s^EhQ(5$<tdXvOMQ=H^{i3q3q~#FJs}w+4`A5Mh?W
zAKZ0GoNjEN=2dT0FX2ihr<5xvd+H?Frv!oKpK#0Ey#I}Pzv#V#sS96qspfkyrmuaU
z7j5m#WtSpGs*(%;m`)3sN_9fLt4`_oRK;1YLWVPXz9t0NJri0a!)Yi|qSafitMkT(
z*sG|+VV3itExX~KTW?`7^Zun{^u<iwTlqUi(O5ZRo}5r6v1z@St_&r(?iApbPYt@Q
zF%qh{LIf>|36SapLQK=Qxhz>qZL@kL<BeLRE+66Rfs%c*G_F5f+v9J5{h+mJSM%8S
z`UTah|DvHezafPiqP`9!CAo)^w4{y<Zvh{r_t~l|wZp#)6|WP@Fb~33p|dr1RKzJX
zhthU5?Hamh__44vbF(5Y1#~io`(lJQq@UWSLdLSRQ$acCLgnCP_t-ihldV3g{ublR
z-c8Q%43#Sg#p`&7EGv$fv>&{j)pP$P@x=;YpI8Bjr1DG;xx>#JEkwz0Q@ZXrxc7=i
z%Mb07Bd-Xq7#W>hd=xfc@s9gaNtexJ`r#oiuP#;1O-1u6nr9$UgU(JoF{^FZN_KqE
z<$gP=Fz3WiKsy3k`L8@}(qDVtEzDaL_t$XY8|loN;i>Gje<AyM|F;Xp*?Z6vHY%FT
zN=rR|zsb`+54xfT&~k60e%*pQ8l5#&hT=O+nK4*$Xx=q*bLr>19#7=4A<lj2l2F2<
zt9RIKC?kiBnY*au=sr#a<JC`X+|;r<jyzM=8_8SL!oBNEO3-*U@G|XI(F_9IZ56C%
z>zC~M;*B0>-T^p$ez$~7ko}pQ)}55|mmu;O`_84|L?ERg%SZ*iUI<7QhF3OnOfB1x
zYS>F=&HQgYm>YDSahRoXSWD`p9LCwHk=dgSkv~sz*&F_L)UUr}a21Ap?nRbW!<BsA
zf-9$@2;xgGD*YlPqP3+dxcZ}Ar=f(vU<UyD$VA?6w2BD<#nrC?za+P|Rqnjn_3JnB
z!AA$7EVs1>wF}{=;{f4WF7^?lT;C(}bw)%UMoS<>qP`>P=bRM*xuO<O0^bx)mx8Zr
zteGqXa30)~gCPrWqY2i@^;)<S0kou&kpv-RL5O!qVEtvmabJ%31~#BU@t0rJ^Y=)k
zD+GH2P)&m_W(SmnSCpgR9(gZ{KpMHP_q%*uRTa?gh({`6@t;E-2f6q?JC1TFMMcWJ
zgQFD){xH8^8TlN|T7I)8(}AmFwlPF~4WB>=nwVeNK!UE_!ft3iPhnE$*C7|$tQaWH
zbOGE(OT6A`;7XQGxe2(3o9TnbBg6wJSK6d8D6Vz}aAMe9fzzd@MUXDEiz$HQpSJ}|
z5?8RIbm>xq2^Jx+pWiD(QE|TB?u=c!oF^b$)SV}9*<g4-us3pj-W~W<<fu!tkAeS@
z>x^zE)0X_@z@&h%GUtHy_YYAZ7tLT(?1WWKe3tl?)aYD5c$sCS;_%J~wpiRMPXERd
z^Ej5nGGk2`#f}`NJW<1et%M9`Y><5?Q<!&|AhUfqy5>9IM6GC5XyMj23T^n9ut}_W
zEJWUEDq$oO5@{l3H0JJW3XwmzE{NL_tVb%3wmk-fGg|y^qLa%9)yP2_0t3yYDdEz`
zLv#8c9mB3kAJ<Vp6&!t*4@>hf(D!dPLHtn`K~x&6YF}v~J8U_QrG%^!^KWD0Yi|xE
z3kv6H4x&Croy9rrk%A$K381ZK12qpw7@Gfn6768L#VCi!S8{l#fyBjAx}-^~*}b?E
zkzbQaVz#xweN+4Vs|}J89)9BSRdGr=^+_V!FLeh5+3;-@A(6)8Ps|)YnnA{%g>BIC
zGQ!2b@N%#j!++YPKjqKyEs0CL+bJ}!wtaUR5R!T3pm>R%##A%66^LE(NEe77XINV&
z9zwl?|3Mwem3hkdvGo@eqYfa6z1KT{14k5+(gJ0N2*4?4w4-C`Ih!DXb0b%5_h?@f
z`1I07jyOF$NC1SL8S+RMP0|riJb<&y5epWP#6VjYF|^?6lnw|L2mTi0nc_-(jwB13
zLpkS2%!>!!Fk~s&OGRJ5v;+i`G)HwXUaH3W2g?k=q)C&{>&aBTOHhIb-Ypk6#n!|%
zjQ`UsYHinD_75@bx!^@@eD97N?omeSmz4>DNu`S_KQ<Z_D8p4GVHmdR=;2^jf|<|2
zI9$-SIX`aW8u|LRfi5mu5c$5m15*po&P*VIoizg`Z7>9Re&ed(N-Tam!D*p8*QNAJ
zQ$iv{kR#<2y08#9q|=QT)V@5oH)0z_#ddQh6Oi<q4&S^gWe5j<l#!O_#8$Uc_)4iA
z8f!Ptv0))txRV(+{ziQKoX~}bW3<-ucJ~j8(j#eY{{dK9rAKzga509j`g(kM+Gb;5
z@HHYqO6GW5H%mtF7c$q{d}6qHqls&AJH~x#*(&0nip-zZZ9CW~c|6&s=UR0d-i=DF
z-u_ITh^Ov&%uOk2Jgnt%;es_FQdPWyXvLydZNpTM@2iD!)9q1@y#1S~J#W10j(%;r
zM^ZetY8oHUc;X1@OuvX=W?iEh?QEfcFYvY+Nc>BA=6%7SK|@9D4Ev$~aluUlV`JrG
zQ^nlXzbCU2qpRl=P%nbYn5*MxHUprrML$n%)HXjH^32lhRfMR=o$$1KmkWAf`5~VM
zcp5eB`x6&1-oWVyQM##djp?UQvVrG>hGosQUdUP_Z8pjycw-&i)(n%p_Hln+9^Sls
zcHj!xXuK=N-@(X>-xnj*xFPqECQFHXR+E9Ccd<TCThKz?>SA{j5cF?mup6w=g)#)4
zq6}SI3%}6)As0Nz9h^pW&N^@Wz+C$D!;}%y^jBCtKI-=TT@(yC&PLEX=c8`VMoayY
zQjrRJ_4mc~-0pta$K*N%%DULl_o*o<5CD+1o6ts+fwf}MxXG>jt*z`&R#^Jp{U3(j
BSH=JU

delta 2897
zcmX9;c_37K8$M^oHttxap(a~R*1EEkQP#mta>r0w)F%vX(#o}qoKvJ_#Dox{B2<b#
zJA+Y{sVrrenCaTGk9`|6-*La+U%&VHE${Zc&l2Ij0z(bl{a#%SJlG7zQ2^k#Y}+H2
zr$W-F2k3rp_7TLs8h_d#FARS3?u08sem3QkE~XlGcivnT6R{Ku5}t#J35B3I)a%f}
z7tW5(7Hk-Y-7+Ix17<mcLx&@fE!7Pzxw~od_VT^+3yTN%9wmPYgcthqk|;Soa;!Tt
zYaWz)PqnU}vNkQL?$9_Kk|a{-S}}~-eU}|uYuB22`=Tw-_HNK4e}%pCRT*K@%<t&L
z{>B?|>5mj1%=s-WUYD|Ir3A`BL;n2BMt-h&Sp-pD@jk6a;q)x&o0yaDF!PMHzH1kp
zd)iuGk{c?j+gOSY50obKe7|J-lRh^e+=anh*=#Q7HLQ~1Mlef3V@Lk#y%uRNO`#&U
zm}etEr}j+-41Mp@)0d#VOzm0#(QGVO>alCB?TS5)CG?Z0@l}O5i27fxXlvU~OS`iS
z4^Pc!gte&}i4a!upf&^htl%9TJF3K(7YiP`8pDThaeVO-dD<R0@K9Z{fN}#{DZjvm
z>=Q0U9y*(NiUFMQ{Qk_otkVSBA+G0-%icP4c!vsyR=n5WCzt*f&y;AF3>gmB@zQto
ziPTNz%QjP+RUx6-?>1^y+lQ9S(U!oFvxy?{FMZekj@l{fa(lL_(Ci8w0VHW=tpwIP
z89$d+Hw3i2c=dL_3j2dWlzly7X-aY#rIS-*#TGWLD<9;Byz3nR=-4w4rszB8$i>xA
z%V(;>bM%8R$Q$?Cqy1}|eq{pD!LLcgo_7;?h?V-6Q$yhFvh$min6EAZ!b1UuCe_}>
z@gD73%w=@Euk0FI=^%rq^wGECUG-XxH3Rajq7+Tt)l>#CuxABD7|J`h=nbI@FyqM(
z_LZUwd=lM#O^L;#@<t$ug3Z%qpF|IX`919(3}29GzdhpoHjcVH*>Ah{eld&z7V;mz
zOa}QbFX?l}+He(~#c`}en9%h2_r|6c2Tck`q|m%X=ZoDS{}eFUzO=7<+U@g8=WwI2
zgL5wNK=fCv+nFi1d%&53b2k%yZEFars2X(3Rm5}DF*KlLFgyHb>lM`5WaH!ut^%(f
zvPN2E58uVz7!@M<5|v`SS>*%WNB|m-_ZV@#mnk_9@T|&HJ(vpSS$`ck@YRe`VfE}K
z!wb4?d=(MBC_DZ7r&9jsn6oqDh!ucW3NfcG|BTGs#*&>Mga1aaTAEZ@0IG`H)$fEI
zZ=qKSs+(S9A5N|Jw2GFhSbY2%bmIg}-$W5U{Nt&BGOSyZ=H-g$K~am5mpy{43@3py
z<k#kec$D<V#j?TPi3P+=eHW!RCcH@nJo7FXpSdpJ^oa1}Vzn9@Y=jY8;0sK|9QR9s
z1Wh6^9bFT`8*xG&l{ksSN^+MH?(6vAScC})?)Dyz7lKQsa`zY1K143di})@Gd~5&*
zO8`2jX`(pVGH}R*k=>+x&rlO44qyDptPlp>jW6vJ=`;ed?tdS!iW2Xd`UYOd&?Izo
zJ*kj!gmDs((NDLwEo!=>w3GEs`CiakI2c1Tkvw?+{38<a3kB{e)KFtShb7-(o{(?3
zJEI7Hl&*(`J54D0_G9w=Q3R_XOGlw2v=2^muy~4R$@<B2KdgQ^@lkIhT_ClUM#FA6
ze_QZ@53!qZNkFu$KAim4iKc<!?wvXrGrcW)bA8u-#AE2*)vSTwuLI~CZb)eZj*rkc
z%S;j<4d?)INTw~k3k`xaJS!8Y2Zyy<5O6HnJ<GRORKO_-8WuMzr3%5!D?=XPrcu8A
zZB$tn;RUJACWcjv`1c)yZ318<ZmS%^=K<nr0HQ>gPkdXsHy9?xP6wiS9w*wL;~iI=
z(eZnbqP2nkZyqf>Cl*h;@RH9a%DIw>>uhV=G&8Z8{V0OP!O)#P-s=T-4eP4f>J)G+
zzz@_e(i$&>q<tv>+B@~U<}lCN)+T7np@j~-EBr%{xc_<aa&;K|s|_S?MoB4kf+`X;
z;5jPfyf8~0&MiErukszNz_CvD<`87kPzU!Qn6-YXGFQrc>>$E$#V)wX2}XrTE6NT1
zRE3p6(>v4$?=8^Uu;%+uMzs5q$ALC?i0)x}L!wd@*3Z|z`?!iT8+rOR5Y4EPHnlTh
zx7wBq2fw4W9+Qzm*J_RXgr4c17tA6<nshGF;5E2{r9H=`AkAT<#gaN<lLE}%11L}G
zF(o?4go?&e83#Znj`cMM{4uRm*eqd30X)6iW%4sfbycD~t}U#JB0wLCEtRXo*r^Oh
zGp-_tB`&ZgWR`_z>hymh{~p(YMehPzJ@|C`ZxnkfW`;gcJMey(5I_MO+#8H&eSCzH
zZwkc2YoR(KD+v%Q6g=qcsoE8)ny&_8X)ZFW0?e+eM3q~Y5IwknI_ss(O9#Y`PmCTJ
znIwbQ=f@y`F2f2yrmALNqo04_(XHJ?)R8<uRIgN^@hz-5daoI(7sq0!DvzPXe!)K^
zXta+nvZv7mcaa=*`1um}gKtT~9i?gi`JjN@-yd?q@rg62fuVVC7}QxtZ^izxjvoJ_
z6G9Q{ztQ7G822N4qUs5#t4QEYqJ<Eo5Xd-b0ml4ff0A(j|NUF^cq=R@jf5h?2#mda
zTU81yMF=_63Pz4N0G?;VsEeD`=!aV@jrK%^?5)N6+XH_b&cno#P#uUGNn>cwjPzs>
z3e3uRmTwVki2b<1OXl@`Cn1KWwgyP+zRB^FXW|1g04S-a#iunBeSPj(Dh1w<MRYT?
z0Nradvq2()8@1h0BJEf=zuLmf)MM-|$ruXR$?{3#rfhOzfyOdFDqDR}l#2tg+|$H-
zXZ*ZyQTO}5Hti-+Ta4GeDQhtw#3a*(J8HL@GbXh!a?Alx?R|@>sZSFg(u5&!ioi-q
zCT|r!zJU<3x@HMD5KqVa+(4$=^iHk^b7V%LR)A+E923>!kGpO4{Y*?*_n}4JLL`v!
zp`4)mIDIn^c|C)OKlJLKsRC6vmnXrkUaD?BG!ndiwc~FSTz`O}-HvkM+u5b>r8Ova
zjn@L$Upae&!eK&VI3fiVV(tJkmx}}<40Oy6br&6K^=E<4uye~|&3f`Tr(RAmX1kfR
z2Anv;#{dBA_IMJ{;W+rH3<NE;JoR&~+;!4<!C58iSjEsYg`yL6Ya8P&R4+<I?vEsM
z*S4)Ar)~aCmS!~=T9F)U@uHXcOvDMj_oUj9f>4OB4CW6mM!dRztu1Y%?4)xlQU5;m
z)&85!skw1VWkp4s>)l4dN}ZmDB==JFl2Ezqt9ZV^eskyGLbnZRC4kWJS${?5nONWU
zt9chPii&$hpdjfKA8#9RUI5|dWm4Uev$1%ykGVEH{xPMb6?L|#S?efy%%9_)(&x7G
znldy|HujM~wQ1Q!1pZx}+_zU;GSA~Ec9s8G$Tf-^F4+ur)XUWCa?82qn7LN|`FGe(
z?T<MxAeU_F_1aX_*P!E1;0a+gI@U{Rh16H<syIEDm_4rj+GhXZ8<6Q6AHj}B-O(z3
z(zrp}n!}SzVJjRZ==0FWjNFZfo++EquRD^7_$w+2Kk%Skda1g@yas82@3hWI7w7pH
zlM8JjgE2!HE`LOq^m45QXtm#n2bE^^`s3Uf)WMU|xhAIKz6x*ZXPy8x-t$IfHW{PW
zW;t2-1O1+k1rIsloq?|g@Su{=>tJZ#TqYI(EM$eiK*Y|kdKGb*Tlq@ZTH7CaXGOdH
EKlOoE^Z)<=

diff --git a/count/count.tex b/count/count.tex
index 82d4a7f..b9cd888 100644
--- a/count/count.tex
+++ b/count/count.tex
@@ -267,8 +267,8 @@ accuracies far greater than 50\,\% are almost always necessary.
 \begin{figure}[tb]
 \centering
 \resizebox{2.5in}{!}{\includegraphics{CodeSamples/count/atomic}}
-\caption{Atomic Increment Scalability on Nehalem}
-\label{fig:count:Atomic Increment Scalability on Nehalem}
+\caption{Atomic Increment Scalability on Kaby Lake}
+\label{fig:count:Atomic Increment Scalability on Kaby Lake}
 \end{figure}

 The straightforward way to count accurately is to use atomic operations,
@@ -296,7 +296,7 @@ This poor performance should not be a surprise, given the discussion in
 Chapter~\ref{chp:Hardware and its Habits},
 nor should it be a surprise that the performance of atomic increment
 gets slower as the number of CPUs and threads increase, as shown in
-Figure~\ref{fig:count:Atomic Increment Scalability on Nehalem}.
+Figure~\ref{fig:count:Atomic Increment Scalability on Kaby Lake}.
 In this figure, the horizontal dashed line resting on the x~axis
 is the ideal performance that would be achieved
 by a perfectly scalable algorithm: with such an algorithm, a given
@@ -365,7 +365,7 @@ global variable, the cache line containing that variable must
 circulate among all the CPUs, as shown by the red arrows.
 Such circulation will take significant time, resulting in
 the poor performance seen in
-Figure~\ref{fig:count:Atomic Increment Scalability on Nehalem},
+Figure~\ref{fig:count:Atomic Increment Scalability on Kaby Lake},
 which might be thought of as shown in
 Figure~\ref{fig:count:Waiting to Count}.

@@ -3444,7 +3444,7 @@ courtesy of eventual consistency.
 	``Use the right tool for the job.''

 	As can be seen from
-	Figure~\ref{fig:count:Atomic Increment Scalability on Nehalem},
+	Figure~\ref{fig:count:Atomic Increment Scalability on Kaby Lake},
 	single-variable atomic increment need not apply for any job
 	involving heavy use of parallel updates.
 	In contrast, the algorithms shown in
@@ -3730,7 +3730,7 @@ Summarizing the summary:
 \item	Different levels of performance and scalability will affect
 	algorithm and data-structure design, as do a large number of
 	other factors.
-	Figure~\ref{fig:count:Atomic Increment Scalability on Nehalem}
+	Figure~\ref{fig:count:Atomic Increment Scalability on Kaby Lake}
 	illustrates this point:  Atomic increment might be completely
 	acceptable for a two-CPU system, but be completely inadequate for an
 	eight-CPU system.
diff --git a/defer/rcuintro.tex b/defer/rcuintro.tex
index 784c4cf..6e909c9 100644
--- a/defer/rcuintro.tex
+++ b/defer/rcuintro.tex
@@ -76,7 +76,7 @@ the figure.
 But how can we tell when the readers are finished?

 It is tempting to consider a reference-counting scheme, but
-Figure~\ref{fig:count:Atomic Increment Scalability on Nehalem}
+Figure~\ref{fig:count:Atomic Increment Scalability on Kaby Lake}
 in
 Chapter~\ref{chp:Counting}
 shows that this can also result in long delays, just as can
diff --git a/locking/locking.tex b/locking/locking.tex
index 7e93bdb..5db2358 100644
--- a/locking/locking.tex
+++ b/locking/locking.tex
@@ -1578,7 +1578,7 @@ Either way, line~22 releases the root \co{rcu_node} structure's
 	but only for relatively small numbers of CPUs.
 	To see why it is problematic in systems with many hundreds of
 	CPUs, look at
-	Figure~\ref{fig:count:Atomic Increment Scalability on Nehalem}
+	Figure~\ref{fig:count:Atomic Increment Scalability on Kaby Lake}
 	and extrapolate the delay from eight to 1,000 CPUs.
 } \QuickQuizEnd

-- 
2.7.4


^ permalink raw reply related

* [Intel-gfx] [PATCH i-g-t 1/4] lib: Don't assert all KMS drivers support edid_override
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

edid_override is a i915.ko debugfs feature; just skip any kms test that
depends on being able to override the edid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107337
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/igt_kms.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/igt_kms.c b/lib/igt_kms.c
index 476a78623..c9e00c3bd 100644
--- a/lib/igt_kms.c
+++ b/lib/igt_kms.c
@@ -913,7 +913,7 @@ void kmstest_force_edid(int drm_fd, drmModeConnector *connector,
 	debugfs_fd = igt_debugfs_open(drm_fd, path, O_WRONLY | O_TRUNC);
 	free(path);
 
-	igt_assert(debugfs_fd != -1);
+	igt_require(debugfs_fd != -1);
 
 	if (length == 0)
 		ret = write(debugfs_fd, "reset", 5);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* [PATCH i-g-t 1/4] lib: Don't assert all KMS drivers support edid_override
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev

edid_override is a i915.ko debugfs feature; just skip any kms test that
depends on being able to override the edid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107337
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/igt_kms.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/igt_kms.c b/lib/igt_kms.c
index 476a78623..c9e00c3bd 100644
--- a/lib/igt_kms.c
+++ b/lib/igt_kms.c
@@ -913,7 +913,7 @@ void kmstest_force_edid(int drm_fd, drmModeConnector *connector,
 	debugfs_fd = igt_debugfs_open(drm_fd, path, O_WRONLY | O_TRUNC);
 	free(path);
 
-	igt_assert(debugfs_fd != -1);
+	igt_require(debugfs_fd != -1);
 
 	if (length == 0)
 		ret = write(debugfs_fd, "reset", 5);
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* [Intel-gfx] [PATCH i-g-t 2/4] igt/gem_tiled_fence_blits: Remove libdrm_intel dependence
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev
In-Reply-To: <20180723200736.29508-1-chris@chris-wilson.co.uk>

Modernise the test to use igt's ioctl library as opposed to the
antiquated libdrm_intel.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_tiled_fence_blits.c | 188 ++++++++++++++++++++--------------
 1 file changed, 110 insertions(+), 78 deletions(-)

diff --git a/tests/gem_tiled_fence_blits.c b/tests/gem_tiled_fence_blits.c
index 693e96cec..5c1e1a68a 100644
--- a/tests/gem_tiled_fence_blits.c
+++ b/tests/gem_tiled_fence_blits.c
@@ -42,54 +42,38 @@
  */
 
 #include "igt.h"
-#include <stdlib.h>
-#include <stdio.h>
-#include <string.h>
-#include <fcntl.h>
-#include <inttypes.h>
-#include <errno.h>
-#include <sys/stat.h>
-#include <sys/time.h>
-
-#include <drm.h>
-
-#include "intel_bufmgr.h"
-
-static drm_intel_bufmgr *bufmgr;
-struct intel_batchbuffer *batch;
-enum {width=512, height=512};
-static const int bo_size = width * height * 4;
+#include "igt_x86.h"
+
+enum { width = 512, height = 512 };
 static uint32_t linear[width * height];
+static const int bo_size = sizeof(linear);
 
-static drm_intel_bo *
-create_bo(int fd, uint32_t start_val)
+static uint32_t create_bo(int fd, uint32_t start_val)
 {
-	drm_intel_bo *bo;
-	uint32_t tiling = I915_TILING_X;
-	int ret, i;
+	uint32_t handle;
+	uint32_t *ptr;
 
-	bo = drm_intel_bo_alloc(bufmgr, "tiled bo", bo_size, 4096);
-	ret = drm_intel_bo_set_tiling(bo, &tiling, width * 4);
-	igt_assert_eq(ret, 0);
-	igt_assert(tiling == I915_TILING_X);
+	handle = gem_create(fd, bo_size);
+	gem_set_tiling(fd, handle, I915_TILING_X, width * 4);
 
 	/* Fill the BO with dwords starting at start_val */
-	for (i = 0; i < width * height; i++)
-		linear[i] = start_val++;
-
-	gem_write(fd, bo->handle, 0, linear, sizeof(linear));
+	ptr = gem_mmap__gtt(fd, handle, bo_size, PROT_WRITE);
+	for (int i = 0; i < width * height; i++)
+		ptr[i] = start_val++;
+	munmap(ptr, bo_size);
 
-	return bo;
+	return handle;
 }
 
-static void
-check_bo(int fd, drm_intel_bo *bo, uint32_t start_val)
+static void check_bo(int fd, uint32_t handle, uint32_t start_val)
 {
-	int i;
+	uint32_t *ptr;
 
-	gem_read(fd, bo->handle, 0, linear, sizeof(linear));
+	ptr = gem_mmap__gtt(fd, handle, bo_size, PROT_READ);
+	igt_memcpy_from_wc(linear, ptr, bo_size);
+	munmap(ptr, bo_size);
 
-	for (i = 0; i < width * height; i++) {
+	for (int i = 0; i < width * height; i++) {
 		igt_assert_f(linear[i] == start_val,
 			     "Expected 0x%08x, found 0x%08x "
 			     "at offset 0x%08x\n",
@@ -98,73 +82,122 @@ check_bo(int fd, drm_intel_bo *bo, uint32_t start_val)
 	}
 }
 
+static uint32_t
+create_batch(int fd, struct drm_i915_gem_relocation_entry *reloc)
+{
+	const int gen = intel_gen(intel_get_drm_devid(fd));
+	const bool has_64b_reloc = gen >= 8;
+	uint32_t *batch;
+	uint32_t handle;
+	uint32_t pitch;
+	int i = 0;
+
+	handle = gem_create(fd, 4096);
+	batch = gem_mmap__cpu(fd, handle, 0, 4096, PROT_WRITE);
+
+	batch[i] = (XY_SRC_COPY_BLT_CMD |
+		    XY_SRC_COPY_BLT_WRITE_ALPHA |
+		    XY_SRC_COPY_BLT_WRITE_RGB);
+	if (gen >= 4) {
+		batch[i] |= (XY_SRC_COPY_BLT_SRC_TILED |
+			     XY_SRC_COPY_BLT_DST_TILED);
+		pitch = width;
+	} else {
+		pitch = 4 * width;
+	}
+	batch[i++] |= 6 + 2 * has_64b_reloc;
+
+	batch[i++] = 3 << 24 | 0xcc << 16 | pitch;
+	batch[i++] = 0; /* dst (x1, y1) */
+	batch[i++] = height << 16 | width; /* dst (x2 y2) */
+	reloc[0].offset = sizeof(*batch) * i;
+	reloc[0].read_domains = I915_GEM_DOMAIN_RENDER;
+	reloc[0].write_domain = I915_GEM_DOMAIN_RENDER;
+	batch[i++] = 0;
+	if (has_64b_reloc)
+		batch[i++] = 0;
+
+	batch[i++] = 0; /* src (x1, y1) */
+	batch[i++] = pitch;
+	reloc[1].offset = sizeof(*batch) * i;
+	reloc[1].read_domains = I915_GEM_DOMAIN_RENDER;
+	batch[i++] = 0;
+	if (has_64b_reloc)
+		batch[i++] = 0;
+
+	batch[i++] = MI_BATCH_BUFFER_END;
+	munmap(batch, 4096);
+
+	return handle;
+}
+
 static void run_test(int fd, int count)
 {
-	drm_intel_bo **bo;
-	uint32_t *bo_start_val;
+	struct drm_i915_gem_relocation_entry reloc[2];
+	struct drm_i915_gem_exec_object2 obj[3];
+	struct drm_i915_gem_execbuffer2 eb;
+	uint32_t *bo, *bo_start_val;
 	uint32_t start = 0;
-	int i;
+
+	memset(reloc, 0, sizeof(reloc));
+	memset(obj, 0, sizeof(obj));
+	obj[2].handle = create_batch(fd, reloc);
+	obj[2].relocs_ptr = to_user_pointer(reloc);
+	obj[2].relocation_count = ARRAY_SIZE(reloc);
+
+	memset(&eb, 0, sizeof(eb));
+	eb.buffers_ptr = to_user_pointer(obj);
+	eb.buffer_count = ARRAY_SIZE(obj);
+	if (intel_gen(intel_get_drm_devid(fd)) >= 6)
+		eb.flags = I915_EXEC_BLT;
 
 	count |= 1;
 	igt_info("Using %d 1MiB buffers\n", count);
 
-	bo = malloc(count * sizeof(*bo));
-	bo_start_val = malloc(count * sizeof(*bo_start_val));
-	igt_assert(bo && bo_start_val);
-
-	bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
-	batch = intel_batchbuffer_alloc(bufmgr, intel_get_drm_devid(fd));
+	bo = malloc(count * (sizeof(*bo) + sizeof(*bo_start_val)));
+	igt_assert(bo);
+	bo_start_val = bo + count;
 
-	for (i = 0; i < count; i++) {
+	for (int i = 0; i < count; i++) {
 		bo[i] = create_bo(fd, start);
 		bo_start_val[i] = start;
-
-		/*
-		igt_info("Creating bo %d\n", i);
-		check_bo(bo[i], bo_start_val[i]);
-		*/
-
 		start += width * height;
 	}
 
-	for (i = 0; i < count; i++) {
-		int src = count - i - 1;
-		intel_copy_bo(batch, bo[i], bo[src], bo_size);
-		bo_start_val[i] = bo_start_val[src];
+	for (int dst = 0; dst < count; dst++) {
+		int src = count - dst - 1;
+
+		if (src == dst)
+			continue;
+
+		reloc[0].target_handle = obj[0].handle = bo[dst];
+		reloc[1].target_handle = obj[1].handle = bo[src];
+
+		gem_execbuf(fd, &eb);
+		bo_start_val[dst] = bo_start_val[src];
 	}
 
-	for (i = 0; i < count * 4; i++) {
+	for (int i = 0; i < count * 4; i++) {
 		int src = random() % count;
 		int dst = random() % count;
 
 		if (src == dst)
 			continue;
 
-		intel_copy_bo(batch, bo[dst], bo[src], bo_size);
-		bo_start_val[dst] = bo_start_val[src];
+		reloc[0].target_handle = obj[0].handle = bo[dst];
+		reloc[1].target_handle = obj[1].handle = bo[src];
 
-		/*
-		check_bo(bo[dst], bo_start_val[dst]);
-		igt_info("%d: copy bo %d to %d\n", i, src, dst);
-		*/
+		gem_execbuf(fd, &eb);
+		bo_start_val[dst] = bo_start_val[src];
 	}
 
-	for (i = 0; i < count; i++) {
-		/*
-		igt_info("check %d\n", i);
-		*/
+	for (int i = 0; i < count; i++) {
 		check_bo(fd, bo[i], bo_start_val[i]);
-
-		drm_intel_bo_unreference(bo[i]);
-		bo[i] = NULL;
+		gem_close(fd, bo[i]);
 	}
-
-	intel_batchbuffer_free(batch);
-	drm_intel_bufmgr_destroy(bufmgr);
-
-	free(bo_start_val);
 	free(bo);
+
+	gem_close(fd, obj[2].handle);
 }
 
 #define MAX_32b ((1ull << 32) - 4096)
@@ -178,9 +211,8 @@ igt_main
 		igt_require_gem(fd);
 	}
 
-	igt_subtest("basic") {
+	igt_subtest("basic")
 		run_test (fd, 2);
-	}
 
 	/* the rest of the tests are too long for simulation */
 	igt_skip_on_simulation();
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* [PATCH i-g-t 2/4] igt/gem_tiled_fence_blits: Remove libdrm_intel dependence
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev
In-Reply-To: <20180723200736.29508-1-chris@chris-wilson.co.uk>

Modernise the test to use igt's ioctl library as opposed to the
antiquated libdrm_intel.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_tiled_fence_blits.c | 188 ++++++++++++++++++++--------------
 1 file changed, 110 insertions(+), 78 deletions(-)

diff --git a/tests/gem_tiled_fence_blits.c b/tests/gem_tiled_fence_blits.c
index 693e96cec..5c1e1a68a 100644
--- a/tests/gem_tiled_fence_blits.c
+++ b/tests/gem_tiled_fence_blits.c
@@ -42,54 +42,38 @@
  */
 
 #include "igt.h"
-#include <stdlib.h>
-#include <stdio.h>
-#include <string.h>
-#include <fcntl.h>
-#include <inttypes.h>
-#include <errno.h>
-#include <sys/stat.h>
-#include <sys/time.h>
-
-#include <drm.h>
-
-#include "intel_bufmgr.h"
-
-static drm_intel_bufmgr *bufmgr;
-struct intel_batchbuffer *batch;
-enum {width=512, height=512};
-static const int bo_size = width * height * 4;
+#include "igt_x86.h"
+
+enum { width = 512, height = 512 };
 static uint32_t linear[width * height];
+static const int bo_size = sizeof(linear);
 
-static drm_intel_bo *
-create_bo(int fd, uint32_t start_val)
+static uint32_t create_bo(int fd, uint32_t start_val)
 {
-	drm_intel_bo *bo;
-	uint32_t tiling = I915_TILING_X;
-	int ret, i;
+	uint32_t handle;
+	uint32_t *ptr;
 
-	bo = drm_intel_bo_alloc(bufmgr, "tiled bo", bo_size, 4096);
-	ret = drm_intel_bo_set_tiling(bo, &tiling, width * 4);
-	igt_assert_eq(ret, 0);
-	igt_assert(tiling == I915_TILING_X);
+	handle = gem_create(fd, bo_size);
+	gem_set_tiling(fd, handle, I915_TILING_X, width * 4);
 
 	/* Fill the BO with dwords starting at start_val */
-	for (i = 0; i < width * height; i++)
-		linear[i] = start_val++;
-
-	gem_write(fd, bo->handle, 0, linear, sizeof(linear));
+	ptr = gem_mmap__gtt(fd, handle, bo_size, PROT_WRITE);
+	for (int i = 0; i < width * height; i++)
+		ptr[i] = start_val++;
+	munmap(ptr, bo_size);
 
-	return bo;
+	return handle;
 }
 
-static void
-check_bo(int fd, drm_intel_bo *bo, uint32_t start_val)
+static void check_bo(int fd, uint32_t handle, uint32_t start_val)
 {
-	int i;
+	uint32_t *ptr;
 
-	gem_read(fd, bo->handle, 0, linear, sizeof(linear));
+	ptr = gem_mmap__gtt(fd, handle, bo_size, PROT_READ);
+	igt_memcpy_from_wc(linear, ptr, bo_size);
+	munmap(ptr, bo_size);
 
-	for (i = 0; i < width * height; i++) {
+	for (int i = 0; i < width * height; i++) {
 		igt_assert_f(linear[i] == start_val,
 			     "Expected 0x%08x, found 0x%08x "
 			     "at offset 0x%08x\n",
@@ -98,73 +82,122 @@ check_bo(int fd, drm_intel_bo *bo, uint32_t start_val)
 	}
 }
 
+static uint32_t
+create_batch(int fd, struct drm_i915_gem_relocation_entry *reloc)
+{
+	const int gen = intel_gen(intel_get_drm_devid(fd));
+	const bool has_64b_reloc = gen >= 8;
+	uint32_t *batch;
+	uint32_t handle;
+	uint32_t pitch;
+	int i = 0;
+
+	handle = gem_create(fd, 4096);
+	batch = gem_mmap__cpu(fd, handle, 0, 4096, PROT_WRITE);
+
+	batch[i] = (XY_SRC_COPY_BLT_CMD |
+		    XY_SRC_COPY_BLT_WRITE_ALPHA |
+		    XY_SRC_COPY_BLT_WRITE_RGB);
+	if (gen >= 4) {
+		batch[i] |= (XY_SRC_COPY_BLT_SRC_TILED |
+			     XY_SRC_COPY_BLT_DST_TILED);
+		pitch = width;
+	} else {
+		pitch = 4 * width;
+	}
+	batch[i++] |= 6 + 2 * has_64b_reloc;
+
+	batch[i++] = 3 << 24 | 0xcc << 16 | pitch;
+	batch[i++] = 0; /* dst (x1, y1) */
+	batch[i++] = height << 16 | width; /* dst (x2 y2) */
+	reloc[0].offset = sizeof(*batch) * i;
+	reloc[0].read_domains = I915_GEM_DOMAIN_RENDER;
+	reloc[0].write_domain = I915_GEM_DOMAIN_RENDER;
+	batch[i++] = 0;
+	if (has_64b_reloc)
+		batch[i++] = 0;
+
+	batch[i++] = 0; /* src (x1, y1) */
+	batch[i++] = pitch;
+	reloc[1].offset = sizeof(*batch) * i;
+	reloc[1].read_domains = I915_GEM_DOMAIN_RENDER;
+	batch[i++] = 0;
+	if (has_64b_reloc)
+		batch[i++] = 0;
+
+	batch[i++] = MI_BATCH_BUFFER_END;
+	munmap(batch, 4096);
+
+	return handle;
+}
+
 static void run_test(int fd, int count)
 {
-	drm_intel_bo **bo;
-	uint32_t *bo_start_val;
+	struct drm_i915_gem_relocation_entry reloc[2];
+	struct drm_i915_gem_exec_object2 obj[3];
+	struct drm_i915_gem_execbuffer2 eb;
+	uint32_t *bo, *bo_start_val;
 	uint32_t start = 0;
-	int i;
+
+	memset(reloc, 0, sizeof(reloc));
+	memset(obj, 0, sizeof(obj));
+	obj[2].handle = create_batch(fd, reloc);
+	obj[2].relocs_ptr = to_user_pointer(reloc);
+	obj[2].relocation_count = ARRAY_SIZE(reloc);
+
+	memset(&eb, 0, sizeof(eb));
+	eb.buffers_ptr = to_user_pointer(obj);
+	eb.buffer_count = ARRAY_SIZE(obj);
+	if (intel_gen(intel_get_drm_devid(fd)) >= 6)
+		eb.flags = I915_EXEC_BLT;
 
 	count |= 1;
 	igt_info("Using %d 1MiB buffers\n", count);
 
-	bo = malloc(count * sizeof(*bo));
-	bo_start_val = malloc(count * sizeof(*bo_start_val));
-	igt_assert(bo && bo_start_val);
-
-	bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
-	batch = intel_batchbuffer_alloc(bufmgr, intel_get_drm_devid(fd));
+	bo = malloc(count * (sizeof(*bo) + sizeof(*bo_start_val)));
+	igt_assert(bo);
+	bo_start_val = bo + count;
 
-	for (i = 0; i < count; i++) {
+	for (int i = 0; i < count; i++) {
 		bo[i] = create_bo(fd, start);
 		bo_start_val[i] = start;
-
-		/*
-		igt_info("Creating bo %d\n", i);
-		check_bo(bo[i], bo_start_val[i]);
-		*/
-
 		start += width * height;
 	}
 
-	for (i = 0; i < count; i++) {
-		int src = count - i - 1;
-		intel_copy_bo(batch, bo[i], bo[src], bo_size);
-		bo_start_val[i] = bo_start_val[src];
+	for (int dst = 0; dst < count; dst++) {
+		int src = count - dst - 1;
+
+		if (src == dst)
+			continue;
+
+		reloc[0].target_handle = obj[0].handle = bo[dst];
+		reloc[1].target_handle = obj[1].handle = bo[src];
+
+		gem_execbuf(fd, &eb);
+		bo_start_val[dst] = bo_start_val[src];
 	}
 
-	for (i = 0; i < count * 4; i++) {
+	for (int i = 0; i < count * 4; i++) {
 		int src = random() % count;
 		int dst = random() % count;
 
 		if (src == dst)
 			continue;
 
-		intel_copy_bo(batch, bo[dst], bo[src], bo_size);
-		bo_start_val[dst] = bo_start_val[src];
+		reloc[0].target_handle = obj[0].handle = bo[dst];
+		reloc[1].target_handle = obj[1].handle = bo[src];
 
-		/*
-		check_bo(bo[dst], bo_start_val[dst]);
-		igt_info("%d: copy bo %d to %d\n", i, src, dst);
-		*/
+		gem_execbuf(fd, &eb);
+		bo_start_val[dst] = bo_start_val[src];
 	}
 
-	for (i = 0; i < count; i++) {
-		/*
-		igt_info("check %d\n", i);
-		*/
+	for (int i = 0; i < count; i++) {
 		check_bo(fd, bo[i], bo_start_val[i]);
-
-		drm_intel_bo_unreference(bo[i]);
-		bo[i] = NULL;
+		gem_close(fd, bo[i]);
 	}
-
-	intel_batchbuffer_free(batch);
-	drm_intel_bufmgr_destroy(bufmgr);
-
-	free(bo_start_val);
 	free(bo);
+
+	gem_close(fd, obj[2].handle);
 }
 
 #define MAX_32b ((1ull << 32) - 4096)
@@ -178,9 +211,8 @@ igt_main
 		igt_require_gem(fd);
 	}
 
-	igt_subtest("basic") {
+	igt_subtest("basic")
 		run_test (fd, 2);
-	}
 
 	/* the rest of the tests are too long for simulation */
 	igt_skip_on_simulation();
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* [Intel-gfx] [PATCH i-g-t 3/4] igt/gem_exec_schedule: Trim deep runtime
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev
In-Reply-To: <20180723200736.29508-1-chris@chris-wilson.co.uk>

Time the runtime for emitting deep dependency tree, while keeping it
full of umpteen thousand requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_schedule.c | 83 +++++++++++++++++++++++++++++++++------
 1 file changed, 70 insertions(+), 13 deletions(-)

diff --git a/tests/gem_exec_schedule.c b/tests/gem_exec_schedule.c
index 43ea97e61..0462ce84f 100644
--- a/tests/gem_exec_schedule.c
+++ b/tests/gem_exec_schedule.c
@@ -748,21 +748,29 @@ static void preemptive_hang(int fd, unsigned ring)
 static void deep(int fd, unsigned ring)
 {
 #define XS 8
-	const unsigned int nreq = MAX_PRIO - MIN_PRIO;
-	const unsigned size = ALIGN(4*nreq, 4096);
+	const unsigned int max_req = MAX_PRIO - MIN_PRIO;
+	const unsigned size = ALIGN(4*max_req, 4096);
 	struct timespec tv = {};
 	IGT_CORK_HANDLE(cork);
+	unsigned int nreq;
 	uint32_t plug;
 	uint32_t result, dep[XS];
 	uint32_t expected = 0;
 	uint32_t *ptr;
 	uint32_t *ctx;
+	int dep_nreq;
+	int n;
 
 	ctx = malloc(sizeof(*ctx) * MAX_CONTEXTS);
-	for (int n = 0; n < MAX_CONTEXTS; n++) {
+	for (n = 0; n < MAX_CONTEXTS; n++) {
 		ctx[n] = gem_context_create(fd);
 	}
 
+	nreq = gem_measure_ring_inflight(fd, ring, 0) / (4 * XS) * MAX_CONTEXTS;
+	if (nreq > max_req)
+		nreq = max_req;
+	igt_info("Using %d requests (prio range %d)\n", nreq, max_req);
+
 	result = gem_create(fd, size);
 	for (int m = 0; m < XS; m ++)
 		dep[m] = gem_create(fd, size);
@@ -774,7 +782,7 @@ static void deep(int fd, unsigned ring)
 		const uint32_t bbe = MI_BATCH_BUFFER_END;
 
 		memset(obj, 0, sizeof(obj));
-		for (int n = 0; n < XS; n++)
+		for (n = 0; n < XS; n++)
 			obj[n].handle = dep[n];
 		obj[XS].handle = result;
 		obj[XS+1].handle = gem_create(fd, 4096);
@@ -784,7 +792,7 @@ static void deep(int fd, unsigned ring)
 		execbuf.buffers_ptr = to_user_pointer(obj);
 		execbuf.buffer_count = XS + 2;
 		execbuf.flags = ring;
-		for (int n = 0; n < MAX_CONTEXTS; n++) {
+		for (n = 0; n < MAX_CONTEXTS; n++) {
 			execbuf.rsvd1 = ctx[n];
 			gem_execbuf(fd, &execbuf);
 		}
@@ -795,15 +803,62 @@ static void deep(int fd, unsigned ring)
 	plug = igt_cork_plug(&cork, fd);
 
 	/* Create a deep dependency chain, with a few branches */
-	for (int n = 0; n < nreq && igt_seconds_elapsed(&tv) < 8; n++) {
-		uint32_t context = ctx[n % MAX_CONTEXTS];
-		gem_context_set_priority(fd, context, MAX_PRIO - nreq + n);
+	for (n = 0; n < nreq && igt_seconds_elapsed(&tv) < 2; n++) {
+		const int gen = intel_gen(intel_get_drm_devid(fd));
+		struct drm_i915_gem_exec_object2 obj[3];
+		struct drm_i915_gem_relocation_entry reloc;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(obj),
+			.buffer_count = 3,
+			.flags = ring | (gen < 6 ? I915_EXEC_SECURE : 0),
+			.rsvd1 = ctx[n % MAX_CONTEXTS],
+		};
+		uint32_t batch[16];
+		int i;
+
+		memset(obj, 0, sizeof(obj));
+		obj[0].handle = plug;
+
+		memset(&reloc, 0, sizeof(reloc));
+		reloc.presumed_offset = 0;
+		reloc.offset = sizeof(uint32_t);
+		reloc.delta = sizeof(uint32_t) * n;
+		reloc.read_domains = I915_GEM_DOMAIN_RENDER;
+		reloc.write_domain = I915_GEM_DOMAIN_RENDER;
+		obj[2].handle = gem_create(fd, 4096);
+		obj[2].relocs_ptr = to_user_pointer(&reloc);
+		obj[2].relocation_count = 1;
+
+		i = 0;
+		batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+		if (gen >= 8) {
+			batch[++i] = reloc.delta;
+			batch[++i] = 0;
+		} else if (gen >= 4) {
+			batch[++i] = 0;
+			batch[++i] = reloc.delta;
+			reloc.offset += sizeof(uint32_t);
+		} else {
+			batch[i]--;
+			batch[++i] = reloc.delta;
+		}
+		batch[++i] = eb.rsvd1;
+		batch[++i] = MI_BATCH_BUFFER_END;
+		gem_write(fd, obj[2].handle, 0, batch, sizeof(batch));
 
-		for (int m = 0; m < XS; m++)
-			store_dword(fd, context, ring, dep[m], 4*n, context, plug, I915_GEM_DOMAIN_INSTRUCTION);
+		gem_context_set_priority(fd, eb.rsvd1, MAX_PRIO - nreq + n);
+		for (int m = 0; m < XS; m++) {
+			obj[1].handle = dep[m];
+			reloc.target_handle = obj[1].handle;
+			gem_execbuf(fd, &eb);
+		}
+		gem_close(fd, obj[2].handle);
 	}
+	igt_info("First deptree: %d requests [%.3fs]\n",
+		 n * XS, 1e-9*igt_nsec_elapsed(&tv));
+	dep_nreq = n;
 
-	for (int n = 0; n < nreq && igt_seconds_elapsed(&tv) < 6; n++) {
+	for (n = 0; n < nreq && igt_seconds_elapsed(&tv) < 4; n++) {
 		uint32_t context = ctx[n % MAX_CONTEXTS];
 		gem_context_set_priority(fd, context, MAX_PRIO - nreq + n);
 
@@ -813,12 +868,14 @@ static void deep(int fd, unsigned ring)
 		}
 		expected = context;
 	}
+	igt_info("Second deptree: %d requests [%.3fs]\n",
+		 n * XS, 1e-9*igt_nsec_elapsed(&tv));
 
 	unplug_show_queue(fd, &cork, ring);
 	gem_close(fd, plug);
 	igt_require(expected); /* too slow */
 
-	for (int n = 0; n < MAX_CONTEXTS; n++)
+	for (n = 0; n < MAX_CONTEXTS; n++)
 		gem_context_destroy(fd, ctx[n]);
 
 	for (int m = 0; m < XS; m++) {
@@ -827,7 +884,7 @@ static void deep(int fd, unsigned ring)
 				I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
 		gem_close(fd, dep[m]);
 
-		for (int n = 0; n < nreq; n++)
+		for (n = 0; n < dep_nreq; n++)
 			igt_assert_eq_u32(ptr[n], ctx[n % MAX_CONTEXTS]);
 		munmap(ptr, size);
 	}
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* [PATCH i-g-t 3/4] igt/gem_exec_schedule: Trim deep runtime
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev
In-Reply-To: <20180723200736.29508-1-chris@chris-wilson.co.uk>

Time the runtime for emitting deep dependency tree, while keeping it
full of umpteen thousand requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_schedule.c | 83 +++++++++++++++++++++++++++++++++------
 1 file changed, 70 insertions(+), 13 deletions(-)

diff --git a/tests/gem_exec_schedule.c b/tests/gem_exec_schedule.c
index 43ea97e61..0462ce84f 100644
--- a/tests/gem_exec_schedule.c
+++ b/tests/gem_exec_schedule.c
@@ -748,21 +748,29 @@ static void preemptive_hang(int fd, unsigned ring)
 static void deep(int fd, unsigned ring)
 {
 #define XS 8
-	const unsigned int nreq = MAX_PRIO - MIN_PRIO;
-	const unsigned size = ALIGN(4*nreq, 4096);
+	const unsigned int max_req = MAX_PRIO - MIN_PRIO;
+	const unsigned size = ALIGN(4*max_req, 4096);
 	struct timespec tv = {};
 	IGT_CORK_HANDLE(cork);
+	unsigned int nreq;
 	uint32_t plug;
 	uint32_t result, dep[XS];
 	uint32_t expected = 0;
 	uint32_t *ptr;
 	uint32_t *ctx;
+	int dep_nreq;
+	int n;
 
 	ctx = malloc(sizeof(*ctx) * MAX_CONTEXTS);
-	for (int n = 0; n < MAX_CONTEXTS; n++) {
+	for (n = 0; n < MAX_CONTEXTS; n++) {
 		ctx[n] = gem_context_create(fd);
 	}
 
+	nreq = gem_measure_ring_inflight(fd, ring, 0) / (4 * XS) * MAX_CONTEXTS;
+	if (nreq > max_req)
+		nreq = max_req;
+	igt_info("Using %d requests (prio range %d)\n", nreq, max_req);
+
 	result = gem_create(fd, size);
 	for (int m = 0; m < XS; m ++)
 		dep[m] = gem_create(fd, size);
@@ -774,7 +782,7 @@ static void deep(int fd, unsigned ring)
 		const uint32_t bbe = MI_BATCH_BUFFER_END;
 
 		memset(obj, 0, sizeof(obj));
-		for (int n = 0; n < XS; n++)
+		for (n = 0; n < XS; n++)
 			obj[n].handle = dep[n];
 		obj[XS].handle = result;
 		obj[XS+1].handle = gem_create(fd, 4096);
@@ -784,7 +792,7 @@ static void deep(int fd, unsigned ring)
 		execbuf.buffers_ptr = to_user_pointer(obj);
 		execbuf.buffer_count = XS + 2;
 		execbuf.flags = ring;
-		for (int n = 0; n < MAX_CONTEXTS; n++) {
+		for (n = 0; n < MAX_CONTEXTS; n++) {
 			execbuf.rsvd1 = ctx[n];
 			gem_execbuf(fd, &execbuf);
 		}
@@ -795,15 +803,62 @@ static void deep(int fd, unsigned ring)
 	plug = igt_cork_plug(&cork, fd);
 
 	/* Create a deep dependency chain, with a few branches */
-	for (int n = 0; n < nreq && igt_seconds_elapsed(&tv) < 8; n++) {
-		uint32_t context = ctx[n % MAX_CONTEXTS];
-		gem_context_set_priority(fd, context, MAX_PRIO - nreq + n);
+	for (n = 0; n < nreq && igt_seconds_elapsed(&tv) < 2; n++) {
+		const int gen = intel_gen(intel_get_drm_devid(fd));
+		struct drm_i915_gem_exec_object2 obj[3];
+		struct drm_i915_gem_relocation_entry reloc;
+		struct drm_i915_gem_execbuffer2 eb = {
+			.buffers_ptr = to_user_pointer(obj),
+			.buffer_count = 3,
+			.flags = ring | (gen < 6 ? I915_EXEC_SECURE : 0),
+			.rsvd1 = ctx[n % MAX_CONTEXTS],
+		};
+		uint32_t batch[16];
+		int i;
+
+		memset(obj, 0, sizeof(obj));
+		obj[0].handle = plug;
+
+		memset(&reloc, 0, sizeof(reloc));
+		reloc.presumed_offset = 0;
+		reloc.offset = sizeof(uint32_t);
+		reloc.delta = sizeof(uint32_t) * n;
+		reloc.read_domains = I915_GEM_DOMAIN_RENDER;
+		reloc.write_domain = I915_GEM_DOMAIN_RENDER;
+		obj[2].handle = gem_create(fd, 4096);
+		obj[2].relocs_ptr = to_user_pointer(&reloc);
+		obj[2].relocation_count = 1;
+
+		i = 0;
+		batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+		if (gen >= 8) {
+			batch[++i] = reloc.delta;
+			batch[++i] = 0;
+		} else if (gen >= 4) {
+			batch[++i] = 0;
+			batch[++i] = reloc.delta;
+			reloc.offset += sizeof(uint32_t);
+		} else {
+			batch[i]--;
+			batch[++i] = reloc.delta;
+		}
+		batch[++i] = eb.rsvd1;
+		batch[++i] = MI_BATCH_BUFFER_END;
+		gem_write(fd, obj[2].handle, 0, batch, sizeof(batch));
 
-		for (int m = 0; m < XS; m++)
-			store_dword(fd, context, ring, dep[m], 4*n, context, plug, I915_GEM_DOMAIN_INSTRUCTION);
+		gem_context_set_priority(fd, eb.rsvd1, MAX_PRIO - nreq + n);
+		for (int m = 0; m < XS; m++) {
+			obj[1].handle = dep[m];
+			reloc.target_handle = obj[1].handle;
+			gem_execbuf(fd, &eb);
+		}
+		gem_close(fd, obj[2].handle);
 	}
+	igt_info("First deptree: %d requests [%.3fs]\n",
+		 n * XS, 1e-9*igt_nsec_elapsed(&tv));
+	dep_nreq = n;
 
-	for (int n = 0; n < nreq && igt_seconds_elapsed(&tv) < 6; n++) {
+	for (n = 0; n < nreq && igt_seconds_elapsed(&tv) < 4; n++) {
 		uint32_t context = ctx[n % MAX_CONTEXTS];
 		gem_context_set_priority(fd, context, MAX_PRIO - nreq + n);
 
@@ -813,12 +868,14 @@ static void deep(int fd, unsigned ring)
 		}
 		expected = context;
 	}
+	igt_info("Second deptree: %d requests [%.3fs]\n",
+		 n * XS, 1e-9*igt_nsec_elapsed(&tv));
 
 	unplug_show_queue(fd, &cork, ring);
 	gem_close(fd, plug);
 	igt_require(expected); /* too slow */
 
-	for (int n = 0; n < MAX_CONTEXTS; n++)
+	for (n = 0; n < MAX_CONTEXTS; n++)
 		gem_context_destroy(fd, ctx[n]);
 
 	for (int m = 0; m < XS; m++) {
@@ -827,7 +884,7 @@ static void deep(int fd, unsigned ring)
 				I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
 		gem_close(fd, dep[m]);
 
-		for (int n = 0; n < nreq; n++)
+		for (n = 0; n < dep_nreq; n++)
 			igt_assert_eq_u32(ptr[n], ctx[n % MAX_CONTEXTS]);
 		munmap(ptr, size);
 	}
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* [Intel-gfx] [PATCH i-g-t 4/4] igt/gem_exec_capture: Capture many, many objects
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev
In-Reply-To: <20180723200736.29508-1-chris@chris-wilson.co.uk>

Exercise O(N^2) behaviour in reading the error state, and push it to the
extreme.

Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_capture.c | 156 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 153 insertions(+), 3 deletions(-)

diff --git a/tests/gem_exec_capture.c b/tests/gem_exec_capture.c
index 2dc06ce43..6cc175551 100644
--- a/tests/gem_exec_capture.c
+++ b/tests/gem_exec_capture.c
@@ -23,6 +23,7 @@
 
 #include "igt.h"
 #include "igt_device.h"
+#include "igt_rand.h"
 #include "igt_sysfs.h"
 
 #define LOCAL_OBJECT_CAPTURE (1 << 7)
@@ -57,7 +58,7 @@ static void check_error_state(int dir, struct drm_i915_gem_exec_object2 *obj)
 	igt_assert(found);
 }
 
-static void __capture(int fd, int dir, unsigned ring, uint32_t target)
+static void __capture1(int fd, int dir, unsigned ring, uint32_t target)
 {
 	const int gen = intel_gen(intel_get_drm_devid(fd));
 	struct drm_i915_gem_exec_object2 obj[4];
@@ -167,10 +168,149 @@ static void capture(int fd, int dir, unsigned ring)
 	uint32_t handle;
 
 	handle = gem_create(fd, 4096);
-	__capture(fd, dir, ring, handle);
+	__capture1(fd, dir, ring, handle);
 	gem_close(fd, handle);
 }
 
+static void __captureN(int fd, int dir, unsigned ring,
+		       unsigned int size, int count, unsigned int flags)
+#define RANDOM 0x1
+{
+	const int gen = intel_gen(intel_get_drm_devid(fd));
+	struct drm_i915_gem_exec_object2 *obj;
+	struct drm_i915_gem_relocation_entry reloc[2];
+	struct drm_i915_gem_execbuffer2 execbuf;
+	uint32_t *batch, *seqno;
+	int i;
+
+	obj = calloc(count + 2, sizeof(*obj));
+	igt_assert(obj);
+
+	obj[0].handle = gem_create(fd, 4096);
+	for (i = 0; i < count; i++) {
+		obj[i + 1].handle = gem_create(fd, size);
+		obj[i + 1].flags =
+			LOCAL_OBJECT_CAPTURE | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		if (flags & RANDOM) {
+			uint32_t *ptr;
+
+			ptr = gem_mmap__cpu(fd, obj[i + 1].handle,
+					    0, size, PROT_WRITE);
+			for (unsigned int n = 0; n < size / sizeof(*ptr); n++)
+				ptr[n] = hars_petruska_f54_1_random_unsafe();
+			munmap(ptr, size);
+		}
+	}
+
+	obj[count + 1].handle = gem_create(fd, 4096);
+	obj[count + 1].relocs_ptr = (uintptr_t)reloc;
+	obj[count + 1].relocation_count = ARRAY_SIZE(reloc);
+
+	memset(reloc, 0, sizeof(reloc));
+	reloc[0].target_handle = obj[count + 1].handle; /* recurse */
+	reloc[0].presumed_offset = 0;
+	reloc[0].offset = 5*sizeof(uint32_t);
+	reloc[0].delta = 0;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = 0;
+
+	reloc[1].target_handle = obj[0].handle; /* breadcrumb */
+	reloc[1].presumed_offset = 0;
+	reloc[1].offset = sizeof(uint32_t);
+	reloc[1].delta = 0;
+	reloc[1].read_domains = I915_GEM_DOMAIN_RENDER;
+	reloc[1].write_domain = I915_GEM_DOMAIN_RENDER;
+
+	seqno = gem_mmap__wc(fd, obj[0].handle, 0, 4096, PROT_READ);
+	gem_set_domain(fd, obj[0].handle,
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+
+	batch = gem_mmap__cpu(fd, obj[count + 1].handle, 0, 4096, PROT_WRITE);
+	gem_set_domain(fd, obj[count + 1].handle,
+			I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = 0;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = 0;
+		reloc[1].offset += sizeof(uint32_t);
+	} else {
+		batch[i]--;
+		batch[++i] = 0;
+	}
+	batch[++i] = 0xc0ffee;
+	if (gen < 3)
+		batch[++i] = MI_NOOP;
+
+	batch[++i] = MI_BATCH_BUFFER_START; /* not crashed? try again! */
+	if (gen >= 8) {
+		batch[i] |= 1 << 8 | 1;
+		batch[++i] = 0;
+		batch[++i] = 0;
+	} else if (gen >= 6) {
+		batch[i] |= 1 << 8;
+		batch[++i] = 0;
+	} else {
+		batch[i] |= 2 << 6;
+		batch[++i] = 0;
+		if (gen < 4) {
+			batch[i] |= 1;
+			reloc[0].delta = 1;
+		}
+	}
+	munmap(batch, 4096);
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
+	execbuf.buffer_count = count + 2;
+	execbuf.flags = ring;
+	if (gen > 3 && gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+	gem_execbuf(fd, &execbuf);
+
+	/* Wait for the request to start */
+	while (*(volatile uint32_t *)seqno != 0xc0ffee)
+		igt_assert(gem_bo_busy(fd, obj[0].handle));
+	munmap(seqno, 4096);
+
+	igt_force_gpu_reset(fd);
+
+	gem_sync(fd, obj[count + 1].handle);
+	gem_close(fd, obj[count + 1].handle);
+	for (i = 0; i < count; i++)
+		gem_close(fd, obj[i + 1].handle);
+	gem_close(fd, obj[0].handle);
+}
+
+static void many(int fd, int dir, unsigned int flags)
+{
+	uint64_t ram, gtt;
+	unsigned long count;
+	char *error;
+
+	gtt = (gem_aperture_size(fd) >> 20) / 4;
+	ram = intel_get_avail_ram_mb() / 4;
+	igt_debug("Available objects in GTT:%"PRIu64", RAM:%"PRIu64"\n",
+		  gtt, ram);
+
+	count = min(gtt, ram);
+	igt_require(count > 1);
+
+	intel_require_memory(count, 2 << 20, CHECK_RAM);
+
+	__captureN(fd, dir, 0, 2 << 20, count, flags);
+
+	error = igt_sysfs_get(dir, "error");
+	igt_sysfs_set(dir, "error", "Begone!");
+
+	igt_assert(error);
+	igt_debug("%s\n", error);
+}
+
 static void userptr(int fd, int dir)
 {
 	uint32_t handle;
@@ -179,7 +319,7 @@ static void userptr(int fd, int dir)
 	igt_assert(posix_memalign(&ptr, 4096, 4096) == 0);
 	igt_require(__gem_userptr(fd, ptr, 4096, 0, 0, &handle) == 0);
 
-	__capture(fd, dir, 0, handle);
+	__capture1(fd, dir, 0, handle);
 
 	gem_close(fd, handle);
 	free(ptr);
@@ -236,6 +376,16 @@ igt_main
 		}
 	}
 
+	igt_subtest_f("many-zero") {
+		igt_require(gem_can_store_dword(fd, 0));
+		many(fd, dir, 0);
+	}
+
+	igt_subtest_f("many-random") {
+		igt_require(gem_can_store_dword(fd, 0));
+		many(fd, dir, RANDOM);
+	}
+
 	/* And check we can read from different types of objects */
 
 	igt_subtest_f("userptr") {
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* [PATCH i-g-t 4/4] igt/gem_exec_capture: Capture many, many objects
From: Chris Wilson @ 2018-07-23 20:07 UTC (permalink / raw)
  To: intel-gfx; +Cc: igt-dev
In-Reply-To: <20180723200736.29508-1-chris@chris-wilson.co.uk>

Exercise O(N^2) behaviour in reading the error state, and push it to the
extreme.

Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_exec_capture.c | 156 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 153 insertions(+), 3 deletions(-)

diff --git a/tests/gem_exec_capture.c b/tests/gem_exec_capture.c
index 2dc06ce43..6cc175551 100644
--- a/tests/gem_exec_capture.c
+++ b/tests/gem_exec_capture.c
@@ -23,6 +23,7 @@
 
 #include "igt.h"
 #include "igt_device.h"
+#include "igt_rand.h"
 #include "igt_sysfs.h"
 
 #define LOCAL_OBJECT_CAPTURE (1 << 7)
@@ -57,7 +58,7 @@ static void check_error_state(int dir, struct drm_i915_gem_exec_object2 *obj)
 	igt_assert(found);
 }
 
-static void __capture(int fd, int dir, unsigned ring, uint32_t target)
+static void __capture1(int fd, int dir, unsigned ring, uint32_t target)
 {
 	const int gen = intel_gen(intel_get_drm_devid(fd));
 	struct drm_i915_gem_exec_object2 obj[4];
@@ -167,10 +168,149 @@ static void capture(int fd, int dir, unsigned ring)
 	uint32_t handle;
 
 	handle = gem_create(fd, 4096);
-	__capture(fd, dir, ring, handle);
+	__capture1(fd, dir, ring, handle);
 	gem_close(fd, handle);
 }
 
+static void __captureN(int fd, int dir, unsigned ring,
+		       unsigned int size, int count, unsigned int flags)
+#define RANDOM 0x1
+{
+	const int gen = intel_gen(intel_get_drm_devid(fd));
+	struct drm_i915_gem_exec_object2 *obj;
+	struct drm_i915_gem_relocation_entry reloc[2];
+	struct drm_i915_gem_execbuffer2 execbuf;
+	uint32_t *batch, *seqno;
+	int i;
+
+	obj = calloc(count + 2, sizeof(*obj));
+	igt_assert(obj);
+
+	obj[0].handle = gem_create(fd, 4096);
+	for (i = 0; i < count; i++) {
+		obj[i + 1].handle = gem_create(fd, size);
+		obj[i + 1].flags =
+			LOCAL_OBJECT_CAPTURE | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		if (flags & RANDOM) {
+			uint32_t *ptr;
+
+			ptr = gem_mmap__cpu(fd, obj[i + 1].handle,
+					    0, size, PROT_WRITE);
+			for (unsigned int n = 0; n < size / sizeof(*ptr); n++)
+				ptr[n] = hars_petruska_f54_1_random_unsafe();
+			munmap(ptr, size);
+		}
+	}
+
+	obj[count + 1].handle = gem_create(fd, 4096);
+	obj[count + 1].relocs_ptr = (uintptr_t)reloc;
+	obj[count + 1].relocation_count = ARRAY_SIZE(reloc);
+
+	memset(reloc, 0, sizeof(reloc));
+	reloc[0].target_handle = obj[count + 1].handle; /* recurse */
+	reloc[0].presumed_offset = 0;
+	reloc[0].offset = 5*sizeof(uint32_t);
+	reloc[0].delta = 0;
+	reloc[0].read_domains = I915_GEM_DOMAIN_COMMAND;
+	reloc[0].write_domain = 0;
+
+	reloc[1].target_handle = obj[0].handle; /* breadcrumb */
+	reloc[1].presumed_offset = 0;
+	reloc[1].offset = sizeof(uint32_t);
+	reloc[1].delta = 0;
+	reloc[1].read_domains = I915_GEM_DOMAIN_RENDER;
+	reloc[1].write_domain = I915_GEM_DOMAIN_RENDER;
+
+	seqno = gem_mmap__wc(fd, obj[0].handle, 0, 4096, PROT_READ);
+	gem_set_domain(fd, obj[0].handle,
+			I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+
+	batch = gem_mmap__cpu(fd, obj[count + 1].handle, 0, 4096, PROT_WRITE);
+	gem_set_domain(fd, obj[count + 1].handle,
+			I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
+
+	i = 0;
+	batch[i] = MI_STORE_DWORD_IMM | (gen < 6 ? 1 << 22 : 0);
+	if (gen >= 8) {
+		batch[++i] = 0;
+		batch[++i] = 0;
+	} else if (gen >= 4) {
+		batch[++i] = 0;
+		batch[++i] = 0;
+		reloc[1].offset += sizeof(uint32_t);
+	} else {
+		batch[i]--;
+		batch[++i] = 0;
+	}
+	batch[++i] = 0xc0ffee;
+	if (gen < 3)
+		batch[++i] = MI_NOOP;
+
+	batch[++i] = MI_BATCH_BUFFER_START; /* not crashed? try again! */
+	if (gen >= 8) {
+		batch[i] |= 1 << 8 | 1;
+		batch[++i] = 0;
+		batch[++i] = 0;
+	} else if (gen >= 6) {
+		batch[i] |= 1 << 8;
+		batch[++i] = 0;
+	} else {
+		batch[i] |= 2 << 6;
+		batch[++i] = 0;
+		if (gen < 4) {
+			batch[i] |= 1;
+			reloc[0].delta = 1;
+		}
+	}
+	munmap(batch, 4096);
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = (uintptr_t)obj;
+	execbuf.buffer_count = count + 2;
+	execbuf.flags = ring;
+	if (gen > 3 && gen < 6)
+		execbuf.flags |= I915_EXEC_SECURE;
+	gem_execbuf(fd, &execbuf);
+
+	/* Wait for the request to start */
+	while (*(volatile uint32_t *)seqno != 0xc0ffee)
+		igt_assert(gem_bo_busy(fd, obj[0].handle));
+	munmap(seqno, 4096);
+
+	igt_force_gpu_reset(fd);
+
+	gem_sync(fd, obj[count + 1].handle);
+	gem_close(fd, obj[count + 1].handle);
+	for (i = 0; i < count; i++)
+		gem_close(fd, obj[i + 1].handle);
+	gem_close(fd, obj[0].handle);
+}
+
+static void many(int fd, int dir, unsigned int flags)
+{
+	uint64_t ram, gtt;
+	unsigned long count;
+	char *error;
+
+	gtt = (gem_aperture_size(fd) >> 20) / 4;
+	ram = intel_get_avail_ram_mb() / 4;
+	igt_debug("Available objects in GTT:%"PRIu64", RAM:%"PRIu64"\n",
+		  gtt, ram);
+
+	count = min(gtt, ram);
+	igt_require(count > 1);
+
+	intel_require_memory(count, 2 << 20, CHECK_RAM);
+
+	__captureN(fd, dir, 0, 2 << 20, count, flags);
+
+	error = igt_sysfs_get(dir, "error");
+	igt_sysfs_set(dir, "error", "Begone!");
+
+	igt_assert(error);
+	igt_debug("%s\n", error);
+}
+
 static void userptr(int fd, int dir)
 {
 	uint32_t handle;
@@ -179,7 +319,7 @@ static void userptr(int fd, int dir)
 	igt_assert(posix_memalign(&ptr, 4096, 4096) == 0);
 	igt_require(__gem_userptr(fd, ptr, 4096, 0, 0, &handle) == 0);
 
-	__capture(fd, dir, 0, handle);
+	__capture1(fd, dir, 0, handle);
 
 	gem_close(fd, handle);
 	free(ptr);
@@ -236,6 +376,16 @@ igt_main
 		}
 	}
 
+	igt_subtest_f("many-zero") {
+		igt_require(gem_can_store_dword(fd, 0));
+		many(fd, dir, 0);
+	}
+
+	igt_subtest_f("many-random") {
+		igt_require(gem_can_store_dword(fd, 0));
+		many(fd, dir, RANDOM);
+	}
+
 	/* And check we can read from different types of objects */
 
 	igt_subtest_f("userptr") {
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related

* Re: [PATCH] ethdev: move sanity checks to non-debug paths
From: Andrew Rybchenko @ 2018-07-23 20:07 UTC (permalink / raw)
  To: Ananyev, Konstantin, Matan Azrad, Aaron Conole
  Cc: dev@dpdk.org, Yigit, Ferruh, Marcelo Leitner, Shahaf Shuler,
	Ori Kam, Thomas Monjalon
In-Reply-To: <2601191342CEEE43887BDE71AB977258DF51B4D9@irsmsx105.ger.corp.intel.com>

On 23.07.2018 17:19, Ananyev, Konstantin wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
>> Sent: Monday, July 23, 2018 1:14 PM
>> To: Aaron Conole <aconole@redhat.com>
>> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>; Marcelo Leitner <mleitner@redhat.com>; Shahaf Shuler
>> <shahafs@mellanox.com>; Ori Kam <orika@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>
>> Subject: Re: [dpdk-dev] [PATCH] ethdev: move sanity checks to non-debug paths
>>
>>
>> Hi Aaron
>> From: Aaron Conole
>>> Sent: Monday, July 23, 2018 2:52 PM
>>> To: Matan Azrad <matan@mellanox.com>
>>> Cc: dev@dpdk.org; Ferruh Yigit <ferruh.yigit@intel.com>; Marcelo Leitner
>>> <mleitner@redhat.com>; Shahaf Shuler <shahafs@mellanox.com>; Ori Kam
>>> <orika@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>
>>> Subject: Re: [dpdk-dev] [PATCH] ethdev: move sanity checks to non-debug paths
>>>
>>> Matan Azrad <matan@mellanox.com> writes:
>>>
>>>> Hi Aaron
>>>>
>>>> From: Aaron Conole
>>>>> These checks would have prevented a reported crash in the field.  If
>>>>> a user builds without ETHDEV_DEBUG, it should make their application
>>>>> more stable, not less.
>>>>>
>>>>> Many of these functions immediately dereference arrays based on the
>>>>> passed in values, so the sanity checks are quite important.
>>>>>
>>>> These functions are datapath functions.
>>>> Do you really want to add more 3 checks + calculations per each burst call?
>>>> Did you check the performance impact?
>>>> I think that performance numbers must be added for the discussion of this
>>> patch.
>>>
>>> I'll dig up performance numbers - but performance doesn't mean anything if
>>> the application isn't running any longer due to crash.
>> Yes, I understand your point, but think about that, if we are going to defend each user mistake it will cost a lot.
>> For example in Tx path, Adding checks for each mbuf pointer and mbuf data validity will be very expensive.
>>
>> I think the best way is to check the common user mistakes in DEBUG mode to help for application debugging and that's it.
> +1
> The problem is that user provided an invalid input parameters.
> Adding just extra checks inside data-path functions wouldn't solve it.
> Konstantin

+1, I agree with Matan and Konstantin
So, NACK

>>>>> The logs are left as DEBUG only.
>>>>>
>>>>> Cc: Marcelo Leitner <mleitner@redhat.com>
>>>>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>>>>> ---
>>>>>   lib/librte_ethdev/rte_ethdev.h | 29 +++++++++++++----------------
>>>>>   1 file changed, 13 insertions(+), 16 deletions(-)
>>>>>
>>>>> diff --git a/lib/librte_ethdev/rte_ethdev.h
>>>>> b/lib/librte_ethdev/rte_ethdev.h index f5f593b31..bfd6a3406 100644
>>>>> --- a/lib/librte_ethdev/rte_ethdev.h
>>>>> +++ b/lib/librte_ethdev/rte_ethdev.h
>>>>> @@ -3805,15 +3805,16 @@ rte_eth_rx_burst(uint16_t port_id, uint16_t
>>>>> queue_id,
>>>>>   	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>>>>   	uint16_t nb_rx;
>>>>>
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
>>>>>   	RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
>>>>>
>>>>>   	if (queue_id >= dev->data->nb_rx_queues) {
>>>>> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   		RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n",
>>> queue_id);
>>>>> +#endif
>>>>>   		return 0;
>>>>>   	}
>>>>> -#endif
>>>>> +
>>>>>   	nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
>>>>>   				     rx_pkts, nb_pkts);
>>>>>
>>>>> @@ -3928,14 +3929,12 @@ rte_eth_rx_descriptor_status(uint16_t
>>>>> port_id, uint16_t queue_id,
>>>>>   	struct rte_eth_dev *dev;
>>>>>   	void *rxq;
>>>>>
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); -#endif
>>>>>   	dev = &rte_eth_devices[port_id];
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>> +
>>>>>   	if (queue_id >= dev->data->nb_rx_queues)
>>>>>   		return -ENODEV;
>>>>> -#endif
>>>>> +
>>>>>   	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_descriptor_status, -
>>>>> ENOTSUP);
>>>>>   	rxq = dev->data->rx_queues[queue_id];
>>>>>
>>>>> @@ -3985,14 +3984,12 @@ static inline int
>>>>> rte_eth_tx_descriptor_status(uint16_t port_id,
>>>>>   	struct rte_eth_dev *dev;
>>>>>   	void *txq;
>>>>>
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); -#endif
>>>>>   	dev = &rte_eth_devices[port_id];
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>> +
>>>>>   	if (queue_id >= dev->data->nb_tx_queues)
>>>>>   		return -ENODEV;
>>>>> -#endif
>>>>> +
>>>>>   	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_descriptor_status, -
>>>>> ENOTSUP);
>>>>>   	txq = dev->data->tx_queues[queue_id];
>>>>>
>>>>> @@ -4071,15 +4068,15 @@ rte_eth_tx_burst(uint16_t port_id, uint16_t
>>>>> queue_id,  {
>>>>>   	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>>>>
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
>>>>>   	RTE_FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0);
>>>>>
>>>>>   	if (queue_id >= dev->data->nb_tx_queues) {
>>>>> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   		RTE_ETHDEV_LOG(ERR, "Invalid TX queue_id=%u\n",
>>> queue_id);
>>>>> +#endif
>>>>>   		return 0;
>>>>>   	}
>>>>> -#endif
>>>>>
>>>>>   #ifdef RTE_ETHDEV_RXTX_CALLBACKS
>>>>>   	struct rte_eth_rxtx_callback *cb = dev->pre_tx_burst_cbs[queue_id];
>>>>> @@ -4160,23 +4157,23 @@ rte_eth_tx_prepare(uint16_t port_id, uint16_t
>>>>> queue_id,  {
>>>>>   	struct rte_eth_dev *dev;
>>>>>
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   	if (!rte_eth_dev_is_valid_port(port_id)) {
>>>>> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   		RTE_ETHDEV_LOG(ERR, "Invalid TX port_id=%u\n", port_id);
>>>>> +#endif
>>>>>   		rte_errno = -EINVAL;
>>>>>   		return 0;
>>>>>   	}
>>>>> -#endif
>>>>>
>>>>>   	dev = &rte_eth_devices[port_id];
>>>>>
>>>>> -#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   	if (queue_id >= dev->data->nb_tx_queues) {
>>>>> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
>>>>>   		RTE_ETHDEV_LOG(ERR, "Invalid TX queue_id=%u\n",
>>> queue_id);
>>>>> +#endif
>>>>>   		rte_errno = -EINVAL;
>>>>>   		return 0;
>>>>>   	}
>>>>> -#endif
>>>>>
>>>>>   	if (!dev->tx_pkt_prepare)
>>>>>   		return nb_pkts;
>>>>> --
>>>>> 2.14.3

^ permalink raw reply

* Re: [PATCH v9 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings
From: Doug Anderson @ 2018-07-23 20:09 UTC (permalink / raw)
  To: Mark Brown, David Collins
  Cc: Liam Girdwood, Rob Herring, Mark Rutland, linux-arm-msm,
	Linux ARM, devicetree, LKML, Rajendra Nayak, Stephen Boyd,
	Matthias Kaehlcke
In-Reply-To: <56b9d5c38cfe46da1228c54f001a49773c3e31dc.1531531808.git.collinsd@codeaurora.org>

Hi Mark,

On Fri, Jul 13, 2018 at 6:50 PM, David Collins <collinsd@codeaurora.org> wrote:
> Introduce bindings for RPMh regulator devices found on some
> Qualcomm Technlogies, Inc. SoCs.  These devices allow a given
> processor within the SoC to make PMIC regulator requests which
> are aggregated within the RPMh hardware block along with requests
> from other processors in the SoC to determine the final PMIC
> regulator hardware state.
>
> Signed-off-by: David Collins <collinsd@codeaurora.org>
> Reviewed-by: Rob Herring <robh@kernel.org>
> Reviewed-by: Douglas Anderson <dianders@chromium.org>
> ---
>  .../bindings/regulator/qcom,rpmh-regulator.txt     | 160 +++++++++++++++++++++
>  .../dt-bindings/regulator/qcom,rpmh-regulator.h    |  36 +++++
>  2 files changed, 196 insertions(+)

I know you are still looking for time to review the RPMh-regulator
driver and that's fine.  One idea I had though: if the bindings look
OK to you and are less controversial, is there any chance they could
land in the meantime?

Specifically it would be very handy to be able to post up device tree
files that refer to regulators and even get those landed, but they
can't land without the bindings.

If that's not possible then no worries, but I figured I'd check.


-Doug

^ permalink raw reply

* [PATCH v9 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings
From: Doug Anderson @ 2018-07-23 20:09 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <56b9d5c38cfe46da1228c54f001a49773c3e31dc.1531531808.git.collinsd@codeaurora.org>

Hi Mark,

On Fri, Jul 13, 2018 at 6:50 PM, David Collins <collinsd@codeaurora.org> wrote:
> Introduce bindings for RPMh regulator devices found on some
> Qualcomm Technlogies, Inc. SoCs.  These devices allow a given
> processor within the SoC to make PMIC regulator requests which
> are aggregated within the RPMh hardware block along with requests
> from other processors in the SoC to determine the final PMIC
> regulator hardware state.
>
> Signed-off-by: David Collins <collinsd@codeaurora.org>
> Reviewed-by: Rob Herring <robh@kernel.org>
> Reviewed-by: Douglas Anderson <dianders@chromium.org>
> ---
>  .../bindings/regulator/qcom,rpmh-regulator.txt     | 160 +++++++++++++++++++++
>  .../dt-bindings/regulator/qcom,rpmh-regulator.h    |  36 +++++
>  2 files changed, 196 insertions(+)

I know you are still looking for time to review the RPMh-regulator
driver and that's fine.  One idea I had though: if the bindings look
OK to you and are less controversial, is there any chance they could
land in the meantime?

Specifically it would be very handy to be able to post up device tree
files that refer to regulators and even get those landed, but they
can't land without the bindings.

If that's not possible then no worries, but I figured I'd check.


-Doug

^ permalink raw reply

* [PATCH] drm/amdgpu: implement harvesting support for UVD 7.2
From: Alex Deucher @ 2018-07-23 20:09 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Alex Deucher

Properly handle cases where one or more instance of the IP
block may be harvested.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       | 10 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c | 13 +++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c       | 11 +++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h       |  5 +++
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c         | 56 +++++++++++++++++++++++++--
 5 files changed, 86 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 258b6f73cbdf..f4d379cd4e47 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -348,8 +348,11 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 			break;
 		case AMDGPU_HW_IP_UVD:
 			type = AMD_IP_BLOCK_TYPE_UVD;
-			for (i = 0; i < adev->uvd.num_uvd_inst; i++)
+			for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+				if (adev->uvd.harvest_config & (1 << i))
+					continue;
 				ring_mask |= ((adev->uvd.inst[i].ring.ready ? 1 : 0) << i);
+			}
 			ib_start_alignment = 64;
 			ib_size_alignment = 64;
 			break;
@@ -362,11 +365,14 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 			break;
 		case AMDGPU_HW_IP_UVD_ENC:
 			type = AMD_IP_BLOCK_TYPE_UVD;
-			for (i = 0; i < adev->uvd.num_uvd_inst; i++)
+			for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+				if (adev->uvd.harvest_config & (1 << i))
+					continue;
 				for (j = 0; j < adev->uvd.num_enc_rings; j++)
 					ring_mask |=
 					((adev->uvd.inst[i].ring_enc[j].ready ? 1 : 0) <<
 					(j + i * adev->uvd.num_enc_rings));
+			}
 			ib_start_alignment = 64;
 			ib_size_alignment = 64;
 			break;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
index ea9850c9224d..bb88411d7c35 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_queue_mgr.c
@@ -219,7 +219,7 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
 			 u32 hw_ip, u32 instance, u32 ring,
 			 struct amdgpu_ring **out_ring)
 {
-	int r, ip_num_rings;
+	int i, r, ip_num_rings;
 	struct amdgpu_queue_mapper *mapper = &mgr->mapper[hw_ip];
 
 	if (!adev || !mgr || !out_ring)
@@ -248,14 +248,21 @@ int amdgpu_queue_mgr_map(struct amdgpu_device *adev,
 		ip_num_rings = adev->sdma.num_instances;
 		break;
 	case AMDGPU_HW_IP_UVD:
-		ip_num_rings = adev->uvd.num_uvd_inst;
+		for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+			if (!(adev->uvd.harvest_config & (1 << i)))
+				ip_num_rings++;
+		}
 		break;
 	case AMDGPU_HW_IP_VCE:
 		ip_num_rings = adev->vce.num_rings;
 		break;
 	case AMDGPU_HW_IP_UVD_ENC:
+		for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+			if (!(adev->uvd.harvest_config & (1 << i)))
+				ip_num_rings++;
+		}
 		ip_num_rings =
-			adev->uvd.num_enc_rings * adev->uvd.num_uvd_inst;
+			adev->uvd.num_enc_rings * ip_num_rings;
 		break;
 	case AMDGPU_HW_IP_VCN_DEC:
 		ip_num_rings = 1;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 80b5c453f8c1..a07548c99ab8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -255,7 +255,8 @@ int amdgpu_uvd_sw_init(struct amdgpu_device *adev)
 		bo_size += AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8);
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
-
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		r = amdgpu_bo_create_kernel(adev, bo_size, PAGE_SIZE,
 					    AMDGPU_GEM_DOMAIN_VRAM, &adev->uvd.inst[j].vcpu_bo,
 					    &adev->uvd.inst[j].gpu_addr, &adev->uvd.inst[j].cpu_addr);
@@ -309,6 +310,8 @@ int amdgpu_uvd_sw_fini(struct amdgpu_device *adev)
 				 &adev->uvd.entity);
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		kfree(adev->uvd.inst[j].saved_bo);
 
 		amdgpu_bo_free_kernel(&adev->uvd.inst[j].vcpu_bo,
@@ -344,6 +347,8 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 	}
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		if (adev->uvd.inst[j].vcpu_bo == NULL)
 			continue;
 
@@ -366,6 +371,8 @@ int amdgpu_uvd_resume(struct amdgpu_device *adev)
 	int i;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		if (adev->uvd.inst[i].vcpu_bo == NULL)
 			return -EINVAL;
 
@@ -1160,6 +1167,8 @@ static void amdgpu_uvd_idle_work_handler(struct work_struct *work)
 	unsigned fences = 0, i, j;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; ++i) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		fences += amdgpu_fence_count_emitted(&adev->uvd.inst[i].ring);
 		for (j = 0; j < adev->uvd.num_enc_rings; ++j) {
 			fences += amdgpu_fence_count_emitted(&adev->uvd.inst[i].ring_enc[j]);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
index 66872286ab12..9cf42454ba81 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h
@@ -46,8 +46,12 @@ struct amdgpu_uvd_inst {
 	struct amdgpu_ring	ring_enc[AMDGPU_MAX_UVD_ENC_RINGS];
 	struct amdgpu_irq_src	irq;
 	uint32_t                srbm_soft_reset;
+	uint32_t                instance;
 };
 
+#define AMDGPU_UVD_HARVEST_UVD0 (1 << 0)
+#define AMDGPU_UVD_HARVEST_UVD1 (1 << 1)
+
 struct amdgpu_uvd {
 	const struct firmware	*fw;	/* UVD firmware */
 	unsigned		fw_version;
@@ -61,6 +65,7 @@ struct amdgpu_uvd {
 	atomic_t		handles[AMDGPU_MAX_UVD_HANDLES];
 	struct drm_sched_entity entity;
 	struct delayed_work	idle_work;
+	unsigned		harvest_config;
 };
 
 int amdgpu_uvd_sw_init(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index db5f3d78ab12..8179317be750 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -41,6 +41,12 @@
 #include "mmhub/mmhub_1_0_sh_mask.h"
 #include "ivsrcid/uvd/irqsrcs_uvd_7_0.h"
 
+#define mmUVD_PG0_CC_UVD_HARVESTING                                                                    0x00c7
+#define mmUVD_PG0_CC_UVD_HARVESTING_BASE_IDX                                                           1
+//UVD_PG0_CC_UVD_HARVESTING
+#define UVD_PG0_CC_UVD_HARVESTING__UVD_DISABLE__SHIFT                                                         0x1
+#define UVD_PG0_CC_UVD_HARVESTING__UVD_DISABLE_MASK                                                           0x00000002L
+
 #define UVD7_MAX_HW_INSTANCES_VEGA20			2
 
 static void uvd_v7_0_set_ring_funcs(struct amdgpu_device *adev);
@@ -370,10 +376,25 @@ static int uvd_v7_0_enc_ring_test_ib(struct amdgpu_ring *ring, long timeout)
 static int uvd_v7_0_early_init(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	if (adev->asic_type == CHIP_VEGA20)
+
+	if (adev->asic_type == CHIP_VEGA20) {
+		u32 harvest;
+		int i;
+
 		adev->uvd.num_uvd_inst = UVD7_MAX_HW_INSTANCES_VEGA20;
-	else
+		for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+			harvest = RREG32_SOC15(UVD, i, mmUVD_PG0_CC_UVD_HARVESTING);
+			if (harvest & UVD_PG0_CC_UVD_HARVESTING__UVD_DISABLE_MASK) {
+				adev->uvd.harvest_config |= 1 << i;
+			}
+		}
+		if (adev->uvd.harvest_config == (AMDGPU_UVD_HARVEST_UVD0 |
+						 AMDGPU_UVD_HARVEST_UVD1))
+			/* both instances are harvested, disable the block */
+			return -ENOENT;
+	} else {
 		adev->uvd.num_uvd_inst = 1;
+	}
 
 	if (amdgpu_sriov_vf(adev))
 		adev->uvd.num_enc_rings = 1;
@@ -393,6 +414,8 @@ static int uvd_v7_0_sw_init(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		/* UVD TRAP */
 		r = amdgpu_irq_add_id(adev, amdgpu_ih_clientid_uvds[j], UVD_7_0__SRCID__UVD_SYSTEM_MESSAGE_INTERRUPT, &adev->uvd.inst[j].irq);
 		if (r)
@@ -425,6 +448,8 @@ static int uvd_v7_0_sw_init(void *handle)
 		return r;
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		if (!amdgpu_sriov_vf(adev)) {
 			ring = &adev->uvd.inst[j].ring;
 			sprintf(ring->name, "uvd<%d>", j);
@@ -472,6 +497,8 @@ static int uvd_v7_0_sw_fini(void *handle)
 		return r;
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		for (i = 0; i < adev->uvd.num_enc_rings; ++i)
 			amdgpu_ring_fini(&adev->uvd.inst[j].ring_enc[i]);
 	}
@@ -500,6 +527,8 @@ static int uvd_v7_0_hw_init(void *handle)
 		goto done;
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; ++j) {
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		ring = &adev->uvd.inst[j].ring;
 
 		if (!amdgpu_sriov_vf(adev)) {
@@ -579,8 +608,11 @@ static int uvd_v7_0_hw_fini(void *handle)
 		DRM_DEBUG("For SRIOV client, shouldn't do anything.\n");
 	}
 
-	for (i = 0; i < adev->uvd.num_uvd_inst; ++i)
+	for (i = 0; i < adev->uvd.num_uvd_inst; ++i) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		adev->uvd.inst[i].ring.ready = false;
+	}
 
 	return 0;
 }
@@ -623,6 +655,8 @@ static void uvd_v7_0_mc_resume(struct amdgpu_device *adev)
 	int i;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; ++i) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
 			WREG32_SOC15(UVD, i, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW,
 				lower_32_bits(adev->firmware.ucode[AMDGPU_UCODE_ID_UVD].mc_addr));
@@ -695,6 +729,8 @@ static int uvd_v7_0_mmsch_start(struct amdgpu_device *adev,
 	WREG32_SOC15(VCE, 0, mmVCE_MMSCH_VF_MAILBOX_RESP, 0);
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; ++i) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		WDOORBELL32(adev->uvd.inst[i].ring_enc[0].doorbell_index, 0);
 		adev->wb.wb[adev->uvd.inst[i].ring_enc[0].wptr_offs] = 0;
 		adev->uvd.inst[i].ring_enc[0].wptr = 0;
@@ -751,6 +787,8 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device *adev)
 		init_table += header->uvd_table_offset;
 
 		for (i = 0; i < adev->uvd.num_uvd_inst; ++i) {
+			if (adev->uvd.harvest_config & (1 << i))
+				continue;
 			ring = &adev->uvd.inst[i].ring;
 			ring->wptr = 0;
 			size = AMDGPU_GPU_PAGE_ALIGN(adev->uvd.fw->size + 4);
@@ -890,6 +928,8 @@ static int uvd_v7_0_start(struct amdgpu_device *adev)
 	int i, j, k, r;
 
 	for (k = 0; k < adev->uvd.num_uvd_inst; ++k) {
+		if (adev->uvd.harvest_config & (1 << k))
+			continue;
 		/* disable DPG */
 		WREG32_P(SOC15_REG_OFFSET(UVD, k, mmUVD_POWER_STATUS), 0,
 				~UVD_POWER_STATUS__UVD_PG_MODE_MASK);
@@ -902,6 +942,8 @@ static int uvd_v7_0_start(struct amdgpu_device *adev)
 	uvd_v7_0_mc_resume(adev);
 
 	for (k = 0; k < adev->uvd.num_uvd_inst; ++k) {
+		if (adev->uvd.harvest_config & (1 << k))
+			continue;
 		ring = &adev->uvd.inst[k].ring;
 		/* disable clock gating */
 		WREG32_P(SOC15_REG_OFFSET(UVD, k, mmUVD_CGC_CTRL), 0,
@@ -1069,6 +1111,8 @@ static void uvd_v7_0_stop(struct amdgpu_device *adev)
 	uint8_t i = 0;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; ++i) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		/* force RBC into idle state */
 		WREG32_SOC15(UVD, i, mmUVD_RBC_RB_CNTL, 0x11010101);
 
@@ -1756,6 +1800,8 @@ static void uvd_v7_0_set_ring_funcs(struct amdgpu_device *adev)
 	int i;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		adev->uvd.inst[i].ring.funcs = &uvd_v7_0_ring_vm_funcs;
 		adev->uvd.inst[i].ring.me = i;
 		DRM_INFO("UVD(%d) is enabled in VM mode\n", i);
@@ -1767,6 +1813,8 @@ static void uvd_v7_0_set_enc_ring_funcs(struct amdgpu_device *adev)
 	int i, j;
 
 	for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
+		if (adev->uvd.harvest_config & (1 << j))
+			continue;
 		for (i = 0; i < adev->uvd.num_enc_rings; ++i) {
 			adev->uvd.inst[j].ring_enc[i].funcs = &uvd_v7_0_enc_ring_vm_funcs;
 			adev->uvd.inst[j].ring_enc[i].me = j;
@@ -1786,6 +1834,8 @@ static void uvd_v7_0_set_irq_funcs(struct amdgpu_device *adev)
 	int i;
 
 	for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
+		if (adev->uvd.harvest_config & (1 << i))
+			continue;
 		adev->uvd.inst[i].irq.num_types = adev->uvd.num_enc_rings + 1;
 		adev->uvd.inst[i].irq.funcs = &uvd_v7_0_irq_funcs;
 	}
-- 
2.13.6

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.