All of lore.kernel.org
 help / color / mirror / Atom feed
* Performance issues in copy_user_generic() in x86_64
@ 2025-03-14 17:53 Herton R. Krzesinski
  2025-03-14 17:53 ` [PATCH] x86: add back the alignment of the destination to 8 bytes in copy_user_generic() Herton R. Krzesinski
  0 siblings, 1 reply; 13+ messages in thread
From: Herton R. Krzesinski @ 2025-03-14 17:53 UTC (permalink / raw)
  To: x86
  Cc: tglx, mingo, bp, dave.hansen, hpa, linux-kernel, torvalds,
	olichtne, atomasov, aokuliar

Hello,

recently I have got two reports of performance loss in copy_user_generic()
after updates in user copy functions in x86_64, when benchmarking with iperf3.
I believe the write alignment to 8 bytes that was done through the old
ALIGN_DESTINATION macro was helping in some cases, and when it was removed the
performance drop can be noticed. Looks like this theory is corroborated by some
performance testing I did.

Please take a look at the following email with the patch if everything is sane.
I already did some testing as explained in the changelog of the patch. I used
the following scripts to run the testing, I just wrote them to get the job done
and get some results, so there is nothing fancy about them.

---- bench.sh
#!/bin/bash

dir=$1
mkdir -p $dir

for cpu in 19 21 23 none; do
	sync
	echo 3 > /proc/sys/vm/drop_caches
	cpu_opt=""
	if [ "$cpu" != "none" ]; then
		cpu_opt="taskset -c $cpu"
	fi
	$cpu_opt iperf3 -D -s -B 127.0.0.1 -p 12000
	perf stat -o $dir/stat.$cpu.txt taskset -c 17 iperf3 -c 127.0.0.1 -b 0/1000 -V -n 50G --repeating-payload -l 16384 -p 12000 --cport 12001 2>&1 > $dir/stat-$cpu.txt
	cat $dir/stat.$cpu.txt >> $dir/stat-$cpu.txt
	rm -f $dir/stat.$cpu.txt
	killall iperf3
done
----

---- stat.sh
#!/bin/bash

dir=$1
printf "            %4s  %13s %12s %12s %11s\n" "CPU" "RATE     " "SYS     " "TIME    " "sender-receiver"

for cpu in 19 21 23 none; do
	time=$(grep 'seconds time elapsed' $dir/stat-$cpu.txt | awk '{ print $1 }')
	sys=$(grep 'seconds sys' $dir/stat-$cpu.txt | awk '{ print $1 }')
	rate=$(grep ' sender' $dir/stat-$cpu.txt | awk '{ print $7 $8 }')
	cpuu=$(grep 'CPU Utilization' $dir/stat-$cpu.txt | awk '{ printf "%s-%s\n", $4, $7 }')

	printf "Server bind %4s: $rate $sys $time %s\n" $cpu $cpuu
done
----

Example of a test run:
nice -n -20 ./bench.sh align
./stat.sh align


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-03-19 13:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-14 17:53 Performance issues in copy_user_generic() in x86_64 Herton R. Krzesinski
2025-03-14 17:53 ` [PATCH] x86: add back the alignment of the destination to 8 bytes in copy_user_generic() Herton R. Krzesinski
2025-03-14 19:06   ` Linus Torvalds
2025-03-14 20:33     ` Herton Krzesinski
2025-03-16 10:58       ` Ingo Molnar
2025-03-16 11:09         ` Ingo Molnar
2025-03-17 13:18           ` Herton Krzesinski
2025-03-18 21:59           ` David Laight
2025-03-18 22:50             ` Herton Krzesinski
2025-03-19 13:07               ` David Laight
2025-03-17 13:16     ` David Laight
2025-03-17 21:29       ` Linus Torvalds
2025-03-17 22:32         ` David Laight

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.