From: Andrea Arcangeli <aarcange@redhat.com>
To: Alex Shi <alex.shi@intel.com>
Cc: Alex Shi <lkml.alex@gmail.com>,
Petr Holasek <pholasek@redhat.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Hillf Danton <dhillf@gmail.com>, Dan Smith <danms@us.ibm.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
Paul Turner <pjt@google.com>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Bharata B Rao <bharata.rao@gmail.com>,
Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Christoph Lameter <cl@linux.com>,
"Chen, Tim C" <tim.c.chen@intel.com>
Subject: Re: AutoNUMA15
Date: Tue, 26 Jun 2012 14:03:25 +0200 [thread overview]
Message-ID: <20120626120325.GA25956@redhat.com> (raw)
In-Reply-To: <4FE96A3A.2080307@intel.com>
[-- Attachment #1: Type: text/plain, Size: 4437 bytes --]
On Tue, Jun 26, 2012 at 03:52:26PM +0800, Alex Shi wrote:
> Could you like to give a url for the benchmarks?
I posted them to lkml a few months ago, I'm attaching them here. There
is actually a more polished version around that I didn't have time to
test yet. For now I'm attaching the old version here that I'm still
using to verify the regressions.
If you edit the .c files to make the right hard/inverse binds, and
then build with -DHARD_BIND and later -DINVERSE_BIND you can measure
the hardware NUMA effects on your hardware. numactl --hardware will
give you the topology to check if the code is ok for your hardware.
> memory). find the openjdk has about 2% regression, while jrockit has no
2% regression is in the worst case the numa hinting page faults (or in
the best case a measurement error) when you get no benefit from the
vastly increased NUMA affinity.
You can reduce that overhead to below 1% by multiplying by 2/3 times
the /sys/kernel/mm/autonuma/knuma_scand/scan_sleep_millisecs and
/sys/kernel/mm/autonuma/knuma_scand/scan_sleep_pass_millisecs .
Especially the latter if set to 15000 will reduce the overhead by 1%.
The current AutoNUMA defaults are hyper aggressive, with benchmarks
running for several minutes you can easily reduce AutoNUMA
aggressiveness to pay a lower fixed cost in the numa hinting page
faults without reducing overall performance.
The boost when you use AutoNUMA is >20%, sometime as high as 100%, so
the 2% is lost in the noise, but over time we should reduce it
(especially with hypervisor tuned profile for those cloud nodes which
only run virtual machines in turn with quite constant loads where
there's no need to react that fast).
> the testing user 2 instances, each of them are pinned to a node. some
> setting is here:
Ok the problem is that you must not pin anything. If you hard pin
AutoNUMA won't do anything on those processes.
It is impossible to run faster than the raw hard pinning, impossible
because AutoNUMA has also to migrate memory, hard pinning avoids all
memory migrations.
AutoNUMA aims to achieve as close performance to hard pinning as
possible without having to user hard pinning, that's the whole point.
So this explains why you measure a 2% regression or no difference,
with hard pins used at all times only the AutoNUMA worst case overhead
can be measured (and I explained above how it can be reduced).
A plan I can suggest for this benchmark is this:
1) "upstream default"
- no hugetlbfs (AutoNUMA cannot migrate hugetlbfs memory)
- no hard pinning of CPUs or memory to nodes
- CONFIG_AUTONUMA=n
- CONFIG_TRANSPARENT_HUGEPAGE=y
2) "autonuma"
- no hugetlbfs (AutoNUMA cannot migrate hugetlbfs memory)
- no hard pinning of CPUs or memory to nodes
- CONFIG_AUTONUMA=y
- CONFIG_AUTONUMA_DEFAULT_ENABLED=y
- CONFIG_TRANSPARENT_HUGEPAGE=y
3) "autonuma lower numa hinting page fault overhead"
- no hugetlbfs (AutoNUMA cannot migrate hugetlbfs memory)
- no hard pinning of CPUs or memory to nodes
- CONFIG_AUTONUMA=y
- CONFIG_AUTONUMA_DEFAULT_ENABLED=y
- CONFIG_TRANSPARENT_HUGEPAGE=y
- echo 15000 >/sys/kernel/mm/autonuma/knuma_scand/scan_sleep_pass_millisecs
4) "upstream hard pinning and transparent hugepage"
- hard pinning of CPUs or memory to nodes
- CONFIG_AUTONUMA=n
- CONFIG_TRANSPARENT_HUGEPAGE=y
5) "upstream hard pinning and hugetlbfs"
- hugetlbfs
- hard pinning of CPUs or memory to nodes
- CONFIG_AUTONUMA=n
- CONFIG_TRANSPARENT_HUGEPAGE=y (y/n won't matter if you use hugetlbfs)
Then you can compare 1/2/3/4/5.
The minimum to make a meaningful comparison is 1 vs 2. The next best
comparison is 1 vs 2 vs 4 (4 is very useful reference too because the
closer AutoNUMA gets to 4 the better! beating 1 is trivial, getting
very close to 4 is less easy because 4 isn't migrating any memory).
Running 3 and 5 is optional, especially I mentioned 5 just because you
liked to run it with hugetlbfs and not just THP.
> jrockit use hugetlb and its options:
hugetlbfs should be disabled when AutoNUMA is enabled because AutoNUMA
won't try to migrate hugetlbfs memory, not that it makes any
difference if the memory is hard pinned. THP should deliver the same
performance of hugetlbfs for the JVM and THP memory can be migrated by
AutoNUMA (as well as mmapped not-shared pagecache, not just anon
memory).
Thanks a lot, and looking forward to see how things goes when you
remove the hard pins.
Andrea
[-- Attachment #2: numa01.c --]
[-- Type: text/x-c, Size: 3046 bytes --]
/*
* Copyright (C) 2012 Red Hat, Inc.
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*/
#define _GNU_SOURCE
#include <pthread.h>
#include <strings.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <numaif.h>
#include <sched.h>
#include <time.h>
#include <sys/wait.h>
#include <sys/file.h>
//#define KVM
#ifndef KVM
#define THREADS 12
#define SIZE (3UL*1024*1024*1024)
#else
#define THREADS 4
#define SIZE (200*1024*1024)
#endif
//#define THREAD_ALLOC
#ifdef THREAD_ALLOC
#define THREAD_SIZE (SIZE/THREADS)
#else
#define THREAD_SIZE SIZE
#endif
//#define HARD_BIND
//#define INVERSE_BIND
//#define NO_BIND_FORCE_SAME_NODE
static char *p_global;
static unsigned long nodemask_global;
void *thread(void * arg)
{
char *p = arg;
int i;
#ifndef KVM
#ifndef THREAD_ALLOC
int nr = 50;
#else
int nr = 1000;
#endif
#else
int nr = 500;
#endif
#ifdef NO_BIND_FORCE_SAME_NODE
if (set_mempolicy(MPOL_BIND, &nodemask_global, 3) < 0)
perror("set_mempolicy"), printf("%lu\n", nodemask_global),
exit(1);
#endif
bzero(p_global, SIZE);
#ifdef NO_BIND_FORCE_SAME_NODE
if (set_mempolicy(MPOL_DEFAULT, NULL, 3) < 0)
perror("set_mempolicy"), exit(1);
#endif
for (i = 0; i < nr; i++) {
#if 1
bzero(p, THREAD_SIZE);
#else
memcpy(p, p+THREAD_SIZE/2, THREAD_SIZE/2);
#endif
asm volatile("" : : : "memory");
}
return NULL;
}
int main()
{
int i;
pthread_t pthread[THREADS];
char *p;
pid_t pid;
cpu_set_t cpumask;
int f;
unsigned long nodemask;
nodemask_global = (time(NULL) & 1) + 1;
f = creat("lock", 0400);
if (f < 0)
perror("creat"), exit(1);
if (unlink("lock") < 0)
perror("unlink"), exit(1);
if ((pid = fork()) < 0)
perror("fork"), exit(1);
p_global = p = malloc(SIZE);
if (!p)
perror("malloc"), exit(1);
CPU_ZERO(&cpumask);
if (!pid) {
nodemask = 1;
for (i = 0; i < 6; i++)
CPU_SET(i, &cpumask);
#if 1
for (i = 12; i < 18; i++)
CPU_SET(i, &cpumask);
#else
for (i = 6; i < 12; i++)
CPU_SET(i, &cpumask);
#endif
} else {
nodemask = 2;
for (i = 6; i < 12; i++)
CPU_SET(i, &cpumask);
for (i = 18; i < 24; i++)
CPU_SET(i, &cpumask);
}
#ifdef INVERSE_BIND
if (nodemask == 1)
nodemask = 2;
else if (nodemask == 2)
nodemask = 1;
#endif
#if 0
if (pid)
goto skip;
#endif
#ifdef HARD_BIND
if (sched_setaffinity(0, sizeof(cpumask), &cpumask) < 0)
perror("sched_setaffinity"), exit(1);
#endif
#ifdef HARD_BIND
if (set_mempolicy(MPOL_BIND, &nodemask, 3) < 0)
perror("set_mempolicy"), printf("%lu\n", nodemask), exit(1);
#endif
#if 0
bzero(p, SIZE);
#endif
for (i = 0; i < THREADS; i++) {
char *_p = p;
#ifdef THREAD_ALLOC
_p += THREAD_SIZE * i;
#endif
if (pthread_create(&pthread[i], NULL, thread, _p) != 0)
perror("pthread_create"), exit(1);
}
for (i = 0; i < THREADS; i++)
if (pthread_join(pthread[i], NULL) != 0)
perror("pthread_join"), exit(1);
#if 1
skip:
#endif
if (pid)
if (wait(NULL) < 0)
perror("wait"), exit(1);
return 0;
}
[-- Attachment #3: numa02.c --]
[-- Type: text/x-c, Size: 2138 bytes --]
/*
* Copyright (C) 2012 Red Hat, Inc.
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*/
#define _GNU_SOURCE
#include <pthread.h>
#include <strings.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <numaif.h>
#include <sched.h>
#include <sys/wait.h>
#include <sys/file.h>
//#define KVM
#ifndef KVM
#ifndef SMT
#define THREADS 24
#else
#define THREADS 12
#endif
#define SIZE (1UL*1024*1024*1024)
#else
#ifndef SMT
#define THREADS 8
#else
#define THREADS 4
#endif
#define SIZE (500*1024*1024)
#endif
#define TOTALSIZE (4UL*1024*1024*1024*200)
#define THREAD_SIZE (SIZE/THREADS)
//#define HARD_BIND
//#define INVERSE_BIND
static void *thread(void * arg)
{
char *p = arg;
int i;
for (i = 0; i < TOTALSIZE/SIZE; i++) {
bzero(p, THREAD_SIZE);
asm volatile("" : : : "memory");
}
return NULL;
}
#ifdef HARD_BIND
static void bind(int node)
{
int i;
unsigned long nodemask;
cpu_set_t cpumask;
CPU_ZERO(&cpumask);
if (!node) {
nodemask = 1;
for (i = 0; i < 6; i++)
CPU_SET(i, &cpumask);
for (i = 12; i < 18; i++)
CPU_SET(i, &cpumask);
} else {
nodemask = 2;
for (i = 6; i < 12; i++)
CPU_SET(i, &cpumask);
for (i = 18; i < 24; i++)
CPU_SET(i, &cpumask);
}
if (sched_setaffinity(0, sizeof(cpumask), &cpumask) < 0)
perror("sched_setaffinity"), exit(1);
if (set_mempolicy(MPOL_BIND, &nodemask, 3) < 0)
perror("set_mempolicy"), printf("%lu\n", nodemask), exit(1);
}
#else
static void bind(int node) {}
#endif
int main()
{
int i;
pthread_t pthread[THREADS];
char *p;
pid_t pid;
int f;
p = malloc(SIZE);
if (!p)
perror("malloc"), exit(1);
bind(0);
bzero(p, SIZE/2);
bind(1);
bzero(p+SIZE/2, SIZE/2);
for (i = 0; i < THREADS; i++) {
char *_p = p + THREAD_SIZE * i;
#ifdef INVERSE_BIND
bind(i < THREADS/2);
#else
bind(i >= THREADS/2);
#endif
if (pthread_create(&pthread[i], NULL, thread, _p) != 0)
perror("pthread_create"), exit(1);
}
for (i = 0; i < THREADS; i++)
if (pthread_join(pthread[i], NULL) != 0)
perror("pthread_join"), exit(1);
return 0;
}
next prev parent reply other threads:[~2012-06-26 12:04 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-25 17:02 [PATCH 00/35] AutoNUMA alpha14 Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 01/35] mm: add unlikely to the mm allocation failure check Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 02/35] autonuma: make set_pmd_at always available Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 03/35] xen: document Xen is using an unused bit for the pagetables Andrea Arcangeli
2012-05-25 20:26 ` Konrad Rzeszutek Wilk
2012-05-26 15:59 ` Andrea Arcangeli
2012-05-29 14:10 ` Konrad Rzeszutek Wilk
2012-05-29 16:01 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 04/35] autonuma: define _PAGE_NUMA_PTE and _PAGE_NUMA_PMD Andrea Arcangeli
2012-05-30 18:22 ` Konrad Rzeszutek Wilk
2012-05-30 18:34 ` Andrea Arcangeli
2012-05-30 20:01 ` Konrad Rzeszutek Wilk
2012-06-05 17:13 ` Andrea Arcangeli
2012-06-05 17:17 ` Konrad Rzeszutek Wilk
2012-06-05 17:40 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 05/35] autonuma: x86 pte_numa() and pmd_numa() Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 06/35] autonuma: generic " Andrea Arcangeli
2012-05-30 20:23 ` Konrad Rzeszutek Wilk
2012-05-25 17:02 ` [PATCH 07/35] autonuma: teach gup_fast about pte_numa Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 08/35] autonuma: introduce kthread_bind_node() Andrea Arcangeli
2012-05-29 12:49 ` Peter Zijlstra
2012-05-29 16:11 ` Andrea Arcangeli
2012-05-29 17:04 ` Peter Zijlstra
2012-05-29 17:44 ` Andrea Arcangeli
2012-05-29 17:48 ` Peter Zijlstra
2012-05-29 18:15 ` Andrea Arcangeli
2012-05-30 20:26 ` Konrad Rzeszutek Wilk
2012-05-25 17:02 ` [PATCH 09/35] autonuma: mm_autonuma and sched_autonuma data structures Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 10/35] autonuma: define the autonuma flags Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 11/35] autonuma: core autonuma.h header Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 12/35] autonuma: CPU follow memory algorithm Andrea Arcangeli
2012-05-29 13:00 ` Peter Zijlstra
2012-05-29 13:54 ` Rik van Riel
2012-05-29 13:10 ` Peter Zijlstra
2012-06-22 17:36 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 13/35] autonuma: add page structure fields Andrea Arcangeli
2012-05-29 13:16 ` Peter Zijlstra
2012-05-29 13:56 ` Rik van Riel
2012-05-29 14:54 ` Peter Zijlstra
2012-05-30 8:25 ` KOSAKI Motohiro
2012-05-30 9:06 ` Peter Zijlstra
2012-05-30 9:41 ` KOSAKI Motohiro
2012-05-30 9:55 ` Peter Zijlstra
2012-05-30 13:49 ` Andrea Arcangeli
2012-05-31 18:18 ` Peter Zijlstra
2012-06-05 14:51 ` Andrea Arcangeli
2012-06-19 18:06 ` Andrea Arcangeli
2012-05-29 16:38 ` Andrea Arcangeli
2012-05-29 16:46 ` Rik van Riel
2012-05-29 16:56 ` Peter Zijlstra
2012-05-29 18:35 ` Andrea Arcangeli
2012-05-29 17:38 ` Linus Torvalds
2012-05-29 18:09 ` Andrea Arcangeli
2012-05-29 20:42 ` Rik van Riel
2012-05-25 17:02 ` [PATCH 14/35] autonuma: knuma_migrated per NUMA node queues Andrea Arcangeli
2012-05-29 13:51 ` Peter Zijlstra
2012-05-30 0:14 ` Andrea Arcangeli
2012-05-30 18:19 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 15/35] autonuma: init knuma_migrated queues Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 16/35] autonuma: autonuma_enter/exit Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 17/35] autonuma: call autonuma_setup_new_exec() Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 18/35] autonuma: alloc/free/init sched_autonuma Andrea Arcangeli
2012-05-30 20:55 ` Konrad Rzeszutek Wilk
2012-05-25 17:02 ` [PATCH 19/35] autonuma: alloc/free/init mm_autonuma Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 20/35] autonuma: avoid CFS select_task_rq_fair to return -1 Andrea Arcangeli
2012-05-29 14:02 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 21/35] autonuma: teach CFS about autonuma affinity Andrea Arcangeli
2012-05-29 16:05 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 22/35] autonuma: sched_set_autonuma_need_balance Andrea Arcangeli
2012-05-29 16:12 ` Peter Zijlstra
2012-05-29 17:33 ` Andrea Arcangeli
2012-05-29 17:43 ` Peter Zijlstra
2012-05-29 18:24 ` Andrea Arcangeli
2012-05-29 22:21 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 23/35] autonuma: core Andrea Arcangeli
2012-05-29 11:45 ` Kirill A. Shutemov
2012-05-30 0:03 ` Andrea Arcangeli
2012-05-29 16:27 ` Peter Zijlstra
2012-05-25 17:02 ` [PATCH 24/35] autonuma: follow_page check for pte_numa/pmd_numa Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 25/35] autonuma: default mempolicy follow AutoNUMA Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 26/35] autonuma: call autonuma_split_huge_page() Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 27/35] autonuma: make khugepaged pte_numa aware Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 28/35] autonuma: retain page last_nid information in khugepaged Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 29/35] autonuma: numa hinting page faults entry points Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 30/35] autonuma: reset autonuma page data when pages are freed Andrea Arcangeli
2012-05-29 16:30 ` Peter Zijlstra
2012-05-29 16:49 ` Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 31/35] autonuma: initialize page structure fields Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 32/35] autonuma: link mm/autonuma.o and kernel/sched/numa.o Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 33/35] autonuma: add CONFIG_AUTONUMA and CONFIG_AUTONUMA_DEFAULT_ENABLED Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 34/35] autonuma: boost khugepaged scanning rate Andrea Arcangeli
2012-05-25 17:02 ` [PATCH 35/35] autonuma: page_autonuma Andrea Arcangeli
2012-05-29 16:44 ` Peter Zijlstra
2012-05-29 17:14 ` Andrea Arcangeli
2012-05-26 17:28 ` [PATCH 00/35] AutoNUMA alpha14 Rik van Riel
2012-05-26 20:42 ` Linus Torvalds
2012-05-29 15:53 ` Christoph Lameter
2012-05-29 16:08 ` Andrea Arcangeli
2012-05-30 14:46 ` Peter Zijlstra
2012-05-30 15:30 ` Ingo Molnar
2012-05-29 13:36 ` Kirill A. Shutemov
2012-05-29 15:43 ` Petr Holasek
2012-05-31 18:08 ` AutoNUMA15 Andrea Arcangeli
2012-05-31 20:01 ` AutoNUMA15 Don Morris
2012-05-31 22:54 ` AutoNUMA15 Andrea Arcangeli
2012-06-01 0:04 ` AutoNUMA15 Andrea Arcangeli
2012-05-31 18:52 ` AutoNUMA15 Don Morris
2012-06-07 2:30 ` AutoNUMA15 Zhouping Liu
2012-06-21 7:29 ` AutoNUMA15 Alex Shi
2012-06-21 14:55 ` AutoNUMA15 Andrea Arcangeli
2012-06-26 7:52 ` AutoNUMA15 Alex Shi
2012-06-26 12:03 ` Andrea Arcangeli [this message]
2012-07-12 2:36 ` AutoNUMA15 Alex Shi
2012-05-29 17:15 ` [PATCH 00/35] AutoNUMA alpha14 Andrea Arcangeli
2012-06-01 22:41 ` Mauricio Faria de Oliveira
2012-06-22 17:57 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120626120325.GA25956@redhat.com \
--to=aarcange@redhat.com \
--cc=Lee.Schermerhorn@hp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@intel.com \
--cc=bharata.rao@gmail.com \
--cc=cl@linux.com \
--cc=danms@us.ibm.com \
--cc=dhillf@gmail.com \
--cc=efault@gmx.de \
--cc=hannes@cmpxchg.org \
--cc=kirill@shutemov.name \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkml.alex@gmail.com \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pholasek@redhat.com \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).