From: Jesper Dangaard Brouer <brouer@redhat.com>
To: linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
akpm@linux-foundation.org
Cc: aravinda@linux.vnet.ibm.com, iamjoonsoo.kim@lge.com,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org,
Jesper Dangaard Brouer <brouer@redhat.com>
Subject: [PATCH V2 3/3] slub: build detached freelist with look-ahead
Date: Mon, 24 Aug 2015 02:59:27 +0200 [thread overview]
Message-ID: <20150824005911.2947.50857.stgit@localhost> (raw)
In-Reply-To: <20150824005727.2947.36065.stgit@localhost>
This change is a more advanced use of detached freelist. The bulk
free array is scanned is a progressive manor with a limited look-ahead
facility.
To maintain the same performance level, as the previous simple
implementation, the look-ahead have been limited to only 3 objects.
This number have been determined my experimental micro benchmarking.
For performance the free loop in kmem_cache_free_bulk have been
significantly reorganized, with a focus on making the branches more
predictable for the compiler. E.g. the per CPU c->freelist is also
build as a detached freelist, even-though it would be just as fast as
freeing directly to it, but it save creating an unpredictable branch.
Another benefit of this change is that kmem_cache_free_bulk() runs
mostly with IRQs enabled. The local IRQs are only disabled when
updating the per CPU c->freelist. This should please Thomas Gleixner.
Pitfall(1): Removed kmem debug support.
Pitfall(2): No BUG_ON() freeing NULL pointers, but the algorithm
handles and skips these NULL pointers.
Compare against previous patch:
There is some fluctuation in the benchmarks between runs. To counter
this I've run some specific[1] bulk sizes, repeated 100 times and run
dmesg through Rusty's "stats"[2] tool.
Command line:
sudo dmesg -c ;\
for x in `seq 100`; do \
modprobe slab_bulk_test02 bulksz=48 loops=100000 && rmmod slab_bulk_test02; \
echo $x; \
sleep 0.${RANDOM} ;\
done; \
dmesg | stats
Results:
bulk size:16, average: +2.01 cycles
Prev: between 19-52 (average: 22.65 stddev:+/-6.9)
This: between 19-67 (average: 24.67 stddev:+/-9.9)
bulk size:48, average: +1.54 cycles
Prev: between 23-45 (average: 27.88 stddev:+/-4)
This: between 24-41 (average: 29.42 stddev:+/-3.7)
bulk size:144, average: +1.73 cycles
Prev: between 44-76 (average: 60.31 stddev:+/-7.7)
This: between 49-80 (average: 62.04 stddev:+/-7.3)
bulk size:512, average: +8.94 cycles
Prev: between 50-68 (average: 60.11 stddev: +/-4.3)
This: between 56-80 (average: 69.05 stddev: +/-5.2)
bulk size:2048, average: +26.81 cycles
Prev: between 61-73 (average: 68.10 stddev:+/-2.9)
This: between 90-104(average: 94.91 stddev:+/-2.1)
[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test02.c
[2] https://github.com/rustyrussell/stats
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
bulk- Fallback - Bulk API
1 - 64 cycles(tsc) 16.144 ns - 47 cycles(tsc) 11.931 - improved 26.6%
2 - 57 cycles(tsc) 14.397 ns - 29 cycles(tsc) 7.368 - improved 49.1%
3 - 55 cycles(tsc) 13.797 ns - 24 cycles(tsc) 6.003 - improved 56.4%
4 - 53 cycles(tsc) 13.500 ns - 22 cycles(tsc) 5.543 - improved 58.5%
8 - 52 cycles(tsc) 13.008 ns - 20 cycles(tsc) 5.047 - improved 61.5%
16 - 51 cycles(tsc) 12.763 ns - 20 cycles(tsc) 5.015 - improved 60.8%
30 - 50 cycles(tsc) 12.743 ns - 20 cycles(tsc) 5.062 - improved 60.0%
32 - 51 cycles(tsc) 12.908 ns - 20 cycles(tsc) 5.089 - improved 60.8%
34 - 87 cycles(tsc) 21.936 ns - 28 cycles(tsc) 7.006 - improved 67.8%
48 - 79 cycles(tsc) 19.840 ns - 31 cycles(tsc) 7.755 - improved 60.8%
64 - 86 cycles(tsc) 21.669 ns - 68 cycles(tsc) 17.203 - improved 20.9%
128 - 101 cycles(tsc) 25.340 ns - 72 cycles(tsc) 18.195 - improved 28.7%
158 - 112 cycles(tsc) 28.152 ns - 73 cycles(tsc) 18.372 - improved 34.8%
250 - 110 cycles(tsc) 27.727 ns - 73 cycles(tsc) 18.430 - improved 33.6%
---
mm/slub.c | 138 ++++++++++++++++++++++++++++++++++++++++---------------------
1 file changed, 90 insertions(+), 48 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 40e4b5926311..49ae96f45670 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2763,71 +2763,113 @@ struct detached_freelist {
int cnt;
};
-/* Note that interrupts must be enabled when calling this function. */
-void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+/*
+ * This function extract objects belonging to the same page, and
+ * builds a detached freelist directly within the given page/objects.
+ * This can happen without any need for synchronization, because the
+ * objects are owned by running process. The freelist is build up as
+ * a single linked list in the objects. The idea is, that this
+ * detached freelist can then be bulk transferred to the real
+ * freelist(s), but only requiring a single synchronization primitive.
+ */
+static inline int build_detached_freelist(
+ struct kmem_cache *s, size_t size, void **p,
+ struct detached_freelist *df, int start_index)
{
- struct kmem_cache_cpu *c;
struct page *page;
int i;
- /* Opportunistically delay updating page->freelist, hoping
- * next free happen to same page. Start building the freelist
- * in the page, but keep local stack ptr to freelist. If
- * successful several object can be transferred to page with a
- * single cmpxchg_double.
- */
- struct detached_freelist df = {0};
+ int lookahead = 0;
+ void *object;
- local_irq_disable();
- c = this_cpu_ptr(s->cpu_slab);
+ /* Always re-init detached_freelist */
+ do {
+ object = p[start_index];
+ if (object) {
+ /* Start new delayed freelist */
+ df->page = virt_to_head_page(object);
+ df->tail_object = object;
+ set_freepointer(s, object, NULL);
+ df->freelist = object;
+ df->cnt = 1;
+ p[start_index] = NULL; /* mark object processed */
+ } else {
+ df->page = NULL; /* Handle NULL ptr in array */
+ }
+ start_index++;
+ } while (!object && start_index < size);
- for (i = 0; i < size; i++) {
- void *object = p[i];
+ for (i = start_index; i < size; i++) {
+ object = p[i];
- BUG_ON(!object);
- /* kmem cache debug support */
- s = cache_from_obj(s, object);
- if (unlikely(!s))
- goto exit;
- slab_free_hook(s, object);
+ if (!object)
+ continue; /* Skip processed objects */
page = virt_to_head_page(object);
- if (page == df.page) {
- /* Oppotunity to delay real free */
- set_freepointer(s, object, df.freelist);
- df.freelist = object;
- df.cnt++;
- } else if (c->page == page) {
- /* Fastpath: local CPU free */
- set_freepointer(s, object, c->freelist);
- c->freelist = object;
+ /* df->page is always set at this point */
+ if (page == df->page) {
+ /* Oppotunity build freelist */
+ set_freepointer(s, object, df->freelist);
+ df->freelist = object;
+ df->cnt++;
+ p[i] = NULL; /* mark object processed */
+ if (!lookahead)
+ start_index++;
} else {
- /* Slowpath: Flush delayed free */
- if (df.page) {
+ /* Limit look ahead search */
+ if (++lookahead >= 3)
+ return start_index;
+ continue;
+ }
+ }
+ return start_index;
+}
+
+/* Note that interrupts must be enabled when calling this function. */
+void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+{
+ struct kmem_cache_cpu *c;
+ int iterator = 0;
+ struct detached_freelist df;
+
+ BUG_ON(!size);
+
+ /* Per CPU ptr may change afterwards */
+ c = this_cpu_ptr(s->cpu_slab);
+
+ while (likely(iterator < size)) {
+ iterator = build_detached_freelist(s, size, p, &df, iterator);
+ if (likely(df.page)) {
+ redo:
+ if (c->page == df.page) {
+ /*
+ * Local CPU free require disabling
+ * IRQs. It is possible to miss the
+ * oppotunity and instead free to
+ * page->freelist, but it does not
+ * matter as page->freelist will
+ * eventually be transferred to
+ * c->freelist
+ */
+ local_irq_disable();
+ c = this_cpu_ptr(s->cpu_slab); /* reload */
+ if (c->page != df.page) {
+ local_irq_enable();
+ goto redo;
+ }
+ /* Bulk transfer to CPU c->freelist */
+ set_freepointer(s, df.tail_object, c->freelist);
+ c->freelist = df.freelist;
+
c->tid = next_tid(c->tid);
local_irq_enable();
+ } else {
+ /* Bulk transfer to page->freelist */
__slab_free(s, df.page, df.tail_object,
_RET_IP_, df.freelist, df.cnt);
- local_irq_disable();
- c = this_cpu_ptr(s->cpu_slab);
}
- /* Start new round of delayed free */
- df.page = page;
- df.tail_object = object;
- set_freepointer(s, object, NULL);
- df.freelist = object;
- df.cnt = 1;
}
}
-exit:
- c->tid = next_tid(c->tid);
- local_irq_enable();
-
- /* Flush detached freelist */
- if (df.page) {
- __slab_free(s, df.page, df.tail_object,
- _RET_IP_, df.freelist, df.cnt);
- }
}
EXPORT_SYMBOL(kmem_cache_free_bulk);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
akpm@linux-foundation.org
Cc: aravinda@linux.vnet.ibm.com, iamjoonsoo.kim@lge.com,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org,
Jesper Dangaard Brouer <brouer@redhat.com>
Subject: [PATCH V2 3/3] slub: build detached freelist with look-ahead
Date: Mon, 24 Aug 2015 02:59:27 +0200 [thread overview]
Message-ID: <20150824005911.2947.50857.stgit@localhost> (raw)
In-Reply-To: <20150824005727.2947.36065.stgit@localhost>
This change is a more advanced use of detached freelist. The bulk
free array is scanned is a progressive manor with a limited look-ahead
facility.
To maintain the same performance level, as the previous simple
implementation, the look-ahead have been limited to only 3 objects.
This number have been determined my experimental micro benchmarking.
For performance the free loop in kmem_cache_free_bulk have been
significantly reorganized, with a focus on making the branches more
predictable for the compiler. E.g. the per CPU c->freelist is also
build as a detached freelist, even-though it would be just as fast as
freeing directly to it, but it save creating an unpredictable branch.
Another benefit of this change is that kmem_cache_free_bulk() runs
mostly with IRQs enabled. The local IRQs are only disabled when
updating the per CPU c->freelist. This should please Thomas Gleixner.
Pitfall(1): Removed kmem debug support.
Pitfall(2): No BUG_ON() freeing NULL pointers, but the algorithm
handles and skips these NULL pointers.
Compare against previous patch:
There is some fluctuation in the benchmarks between runs. To counter
this I've run some specific[1] bulk sizes, repeated 100 times and run
dmesg through Rusty's "stats"[2] tool.
Command line:
sudo dmesg -c ;\
for x in `seq 100`; do \
modprobe slab_bulk_test02 bulksz=48 loops=100000 && rmmod slab_bulk_test02; \
echo $x; \
sleep 0.${RANDOM} ;\
done; \
dmesg | stats
Results:
bulk size:16, average: +2.01 cycles
Prev: between 19-52 (average: 22.65 stddev:+/-6.9)
This: between 19-67 (average: 24.67 stddev:+/-9.9)
bulk size:48, average: +1.54 cycles
Prev: between 23-45 (average: 27.88 stddev:+/-4)
This: between 24-41 (average: 29.42 stddev:+/-3.7)
bulk size:144, average: +1.73 cycles
Prev: between 44-76 (average: 60.31 stddev:+/-7.7)
This: between 49-80 (average: 62.04 stddev:+/-7.3)
bulk size:512, average: +8.94 cycles
Prev: between 50-68 (average: 60.11 stddev: +/-4.3)
This: between 56-80 (average: 69.05 stddev: +/-5.2)
bulk size:2048, average: +26.81 cycles
Prev: between 61-73 (average: 68.10 stddev:+/-2.9)
This: between 90-104(average: 94.91 stddev:+/-2.1)
[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test02.c
[2] https://github.com/rustyrussell/stats
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
bulk- Fallback - Bulk API
1 - 64 cycles(tsc) 16.144 ns - 47 cycles(tsc) 11.931 - improved 26.6%
2 - 57 cycles(tsc) 14.397 ns - 29 cycles(tsc) 7.368 - improved 49.1%
3 - 55 cycles(tsc) 13.797 ns - 24 cycles(tsc) 6.003 - improved 56.4%
4 - 53 cycles(tsc) 13.500 ns - 22 cycles(tsc) 5.543 - improved 58.5%
8 - 52 cycles(tsc) 13.008 ns - 20 cycles(tsc) 5.047 - improved 61.5%
16 - 51 cycles(tsc) 12.763 ns - 20 cycles(tsc) 5.015 - improved 60.8%
30 - 50 cycles(tsc) 12.743 ns - 20 cycles(tsc) 5.062 - improved 60.0%
32 - 51 cycles(tsc) 12.908 ns - 20 cycles(tsc) 5.089 - improved 60.8%
34 - 87 cycles(tsc) 21.936 ns - 28 cycles(tsc) 7.006 - improved 67.8%
48 - 79 cycles(tsc) 19.840 ns - 31 cycles(tsc) 7.755 - improved 60.8%
64 - 86 cycles(tsc) 21.669 ns - 68 cycles(tsc) 17.203 - improved 20.9%
128 - 101 cycles(tsc) 25.340 ns - 72 cycles(tsc) 18.195 - improved 28.7%
158 - 112 cycles(tsc) 28.152 ns - 73 cycles(tsc) 18.372 - improved 34.8%
250 - 110 cycles(tsc) 27.727 ns - 73 cycles(tsc) 18.430 - improved 33.6%
---
mm/slub.c | 138 ++++++++++++++++++++++++++++++++++++++++---------------------
1 file changed, 90 insertions(+), 48 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 40e4b5926311..49ae96f45670 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2763,71 +2763,113 @@ struct detached_freelist {
int cnt;
};
-/* Note that interrupts must be enabled when calling this function. */
-void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+/*
+ * This function extract objects belonging to the same page, and
+ * builds a detached freelist directly within the given page/objects.
+ * This can happen without any need for synchronization, because the
+ * objects are owned by running process. The freelist is build up as
+ * a single linked list in the objects. The idea is, that this
+ * detached freelist can then be bulk transferred to the real
+ * freelist(s), but only requiring a single synchronization primitive.
+ */
+static inline int build_detached_freelist(
+ struct kmem_cache *s, size_t size, void **p,
+ struct detached_freelist *df, int start_index)
{
- struct kmem_cache_cpu *c;
struct page *page;
int i;
- /* Opportunistically delay updating page->freelist, hoping
- * next free happen to same page. Start building the freelist
- * in the page, but keep local stack ptr to freelist. If
- * successful several object can be transferred to page with a
- * single cmpxchg_double.
- */
- struct detached_freelist df = {0};
+ int lookahead = 0;
+ void *object;
- local_irq_disable();
- c = this_cpu_ptr(s->cpu_slab);
+ /* Always re-init detached_freelist */
+ do {
+ object = p[start_index];
+ if (object) {
+ /* Start new delayed freelist */
+ df->page = virt_to_head_page(object);
+ df->tail_object = object;
+ set_freepointer(s, object, NULL);
+ df->freelist = object;
+ df->cnt = 1;
+ p[start_index] = NULL; /* mark object processed */
+ } else {
+ df->page = NULL; /* Handle NULL ptr in array */
+ }
+ start_index++;
+ } while (!object && start_index < size);
- for (i = 0; i < size; i++) {
- void *object = p[i];
+ for (i = start_index; i < size; i++) {
+ object = p[i];
- BUG_ON(!object);
- /* kmem cache debug support */
- s = cache_from_obj(s, object);
- if (unlikely(!s))
- goto exit;
- slab_free_hook(s, object);
+ if (!object)
+ continue; /* Skip processed objects */
page = virt_to_head_page(object);
- if (page == df.page) {
- /* Oppotunity to delay real free */
- set_freepointer(s, object, df.freelist);
- df.freelist = object;
- df.cnt++;
- } else if (c->page == page) {
- /* Fastpath: local CPU free */
- set_freepointer(s, object, c->freelist);
- c->freelist = object;
+ /* df->page is always set at this point */
+ if (page == df->page) {
+ /* Oppotunity build freelist */
+ set_freepointer(s, object, df->freelist);
+ df->freelist = object;
+ df->cnt++;
+ p[i] = NULL; /* mark object processed */
+ if (!lookahead)
+ start_index++;
} else {
- /* Slowpath: Flush delayed free */
- if (df.page) {
+ /* Limit look ahead search */
+ if (++lookahead >= 3)
+ return start_index;
+ continue;
+ }
+ }
+ return start_index;
+}
+
+/* Note that interrupts must be enabled when calling this function. */
+void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+{
+ struct kmem_cache_cpu *c;
+ int iterator = 0;
+ struct detached_freelist df;
+
+ BUG_ON(!size);
+
+ /* Per CPU ptr may change afterwards */
+ c = this_cpu_ptr(s->cpu_slab);
+
+ while (likely(iterator < size)) {
+ iterator = build_detached_freelist(s, size, p, &df, iterator);
+ if (likely(df.page)) {
+ redo:
+ if (c->page == df.page) {
+ /*
+ * Local CPU free require disabling
+ * IRQs. It is possible to miss the
+ * oppotunity and instead free to
+ * page->freelist, but it does not
+ * matter as page->freelist will
+ * eventually be transferred to
+ * c->freelist
+ */
+ local_irq_disable();
+ c = this_cpu_ptr(s->cpu_slab); /* reload */
+ if (c->page != df.page) {
+ local_irq_enable();
+ goto redo;
+ }
+ /* Bulk transfer to CPU c->freelist */
+ set_freepointer(s, df.tail_object, c->freelist);
+ c->freelist = df.freelist;
+
c->tid = next_tid(c->tid);
local_irq_enable();
+ } else {
+ /* Bulk transfer to page->freelist */
__slab_free(s, df.page, df.tail_object,
_RET_IP_, df.freelist, df.cnt);
- local_irq_disable();
- c = this_cpu_ptr(s->cpu_slab);
}
- /* Start new round of delayed free */
- df.page = page;
- df.tail_object = object;
- set_freepointer(s, object, NULL);
- df.freelist = object;
- df.cnt = 1;
}
}
-exit:
- c->tid = next_tid(c->tid);
- local_irq_enable();
-
- /* Flush detached freelist */
- if (df.page) {
- __slab_free(s, df.page, df.tail_object,
- _RET_IP_, df.freelist, df.cnt);
- }
}
EXPORT_SYMBOL(kmem_cache_free_bulk);
next prev parent reply other threads:[~2015-08-24 0:59 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-24 0:58 [PATCH V2 0/3] slub: introducing detached freelist Jesper Dangaard Brouer
2015-08-24 0:58 ` Jesper Dangaard Brouer
2015-08-24 0:58 ` [PATCH V2 1/3] slub: extend slowpath __slab_free() to handle bulk free Jesper Dangaard Brouer
2015-08-24 0:58 ` Jesper Dangaard Brouer
2015-08-24 0:59 ` [PATCH V2 2/3] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-08-24 0:59 ` Jesper Dangaard Brouer
2015-08-24 0:59 ` Jesper Dangaard Brouer [this message]
2015-08-24 0:59 ` [PATCH V2 3/3] slub: build detached freelist with look-ahead Jesper Dangaard Brouer
2015-09-04 17:00 ` [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API Jesper Dangaard Brouer
2015-09-04 17:00 ` Jesper Dangaard Brouer
2015-09-04 17:00 ` [RFC PATCH 1/3] net: introduce kfree_skb_bulk() user of kmem_cache_free_bulk() Jesper Dangaard Brouer
2015-09-04 18:47 ` Tom Herbert
2015-09-07 8:41 ` Jesper Dangaard Brouer
2015-09-07 8:41 ` Jesper Dangaard Brouer
2015-09-07 16:25 ` Tom Herbert
2015-09-07 20:14 ` Jesper Dangaard Brouer
2015-09-07 20:14 ` Jesper Dangaard Brouer
2015-09-08 21:01 ` David Miller
2015-09-04 17:01 ` [RFC PATCH 2/3] net: NIC helper API for building array of skbs to free Jesper Dangaard Brouer
2015-09-04 17:01 ` Jesper Dangaard Brouer
2015-09-04 17:01 ` [RFC PATCH 3/3] ixgbe: bulk free SKBs during TX completion cleanup cycle Jesper Dangaard Brouer
2015-09-04 18:09 ` [RFC PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API Alexander Duyck
2015-09-04 18:55 ` Christoph Lameter
2015-09-04 20:39 ` Alexander Duyck
2015-09-04 23:45 ` Christoph Lameter
2015-09-05 11:18 ` Jesper Dangaard Brouer
2015-09-08 17:32 ` Christoph Lameter
2015-09-08 17:32 ` Christoph Lameter
2015-09-09 12:59 ` Jesper Dangaard Brouer
2015-09-09 14:08 ` Christoph Lameter
2015-09-09 14:08 ` Christoph Lameter
2015-09-07 8:16 ` Jesper Dangaard Brouer
2015-09-07 21:23 ` Alexander Duyck
2015-09-07 21:23 ` Alexander Duyck
2015-09-16 10:02 ` Experiences with slub bulk use-case for network stack Jesper Dangaard Brouer
2015-09-16 10:02 ` Jesper Dangaard Brouer
2015-09-16 15:13 ` Christoph Lameter
2015-09-17 20:17 ` Jesper Dangaard Brouer
2015-09-17 23:57 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150824005911.2947.50857.stgit@localhost \
--to=brouer@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aravinda@linux.vnet.ibm.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=paulmck@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.