BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@meta.com>
To: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: bpf <bpf@vger.kernel.org>, Kui-Feng Lee <kuifeng@meta.com>,
	Manu Bretelle <chantr4@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Jakub Sitnicki <jakub@cloudflare.com>
Subject: Re: selftest sock_fields failed on s390x with latest llvm17
Date: Mon, 15 May 2023 08:27:15 -0700	[thread overview]
Message-ID: <5a57547e-6865-1026-60db-bb6b2ca70e34@meta.com> (raw)
In-Reply-To: <35094b7c6fbe9843638a3695d56a92e42f3cfe4c.camel@linux.ibm.com>



On 5/15/23 12:55 AM, Ilya Leoshkevich wrote:
> On Sun, 2023-05-14 at 09:58 -0700, Yonghong Song wrote:
>>
>>
>> On 5/13/23 1:24 AM, Ilya Leoshkevich wrote:
>>> On Fri, 2023-05-12 at 21:13 -0700, Yonghong Song wrote:
>>>>
>>>>
>>>> On 5/12/23 7:40 AM, Ilya Leoshkevich wrote:
>>>>> On Wed, 2023-05-03 at 21:46 +0200, Ilya Leoshkevich wrote:
>>>>>> On Wed, 2023-05-03 at 12:35 -0700, Yonghong Song wrote:
>>>>>>> Hi, Ilya,
>>>>>>>
>>>>>>> BPF CI ([1]) detected a s390x failure when bpf program is
>>>>>>> compiled
>>>>>>> with
>>>>>>> latest llvm17 on bpf-next branch. To reproduce the issue,
>>>>>>> just
>>>>>>> run
>>>>>>> normal './test_progs -j'. The failure log looks like below:
>>>>>>>
>>>>>>> Notice: Success: 341/3015, Skipped: 29, Failed: 1
>>>>>>> Error: #191 sock_fields
>>>>>>>       Error: #191 sock_fields
>>>>>>>       create_netns:PASS:create netns 0 nsec
>>>>>>>       create_netns:PASS:bring up lo 0 nsec
>>>>>>>      
>>>>>>> serial_test_sock_fields:PASS:test_sock_fields__open_and_loa
>>>>>>> d 0
>>>>>>> nsec
>>>>>>>      
>>>>>>> serial_test_sock_fields:PASS:attach_cgroup(egress_read_sock
>>>>>>> _fie
>>>>>>> lds)
>>>>>>> 0
>>>>>>> nsec
>>>>>>>      
>>>>>>> serial_test_sock_fields:PASS:attach_cgroup(ingress_read_soc
>>>>>>> k_fi
>>>>>>> elds
>>>>>>> )
>>>>>>> 0 nsec
>>>>>>>      
>>>>>>> serial_test_sock_fields:PASS:attach_cgroup(read_sk_dst_port
>>>>>>> 0
>>>>>>> nsec
>>>>>>>       test:PASS:getsockname(listen_fd) 0 nsec
>>>>>>>       test:PASS:getsockname(cli_fd) 0 nsec
>>>>>>>       test:PASS:accept(listen_fd) 0 nsec
>>>>>>>      
>>>>>>> init_sk_storage:PASS:bpf_map_update_elem(sk_pkt_out_cnt_fd)
>>>>>>> 0
>>>>>>> nsec
>>>>>>>      
>>>>>>> init_sk_storage:PASS:bpf_map_update_elem(sk_pkt_out_cnt10_f
>>>>>>> d) 0
>>>>>>> nsec
>>>>>>>       test:PASS:send(accept_fd) 0 nsec
>>>>>>>       test:PASS:recv(cli_fd) 0 nsec
>>>>>>>       test:PASS:send(accept_fd) 0 nsec
>>>>>>>       test:PASS:recv(cli_fd) 0 nsec
>>>>>>>       test:PASS:recv(accept_fd) for fin 0 nsec
>>>>>>>       test:PASS:recv(cli_fd) for fin 0 nsec
>>>>>>>      
>>>>>>> check_sk_pkt_out_cnt:PASS:bpf_map_lookup_elem(sk_pkt_out_cn
>>>>>>> t,
>>>>>>> &accept_fd) 0 nsec
>>>>>>>      
>>>>>>> check_sk_pkt_out_cnt:PASS:bpf_map_lookup_elem(sk_pkt_out_cn
>>>>>>> t,
>>>>>>> &cli_fd) 0 nsec
>>>>>>>       check_result:PASS:bpf_map_lookup_elem(linum_map_fd) 0
>>>>>>> nsec
>>>>>>>       check_result:PASS:bpf_map_lookup_elem(linum_map_fd) 0
>>>>>>> nsec
>>>>>>>       check_result:PASS:bpf_map_lookup_elem(linum_map_fd,
>>>>>>> READ_SK_DST_PORT_IDX) 0 nsec
>>>>>>>       check_result:FAIL:failure in read_sk_dst_port on line
>>>>>>> unexpected
>>>>>>> failure in read_sk_dst_port on line: actual 297 != expected
>>>>>>> 0
>>>>>>>       listen_sk: state:10 bound_dev_if:0 family:10 type:1
>>>>>>> protocol:6
>>>>>>> mark:0
>>>>>>> priority:0 src_ip4:7f000006(127.0.0.6) src_ip6:0:0:0:1(::1)
>>>>>>> src_port:51966 dst_ip4:0(0.0.0.0) dst_ip6:0:0:0:0(::)
>>>>>>> dst_port:0
>>>>>>>       srv_sk: state:9 bound_dev_if:0 family:10 type:1
>>>>>>> protocol:6
>>>>>>> mark:0
>>>>>>> priority:0 src_ip4:7f000006(127.0.0.6) src_ip6:0:0:0:1(::1)
>>>>>>> src_port:51966 dst_ip4:7f000006(127.0.0.6)
>>>>>>> dst_ip6:0:0:0:1(::1)
>>>>>>> dst_port:38030
>>>>>>>       cli_sk: state:5 bound_dev_if:0 family:10 type:1
>>>>>>> protocol:6
>>>>>>> mark:0
>>>>>>> priority:0 src_ip4:7f000006(127.0.0.6) src_ip6:0:0:0:1(::1)
>>>>>>> src_port:38030 dst_ip4:0(0.0.0.0) dst_ip6:0:0:0:1(::1)
>>>>>>> dst_port:51966
>>>>>>>       listen_tp: snd_cwnd:10 srtt_us:0 rtt_min:4294967295
>>>>>>> snd_ssthresh:2147483647 rcv_nxt:0 snd_nxt:0 snd:una:0
>>>>>>> mss_cache:536
>>>>>>> ecn_flags:0 rate_delivered:0 rate_interval_us:0
>>>>>>> packets_out:0
>>>>>>> retrans_out:0 total_retrans:0 segs_in:0 data_segs_in:0
>>>>>>> segs_out:0
>>>>>>> data_segs_out:0 lost_out:0 sacked_out:0 bytes_received:0
>>>>>>> bytes_acked:0
>>>>>>>       srv_tp: snd_cwnd:10 srtt_us:3904 rtt_min:272
>>>>>>> snd_ssthresh:2147483647
>>>>>>> rcv_nxt:648617715 snd_nxt:4218065810 snd:una:4218065810
>>>>>>> mss_cache:32768
>>>>>>> ecn_flags:0 rate_delivered:1 rate_interval_us:272
>>>>>>> packets_out:0
>>>>>>> retrans_out:0 total_retrans:0 segs_in:5 data_segs_in:0
>>>>>>> segs_out:3
>>>>>>> data_segs_out:2 lost_out:0 sacked_out:0 bytes_received:1
>>>>>>> bytes_acked:22
>>>>>>>       cli_tp: snd_cwnd:10 srtt_us:6035 rtt_min:730
>>>>>>> snd_ssthresh:2147483647
>>>>>>> rcv_nxt:4218065811 snd_nxt:648617715 snd:una:648617715
>>>>>>> mss_cache:32768
>>>>>>> ecn_flags:0 rate_delivered:1 rate_interval_us:925
>>>>>>> packets_out:0
>>>>>>> retrans_out:0 total_retrans:0 segs_in:4 data_segs_in:2
>>>>>>> segs_out:6
>>>>>>> data_segs_out:0 lost_out:0 sacked_out:0 bytes_received:23
>>>>>>> bytes_acked:2
>>>>>>>       check_result:PASS:listen_sk 0 nsec
>>>>>>>       check_result:PASS:srv_sk 0 nsec
>>>>>>>       check_result:PASS:srv_tp 0 nsec
>>>>>>>
>>>>>>> If bpf program is compiled with llvm16, the test passed
>>>>>>> according
>>>>>>> to
>>>>>>> a CI run.
>>>>>>>
>>>>>>> I don't have s390x environment to debug this. Could you
>>>>>>> help
>>>>>>> debug
>>>>>>> it?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>>       [1]
>>>>>>> https://github.com/kernel-patches/vmtest/actions/runs/4866851496/jobs/8679080985?pr=224#step:6:7645
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> thank for letting me know.
>>>>>> I will look into this.
>>>>>>
>>>>>> Best regards,
>>>>>> Ilya
>>>>>
>>>>> In the meantime the issue was fixed by:
>>>>>
>>>>> commit 141be5c062ecf22bd287afffd310e8ac4711444a
>>>>> Author: Shoaib Meenai <smeenai@fb.com>
>>>>> Date:   Fri May 5 14:18:12 2023 -0700
>>>>>
>>>>>        Revert "Reland [Pipeline] Don't limit ArgumentPromotion
>>>>> to -
>>>>> O3"
>>>>>        
>>>>>        This reverts commit
>>>>> 6f29d1adf29820daae9ea7a01ae2588b67735b9e.
>>>>>        
>>>>>        https://reviews.llvm.org/D149768    is causing size
>>>>> regressions
>>>>> for -
>>>>> Oz
>>>>>        with FullLTO, and I'm reverting that one while
>>>>> investigating.
>>>>> This
>>>>>        commit depends on that one, so it needs to be reverted as
>>>>> well.
>>>>
>>>> The transformtion "Don't limit ArgumentPromotion to -O3" is
>>>> temporarily
>>>> reverted. But it could be reverted again once the issue is
>>>> resolved.
>>>> So it is a good idea to resolve the issue in the kernel.
>>>>
>>>>>
>>>>> But looking at the codegen differences:
>>>>>
>>>>> $ diff -u <(sed -e s/[0-9]*://g pass.s) <(sed -e s/[0-9]*://g
>>>>> fail.s)
>>>>>
>>>>> -pass.o:        file format elf64-bpf
>>>>> +fail.o:        file format elf64-bpf
>>>>>
>>>>> -00000000000002c8 <sk_dst_port__load_half>
>>>>> -       69 11 00 30 00 00 00 00 r1 = *(u16 *)(r1 + 48)
>>>>> +00000000000002c0 <sk_dst_port__load_half>
>>>>> +       54 10 00 00 00 00 ff ff w1 &= 65535
>>>>>            b4 00 00 00 00 00 00 01 w0 = 1
>>>>>            16 10 00 01 00 00 ca fe if w1 == 51966 goto +1
>>>>> <LBB6_2>
>>>>>            b4 00 00 00 00 00 00 00 w0 = 0
>>>>>
>>>>> This is what ArgumentPromotion is supposed to do, so that's
>>>>> okay so
>>>>> far. However, further down below we have:
>>>>>
>>>>>     Disassembly of section cgroup_skb/egress:
>>>>>
>>>>> -       bf 16 00 00 00 00 00 00 r1 = r6
>>>>> +       61 76 00 30 00 00 00 00 r7 = *(u32 *)(r6 + 48)
>>>>> +       bc 17 00 00 00 00 00 00 w1 = w7
>>>>>            85 01 00 00 00 00 00 53 call sk_dst_port__load_word
>>>>>
>>>>> ...
>>>>>
>>>>> -       bf 16 00 00 00 00 00 00 r1 = r6
>>>>> +       74 70 00 00 00 00 00 10 w7 >>= 16
>>>>> +       bc 17 00 00 00 00 00 00 w1 = w7
>>>>>            85 01 00 00 00 00 00 57 call sk_dst_port__load_half
>>>>>
>>>>> so there is no 16-bit load anymore, instead, the result from
>>>>> the
>>>>> earlier 32-bit load is reused. However, on BE these kinds of
>>>>> loads
>>>>> for this particular field are not consistent at the moment -
>>>>> see
>>>>> [1]
>>>>> and the previous discussions.
>>>>>
>>>>> De-facto we have the following results:
>>>>>
>>>>> - int load: 0x0000cafe
>>>>> - short load: 0xcafe
>>>>
>>>> So 'De-facto' means the above is the expected result.
>>>>
>>>>>
>>>>> On a consistent BE we should have rather had:
>>>>>
>>>>> - int load: 0x0000cafe
>>>>> - short load: 0
>>>>
>>>> What does 'consistent BE' mean here? Does it mean the expected
>>>> result from the source code itself?
>>>
>>> I should not have called the de-facto example "BE" at all: it's
>>> rather
>>> "mixed endianness" or "weird endianness" or something along these
>>> lines.
>>>
>>> On "consistent BE" or simply "BE" properties like
>>>
>>> *(uint32_t *)p = (*(uint16_t *)p << 16) | *(uint16_t *)(p + 2);
>>>
>>> hold. This is currently not the case for bpf_sock.dst_port.
>>>
>>> We compile with -mbig-endian, so we promise to the compiler that
>>> the
>>> machine is big-endian, and the compiler expects the above to hold
>>> for
>>> any p. Unfortunately when p points to bpf_sock.dst_port, this is
>>> not
>>> the case.
>>
>> If I understand correctly, *(uint32_t *)p to get the
>> bpf_sock.dst_port
>> is for backward compatibility. But the real u32 read by compiler will
>> do (*(uint16_t *)p << 16) | *(uint16_t *)(p + 2) which is not the
>> same as expected *(uint32_t *)p so we have problem here.
>>
>>>
>>> The property above is important for the correctness of the
>>> load/store
>>> tearing transformations. What we have here is not exactly tearing,
>>> but
>>> is quite close.
>>>
>>>>> Clang, of course, expects a consistent BE and optimizes around
>>>>> that.
>>>>>
>>>>> This was a conscious tradeoff Jakub and I have agreed on in
>>>>> order
>>>>> to
>>>>> keep the quirky behavior from the past. Given what's happening
>>>>> with
>>>>> Clang now, I wonder if it would be worth revisiting it in the
>>>>> name
>>>>> of
>>>>> consistency?
>>>>
>>>> If I understand correctly, I think the issue is
>>>>        r7 = *(u32 *)(r6 + 48)
>>>>        w7 >= 16
>>>>        w1 = w7
>>>>
>>>> after verifier, it is changed to
>>>>       r7 = *(u16 *)(r6 + <kernel offset>)
>>>>       w7 >= 16
>>>>       w1 = w7
>>>>
>>>> and the result after verifier rewrite is completely wrong.
>>>> Is it right?
>>>
>>> No, the verifier rewrite is correct.
>>> The sk_dst_port__load_word() part of the test succeeds.
>>>
>>> All these rewrites already work fine, they are correct and
>>> consistent.
>>> It's really just bpf_sock.dst_port that is special.
>>>
>>>> If this is the case, during verifier rewrite, if it is
>>>> big endian, say the user intends to load 4 bytes (from uapi
>>>> header)
>>>> while the kernel field is 2 bytes, in such cases, kernel
>>>> can still pretend to generate 4-byte load. For example,
>>>> for the above example, the code after verification could be:
>>>>       r7 = *(u16 *)(r6 + <kernel offset>)
>>>>       r7 <= 16
>>>>       w7 >= 16
>>>>       w1 = w7
>>>>
>>>> Hopefully, we won't have many such cases. Does this work?
>>>
>>> This would break the sk_dst_port__load_word() part of the test.
>>
>> This is a hack. This may work for this specific u16 case, but
>> yes, it won't work for u32 load case.
>>
>>>
>>>
>>>
>>> Above I asked whether we can resolve the inconsistency, but I
>>> thought
>>> about it and I don't see a way of doing it without breaking the
>>> ABI,
>>> which is at worst unacceptable, and at best a last resort measure.
>>>
>>> What do you think about marking bpf_sock.dst_port volatile?
>>> volatile
>>> should prevent tearing and similar optimizations, with which we
>>> have a
>>> problem here.
>>>
>>> We could also add a comment warning users not to cast away this
>>> volatile due to the quirk we have. And then we should adjust the
>>> test
>>> (making all casts volatile) to comply with this new warning.
>>
>> I did a little study on this. The main problem here for
>> static __noinline bool sk_dst_port__load_half(struct bpf_sock *sk)
>> {
>>           __u16 *half = (__u16 *)&sk->dst_port;
>>           return half[0] == bpf_htons(0xcafe);
>> }
>>
>> Through some cross-function optimization by ArgumentPromotion
>> optimization, the compiler does:
>>      /* the below shared by sk_dst_port__load_word
>>       * and sk_dst_port__load_half
>>       */
>>      __u32 *word = (__u32 *)&sk->dst_port;
>>      __u32 word_val = word[0];
>>
>>      /* the below is for sk_dst_port__load_half only */
>>      __u16 half_val = word_val >> 16;
>>
>>      ... half_val passed into sk_dst_port__load_half ...
>>      return half_val == bpf_htons(0xcafe);
>>
>> Here, 'word_val = word[0]' is replaced by 2-byte load
>> by the verifier and this caused the trouble for later
>> sk_dst_port__load_half().
>>
>> I don't have a good solution here. The issue is exposed
>> as we have both u16 and u32 load for &sk->dst_port and
>> the compiler did some optimization with this.
>>
>> I would say this is an extreme corner case and we can just
>> fix in the source code like below:
>>
>> diff --git a/tools/testing/selftests/bpf/progs/test_sock_fields.c
>> b/tools/testing/selftests/bpf/progs/test_sock_fields.c
>> index bbad3c2d9aa5..39c975786720 100644
>> --- a/tools/testing/selftests/bpf/progs/test_sock_fields.c
>> +++ b/tools/testing/selftests/bpf/progs/test_sock_fields.c
>> @@ -265,7 +265,10 @@ static __noinline bool
>> sk_dst_port__load_word(struct bpf_sock *sk)
>>
>>    static __noinline bool sk_dst_port__load_half(struct bpf_sock *sk)
>>    {
>> -       __u16 *half = (__u16 *)&sk->dst_port;
>> +       __u16 *half;
>> +
>> +       asm volatile ("");
>> +       half  = (__u16 *)&sk->dst_port;
>>           return half[0] == bpf_htons(0xcafe);
>>    }
>>
>> Could you try whether the above workaround works or not?
>> If we want the code to be future proof for potential
>> cross-func optimization for these noinline functions, we
>> can add similar asm codes to all of
>> bool sk_dst_port__load_{word, half, byte}.
> 
> Hi,
> 
> this makes the issue go away, thanks.
> 
> However, I'm still concerned, because this only inhibits a certain
> optimization and does not address the underlying fundamental problem:
> we promise to clang that the in-kernel implementation of the eBPF
> virtual machine is big-endian, while in reality it's not. As compiler
> optimizations get more aggressive, we will surely see more of this.
> 
> Why not do this instead?
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 1bb11a6ee667..3c9b535532ae 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -6102,7 +6102,7 @@ struct bpf_sock {
>   	__u32 src_ip4;
>   	__u32 src_ip6[4];
>   	__u32 src_port;		/* host byte order */
> -	__be16 dst_port;	/* network byte order */
> +	volatile __be16 dst_port;	/* network byte order */
>   	__u16 :16;		/* zero padding */
>   	__u32 dst_ip4;
>   	__u32 dst_ip6[4];
> diff --git a/tools/include/uapi/linux/bpf.h
> b/tools/include/uapi/linux/bpf.h
> index 1bb11a6ee667..3c9b535532ae 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -6102,7 +6102,7 @@ struct bpf_sock {
>   	__u32 src_ip4;
>   	__u32 src_ip6[4];
>   	__u32 src_port;		/* host byte order */
> -	__be16 dst_port;	/* network byte order */
> +	volatile __be16 dst_port;	/* network byte order */
>   	__u16 :16;		/* zero padding */
>   	__u32 dst_ip4;
>   	__u32 dst_ip6[4];
> diff --git a/tools/testing/selftests/bpf/progs/test_sock_fields.c
> b/tools/testing/selftests/bpf/progs/test_sock_fields.c
> index bbad3c2d9aa5..773ded84ac12 100644
> --- a/tools/testing/selftests/bpf/progs/test_sock_fields.c
> +++ b/tools/testing/selftests/bpf/progs/test_sock_fields.c
> @@ -259,19 +259,19 @@ int ingress_read_sock_fields(struct __sk_buff
> *skb)
>    */
>   static __noinline bool sk_dst_port__load_word(struct bpf_sock *sk)
>   {
> -	__u32 *word = (__u32 *)&sk->dst_port;
> +	volatile __u32 *word = (volatile __u32 *)&sk->dst_port;
>   	return word[0] == bpf_htons(0xcafe);
>   }
>   
>   static __noinline bool sk_dst_port__load_half(struct bpf_sock *sk)
>   {
> -	__u16 *half = (__u16 *)&sk->dst_port;
> +	volatile __u16 *half = (volatile __u16 *)&sk->dst_port;
>   	return half[0] == bpf_htons(0xcafe);
>   }
>   
>   static __noinline bool sk_dst_port__load_byte(struct bpf_sock *sk)
>   {
> -	__u8 *byte = (__u8 *)&sk->dst_port;
> +	volatile __u8 *byte = (volatile __u8 *)&sk->dst_port;
>   	return byte[0] == 0xca && byte[1] == 0xfe;
>   }
>   
> This also works, and as far as I'm concerned, this would be a proper
> fix for the underlying issue: we tell the compiler that it should never
> ever (with any of today's or future optimizations) try to be clever
> when accessing dst_port.

The above test_sock_fields.c change should work too.
I think the uapi change is not necessary. The key word 'volatile'
intends to avoid merging two or more identical loads together.
In this particular case, with only uapi changes, the
harmful transformation can still happen since the sk->dst_port
is indeed loaded only once.

The test_sock_fields.c change itself forces proper load
which is verifier friendly. So I suggest to have
test_sock_fields.c change only. My previously suggested
changes have the same effect, to preserve the verifier friendly
load.

> 
> Best regards,
> Ilya
> 
> 
>>>>> [1]
>>>>> https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com
> 

  reply	other threads:[~2023-05-15 15:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-03 19:35 selftest sock_fields failed on s390x with latest llvm17 Yonghong Song
2023-05-03 19:46 ` Ilya Leoshkevich
2023-05-12 14:40   ` Ilya Leoshkevich
2023-05-13  4:13     ` Yonghong Song
2023-05-13  8:24       ` Ilya Leoshkevich
2023-05-14 16:58         ` Yonghong Song
2023-05-15  7:55           ` Ilya Leoshkevich
2023-05-15 15:27             ` Yonghong Song [this message]
2023-05-16  9:16               ` Ilya Leoshkevich
2023-05-16 15:20                 ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a57547e-6865-1026-60db-bb6b2ca70e34@meta.com \
    --to=yhs@meta.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=chantr4@gmail.com \
    --cc=iii@linux.ibm.com \
    --cc=jakub@cloudflare.com \
    --cc=kuifeng@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox