All of lore.kernel.org
 help / color / mirror / Atom feed
* [mlmmj] Potential mail loss in postfix?
@ 2010-09-28 23:21 Robin H. Johnson
  2010-11-11  3:58 ` Ben Schmidt
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Robin H. Johnson @ 2010-09-28 23:21 UTC (permalink / raw)
  To: mlmmj

Hi

Noticed something, and I don't have a testcase for it yet unfortunately
or a suitable setup to re-test on. Instead I've got my analysis of the
problem how it's occurred twice now.

- Using verp and postfix together first of all (string 'postfix' in the
  verp file, '100' in maxverprecips).
- Pick a list with a lot of subscribers.
  - This leads to a case where the mlmmj-send invocation takes several
	minutes to complete for a normal list mail.
- (optional) set postfix to hold incoming mail, and you can release it
  just at the right moment to see it be mlmmj-recieve.
- The postfix log will show delivery to mlmmj-recieve.
- mlmmj.operation.log will contain a line from mlmmj-process stating
  that the message was allowed (by your access rules).
- Now, while mlmmj-send is running, you're going to execute a normal
  shutdown of postfix: 'postfix stop' [1]
- The mail will be lost completely now. There is no record of it in
  archive, or any of the queues :-(.

[1] The description for 'postfix stop': Stop the Postfix mail system in
an orderly fashion. If possible, running processes are allowed to
terminate at their earliest convenience.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail     : robbat2@gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [mlmmj] Potential mail loss in postfix?
  2010-09-28 23:21 [mlmmj] Potential mail loss in postfix? Robin H. Johnson
@ 2010-11-11  3:58 ` Ben Schmidt
  2010-11-11  4:55 ` Ben Schmidt
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ben Schmidt @ 2010-11-11  3:58 UTC (permalink / raw)
  To: mlmmj

I might have found this bug.

If init_sockfd() fails (e.g. because Postfix has shut down so there is no smtpd 
listening) it calls exit(). Mail would then fail to be archived or requeued. It 
will be in a queue file only until mlmmj-maintd cleans it up (which it will do as 
soon as it finds it, as it won't have accompanying .mailfrom etc. files).

Do you have logs from when this happened? Do you see "Could not get socket" or 
"Could not connect to %s, exiting..." (%s probably is 127.0.0.1) in them?

Ben.



On 29/09/10 9:21 AM, Robin H. Johnson wrote:
> Hi
>
> Noticed something, and I don't have a testcase for it yet unfortunately
> or a suitable setup to re-test on. Instead I've got my analysis of the
> problem how it's occurred twice now.
>
> - Using verp and postfix together first of all (string 'postfix' in the
>    verp file, '100' in maxverprecips).
> - Pick a list with a lot of subscribers.
>    - This leads to a case where the mlmmj-send invocation takes several
> 	minutes to complete for a normal list mail.
> - (optional) set postfix to hold incoming mail, and you can release it
>    just at the right moment to see it be mlmmj-recieve.
> - The postfix log will show delivery to mlmmj-recieve.
> - mlmmj.operation.log will contain a line from mlmmj-process stating
>    that the message was allowed (by your access rules).
> - Now, while mlmmj-send is running, you're going to execute a normal
>    shutdown of postfix: 'postfix stop' [1]
> - The mail will be lost completely now. There is no record of it in
>    archive, or any of the queues :-(.
>
> [1] The description for 'postfix stop': Stop the Postfix mail system in
> an orderly fashion. If possible, running processes are allowed to
> terminate at their earliest convenience.
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [mlmmj] Potential mail loss in postfix?
  2010-09-28 23:21 [mlmmj] Potential mail loss in postfix? Robin H. Johnson
  2010-11-11  3:58 ` Ben Schmidt
@ 2010-11-11  4:55 ` Ben Schmidt
  2010-11-11  5:12 ` Robin H. Johnson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ben Schmidt @ 2010-11-11  4:55 UTC (permalink / raw)
  To: mlmmj

On 11/11/10 2:58 PM, Ben Schmidt wrote:
> I might have found this bug.
>
> If init_sockfd() fails (e.g. because Postfix has shut down so there is no smtpd
> listening) it calls exit(). Mail would then fail to be archived or requeued. It
> will be in a queue file only until mlmmj-maintd cleans it up (which it will do as
> soon as it finds it, as it won't have accompanying .mailfrom etc. files).
>
> Do you have logs from when this happened? Do you see "Could not get socket" or
> "Could not connect to %s, exiting..." (%s probably is 127.0.0.1) in them?

"Could not connect to %s, exiting ..."

Omitted a space before. Correcting myself, just in case you search for
that part of the string and don't find it because of my error. :-)

> Ben.
>
>
>
> On 29/09/10 9:21 AM, Robin H. Johnson wrote:
>> Hi
>>
>> Noticed something, and I don't have a testcase for it yet unfortunately
>> or a suitable setup to re-test on. Instead I've got my analysis of the
>> problem how it's occurred twice now.
>>
>> - Using verp and postfix together first of all (string 'postfix' in the
>> verp file, '100' in maxverprecips).
>> - Pick a list with a lot of subscribers.
>> - This leads to a case where the mlmmj-send invocation takes several
>> minutes to complete for a normal list mail.
>> - (optional) set postfix to hold incoming mail, and you can release it
>> just at the right moment to see it be mlmmj-recieve.
>> - The postfix log will show delivery to mlmmj-recieve.
>> - mlmmj.operation.log will contain a line from mlmmj-process stating
>> that the message was allowed (by your access rules).
>> - Now, while mlmmj-send is running, you're going to execute a normal
>> shutdown of postfix: 'postfix stop' [1]
>> - The mail will be lost completely now. There is no record of it in
>> archive, or any of the queues :-(.
>>
>> [1] The description for 'postfix stop': Stop the Postfix mail system in
>> an orderly fashion. If possible, running processes are allowed to
>> terminate at their earliest convenience.
>>
>
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [mlmmj] Potential mail loss in postfix?
  2010-09-28 23:21 [mlmmj] Potential mail loss in postfix? Robin H. Johnson
  2010-11-11  3:58 ` Ben Schmidt
  2010-11-11  4:55 ` Ben Schmidt
@ 2010-11-11  5:12 ` Robin H. Johnson
  2010-11-11 12:15 ` Ben Schmidt
  2010-11-11 21:13 ` Robin H. Johnson
  4 siblings, 0 replies; 6+ messages in thread
From: Robin H. Johnson @ 2010-11-11  5:12 UTC (permalink / raw)
  To: mlmmj

[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]

(No need to CC me, just send to the list)

On Thu, Nov 11, 2010 at 03:55:06PM +1100, Ben Schmidt wrote:
> On 11/11/10 2:58 PM, Ben Schmidt wrote:
> > I might have found this bug.
> >
> > If init_sockfd() fails (e.g. because Postfix has shut down so there is no smtpd
> > listening) it calls exit(). Mail would then fail to be archived or requeued. It
> > will be in a queue file only until mlmmj-maintd cleans it up (which it will do as
> > soon as it finds it, as it won't have accompanying .mailfrom etc. files).
> >
> > Do you have logs from when this happened? Do you see "Could not get socket" or
> > "Could not connect to %s, exiting..." (%s probably is 127.0.0.1) in them?
> 
> "Could not connect to %s, exiting ..."
> 
> Omitted a space before. Correcting myself, just in case you search for
> that part of the string and don't find it because of my error. :-)
I don't find it in the last month of syslog or mlmmj logfiles
(incidently, would be really nice to have them go to syslog...).

I've left a much larger trawl of syslog data for that box running, I'll
check for any hits in the morning (~120GiB worth of logs takes a
while...).

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail     : robbat2@gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

[-- Attachment #2: Type: application/pgp-signature, Size: 330 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [mlmmj] Potential mail loss in postfix?
  2010-09-28 23:21 [mlmmj] Potential mail loss in postfix? Robin H. Johnson
                   ` (2 preceding siblings ...)
  2010-11-11  5:12 ` Robin H. Johnson
@ 2010-11-11 12:15 ` Ben Schmidt
  2010-11-11 21:13 ` Robin H. Johnson
  4 siblings, 0 replies; 6+ messages in thread
From: Ben Schmidt @ 2010-11-11 12:15 UTC (permalink / raw)
  To: mlmmj

On 11/11/10 4:12 PM, Robin H. Johnson wrote:
> (No need to CC me, just send to the list)

I'll try to remember that.

> I don't find it in the last month of syslog or mlmmj logfiles
> (incidently, would be really nice to have them go to syslog...).

Yes. That'll be one of my highest priorities after getting 1.2.18 out.

> I've left a much larger trawl of syslog data for that box running, I'll
> check for any hits in the morning (~120GiB worth of logs takes a
> while...).

Ta.

Ben.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [mlmmj] Potential mail loss in postfix?
  2010-09-28 23:21 [mlmmj] Potential mail loss in postfix? Robin H. Johnson
                   ` (3 preceding siblings ...)
  2010-11-11 12:15 ` Ben Schmidt
@ 2010-11-11 21:13 ` Robin H. Johnson
  4 siblings, 0 replies; 6+ messages in thread
From: Robin H. Johnson @ 2010-11-11 21:13 UTC (permalink / raw)
  To: mlmmj

[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]

On Thu, Nov 11, 2010 at 11:15:06PM +1100, Ben Schmidt wrote:
> On 11/11/10 4:12 PM, Robin H. Johnson wrote:
> > (No need to CC me, just send to the list)
> 
> I'll try to remember that.
> 
> > I don't find it in the last month of syslog or mlmmj logfiles
> > (incidently, would be really nice to have them go to syslog...).
> 
> Yes. That'll be one of my highest priorities after getting 1.2.18 out.
> 
> > I've left a much larger trawl of syslog data for that box running, I'll
> > check for any hits in the morning (~120GiB worth of logs takes a
> > while...).
Confirmed, here's a line from around when I sent the first email in this
thread:

Sep 28 21:35:38 pigeon /usr/bin/mlmmj-send[6681]: init_sockfd.c:55: Could not connect to 127.0.0.1, exiting ... : Connection refused

It happened 1333 times, in the span of 21:35:31 to 21:35:38 (all UTC)
that day.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail     : robbat2@gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

[-- Attachment #2: Type: application/pgp-signature, Size: 330 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-11-11 21:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-28 23:21 [mlmmj] Potential mail loss in postfix? Robin H. Johnson
2010-11-11  3:58 ` Ben Schmidt
2010-11-11  4:55 ` Ben Schmidt
2010-11-11  5:12 ` Robin H. Johnson
2010-11-11 12:15 ` Ben Schmidt
2010-11-11 21:13 ` Robin H. Johnson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.