* [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
@ 2010-11-09 10:34 ` Martin Koch Andersen
2010-11-09 10:59 ` Martin Koch Andersen
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Martin Koch Andersen @ 2010-11-09 10:34 UTC (permalink / raw)
To: mlmmj
On 09/11/2010 11.17, Martin Koch Andersen wrote:
> Hi,
>
> Today I had a user contacting me, that is a normal subscriber to one of our mlmmj mailinglists.
> He was not getting any mails he said.
>
> I confirmed that he was indeed subscribed to the list, and other users are getting their mails.
> There is not a trace of his email addresses i the postfix maillog.
>
> But I found his address in requeue/x/subscribers.
>
> Can anyone explain to me what excatly requeue is?
> I don't understand why I can't find a delivery attempt in the maillog, but there is an entry in requeue?
>
> Is this normal? And why is the user not getting any mail (ending up in requeue)?
>
> I grep'ed on other addresses found in requeue/x/subscribers files, some of them were in maillog, but others, as the case mentioned above, were not.
>
> Maintenance job is of course running.
One more information. I have "noarchive" enabled for all lists. So what happens to the subscribers ending up in requeue? How can mails be re-sent, when the original mail is not stored anywhere? So how can requeue be processed when noarchive is enabled?
And still, I don't get why subscribers end up in requeue in the first place - without a delivery attempt.
Thanks again.
--
Martin Koch Andersen
http://925.dk
^ permalink raw reply [flat|nested] 11+ messages in thread* [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
2010-11-09 10:34 ` [mlmmj] " Martin Koch Andersen
@ 2010-11-09 10:59 ` Martin Koch Andersen
2010-11-09 13:01 ` Ben Schmidt
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Martin Koch Andersen @ 2010-11-09 10:59 UTC (permalink / raw)
To: mlmmj
On 09/11/2010 11.34, Martin Koch Andersen wrote:
>
> One more information. I have "noarchive" enabled for all lists. So what happens to the subscribers ending up in requeue? How can mails be re-sent, when the original mail is not stored anywhere? So how can requeue be processed when noarchive is enabled?
Sorry about the spamming. But I just looked through the mlmmj src/, and apparently with "noarchive", there is supposed to be a mailfile in "%s/requeue/%d/mailfile".
For this list, there is no much mailfile (for any index)! For the other lists we have, the mailfile is there. All the lists are configured exactly the same way, and permissions are the same etc. So it's a bit weird why "%s/requeue/%d/mailfile"'s are missing for the list in question.
--
Martin Koch Andersen
http://925.dk
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
2010-11-09 10:34 ` [mlmmj] " Martin Koch Andersen
2010-11-09 10:59 ` Martin Koch Andersen
@ 2010-11-09 13:01 ` Ben Schmidt
2010-11-10 9:46 ` Martin Koch Andersen
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Ben Schmidt @ 2010-11-09 13:01 UTC (permalink / raw)
To: mlmmj
Hmmm.
/requeue/ will be used whenever delivery fails. I think what will happen
is that if delivery to any recipient fails, it and all remaining
recipients will be requeued. So if you are getting a lot of requeues, it
may well be that there is a bad address on the list that is failing
every time and causing all later recipients to be requeued. Maybe check
your mail logs and see if there is any error, or at least have a look at
what the last address delivered in the first 'batch' for a post is, then
see in the mlmmj subscribers file what address comes next. Or perhaps
the first address in the /requeue/x/subscribers file is the one causing
the problem. Is there anything suspicious about it?
There should always be a /requeue/x/mailfile, though. If that doesn't
exist, perhaps mlmmj is even crashing.
What OS are you using, and what filesystem is your mlmmj listdir on?
Maybe something system-dependent is coming up if rename() is failing.
Cheers,
Ben.
On 9/11/10 9:59 PM, Martin Koch Andersen wrote:
>
> On 09/11/2010 11.34, Martin Koch Andersen wrote:
>>
>> One more information. I have "noarchive" enabled for all lists. So what happens to the subscribers ending up in requeue? How can mails be re-sent, when the original mail is not stored anywhere? So how can requeue be processed when noarchive is enabled?
>
> Sorry about the spamming. But I just looked through the mlmmj src/, and apparently with "noarchive", there is supposed to be a mailfile in "%s/requeue/%d/mailfile".
>
> For this list, there is no much mailfile (for any index)! For the other lists we have, the mailfile is there. All the lists are configured exactly the same way, and permissions are the same etc. So it's a bit weird why "%s/requeue/%d/mailfile"'s are missing for the list in question.
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
` (2 preceding siblings ...)
2010-11-09 13:01 ` Ben Schmidt
@ 2010-11-10 9:46 ` Martin Koch Andersen
2010-11-11 0:36 ` Ben Schmidt
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Martin Koch Andersen @ 2010-11-10 9:46 UTC (permalink / raw)
To: mlmmj
On 09/11/2010 14.01, Ben Schmidt wrote:
> Hmmm.
>
> /requeue/ will be used whenever delivery fails. I think what will happen
> is that if delivery to any recipient fails, it and all remaining
> recipients will be requeued. So if you are getting a lot of requeues, it
> may well be that there is a bad address on the list that is failing
> every time and causing all later recipients to be requeued. Maybe check
> your mail logs and see if there is any error, or at least have a look at
> what the last address delivered in the first 'batch' for a post is, then
> see in the mlmmj subscribers file what address comes next. Or perhaps
> the first address in the /requeue/x/subscribers file is the one causing
> the problem. Is there anything suspicious about it?
The last address in the requeue/subscribers file did look a bit weird.
And maillog had this:
Nov 10 09:34:30 goulding postfix/smtpd[57926]: warning: Illegal address syntax from localhost[127.0.0.1] in RCPT command: <xxx@hotmail.cp?>
Nov 10 09:34:30 goulding /usr/local/bin/mlmmj-send[58016]: mlmmj-send.c:289: Error in RCPT TO. Reply = [501 5.1.3 Bad recipient address syntax^M ]: No such file or directory
I've unsubscribed this address, and other syntax error ones, from the list now.
> There should always be a /requeue/x/mailfile, though. If that doesn't
> exist, perhaps mlmmj is even crashing.
Could this be a bug in mlmmj? The mailfile was not created today either for the list in question.
I'll perhaps see tomorrow, with the invalid addresses removed, if it is.
> What OS are you using, and what filesystem is your mlmmj listdir on?
> Maybe something system-dependent is coming up if rename() is failing.
FreeBSD and UFS.
--
Martin Koch Andersen
http://925.dk
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
` (3 preceding siblings ...)
2010-11-10 9:46 ` Martin Koch Andersen
@ 2010-11-11 0:36 ` Ben Schmidt
2010-11-11 0:41 ` Ben Schmidt
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Ben Schmidt @ 2010-11-11 0:36 UTC (permalink / raw)
To: mlmmj
> The last address in the requeue/subscribers file did look a bit weird.
> And maillog had this:
>
> Nov 10 09:34:30 goulding postfix/smtpd[57926]: warning: Illegal address syntax from localhost[127.0.0.1] in RCPT command:<xxx@hotmail.cp?>
> Nov 10 09:34:30 goulding /usr/local/bin/mlmmj-send[58016]: mlmmj-send.c:289: Error in RCPT TO. Reply = [501 5.1.3 Bad recipient address syntax^M ]: No such file or directory
Thank you. The "No such file or directory" message is bogus; fixing this
is on the to do list; the rest is relevant.
Any idea how the address got there? I suspect it's come through
mlmmj-sub via a web interface or something? I know mlmmj-sub doesn't
have particularly adequate validation at present.
> I've unsubscribed this address, and other syntax error ones, from the list now.
>
>> There should always be a /requeue/x/mailfile, though. If that doesn't
>> exist, perhaps mlmmj is even crashing.
>
> Could this be a bug in mlmmj? The mailfile was not created today either for the list in question.
> I'll perhaps see tomorrow, with the invalid addresses removed, if it is.
It most certainly could be. I've had a look at the relevant code and
can't quickly spot a bug, though.
There isn't a followup error message after the Error in RCPT TO, is there? I don't
expect one, but it's worth checking!
The person is a regular subscriber, not digest, right?
>> What OS are you using, and what filesystem is your mlmmj listdir on?
>> Maybe something system-dependent is coming up if rename() is failing.
>
> FreeBSD and UFS.
OK. Good to know. Once we're closer to finding what's going on, this
might make a difference.
Ben.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
` (4 preceding siblings ...)
2010-11-11 0:36 ` Ben Schmidt
@ 2010-11-11 0:41 ` Ben Schmidt
2010-11-11 10:06 ` Martin Koch Andersen
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Ben Schmidt @ 2010-11-11 0:41 UTC (permalink / raw)
To: mlmmj
Also, can you confirm which mlmmj version you are using? Thanks.
Ben.
On 11/11/10 11:36 AM, Ben Schmidt wrote:
>> The last address in the requeue/subscribers file did look a bit weird.
>> And maillog had this:
>>
>> Nov 10 09:34:30 goulding postfix/smtpd[57926]: warning: Illegal address syntax
>> from localhost[127.0.0.1] in RCPT command:<xxx@hotmail.cp?>
>> Nov 10 09:34:30 goulding /usr/local/bin/mlmmj-send[58016]: mlmmj-send.c:289:
>> Error in RCPT TO. Reply = [501 5.1.3 Bad recipient address syntax^M ]: No such
>> file or directory
>
> Thank you. The "No such file or directory" message is bogus; fixing this
> is on the to do list; the rest is relevant.
>
> Any idea how the address got there? I suspect it's come through
> mlmmj-sub via a web interface or something? I know mlmmj-sub doesn't
> have particularly adequate validation at present.
>
>> I've unsubscribed this address, and other syntax error ones, from the list now.
>>
>>> There should always be a /requeue/x/mailfile, though. If that doesn't
>>> exist, perhaps mlmmj is even crashing.
>>
>> Could this be a bug in mlmmj? The mailfile was not created today either for the
>> list in question.
>> I'll perhaps see tomorrow, with the invalid addresses removed, if it is.
>
> It most certainly could be. I've had a look at the relevant code and
> can't quickly spot a bug, though.
>
> There isn't a followup error message after the Error in RCPT TO, is there? I don't
> expect one, but it's worth checking!
>
> The person is a regular subscriber, not digest, right?
>
>>> What OS are you using, and what filesystem is your mlmmj listdir on?
>>> Maybe something system-dependent is coming up if rename() is failing.
>>
>> FreeBSD and UFS.
>
> OK. Good to know. Once we're closer to finding what's going on, this
> might make a difference.
>
> Ben.
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
` (5 preceding siblings ...)
2010-11-11 0:41 ` Ben Schmidt
@ 2010-11-11 10:06 ` Martin Koch Andersen
2010-11-11 12:33 ` Ben Schmidt
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Martin Koch Andersen @ 2010-11-11 10:06 UTC (permalink / raw)
To: mlmmj
Hi Ben & others,
On 11/11/2010 01.36, Ben Schmidt wrote:
> Any idea how the address got there? I suspect it's come through
> mlmmj-sub via a web interface or something? I know mlmmj-sub doesn't
> have particularly adequate validation at present.
Yah, addresses come from a variety of places. These lists have tens of thousands of subscribers. So a bad address will eventually sneak in once in a while.
>> I've unsubscribed this address, and other syntax error ones, from the list now.
>>
>>> There should always be a /requeue/x/mailfile, though. If that doesn't
>>> exist, perhaps mlmmj is even crashing.
>>
>> Could this be a bug in mlmmj? The mailfile was not created today either for the list in question.
>> I'll perhaps see tomorrow, with the invalid addresses removed, if it is.
>
> It most certainly could be. I've had a look at the relevant code and
> can't quickly spot a bug, though.
>
> There isn't a followup error message after the Error in RCPT TO, is there? I don't expect one, but it's worth checking!
Nope there is not. Today the same thing happened. No mailfile for that list in requeue/x/. Only thing in error log was:
Nov 11 09:52:22 goulding /usr/local/bin/mlmmj-send[89349]: mlmmj-send.c:267: Error in MAIL FROM. Reply = [501 5.1.7 Bad sender address syntax^M ]: No such file or directory
Again because of a bad address. But I don't think the bad address itself is the problem? I mean, thats what requeue is there for in the first place. There must be some other reason while the mailfile is never being created for this list on requeue. As I've mentioned, the other lists on the same server, gets their mailfile's created just fine. And all lists have same setup/config (I triple checked that!).
I'm using mlmmj-1.2.17.1 from the FreeBSD ports.
I don't know what is going, but it is a problem, because the subscribers in requeue will never get their mails, because of the missing mailfile.
Would it be possible to at least add a log_error() in the relevant place where the mailfile is supposed to be created? I think it's in mlmmj-send.c at the bottom of main():
...
if (rename(mailfilename, requeuefilename) < 0)
unlink(mailfilename);
- in the "noarchive" branch. The list is "noarchive".
I also wonder why, when rename() fails, the if(!ctrlarchive) { branch does not unlink the mailfilename, whereas with noarchive it is unlinked.
Kind regards,
--
Martin Koch Andersen
http://925.dk
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
` (6 preceding siblings ...)
2010-11-11 10:06 ` Martin Koch Andersen
@ 2010-11-11 12:33 ` Ben Schmidt
2010-11-11 13:05 ` Martin Koch Andersen
2010-11-11 16:07 ` Ben Schmidt
9 siblings, 0 replies; 11+ messages in thread
From: Ben Schmidt @ 2010-11-11 12:33 UTC (permalink / raw)
To: mlmmj
>> There isn't a followup error message after the Error in RCPT TO, is
>> there? I don't expect one, but it's worth checking!
>
> Nope there is not. Today the same thing happened. No mailfile for that
> list in requeue/x/. Only thing in error log was:
>
> Nov 11 09:52:22 goulding /usr/local/bin/mlmmj-send[89349]: mlmmj-send.c:267: Error in MAIL FROM. Reply = [501 5.1.7 Bad sender address syntax^M ]: No such file or directory
>
> Again because of a bad address. But I don't think the bad address
> itself is the problem? I mean, thats what requeue is there for in the
> first place. There must be some other reason while the mailfile is
> never being created for this list on requeue. As I've mentioned, the
> other lists on the same server, gets their mailfile's created just
> fine. And all lists have same setup/config (I triple checked that!).
Yeah, exactly. Something very strange is going on. I doubt it's getting
to the relevant point in mlmmj-send where the mail is moved to mailfile.
It might be exit()ing for some undesired reason prior to that point
(like the potential init_sockfd() problem I just wrote about and found
while investigating this issue), or it might be segfaulting or
something.
Do you have a verp control file at all? What's in it if so?
Also, these are normal posts, not digests, aren't they?
> I'm using mlmmj-1.2.17.1 from the FreeBSD ports.
>
> I don't know what is going, but it is a problem, because the
> subscribers in requeue will never get their mails, because of the
> missing mailfile.
>
> Would it be possible to at least add a log_error() in the relevant
> place where the mailfile is supposed to be created? I think it's in
> mlmmj-send.c at the bottom of main():
>
> ...
> if (rename(mailfilename, requeuefilename)< 0)
> unlink(mailfilename);
>
> - in the "noarchive" branch. The list is "noarchive".
>
> I also wonder why, when rename() fails, the if(!ctrlarchive) { branch
> does not unlink the mailfilename, whereas with noarchive it is
> unlinked.
Because rename() isn't just being used to rename. It's also being used
to check for the existence of the directory. If it fails, the idea is
that nothing was requeued, and since we're not archiving, we don't need
the file. It's not an error but a "we don't need this" condition. It
should have something more like this:
len = strlen(listdir) + 9 + 20 + 9;
requeuefilename = mymalloc(len);
snprintf(requeuefilename, len, "%s/requeue/%d",
listdir, mindex);
if(stat(requeuefilename, &st) < 0) {
/* Nothing was requeued and we don't keep
* mail for a noarchive list. */
unlink(mailfilename);
} else {
snprintf(requeuefilename, len,
"%s/requeue/%d/mailfile",
listdir, mindex);
if (rename(mailfilename, requeuefilename) < 0) {
log_error(LOG_ARGS,
"Could not rename(%s,%s);",
mailfilename,
requeuefilename);
}
}
myfree(requeuefilename);
I've given it only a quick and superficial test, but it seems to work,
and it can't fail much worse for you than it already is anyway, so if
you're in a position to compile a patched mlmmj yourself, please do give
it a shot!
Ben.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
` (7 preceding siblings ...)
2010-11-11 12:33 ` Ben Schmidt
@ 2010-11-11 13:05 ` Martin Koch Andersen
2010-11-11 16:07 ` Ben Schmidt
9 siblings, 0 replies; 11+ messages in thread
From: Martin Koch Andersen @ 2010-11-11 13:05 UTC (permalink / raw)
To: mlmmj
Hi,
On 11/11/2010 13.33, Ben Schmidt wrote:
> Yeah, exactly. Something very strange is going on. I doubt it's getting
> to the relevant point in mlmmj-send where the mail is moved to mailfile.
> It might be exit()ing for some undesired reason prior to that point
> (like the potential init_sockfd() problem I just wrote about and found
> while investigating this issue), or it might be segfaulting or
> something.
>
> Do you have a verp control file at all? What's in it if so?
Nope I don't.
> Also, these are normal posts, not digests, aren't they?
Normal yes.
> Because rename() isn't just being used to rename. It's also being used
> to check for the existence of the directory. If it fails, the idea is
> that nothing was requeued, and since we're not archiving, we don't need
> the file. It's not an error but a "we don't need this" condition. It
> should have something more like this:
I see.
> len = strlen(listdir) + 9 + 20 + 9;
> requeuefilename = mymalloc(len);
> snprintf(requeuefilename, len, "%s/requeue/%d",
> listdir, mindex);
> if(stat(requeuefilename, &st) < 0) {
> /* Nothing was requeued and we don't keep
> * mail for a noarchive list. */
> unlink(mailfilename);
> } else {
> snprintf(requeuefilename, len,
> "%s/requeue/%d/mailfile",
> listdir, mindex);
> if (rename(mailfilename, requeuefilename) < 0) {
> log_error(LOG_ARGS,
> "Could not rename(%s,%s);",
> mailfilename,
> requeuefilename);
> }
> }
> myfree(requeuefilename);
>
> I've given it only a quick and superficial test, but it seems to work,
> and it can't fail much worse for you than it already is anyway, so if
> you're in a position to compile a patched mlmmj yourself, please do give
> it a shot!
Patch makes sense. I don't have a chance right now to test it though. But looks fine, and will help us exclude rename() failing. Will try it when new release is out. Until then I just have to keep an eye of the requeue I guess :(
Would it perhaps make sense, and not be too difficult to write the mailfile before sending out the mails - rather than after all is done?
E.g. before anything takes place (and then delete it, when all is done, if not needed) or on first requeue. That way, if things go bad (for reasons still unknown...), at least there is a mailfile for the subscribers that have already been requeued?
--
Martin Koch Andersen
http://925.dk
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [mlmmj] Re: Mail delivery issues and requeue
2010-11-09 10:17 [mlmmj] Mail delivery issues and requeue Martin Koch Andersen
` (8 preceding siblings ...)
2010-11-11 13:05 ` Martin Koch Andersen
@ 2010-11-11 16:07 ` Ben Schmidt
9 siblings, 0 replies; 11+ messages in thread
From: Ben Schmidt @ 2010-11-11 16:07 UTC (permalink / raw)
To: mlmmj
>> Yeah, exactly. Something very strange is going on. I doubt it's getting
>> to the relevant point in mlmmj-send where the mail is moved to mailfile.
>> It might be exit()ing for some undesired reason prior to that point
>> (like the potential init_sockfd() problem I just wrote about and found
>> while investigating this issue), or it might be segfaulting or
>> something.
I think I found a code path that could cause a segfault if the MTA
disconnects the client, so that could be the answer. It should probably
be logging another error message, though, which it isn't, so perhaps
that's not the problem.
>> Do you have a verp control file at all? What's in it if so?
>
> Nope I don't.
No maxverprecips either? That means the default 100 will be used. Even
if not using verp, this is how many mails will be sent in a single SMTP
transaction.
> Would it perhaps make sense, and not be too difficult to write the
> mailfile before sending out the mails - rather than after all is done?
> E.g. before anything takes place (and then delete it, when all is
> done, if not needed) or on first requeue. That way, if things go bad
> (for reasons still unknown...), at least there is a mailfile for the
> subscribers that have already been requeued?
Maybe. It could essentially be reorganised so that requeuing is the
norm. Everything in place as if requeuing, but then remove the requeue
file if there are no requeue subscribers. I'll probably leave it until
after 1.2.18, though, as it would be a biggish change and would need to
be carefully thought through.
And I'd prefer to find whatever this other bug/problem is, anyway. I
don't like it.
Ben.
^ permalink raw reply [flat|nested] 11+ messages in thread