Mail failure due to Cloudflare DNS
On the morning of Friday 20th November there were no emails from my FreeBSD server, opal. These are generated daily by the periodic system under FreebSD, one email containing a system overview and one a security report for the previous 24 hours.
opal was up, so what was the cause of the non-appearance of the emails?
The clue was in /var/log/maillog:
Nov 20 03:17:42 opal sm-msp-queue[83606]: 0AK34MKC083509: to=root, ctladdr=root\ (0/0), delay=00:13:20, xdelay=00:00:10, mailer=relay, pri=120837, relay=[127.0\ .0.1] [127.0.0.1], dsn=4.0.0, reply=451 4.4.3 Temporary lookup failure of 127.0\ .0.1 at bl.spamcop.net, stat=Deferred: 451 4.4.3 Temporary lookup failure of 12\ 7.0.0.1 at bl.spamcop.net Nov 20 03:17:42 opal sm-msp-queue[83606]: 0AK34dM3083585: to=root, ctladdr=root\ (0/0), delay=00:13:03, xdelay=00:00:00, mailer=relay, pri=123842, relay=[127.0\ .0.1], dsn=4.0.0, reply=451 4.4.3 Temporary lookup failure of 127.0.0.1 at bl.s\ pamcop.net, stat=Deferred: 451 4.4.3 Temporary lookup failure of 127.0.0.1 at b\ l.spamcop.net
I run my own DNS, forwarding queries to the Cloudflare DNS servers (1.1.1.1 and 1.0.0.1). Hmm, what did nslookup tell me?
> bl.spamcop.net Server: 127.0.0.1 Address: 127.0.0.1#53 Non-authoritative answer: Name: bl.spamcop.net Address: 184.94.240.110 ** server can't find bl.spamcop.net: SERVFAIL
Bizarre that it returns the address, and then returns a SERVFAIL. I tried with Google DNS:
> server 8.8.8.8 Default server: 8.8.8.8 Address: 8.8.8.8#53 > bl.spamcop.net Server: 8.8.8.8 Address: 8.8.8.8#53 Non-authoritative answer: Name: bl.spamcop.net Address: 184.94.240.110
This failure was affecting both incoming and outgoing mail:
# mailq -Ac /var/spool/clientmqueue (4 requests) -----Q-ID----- --Size-- -----Q-Time----- ------------Sender/Recipient----------- 0AK7HWmR084402 2501 Fri Nov 20 07:17 MAILER-DAEMON (Deferred: 451 4.4.3 Temporary lookup failure of 127.0.0.1 at) root 0AK7HWmS084402 5506 Fri Nov 20 07:17 MAILER-DAEMON (Deferred: 451 4.4.3 Temporary lookup failure of 127.0.0.1 at) root 0AK34MKC083509 787 Fri Nov 20 03:04 root (Deferred: 451 4.4.3 Temporary lookup failure of 127.0.0.1 at) root 0AK34dM3083585 3801 Fri Nov 20 03:04 root (Deferred: 451 4.4.3 Temporary lookup failure of 127.0.0.1 at) root Total requests: 4
I switched to OpenDNS servers as the forward target in the local DNS and mails started flowing again.
Stumbling over another mail error
While reading the sendmail
log, I found an unrelated
issue with dovecot:
Nov 20 10:14:37 opal dovecot[32592]: imap-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=192.168.0.253, lip=192.168.0.4, TLS handshaking: SSL_accept() failed: error:14094416:SSL routines:ssl3_read_bytes:sslv3 alert certificate unknown: SSL alert number 46, session=<+iujHIe0YJzAqAD9> Nov 20 10:14:37 opal dovecot[32592]: imap-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=192.168.0.253, lip=192.168.0.4, TLS handshaking: SSL_accept() failed: error:14094416:SSL routines:ssl3_read_bytes:sslv3 alert certificate unknown: SSL alert number 46, session=<TASlHIe0YpzAqAD9>
These errors are caused by GMail access from an Android phone. I remember a similar issue from a few months ago, which was cured by deleting the email account and re-creating it. So that's I did this time, except that the setup process told me the outgoing mail server did not offer STARTTLS. That's not right.
However, GMail was right...
[mark@opal:~]$ telnet localhost 25 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 220 opal.hydrus.org.uk ESMTP Sendmail 8.16.1/8.16.1; Fri, 20 Nov 2020 10:41:41 GMT EHLO localhost 250-opal.hydrus.org.uk Hello localhost [127.0.0.1], pleased to meet you 250-ENHANCEDSTATUSCODES 250-PIPELINING 250-8BITMIME 250-SIZE 250-DSN 250-ETRN 250-AUTH DIGEST-MD5 CRAM-MD5 250-DELIVERBY 250 HELP QUIT 221 2.0.0 opal.hydrus.org.uk closing connection Connection closed by foreign host.
No STARTTLS capability was being shown. I restarted the sendmail service many times by now, so that I actually managed to spot the error:
Nov 20 12:04:20 opal sm-mta[88853]: STARTTLS=server, error: SSL_CTX_check_private_key failed (PATHNAME ELIDED): 0
Sendmail
was showing STARTTLS on crimson, the backup
server, so I compared configurations. I found a key difference in
the sendmail
configuration file
(<machine_name>.mc
):
crimson
define(`confSERVER_CERT', `CERT_DIR/cert.pem')dnl
opal
define(`confSERVER_CERT', `CERT_DIR/chain.pem')dnl
Yet another shoot footing incident. I must have changed this sometime in the past for reasons I can no longer remember and hadn't noticed the effect. Shows how often I send mail out from a remote client using hydrus.org.uk.
Now, the fullchain.pem file is used to supply the SERVER_CERT and all is good.
At present, Cloudflare is resolving bl.spamcop.net
without error, but there does seem to be a small delay after the
address is returned in nslookup
.