1

We have a dovecot server version 2.3.16 with sieve duplicate extension enabled to detect and discard duplicates. We have duplicate routing in Postfix virtual_aliases file which causes duplicate copies with some address lists and is done correctly by Postfix.

However some duplicates are randomly allowed by Dovecot Sieve. Different emails receive duplicates on each iteration of same test. After detailed logging it looks like the duplicates get through when the same Message ID for same mailbox is taken up by two different dovecot processes. If the duplicates with same Message ID for same mailbox is taken up by same process, it discards the duplicate correctly. So if I reduce the process_limit down to 1 we get zero duplicates in all cases.

Question is why is there a "duplicate mark" synchronization issue? is there a setting I am missing for synchronization? I would prefer NOT to set the lmtp process limit to 1

Dovecot configuration:

service lmtp {
   process_limit = 1
}
shutdown_clients = no

Following the direction of serverfault post here I have set sieve configuration as below and sieve script is being executed properly and duplicate is being marked each time

sieve = file:~/sieve;active=~/.dovecot.sieve
sieve_before = /etc/dovecot/conf.d/deduplicate.sieve
sieve_extensions = +duplicate +fileinto +imapflags +editheader

Deduplicate seive script:

require "duplicate";
if duplicate {
  discard;
  stop;
}

Dovecot log entries for single email:

Oct 12 08:57:59 lmtp([email protected])<230008><IFAiHZclKGV5ggMAfXq90w>: Debug: sieve: msgid=<SA1PR22MB3146CDDB578F27162AAD7C29EFD3A@SA1PR22MB3146.namprd22.prod.outlook.com>: Finish implicit keep action
Oct 12 08:57:59 lmtp([email protected])<230008><IFAiHZclKGV5ggMAfXq90w>: Debug: sieve: msgid=<SA1PR22MB3146CDDB578F27162AAD7C29EFD3A@SA1PR22MB3146.namprd22.prod.outlook.com>: Finishing actions
Oct 12 08:57:59 lmtp([email protected])<230008><IFAiHZclKGV5ggMAfXq90w>: Debug: sieve: msgid=<SA1PR22MB3146CDDB578F27162AAD7C29EFD3A@SA1PR22MB3146.namprd22.prod.outlook.com>: Finish duplicate_mark action
Oct 12 08:57:59 lmtp([email protected])<230008><IFAiHZclKGV5ggMAfXq90w>: Debug: sieve: msgid=<SA1PR22MB3146CDDB578F27162AAD7C29EFD3A@SA1PR22MB3146.namprd22.prod.outlook.com>: Finished executing result (final, status=ok, keep=yes)
4
  • Check your logs: Is your sieve script taking non-negligible time to finish processing? YOu may be able to make the problem mostly disappear through resolving whatever is causing this concurrency to occur.
    – anx
    Oct 12 at 22:17
  • I wonder if your duplication problem could be resolved at the root. No promises whatsoever.. but I guess no harm in posting a separate question about your Postfix configuration and its references alias lookups.
    – anx
    Oct 12 at 22:57
  • Hint: If your logs only look like that because of systemd, you can get it to print correct timestamps with explicit output formatting like journalctl -o short-iso-precise
    – anx
    Oct 12 at 23:04
  • @anx sieve processing of all emails with same Message ID is finishing within the same second so race condition is very likely. Postfix is using recursive lookups on virtual alias database which is the root cause of duplicates. From what I have read this is the expected way for postfix to behave and gave error when i tried to set recursive limit to 1. I didn't see a good setting/possibility to remove duplicates in postfix itself Oct 13 at 13:03

1 Answer 1

1

RFC 7352 makes very clear which side to err on (emphasis mine):

Implementations MUST only update the internal duplicate-tracking list when the Sieve script execution finishes successfully. [..] However, deferring the definitive modification of the tracking list to the end of a successful Sieve script execution is not without problems. It can cause a race condition when a duplicate message is delivered in parallel before the tracking list is updated. This way, a duplicate message could be missed by the "duplicate" test. More complex implementations could use a locking mechanism to prevent this problem. But, irrespective of what implementation is chosen, situations in which the "duplicate" test erroneously yields "true" MUST be prevented.

That leaves your with one complete, and two incomplete workaround:

  1. Make Dovecot enforce per-user locking via the anvil service lmtp_user_concurrency_limit=1 - that will lock the LMTP delivery, deferring back to Postfix if the lock is already taken. This should net you similar behaviour as your process_limit=1 workaround, but at much reduced performance penalty - provided you actually do deliver mostly to different users.

    You may need to look at lmtp_connection_cache_time_limit et al. in Postfix, look at your deferred queue behaviour and connection reuse after Dovecot actually defers messages this way - Postfix should generally back off only very briefly.

  2. Make your sieve script finish faster, so there is less concurrency. You may need to move expensive checks from post-queue calls from sieve into before-queue milters. In case you are inserting messages from other sources, having Dovecot update indexes as a recurring background job will reduce outstanding work Dovecot might trigger on delivery.

  3. Make your submitting clients submit duplicates in the same transaction, allowing Postfix to push the towards Dovecot together, allowing Dovecot to detect duplicates without locking.

2
  • Thanks for the RFC document reference. Pretty sure that is my problem. 1. I tried the lmtp_user_concurrency_limit of 1. It caused a lot of backup in postfix queue and it was waiting until next queue run. If the same user gets multiple emails within that time, deliveries for it will take further more time. I would prefer process_limit of 1 over this since the delay is normally a few milliseconds to seconds and there is no gap in between deliveries. 2 Our sieve script is very simple and only using for duplicates now. It seems to be finishing all in same second. Oct 13 at 13:54
  • 3. I dont think I got this solution. I cannot try this as I dont have control over third party postfix clients Oct 13 at 13:59

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .