[Imap-protocol] BODY.PEEK[section]<origin.size> FETCH response
tss at iki.fi
Tue Nov 1 00:26:30 PDT 2011
On 1.11.2011, at 9.00, Bron Gondwana wrote:
> On Tue, Nov 01, 2011 at 08:40:57AM +0200, Timo Sirainen wrote:
>> On 1.11.2011, at 8.31, Bron Gondwana wrote:
>>> On Tue, Nov 01, 2011 at 08:06:29AM +0200, Timo Sirainen wrote:
>>>> Dovecot also stores messages with LFs and has no trouble exporting them as if they were CRLFs. I think the only actual (performance) problem with it is .. well, actually the topic of this thread :) A partial fetch from a non-zero offset requires some scanning to find out the LF-only-offset. But luckily all clients just fetch the blocks in increasing order from zero offset, so this isn't such an important problem.
>>> How do you handle a message with a mix of LF and CRLF in the original?
>> "Correctly." :)
> Er - by which you mean that you always return the exact bytes you were given?
I don't think LF vs. CRLF have any special meaning in email data, they're both simply newlines. So Dovecot doesn't try to preserve them. They're both converted to newlines anyway (LFs or CRLFs depending on context). Although I did initially wonder about supporting binary message bodies, but never bothered with it.
>> Basically everywhere there are message (part) sizes, I store the "physical size" (exactly as it is stored in disk, with or without CRs) and the "virtual size" (all LFs converted to CRLFs). If physical size equals to virtual size, I'll do some extra optimizations like being able to seek to wanted offset immediately or use sendfile() to send the message.
> Sounds to me like that's enough benefit to store it all CRLFs in itself.
> 1/65 of storage space vs seek and sendfile.
Well, that's why it's an option :) But typically I've noticed that I/O is the problem, not CPU, so sendfile isn't all that useful. The seeking is more of a theoretical problem. Normally when clients fetch partial data they start from offset 0, so no seeking needed. The next block starts from where the previous block ended, which Dovecot remembers and continues again without seeking. And so on. So even if LFs save only a little disk space and disk I/O, I figured it's better than nothing.
>> Although a mix of LFs and CRLFs in the same message shouldn't normally appear in mail files.
> Most often seen with headers, or between parts. The most ugly cases
> being differences between the mime-headers of a part, and the content
> of said part.
Coming from where? SMTP? IMAP APPENDs? I've never noticed, because Dovecot handles them silently.
More information about the Imap-protocol