Ticket #714 (new defect)

Opened 7 years ago

Last modified 11 months ago

Lines starting From aren't escaped saving from Maildir to mbox

Reported by: Phil Pennock <Phil.Pennock@…> Owned by: mutt-dev
Priority: major Milestone:
Component: mutt Version: 1.2.5i
Keywords: Cc:

Description (last modified by brendan) (diff)

Send a mail to an address which is saved into a Maildir folder; in the mail, include a line which starts "From". Start mutt on that folder. Read the message, save to a normal mbox folder. Look at the folder with a text-viewer, see that the line hasn't been escaped with a ">".

Mutt survives, thanks to the Content-Length: header. Mailers which don't understand that header get upset (as does Mutt if the header is deleted).

Since Mutt supports multiple mailbox formats, it needs to understand the body conversions necessary when moving between formats. I've just confirmed that ">From" isn't changed into "From" when saving _to_ Maildir format, but this is arguably more problematic.

Bug confirmed present in 1.3.19i.

Change History

  Changed 7 years ago by Thomas Roessler <roessler@…>

severity 714 grave

  Changed 5 years ago by Alain Bench <veronatif@…>

Hello Phil, thanks for your report. And hello ALL.

On Monday, July 30, 2001 at 9:45:28 AM +0000, Phil Pennock wrote:

> Send a mail to an address which is saved into a Maildir folder; in the
> mail, include a line which starts "From". Start mutt on that folder.
> Read the message, save to a normal mbox folder. Look at the folder
> with a text-viewer, see that the line hasn't been escaped with a ">".

   I can confirm this problem with 1.5.4, and add that, saving to mbox:

- A "Lines:" header is not added.
- An originaly wrong "Lines:" count is preserved.
- An originaly wrong "Content-Length:" count is preserved.


Bye!	Alain.

  Changed 5 years ago by Vincent Lefevre <vincent@…>

On 2003-09-16 14:40:17 +0200, Alain Bench wrote:
>  On Monday, July 30, 2001 at 9:45:28 AM +0000, Phil Pennock wrote:
> > Send a mail to an address which is saved into a Maildir folder; in the
> > mail, include a line which starts "From". Start mutt on that folder.
> > Read the message, save to a normal mbox folder. Look at the folder
> > with a text-viewer, see that the line hasn't been escaped with a ">".
> 
>     I can confirm this problem with 1.5.4,

This is not a problem, on the contrary. Modifying the body of a
message is a bad idea, and escaping the From is no longer needed
as the "Content-Length:" header indicates the size of the message.

> and add that, saving to mbox:
> 
>  - A "Lines:" header is not added.

This is not required (though it would be a good idea, ditto with the
maildir format -- there are places where one can't use procmail).

>  - An originaly wrong "Lines:" count is preserved.
>  - An originaly wrong "Content-Length:" count is preserved.

Well, this is not the goal of Mutt to fix broken software. However,
checking these values would also be a good idea, IMHO. I think that
you can also remove them with procmail (if you can use it).

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/> - 100%
validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
des Jeux Mathématiques et Logiques, TETRHEX, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

  Changed 5 years ago by Bob Bell <bbell@…>

On Wed, Sep 17, 2003 at 06:11:49PM +0200, Alain Bench <veronatif@free.fr> wrote:
>  On Tuesday, September 16, 2003 at 10:22:46 PM +0200, Vincent Lefèvre wrote:
> > escaping the From is no longer needed as the "Content-Length:" header
> > indicates the size of the message.
> 
>     Highway to potential disaster. Don't we have to interoperate with
> other mbox tools? Some such tools don't look at CL.
> 
>     I mean, one can /dislike/ body munging and From escaping, and feel
> unhappy about it. One can still want to use mbox format. But then, one
> has to take care to use *only* CL-mbox variant tools on his system. If
> it's perfectly possible to work like that, it's perhaps not really
> sensible to put such requirement on everybody else.
> 
>     Between From escaping and mbox corruption risk, I'd choose...

If anyone changes this behavior, please make it configurable.
I specifically have procmail/formail insert Content-Length: header so
that mutt doesn't have to do any body munging and From escaping.

  Changed 5 years ago by Alain Bench <veronatif@…>

On Tuesday, September 16, 2003 at 10:22:46 PM +0200, Vincent Lefèvre wrote:

> escaping the From is no longer needed as the "Content-Length:" header
> indicates the size of the message.

   Highway to potential disaster. Don't we have to interoperate with
other mbox tools? Some such tools don't look at CL.

   I mean, one can /dislike/ body munging and From escaping, and feel
unhappy about it. One can still want to use mbox format. But then, one
has to take care to use *only* CL-mbox variant tools on his system. If
it's perfectly possible to work like that, it's perhaps not really
sensible to put such requirement on everybody else.

   Between From escaping and mbox corruption risk, I'd choose...


Bye!	Alain.

  Changed 5 years ago by Vincent Lefevre <vincent@…>

On 2003-09-17 18:11:49 +0200, Alain Bench wrote:
>     Highway to potential disaster. Don't we have to interoperate with
> other mbox tools? Some such tools don't look at CL.

One may choose not to use such tools.

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/> - 100%
validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
des Jeux Mathématiques et Logiques, TETRHEX, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

  Changed 5 years ago by Edmund GRIMLEY EVANS <edmundo@…>

Alain Bench <veronatif@free.fr>:

>     Highway to potential disaster. Don't we have to interoperate with
> other mbox tools? Some such tools don't look at CL.

Generally mbox is a disaster. There are at least three ways of writing
to an mbox and countless ways of reading from one depending on whether
you look at CL, how strictly you parse the From_ lines, how you
unescape "From" and how you fall back from one strategy to another
when you find something unexpected. Check the list archives for
further discussion and references.

  Changed 5 years ago by Alain Bench <veronatif@…>

On Thursday, September 18, 2003 at 8:46:55 AM +0100, Edmund Grimley-Evans wrote:

> There are at least three ways of writing to an mbox

   I count: Traditional BSD (no CL, From escaping) and CL-variant (CL,
no escaping). What's the third? Hybrid CL _and_ From escaping?


> Check the list archives for further discussion and references.

   There are also interesting elements on Jamie Zawinski site at
<URL:http://www.jwz.org/doc/content-length.html>. Someone has more
sources?


Bye!	Alain.
-- 
« if you believe the Content-Length header, I've got a bridge to sellyou. »

  Changed 3 years ago by ab

Deduppe despam dejunk devirus unform.

  Changed 20 months ago by brendan

  • priority changed from critical to major
  • description modified (diff)

  Changed 11 months ago by rtc

There are not three, but four principal ways of writing to an mbox, known as mboxo, mboxrd, mboxcl and mboxcl2, and mboxrd should always be your choice. See http://homepages.tesco.net/~J.deBoynePollard/FGA/mail-mbox-formats.html for details. See also #2976.

follow-up: ↓ 13   Changed 11 months ago by vinc17

I've noticed that he avoids talking about the restrictions on line lengths: escaping a "From " line adds a character, which is bad if the maximum line length had already been reached.

RFC 2822 says:

2.1.1. Line Length Limits

   There are two limits that this standard places on the number of
   characters in a line. Each line of characters MUST be no more than
   998 characters, and SHOULD be no more than 78 characters, excluding
   the CRLF.
[...]

Though this doesn't apply to mailbox formats, I wouldn't be surprised if some/much mail software assumes that messages in mailboxes are in the RFC2822 format (this would be rather logical after all).

Also, if one assumes that nowadays, mail software supports MIME, quoted-printable allows to encode the "From" without any corruption.

in reply to: ↑ 12 ; follow-up: ↓ 14   Changed 11 months ago by rtc

Replying to vinc17:

I've noticed that he avoids talking about the restrictions on line lengths: escaping a "From " line adds a character, which is bad if the maximum line length had already been reached. ... Though this doesn't apply to mailbox formats, I wouldn't be surprised if some/much mail software assumes that messages in mailboxes are in the RFC2822 format (this would be rather logical after all).

The RFC message format is a layer above the mailbox. The mailbox stores collections of messages transparently, that is, without assumptions about the message format, while it should be possible to store the messages in any mailbox format, without any assumptions about its structure. If you read from a mailbox, you must hence undo the From_ quoting before assuming that you can handle the message.

Also, if one assumes that nowadays, mail software supports MIME, quoted-printable allows to encode the "From" without any corruption.

That would tamper with the message and violate the layer separation principle.

in reply to: ↑ 13 ; follow-up: ↓ 15   Changed 11 months ago by vinc17

Replying to rtc:

The RFC message format is a layer above the mailbox. The mailbox stores collections of messages transparently, that is, without assumptions about the message format,

This is not true. For instance, with the maildir format, the flags of the messages are stored in the filename.

while it should be possible to store the messages in any mailbox format, without any assumptions about its structure. If you read from a mailbox, you must hence undo the From_ quoting before assuming that you can handle the message.

No, it is not possible to unquote the "From " unambiguously.

Also, if one assumes that nowadays, mail software supports MIME, quoted-printable allows to encode the "From" without any corruption.

That would tamper with the message and violate the layer separation principle.

Not more that From quoting.

in reply to: ↑ 14 ; follow-up: ↓ 16   Changed 11 months ago by rtc

Replying to vinc17:

Replying to rtc:

The RFC message format is a layer above the mailbox. The mailbox stores collections of messages transparently, that is, without assumptions about the message format,

This is not true. For instance, with the maildir format, the flags of the messages are stored in the filename.

This does not contradict, but, on the contrary, is in exact agreement with what I said. The flags are not part of the message itself. Strictly speaking, mailers that save flags as part of the message header (admittedly the single most commonly seen solution) are violating the layer separation principle.

while it should be possible to store the messages in any mailbox format, without any assumptions about its structure. If you read from a mailbox, you must hence undo the From_ quoting before assuming that you can handle the message.

No, it is not possible to unquote the "From " unambiguously.

I did not say that it is possible without preconditions. You should always assume mboxrd format when unquoting, and if the message was in fact written in mboxrd format, then unquoting will be unambiguous. If not, then the information was already lost when the mail was written, so it won't really be a problem. I use procmail, enhanced to write mboxrd instead of mboxo format messages.

Also, if one assumes that nowadays, mail software supports MIME, quoted-printable allows to encode the "From" without any corruption.

That would tamper with the message and violate the layer separation principle.

Not more that From quoting.

If you do mboxrd style From_ quoting und unquoting, you will not violate it.

in reply to: ↑ 15 ; follow-up: ↓ 17   Changed 11 months ago by vinc17

Replying to rtc:

This does not contradict, but, on the contrary, is in exact agreement with what I said. The flags are not part of the message itself.

So, for the same reason, the "From " should not be quoted, and MTAs/MDAs should not add headers to messages (or modify the encoding of messages).

You should always assume mboxrd format when unquoting,

No, this is not the format supported by most software. Procmail, Exim and Postfix assume mboxo.

and if the message was in fact written in mboxrd format, then unquoting will be unambiguous. If not, then the information was already lost when the mail was written,

No information is lost when the mail is written. It is lost only when the mail is stored to some mbox format. But in general the mbox format is not the mboxrd format. So, it is better not to assume mboxrd when unquoting.

If you do mboxrd style From_ quoting und unquoting, you will not violate it.

But the point is that most software does not use mboxrd.

in reply to: ↑ 16 ; follow-up: ↓ 18   Changed 11 months ago by rtc

Replying to vinc17:

So, for the same reason, the "From " should not be quoted, and MTAs/MDAs should not add headers to messages (or modify the encoding of messages).

From_ must be quoted in some way; if not, the end of the message cannot be uniquely determined anymore.

You should always assume mboxrd format when unquoting,

No, this is not the format supported by most software. Procmail, Exim and Postfix assume mboxo.

To the best of my knowledge, procmail, exim and postfix never read from a mailbox (correct me if I am wrong), but only append to it, so they never have to assume a mailbox format, but only write messages in one. Assuming mboxrd format when reading a mail from a mboxo format mailbox has not more problems than assuming mboxo format when reading a mail from a mboxo format mailbox.

No information is lost when the mail is written. It is lost only when the mail is stored to some mbox format.

In fact, it is only lost when the mail is stored to some mbox format other than mboxrd.

But in general the mbox format is not the mboxrd format. So, it is better not to assume mboxrd when unquoting.

There is no such thing as "the mbox format". It is always better to assume mboxrd. It has the clear advantage that the message is retrieved correctly at least for one case: The case that the file was written in mboxrd format.

If you do mboxrd style From_ quoting und unquoting, you will not violate it.

But the point is that most software does not use mboxrd.

I am not exactly sure whether you have understood the problem that mboxrd solves. mboxo mailboxes always involve a loss of information. You cannot decode a mboxo mailbox correctly, even if you know that it was written in mboxo format. So there is no problem that was not already there if you assume mboxrd format.

Reservations such as you have towards mboxrd are not really justified and have long been refuted. See http://groups.google.com/group/comp.mail.misc/msg/1c75d524a8957427 The best choice is short and simple: Always assume mboxrd.

in reply to: ↑ 17 ; follow-up: ↓ 19   Changed 11 months ago by vinc17

Replying to rtc:

From_ must be quoted in some way; if not, the end of the message cannot be uniquely determined anymore.

No, with some formats, it doesn't need to be quoted. Also, if a line starting with "From" doesn't have the expected format, it doesn't need to be quoted. Mutt does this check when reading a mailbox (if the Content-Length is either incorrect or missing).

To the best of my knowledge, procmail, exim and postfix never read from a mailbox (correct me if I am wrong), but only append to it, so they never have to assume a mailbox format, but only write messages in one.

To write a message, they need to assume a format. For instance, if some message contains a line starting with ">From ", it will have to be written as ">From " in mboxo (this is what procmail, exim and postfix do), but it will have to be written as ">>From " in mboxrd.

Assuming mboxrd format when reading a mail from a mboxo format mailbox has not more problems than assuming mboxo format when reading a mail from a mboxo format mailbox.

This is incorrect. Assuming that in general, a line starting with "From " doesn't need to be quoted (see above), assuming mboxo format will lead to fewer message "corruptions".

No information is lost when the mail is written. It is lost only when the mail is stored to some mbox format.

In fact, it is only lost when the mail is stored to some mbox format other than mboxrd.

No (see above).

There is no such thing as "the mbox format". It is always better to assume mboxrd. It has the clear advantage that the message is retrieved correctly at least for one case: The case that the file was written in mboxrd format.

No, this is wrong (see above).

I am not exactly sure whether you have understood the problem that mboxrd solves. mboxo mailboxes always involve a loss of information.

No, this is wrong. If you have no lines starting with "From " (or if the software is smart enough to detect that they don't need to be quoted, that is, most of the time), there is no loss of information.

in reply to: ↑ 18 ; follow-up: ↓ 20   Changed 11 months ago by rtc

Replying to vinc17:

Replying to rtc:

From_ must be quoted in some way; if not, the end of the message cannot be uniquely determined anymore.

No, with some formats, it doesn't need to be quoted.

In fact, I only know exactly one mbox based format, the mboxcl format, where this is the case. It has several severe flaws: Any reader that does not know this sick convention will wreak havoc in such mailboxes; the delivery program needs to take care of incoming mails that already have Content-Length headers; if your delivery program doesn't (which is common, you mentioned procmail, postfix and exim which don't take care of that, I suppose), you get serious problems if you get a remote message with tampered Content-Length headers etc. I fully agree with what the aforementioned http://www.jwz.org/doc/content-length.html has to say about that (despite its unfortunate rejection of mboxrdin favor of mboxo, which is however done with much better arguments than your have here.)

Also, if a line starting with "From" doesn't have the expected format, it doesn't need to be quoted.

There is no such thing as "the expected format" of a "line starting with 'From'". If your mbox variety requires From_ quoting, you should quote any line starting with "From ", and you should assume that any line starting with "From " is a message separator. You complicate matters endlessly and create many unnecessary new problems if you try to use such a seemingly more "sophisticated" approach.

Mutt does this check when reading a mailbox (if the Content-Length is either incorrect or missing).

It shouldn't. If you ever have a message inside a mboxcl format mailbox that lacks appropriate Content-Length, and it contains an unquoted From_, something went horribly wrong.

To write a message, they need to assume a format. For instance, if some message contains a line starting with ">From ", it will have to be written as ">From " in mboxo (this is what procmail, exim and postfix do), but it will have to be written as ">>From " in mboxrd.

This is correct. However, your initial statement was "No, this is not the format supported by most software" What is problematic about mboxrd not being this format? Whether you choose to assume mboxo or mboxrd on a mailbox written in the format supported by most software (that writes mailboxes), and this format is mboxo, you will never in all cases be able to unambiguously reconstruct the original message.

And one major MTA, qmail, has switched to mboxrd, so if you assume mboxrd, you will at least have a clean solution for this case.

You are talking about the mboxcl format above. This is by far not the format supported by most software, either.

Assuming mboxrd format when reading a mail from a mboxo format mailbox has not more problems than assuming mboxo format when reading a mail from a mboxo format mailbox.

This is incorrect. Assuming that in general, a line starting with "From " doesn't need to be quoted (see above), assuming mboxo format will lead to fewer message "corruptions".

I don't understand what you mean. The first assumption cannot be combined with the second. They contradict each other.

A From_ always needs to be quoted except if you assume mboxcl format, in which case you need to assume mboxcl, not mboxo, on reading the mailbox, and you are then trading the From_ problems for many other, much more serious ones (see above).

Assuming mboxo format will never lead to fewer message corruptions compared to assuming the mboxrd format. Of course you can always construct specific examples where assuming the mboxrd format will corrupt the message worse than assuming mboxo would. But these cases are very rare in real world amd are never fatal.

No information is lost when the mail is written. It is lost only when the mail is stored to some mbox format.

In fact, it is only lost when the mail is stored to some mbox format other than mboxrd.

No (see above).

Information is lost in mboxcl format, too, because if you want to write a mail in that format, you need to overwrite already existing Content-Length headers, apart from the other problems I mentioned.

I am not exactly sure whether you have understood the problem that mboxrd solves. mboxo mailboxes always involve a loss of information.

No, this is wrong. If you have no lines starting with "From " (or if the software is smart enough to detect that they don't need to be quoted, that is, most of the time), there is no loss of information.

If I am talking about corruption, I am talking about what happens to the infinity of possible messages, not any specific subset of them. If you have no lines starting with "From ", you obviously never have problems. Defining the problem to be non-existent is not a valid argument. I don't see that "the software [being] smart enough to detect that they don't need to be quoted" is in any way related to "most of the time".

in reply to: ↑ 19 ; follow-up: ↓ 21   Changed 11 months ago by vinc17

Replying to rtc:

In fact, I only know exactly one mbox based format, the mboxcl format, where this is the case. It has several severe flaws: Any reader that does not know this sick convention will wreak havoc in such mailboxes;

This is not a flaw. Mutt supports the Content-Length header, so, no problem for me. At least this doesn't corrupt the message ("From " quoting is particularly evil when it occurs in an attachment).

the delivery program needs to take care of incoming mails that already have Content-Length headers; if your delivery program doesn't (which is common, you mentioned procmail, postfix and exim which don't take care of that, I suppose), you get serious problems if you get a remote message with tampered Content-Length headers etc.

I've never seen any problem in practice (but I switched to maildir for incoming mail a few years ago). Also, nowadays most users use filters (mainly because of spam), and it's really easy to remove the Content-Length header. Ditto for the Status header.

Also, without Content-Length, the mailer would be very inefficient on large mailboxes.

There is no such thing as "the expected format" of a "line starting with 'From'".

See the is_from function in from.c.

If your mbox variety requires From_ quoting, you should quote any line starting with "From ", and you should assume that any line starting with "From " is a message separator. You complicate matters endlessly and create many unnecessary new problems if you try to use such a seemingly more "sophisticated" approach.

Instead of dealing with theory that doesn't match the practice, Mutt tries to solve practical problems. And it does it well.

Mutt does this check when reading a mailbox (if the Content-Length is either incorrect or missing).

It shouldn't. If you ever have a message inside a mboxcl format mailbox that lacks appropriate Content-Length, and it contains an unquoted From_, something went horribly wrong.

In any case, I think this is fine that Mutt has some recovering mechanisms.

This is correct. However, your initial statement was "No, this is not the format supported by most software" What is problematic about mboxrd not being this format? Whether you choose to assume mboxo or mboxrd on a mailbox written in the format supported by most software (that writes mailboxes), and this format is mboxo, you will never in all cases be able to unambiguously reconstruct the original message.

But assuming mboxo leads to the correct result most of the time.

And one major MTA, qmail, has switched to mboxrd, so if you assume mboxrd, you will at least have a clean solution for this case.

qmail uses maildir by default (IIRC, this was the format introduced by qmail).

You are talking about the mboxcl format above. This is by far not the format supported by most software, either.

It is supported by Mutt, and this is sufficient for most Mutt users.

A From_ always needs to be quoted except if you assume mboxcl format,

Not all of them.

in which case you need to assume mboxcl, not mboxo, on reading the mailbox,

In practice, one can have mixed-form of mailboxes.

Assuming mboxo format will never lead to fewer message corruptions compared to assuming the mboxrd format. Of course you can always construct specific examples where assuming the mboxrd format will corrupt the message worse than assuming mboxo would. But these cases are very rare in real world amd are never fatal.

No, they are common and annoying.

Information is lost in mboxcl format, too, because if you want to write a mail in that format, you need to overwrite already existing Content-Length headers, apart from the other problems I mentioned.

No information is lost. Content-Length is just meta-data.

If I am talking about corruption, I am talking about what happens to the infinity of possible messages, not any specific subset of them.

I really don't mind about your infinity of possible messages. Not all possible messages occur in practice. You model is wrong, as well as your deductions.

in reply to: ↑ 20   Changed 11 months ago by rtc

Replying to vinc17:

Replying to rtc:

Any reader that does not know this sick convention will wreak havoc in such mailboxes;

This is not a flaw.

Of course it is a flaw, and a serious one.

Mutt supports the Content-Length header, so, no problem for me. At least this doesn't corrupt the message

That doesn't help you at all. Not mutt, but the delivery program writes the mail to your mailbox, and it will usually do in mboxo format. You message is hence corrupted before mutt even sees it.

("From " quoting is particularly evil when it occurs in an attachment).

I agree, but if you assume mboxrd, you will at least get back the original attachment if it was written by an mboxrd writer.

I've never seen any problem in practice (but I switched to maildir for incoming mail a few years ago). Also, nowadays most users use filters (mainly because of spam), and it's really easy to remove the Content-Length header. Ditto for the Status header.

You are playing down the problem here and defining it to be non-existant again.

Also, without Content-Length, the mailer would be very inefficient on large mailboxes.

It would be inefficient on mailboxes with large messages in them, though you overestimate the effect here.

There is no such thing as "the expected format" of a "line starting with 'From'".

See the is_from function in from.c.

Did anyone ever have a mailbox where this would be necessary? A correct is_from would return 1 if the line starts with "From ", 0 in any other case.

Instead of dealing with theory that doesn't match the practice, Mutt tries to solve practical problems. And it does it well. [...] In any case, I think this is fine that Mutt has some recovering mechanisms.

On the contrary, mutt seems to ignore practical problems and tries to solve theoretical ones that are not actually there, creating real practical ones by doing that. As jwz correctly says: "Some people will tell you that you should do stricter parsing on those lines: check for user names and dates and so on. They are wrong. The random crap that has traditionally been dumped into that line is without bound; comparing the first five characters is the only safe and portable thing to do."

But assuming mboxo leads to the correct result most of the time.

Assuming mboxrd leads to the correct result most of the time, too.

You are talking about the mboxcl format above. This is by far not the format supported by most software, either.

It is supported by Mutt, and this is sufficient for most Mutt users.

It doesn't help users at all, because the mail is initially written not by mutt, but by the delivery program, and the delivery program decides how to quote the message, not mutt.

A From_ always needs to be quoted except if you assume mboxcl format,

Not all of them.

They do need all to be quoted and they are all quoted, by any delivery program known to me.

in which case you need to assume mboxcl, not mboxo, on reading the mailbox,

In practice, one can have mixed-form of mailboxes.

Yes, with mutt, you will usually result in mboxcl2, because it adds Content-Length in addition to the quoting already done by the delivery program.

Assuming mboxo format will never lead to fewer message corruptions compared to assuming the mboxrd format. Of course you can always construct specific examples where assuming the mboxrd format will corrupt the message worse than assuming mboxo would. But these cases are very rare in real world amd are never fatal.

No, they are common and annoying.

Yes, if programs that do cryptographic signing already assume that the message will be written in mboxo and so already do the quoting on the sender part. If it is actually written in mboxo format, but you assume mboxrd on reading, you will inevitably get problems and the signature won't match. For this practical case, which is the result of a sick and ugly kludge (for which the inventors should be shot; using any character except '>' to place in front of 'From ' on the sender side would never have brought the problem into existence), mboxrd is indeed not the most practical choice.

No information is lost. Content-Length is just meta-data.

Strictly speaking, it is part of the header of the messages, not of the meta-data. The only meta-data that is present in mbox* files is the From_ lines.

I really don't mind about your infinity of possible messages. Not all possible messages occur in practice. You model is wrong, as well as your deductions.

My model may not be perfect, but my deductions are not at all as wrong as you try to suggest. I don't deny that you have some points. You shouldn't completely deny my points, either. Please review my new patch at #2976 which leaves the user the option of whether to add mboxcl idiosyncracies to his mailboxes and whether to save fccs in mboxrd or in mboxcl.

Note: See TracTickets for help on using tickets.