Ticket #2802 (closed defect: fixed)

Opened 22 months ago

Last modified 21 months ago

Mutt *very* slow reading mails with long header lines

Reported by: Christoph Berg <cb@…> Owned by: mutt-dev
Priority: minor Milestone:
Component: mutt Version:
Keywords: Cc:

Description

The following was submitted as Debian bug #290701:

----- Forwarded message from Steve McIntyre <steve@einval.com> -----

Date: Sun, 16 Jan 2005 01:08:58 +0000
From: Steve McIntyre <steve@einval.com>
Reply-To: 290701@bugs.debian.org
To: submit@bugs.debian.org
Subject: Mutt *very* slow reading some mails

Package: mutt
Version: 1.5.6-20040907+2

Mutt stops responding to me when trying to open certain (spam)
messages. I've attached a gzipped example. strace shows large numbers
of mremap() calls going on, which suggests maybe lots of memory
allocations. The mail has a *very* long line in the Bcc: header, which
I'm guessing may be the cause.

Of course, I don't want to actually read this mail, but mutt should be
better behaved. After waiting a couple of minutes, I killed the mutt
process and had a look at this mail using less instead.

-- 
Steve McIntyre, Cambridge, UK.                                steve@einval.com
Is there anybody out there?

----- End forwarded message -----

The example message is available as
http://bugs.debian.org/cgi-bin/bugreport.cgi/example-mail.gz?bug=290701;msg=5;att=1

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/

>Fix:
Unknown

Change History

Changed 21 months ago by Christoph Berg <cb@…>

----- Forwarded message from Sami Farin <safari-mutt@safari.iki.fi> -----

Date: Sat, 10 Mar 2007 20:12:30 +0200
From: Sami Farin <safari-mutt@safari.iki.fi>
To: Mutt Development List <mutt-dev@mutt.org>
Subject: mutt Bug #2802 - stupid usage of strcat

mutt would benefit from replacing strcat and friends with
a better string library,...  but because that's too much
work for me, I only checked what it helps if I do
one basic optimization for mutt's usage of strcat.

http://bugs.mutt.org/cgi-bin/gnatsweb.pl?database=mutt&cmd=view%20audit-trail&pr=2802
mutt -f ./big.mbox
cpu usage before: 98180452048 cycles
cpu usage after:   4245703875 cycles
(speedup: 23.1x)

when Bcc is two times longer:
cpu usage before: 385111497988 cycles
cpu usage after:    8414844886 cycles
(speedup: 45.8x)

[ mutt-1.5.14-optimize_strcat_usage.patch ]

rfc822_write_address could as well return the length.
[ mutt-1.5.14-rfc822_write_address_return_int.patch ]

mutt_canonical_charset does stupid loop when searching
for correct entry, with this patch "mutt -f ./big.mbox"
is 2x faster when I use utf-8 charset
[ mutt-1.5.14-mutt_canonical_charset-optimize.patch]
Searching PreferredMIMENames should be converted to 
a better algorithm instead of applying this patch
to mutt, I'd say.  This was just for testing the speed
effect.  And it shouldn't need to call that function
/that/ often, charset doesn't change while parsing
one header field...?!



-- 
Do what you love because life is too short for anything else.


--- mutt-1.5.14/rfc822.c.bak	2006-05-18 20:34:09.000000000 +0300
+++ mutt-1.5.14/rfc822.c	2007-03-10 17:37:28.268829563 +0200
@@ -687,7 +687,7 @@ done:
 }
 
 /* note: it is assumed that `buf' is nul terminated! */
-void rfc822_write_address (char *buf, size_t buflen, ADDRESS *addr, int display)
+int rfc822_write_address (char *buf, size_t buflen, ADDRESS *addr, int display)
 {
   char *pbuf = buf;
   size_t len = mutt_strlen (buf);
@@ -739,6 +739,7 @@ void rfc822_write_address (char *buf, si
   }
 done:
   *pbuf = 0;
+  return pbuf - buf;
 }
 
 /* this should be rfc822_cpy_adr */
--- mutt-1.5.14/rfc822.h.bak	2005-09-17 23:18:23.000000000 +0300
+++ mutt-1.5.14/rfc822.h	2007-03-10 17:35:37.551182393 +0200
@@ -48,7 +48,7 @@ ADDRESS *rfc822_parse_adrlist (ADDRESS *
 ADDRESS *rfc822_cpy_adr (ADDRESS *addr);
 ADDRESS *rfc822_cpy_adr_real (ADDRESS *addr);
 ADDRESS *rfc822_append (ADDRESS **a, ADDRESS *b);
-void rfc822_write_address (char *, size_t, ADDRESS *, int);
+int rfc822_write_address (char *, size_t, ADDRESS *, int);
 void rfc822_write_address_single (char *, size_t, ADDRESS *, int);
 void rfc822_free_address (ADDRESS **addr);
 void rfc822_cat (char *, size_t, const char *, const char *);

--- mutt-1.5.14/charset.c.bak	2006-05-18 20:34:07.000000000 +0300
+++ mutt-1.5.14/charset.c	2007-03-10 15:48:32.311906317 +0200
@@ -245,6 +245,11 @@ void mutt_canonical_charset (char *dest,
   char *p;
   char scratch[LONG_STRING];
 
+  if (!ascii_strcasecmp (name, "utf-8")) {
+    strfcpy (dest, name, dlen);
+    goto found_utf8;
+  }
+
   /* catch some common iso-8859-something misspellings */
   if (!ascii_strncasecmp (name, "8859", 4) && name[4] != '-')
     snprintf (scratch, sizeof (scratch), "iso-8859-%s", name +4);
@@ -267,6 +272,7 @@ void mutt_canonical_charset (char *dest,
 
   strfcpy (dest, scratch, dlen);
 
+found_utf8:
   /* for cosmetics' sake, transform to lowercase. */
   for (p = dest; *p; p++)
     *p = ascii_tolower (*p);

--- mutt-1.5.14/copy.c.bak	2007-01-30 21:49:01.000000000 +0200
+++ mutt-1.5.14/copy.c	2007-03-10 19:14:29.169711344 +0200
@@ -49,13 +49,14 @@ mutt_copy_hdr (FILE *in, FILE *out, LOFF
   int from = 0;
   int this_is_from;
   int ignore = 0;
-  char buf[STRING]; /* should be long enough to get most fields in one pass */
+  char buf[LONG_STRING]; /* should be long enough to get most fields in one pass */
   char *nl;
   LIST *t;
   char **headers;
   int hdr_count;
   int x;
   char *this_one = NULL;
+  size_t this_one_len;
   int error;
 
   if (ftello (in) != off_start)
@@ -156,15 +157,17 @@ mutt_copy_hdr (FILE *in, FILE *out, LOFF
 	{
 	  if (!address_header_decode (&this_one))
 	    rfc2047_decode (&this_one);
+	  this_one_len = mutt_strlen (this_one);
 	}
-	
+
 	if (!headers[x])
 	  headers[x] = this_one;
 	else 
 	{
-	  safe_realloc (&headers[x], mutt_strlen (headers[x]) + 
-			mutt_strlen (this_one) + sizeof (char));
-	  strcat (headers[x], this_one); /* __STRCAT_CHECKED__ */
+	  int hlen = mutt_strlen (headers[x]);
+
+	  safe_realloc (&headers[x], hlen + this_one_len + sizeof (char));
+	  strcat (headers[x] + hlen, this_one); /* __STRCAT_CHECKED__ */
 	  FREE (&this_one);
 	}
 
@@ -231,13 +234,15 @@ mutt_copy_hdr (FILE *in, FILE *out, LOFF
     if (!ignore)
     {
       dprint (2, (debugfile, "Reorder: x = %d; hdr_count = %d\n", x, hdr_count));
-      if (!this_one)
+      if (!this_one) {
 	this_one = safe_strdup (buf);
-      else
-      {
-	safe_realloc (&this_one,
-		      mutt_strlen (this_one) + mutt_strlen (buf) + sizeof (char));
-	strcat (this_one, buf); /* __STRCAT_CHECKED__ */
+	this_one_len = mutt_strlen (this_one);
+      } else {
+	int blen = mutt_strlen (buf);
+
+	safe_realloc (&this_one, this_one_len + blen + sizeof (char));
+	strcat (this_one + this_one_len, buf); /* __STRCAT_CHECKED__ */
+	this_one_len += blen;
       }
     }
   } /* while (ftello (in) < off_end) */
@@ -255,9 +260,10 @@ mutt_copy_hdr (FILE *in, FILE *out, LOFF
       headers[x] = this_one;
     else 
     {
-      safe_realloc (&headers[x], mutt_strlen (headers[x]) + 
-		    mutt_strlen (this_one) + sizeof (char));
-      strcat (headers[x], this_one); /* __STRCAT_CHECKED__ */
+      int hlen = mutt_strlen (headers[x]);
+
+      safe_realloc (&headers[x], hlen + this_one_len + sizeof (char));
+      strcat (headers[x] + hlen, this_one); /* __STRCAT_CHECKED__ */
       FREE (&this_one);
     }
 
@@ -845,22 +851,22 @@ static void format_address_header (char 
   char buf[HUGE_STRING];
   char cbuf[STRING];
   char c2buf[STRING];
-  
-  int l, linelen, buflen, count;
+  char *p;
+  int l, linelen, buflen, count, cbuflen, c2buflen, plen;
+
   linelen = mutt_strlen (*h);
+  plen = linelen;
   buflen  = linelen + 3;
-  
-  
+
   safe_realloc (h, buflen);
   for (count = 0; a; a = a->next, count++)
   {
     ADDRESS *tmp = a->next;
     a->next = NULL;
     *buf = *cbuf = *c2buf = '\0';
-    rfc822_write_address (buf, sizeof (buf), a, 0);
+    l = rfc822_write_address (buf, sizeof (buf), a, 0);
     a->next = tmp;
     
-    l = mutt_strlen (buf);
     if (count && linelen + l > 74) 
     {
       strcpy (cbuf, "\n\t");  	/* __STRCPY_CHECKED__ */
@@ -881,16 +887,22 @@ static void format_address_header (char 
       buflen++;
       strcpy (c2buf, ",");	/* __STRCPY_CHECKED__ */
     }
-    
-    buflen += l + mutt_strlen (cbuf) + mutt_strlen (c2buf);
+
+    cbuflen = mutt_strlen (cbuf);
+    c2buflen = mutt_strlen (c2buf);
+    buflen += l + cbuflen + c2buflen;
     safe_realloc (h, buflen);
-    strcat (*h, cbuf);		/* __STRCAT_CHECKED__ */
-    strcat (*h, buf);		/* __STRCAT_CHECKED__ */
-    strcat (*h, c2buf);		/* __STRCAT_CHECKED__ */
+    p = *h;
+    strcat (p + plen, cbuf);		/* __STRCAT_CHECKED__ */
+    plen += cbuflen;
+    strcat (p + plen, buf);		/* __STRCAT_CHECKED__ */
+    plen += l;
+    strcat (p + plen, c2buf);		/* __STRCAT_CHECKED__ */
+    plen += c2buflen;
   }
   
   /* Space for this was allocated in the beginning of this function. */
-  strcat (*h, "\n");		/* __STRCAT_CHECKED__ */
+  strcat (p + plen, "\n");		/* __STRCAT_CHECKED__ */
 }
 
 static int address_header_decode (char **h)


----- End forwarded message -----

Christoph
-- 
cb@df7cb.de | http://www.df7cb.de/

Changed 21 months ago by Oswald Buddenhagen <ossi@…>

talking about stupid ...
using strcat instead of strcpy when the target string is known to be
empty is not too wise, but mostly irrelevant.
it is even less wise to use strcat/strcpy instead of memcpy when the
length of the source string is known in advance. but i guess this is
relatively insignificant compared to strcat()ing when the target string
is already quite long compared to the length of the source string.

Changed 21 months ago by cb

  • status changed from new to closed
  • resolution set to fixed
----- Forwarded message from Sami Farin <safari-mutt@safari.iki.fi> -----
 talking about stupid ...
Fixed in tip, http://dev.mutt.org/hg/mutt/rev/5407ad0c2b35
Note: See TracTickets for help on using tickets.