(Difference between revisions)
 Revision as of 08:04, 25 June 2008 (edit) (→External links)← Older edit Revision as of 05:40, 20 September 2008 (edit) (undo) (Major revision; new scatter plots and a rewrite of the descriptions. Also changed the linear regression to a power regression as it matches the data more closely.)Newer edit → Line 6: Line 6: This section involves data taken from the list [[Strong Bad Email By Length]]. This section involves data taken from the list [[Strong Bad Email By Length]]. [[Image:SBEScatter.png|thumb|200px|left|A scatter plot of chronological number vs. length, with outliers.]] [[Image:SBEScatter.png|thumb|200px|left|A scatter plot of chronological number vs. length, with outliers.]] - [[Image:SBELinReg.png|thumb|200px|right|A scatter plot of chronological number vs. length, without outliers.]] + [[Image:SBEScatter2.png|thumb|200px|right|A scatter plot of chronological number vs. length, without outliers.]] - The scatter plot to the left shows a fairly strong positive correlation between Email Number and Email Length.  The ''r'' value between these two variables, including outliers, is .721. An ''r'' value of 1 would indicate a perfect positive correlation.  A value of -1 indicates a perfect negative correlation. Therefore, .721 indicates a fairly strong positive correlation. + The scatter plot to the left illustrates the relationship between the e-mail number and its corresponding length (the red plots for the original length, and the black plots including the Easter Eggs).  This can be mathematically modeled using a power regression curve, which allows us to measure the trend for e-mail duration as well as predict the ongoing trend for future e-mails.  The ''R^2'' value for these curves identifies how strong this relationship is (how close the points are to the model): a value of 1 means that the model and the data is identical, while a -1 means that the model does not relate to the data at all. The equation for the black curve is y = 0.0002513x^0.4746 and the equation for the red curve is y = 0.0002467x^0.4593. - The left plot, however, shows a handful of clear outliers which are likely affecting the correlation.  In the plot on the right, the outliers have been removed (the outliers were defined as those emails with a residual value of 121 or greater, or -121 or less). A Least Squares Regression Line (LSRL) has also been added. The LSRL can be used to extrapolate, or guess the length of future emails.  The ''r'' value of this line is .7872. The equation for the LSRL is y = 1.0589x + 62.907. This method of guessing is not 100% accurate, since it is unlikely the emails will ever be, for example, 20 minutes long.  This equation should not be considered a foolproof method for guessing the length of an email, but it does give a nice idea of what the average length of an email is at a given point in time. + There are certain e-mails, however, whose lengths were much longer than the e-mails surrounding them, called ''outliers''.  These e-mails can affect the accuracy of the model and, if removed, can allow for greater accuracy.  The graph on the right has the outliers removed, which subsequently improves the ''R^2'' value for the curves. The black curve's equation becomes y = 0.0002512x^0.4717 and the red equation becomes y = 0.0002466x^0.4565. Of course, it should be noted that these models are by no means a guaranteed guess; e-mail 500, for example, is unlikely going to be over six and a half minutes long, as this model predicts. - However, in the email [[theme song]], Strong Bad tells his viewers that each email is about 3 to 5 minutes long. + In the email [[theme song]], Strong Bad tells his viewers that each email is about 3 to 5 minutes long. ==Strong Bad Email by era== ==Strong Bad Email by era==

## Revision as of 05:40, 20 September 2008

No Loafing!

With more and more Strong Bad Emails released on homestarrunner.com, it is hard to keep track of all the statistics, such as which computer was used the most, or how the length of emails has increased throughout the years. To correctly calculate those numbers, a few charts and graphs have been made for the ease of the people who like to know everything about Strong Bad and his emails.

## Contents

This section involves data taken from the list Strong Bad Email By Length.

A scatter plot of chronological number vs. length, with outliers.
A scatter plot of chronological number vs. length, without outliers.

The scatter plot to the left illustrates the relationship between the e-mail number and its corresponding length (the red plots for the original length, and the black plots including the Easter Eggs). This can be mathematically modeled using a power regression curve, which allows us to measure the trend for e-mail duration as well as predict the ongoing trend for future e-mails. The R^2 value for these curves identifies how strong this relationship is (how close the points are to the model): a value of 1 means that the model and the data is identical, while a -1 means that the model does not relate to the data at all. The equation for the black curve is y = 0.0002513x^0.4746 and the equation for the red curve is y = 0.0002467x^0.4593.

There are certain e-mails, however, whose lengths were much longer than the e-mails surrounding them, called outliers. These e-mails can affect the accuracy of the model and, if removed, can allow for greater accuracy. The graph on the right has the outliers removed, which subsequently improves the R^2 value for the curves. The black curve's equation becomes y = 0.0002512x^0.4717 and the red equation becomes y = 0.0002466x^0.4565. Of course, it should be noted that these models are by no means a guaranteed guess; e-mail 500, for example, is unlikely going to be over six and a half minutes long, as this model predicts.

In the email theme song, Strong Bad tells his viewers that each email is about 3 to 5 minutes long.

## Strong Bad Email by era

The Compy 386 has thus far been the longest era.

This section involves data on the computer used to answer each email, or the "era" of the computer. The categories are Tandy 400, Broken Tandy, Compy 386, Lappy 486, and Other (Pom Pilot, Tangerine Dreams, and Corpy NT6.)

## Total time spent using each computer

Total length of Strong Bad Emails per computer

This section involves data taken from the Strong Bad Email by Length page. The chart recognizes four categories of computers: Tandy 400 (includes Broken Tandy), Lappy 486, Compy 386, and Other.

## Strong Bad Emails featuring more than one email

Several Strong Bad Emails feature more than one email.

• credit card — After checking his email, Strong Bad sends an email to, and gets a reply from, Homestar Runner.
• spring cleaning — Strong Bad checks five emails and promptly deletes each one.
• sisters — Strong Bad accidently deletes the first email he gets and later receives a poorly written one.
• 50 emails — Strong Bad checks two emails (and begins to check another before Homestar Runner arrives and "answers" another two).
• huttah! — The first five emails Strong Bad checks all show particular interest in The Cheat (he deletes them). The last two are all directed to Strong Bad, but he attempts to fool The Cheat into thinking they're for him.
• personal favorites — In addition to the main email, Strong Bad is seen answering 4 emails in fake flashbacks (two in the main toon, two in Easter eggs).
• 2 emails — Jimmy suggests Strong Bad check two emails a week and he does. He can also be seen checking a third email during the fast-forwarding.
• cheatday — After Strong Bad checks his email, he lets The Cheat check another three emails.
• other days — In addition to the main email, Strong Bad answers a Polish email (and a snail mail).
• bottom 10 — Strong Bad receives an email with large numbers of "Fwd:" and "Re:" in the subject line, as an example of #8 on his bottom 10.
• theme song — Strong Bad can be seen answering an email in one of the theme song montages.
• retirement — Strong Bad answers an email on each of his first two computers.
• the chair — Strong Bad answers a second email, but his new chair obstructs almost the entire screen of the Lappy while he does so.

## Intervals between Strong Bad Emails

A column graph showing the intervals between Strong Bad Emails.

Strong Bad Emails are released at varying frequencies. The graph on the right shows the number of days in between the release of an email. Here is a summary of the data:

• Mean: 11.45 days
• Median: 7 days
• Mode: 7 days — 69 emails came out 7 days after the previous release
• Minimum: 1 day — There was 1 day between the releases of retirement A and retirement B.
• Maximum: 71 days — There were 71 days between the releases of halloweener and brianrietta

Note: Data are not complete. Reliable dates are not available for homsar, butt IQ, homestar hair, making out, and depressio. These data are as of October 4, 2006.

## Other information

• 41% of all emails have no location given; 2% have no return sender at all.
• The average email is 1.81 sentences long; the average cartoon is 152.82 seconds long.
• The Brothers Chaps most frequently choose emails with sender names starting with J or S. Together these senders make up a whopping 26.75% of all emails. This may be an indicator of popular names in the world, not an indication of TBC preference.
• Only 18% of all emails are longer than two sentences. Only one email longer than four sentences has ever been used.
• There are eight substantiated claims of Strong Bad answering an email from a HRWiki or HRWiki forum user. These emails are montage (sent by Porplemontage), animal (sent by Kerrek slaya), portrait (sent by NachoMan), space program (sent by Ryan Sturmer), cliffhangers (sent by Cessna Man!), underlings (sent by PlasticDiverGuy), the paper (sent by Thewi2kbug), and slumber party (sent by ThomasO).