Strong Bad Email Statistics

From Homestar Runner Wiki

Revision as of 22:03, 27 March 2005 by Joshua (Talk | contribs)
Jump to: navigation, search
No Loafing!


Various statistics of interest involving Strong Bad Email data.


Strong Bad Email By Length

This section involves data taken from the list Strong Bad Email By Length.

A scatter plot of chronological number vs. length, with outliers.


  • The scatter plot shows a fairly strong positive correlation between Email Number and Email Length. The r value between these two variables without deleting outliers is .844.
    • A r value of 1 would indicate a perfect, positive correlation. A value of -1 indicates a perfect, negative correlation. Therefore, .844 indicates a fairly strong, positive correlation.
  • This plot shows there are a handful of clear outliers which are likely effecting the correlation. In the plot below, the outliers have been removed. A Least Squares Regression Line (LSRL) has also been added.
    • The outliers were defined as those emails with a residual value of 40 or greater, or -40 or less.
A scatter plot of chronological number vs. length, without outliers.



  • The LSRL can be used to extrapolate, or guess the length of future emails. The r value of this line is .946.
    • The equation for the LSRL is y = 1.3848x + 44.831. y = Time (seconds); x = Email number
  • This method of guessing is not 100% accurate, since it is unlikely the e-mails will ever be, say, 20 minutes long. This equation should not be considered a foolproof method for guessing the length of an e-mail.

Strong Bad Email By Era

This section involves data on the computer used to answer each e-mail, or the "era" of the computer. The categories are Tandy 400, Broken Tandy, Compy 386, Lappy 486, and Other. (Pom Pilot and Tangerine Dreams)

"The newer, the longer"
But Compy 386 can win the stupid competition.
Personal tools