Strong Bad Email Statistics
From Homestar Runner Wiki
(Difference between revisions)
(+2 images) |
(Added By Length section) |
||
Line 1: | Line 1: | ||
Various statistics of interest involving [[Strong Bad Email]] data. | Various statistics of interest involving [[Strong Bad Email]] data. | ||
- | == | + | ==Strong Bad Email By Length== |
- | + | This section involves data taken from the list [[Strong Bad Email By Length]]. | |
- | [[Image: | + | [[Image:SBEScatter.png|thumb|200px|left|A scatter plot of chronological number vs. length, with outliers.]] |
- | + | ||
- | [[Image: | + | |
+ | *The scatter plot shows a fairly strong positive correlation between Email Number and Email Length. The r value between these two variables without deleting outliers is .844. | ||
+ | **A r value of 1 would indicate a perfect, positive correlation. A value of -1 indicates a perfect, negative correlation. Therefore, .844 indicates a fairly strong, positive correlation. | ||
+ | *This plot shows there are a handful of clear outliers which are likely effecting the correlation. In the plot below, the outliers have been removed. A Least Squares Regression Line (LSRL) has also been added. | ||
+ | **The outliers were defined as those emails with a residual value of 40 or greater, or -40 or less. | ||
+ | |||
+ | |||
+ | |||
+ | [[Image:SBELinReg.png|thumb|200px|right|A scatter plot of chronological number vs. length, without outliers.]] | ||
+ | |||
+ | |||
+ | |||
+ | *The LSRL can be used to extrapolate, or guess the length of future emails. The r value of this line is .946. | ||
+ | **The equation for the LSRL is y = 1.3848x + 44.831. y = Time (seconds); x = Email number | ||
+ | *This method of guessing is not 100% accurate, since it is unlikely the e-mails will ever be, say, 20 minutes long. This equation should not be considered a foolproof method for guessing the length of an e-mail. |
Revision as of 08:02, 27 March 2005
Various statistics of interest involving Strong Bad Email data.
Strong Bad Email By Length
This section involves data taken from the list Strong Bad Email By Length.
- The scatter plot shows a fairly strong positive correlation between Email Number and Email Length. The r value between these two variables without deleting outliers is .844.
- A r value of 1 would indicate a perfect, positive correlation. A value of -1 indicates a perfect, negative correlation. Therefore, .844 indicates a fairly strong, positive correlation.
- This plot shows there are a handful of clear outliers which are likely effecting the correlation. In the plot below, the outliers have been removed. A Least Squares Regression Line (LSRL) has also been added.
- The outliers were defined as those emails with a residual value of 40 or greater, or -40 or less.
- The LSRL can be used to extrapolate, or guess the length of future emails. The r value of this line is .946.
- The equation for the LSRL is y = 1.3848x + 44.831. y = Time (seconds); x = Email number
- This method of guessing is not 100% accurate, since it is unlikely the e-mails will ever be, say, 20 minutes long. This equation should not be considered a foolproof method for guessing the length of an e-mail.