Talk:Strong Bad Email Statistics

From Homestar Runner Wiki

Revision as of 18:19, 7 April 2008 by Seahen (Talk | contribs)
Jump to: navigation, search
Ding! Strong Bad Email Statistics is a featured article, which means it showcases an important part of the Homestar Runner body of work and/or highlights the fine work of this wiki. We also might just think it's cool. If you see a way this page can be updated or improved without compromising previous work, feel free to contribute.


Contents

Comments on the Progress

This page is coming together quite nicely - I like what DMurphy has created thus far - I think the "No Loafing Pie Chart" was a great choice - it adds a perfect amount of humor to this page. The descriptions of the interpretations of the data are thorough and that's grood, er..good for anyone interested in the interesting field of statistics. I just hope the average fans of Strong Bad Emails will be as enthusastic about the statistical information as those who have a subscription to Nerdular Nerdence! Nevertheless, I think everyone (ability in mathematics notwithstanding) can learn something from the data charts. I hope to see/make a pie chart soon! --The Paper 01:49, 27 Mar 2005 (MST)

...Yep, I have no clue what you all are saying. --Color Printer 06:42, 27 Mar 2005 (MST)

Database

Anyone wishing to help with this project that doesn't want to undertake the tedious task of data entry can request a database from The Paper or me. The database is a very user-friendly excel file which has data on Time, Location, Name, and various other categories for each of the 126 emails. Just e-mail me and I'll be happy to send you a copy of the file. --DMurphy 15:45, 27 Mar 2005 (MST)

Wow!

DMurphy and The Paper- you guys did a great job on this! It really makes this wiki look smart. This rocks! →FireBird 23:58, 27 Mar 2005 (MST)

This page is awesome and beautiful and beautifully awesome. Aurora Szalinski 11:56, 28 Mar 2005 (MST)

My god...

You guys have WAY too much time on your hands.

Yep, they do. --Color Printer 10:56, 28 Mar 2005 (MST)

Not really... any statistics major could put this together in 30 minutes or less. The hard part is data collection, which was already done before I got the idea to start this page. I know it looks like this would take days of work, but Excel makes the project easy to complete. Anyway, it's spring break for me, and I didn't have anything planned for the night I started this, so I decided to make a new page. --DMurphy 16:23, 29 Mar 2005 (MST)

bottom 10

Why do you say bottom 10 has 2 emails? The second one was just Fwd:'s and Re:'s.

Still, after the joke, I think we were supposed to assume that an email eventually popped up. — It's dot com 05:43, 30 Aug 2005 (UTC)

Nominated for featured article

This article has been nominated for article of the week and I for one would like to see that happen — it's a really great article. One thing that needs to happen first, though, is it needs a good opening summary paragraph. I might try my hand at writing one, but it would probably sound better coming from someone who really understands all this statistical stuff. Maybe summarize why these charts where created and what we can learn from studying the emails this way. — wikisig.gif Joey (talk·edits) 04:48, 22 Aug 2005 (UTC)

You know what? I realy like this page, so whan I'm back from school I'll try my hand at writing one. Elcool (talk)(contribs) 05:16, 22 Aug 2005 (UTC)
I'd go for it if a good paragraph was written for it. Joey: a better place to talk about this would be HRWiki:Featured Article Selection. —BazookaJoe 12:39, 22 Aug 2005 (UTC)
So? What do think? Is it good enough? Elcool (talk)(contribs) 15:17, 22 Aug 2005 (UTC)
I'm sure my English teacher would think this is great, E.L. --Ookelaylay 00:55, 31 Aug 2005 (UTC)
Thanks! Elcool (talk)(contribs) 05:21, 31 Aug 2005 (UTC)
E.L. Cool, that's a great intro paragraph. I'm glad to see this on the front page. Way to go! — wikisig.gif Joey (talk·edits) 00:38, 2 Sep 2005 (UTC)
Why, thank you! Elcool (talk)(contribs) 04:17, 2 Sep 2005 (UTC)

Intro paragraph

I think the intro paragraph takes the purpose of the page in the wrong direction. It shouldn't be worded to sound like this is a place for making predictions, because that's not what I came to this page for. I came to look at the charts. (In other words, keep the focus in the past, and not the future.) —BazookaJoe 22:55, 22 Aug 2005 (UTC)

The only part that is about making predictions is "...or how the future emails will be". If you want to change go write ahead. Elcool (talk)(contribs) 05:16, 23 Aug 2005 (UTC)
I like what has been added to the opening paragraph. It reads much more smoother now and I think the users of the wiki will notice. Thanks for the active interest in keeping this article up-to-date and looking smart. I think DMurphy may have left the project, but we're certainly keeping his creation in tip-top shape. Much appreciated. =) —THE PAPER PREEEOW 00:51, 24 Aug 2005 (UTC)

Block Computer

Did you guys count the Block Computer from "Other Days"? Technically, that has 2 emails.--Martin925 23:15, 29 Aug 2005 (UTC)

We have rather "strict" (read arbitrary) guidelines meaning that first computer (or device) that Strong Bad uses in each particular email is the one we consider "used". In other words, we are aware of Block but we do not consider it one of the "others". —THE PAPER PREEEOW 05:39, 30 Aug 2005 (UTC)

Time spent with...

Perhaps a new chart - time spent with each computer? Just a product of "Percentage of emails by computer era" and "Average length of emails by computer era".

Maybe even "Percentage of time Strong Bad spends physically in front of the computer," by email number or by era, but that would mean a lot of new data collection so probably not worth the effort. --phlip TC 04:16, 7 Oct 2005 (UTC)

Scatterplots

I don't know about you, but on the next scatterplots, I would love to have Email titles with arrows pointing to all the outliers, above and below. —BazookaJoe 03:28, 12 October 2005 (UTC)

I will certainly take this into considering when/if I make a new scatterplot graph. Thanks for the helpful suggestion. —THE PAPER PREEEOW 03:45, 12 October 2005 (UTC)

Back

Well, I stopped in to see how things are going today. I've been quite busy with school and other things, but I do have some time at the moment. I added a general statistics section. I'm also about to upload the Time Spent With Each Computer chart someone suggested. As far as adding arrows to the scatter plot goes, it's possible, but would require adding all the labels by hand. As there are about 15 outliers, not only would it be time consuming but it also might get a bit messy. Perhaps just a label for the outliers with the highest residual and lowest residual? Those would be Vacation and Colonization, respectively. --DMurphy 03:51, 12 October 2005 (UTC)

"Unreliable" Date?

I just deleted sb_email 22 was released between Vacation Postcard #5 and invisibility. It's as "reliable" as any others from LiveJournals. Why was this in the "unreliable" section in the first place? Thunderbird 02:00, 17 December 2005 (UTC)

Updates

It would be nice if this page would be updated again. This of course, looks like a lot of work. It would be nice if there was a way to streamline this process and have some scripts that would automate the creation of some of the wiki text elements and charts (nothing too fancy) for inclusion into this page. My guess is that using Excel's string function capabilities would help. Anyway, speaking of excel, the link to the original .xls file is gone. It seems strongfans has reorganized its site but i have no idea where to even begin looking for this file. Does anyone have a copy? It would be nice if the excel file could be kept in the wiki vaults instead of offsite, since we depend on that data for this page so much. --Stux 15:45, 28 January 2006 (UTC)

  • I attempted to upload the file a while back, but the wiki doesn't allow it. I gave a copy to The Paper and InterruptorJones a while back, so they may have maintained the DB. Otherwise, I have a DB that's not updated (I think it goes through 125ish). It really doesn't take a lot to update the DB... the hard part is making all the pretty graphs and it's such a time-sensitive page that it's tough to keep it updated. I think it would be best to just wikify the data and put it on a separate page. No way I'm up for doing that though... heh. If you want to Wikify the graphics too, be my guest, but I have absolutely no idea how to do that. Email me at dolan.murphy@gmail.com for the db. -DMurphy 23:11, 31 January 2006 (UTC)
  • Also, I'm not entirely familiar with Wiki scripting, but I'm not sure it would be able to, for example, calculate the least squares regression line or the rank of the length, so it may be best served in an excel file and it's just a matter of (a) figuring out how to host the file on the wiki or (b) finding a dedicated host that will update it after every e-mail, but who also knows how to calculate the LSRL equation (as explained in the article). I think this article could be a LOT better if we figure out how to open the database to the community. But, I know I'm tired of spending time updating graphs, so if there's someway to have wiki calculate the LinReg graphs and other things, please, be my guest. And if it's too much to update the LSRL every week, we could just base it on the first 100 data points... that should be accurate enough anyway. -DMurphy 23:30, 31 January 2006 (UTC)
    Hi DMurphy! Again, thanks for the reply to my inquiry. I have emailed you about a copy the Excel file. As for wiki-fying some of the graphics: at the very least one of the graphics can be wikified: SBESTimeSpentChart.png doesn't need all that text on a picture, and can instead be turned into text for this section with the static graphics used in the original picture. That is probably the easy part. I seriously doubt that this wiki has dynamic code generation implemented for security reasons. My guess is that the best bet to do this is have a template of sorts which a local program (such as a Perl Script) can generate the appropriate wiki code that can be cut and pasted onto the real page. This would at least allow on-the-fly recreation for most of the data once a new email is released. The hard part is then making those graphs. However I do think that there are command-line programs that would let you make them, but they'd likely look different from the excel versions seen now. This would also flood the file history with new uploads (not necessarily a bad thing). It might be more feasable if this graph is updated every 1 or 2 months based on SBE frequency. Regardless, some local scripting would make the job a lot easier. Once I get the Excel files i'll play around with things and see what I can get. Thanks! --Stux 20:21, 1 February 2006 (UTC)
Sorry for the late reply. I will get the excel database to you as quickly as possible. As for the link, I will try to get it working in the next day or so. —THE PAPER PREEEOW 21:39, 6 February 2006 (UTC)
Ok cool! Let me know when it's ready. Also, have you got a chance to check out issues regarding my most recent problem with strongfans? --Stux 16:20, 8 February 2006 (UTC)

Excel file

The current link to the Excel file is broken. It would be nice if we could a) enable uploads for .xls files so anyone could update the Excel file and re-upload it at will, and b) include all charts (and necessary data) inside the Excel file, so the person updating the data can also update the charts on this page. I'm working on an improved Excel file, but will need some help with the fancier charts.

Edit: Boo on me for not reading the above discussion before posting this. Still, I'd like to know if anybody's interested in my suggestions. — InterruptorJones 16:40, 17 April 2006 (UTC)

The link should once again be operational. I am still not sure why the file was deleted from the server in the first place. It does need to be updated as the latest email to be recorded on the file is #127 (best thing). My apologies to Stux et al. who were interested in updating the file but due to my procrastination were unable. —THE PAPER PREEEOW 11:15, 20 April 2006 (UTC)

Automation

It seems like all of these charts and graphs are produced by hand and then converted into PNGs. That seems like a really tedious process. Have you guys ever considered using some automated chart generator where you just have a stored set of data that is grabbed to dynamically generate charts and graphs? And based on a couple of Google searches, it seems like many different software packages to do this are available (and many are freeware). (Like this one, for instance.) Just a thought. --Soapergem Talk.png Contrib.png 04:25, 23 April 2006 (UTC)

Shim Sham?

Since another edit brought up the topic of emails-answered-by-wiki-users, I checked the others. After a long, long search, my research on time capsule led to this conversation, which seems to trump the "assume good faith" re-revert which led to its inclusion up to now. —AbdiViklas 06:16, 2 May 2006 (UTC)

Well, unfortunately, due to the mass amounts of people who have claimed to have had Strong Bad answer one of their e-mails, I don't think that assuming good faith is enough in this case or in any other Strong-Bad-answered-my-e-mail cases (not that I agree, I'm just stating the facts). I could be wrong, but judging by Dot com's edit on the link you provided, I don't think so. This post is far too negative. It needs a Homsar quote.
DaAaAaAaAaA! "Hi, Wonder Mike! I'm Homsar!" - Super Sam 11:12, 14 May 2006 (UTC)

Data as wiki table

Thanks to a tool I found on Wikipedia:Wikipedia:Tools/Editing_tools#mwpush I have a wiki-syntax table of the raw data. Is there any interest in having a page for it here? — User:ACupOfCoffee@ 06:30, 7 July 2006 (UTC)

Regular updating

Okay, I'll get right to the point: this page needs to be updated. Badly. It should also be updated every few emails, maybe every 10 or so. To do that, we would need a team of professionals to add to the Excel sheets and make new charts for each update. I have little to no experience in statistics, and will to what I can, but I will need help. — SamSF%20sig.jpgFisher (Come in, Lambert.) 16:13, 30 June 2007 (UTC)

The spreadsheet's only three emails behind. The scatter plot goes with the spreadsheet, but yeah, the rest of the graphs could use some updating. Who's better than bad at graphic design? — User:ACupOfCoffee@ 18:00, 30 June 2007 (UTC)

Strong Bad checking more than one email

The following was copied from Talk:Strong Bad checking more than one email when it was decided to merge the content of that article with this one:

Merge

This page is a more wordy duplicate of Strong Bad Email Statistics#Strong Bad Emails Featuring More Than One Email. It should be redirected and it's content merged. Elcool (talk)(contribs) 15:58, 22 April 2007 (UTC)
This page should not be merged. We can just merge the Strong Bad Email Statistics#Strong Bad Emails Featuring More Than One Email onto this page. User talk:Sam the Man Sam the Man
Replace "checking" with "featuring" and capitalise all the words. We can provide a link to this article on Strong Bad Email Statistics. – The Chort 17:46, 22 April 2007 (UTC)
I agree with Elcool: merge and redirect. kai lyn 18:02, 22 April 2007 (UTC)
Knowing that we already have this info on the Statistics page, I have to agree with the merge. Has Matt? (talk) 18:06, 22 April 2007 (UTC)
I agree with The Chort. Maybe have a short segment and one of those "Main article" things... --Mario2.PNG Super Martyo boing! 02:07, 23 April 2007 (UTC)
Agree con El Chorto and Super Martyo Hermano. --DorianGray 02:11, 23 April 2007 (UTC)
Delete Merge and redirect for the exact same reason I voted to delete that article about Hrwiki people who got their emails on Strong Bad Email: This info can already be found on Strong Bad Email Statistics. Bad Bad Guy 23:55, 10 May 2007 (UTC)
I just updated that statisics section. Now it's even safer to redirect, imho. Bad Bad Guy 20:08, 23 May 2007 (UTC)
I vote for merge and redirect. DNA Evidence (Talk | contribs) 20:14, 23 May 2007 (UTC) (left unsigned)
Merge and redirect. Trey56 03:03, 9 June 2007 (UTC)
yeah yah, totally, yah Slipknot6477 02:30, 19 July 2007 (UTC)

Wi2k Bug

Could someone present me with the evidence that Wi2k Bug wrote the paper? Bad Bad Guy 20:26, 29 July 2007 (UTC)

"Substantiated Claims"

I'm assuming "substantiated claims" are those with enough evidence to make it more reasonable than not to assume the editor was the author of the email. It does not imply proof or certainty. Does anyone disagree with my take on this? Qermaq - (T/C) Image:Qermaqsigpic.png 02:22, 31 July 2007 (UTC)

Just thought I'd revive this with something to add. I believe this section (and all remarks related to it in the individual email pages) should be deleted, as proof can be highly questionable and easy to fake (not accusing anybody here, but it's possible). Also, having these people doesn't add to the encyclopedic content of the page and really just serves to give the users listed bragging rights. - Super Sam 15:13, 23 November 2007 (UTC)

Weird Facts

I did some research, and as of currently (no differentiation from other computers. Just Tandy, Compy, and Lappy.):

Tandy Total: 53 min 33 sec Tandy Average: 1 min 20.325 seconds

Compy Total: 3 hours 22 min 37 sec Compy Average: 2 min 35.859 sec

Lappy Total: 4 hours 43 min 17 sec Lappy Average: 3 min 52.836 sec (/\ .03 sec)

This includes buried.

SBEmailnum314159: Some time, Some day and month 20XX.

Edited: Sbemailnum314159 20:10, 30 January 2008 (UTC), 22:26, 4 February 2008 (UTC),22:14, 4 March 2008 (UTC)

Length vs. date

Can we please get a scatter plot of e-mail lengths with date, rather than issue number, as the independent (x-axis) variable? I think that would be more useful as a predictor of how long future e-mails will be. Seahen 18:16, 7 April 2008 (UTC)

Personal tools