SPORTS

why am i doing this

when i worked in an office that had a fantasy football league, i would only draft players who were smiling (and who walker recommended ┐(´ー｀)┌). i have no idea where those photos on espn were from or under what context they were taken, but my assumption was that happy players would play better. i think i generally did okay the 3 or 4 years i played in my work league, but i also don't really care about football so i didn't pay much attention. anyway, i did most of a statistics MS so i guess it's time to revisit this question.

how did i do this

i wanted to do this with 2016 data since i was in an ff league that year, but espn updates the player photos every year and i don't know where the old ones go, so 2019 it is.

i used the ffanalytics R package to scrape CBS's 2019 fantasy data. i was really only interested in the season point totals for each player. during QA, my husband noticed that this was not the 2019 data and regardless of the year input, it was only scraping 2020's projections, so uh, that was not helpful, but i did get to spend a bunch of time troubleshooting why R refused to install packages from github so that counts as experience, right?

then i googled "nfl 2019 fantasy csv" and found fantasy football data pros' datasets. thanks y'all! i used the weekly 2019 data to compute point totals for all players. this was surprisingly annoying to do because i forget that data is always bad. i might put my R file on github but there's nothing exciting, just distressed comments about how there are two different players named ryan griffin.

then i manually determined if each player was smiling or not by going to espn's website and looking at their pictures with my eyes!!! this was pretty time consuming because i had 623 players' faces to examine, so about 60% of the way through, i removed the players from my data who had played fewer than 4 games.

since i had a single categorical predictor (smiling or not) and a single continuous outcome (total season points), i ran an anova. i mean, i checked the normality of both groups using histograms and shapiro-wilk tests and i ran an f test to check for homogeneity of variances and THEN i ran an anova. i could've run a simple linear regression too since that ends up being the same thing as anova in this case, aaand since the predictor is binary, i could also run a two sample t-test and get the same result.

what did i find out

summary
facial expression		n	%
	smiling	273	67%
	not smiling	136	33%
	total	409	100%
season fantasy points		mean	std dev
	smiling	111	89.8
	not smiling	95	78.4
	total	106	86.5

the groups were definitely not normally distributed, but anova is fairly robust against non-normality so i went ahead anyway. the f test for season fantasy points between facial expressions was not significant, so i concluded the variances for both groups were not significantly different. i ran a one way anova with an assumption of equal variances, and it gave me a p-value of 0.06, so like barely not significant?

you can see the mean season fantasy points for the smiling group was higher by about 16 points, but we don't have enough evidence to conclude that this difference is statistically significant. we probably can say that this method of player selection won't hurt your ff league performance, and really, isn't it nicer to see a roster full of smiling faces every time you log into your fantasy account?

what i still want to try

the distribution for season fantasy points looks like it might be exponential, so i want to mess around with transformations to see how that affects things
i also want to check if player position has anything to do with this and maybe include it with facial expression as a predictor for a linear regression model
fantasy football data pros came in clutch with the dataset but once i figure out and solve why python isn't working on my computer, i want to try scraping the data from espn myself
i wanted to get the per game points so i could impute missing data for injured players using my abandoned single imputation R package but the dataset didn't indicate when players were out with an injury

you should smile more