DFS Lineup Mean Variance Optimization
by Robert Del Vicario
DFS sites with optimizers seem to be popping up left, right and center. However, what many players want isn't simply an optimized team, but rather a team that will generate an expected number of fantasy points for a given level of risk.
Fortunately that is a problem that has been solved (to some extent) in finance. Mean variance optimization, developed by Harry Markowitz, allows an investor to maximize a portfolio's return for a given leve of risk, or minimize a portfolio's risk for a given level of return. For many players these are exactly tye types of lineups that they would like to be able to optimize against. In a 50/50 you might want to construct a team with an expected FP total while minimizing risk, or in a GPP you might actually want to maximize risk in an attempt to generate high variance teams allowing you to place well.
So what do we need to accomplish this?
Player fantasy points projections
Player positions to set constraints and build our team
Player salaries to set constraints against
Player covariance matrix to understand how player performances covary
The hard part here is the player covariance matrix. There are a few options for constructing a covariance matrix here. The first is that each player covaries with every other player (unlikely), the second is that each player only covaries with with every player he plays against (likely but hard to model), and third that each player covaries with his teammates and the opposing team (likely and easier to model). With this bit of knowledge we can construct a covariance matrix for each player on a given night. Players who aren't facing each other are expected to have a covariance of 0.
Once we have created the above pieces we can begin the optimization process. I coded up the following in JuMP using IBM's CPLEX solver..
using DataFrames, JuMP, Gadfly, AmplNLWriter, Cbc, CPLEX cd("C:\\Users\\blahblah\\Desktop\\Projects\\140427 IJulia Notebooks\\151119 Mean Variance Portfolio Optimization") #read in data for optimziation covar = readcsv ("covar.csv") c = readcsv("c.csv") pg = readcsv("pg.csv") pf = readcsv("pf.csv") sf = readcsv("sf.csv") sg = readcsv("sg.csv") proj = readcsv("proj_pts.csv") sal = readcsv("sal.csv") n = size(pf,1) #m = Model(solver=CouenneNLSolver()) #m = Model(solver=CbcSolver()) m = Model(solver = CplexSolver()) @defVar(m, x[1:n], Bin) @addConstraint(m, dot(x, c[1:n,1]) == 1) @addConstraint(m, dot(x, pg[1:n,1]) == 2) @addConstraint(m, dot(x, pf[1:n,1]) == 2) @addConstraint(m, dot(x, sg[1:n,1]) == 2) @addConstraint(m, dot(x, sf[1:n,1]) == 2) @addConstraint(m, dot(x, sal[1:n,1]) = 125) @setObjective(m, Min, sum{covar[i,j] * x[i] * x[j], i = 1:n, j = 1:n}) status = solve(m) println("Objective value: ", getObjectiveValue(m))
The upside is that the above runs and solves the problem (I assume at some point in time). However, I left it running on my box for 4 hours at 100% CPU utilization and it did little more than chew up 36gb of ram and run up my electrical bill. The issue lies in the binary constraint which turns this optimizaiton problem from an easy model into solve to a very hard model to solve.
I don't think all is lost. Previously I had posted some code that folks to generate numerous lineups very quickly (~1k in 6 seconds)using a linear solver. What we can do is take these lineups output from the linear solver and calculate the risk of each lineup pretty quickly in R. This would allow us to create a large set of lineups while assessing the risk associated with each lineup. One could then simply pick the best lineup for their needs (e.g., high risk for for GPP or low risk for a 50/50).
Next time around I'll look to post some code for creating the lineups and assessing lineup risk.











