Game Spotlight - Blood Bowl (Part II)
*rides back with big bag of explanations*
Discussing the rather excellent Games Workshop game Blood Bowl, in the form of a contrived fictional narrative. I figured it was a more accessible format than dry section headings.
I'd already outlined the search structure I was planning on using for a hypothetical Blood Bowl AI (multi-agent distributed system solving unit subproblems, coordinated by a "team agent"), and now I've come back to fill in the admittedly gaping hole in my argument - namely, the magic evaluation function, which will return a single number representing how "good" a given game state is for the active player.
Yeah, that was super handwavey. You do realise that any search problem in AI can be solved by a sufficiently accurate evaluation function, right? All you do is say all states on the victory path are worth "1" and all states off the path are worth "0". It doesn't mean anything.
True - which is why I'm now going to outline an actual implementation.
The trick with evaluation functions is making them general enough that they're easy to compute while making them specific enough that your chosen search process can use them in place of searching a subgame tree. They're meant to be rough calculations that approximate how good a given state is.
There are two main ways to come up with these:
Learning
Given enough examples of good and bad gamestates (supervised learning) or enough playthroughs to completion (unsupervised learning), it's possible to learn an evaluation function. The easiest way to do this is to have your scoring algorithm already designed and just use the learning process to tighten weights on various components, but this requires you to design the scoring algorithm (a fair amount of work in itself). Alternatively, you can throw the problem at a more general algorithmic framework (say, a neural net), and use the learning process to adjust the computation as well as the weights - however, this process is very slow, and gets exponentially slower the more "resolution" you require from the function. For Blood Bowl, I'm guessing we'd require a lot.
Learning processes are also problematic in that it's usually very difficult to discern why they think a certain state is good - the evaluation function generated is typically quite opaque and relatively immune to tweaking. If after all this brute force offline computation you come up with something moronic, it's going to be hard to turn that into something good without restarting the entire process.
As an analogy, it might be helpful to consider biological evolution. It's a very powerful unguided optimisation process that can come up with really sophisticated organisms (most humans), but it takes millions of years to do so for a specific environmental niche (and billions of years from the dawn of life to where we are today). Learning algorithms are a step above this, but they're still slow. It would be helpful if there were an alternative...
Intelligent Design
Automated learning is attractive because (in theory) it saves you having to think about how an evaluation function should work - you barely even have to know anything about the domain you want it to evaluate. The obvious alternative is to use expert knowledge of the domain to come up with a good evaluation function yourself - this is both considerably more fun and considerably more illuminating, as you can see exactly why certain choices were made and adjust them on the fly dependant on performance. If we're ingenious enough, it should be faster to create a good function than to evolve one.
Because "point a learning algorithm at it" is almost as much of a non-answer as saying there's a "magic evaluation function", I'm going to try the latter approach.
So what should this "intelligently designed evaluation function" look like?
Crafting a good function for a complex game such as this is not going to be easy. To do it properly would require some combination of mathematical analysis and trial-and-error iterative improvement. I don't have the time or inclination to do either right now, so what follows is where I'd start.
(I should also warn you that I don't really have space to explain the rules of the game in detail, so I'll try to write in more general terms - it's the methods I'm using that are important. If you want to refer to the rules anyway, they're available here.)
I'll begin by observing that the problem isn't simple enough for us to pull a fully formed function out of thin air - so we should do what we always do in this situation, and decompose the problem into a number of subproblems, solve those, then recombine the answers in a way that seems sensible (perhaps as a linear combination, or possibly something more involved).
For Blood Bowl, the first decomposition that naturally presents itself is to consider the score of a single team rather than both teams at once. The overall evaluation function would then be the "team score" for the active player minus the "team score" for the opponent. Due to the symmetry of the game, we only have to construct one such function (which is team-independent), and we can explicitly ignore things which are good for our opponent when calculating what is good for us, as this will be contained fully in the other component.
With this in mind, my second-level subproblems (decomposing the "team score" problem) are these:
Overall Goal Satisfaction
I'll start with an easy one. Touchdowns should be worth a significant amount each, so that the AI is incentivised to make them if they're immediately available while not being tempted to perform risky actions beforehand.
Unit Score
What is a unit worth when active (upright)? What is the same unit worth when stunned or prone (temporarily incapacitated)? What is that unit worth when removed from the board? OK, that last one's easy - we'll say it's worth zero - but the first two questions aren't as clear cut.
Having a function that spits out a unit's current worth is useful because we can plug this value into other functions, allowing us to see which opposing units are worth blocking or which of our own units are worth protecting. In addition, the sum of all unit scores will probably be a valuable component in the overall evaluation function in its own right, as it (roughly) represents ability to act.
You could approach this from the complex end and say that this depends on a variety of factors including board position and proximity to other units. I'd prefer to keep things simple and incorporate the complex stuff as functions of a basic unit score, rather than the other way around. Let's say my initial guess will assign a score equal to the units cost, reduced by some scalar for stunned/prone states, and increased by some scalar for carrying the ball (as this is a state that's easily determined).
Combat Advantage
Half of Blood Bowl is about beating up the opposing team or preventing the opposing team from beating you up. Hitting units successfully incapacitates them either temporarily or permanently, which results in less options available for their controlling player.
It's worth examining what the simple "sum of unit scores" would incentivise with our current search model - we're not looking far enough ahead to chain together multiple unit actions, so the AI won't see that it can move units into positions where they can assist other units. All it would do is identify the best immediate block actions (or lack thereof).
A better model would take into account the quantity and quality of block actions available for the active team. Consider the set of all block actions available to all allied units on the board. These can have one of four real outcomes: {defender down, attacker down, both down, neither down} (I'll ignore pushbacks for now - assume they fall into the last category). Each outcome will have an easily calculable probability associated with it, depending on the relative strengths and skills of the participants - including assists granted to them by neighbouring players - and each outcome will have an easily calculable score based on the resulting standing-up-ness of each participant. We can blend these in the obvious linear combination to get a score for the block action itself, and we can sum these scores for all immediate block actions to get a heuristic for combat advantage.
This has immediate advantages because it considers the effects of assists on the block actions. Positioning is now important - units should see the advantage of moving to provide assists on otherwise unfavourable blocks. The AI would also be able to see the effect that removing an opposing unit would have on future combat advantage via simulation, which should allow it to order its block actions correctly (i.e. performing a series of favourable blocks, each removing critical opposing assists to make the next block more favourable).
Positional Defense
The primary factor restricting movement in Blood Bowl are what are known as "tackle zones" - essentially, every square adjacent to an upright unit. Opposing units in these squares suffer a penalty to perform agility-based actions, and they have a chance to be tackled (fall over and drop the ball) when moving out of them. (The secondary factor restricting movement is through blocking off certain routes with actual units, but due to the relatively few units compared to available board space this comes up less often)
Imagine that each square on the pitch has a value associated with it, representing how much "power" this team can safely move there (let's call this "influence", as in influence mapping, which is essentially what we're doing here). If such a map existed, then one way of determining defensive strength is to ensure that the opposing team doesn't have a large influence in your half of the pitch, as this would correspond to great freedom of movement here. Similarly, your team does want a large influence in your half of the field, as this means you can respond quickly to incursions by opposing units trying to score a cheeky touchdown. Units in high zones of opposing influence are more at risk of being surrounded and blitzed.
We can approximate this by using a diffusion process. For each unit, we'll iterate the following process a number of times equal to their movement stat:
Starting with their full influence value on the square we currently occupy, consider all adjacent squares - give these influence equal to the influence in the source square multiplied by the success rate of moving there.
For subsequent iterations, consider all squares adjacent to those already visited, and perform the same operation (choosing the maximum of the values for repeated visits).
The overall "influence map" would be a linear combination of these, weighted by (presumably) offensive power. Admittedly, this process is likely to be quite costly - further approximations may need to be made to actually get this usable - but I think it's the right principle.
As stated above, a good defense entails making sure your opponent doesn't have too much of an influence advantage in your half of the pitch. This can be reduced by keeping units deep in your half with complete freedom of movement, or by positioning and moving units such that the freedom of opposing units is reduced in some way (either by directly putting them in tackle zones or by creating a "wall" of tackle zones to dissuade movement along a certain path).
Positional Offense
Finally, how do we lead the AI in a direction that will result in it scoring touchdowns? The above "goal satisfaction" component will result in it taking immediate touchdown actions, but we need to incentivise moving towards such a state even if it's not visible in the immediate future.
We need a function that increases the closer the ball is to the opponent's endzone, and increases further if held by an allied player. From the influence map calculated above, we're aware of how risky certain squares are, so we should take this into account as well (for both loose ball position and held ball position).
In a vacuum, this should result in the AI picking up the ball and moving it closer to the opponent's endzone safely, by either fast movement or passing into zones of low opposing influence.
Together, I believe these components encompass the main strategic elements required to play an adequate game of Blood Bowl. All that's left is some way of combining them - my gut feeling says a linear combination of the above would work, with appropriate component weights.
Is that it? Are we done yet?
That concludes my general outline of constructing an AI to play Blood Bowl. I hope it was informative, and not too confusing.
I'll stress again that this was an entirely theoretical adventure, and that actual implementation and experimentation would be needed to refine the above into something actually good - but I'm still very open to suggestions on how to improve this or for people to point out obvious flaws in my reasoning.
Blood Bowl is a fairly nice self-contained coding project, so it's not inconceivable that I'll actually get around to trying this out at some point. If I do, I will definitely post about it here :)