¹Strictly, it can be modeled as a POMDP* for each unit independently with S the states of all the other units (enemies and allied altogether) which are known through observations O by conditional observations probabilities Ω, with A the set of actions for the given unit, T transition probabilities between states and depending on actions, and the reward function R based on goal execution, unit survivability and so on... It can also be viewed as a (gigantic) POMDP* solving the problem for all (controlled units) at once, the advantage is that all states S for allied units is known, the disadvantage is that the combinatorics of T and A make it intractable for useful problems.