This question arose from discussing with njh how to best control a greenhouse climate control system.
The goal is to maintain the temperature of the greenhouse at roughly some given temperature while using a minimum of power. There's a specific goal function that defines this goal. We wish to minimize the expected value of the integral of this function over time.
The system monitors various things, such as the temperature in various parts of the system. It has direct control over a number of devices such as fans and pumps.
We believe that the system can be modelled using some function for which we have a fixed number of unknown parameters (though there will be some amount of random noise which can't be modelled). We have some form of (fairly uninformative) prior belief about what the values of those parameters might be. Clearly if we know the value of the parameters we will be able to optimize the expected integral of the goal function.
This is now a well posed problem, which has an exact mathematical answer. Presumably the optimal strategy will be to initially play with the devices the system has direct control over in order to learn the parameters of the model to some precise degree of inaccuracy. Buggered if I know how to solve it though.
2/2/2007: Another thing for which play might be useful is to help determine hidden state variables in the system. For example, if the temperature of a thermal mass used for heat/cold storage is unknown, turning on the radiator fan and observing the change in greenhouse temperature may give a hint as to the value of this temperature.
4/3/2007: This problem is called "Dual Control".