A Universal Measure of Intelligence for Artificial Agents
I came across this interesting paper today: A Universal Measure of Intelligence for Artificial Agents.
The task they use to define intelligence is a generic reinforcement learning - an agent gets some observation, a part of this observation is a reward that depends on earlier actions, the agent has a finite set of possible actions. The agent needs to maximize the total reward across all possible environments.
Computing this total reward across all possible, or even across all computable environments, however, is not possible, hence the authors define a distribution of "simple environments" - rewarding agents for applying Occams Razor.
This short summary obviously omitted some details - read the paper for the details.
The authors did not do it and don't even discuss the possibility - but I would assume that it should be possible to apply similar ideas to actually evaluate learning algorithms. What I don't know is, how an algorithm to actually create samples from the distribution of simple environments would work (and how
fast slow it would be). I'm also not sure how much a completely unbiased learner could help us solve any real problems.