Quote:
Originally Posted by munchkin
For each data set, evaluate the hypothesis for that data and save it somewhere. When all of the data has been processed take each data point and calculate the "average" value of that data point evaluated against each of the previously generated hypotheses. That average value can be used to calculate the variance of the data set.

Just to clarify. After you have calculated the various hypotheses based on different data sets, you evaluate these hypotheses on a generic point
(not necessarily belonging to any data set used for training). The average value you get will be
and the variance in the values you get will be
. The expected value of
with respect to
(based on the uniform probability of generating
) is the variance
. Is this what you have calculated?