Working through some of your arguments on the xor problem in two dimensions should help clarify things.
In particular, all axis aligned hyperplanes do not reduce entropy in your construction, but (for instance) the hyperplane x=0 is locally a perfect classifier almost everywhere.
Viewing a single comment thread. View all comments