Theoretical and computational justification is given for improved generalization when the training set is learned with less accuracy. The model used for this investigation is a simple linear one. It is shown that learning a training set with a tolerance $\tau$ improves generalization, over zero-tolerance training, for any testing set satisfying a certain closeness condition to the training set. These results, obtained via a mathematical programming approach to generalization, are placed in the context of some well-known machine learning results. Computational confirmation of improved generalization is given for linear systems, as well as for nonlinear systems such as neural networks for which no theoretical results are available at present.