With 1% or 10% mislabeled examples, SB still reaches similar test error rates while giving us a considerable performance boost. However, SB is worse than the traditional approach on 20% mislabeled datasets. Since the retain rate of SB is 33%, and there are 20% mislabeled data, most examples we pick will be incorrectly labeled. Thus, our model will start to overfit mislabeled examples.