
spurious_waffles t1_ivzflun wrote

You could try very small character level perturbations of your input such as deletions, repetitions, and character swaps. You just need to be careful to not change the semantic meaning of your input text.

There's some research our there showing that BERT-like models break down on standard benchmarks when the benchmark text contains a small amount of character level noise.
