[ICLR 2023] Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
[ICLR 2023] Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
We investigate the task-incremental learning ability of pre-trained language models like BERT. Our probing experiments reveal that, even without memory replay, BERT can generate representations for instances of different classes of the same task within different sub-spaces (no overlapping). However, without replay, the representations for instances of different tasks will be generated into overlapping sub-spaces, which hurts model's ability to distinguish their tasks.