suppose I want to use the existing pre-trained model. https://huggingface.co/Salesforce/grappa_large_jnt/ as the initial checkpoint for finetuning.
This grappa model has max position embedding as 514 in config.json
Now I want to extend this model from 514 to 1024 tokens. The first 0-513 embeddings are initialized with the pre-trained model, the rest (514-1023) are randomly initialized.
How to archieve this?
