October 27, 2021

How to Use Bert to Process Long Text?

How to Use Bert to Process Long Text?

Bert usually can process no more than 512 tokens (words). However, if a sequence is longer than 512, how to process using Bert? There are three common ways:
(1) head-only: use first 510 tokens
(2) tail-only: use the last 510 tokens
(3) head+tail: select the first 128 and the last 382 tokens.
The experiment shows head+tail has the best performance.

Here is the full tutorial!

File: PDF

Language: English

DOWNLOAD