Extremely high logits

#9
by Thomas2419 - opened

Hello I've found this model to have extremely high logits, and loss on new tasks because of that fact into the millions compares to Bert base, roberta, deberta, and other models I tested identically to mobile bert. Is this an intentional facet of mobilebert? It seems to render finetuning new heads onto the frozen model impossible due to instability?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment