Whole_Cranberry

2024-07-10 16:21:40

In transformer, why use the query/key/value weight matrix?

2024-07-08 01:49:14

In transformer, what is exactly in the query/key/value weight matrix?

2024-07-07 21:58:44

Why positional embeddings in transformer are summed with word embeddings instead of concatenation?

2024-07-07 19:27:26

About input embedding of transformer