[rwkv-x] v5 model memory scaling benchmark #175
PicoCreator
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
First, outlining the status quo, V5 has proven to beat out all existing V4, and V5+wavenet architecture in being able to retain memory for L24/D2048. The attached is the high level performance
The goal, is to figure out how this scales, by layer/embedding size/head size/ etc
Follow on the following discord thread: https://discord.com/channels/992359628979568762/1142397705314906182/1142397998853279744
Beta Was this translation helpful? Give feedback.
All reactions