anonymous351 9 hours ago

Speed testing Cerebras's insanely fast inference.

The first output totaled 847 lines, 30,499 bytes, in 33.461 seconds including TTFT, meaning Cerebras was able to output over 911 bytes per second on a task that involved reading and understanding source code for two golang files (part of the same package) totaling 939 lines and 25,315 bytes of input.

The second output totaled 3130 lines, 124,403 bytes, in 58.690 seconds including TTFT, meaning Cerebras was able to output over 2119 bytes per second on a task that involved producing a second, longer version of an existing file with 33.1k tokens worth of relevant information already present in the context window.

Needless to say, I'm thoroughly impressed, particularly with Cerebras' new "Cerebras Code" monthly subscription plans. The $50/mo subscription was used to perform this test. I am not sponsored by Cerebras, just a happy enthusiast.

This channel is not monetized by me, I am not seeking any kind of financial incentives from sharing this here, just wanted to carefully evaluate Cerebras' bold performance claims with a realistic code comprehension + documentation + modification workflow.