Success: the team launched Llama 2. On the Pentium II, a 260K parameter model hit 39.31 tokens per second. Not exactly blistering, but on a 1997 rig, it’s remarkable. A larger 15M parameter version ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results