Table 7 of the paper shows that the parameter of llama-7b is 3404.53M. How is this number calculated? Shouldn't it be 7b instead?
Table 7 of the paper shows that the parameter of llama-7b is 3404.53M. How is this number calculated? Shouldn't it be 7b instead?