-
Notifications
You must be signed in to change notification settings - Fork 955
Open
Description
Slide 45/83 4. Basic concepts II Integral and floating-point types
Starting from the doc of Carl Burch, I discover that the used floating-point representation is 8-bit (1+4+3 sign-exponent-mantissa). It is better to write this information on the slide otherwise numbers in binary notation are incomprehensible.
Furthermore, if I understand 8-bit notation correctly, the binary number 00001111 (sign=0, exponent=0001, mantissa=111) = 1.111 x 2^(-7+1) = 1.111 x 2^(-6) = 0.000001111 = 1/64 + 1/128 + 1/256 + 1/512 = 1/64 (1 + 1/2 + 1/4 + 1/8) = 1/64 ((8 + 4 + 2 + 1)/8) = 1/64 15/8 or 15/8//64 not 17/8//64 as written.
Same thing for 00000111 --> 15/8//128
Metadata
Metadata
Assignees
Labels
No labels