Posits offer a more efficient trade-off between precision and dynamic range than the various IEEE
floating point formats.
Posits have many nice properties:
- Any N-bit value is a well-defined N-bit posit. Think of an N-bit posit as
an N-bit twos-complement integer. Although N is usually chosen to be a multiple of 8,
there is no requirement that this be so: if you need more precision or range than can be provided by
an 8 bit posit, but less than that provided by a 16 bit posit, you could use, say, 11 bits.
- For a given N, there are only two exceptional values: zero and infinity. Zero is
represented by N zero bits, while infinity is represented by a one bit followed by N-1
zero bits.
- All other N-bit values have a 1:1 mapping with associated rational numbers.
- For a given N, you can trade off accuracy versus dynamic range with a separate
non-negative integer parameter, herein called M. A larger
M value gives a greater dynamic range in exchange for reduced accuracy. M is
implicit: its value is not specified as part of the N-bit value and must be known from
context. So the same N-bit value is associated with different rational numbers
depending on your chosen M. For example:
N |
M |
posit |
rational value |
posit |
rational value |
posit |
rational value |
posit |
rational value |
posit |
rational value |
8 |
0 |
$01 |
1/64 |
$20 |
1/2 |
$40 |
1 |
$60 |
2 |
$7F |
64 |
8 |
1 |
$01 |
1/4096 |
$20 |
1/4 |
$40 |
1 |
$60 |
4 |
$7F |
4096 |
8 |
2 |
$01 |
1/16777216 |
$20 |
1/16 |
$40 |
1 |
$60 |
16 |
$7F |
16777216 |
8 |
3 |
$01 |
1/(2^48) |
$20 |
1/256 |
$40 |
1 |
$60 |
256 |
$7F |
2^48 |
8 |
k |
$01 |
1/(2^(3(2^(k+1)))) |
$20 |
1/(2^(2^k)) |
$40 |
1 |
$60 |
2^(2^k) |
$7F |
2^(3(2^(k+1))) |
- Other than infinity, all posits of a given N and M are well-ordered and may be
compared for signed less-than, equality, or greater-than as though they were N-bit
twos-complement integers. For example, if N is 16, the most-negative posit has the bit
pattern $8001, and the most-positive posit has the bit pattern $7FFF. Note that this is true for
any value of M, though the magnitude of the most-positive and most-negative values varies
(dramatically) with M, see the table above.
- Any N-bit posit may be losslessly converted to an (N+n)-bit posit using the
same M by shifting its value left by n bits. (For example, the posit8m1 $54, the
posit16m1 $5400, and the posit32m1 $54000000 all represent the number 2.5.)
- There are no denormal numbers. All values other than infinity in a given posit format are exact
integer multiples of that format's minimum value. That minimum value always has the twos complement
representation 1.
- For any power-of-two that has an exact posit representation for a given N and M,
the reciprocal of that value also has an exact posit representation for the same N
and M. Since the largest non-infinite and smallest non-zero posits are always powers of two,
this means that the dynamic range of posits is always symmetrical. (Compare with IEEE floating
point numbers, where the reciprocal of a denormal number is too large to represent even
approximately.)
- For any rational value that has an exact posit representation for a given N and M,
the negation of that value also has an exact posit representation for the same N
and M.
- As with twos-complement integers, you can determine the sign using only the most-significant-bit
of the value.
- You can negate a posit in the same way you would negate a twos-complement integer.
See here for a detailed overview of posits.