BFloat16
For conceptual model usage and type mapping, see:
- class pyrogue.BFloat16(bitSize)[source]
Model class for 16-bit Brain Float (BFloat16) numbers.
- Parameters:
bitSize (
int) – Number of bits being represented. Must be 16.args (Any)
kwargs (Any)
- Return type:
Any
Notes
Format: 1 sign bit, 8 exponent bits, 7 mantissa bits (same exponent as float32). Bias = 127. Supports infinity and NaN. Maximum representable finite value is approximately 3.39e38 (same range as float32). Supported by NVIDIA Ampere (A100), Hopper (H100), and Blackwell GPUs.
- toBytes(value)[source]
Convert float to 2-byte BFloat16 encoding.
BFloat16 is the upper 16 bits of the float32 bit pattern. All special values (NaN, infinity, zero, subnormals) are preserved.
- fromBytes(ba)[source]
Decode 2-byte BFloat16 encoding to float.
Reconstructs float32 by shifting the 16-bit pattern left by 16.