TensorFloat32
For conceptual model usage and type mapping, see:
- class pyrogue.TensorFloat32(bitSize)[source]
Model class for 32-bit TensorFloat32 (NVIDIA TF32) numbers.
- Parameters:
bitSize (
int) – Number of bits being represented. Must be 32.args (Any)
kwargs (Any)
- Return type:
Any
Notes
Format: 1 sign bit, 8 exponent bits, 10 mantissa bits (1s/8e/10m). Bias = 127. Same exponent range as float32. Stored in a 4-byte word with the lower 13 mantissa bits zeroed. Supports infinity and NaN. Maximum representable finite value is approximately 3.40e38. Supported by NVIDIA Ampere (A100), Hopper (H100), and Blackwell GPUs.
- toBytes(value)[source]
Convert float to 4-byte TensorFloat32 encoding.
TF32 zeros the lower 13 mantissa bits of the float32 bit pattern. All special values (NaN, infinity, zero, subnormals) are preserved.
- fromBytes(ba)[source]
Decode 4-byte TensorFloat32 encoding to float.
TF32 bit pattern is a valid float32 with lower 13 mantissa bits zeroed. Direct reinterpretation as float32 is sufficient.