Optimum documentation
Configuration
You are viewing v1.17.1 version.
A newer version
v1.25.2 is available.
Configuration
QuantizationConfig
class optimum.amd.ryzenai.QuantizationConfig
< source >( format: QuantFormat = <QuantFormat.QDQ: 1> calibration_method: CalibrationMethod = <PowerOfTwoMethod.MinMSE: 1> activations_dtype: QuantType = <QuantType.QUInt8: 1> activations_symmetric: bool = True weights_dtype: QuantType = <QuantType.QInt8: 0> weights_symmetric: bool = True enable_dpu: bool = True )
Parameters
- is_static (
bool
) — Whether to apply static quantization or dynamic quantization. - format (
QuantFormat
) — Targeted RyzenAI quantization representation format. For the Operator Oriented (QOperator) format, all the quantized operators have their own ONNX definitions. For the Tensor Oriented (QDQ) format, the model is quantized by inserting QuantizeLinear / DeQuantizeLinear operators. - calibration_method (
CalibrationMethod
) — The method chosen to calculate the activations quantization parameters using the calibration dataset. - activations_dtype (
QuantType
, defaults toQuantType.QUInt8
) — The quantization data types to use for the activations. - activations_symmetric (
bool
, defaults toFalse
) — Whether to apply symmetric quantization on the activations. - weights_dtype (
QuantType
, defaults toQuantType.QInt8
) — The quantization data types to use for the weights. - weights_symmetric (
bool
, defaults toTrue
) — Whether to apply symmetric quantization on the weights. - enable_dpu (
bool
, defaults toTrue
) — Determines whether to generate a quantized model that is suitable for the DPU. If set to True, the quantization process will create a model that is optimized for DPU computations.
QuantizationConfig is the configuration class handling all the RyzenAI quantization parameters.
AutoQuantizationConfig
RyzenAIConfig
class optimum.amd.ryzenai.RyzenAIConfig
< source >( opset: Optional = None quantization: Optional = None **kwargs )
RyzenAIConfig is the configuration class handling all the VitisAI parameters related to the ONNX IR model export, and quantization parameters.