Configuration

QuantizationConfig

class optimum.amd.ryzenai.QuantizationConfig

( format: QuantFormat = <QuantFormat.QDQ: 1> calibration_method: CalibrationMethod = <PowerOfTwoMethod.MinMSE: 1> activations_dtype: QuantType = <QuantType.QUInt8: 1> activations_symmetric: bool = True weights_dtype: QuantType = <QuantType.QInt8: 0> weights_symmetric: bool = True enable_dpu: bool = True )

Parameters

is_static (bool) — Whether to apply static quantization or dynamic quantization.
format (QuantFormat) — Targeted RyzenAI quantization representation format. For the Operator Oriented (QOperator) format, all the quantized operators have their own ONNX definitions. For the Tensor Oriented (QDQ) format, the model is quantized by inserting QuantizeLinear / DeQuantizeLinear operators.
calibration_method (CalibrationMethod) — The method chosen to calculate the activations quantization parameters using the calibration dataset.
activations_dtype (QuantType, defaults to QuantType.QUInt8) — The quantization data types to use for the activations.
activations_symmetric (bool, defaults to False) — Whether to apply symmetric quantization on the activations.
weights_dtype (QuantType, defaults to QuantType.QInt8) — The quantization data types to use for the weights.
weights_symmetric (bool, defaults to True) — Whether to apply symmetric quantization on the weights.
enable_dpu (bool, defaults to True) — Determines whether to generate a quantized model that is suitable for the DPU. If set to True, the quantization process will create a model that is optimized for DPU computations.

QuantizationConfig is the configuration class handling all the RyzenAI quantization parameters.

AutoQuantizationConfig

class optimum.amd.ryzenai.AutoQuantizationConfig

< source >

( )

RyzenAIConfig

class optimum.amd.ryzenai.RyzenAIConfig