NNCF configuration file schema

One of

single_object_version
array_of_objects_version

root compression oneOf single_object_version

single_object_version

One of

root compression oneOf single_object_version oneOf quantization

Type: object

Applies quantization on top of the input model, simulating future low-precision execution specifics, and selects the quantization layout and parameters to strive for the best possible quantized model accuracy and performance.
See Quantization.md and the rest of this schema for more details and parameters.

No Additional Properties

root compression oneOf single_object_version oneOf item 0 algorithm

Type: const
Specific value: "quantization"

root compression oneOf single_object_version oneOf item 0 initializer

Type: object

Specifies the kind of pre-training initialization used for the quantization algorithm.
Some kind of initialization is usually required so that the trainable quantization parameters have a better chance to get fine-tuned to values that result in good accuracy.

No Additional Properties

root compression oneOf single_object_version oneOf item 0 initializer batchnorm_adaptation

Type: object

This initializer is applied by default to utilize batch norm statistics adaptation to the current compression scenario. See documentation for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 0 initializer batchnorm_adaptation num_bn_adaptation_samples

Type: number Default: 2000

Number of samples from the training dataset to use for model inference during the BatchNorm statistics adaptation procedure for the compressed model. The actual number of samples will be a closest multiple of the batch size. Set this to 0 to disable BN adaptation.

root compression oneOf single_object_version oneOf item 0 initializer range

This initializer performs forward runs of the model to be quantized using samples from a user-supplied data loader to gather activation and weight tensor statistics within the network and use these to set up initial range parameters for quantizers.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration

global_range_init_configuration

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration type

Default: "mixed_min_max"

Type of the initializer - determines which statistics gathered during initialization will be used to initialize the quantization ranges.
'Online' initializers do not have to store intermediate statistics in memory, while 'offline' do. Increasing the number of initialization samples for 'offline' initialization types will increase RAM overhead of applying NNCF to the model.
Depending on whether the quantizer is configured to be per-tensor or per-channel, the statistics will be collected either on the basis of the set of the entire tensor values, or these will be collected and applied separately for each channel value subset.

One of

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration type oneOf mixed_min_max

mixed_min_max

Type: const

Minimum quantizer range initialized using minima of per-channel minima of the tensor to be quantized, maximum quantizer range initialized using maxima of per-channel maxima of the tensor to be quantized. Offline.

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration type oneOf mean_min_max

mean_min_max

Type: const

Minimum quantizer range initialized using averages (across every single initialization sample) of minima of values in the tensor to be quantized, maximum quantizer range initialized using maxima respectively. Offline.

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf global_range_init_configuration params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration

per_layer_range_init_configuration

Type: array of object
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items

Type: object

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 0 initializer range oneOf per_layer_range_init_configuration item 1 items target_quantizer_group

Type: enum (of string)

The target group of quantizers for which the specified type of range initialization will be applied. If unspecified, then the range initialization of the given type will be applied to all quantizers.

Must be one of:

"activations"
"weights"

root compression oneOf single_object_version oneOf item 0 initializer precision

Type: object

This initializer performs advanced selection of bitwidth per each quantizer location, trying to achieve the best tradeoff between performance and quality of the resulting model.

No Additional Properties

root compression oneOf single_object_version oneOf item 0 initializer precision required

Type: object

root compression oneOf single_object_version oneOf item 0 initializer precision type

Type of precision initialization.

One of

root compression oneOf single_object_version oneOf item 0 initializer precision type oneOf hawq

hawq

Type: const

Applies HAWQ algorithm to determine best bitwidths for each quantizer using a Hessiancalculation approach. For more details see Quantization.md

Specific value: "hawq"

root compression oneOf single_object_version oneOf item 0 initializer precision type oneOf autoq

autoq

Type: const

Applies AutoQ algorithm to determine best bitwidths for each quantizer using reinforcement learning. For more details see Quantization.md

Specific value: "autoq"

root compression oneOf single_object_version oneOf item 0 initializer precision type oneOf manual

manual

Type: const

Allows to manually specify via following config options the exact bitwidth for each quantizer location.

Specific value: "manual"

root compression oneOf single_object_version oneOf item 0 initializer precision bits

Type: array of number Default: [2, 4, 8]

A list of bitwidth to choose from when performing precision initialization. Overrides bits constraints specified in weight and activation sections.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 initializer precision bits bits items

Type: number

Example:

root compression oneOf single_object_version oneOf item 0 initializer precision num_data_points

Type: number Default: 100

Number of data points to iteratively estimate Hessian trace.

root compression oneOf single_object_version oneOf item 0 initializer precision iter_number

Type: number Default: 200

Maximum number of iterations of Hutchinson algorithm to Estimate Hessian trace.

root compression oneOf single_object_version oneOf item 0 initializer precision tolerance

Type: number Default: 0.0001

Minimum relative tolerance for stopping the Hutchinson algorithm. It's calculated between mean average trace from the previous iteration and the current one.

root compression oneOf single_object_version oneOf item 0 initializer precision compression_ratio

Type: number

For the hawq mode:
The desired ratio between bit complexity of a fully INT8 model and a mixed-precision lower-bit one. On precision initialization stage the HAWQ algorithm chooses the most accurate mixed-precision configuration with a ratio no less than the specified. Bit complexity of the model is a sum of bit complexities for each quantized layer, which are a multiplication of FLOPS for the layer by the number of bits for its quantization.
For the autoq mode:
The target model size after quantization, relative to total parameters size in FP32. E.g. a uniform INT8-quantized model would have a compression_ratio equal to 0.25,and a uniform INT4-quantized model would have compression_ratio equal to 0.125.

root compression oneOf single_object_version oneOf item 0 initializer precision eval_subset_ratio

Type: number Default: 1.0

The desired ratio of dataloader to be iterated during each search iteration of AutoQ precision initialization. Specifically, this ratio applies to the registered autoqevalloader via registerdefaultinit_args.

root compression oneOf single_object_version oneOf item 0 initializer precision warmup_iter_number

Type: number Default: 20

The number of random policy at the beginning of of AutoQ precision initialization to populate replay buffer with experiences. This key is meant for internal testing use. Users need not to configure.

root compression oneOf single_object_version oneOf item 0 initializer precision bitwidth_per_scope

Type: array of array

Manual settings for the quantizer bitwidths. Scopes are used to identify the quantizers.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 initializer precision bitwidth_per_scope bitwidth_per_scope items

Type: array

A tuple of a bitwidth and a scope of the quantizer to assign the bitwidth to.

No Additional Items

Tuple Validation

Item at 1 must be:

root compression oneOf single_object_version oneOf item 0 initializer precision bitwidth_per_scope bitwidth_per_scope items bitwidth_per_scope items item 0

Type: number

Item at 2 must be:

root compression oneOf single_object_version oneOf item 0 initializer precision bitwidth_per_scope bitwidth_per_scope items bitwidth_per_scope items item 1

Type: string

Example:

[
    [
        2,
        "ResNet/NNCFConv2d[conv1]/conv2d_0|WEIGHT"
    ],
    [
        8,
        "ResNet/ReLU[relu]/relu__0|OUTPUT"
    ]
]

root compression oneOf single_object_version oneOf item 0 initializer precision traces_per_layer_path

Type: string

Path to serialized PyTorch Tensor with average Hessian traces per quantized modules. It can be used to accelerate mixed precision initialization by using average Hessian traces from previous run of HAWQ algorithm.

root compression oneOf single_object_version oneOf item 0 initializer precision dump_init_precision_data

Type: boolean Default: false

Whether to dump data related to Precision Initialization algorithm. HAWQ dump includes bitwidth graph, average traces and different plots. AutoQ dump includes DDPG agent learning trajectory in tensorboard and mixed-precision environment metadata.

root compression oneOf single_object_version oneOf item 0 initializer precision bitwidth_assignment_mode

Type: enum (of string) Default: "liberal"

The mode for assignment bitwidth to activation quantizers. In the 'strict' mode,a group of quantizers that feed their output to one and more same modules as input (weight quantizers count as well) will have the same bitwidth in the 'liberal' mode allows different precisions within the group.
Bitwidth is assigned based on hardware constraints. If multiple variants are possible, the minimal compatible bitwidth is chosen.

Must be one of:

"strict"
"liberal"

root compression oneOf single_object_version oneOf item 0 preset

Type: enum (of string) Default: "performance"

The preset defines the quantization schema for weights and activations. The 'performance' mode sets up symmetric weight and activation quantizers. The 'mixed' mode utilizes symmetric weight quantization and asymmetric activation quantization.

Must be one of:

"performance"
"mixed"

root compression oneOf single_object_version oneOf item 0 quantize_inputs

Type: boolean Default: true

Whether the model inputs should be immediately quantized prior to any other model operations.

root compression oneOf single_object_version oneOf item 0 quantize_outputs

Type: boolean Default: false

Whether the model outputs should be additionally quantized.

root compression oneOf single_object_version oneOf item 0 weights

Type: object

Constraints to be applied to model weights quantization only.

No Additional Properties

root compression oneOf single_object_version oneOf item 0 weights mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 0 weights bits

Type: number Default: 8

Bitwidth to quantize to. It is intended for manual bitwidth setting. Can be overridden by the bits parameter from the precision initializer section. An error occurs if it doesn't match a corresponding bitwidth constraints from the hardware configuration.

root compression oneOf single_object_version oneOf item 0 weights signed

Type: boolean

Whether to use signed or unsigned input/output values for quantization. true will force the quantization to support signed values, false will force the quantization to only support input values with one and the same sign, and leaving this value unspecified (default) means relying on the initialization statistics to determine best approach.
Note: If set to false, but the input values have differing signs during initialization, signed quantization will be performed instead.

root compression oneOf single_object_version oneOf item 0 weights per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 0 weights ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 weights ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 0 weights target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 weights target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 0 weights validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 0 weights logarithm_scale

Type: boolean Default: false

Whether to use log of scale as the optimization parameter instead of the scale itself. This serves as an optional regularization opportunity for training quantizer scales.

root compression oneOf single_object_version oneOf item 0 activations

Type: object

Constraints to be applied to model activations quantization only.

No Additional Properties

root compression oneOf single_object_version oneOf item 0 activations mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 0 activations bits

Type: number Default: 8

root compression oneOf single_object_version oneOf item 0 activations signed

Type: boolean

root compression oneOf single_object_version oneOf item 0 activations per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 0 activations ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 activations ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 0 activations target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 activations target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 0 activations validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 0 activations logarithm_scale

Type: boolean Default: false

Whether to use log of scale as the optimization parameter instead of the scale itself. This serves as an optional regularization opportunity for training quantizer scales.

root compression oneOf single_object_version oneOf item 0 activations unified_scale_ops

Type: array of array

Specifies operations in the model which will share the same quantizer module for activations. This is helpful in case one and the same quantizer scale is required for each input of this operation. Each sub-array will define a group of model operation inputs that have to share a single actual quantization module, each entry in this subarray should correspond to exactly one node in the NNCF graph and the groups should not overlap. The final quantizer for each sub-array will be associated with the first element of this sub-array.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 activations unified_scale_ops unified_scale_ops items

Type: array of string
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 activations unified_scale_ops unified_scale_ops items unified_scale_ops items items

Type: string

root compression oneOf single_object_version oneOf item 0 scope_overrides

Type: object

This option is used to specify overriding quantization constraints for specific scope,e.g. in case you need to quantize a single operation differently than the rest of the model. Any other automatic or group-wise settings will be overridden.

No Additional Properties

Example:

{
    "weights": {
        "QuantizeOutputsTestModel/NNCFConv2d[conv5]/conv2d_0": {
            "mode": "asymmetric"
        },
        "activations": {
            "{re}.*conv_first.*": {
                "mode": "asymmetric"
            },
            "{re}.*conv_second.*": {
                "mode": "symmetric"
            }
        }
    }
}

root compression oneOf single_object_version oneOf item 0 scope_overrides weights

Type: object

Pattern Property

All properties whose name matches the following regular expression must respect the following conditions

Property name regular expression: .*

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .*

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* bits

Type: number Default: 8

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* signed

Type: boolean

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range

One of

global_range_init_configuration
per_layer_range_init_configuration

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration

global_range_init_configuration

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf global_range_init_configuration params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration

per_layer_range_init_configuration

Type: array of object
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items

Type: object

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 0 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items target_quantizer_group

Type: enum (of string)

The target group of quantizers for which the specified type of range initialization will be applied. If unspecified, then the range initialization of the given type will be applied to all quantizers.

Must be one of:

"activations"
"weights"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations

Type: object

Pattern Property

All properties whose name matches the following regular expression must respect the following conditions

Property name regular expression: .*

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .*

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* bits

Type: number Default: 8

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* signed

Type: boolean

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range

One of

global_range_init_configuration
per_layer_range_init_configuration

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration

global_range_init_configuration

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf global_range_init_configuration params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration

per_layer_range_init_configuration

Type: array of object
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items

Type: object

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 0 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items target_quantizer_group

Type: enum (of string)

The target group of quantizers for which the specified type of range initialization will be applied. If unspecified, then the range initialization of the given type will be applied to all quantizers.

Must be one of:

"activations"
"weights"

root compression oneOf single_object_version oneOf item 0 export_to_onnx_standard_ops

Type: boolean Default: false

[Deprecated] Determines how should the additional quantization operations be exported into the ONNX format. Set this to true to export to ONNX standard QuantizeLinear-DequantizeLinear node pairs (8-bit quantization only) or to false to export to OpenVINO-supported FakeQuantize ONNX(all quantization settings supported).

root compression oneOf single_object_version oneOf item 0 overflow_fix

Type: enum (of string) Default: "enable"

This option controls whether to apply the overflow issue fix for the appropriate NNCF config or not. If set to disable, the fix will not be applied. If set to enable or first_layer_only, while appropriate target_devices are chosen, the fix will be applied to all layers or to the first convolutional layer respectively.

Must be one of:

"enable"
"disable"
"first_layer_only"

root compression oneOf single_object_version oneOf item 0 params

Type: object

Configures the staged quantization compression scheduler for the quantization algorithm. The quantizers will not be applied until a given epoch count is reached.

No Additional Properties

root compression oneOf single_object_version oneOf item 0 params batch_multiplier

Type: number

Gradients will be accumulated for this number of batches before doing a 'backward' call. Increasing this may improve training quality, since binarized networks exhibit noisy gradients and their training requires larger batch sizes than could be accommodated by GPUs.

root compression oneOf single_object_version oneOf item 0 params activations_quant_start_epoch

Type: number Default: 1

A zero-based index of the epoch, upon reaching which the activations will start to be quantized.

root compression oneOf single_object_version oneOf item 0 params weights_quant_start_epoch

Type: number Default: 1

Epoch index upon which the weights will start to be quantized.

root compression oneOf single_object_version oneOf item 0 params lr_poly_drop_start_epoch

Type: number

Epoch index upon which the learning rate will start to be dropped. If unspecified, learning rate will not be dropped.

root compression oneOf single_object_version oneOf item 0 params lr_poly_drop_duration_epochs

Type: number Default: 30

Duration, in epochs, of the learning rate dropping process.

root compression oneOf single_object_version oneOf item 0 params disable_wd_start_epoch

Type: number

Epoch to disable weight decay in the optimizer. If unspecified, weight decay will not be disabled.

root compression oneOf single_object_version oneOf item 0 params base_lr

Type: number Default: 0.001

Initial value of learning rate.

root compression oneOf single_object_version oneOf item 0 params base_wd

Type: number Default: 1e-05

Initial value of weight decay.

root compression oneOf single_object_version oneOf item 0 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 0 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 0 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 0 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 0 compression_lr_multiplier

Type: number

PyTorch only - Used to increase/decrease gradients for compression algorithms' parameters. The gradients will be multiplied by the specified value. If unspecified, the gradients will not be adjusted.

root compression oneOf single_object_version oneOf filter_pruning

Type: object

Applies filter pruning during training of the model to effectively remove entire sub-dimensions of tensors in the original model from computation and therefore increase performance.
See Pruning.md and the rest of this schema for more details and parameters.

No Additional Properties

root compression oneOf single_object_version oneOf item 1 algorithm

Type: const
Specific value: "filter_pruning"

root compression oneOf single_object_version oneOf item 1 initializer

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 1 initializer batchnorm_adaptation

Type: object

This initializer is applied by default to utilize batch norm statistics adaptation to the current compression scenario. See documentation for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 1 initializer batchnorm_adaptation num_bn_adaptation_samples

Type: number Default: 2000

root compression oneOf single_object_version oneOf item 1 pruning_init

Type: number Default: 0.0

Initial value of the pruning level applied to the prunable operations.

root compression oneOf single_object_version oneOf item 1 params

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 1 params filter_importance

Type: enum (of string) Default: "L2"

The type of filter importance metric.

Must be one of:

"L2"
"L1"
"geometric_median"

root compression oneOf single_object_version oneOf item 1 params pruning_target

Type: number Default: 0.5

Target value of the pruning level for the operations that can be pruned. The operations are determined by analysis of the model architecture during the pruning algorithm initialization stage.

root compression oneOf single_object_version oneOf item 1 params pruning_steps

Type: number Default: 100

Number of epochs during which the pruning level is increased from pruning_init to pruning_target.

root compression oneOf single_object_version oneOf item 1 params pruning_flops_target

Type: number

Target value of the pruning level for model FLOPs.

root compression oneOf single_object_version oneOf item 1 params schedule

Type: enum (of string) Default: "exponential"

The type of scheduling to use for adjusting the target pruning level.

Must be one of:

"exponential"
"exponential_with_bias"
"baseline"

root compression oneOf single_object_version oneOf item 1 params num_init_steps

Type: number Default: 0

Number of epochs for model pretraining before starting filter pruning.

root compression oneOf single_object_version oneOf item 1 params interlayer_ranking_type

Type: enum (of string) Default: "unweighted_ranking"

The type of filter ranking across the layers.

Must be one of:

"unweighted_ranking"
"learned_ranking"

root compression oneOf single_object_version oneOf item 1 params all_weights

Type: boolean Default: false

Whether to prune layers independently (choose filters with the smallest importance in each layer separately) or not.

root compression oneOf single_object_version oneOf item 1 params prune_first_conv

Type: boolean Default: false

Whether to prune first convolutional layers or not. A 'first' convolutional layer is such a layer that the path from model input to this layer has no other convolution operations on it.

root compression oneOf single_object_version oneOf item 1 params prune_downsample_convs

Type: boolean Default: false

Whether to prune downsampling convolutional layers (with stride > 1) or not.

root compression oneOf single_object_version oneOf item 1 params prune_batch_norms

Type: boolean Default: true

Whether to prune parameters of the Batch Norm layer that corresponds to pruned filters of the convolutional layer which feeds its output to this Batch Norm.

root compression oneOf single_object_version oneOf item 1 params legr_params

Type: object

Describes parameters specific to the LeGR pruning algorithm.See Pruning.md for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 1 params legr_params generations

Type: number Default: 400

Number of generations for the evolution algorithm.

root compression oneOf single_object_version oneOf item 1 params legr_params train_steps

Type: number Default: 200

Number of training steps to estimate pruned model accuracy.

root compression oneOf single_object_version oneOf item 1 params legr_params max_pruning

Type: number Default: 0.8

Pruning level for the model to train LeGR algorithm on it. If learned ranking will be used for multiple pruning levels, the highest should be used as max_pruning. If model will be pruned with one pruning level, this target should be used.

root compression oneOf single_object_version oneOf item 1 params legr_params random_seed

Type: number Default: 42

Random seed for LeGR coefficients generation.

root compression oneOf single_object_version oneOf item 1 params save_ranking_coeffs_path

Type: string

root compression oneOf single_object_version oneOf item 1 params load_ranking_coeffs_path

Type: string

root compression oneOf single_object_version oneOf item 1 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 1 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 1 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 1 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 1 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf magnitude_sparsity

Type: object

Applies sparsity on top of the current model. Each weight tensor value will be either kept as-is, or set to 0 based on its magnitude. For large sparsity levels, this will improve performance on hardware that can profit from it. See Sparsity.md and the rest of this schema for more details and parameters.

No Additional Properties

root compression oneOf single_object_version oneOf item 2 algorithm

Type: const
Specific value: "magnitude_sparsity"

root compression oneOf single_object_version oneOf item 2 sparsity_init

Type: number Default: 0.0

Initial value of the sparsity level applied to the model.

root compression oneOf single_object_version oneOf item 2 initializer

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 2 initializer batchnorm_adaptation

Type: object

This initializer is applied by default to utilize batch norm statistics adaptation to the current compression scenario. See documentation for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 2 initializer batchnorm_adaptation num_bn_adaptation_samples

Type: number Default: 2000

root compression oneOf single_object_version oneOf item 2 params

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 2 params sparsity_level_setting_mode

Type: string Default: "global"

The mode of sparsity level setting.
global - the sparsity level is calculated across all weight values in the network across layers, local - the sparsity level can be set per-layer and within each layer is computed with respect only to the weight values within that layer.

root compression oneOf single_object_version oneOf item 2 params schedule

Type: enum (of string)

The type of scheduling to use for adjusting the targetsparsity level. Default - exponential for rb_sparsity, polynomial otherwise

Must be one of:

"polynomial"
"exponential"
"adaptive"
"multistep"

root compression oneOf single_object_version oneOf item 2 params sparsity_target

Type: number Default: 0.5

Target sparsity level for the model, to be reached at the end of the compression schedule.

root compression oneOf single_object_version oneOf item 2 params sparsity_target_epoch

Type: number Default: 90

Index of the epoch upon which the sparsity level of the model is scheduled to become equal to sparsity_target.

root compression oneOf single_object_version oneOf item 2 params sparsity_freeze_epoch

Type: number Default: 100

Index of the epoch upon which the sparsity mask will be frozen and no longer trained.

root compression oneOf single_object_version oneOf item 2 params update_per_optimizer_step

Type: boolean Default: false

Whether the function-based sparsity level schedulers should update the sparsity level after each optimizer step instead of each epoch step.

root compression oneOf single_object_version oneOf item 2 params steps_per_epoch

Type: number

Number of optimizer steps in one epoch. Required to start proper scheduling in the first training epoch if update_per_optimizer_step is true.

root compression oneOf single_object_version oneOf item 2 params multistep_steps

Type: array of number Default: [90]

A list of scheduler steps at which to transition to the next scheduled sparsity level (multistep scheduler only).

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 2 params multistep_steps multistep_steps items

Type: number

root compression oneOf single_object_version oneOf item 2 params multistep_sparsity_levels

Type: array of number Default: [0.1, 0.5]

Multistep scheduler only - Levels of sparsity to use at each step of the scheduler as specified in the multistep_steps attribute. The first sparsity level will be applied immediately, so the length of this list should be larger than the length of the multistep_steps by one. The last sparsity level will function as the ultimate sparsity target, overriding the "sparsity_target" setting if it is present.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 2 params multistep_sparsity_levels multistep_sparsity_levels items

Type: number

root compression oneOf single_object_version oneOf item 2 params patience

Type: number Default: 1

A conventional patience parameter for the scheduler, as for any other standard scheduler. Specified in units of scheduler steps.

root compression oneOf single_object_version oneOf item 2 params power

Type: number Default: 0.9

For polynomial scheduler - determines the corresponding power value.

root compression oneOf single_object_version oneOf item 2 params concave

Type: boolean Default: true

For polynomial scheduler - if true, then the target sparsity level will be approached in concave manner, and in convex manner otherwise.

root compression oneOf single_object_version oneOf item 2 params weight_importance

Type: enum (of string) Default: "normed_abs"

Determines the way in which the weight values will be sorted after being aggregated in order to determine the sparsity threshold corresponding to a specific sparsity level.

Must be one of:

"abs"
"normed_abs"

root compression oneOf single_object_version oneOf item 2 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 2 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 2 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 2 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 2 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf rb_sparsity

Type: object

Applies sparsity on top of the current model. Each weight tensor value will be either kept as-is, or set to 0 based on its importance as determined by the regularization-based sparsity algorithm. For large sparsity levels, this will improve performance on hardware that can profit from it. See Sparsity.md and the rest of this schema for more details and parameters.

No Additional Properties

root compression oneOf single_object_version oneOf item 3 algorithm

Type: const
Specific value: "rb_sparsity"

root compression oneOf single_object_version oneOf item 3 sparsity_init

Type: number Default: 0.0

Initial value of the sparsity level applied to the model

root compression oneOf single_object_version oneOf item 3 params

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 3 params sparsity_level_setting_mode

Type: string Default: "global"

root compression oneOf single_object_version oneOf item 3 params schedule

Type: enum (of string)

The type of scheduling to use for adjusting the targetsparsity level. Default - exponential for rb_sparsity, polynomial otherwise

Must be one of:

"polynomial"
"exponential"
"adaptive"
"multistep"

root compression oneOf single_object_version oneOf item 3 params sparsity_target

Type: number Default: 0.5

Target sparsity level for the model, to be reached at the end of the compression schedule.

root compression oneOf single_object_version oneOf item 3 params sparsity_target_epoch

Type: number Default: 90

Index of the epoch upon which the sparsity level of the model is scheduled to become equal to sparsity_target.

root compression oneOf single_object_version oneOf item 3 params sparsity_freeze_epoch

Type: number Default: 100

Index of the epoch upon which the sparsity mask will be frozen and no longer trained.

root compression oneOf single_object_version oneOf item 3 params update_per_optimizer_step

Type: boolean Default: false

Whether the function-based sparsity level schedulers should update the sparsity level after each optimizer step instead of each epoch step.

root compression oneOf single_object_version oneOf item 3 params steps_per_epoch

Type: number

Number of optimizer steps in one epoch. Required to start proper scheduling in the first training epoch if update_per_optimizer_step is true.

root compression oneOf single_object_version oneOf item 3 params multistep_steps

Type: array of number Default: [90]

A list of scheduler steps at which to transition to the next scheduled sparsity level (multistep scheduler only).

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 3 params multistep_steps multistep_steps items

Type: number

root compression oneOf single_object_version oneOf item 3 params multistep_sparsity_levels

Type: array of number Default: [0.1, 0.5]

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 3 params multistep_sparsity_levels multistep_sparsity_levels items

Type: number

root compression oneOf single_object_version oneOf item 3 params patience

Type: number Default: 1

A conventional patience parameter for the scheduler, as for any other standard scheduler. Specified in units of scheduler steps.

root compression oneOf single_object_version oneOf item 3 params power

Type: number Default: 0.9

For polynomial scheduler - determines the corresponding power value.

root compression oneOf single_object_version oneOf item 3 params concave

Type: boolean Default: true

For polynomial scheduler - if true, then the target sparsity level will be approached in concave manner, and in convex manner otherwise.

root compression oneOf single_object_version oneOf item 3 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 3 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 3 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 3 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 3 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 3 compression_lr_multiplier

Type: number

PyTorch only - Used to increase/decrease gradients for compression algorithms' parameters. The gradients will be multiplied by the specified value. If unspecified, the gradients will not be adjusted.

root compression oneOf single_object_version oneOf knowledge_distillation

Type: object

This algorithm is only useful in combination with other compression algorithms and improves theend accuracy result of the corresponding algorithm by calculating knowledge distillation loss between the compressed model currently in training and its original, uncompressed counterpart. See KnowledgeDistillation.md and the rest of this schema for more details and parameters.

No Additional Properties

root compression oneOf single_object_version oneOf item 4 algorithm

Type: const
Specific value: "knowledge_distillation"

root compression oneOf single_object_version oneOf item 4 type

Type: enum (of string)

Type of Knowledge Distillation Loss.

Must be one of:

"mse"
"softmax"

root compression oneOf single_object_version oneOf item 4 scale

Type: number Default: 1.0

Knowledge Distillation loss value multiplier

root compression oneOf single_object_version oneOf item 4 temperature

Type: number Default: 1.0

softmax type only - Temperature for logits softening.

root compression oneOf single_object_version oneOf const_sparsity

Type: object

This algorithm takes no additional parameters and is used when you want to load a checkpoint trained with another sparsity algorithm and do other compression without changing the sparsity mask.

No Additional Properties

root compression oneOf single_object_version oneOf item 5 algorithm

Type: const
Specific value: "const_sparsity"

root compression oneOf single_object_version oneOf item 5 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 5 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 5 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 5 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 5 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf binarization

Type: object

Binarization is a specific particular case of the more general quantization algorithm.
See Binarization.md and the rest of this schema for more details and parameters.

No Additional Properties

root compression oneOf single_object_version oneOf item 6 algorithm

Type: const
Specific value: "binarization"

root compression oneOf single_object_version oneOf item 6 mode

Type: enum (of string) Default: "xnor"

Selects the mode of binarization - either 'xnor' for XNOR binarization,or 'dorefa' for DoReFa binarization.

Must be one of:

"xnor"
"dorefa"

root compression oneOf single_object_version oneOf item 6 initializer

Type: object

No Additional Properties

root compression oneOf single_object_version oneOf item 6 initializer batchnorm_adaptation

Type: object

This initializer is applied by default to utilize batch norm statistics adaptation to the current compression scenario. See documentation for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 6 initializer batchnorm_adaptation num_bn_adaptation_samples

Type: number Default: 2000

root compression oneOf single_object_version oneOf item 6 initializer range

One of

global_range_init_configuration
per_layer_range_init_configuration

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration

global_range_init_configuration

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf global_range_init_configuration params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration

per_layer_range_init_configuration

Type: array of object
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items

Type: object

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 6 initializer range oneOf per_layer_range_init_configuration item 1 items target_quantizer_group

Type: enum (of string)

The target group of quantizers for which the specified type of range initialization will be applied. If unspecified, then the range initialization of the given type will be applied to all quantizers.

Must be one of:

"activations"
"weights"

root compression oneOf single_object_version oneOf item 6 initializer precision

Type: object

This initializer performs advanced selection of bitwidth per each quantizer location, trying to achieve the best tradeoff between performance and quality of the resulting model.

No Additional Properties

root compression oneOf single_object_version oneOf item 6 initializer precision required

Type: object

root compression oneOf single_object_version oneOf item 6 initializer precision type

Type of precision initialization.

One of

root compression oneOf single_object_version oneOf item 6 initializer precision type oneOf hawq

hawq

Type: const

Applies HAWQ algorithm to determine best bitwidths for each quantizer using a Hessiancalculation approach. For more details see Quantization.md

Specific value: "hawq"

root compression oneOf single_object_version oneOf item 6 initializer precision type oneOf autoq

autoq

Type: const

Applies AutoQ algorithm to determine best bitwidths for each quantizer using reinforcement learning. For more details see Quantization.md

Specific value: "autoq"

root compression oneOf single_object_version oneOf item 6 initializer precision type oneOf manual

manual

Type: const

Allows to manually specify via following config options the exact bitwidth for each quantizer location.

Specific value: "manual"

root compression oneOf single_object_version oneOf item 6 initializer precision bits

Type: array of number Default: [2, 4, 8]

A list of bitwidth to choose from when performing precision initialization. Overrides bits constraints specified in weight and activation sections.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 6 initializer precision bits bits items

Type: number

Example:

root compression oneOf single_object_version oneOf item 6 initializer precision num_data_points

Type: number Default: 100

Number of data points to iteratively estimate Hessian trace.

root compression oneOf single_object_version oneOf item 6 initializer precision iter_number

Type: number Default: 200

Maximum number of iterations of Hutchinson algorithm to Estimate Hessian trace.

root compression oneOf single_object_version oneOf item 6 initializer precision tolerance

Type: number Default: 0.0001

Minimum relative tolerance for stopping the Hutchinson algorithm. It's calculated between mean average trace from the previous iteration and the current one.

root compression oneOf single_object_version oneOf item 6 initializer precision compression_ratio

Type: number

root compression oneOf single_object_version oneOf item 6 initializer precision eval_subset_ratio

Type: number Default: 1.0

root compression oneOf single_object_version oneOf item 6 initializer precision warmup_iter_number

Type: number Default: 20

The number of random policy at the beginning of of AutoQ precision initialization to populate replay buffer with experiences. This key is meant for internal testing use. Users need not to configure.

root compression oneOf single_object_version oneOf item 6 initializer precision bitwidth_per_scope

Type: array of array

Manual settings for the quantizer bitwidths. Scopes are used to identify the quantizers.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 6 initializer precision bitwidth_per_scope bitwidth_per_scope items

Type: array

A tuple of a bitwidth and a scope of the quantizer to assign the bitwidth to.

No Additional Items

Tuple Validation

Item at 1 must be:

root compression oneOf single_object_version oneOf item 6 initializer precision bitwidth_per_scope bitwidth_per_scope items bitwidth_per_scope items item 0

Type: number

Item at 2 must be:

root compression oneOf single_object_version oneOf item 6 initializer precision bitwidth_per_scope bitwidth_per_scope items bitwidth_per_scope items item 1

Type: string

Example:

[
    [
        2,
        "ResNet/NNCFConv2d[conv1]/conv2d_0|WEIGHT"
    ],
    [
        8,
        "ResNet/ReLU[relu]/relu__0|OUTPUT"
    ]
]

root compression oneOf single_object_version oneOf item 6 initializer precision traces_per_layer_path

Type: string

root compression oneOf single_object_version oneOf item 6 initializer precision dump_init_precision_data

Type: boolean Default: false

root compression oneOf single_object_version oneOf item 6 initializer precision bitwidth_assignment_mode

Type: enum (of string) Default: "liberal"

Must be one of:

"strict"
"liberal"

root compression oneOf single_object_version oneOf item 6 params

Type: object

Configures the staged quantization compression scheduler for the quantization algorithm. The quantizers will not be applied until a given epoch count is reached.

No Additional Properties

root compression oneOf single_object_version oneOf item 6 params batch_multiplier

Type: number

root compression oneOf single_object_version oneOf item 6 params activations_quant_start_epoch

Type: number Default: 1

A zero-based index of the epoch, upon reaching which the activations will start to be quantized.

root compression oneOf single_object_version oneOf item 6 params weights_quant_start_epoch

Type: number Default: 1

Epoch index upon which the weights will start to be quantized.

root compression oneOf single_object_version oneOf item 6 params lr_poly_drop_start_epoch

Type: number

Epoch index upon which the learning rate will start to be dropped. If unspecified, learning rate will not be dropped.

root compression oneOf single_object_version oneOf item 6 params lr_poly_drop_duration_epochs

Type: number Default: 30

Duration, in epochs, of the learning rate dropping process.

root compression oneOf single_object_version oneOf item 6 params disable_wd_start_epoch

Type: number

Epoch to disable weight decay in the optimizer. If unspecified, weight decay will not be disabled.

root compression oneOf single_object_version oneOf item 6 params base_lr

Type: number Default: 0.001

Initial value of learning rate.

root compression oneOf single_object_version oneOf item 6 params base_wd

Type: number Default: 1e-05

Initial value of weight decay.

root compression oneOf single_object_version oneOf item 6 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 6 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 6 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 6 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 6 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 6 compression_lr_multiplier

Type: number

PyTorch only - Used to increase/decrease gradients for compression algorithms' parameters. The gradients will be multiplied by the specified value. If unspecified, the gradients will not be adjusted.

root compression oneOf single_object_version oneOf experimental_quantization

Type: object

No Additional Properties

root compression oneOf single_object_version oneOf item 7 algorithm

Type: const
Specific value: "experimental_quantization"

root compression oneOf single_object_version oneOf item 7 initializer

Type: object

No Additional Properties

root compression oneOf single_object_version oneOf item 7 initializer batchnorm_adaptation

Type: object

This initializer is applied by default to utilize batch norm statistics adaptation to the current compression scenario. See documentation for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 7 initializer batchnorm_adaptation num_bn_adaptation_samples

Type: number Default: 2000

root compression oneOf single_object_version oneOf item 7 initializer range

One of

global_range_init_configuration
per_layer_range_init_configuration

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration

global_range_init_configuration

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf global_range_init_configuration params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration

per_layer_range_init_configuration

Type: array of object
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items

Type: object

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 7 initializer range oneOf per_layer_range_init_configuration item 1 items target_quantizer_group

Type: enum (of string)

The target group of quantizers for which the specified type of range initialization will be applied. If unspecified, then the range initialization of the given type will be applied to all quantizers.

Must be one of:

"activations"
"weights"

root compression oneOf single_object_version oneOf item 7 initializer precision

Type: object

This initializer performs advanced selection of bitwidth per each quantizer location, trying to achieve the best tradeoff between performance and quality of the resulting model.

No Additional Properties

root compression oneOf single_object_version oneOf item 7 initializer precision required

Type: object

root compression oneOf single_object_version oneOf item 7 initializer precision type

Type of precision initialization.

One of

root compression oneOf single_object_version oneOf item 7 initializer precision type oneOf hawq

hawq

Type: const

Applies HAWQ algorithm to determine best bitwidths for each quantizer using a Hessiancalculation approach. For more details see Quantization.md

Specific value: "hawq"

root compression oneOf single_object_version oneOf item 7 initializer precision type oneOf autoq

autoq

Type: const

Applies AutoQ algorithm to determine best bitwidths for each quantizer using reinforcement learning. For more details see Quantization.md

Specific value: "autoq"

root compression oneOf single_object_version oneOf item 7 initializer precision type oneOf manual

manual

Type: const

Allows to manually specify via following config options the exact bitwidth for each quantizer location.

Specific value: "manual"

root compression oneOf single_object_version oneOf item 7 initializer precision bits

Type: array of number Default: [2, 4, 8]

A list of bitwidth to choose from when performing precision initialization. Overrides bits constraints specified in weight and activation sections.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 initializer precision bits bits items

Type: number

Example:

root compression oneOf single_object_version oneOf item 7 initializer precision num_data_points

Type: number Default: 100

Number of data points to iteratively estimate Hessian trace.

root compression oneOf single_object_version oneOf item 7 initializer precision iter_number

Type: number Default: 200

Maximum number of iterations of Hutchinson algorithm to Estimate Hessian trace.

root compression oneOf single_object_version oneOf item 7 initializer precision tolerance

Type: number Default: 0.0001

Minimum relative tolerance for stopping the Hutchinson algorithm. It's calculated between mean average trace from the previous iteration and the current one.

root compression oneOf single_object_version oneOf item 7 initializer precision compression_ratio

Type: number

root compression oneOf single_object_version oneOf item 7 initializer precision eval_subset_ratio

Type: number Default: 1.0

root compression oneOf single_object_version oneOf item 7 initializer precision warmup_iter_number

Type: number Default: 20

The number of random policy at the beginning of of AutoQ precision initialization to populate replay buffer with experiences. This key is meant for internal testing use. Users need not to configure.

root compression oneOf single_object_version oneOf item 7 initializer precision bitwidth_per_scope

Type: array of array

Manual settings for the quantizer bitwidths. Scopes are used to identify the quantizers.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 initializer precision bitwidth_per_scope bitwidth_per_scope items

Type: array

A tuple of a bitwidth and a scope of the quantizer to assign the bitwidth to.

No Additional Items

Tuple Validation

Item at 1 must be:

root compression oneOf single_object_version oneOf item 7 initializer precision bitwidth_per_scope bitwidth_per_scope items bitwidth_per_scope items item 0

Type: number

Item at 2 must be:

root compression oneOf single_object_version oneOf item 7 initializer precision bitwidth_per_scope bitwidth_per_scope items bitwidth_per_scope items item 1

Type: string

Example:

[
    [
        2,
        "ResNet/NNCFConv2d[conv1]/conv2d_0|WEIGHT"
    ],
    [
        8,
        "ResNet/ReLU[relu]/relu__0|OUTPUT"
    ]
]

root compression oneOf single_object_version oneOf item 7 initializer precision traces_per_layer_path

Type: string

root compression oneOf single_object_version oneOf item 7 initializer precision dump_init_precision_data

Type: boolean Default: false

root compression oneOf single_object_version oneOf item 7 initializer precision bitwidth_assignment_mode

Type: enum (of string) Default: "liberal"

Must be one of:

"strict"
"liberal"

root compression oneOf single_object_version oneOf item 7 preset

Type: enum (of string) Default: "performance"

Must be one of:

"performance"
"mixed"

root compression oneOf single_object_version oneOf item 7 quantize_inputs

Type: boolean Default: true

Whether the model inputs should be immediately quantized prior to any other model operations.

root compression oneOf single_object_version oneOf item 7 quantize_outputs

Type: boolean Default: false

Whether the model outputs should be additionally quantized.

root compression oneOf single_object_version oneOf item 7 weights

Type: object

Constraints to be applied to model weights quantization only.

No Additional Properties

root compression oneOf single_object_version oneOf item 7 weights mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 7 weights bits

Type: number Default: 8

root compression oneOf single_object_version oneOf item 7 weights signed

Type: boolean

root compression oneOf single_object_version oneOf item 7 weights per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 7 weights ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 weights ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 7 weights target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 weights target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 7 weights validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 7 weights logarithm_scale

Type: boolean Default: false

Whether to use log of scale as the optimization parameter instead of the scale itself. This serves as an optional regularization opportunity for training quantizer scales.

root compression oneOf single_object_version oneOf item 7 activations

Type: object

Constraints to be applied to model activations quantization only.

No Additional Properties

root compression oneOf single_object_version oneOf item 7 activations mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 7 activations bits

Type: number Default: 8

root compression oneOf single_object_version oneOf item 7 activations signed

Type: boolean

root compression oneOf single_object_version oneOf item 7 activations per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 7 activations ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 activations ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 7 activations target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 activations target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 7 activations validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 7 activations logarithm_scale

Type: boolean Default: false

Whether to use log of scale as the optimization parameter instead of the scale itself. This serves as an optional regularization opportunity for training quantizer scales.

root compression oneOf single_object_version oneOf item 7 activations unified_scale_ops

Type: array of array

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 activations unified_scale_ops unified_scale_ops items

Type: array of string
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 activations unified_scale_ops unified_scale_ops items unified_scale_ops items items

Type: string

root compression oneOf single_object_version oneOf item 7 scope_overrides

Type: object

No Additional Properties

Example:

{
    "weights": {
        "QuantizeOutputsTestModel/NNCFConv2d[conv5]/conv2d_0": {
            "mode": "asymmetric"
        },
        "activations": {
            "{re}.*conv_first.*": {
                "mode": "asymmetric"
            },
            "{re}.*conv_second.*": {
                "mode": "symmetric"
            }
        }
    }
}

root compression oneOf single_object_version oneOf item 7 scope_overrides weights

Type: object

Pattern Property

All properties whose name matches the following regular expression must respect the following conditions

Property name regular expression: .*

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .*

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* bits

Type: number Default: 8

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* signed

Type: boolean

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range

One of

global_range_init_configuration
per_layer_range_init_configuration

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration

global_range_init_configuration

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf global_range_init_configuration params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration

per_layer_range_init_configuration

Type: array of object
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items

Type: object

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 7 scope_overrides weights .* initializer range oneOf per_layer_range_init_configuration item 1 items target_quantizer_group

Type: enum (of string)

The target group of quantizers for which the specified type of range initialization will be applied. If unspecified, then the range initialization of the given type will be applied to all quantizers.

Must be one of:

"activations"
"weights"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations

Type: object

Pattern Property

All properties whose name matches the following regular expression must respect the following conditions

Property name regular expression: .*

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .*

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* mode

Type: enum (of string)

Mode of quantization. See Quantization.md for more details.

Must be one of:

"symmetric"
"asymmetric"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* bits

Type: number Default: 8

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* signed

Type: boolean

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* per_channel

Type: boolean Default: false

Whether to quantize inputs of this quantizer per each channel of input tensor (per 0-th dimension for weight quantization, and per 1-st dimension for activation quantization).

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range

One of

global_range_init_configuration
per_layer_range_init_configuration

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration

global_range_init_configuration

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf global_range_init_configuration params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration

per_layer_range_init_configuration

Type: array of object
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items

Type: object

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items num_init_samples

Type: number Default: 256

Number of samples from the training dataset to consume as sample model inputs for purposes of setting initial minimum and maximum quantization ranges.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type

Default: "mixed_min_max"

One of

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mixed_min_max

mixed_min_max

Type: const

Specific value: "mixed_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf min_max

min_max

Type: const

Minimum quantizer range initialized using global minimum of values in the tensor to be quantized, maximum quantizer range initialized using global maxima of the samevalues. Online.

Specific value: "min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_min_max

mean_min_max

Type: const

Specific value: "mean_min_max"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf threesigma

threesigma

Type: const

Quantizer minimum and maximum ranges set to be equal to +- 3 median absolute deviation from the median of the observed values in the tensor to be quantized. Offline.

Specific value: "threesigma"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf percentile

percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to specified percentiles of the the observed values (across the entire initialization sample set) in the tensor to be quantized. Offline.

Specific value: "percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items type oneOf mean_percentile

mean_percentile

Type: const

Quantizer minimum and maximum ranges set to be equal to averaged (across every single initialization sample) specified percentiles of the the observed values in the tensor to be quantized. Offline.

Specific value: "mean_percentile"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items params

Type: object

Type-specific parameters of the initializer.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items params min_percentile

Type: number Default: 0.1

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input minimum.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items params max_percentile

Type: number Default: 99.9

For 'percentile' and 'mean_percentile' types - specify the percentile of input value histograms to be set as the initial value for the quantizer input maximum.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 7 scope_overrides activations .* initializer range oneOf per_layer_range_init_configuration item 1 items target_quantizer_group

Type: enum (of string)

The target group of quantizers for which the specified type of range initialization will be applied. If unspecified, then the range initialization of the given type will be applied to all quantizers.

Must be one of:

"activations"
"weights"

root compression oneOf single_object_version oneOf item 7 export_to_onnx_standard_ops

Type: boolean Default: false

root compression oneOf single_object_version oneOf item 7 overflow_fix

Type: enum (of string) Default: "enable"

Must be one of:

"enable"
"disable"
"first_layer_only"

root compression oneOf single_object_version oneOf item 7 params

Type: object

Configures the staged quantization compression scheduler for the quantization algorithm. The quantizers will not be applied until a given epoch count is reached.

No Additional Properties

root compression oneOf single_object_version oneOf item 7 params batch_multiplier

Type: number

root compression oneOf single_object_version oneOf item 7 params activations_quant_start_epoch

Type: number Default: 1

A zero-based index of the epoch, upon reaching which the activations will start to be quantized.

root compression oneOf single_object_version oneOf item 7 params weights_quant_start_epoch

Type: number Default: 1

Epoch index upon which the weights will start to be quantized.

root compression oneOf single_object_version oneOf item 7 params lr_poly_drop_start_epoch

Type: number

Epoch index upon which the learning rate will start to be dropped. If unspecified, learning rate will not be dropped.

root compression oneOf single_object_version oneOf item 7 params lr_poly_drop_duration_epochs

Type: number Default: 30

Duration, in epochs, of the learning rate dropping process.

root compression oneOf single_object_version oneOf item 7 params disable_wd_start_epoch

Type: number

Epoch to disable weight decay in the optimizer. If unspecified, weight decay will not be disabled.

root compression oneOf single_object_version oneOf item 7 params base_lr

Type: number Default: 0.001

Initial value of learning rate.

root compression oneOf single_object_version oneOf item 7 params base_wd

Type: number Default: 1e-05

Initial value of weight decay.

root compression oneOf single_object_version oneOf item 7 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 7 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 7 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 7 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 7 compression_lr_multiplier

Type: number

PyTorch only - Used to increase/decrease gradients for compression algorithms' parameters. The gradients will be multiplied by the specified value. If unspecified, the gradients will not be adjusted.

root compression oneOf single_object_version oneOf bootstrapNAS

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training algorithm

Type: enum (of string)

Defines training strategy for tuning supernet. By default, progressive shrinking

Must be one of:

"progressive_shrinking"

root compression oneOf single_object_version oneOf item 8 training progressivity_of_elasticity

Type: array of string

Defines the order of adding a new elasticity dimension from stage to stage

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training progressivity_of_elasticity progressivity_of_elasticity items

Type: string

Example:

[
    "width",
    "depth",
    "kernel"
]

root compression oneOf single_object_version oneOf item 8 training batchnorm_adaptation

Type: object

This initializer is applied by default to utilize batch norm statistics adaptation to the current compression scenario. See documentation for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 8 training batchnorm_adaptation num_bn_adaptation_samples

Type: number Default: 2000

root compression oneOf single_object_version oneOf item 8 training schedule

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions

Type: array of object

List of parameters per each supernet training stage

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items

Type: object

Defines a supernet training stage: how many epochs it takes, which elasticities with which settings are enabled, whether some operation should happen in the beginning

No Additional Properties

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items train_dims

Type: array of string

Elasticity dimensions that are enabled for subnet sampling,the rest elastic dimensions are disabled

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items train_dims train_dims items

Type: string

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items epochs

Type: number

Duration of the training stage in epochs

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items depth_indicator

Type: number

Restricts the maximum number of blocks in each independent group that can be skipped. For example, Resnet50 has 4 four independent groups, each group consists of a specific number of Bottleneck layers [3,4,6,3], that potentially can be skipped. If depth indicator equals to 1, only the last Bottleneck can be skipped in each group, if it equals 2 - the last two and etc. This allows to implement progressive shrinking logic from Once for all paper. Default value is 1.

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items width_indicator

Type: number

Restricts the maximum number of width values in each elastic layer. For example, some conv2d with elastic width can vary number of output channels from the following list: [8, 16, 32] If width indicator is equal to 1, it can only activate the maximum number of channels - 32. If it equals 2, then the last two can be selected - 16 or 32, or both of them.

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items reorg_weights

Type: boolean

if True, triggers reorganization of weights in order to have filters sorted by importance (e.g. by l2 norm) in the beginning of the stage

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items bn_adapt

Type: boolean

if True, triggers batchnorm adaptation in the beginning of the stage

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items init_lr

Type: number

Initial learning rate for a stage. If specified in the stage descriptor, it will trigger a reset of the learning rate at the beginning of the stage.

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items epochs_lr

Type: number

Number of epochs to compute the adjustment of the learning rate.

root compression oneOf single_object_version oneOf item 8 training schedule list_stage_descriptions list_stage_descriptions items sample_rate

Type: number

Number of iterations to activate the random subnet. Default value is 1.

root compression oneOf single_object_version oneOf item 8 training elasticity

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training elasticity depth

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training elasticity depth skipped_blocks

Type: array of array

List of building blocks to be skipped. The block is defined by names of start and end nodes. The end node is skipped. In contrast, the start node is executed. It produces a tensor that is bypassed through the skipping nodes until the one after end node.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training elasticity depth skipped_blocks skipped_blocks items

Type: array of string
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training elasticity depth skipped_blocks skipped_blocks items skipped_blocks items items

Type: string

Example:

[
    [
        "start_op_1",
        "end_op_1"
    ],
    [
        "start_op_2",
        "end_op_2"
    ]
]

root compression oneOf single_object_version oneOf item 8 training elasticity depth min_block_size

Type: number

Defines minimal number of operations in the skipping block. Option is available for the auto mode only. Default value is 5

root compression oneOf single_object_version oneOf item 8 training elasticity depth max_block_size

Type: number

Defines maximal number of operations in the block. Option is available for the auto mode only. Default value is 50

root compression oneOf single_object_version oneOf item 8 training elasticity depth hw_fused_ops

Type: boolean

If True, automatic block search will not relate operations, which are fused on inference, into different blocks for skipping. True, by default

root compression oneOf single_object_version oneOf item 8 training elasticity width

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training elasticity width min_width

Type: number

Minimal number of output channels that can be activated for each layers with elastic width. Default value is 32.

root compression oneOf single_object_version oneOf item 8 training elasticity width max_num_widths

Type: number

Restricts total number of different elastic width values for each layer. The default value is -1 means that there's no restrictions.

root compression oneOf single_object_version oneOf item 8 training elasticity width width_step

Type: number

Defines a step size for a generation of the elastic width search space - the list of all possible width values for each layer. The generation starts from the number of output channels in the original model and stops when it reaches whether a min_width width value or number of generated width values equal to max_num_widths

root compression oneOf single_object_version oneOf item 8 training elasticity width width_multipliers

Type: array of number

Defines elastic width search space via a list of multipliers. All possible width values are obtained by multiplying the original width value with the values in the given list.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training elasticity width width_multipliers width_multipliers items

Type: number

root compression oneOf single_object_version oneOf item 8 training elasticity width filter_importance

Type: string

The type of filter importance metric. Can be one of L1, L2, geometric_median, external. L2 by default.

root compression oneOf single_object_version oneOf item 8 training elasticity width external_importance_path

Type: string

Path to the custom external weight importance (PyTorch tensor) per node that needs to weight reorder. Valid only when filterimportance is external. The file should be loaded via the torch interface torch.load(), represented as a dictionary. It maps NNCF node name to importance tensor with the same shape as the weights in the node module. For example, node Model/NNCFLinear[fc1]/linear_0 has a 3x1 linear module with weight [0.2, 0.3, 0.9], and in the dict{'Model/NNCFLinear[fc1]/linear0': tensor([0.4, 0.01, 0.2])} represents the corresponding weight importance.

root compression oneOf single_object_version oneOf item 8 training elasticity kernel

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training elasticity kernel max_num_kernels

Type: number

Restricts the total number of different elastic kernel values for each layer. The default value is -1 means that there's no restrictions.

root compression oneOf single_object_version oneOf item 8 training elasticity available_elasticity_dims

Type: array of string

Defines the available elasticity dimension for sampling subnets. By default, all elastic dimensions are available - [width, depth, kernel]

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training elasticity available_elasticity_dims available_elasticity_dims items

Type: string

root compression oneOf single_object_version oneOf item 8 training elasticity ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training elasticity ignored_scopes ignored_scopes items

Type: string

root compression oneOf single_object_version oneOf item 8 training elasticity target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 training elasticity target_scopes target_scopes items

Type: string

root compression oneOf single_object_version oneOf item 8 training lr_schedule

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training lr_schedule params

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 training lr_schedule params base_lr

Type: number

Defines a global learning rate scheduler.If these parameters are not set, a stage learning rate scheduler will be used.

root compression oneOf single_object_version oneOf item 8 training train_steps

Type: number

Defines the number of samples used for each training epoch.

root compression oneOf single_object_version oneOf item 8 search

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 8 search algorithm

Type: enum (of string)

Defines the search algorithm. Default algorithm is NSGA-II.

Must be one of:

"NSGA2"
"RNSGA2"

root compression oneOf single_object_version oneOf item 8 search batchnorm_adaptation

Type: object

This initializer is applied by default to utilize batch norm statistics adaptation to the current compression scenario. See documentation for more details.

No Additional Properties

root compression oneOf single_object_version oneOf item 8 search batchnorm_adaptation num_bn_adaptation_samples

Type: number Default: 2000

root compression oneOf single_object_version oneOf item 8 search num_evals

Type: number

Defines the number of evaluations that will be used by the search algorithm.

root compression oneOf single_object_version oneOf item 8 search num_constraints

Type: number

Number of constraints in search problem.

root compression oneOf single_object_version oneOf item 8 search population

Type: number

Defines the population size when using an evolutionary search algorithm.

root compression oneOf single_object_version oneOf item 8 search crossover_prob

Type: number

Crossover probability used by a genetic algorithm.

root compression oneOf single_object_version oneOf item 8 search crossover_eta

Type: number

Crossover eta.

root compression oneOf single_object_version oneOf item 8 search mutation_eta

Type: number

Mutation eta for genetic algorithm.

root compression oneOf single_object_version oneOf item 8 search mutation_prob

Type: number

Mutation probability for genetic algorithm.

root compression oneOf single_object_version oneOf item 8 search acc_delta

Type: number

Defines the absolute difference in accuracy that is tolerated when looking for a subnetwork.

root compression oneOf single_object_version oneOf item 8 search ref_acc

Type: number

Defines the reference accuracy from the pre-trained model used to generate the super-network.

root compression oneOf single_object_version oneOf item 8 search aspiration_points

Type: array of number

Information to indicate the preferred parts of the Pareto front

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 search aspiration_points aspiration_points items

Type: number

root compression oneOf single_object_version oneOf item 8 search epsilon

Type: number

epsilon distance of surviving solutions for RNSGA-II.

root compression oneOf single_object_version oneOf item 8 search weights

Type: number

weights used by RNSGA-II.

root compression oneOf single_object_version oneOf item 8 search extreme_points_as_ref_points

Type: boolean

Find extreme points and use them as aspiration points.

root compression oneOf single_object_version oneOf item 8 search compression

One of

single_object_version
array_of_objects_version

root compression oneOf single_object_version oneOf item 8 search compression oneOf single_object_version

single_object_version

Type: object

Same definition as compression_oneOf_i0_oneOf_i4

root compression oneOf single_object_version oneOf item 8 search compression oneOf array_of_objects_version

array_of_objects_version

Type: array
No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 8 search compression oneOf array_of_objects_version item 1 items

Type: object

Same definition as compression_oneOf_i0_oneOf_i4

root compression oneOf single_object_version oneOf movement_sparsity

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 9 algorithm

Type: const
Specific value: "movement_sparsity"

root compression oneOf single_object_version oneOf item 9 params

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 9 params warmup_start_epoch

Type: number

Index of the starting epoch (include) for warmup stage.

root compression oneOf single_object_version oneOf item 9 params warmup_end_epoch

Type: number

Index of the end epoch (exclude) for warmup stage.

root compression oneOf single_object_version oneOf item 9 params importance_regularization_factor

Type: number

The regularization factor on weight importance scores. With a larger positive value, more model weights will be regarded as less important and thus be sparsified.

root compression oneOf single_object_version oneOf item 9 params enable_structured_masking

Type: boolean Default: true

Whether to do structured mask resolution after warmup stage. Only supports structured masking on multi-head self-attention blocks and feed-forward networks now.

root compression oneOf single_object_version oneOf item 9 params power

Type: number Default: 3.0

The power value of polynomial decay for threshold and regularization factor update during warmup stage.

root compression oneOf single_object_version oneOf item 9 params init_importance_threshold

Type: number

The initial value of importance threshold during warmup stage. If not specified, this will be automatically decided during training so that the model is with about 0.1% linear layer sparsity on involved layers at the beginning of warmup stage.

root compression oneOf single_object_version oneOf item 9 params final_importance_threshold

Type: number Default: 0.0

The final value of importance threshold during warmup stage.

root compression oneOf single_object_version oneOf item 9 params steps_per_epoch

Type: number

Number of training steps in one epoch, used for proper threshold and regularization factor updates. Optional if warmupstartepoch >=1 since this can be counted in the 1st epoch. Otherwise users have to specify it.

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes

Type: array of object

Describes how each supported layer will be sparsified.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes sparse_structure_by_scopes items

Type: object
No Additional Properties

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes sparse_structure_by_scopes items mode

Type: enum (of string)

Defines in which mode a supported layer will be sparsified.

Must be one of:

"fine"
"block"
"per_dim"

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes sparse_structure_by_scopes items sparse_factors

Type: array of number

The block shape for weights to sparsify. Required when mode="block".

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes sparse_structure_by_scopes items sparse_factors sparse_factors items

Type: number

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes sparse_structure_by_scopes items axis

Type: number

The dimension for weights to sparsify. Required when mode="per_dim".

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes sparse_structure_by_scopes items target_scopes

Type: array of string or string

Model control flow graph node scopes to be considered in this mode.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 9 sparse_structure_by_scopes sparse_structure_by_scopes items target_scopes target_scopes items

Type: string

root compression oneOf single_object_version oneOf item 9 ignored_scopes

Type: array of string or string

A list of model control flow graph node scopes to be ignored for this operation - functions as an 'allowlist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 9 ignored_scopes ignored_scopes items

Type: string

Examples:

"{re}conv.*"

[
    "LeNet/relu_0",
    "LeNet/relu_1"
]

root compression oneOf single_object_version oneOf item 9 target_scopes

Type: array of string or string

A list of model control flow graph node scopes to be considered for this operation - functions as a 'denylist'. Optional.

No Additional Items

Each item of this array must be:

root compression oneOf single_object_version oneOf item 9 target_scopes target_scopes items

Type: string

Examples:

[
    "UNet/ModuleList[down_path]/UNetConvBlock[1]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[2]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[3]/Sequential[block]/Conv2d[0]",
    "UNet/ModuleList[down_path]/UNetConvBlock[4]/Sequential[block]/Conv2d[0]"
]

"UNet/ModuleList\\[up_path\\].*"

root compression oneOf single_object_version oneOf item 9 validate_scopes

Type: boolean Default: true

If set to True, then a RuntimeError will be raised if the names of the ignored/target scopes do not match the names of the scopes in the model graph.

root compression oneOf single_object_version oneOf item 9 compression_lr_multiplier

Type: number

PyTorch only - Used to increase/decrease gradients for compression algorithms' parameters. The gradients will be multiplied by the specified value. If unspecified, the gradients will not be adjusted.

root compression oneOf array_of_objects_version

array_of_objects_version

Type: array
No Additional Items

Each item of this array must be:

root compression oneOf array_of_objects_version item 1 items

One of

root compression oneOf array_of_objects_version item 1 items oneOf quantization

Type: object

Same definition as compression_oneOf_i0_oneOf_i0

root compression oneOf array_of_objects_version item 1 items oneOf filter_pruning

Type: object

Same definition as compression_oneOf_i0_oneOf_i1

root compression oneOf array_of_objects_version item 1 items oneOf magnitude_sparsity

Type: object

Same definition as compression_oneOf_i0_oneOf_i2

root compression oneOf array_of_objects_version item 1 items oneOf rb_sparsity

Type: object

Same definition as compression_oneOf_i0_oneOf_i3

root compression oneOf array_of_objects_version item 1 items oneOf knowledge_distillation

Type: object

Same definition as compression_oneOf_i0_oneOf_i4

root compression oneOf array_of_objects_version item 1 items oneOf const_sparsity

Type: object

This algorithm takes no additional parameters and is used when you want to load a checkpoint trained with another sparsity algorithm and do other compression without changing the sparsity mask.

Same definition as compression_oneOf_i0_oneOf_i5

root compression oneOf array_of_objects_version item 1 items oneOf binarization

Type: object

Binarization is a specific particular case of the more general quantization algorithm.
See Binarization.md and the rest of this schema for more details and parameters.

Same definition as compression_oneOf_i0_oneOf_i6

root compression oneOf array_of_objects_version item 1 items oneOf experimental_quantization

Type: object

Same definition as compression_oneOf_i0_oneOf_i7

root compression oneOf array_of_objects_version item 1 items oneOf bootstrapNAS

Type: object
Same definition as compression_oneOf_i0_oneOf_i8

root compression oneOf array_of_objects_version item 1 items oneOf movement_sparsity

Type: object
Same definition as compression_oneOf_i0_oneOf_i9

NNCF configuration file schema

input_info

One of

single_object_version

sample_size

Each item of this array must be:

type

filler

keyword

array_of_objects_version

Each item of this array must be:

sample_size

Each item of this array must be:

type

filler

keyword

target_device

Must be one of:

compression

One of

single_object_version

One of

algorithm Required

initializer

batchnorm_adaptation

num_bn_adaptation_samples

range

One of

global_range_init_configuration

num_init_samples

type

One of

mixed_min_max

min_max

mean_min_max

threesigma

percentile

mean_percentile

params

min_percentile

max_percentile

per_layer_range_init_configuration

Each item of this array must be:

num_init_samples

type

One of

mixed_min_max

min_max

mean_min_max

threesigma

percentile

mean_percentile

params

min_percentile

max_percentile

ignored_scopes

Each item of this array must be:

target_scopes

Each item of this array must be:

validate_scopes

target_quantizer_group

Must be one of:

precision

required

type

One of

hawq

autoq

manual

bits

Each item of this array must be:

num_data_points

iter_number

tolerance

compression_ratio

eval_subset_ratio

warmup_iter_number

bitwidth_per_scope

Each item of this array must be:

Tuple Validation