TOSA op with "round half to even" (lowering pytorch's quantize_per_tensor)


I’m trying to implement a lowering of pytorch’s quantize_per_tensor [0] to TOSA.
quantize_per_tensor is specified [1] using round [2], which is specified to “round half to even”.
I seem to be unable to find a TOSA operator with “round half to even” behavior" (CAST seems to be “round toward zero”), so I cannot get an accurate lowering.
Any ideas?

Best wishes,

[0] Sorry, can only put two links as a new user
[1] Quantization — PyTorch master documentation (
[2] torch.round — PyTorch 1.13 documentation

Hi Matthias,

Often we’re dealing with fully quantized networks, and use the RESCALE operator to convert between the different sizes, and does include rounding. So if you’re working towards a fully quantized network, you would then only need to adjust your scale and shift values based on your pytorch scale. (zero point doesn’t change)

If you’re trying to replicate the floating point functionality, I believe you would need to add the rounding as a combination of ADD and then CAST, using the CAST round towards zero.


Hi Eric,

thanks for your fast answer.
May goal is to lower a mixed network that has mostly quantized but also a few floating point operators.

I don’t quite understand how I would emulate “round half to even” via ADD and CAST.
I think it would be implemented roughly like

function ROUND_HALF_TO_EVEN(X, type)
  C = FLOOR(X);
  FRAC = X - C;
  TIE = EQUAL(FRAC, 0.5)
  return CAST(SELECT(TIE && ODD, C + 1, C))

with some extra logic to implement IS_ODD. Is there an easier way?

Would it be an option to extend the TOSA spec to (a) have an attribute on CAST that specifies the rounding mode or (b) an extra operator with “round half to even” behavior?

Best wishes,

Hi Matthias,

Thanks for your posting. I’d like to follow up on a couple of points from this thread.

The TOSA specification defines CAST as round to the nearest integer but does not specify the behavior for ties of exact half-way point. In the TOSA reference model, 2.5 rounds to 3 using std::round() and this is I think round to nearest with half away from zero. Just wanted to check where you saw the behavior of round towards zero?

For networks consisting of all integer operations, TOSA can define the result bit-exact as all the integer TOSA operations define the result bit-exact. TOSA does not define bit-exactness for networks containing floating-point operations as floating-point operation results can vary according to operation order and rounding behavior (eg handling of ties) and these are aspects that are left open to allow a range of implementations. As you say, there is nothing in TOSA to specify a round-to-nearest-even mode. If CAST were to specify a particular rounding mode (such as round to nearest even) I think that would still not allow for exact comparison in general due to, for example, operation order of other floating-point operations before the CAST.

Best regards,

Hi Dominic,

you are right, I’m also seeing “round to nearest with half away from zero” in the TOSA lowering implementation in LLVM.

I also understand that I cannot expect bit-accurate floating point operations (in any setting really),
but, usually, one can still get close results in terms of relative accuracy.
The effect of the rounding mode is much bigger. Whether rounding 3.5 to 3 or to 4 gives an error of ~ 30%.

So, allowing to specify the rounding mode in CAST wouldn’t make floating point bit accurate (never will),
but will allow to obtain a reasonable relative accuracy.

Best wishes,

Hi Matthias,

Thanks for confirming that the rounding you see with the CAST operator matches the reference.

In terms of relative error, I think a small relative error before a CAST can become a large relative error after even if rounding of tie is specified – for example, 3.49 and 3.51 will round to 3 or 4 regardless of tie mode. However, as you say, the maximum relative error of float to integer in isolation can be large if tie rounding is not specified.

We’re just entering the holiday season with several people out and so this thread may go quiet for a bit until into the new year.

Best regards,

Happy new year :slight_smile:

I’m thinking about the following alternatives:

  1. Adding an optional attribute “roundingMode” to the CAST op, which can take values “unspecified, round half away from zero, round half to even” with “unspecified” being the default.

  2. Adding an round op with attribute “roundingMode”, which can take values “unspecified, round half away from zero, round half to even” with “unspecified” being the default. This could also subsume the floor and ceil ops by extending roundingMode.

Would you consider this for inclusion into the standard?

I checked tensorflow, and it also seems to define its default rounding mode to “round half to even” [0],
so that proposal would also help there.

Thank you!

[0] tf.math.round | TensorFlow v2.11.0

Also ONNX defines its round function to round “half to even”: round in onnx/ · onnx/onnx (

Hi Matthias,

Happy new year.

We prefer to avoid optional attributes since options provide additional complexity for an implementation.

We agree round to nearest even is quite a commonly used rounding type. We are looking at whether it would be possible to define the rounding mode for the CAST operator to be round-to-nearest with tie to even (rather than the current round to nearest but tie unspecified). There are a few things we need to check to see if this would cause any issues with current usage.

Best regards,

Hi Dominic,

this sounds great! Let me know if I can help you in that process.