Scientific computing with confidence using typed dimensions

97 pointsposted 2 months ago
by g0xA52A2A

47 Comments

cosmic_quanta

2 months ago

Author here, thanks for posting!

On the Haskell Discourse [0], someone posted about another Haskell library, units [1], which allows to define unit systems NOT limited to the 7 physical dimensions. For example, adding a base dimension for currency would be nice to model electricity prices in currency per energy.

[0]: https://discourse.haskell.org/t/blog-post-scientific-computi...

[1]: https://hackage.haskell.org/package/units

readthenotes1

2 months ago

Dimension analysis is a super power in applied math. I was first taught it in chemistry, but was able to use it to avoid tripping myself up on just about every real world math problem I did...

(And I too changed an implementation to add units of measure to a system. It took a few days, and caused some grief with downstream system developers. But we also avoided the Mars Climate Orbiter experience in production )

jonjojojon

2 months ago

The modeling of unit systems with types is fraught with all sorts of inconsistencies and design tradeoffs. There are quantities that have the same dimensions, but shouldn't be compared or allow arithmetic between them. Many issues with Temperature scales and how to interpret differences in these spaces.

I think a pretty well thought out approach to this is the mp-units library in c++ that uses the ISQ(International System of Quantities) [1].

There are some great blog posts on their site about modeling units systems and dimensions. Fingers crossed it may even be added to the standard library in C++29.

[1]: https://mpusz.github.io/mp-units/latest/

adrian_b

2 months ago

Many quantities that in the current official version of SI appear to have the same dimensions, in fact do not have the same dimensions.

The illusion of having the same dimensions has been caused by a vote of the General Conference on Weights and Measures, which has been exactly as stupid as deciding by vote that 2 + 2 = 5. It is really a shame that something like this has happened.

According to that vote, some quantities have been declared as adimensional, even if arbitrary units of measurements must be chosen for them, exactly like for any other physical quantity and their units enter in the definitions of many other units, so changing any such choice of a unit changes the values of many other SI units, which implies that any version of SI that does not explicitly include the fundamental units chosen for "adimensional" quantities is incomplete.

The system of units of measurement required for the physical quantities includes fundamental units a.k.a. base units, which are chosen arbitrarily, without any constraints, and also derived units, whose values are determined in such a way as to eliminate all constants from the formulae that relate the physical quantity for which the derived unit is determined to other physical quantities for which units have already been determined.

The fundamental units are of 3 kinds, which I shall call arithmetic units, geometric units and dynamic units.

The arithmetic units are for discrete quantities, i.e. quantities that are expressed in integer numbers, so they have natural units.

There are a huge number of such fundamental units for quantities that can be obtained by counting, e.g. the number of atoms of a certain kind, the number of ions of certain kind, the number of molecules of a certain kind, the number of objects of a certain kind, the number of apples, the number of oranges, the number of humans and so on.

While "1" is the natural unit for all such quantities, in physics and chemistry there are many such quantities for which the conventional SI unit is not "1", but the number of Avogadro, a.k.a. one mole of substance.

Besides numbers of various entities, if we neglect nuclear/"elementary" particle physics, there is one more discrete physical quantity, which is the electric charge a.k.a. the quantity of electricity. The natural unit for the electric charge is the elementary electric charge and the conventional SI unit, the coulomb, is defined as an exact multiple of the elementary electric charge.

Besides the units for discrete quantities, there are 6 fundamental units for continuous quantities, 3 geometric units & 3 dynamic units. The 3 geometric units are the units of logarithmic ratio, plane angle and solid angle (which are respectively associated with 1-dimensional, 2-dimensional and 3-dimensional spaces).

The 3 fundamental geometric units are chosen by a mathematical convention, but they are exactly as arbitrary as the fundamental dynamic units. This is proven by the fact that in practice many different choices are used for these 3 fundamental units. Like length can be measured in meters or in feet, logarithmic ratios can be measured in nepers or in decibels, plane angles can be measured in degrees or in right angles or in cycles or in radians, while solid angles can be measured in orthogonal trihedral angles, in whole spheres or in steradians. Changing the choice for any of these fundamental units requires value conversions for all physical quantities that have units derived from them, in the same way like when replacing Imperial units with metric units.

There are many physical quantities in SI that appear to have the same dimensional formulae, but this happens only because the units of logarithmic ratio or plane angle or solid angle are omitted from the formulae. This omission is especially frequent for the unit of plane angle, which enters in a much greater number of dimensional formulae than it is obvious. For instance, the magnetic flux also includes in its dimensional formula the unit of plane angle, though you will almost never see this taken into account.

The SI includes a greater number of conventional fundamental dynamic units than necessary, for historical reasons, which results in a greater number of "universal constants" than for an optimal system of units, which requires only 3 fundamental dynamic units.

While at the end of the 18th century, when the metric system has been designed, it was convenient to choose as fundamental dynamic units the units of length, time and mass, for which distinct physical devices were chosen to materialize the units, the current SI system is equivalent with choosing a single physical device to materialize all 3 fundamental dynamic units, which are the units of length, time and electric voltage (yes, now the unit of mass is derived from the unit of electric voltage, not vice-versa, despite confusing definitions that can be encountered in many texts).

The single physical device that can materialize all 3 fundamental dynamic units is an atomic clock. The electromagnetic wave corresponding to its output signal provides the unit of length in its wavelength, the unit of time in its wave period, while the Josephson voltage corresponding to its frequency provides the unit of electric voltage. The conventional units of SI are obtained from the 3 units provided by an atomic clock by multiplication with conventional exact constants, in order to ensure continuity with the older units of measurement.

A few of the so-called base units of SI are no longer true base units, but they are derived from the true base units by inserting exact conventional constants in the formulae relating physical quantities, e.g. the units of temperature and luminous intensity.

abdullahkhalids

2 months ago

Thanks for the explanation. When I taught physics, I was always annoyed by not being able to teach this correctly. Right from PHY101: Mechanics, as soon as you get to rotational motion, you run into these problems because of the units of plane angle issue. Common introductory texts get the units for rotational quantities wrong.

Do you have a reference for any undergrad text that does this correctly?

feoren

2 months ago

One thing I would stress is that there is not one "true" system of units that is correct in some absolute sense. Unit systems are to arithmetic as type systems are to code. It's OK to have different systems of units that are good at preventing different kinds of error. This means that when talking about

That being said, you can absolutely criticize any given unit system, just as you can criticize any given type system. And some of your criticisms* of SI are indeed valid. (*Out of those that I even understand at all.) It just means that when we compare unit systems, we should be focusing on factors like: what kinds of errors do they catch/prevent? How understandable are they? How reliable? You do that in some places, such as your excellent point about torque vs. energy. But it's less clear why we should be splitting out the 3 kinds of units or only ever specifying number of apples as integers (I can certainly have half an apple) -- how does this actually help us? I would also argue that if you're trying to say something like "1 Apple + 1 Orange != 2", then I'd argue for the same reason that "1 Apple + 1 Apple != 2", because different apples are different sizes, shapes, value, etc. Really I'm just not clear on almost any of your points about "discrete units".

And for the geometric units, why stop at 3? We obviously could have 4-dimensional angles as well (possibly not physically useful, but maybe for spacetime calculations ... and why should our system of units only apply to our particular universe?). You could have n^2 and n^3 units similar to logarithmic units (e.g. 1 squart = 1 foot, 2 squart = 4 feet, 3 squart = 9 feet). I'm not claiming any of those are useful, but I am claiming it shows that your examples are not "fundamental".

> The electromagnetic wave corresponding to its output signal provides the unit of length in its wavelength, the unit of time in its wave period.

But aren't the wavelength and period of an electromagnetic wave linearly correlated by the speed of light? Doesn't this make your 3 fundamental dynamic units only 2?

olddustytrail

2 months ago

And?

adrian_b

2 months ago

My point was that what the previous poster had mentioned "There are quantities that have the same dimensions, but shouldn't be compared or allow arithmetic between them" is caused by using incorrect dimensional formulae.

When you use correct dimensional formulae, most of these cases disappear, e.g. torque and energy do not have the same dimensions (the cases that remain are for quantities of the same kind that measure different things, like the average and the peak value of some quantity, for which some arithmetic operations that combine them may be meaningless). The problem is that there are plenty of textbooks with incomplete formulae, so one may need to analyze them, instead of having faith that they are correct.

Moreover, for the discrete quantities obtained by counting one must define different types depending on what is counted, i.e. one must not use the same type for counting horses and for counting boxes (or more realistically for the kind of quantities that can appear in complicated expressions, a molar quantity of some chemical substance must not have the same type as a molar quantity of a different chemical substance, or, when simulating semiconductors, the concentrations of electrons, holes, donors and acceptors must have different types).

fph

2 months ago

ziotom78

2 months ago

I am using Unitful.jl for some complex electromagnetics code I am developing. Although it makes the definition of types and constructors more complicated, it has helped me find several subtle bugs in the equations. Forcing the type system to keep track of measurement units has pros and cons, but so far, the advantages are more significant than the cons.

aithrowawaycomm

2 months ago

I have yet to see a language which does units better than F#: https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref... It is one of the main reasons I use F# over other functional languages.

Doing it within an existing type system is more trouble than it’s worth: the algebra of units is simple, but it doesn’t apply to other types without stretching the compiler. It is far easier to have a distinct subsystem with measure types specifically identified.

semi-extrinsic

2 months ago

I will say that the OpenFOAM C++ library supports dimensions in a more user friendly way than this, with support for both vectors/matrices and for checking units when you take partial derivatives of your quantities.

There is even native support for reading dimensioned values from input files, so your user can specify "speed 12.7 [ft/s]" irrespective of what units you use internally in the code processing that input file. It just gets converted (or throws an error).

See e.g. this, from 4.2.6 onwards: https://doc.cfd.direct/openfoam/user-guide-v12/basic-file-fo...

cosmic_quanta

2 months ago

Interesting, I am intrigued by the support for partial derivatives with typed dimensions. I'll have to check out OpenFOAM

jcgrillo

2 months ago

I misread this title (the word "confidence" threw me) and was initially very confused when it turned out to be about dimensional analysis instead of uncertainty.

But why not both? A number system with dimensions and which automatically propagates measurement uncertainty through calculations (a la Taylor's train wreck book) doesn't seem totally infeasible?

I would particularly like this for expressing cad models as code instead of brep.

cosmic_quanta

2 months ago

There are Haskell packages for uncertainty (e.g. [0], based on automatic differentiation [1]). However, these packages don't support typed dimensions, because multiplication/division becomes more complex.

It is my goal to provide quantities with uncertainty AND typed dimensions one day.

[0]: https://hackage.haskell.org/package/uncertain

[1]: https://hackage.haskell.org/package/ad

jcgrillo

2 months ago

Very cool. Thanks for the links, I'm not very familiar with the Haskell ecosystem, I mostly work in Rust and Go these days. I'll definitely check them out.

ssivark

2 months ago

This is the kind of thing that would be pretty straight-forward in Julia, I imagine. Independent libraries for uncertainties and units could easily be remixed together if the unitful quantity accepts any number type, and the uncertainties just define a new number type. Multiple dispatch should generate consistent downstream behavior "for free".

ziotom78

2 months ago

Indeed:

    julia> using Unitful: m, s

    julia> using Measurements

    julia> Δx = (3.0 ± 0.1)m
    3.0 ± 0.1 m

    julia> t = (1.0 ± 0.2)s
    1.0 ± 0.2 s

    julia> v = Δx / t
    3.0 ± 0.61 m s^-1

zzbn00

2 months ago

There is a good dimensional analysis package `ezunits` in Maxima. Very useful for working through equations and then sometimes generating the c/fortran code directly from Maxima is fine.

simiones

2 months ago

Would this work for something like linear algebra? Could it support multiplying two 3x3 matrices where each cell can have a different dimension, and only work if they are compatible?

antononcube

2 months ago

Yes you can do that (easily) with Wolfram Language (aka Mathematica.)

Here is an example:

    mat1 = Table[
       RandomChoice[{
           Quantity[RandomReal[], "Meters"],
           Quantity[RandomReal[], "Seconds"],
           Quantity[RandomReal[], "Meters/Seconds"],
           Quantity[RandomReal[], "Meters/Seconds^2"]
         }] , 3, 3];
    mat1 // MatrixForm
    
    mat1 . Transpose[mat1]
See the corresponding screenshot: https://imgur.com/aP9Ugk2

librasteve

2 months ago

I whipped up this in raku:

  use Physics::Measure :ALL;

  sub infix:<·>(@x1, @x2) {
    die "Incompatible dimensions."
            unless @x1 == @x2[0] && @x1[0] == @x2;

    [for ^@x1 -> $m {
        [for ^@x1 -> $n {
            [+] @x1[$m;*] >>*<< @x2[*;$n]
        }]
    }]
  }

  my @m = [[1m,2m],[3m,4m]]; 

  say @m · [Z] @m;     #[[5m^2 11m^2] [11m^2 25m^2]]
Since Physics::Measure is strong on illegal combinations and since there are not many realistic random combinations of Units (s^2 anyone) I have not gone random for my example.

boscillator

2 months ago

This is something I always feel is missing from unit libraries.

cosmic_quanta

2 months ago

That's a great question.

Haskell has arrays and matrices of homogeneous types, so it wouldn't work by default.

If you needed this kind of functionality, you would have to create your own Matrix type which would look very similar to a 9-tuple, and then define matrix multiplication. It would then be possible to encode the dimensional constraints in the type signature, but it's already quite involved for 2x2 matrices:

    data Mat2 a b c d 
        = MkMatrix a b c d
    
    matmul :: ???? -- good luck writing this type signature
    matmul (MkMatrix a11 a12 a21 a22) (MkMatrix b11 b12 b21 b22)
        = MkMatrix ((a11 * b11) + (a12 * b21))
                   ((a11 * b12) + (a12 * b22))
                   ((a21 * b11) + (a22 * b21))
                   ((a21 * b12) + (a22 * b22))
I can't imagine for 3x3.

simiones

2 months ago

Thanks for the sample! I was asking because it seems to me this is always an interesting example of a very common and very well defined formal operation where nevertheless type systems are extremely cumbersome to use, but I'm always curious if there is some solution that I'm missing.

I wonder if there is any plausible approach that would work closer to how units are treated in physics calculations - not as types, but as special numerical values, more or less (values that you can multiply/divide by any other unit, but can only add/subtract with themselves).

cosmic_quanta

2 months ago

> ... not as types, but as special numerical values, more or less (values that you can multiply/divide by any other unit, but can only add/subtract with themselves).

What's so cool about `dimensional` is that the types work the way you describe!

Normally, in Haskell, the addition and multiplication functions are typed like so:

    (+) :: Double -> Double -> Double
    (*) :: Double -> Double -> Double
With `dimensional`, addition looks like:

    (+) :: Quantity d Double -> Quantity d Double -> Quantity d Double
which means you can't add quantities with incompatible dimensions, so far so good. But multiplication looks like:

    (*) :: Quantity d1 Double -> Quantity d2 Double -> Quantity (d1 * d2) Double
That is, the dimensions of the result of multiplication (d1 * d2) follow the physics meaning of multiplication. What can look confusing is that (d1 * d2) is a type-level calculation which is run at compile-time.

This means that `dimensional` isn't only about 'statically checking' dimensions and units, but also inferring them.

You can see this interactively. What is the type of 1 meter divided by 1 second:

    ghci> :t (1 *~ meter) / (1 *~ second)
    Quantity (Dim Pos1 Zero Neg1 Zero Zero Zero Zero) Double
where (Dim Pos1 Zero Neg1 Zero Zero Zero Zero) is a synonym for the dimension of velocity, which is not checked -- it is inferred at compile time.

simiones

2 months ago

Nice, I didn't realize this! Thank you for going into this level of detail.

ttoinou

2 months ago

   Two quantities with dimensions 
D1 and D2 can only be added or subtracted if (and only if) D1≡D2. For example, it does not make sense to add a length to a mass.

That’s not true though. Some units cannot be added together, for example all intensive units like speed, density, temperature etc.

cosmic_quanta

2 months ago

That's an excellent point, thanks! I will edit the post

chris_va

2 months ago

I am partial to astropy, eg

from astropy import units as u

density = np.linspace(0, 1, 10)*u.g/u.cm**3

mass = (density*1*u.m**3).to(u.kg)

librasteve

2 months ago

This is a great article and I like that you are highlighting the benefits of Dimensions as Types.

While Raku is less popular than Python, it does have deep roots in Haskell and strong types (the first Raku / Perl6 parser - PUGS - was written in Haskell and all the early devs were encouraged to learn Haskell first).

Similar concepts are used in these Raku modules... which provide dimensional analysis and marry types to dimensions.

- https://raku.land/zef:librasteve/Physics::Measure

- https://raku.land/zef:librasteve/Physics::Unit

I had some fun with making this example of using these Raku modules with Jupyter https://rakujourney.wordpress.com/wp-content/uploads/2023/08...

[disclaimer: I am the author of these modules]

Raku is also good at Slangs (Sublanguages) and unicode, so these tools can be used in a very intuitive way:

  #!/usr/bin/env raku
  use Physics::Constants;
  use Physics::Measure :ALL;

  say ~ℏ;     #1.054571817e-34 J.s

  my \λ = 2.5nm;
  say "Wavelength of photon (λ) is " ~λ;

  my \ν = c / λ;
  say "Frequency of photon (ν) is " ~ν.in('petahertz');

  my \Ep = ℎ * ν;
  say "Energy of photon (Ep) is " ~Ep.in('attojoules');

  Wavelength of photon (λ) is 2.5nm
  Frequency of photon (ν) is 119.92PHz
  Energy of photon (Ep) is 79.46aJ

antononcube

2 months ago

Wolfram Language (aka Mathematica) is the best for doing scientific computing with physical unit dimensions.

See: https://reference.wolfram.com/language/guide/Units.html

> The units framework [of Wolfram Language] integrates seamlessly with visualization, numeric and algebraic computation functions. It also supports dimensional analysis, as well as purely symbolic operations on quantities.

hexane360

2 months ago

Note that Mathematica units impose a very large runtime penalty, making them unsuitable for a lot of applications

antononcube

2 months ago

The applications that are WL-units suitable are much more than the WL-units unsuitable ones you refer to.

It is true that I (and others) avoid using Wolfram Language (WL) units. Or try to get the computation expressions units-free as quickly as possible. But that is also a normal procedure when crafting computational workflows.

amelius

2 months ago

Can it also do reference frames?

Like if I have a cartesian coordinates in one reference frame, can I use them in a type-safe way? E.g. not add two vectors in two reference frames? Same for polar coordinates?

Etc.

cosmic_quanta

2 months ago

The `dimensional` library doesn't provide this, no. However, it's easy to 'tag' data in Haskell using phantom types. For example, for vectors in reference frames:

    newtype Framed f a = MkFramed (Vector a)
    
    add :: Framed f a -> Framed f a -> Framed f a
    add (MkFramed vec1) (MkFramed vec2) = ... add vec1 and vec2
    
    data ReferenceFrame1
    data ReferenceFrame2
    
    v1 :: Framed ReferenceFrame1 Double
    v1 = ...
    
    v2 :: Framed ReferenceFrame2 Double
    v2 = ...
    
    main :: IO ()
    main = print $ v1 `add` v2 -- will not type check
   
There might already be a library to do this

amelius

2 months ago

Ok, but I'd also want to convert between polar and cartesian coordinates.

toxik

2 months ago

I was so disappointed that this turned from Python to Haskell, I would just _love_ to have something like this in Python. Even just type annotations.

A similar thing I have been thinking about doing in robotics contexts is to annotate vectors and transformation matrices with their reference frames. So you can only matrix-vector multiply between compatible types: `object_in_world = camera_to_world @ object_in_camera`. It can be done somewhat in C++.

kardos

2 months ago

Could this be done with an IDE plugin (for Pycharm or whatnot)? It is tedious to go though code and carefully annotate each variable and run so many compile test, read error, revise, cycles. It should be possible on a per function basis to statically check how each variable relates to the inputs and outputs, and just annotate those. Then the plugin could solve the units for each item, and only ask for an annotation where needed to resolve an ambiguous case.

Hizonner

2 months ago

I used that package once, to write a program to do US taxes on Canadian mutual funds. Yes, it seemed necessary.