Scientific computing with confidence using typed dimensions

83 pointsposted a day ago
by g0xA52A2A

42 Comments

cosmic_quanta

a day ago

Author here, thanks for posting!

On the Haskell Discourse [0], someone posted about another Haskell library, units [1], which allows to define unit systems NOT limited to the 7 physical dimensions. For example, adding a base dimension for currency would be nice to model electricity prices in currency per energy.

[0]: https://discourse.haskell.org/t/blog-post-scientific-computi...

[1]: https://hackage.haskell.org/package/units

readthenotes1

18 hours ago

Dimension analysis is a super power in applied math. I was first taught it in chemistry, but was able to use it to avoid tripping myself up on just about every real world math problem I did...

(And I too changed an implementation to add units of measure to a system. It took a few days, and caused some grief with downstream system developers. But we also avoided the Mars Climate Orbiter experience in production )

jonjojojon

20 hours ago

The modeling of unit systems with types is fraught with all sorts of inconsistencies and design tradeoffs. There are quantities that have the same dimensions, but shouldn't be compared or allow arithmetic between them. Many issues with Temperature scales and how to interpret differences in these spaces.

I think a pretty well thought out approach to this is the mp-units library in c++ that uses the ISQ(International System of Quantities) [1].

There are some great blog posts on their site about modeling units systems and dimensions. Fingers crossed it may even be added to the standard library in C++29.

[1]: https://mpusz.github.io/mp-units/latest/

adrian_b

18 hours ago

Many quantities that in the current official version of SI appear to have the same dimensions, in fact do not have the same dimensions.

The illusion of having the same dimensions has been caused by a vote of the General Conference on Weights and Measures, which has been exactly as stupid as deciding by vote that 2 + 2 = 5. It is really a shame that something like this has happened.

According to that vote, some quantities have been declared as adimensional, even if arbitrary units of measurements must be chosen for them, exactly like for any other physical quantity and their units enter in the definitions of many other units, so changing any such choice of a unit changes the values of many other SI units, which implies that any version of SI that does not explicitly include the fundamental units chosen for "adimensional" quantities is incomplete.

The system of units of measurement required for the physical quantities includes fundamental units a.k.a. base units, which are chosen arbitrarily, without any constraints, and also derived units, whose values are determined in such a way as to eliminate all constants from the formulae that relate the physical quantity for which the derived unit is determined to other physical quantities for which units have already been determined.

The fundamental units are of 3 kinds, which I shall call arithmetic units, geometric units and dynamic units.

The arithmetic units are for discrete quantities, i.e. quantities that are expressed in integer numbers, so they have natural units.

There are a huge number of such fundamental units for quantities that can be obtained by counting, e.g. the number of atoms of a certain kind, the number of ions of certain kind, the number of molecules of a certain kind, the number of objects of a certain kind, the number of apples, the number of oranges, the number of humans and so on.

While "1" is the natural unit for all such quantities, in physics and chemistry there are many such quantities for which the conventional SI unit is not "1", but the number of Avogadro, a.k.a. one mole of substance.

Besides numbers of various entities, if we neglect nuclear/"elementary" particle physics, there is one more discrete physical quantity, which is the electric charge a.k.a. the quantity of electricity. The natural unit for the electric charge is the elementary electric charge and the conventional SI unit, the coulomb, is defined as an exact multiple of the elementary electric charge.

Besides the units for discrete quantities, there are 6 fundamental units for continuous quantities, 3 geometric units & 3 dynamic units. The 3 geometric units are the units of logarithmic ratio, plane angle and solid angle (which are respectively associated with 1-dimensional, 2-dimensional and 3-dimensional spaces).

The 3 fundamental geometric units are chosen by a mathematical convention, but they are exactly as arbitrary as the fundamental dynamic units. This is proven by the fact that in practice many different choices are used for these 3 fundamental units. Like length can be measured in meters or in feet, logarithmic ratios can be measured in nepers or in decibels, plane angles can be measured in degrees or in right angles or in cycles or in radians, while solid angles can be measured in orthogonal trihedral angles, in whole spheres or in steradians. Changing the choice for any of these fundamental units requires value conversions for all physical quantities that have units derived from them, in the same way like when replacing Imperial units with metric units.

There are many physical quantities in SI that appear to have the same dimensional formulae, but this happens only because the units of logarithmic ratio or plane angle or solid angle are omitted from the formulae. This omission is especially frequent for the unit of plane angle, which enters in a much greater number of dimensional formulae than it is obvious. For instance, the magnetic flux also includes in its dimensional formula the unit of plane angle, though you will almost never see this taken into account.

The SI includes a greater number of conventional fundamental dynamic units than necessary, for historical reasons, which results in a greater number of "universal constants" than for an optimal system of units, which requires only 3 fundamental dynamic units.

While at the end of the 18th century, when the metric system has been designed, it was convenient to choose as fundamental dynamic units the units of length, time and mass, for which distinct physical devices were chosen to materialize the units, the current SI system is equivalent with choosing a single physical device to materialize all 3 fundamental dynamic units, which are the units of length, time and electric voltage (yes, now the unit of mass is derived from the unit of electric voltage, not vice-versa, despite confusing definitions that can be encountered in many texts).

The single physical device that can materialize all 3 fundamental dynamic units is an atomic clock. The electromagnetic wave corresponding to its output signal provides the unit of length in its wavelength, the unit of time in its wave period, while the Josephson voltage corresponding to its frequency provides the unit of electric voltage. The conventional units of SI are obtained from the 3 units provided by an atomic clock by multiplication with conventional exact constants, in order to ensure continuity with the older units of measurement.

A few of the so-called base units of SI are no longer true base units, but they are derived from the true base units by inserting exact conventional constants in the formulae relating physical quantities, e.g. the units of temperature and luminous intensity.

abdullahkhalids

13 hours ago

Thanks for the explanation. When I taught physics, I was always annoyed by not being able to teach this correctly. Right from PHY101: Mechanics, as soon as you get to rotational motion, you run into these problems because of the units of plane angle issue. Common introductory texts get the units for rotational quantities wrong.

Do you have a reference for any undergrad text that does this correctly?

olddustytrail

14 hours ago

And?

adrian_b

14 hours ago

My point was that what the previous poster had mentioned "There are quantities that have the same dimensions, but shouldn't be compared or allow arithmetic between them" is caused by using incorrect dimensional formulae.

When you use correct dimensional formulae, most of these cases disappear, e.g. torque and energy do not have the same dimensions (the cases that remain are for quantities of the same kind that measure different things, like the average and the peak value of some quantity, for which some arithmetic operations that combine them may be meaningless). The problem is that there are plenty of textbooks with incomplete formulae, so one may need to analyze them, instead of having faith that they are correct.

Moreover, for the discrete quantities obtained by counting one must define different types depending on what is counted, i.e. one must not use the same type for counting horses and for counting boxes (or more realistically for the kind of quantities that can appear in complicated expressions, a molar quantity of some chemical substance must not have the same type as a molar quantity of a different chemical substance, or, when simulating semiconductors, the concentrations of electrons, holes, donors and acceptors must have different types).

fph

a day ago

ziotom78

an hour ago

I am using Unitful.jl for some complex electromagnetics code I am developing. Although it makes the definition of types and constructors more complicated, it has helped me find several subtle bugs in the equations. Forcing the type system to keep track of measurement units has pros and cons, but so far, the advantages are more significant than the cons.

aithrowawaycomm

15 hours ago

I have yet to see a language which does units better than F#: https://learn.microsoft.com/en-us/dotnet/fsharp/language-ref... It is one of the main reasons I use F# over other functional languages.

Doing it within an existing type system is more trouble than it’s worth: the algebra of units is simple, but it doesn’t apply to other types without stretching the compiler. It is far easier to have a distinct subsystem with measure types specifically identified.

semi-extrinsic

14 hours ago

I will say that the OpenFOAM C++ library supports dimensions in a more user friendly way than this, with support for both vectors/matrices and for checking units when you take partial derivatives of your quantities.

There is even native support for reading dimensioned values from input files, so your user can specify "speed 12.7 [ft/s]" irrespective of what units you use internally in the code processing that input file. It just gets converted (or throws an error).

See e.g. this, from 4.2.6 onwards: https://doc.cfd.direct/openfoam/user-guide-v12/basic-file-fo...

cosmic_quanta

13 hours ago

Interesting, I am intrigued by the support for partial derivatives with typed dimensions. I'll have to check out OpenFOAM

jcgrillo

19 hours ago

I misread this title (the word "confidence" threw me) and was initially very confused when it turned out to be about dimensional analysis instead of uncertainty.

But why not both? A number system with dimensions and which automatically propagates measurement uncertainty through calculations (a la Taylor's train wreck book) doesn't seem totally infeasible?

I would particularly like this for expressing cad models as code instead of brep.

cosmic_quanta

18 hours ago

There are Haskell packages for uncertainty (e.g. [0], based on automatic differentiation [1]). However, these packages don't support typed dimensions, because multiplication/division becomes more complex.

It is my goal to provide quantities with uncertainty AND typed dimensions one day.

[0]: https://hackage.haskell.org/package/uncertain

[1]: https://hackage.haskell.org/package/ad

ssivark

15 hours ago

This is the kind of thing that would be pretty straight-forward in Julia, I imagine. Independent libraries for uncertainties and units could easily be remixed together if the unitful quantity accepts any number type, and the uncertainties just define a new number type. Multiple dispatch should generate consistent downstream behavior "for free".

ziotom78

an hour ago

Indeed:

    julia> using Unitful: m, s

    julia> using Measurements

    julia> Δx = (3.0 ± 0.1)m
    3.0 ± 0.1 m

    julia> t = (1.0 ± 0.2)s
    1.0 ± 0.2 s

    julia> v = Δx / t
    3.0 ± 0.61 m s^-1

jcgrillo

18 hours ago

Very cool. Thanks for the links, I'm not very familiar with the Haskell ecosystem, I mostly work in Rust and Go these days. I'll definitely check them out.

zzbn00

a day ago

There is a good dimensional analysis package `ezunits` in Maxima. Very useful for working through equations and then sometimes generating the c/fortran code directly from Maxima is fine.

chris_va

14 hours ago

I am partial to astropy, eg

from astropy import units as u

density = np.linspace(0, 1, 10)*u.g/u.cm**3

mass = (density*1*u.m**3).to(u.kg)

simiones

a day ago

Would this work for something like linear algebra? Could it support multiplying two 3x3 matrices where each cell can have a different dimension, and only work if they are compatible?

antononcube

14 hours ago

Yes you can do that (easily) with Wolfram Language (aka Mathematica.)

Here is an example:

    mat1 = Table[
       RandomChoice[{
           Quantity[RandomReal[], "Meters"],
           Quantity[RandomReal[], "Seconds"],
           Quantity[RandomReal[], "Meters/Seconds"],
           Quantity[RandomReal[], "Meters/Seconds^2"]
         }] , 3, 3];
    mat1 // MatrixForm
    
    mat1 . Transpose[mat1]
See the corresponding screenshot: https://imgur.com/aP9Ugk2

boscillator

21 hours ago

This is something I always feel is missing from unit libraries.

cosmic_quanta

a day ago

That's a great question.

Haskell has arrays and matrices of homogeneous types, so it wouldn't work by default.

If you needed this kind of functionality, you would have to create your own Matrix type which would look very similar to a 9-tuple, and then define matrix multiplication. It would then be possible to encode the dimensional constraints in the type signature, but it's already quite involved for 2x2 matrices:

    data Mat2 a b c d 
        = MkMatrix a b c d
    
    matmul :: ???? -- good luck writing this type signature
    matmul (MkMatrix a11 a12 a21 a22) (MkMatrix b11 b12 b21 b22)
        = MkMatrix ((a11 * b11) + (a12 * b21))
                   ((a11 * b12) + (a12 * b22))
                   ((a21 * b11) + (a22 * b21))
                   ((a21 * b12) + (a22 * b22))
I can't imagine for 3x3.

simiones

20 hours ago

Thanks for the sample! I was asking because it seems to me this is always an interesting example of a very common and very well defined formal operation where nevertheless type systems are extremely cumbersome to use, but I'm always curious if there is some solution that I'm missing.

I wonder if there is any plausible approach that would work closer to how units are treated in physics calculations - not as types, but as special numerical values, more or less (values that you can multiply/divide by any other unit, but can only add/subtract with themselves).

cosmic_quanta

20 hours ago

> ... not as types, but as special numerical values, more or less (values that you can multiply/divide by any other unit, but can only add/subtract with themselves).

What's so cool about `dimensional` is that the types work the way you describe!

Normally, in Haskell, the addition and multiplication functions are typed like so:

    (+) :: Double -> Double -> Double
    (*) :: Double -> Double -> Double
With `dimensional`, addition looks like:

    (+) :: Quantity d Double -> Quantity d Double -> Quantity d Double
which means you can't add quantities with incompatible dimensions, so far so good. But multiplication looks like:

    (*) :: Quantity d1 Double -> Quantity d2 Double -> Quantity (d1 * d2) Double
That is, the dimensions of the result of multiplication (d1 * d2) follow the physics meaning of multiplication. What can look confusing is that (d1 * d2) is a type-level calculation which is run at compile-time.

This means that `dimensional` isn't only about 'statically checking' dimensions and units, but also inferring them.

You can see this interactively. What is the type of 1 meter divided by 1 second:

    ghci> :t (1 *~ meter) / (1 *~ second)
    Quantity (Dim Pos1 Zero Neg1 Zero Zero Zero Zero) Double
where (Dim Pos1 Zero Neg1 Zero Zero Zero Zero) is a synonym for the dimension of velocity, which is not checked -- it is inferred at compile time.

simiones

20 hours ago

Nice, I didn't realize this! Thank you for going into this level of detail.

antononcube

18 hours ago

Wolfram Language (aka Mathematica) is the best for doing scientific computing with physical unit dimensions.

See: https://reference.wolfram.com/language/guide/Units.html

> The units framework [of Wolfram Language] integrates seamlessly with visualization, numeric and algebraic computation functions. It also supports dimensional analysis, as well as purely symbolic operations on quantities.

hexane360

14 hours ago

Note that Mathematica units impose a very large runtime penalty, making them unsuitable for a lot of applications

antononcube

13 hours ago

The applications that are WL-units suitable are much more than the WL-units unsuitable ones you refer to.

It is true that I (and others) avoid using Wolfram Language (WL) units. Or try to get the computation expressions units-free as quickly as possible. But that is also a normal procedure when crafting computational workflows.

librasteve

a day ago

This is a great article and I like that you are highlighting the benefits of Dimensions as Types.

While Raku is less popular than Python, it does have deep roots in Haskell and strong types (the first Raku / Perl6 parser - PUGS - was written in Haskell and all the early devs were encouraged to learn Haskell first).

Similar concepts are used in these Raku modules... which provide dimensional analysis and marry types to dimensions.

- https://raku.land/zef:librasteve/Physics::Measure

- https://raku.land/zef:librasteve/Physics::Unit

I had some fun with making this example of using these Raku modules with Jupyter https://rakujourney.wordpress.com/wp-content/uploads/2023/08...

[disclaimer: I am the author of these modules]

Raku is also good at Slangs (Sublanguages) and unicode, so these tools can be used in a very intuitive way:

  #!/usr/bin/env raku
  use Physics::Constants;
  use Physics::Measure :ALL;

  say ~ℏ;     #1.054571817e-34 J.s

  my \λ = 2.5nm;
  say "Wavelength of photon (λ) is " ~λ;

  my \ν = c / λ;
  say "Frequency of photon (ν) is " ~ν.in('petahertz');

  my \Ep = ℎ * ν;
  say "Energy of photon (Ep) is " ~Ep.in('attojoules');

  Wavelength of photon (λ) is 2.5nm
  Frequency of photon (ν) is 119.92PHz
  Energy of photon (Ep) is 79.46aJ

amelius

21 hours ago

Can it also do reference frames?

Like if I have a cartesian coordinates in one reference frame, can I use them in a type-safe way? E.g. not add two vectors in two reference frames? Same for polar coordinates?

Etc.

cosmic_quanta

21 hours ago

The `dimensional` library doesn't provide this, no. However, it's easy to 'tag' data in Haskell using phantom types. For example, for vectors in reference frames:

    newtype Framed f a = MkFramed (Vector a)
    
    add :: Framed f a -> Framed f a -> Framed f a
    add (MkFramed vec1) (MkFramed vec2) = ... add vec1 and vec2
    
    data ReferenceFrame1
    data ReferenceFrame2
    
    v1 :: Framed ReferenceFrame1 Double
    v1 = ...
    
    v2 :: Framed ReferenceFrame2 Double
    v2 = ...
    
    main :: IO ()
    main = print $ v1 `add` v2 -- will not type check
   
There might already be a library to do this

amelius

19 hours ago

Ok, but I'd also want to convert between polar and cartesian coordinates.

toxik

a day ago

I was so disappointed that this turned from Python to Haskell, I would just _love_ to have something like this in Python. Even just type annotations.

A similar thing I have been thinking about doing in robotics contexts is to annotate vectors and transformation matrices with their reference frames. So you can only matrix-vector multiply between compatible types: `object_in_world = camera_to_world @ object_in_camera`. It can be done somewhat in C++.

kardos

a day ago

Could this be done with an IDE plugin (for Pycharm or whatnot)? It is tedious to go though code and carefully annotate each variable and run so many compile test, read error, revise, cycles. It should be possible on a per function basis to statically check how each variable relates to the inputs and outputs, and just annotate those. Then the plugin could solve the units for each item, and only ask for an annotation where needed to resolve an ambiguous case.

Hizonner

a day ago

I used that package once, to write a program to do US taxes on Canadian mutual funds. Yes, it seemed necessary.