raju
10 hours ago
Let me start by saying (as someone who has written a few technical books of his own)—Congratulations!
I am sure you (assuming this is your first book) are learning that this is a labor of love, and I wish you the very best in this endeavor. You should be proud!
I was exposed to "data oriented programming" thanks to Clojure—wherein maps/sets are the constructs used to pass data (as plain data) around, with simple functions that work with the data, as opposed to the traditional OO (hello ORM) that mangles data to fit some weird hierarchy.
Java's recent innovations certainly make this a lot easier, and I am glad someone is looking at propagating a much needed message.
I will take a look at the book, but I wish you the very best.
mrbonner
10 hours ago
I am also very interested in how this work in practice. With OOP at least you know the shape of your data structure as opposed to the hash map as a mere container type.
geophile
9 hours ago
I am an OOP programmer going back to the late 80s (including the cfront days of C++), and a serious user of Python since 2007.
In Python, I sometimes try data-oriented programming, using lists and dicts to structure data. And I find that it does not work well. Once I get two or more levels of nesting, I find it far too easy to get confused about which level I'm on, which is not helped by Python's lack of strong typing. In these situations, I often introduce objects that wrap the map or dict, and have methods that make sense for that level. In other words, the objects can be viewed as providing clear documentation for the whole nested structure, and how it can be navigated.
goostavos
9 hours ago
>Once I get two or more levels of nesting, I find it far too easy to get confused about which level I'm on
Author here, I agree with you. I have the working memory of a small pigeon.
The flavor of data orientation we cover in the book leverages strongly typed representations of data (as opposed to using hash maps everywhere). So you'll always know what's shape it's in (and the compiler enforces it!). We spend a lot of time exploring the role that the type system can play in our programming and how we represent data.
joshlemer
9 hours ago
Given the strongly typed flavour of data oriented programming, I wonder if you have any thoughts on the "proliferation of types" problem. How to avoid, especially in a nominally typed language like Java, an explosion of aggregate types for every context where there may be a slight change in what fields are present, what their types are, and which ones are optional. Basically, Rich Hickey's Maybe Not talk.
record Make(makeId, name)
record Model(modelId, name)
record Car(make, model, year)
record Car(makeId, modelId, year)
record Car(make, model)
record Car(makeId, modelId)
record Car(make, year)
record Car(makeId, year)
record Car(make, model, year, colour)
record Car(makeId, modelId, year, colour)
record Car(year, colour)
....
kentosi
5 hours ago
I haven't yet had the luxury to experiment with the latest version of Java, but this is one of the reasons why I wish Java introduced named parameters the say way kotlin and scala do.
Eg:
data class Make(makeId: String, name: String)
data class Model(modelId: String, name: String)
data class Car(make: Make, model: Model, year: String, ...)
Now you can go ahead and order the params whichever way you wish so long as you're explicitly naming them: val v1 = Car(make = myMake1, model = myModel1, year = "2023", ...)
val v1 = Car(model = myModel1, make = myMake1, year = "2023", ...)
goostavos
8 hours ago
I have a long convoluted answer to this.
I love that talk (and most of Rich's stuff). I consider myself a Clojure fanboy that got converted to the dark side of strong static typing.
I think, to some degree, he actually answers that question as part of his talk (in between beating up nominal types). Optionality often pops up in place of understanding (or representing) that data has a context. If you model your program so that it has "15 maybe sheep," then... you'll have 15 "maybe sheep" you've got to deal with.
The possible combinations of all data types that could be made is very different from the subset that actually express themselves in our programs. Meaning, the actual "explosion" is fairly constrained in practice because (most) businesses can't function under combinatorial pressures. There's some stuff that matters, and some stuff that doesn't. We only have to apply typing rigor to the stuff that matters.
Where I do find type explosions tedious and annoying is not in expressing every possible combination, but in trying to express the slow accretion of information. (I think he talks about this in one of his talks, too). Invoice, then InvoiceWithCustomer, then InvoiceWithCustomerAndId, etc... the world that microservices have doomed us to representing.
I don't know a good way to model that without intersection types or something like Rows in purescript. In Java, it's a pain point for sure.
jakjak123
8 hours ago
Hopefully your domain is sane enough that you can read nearly all the data you are going to use up front, then pass it on to your pure functions. Speaking from a Java perspective.
1propionyl
8 hours ago
My sense is that what's needed is a generalization of the kinds of features offered by TypeScript for mapping types to new types (e.g. Partial<T>) "arithmetically".
For example I often really directly want to express is "T but minus/plus this field" with the transformations that attach or detach fields automated.
In an ideal world I would like to define what a "base" domain object is shaped like, and then express the differences from it I care about (optionalizing, adding, removing, etc).
For example, I might have a Widget that must always have an ID but when I am creating a new Widget I could just write "Widget - {.id}" rather than have to define an entire WidgetCreateDTO or some such.
glenjamin
7 hours ago
Do you mean in TypeScript or in another language?
In TS the `Omit<T, K>` type can be used to remove stuff, and intersection can be used to add stuff
piva00
8 hours ago
> For example, I might have a Widget that must always have an ID but when I am creating a new Widget I could just write "Widget - {.id}" rather than have to define an entire WidgetCreateDTO or some such.
In this case you're preferring terseness vs a true representation of the meaning of the type. Assuming that a Widget needs an ID, having another type to express a Widget creation data makes sense, it's more verbose but it does represent the actual functioning better, you pass data that will be used to create a valid Widget in its own type (your WidgetCreationDTO), getting a Widget as a result of the action.
1propionyl
5 hours ago
> Assuming that a Widget needs an ID, having another type to express a Widget creation data makes sense, it's more verbose but it does represent the actual functioning better
I agree with this logically. The problem is that the proliferation of such types for various use cases is extremely detrimental to the development process (many more places need to be updated) and it's all too easy for a change to be improperly propagated.
What you're saying is correct and appropriate I think for mature codebases with "settled" domains and projects with mature testing and QA processes that are well into maintenance over exploration/iteration. But on the way there, the overhead induced by a single domain object whose exact definition is unstable potentially proliferating a dozen types is developmentally/procedurally toxic.
To put a finer point on it: be fully explicit when rate of change is expected to be slow, but when rate of change is expected to be high favor making changes easy.
PaulHoule
5 hours ago
Hickey is great at trash-talking other languages. In the case of Car you might build a set of builders where you write
Car.builder().make(“Buick”).model(“LeSabre”).build()
Or in a sane world code generate a bunch of constructors.In the field of ontology (say OWL and RDF) there is a very different viewpoint about ‘Classes’ in the objects gain classes as they gain attributes. :Taylor_Swift is a :Person because she has a :birthDate, :birthPlace and such but was not initially a :Musician until she :playsInstrument, :recordedTrack, :performedConcert and such. Most languages have object systems like Java or C++ where a Person can’t start out as not a Musician but become one later like the way they can in real life.
Notably in a system like the the terrible asymmetry of where does an attribute really belong is resolved, as in real life you don’t have to say it is primary that Taylor Swift recorded the Album Fearless or that Fearless was recorded by Taylor Swift.
It’s a really fascinating question in my mind how you create a ‘meta object facility’ that puts a more powerful object system on your fingers in a language like Java or Python, for instance you can have something like
taylorSwift.as(Musician.class)
which returns something that implements the Musician.class interface if taylorSwift.isA(Musician.class)
where TaylorSwift instanceof MetaObject.class
cbsmith
4 hours ago
That one is pretty simple. You have a Car object with four fields. The types of the fields are, respectively Optional<Make>, Optional<Model>, Optional<Year>, and Optional<Colour>.
Hickey makes it sound worse than it is.
geophile
8 hours ago
This discussion sounds like there is confusion about the Car abstraction.
Make and model vs. makeId and modelId: Pick one. Are Make and Model referenced by Cars or not? There seems a slight risk of the Banana/Monkey/Jungle problem here, so maybe stick with ids, and then rely on functions that lookup makes and models given ids. I think it's workable either way.
As for all the optional stuff (color, year, ...): What exactly is the problem? If Cars don't always have all of these properties then it would be foolish of Car users to just do myCar.colour, for example. Test for presence of an optional property, or use something like Optional<T> (which amounts to a language supported testing for presence). Doesn't any solution work out pretty much the same? When I have had this problem, I have not done a proliferation of types (even in an inheritance hierarchy) -- that seems overly complicated and brittle.
nicoty
8 hours ago
I'm not familiar with Java. Does it have no notion of structural types at all? If it does, maybe you could wrap those fields in `Car` with `Maybe`/`Option` (I’m not sure what the equivalent is in Java) so you get something like `Car(Maybe Make, Maybe Model, Maybe Year, Maybe Colour)`?
vips7L
an hour ago
Records are structural types. Null restricted types are in draft: https://openjdk.org/jeps/8303099
spullara
8 hours ago
yes and it is called Optional (rather than Maybe)
js2
8 hours ago
> Python's lack of strong typing
I see people conflate strong/weak and static/dynamic quite often. Python is strong[1]/dynamic, with optional static typing through annotations and a type checker (mypy, pyright, etc).
Perhaps the easiest way to add static types to data is with pydantic. Here's an example of using pydantic to type-check data provided via an external yaml configuration file:
https://news.ycombinator.com/item?id=41508243
[1] strong/weak are not strictly defined, as compared to dynamic/static, but Python is absolutely on the strong end of the scale. You'll get a runtime TypeError if you try to add a number to a string, for example, compared to say JavaScript which will happily provide a typically meaningless "wat?"-style result.
jstimpfle
6 hours ago
In some significant ways, it's not strong at all. It's stronger than Javascript but it's difficult not to be. Python is a duck typing language for the most part.
js2
5 hours ago
Duck typing is an aspect of it being dynamically typed, not whether it is strong/weak. But strong/weak is not formally defined, so if duck typing disqualifies it for you, so be it.
https://langdev.stackexchange.com/questions/3741/how-are-str...
cbsmith
4 hours ago
I always think of Python as having "fairly strong" typing, because you can override the type of objects by just assigning to __class__.
thelastparadise
8 hours ago
You're being pydantic =)
cbsmith
4 hours ago
Living this dream in Python right now (inherited a code base that used nasty nesting of lists & dicts). You don't strictly need to do OOP to solve the problem, but it really does help to have a data model. Using dataclasses to map out the data structures makes the code so much more readible, and the support for type hints in Python is good enough that you can even debug problems with the type system.
mejutoco
9 hours ago
I recommend you use pydantic for type annotations. Alternatively, dataclasses. Then you pair it with typeguards @typechecked annotation and the types will be checked at runtime for each method/function. You can use mypy to check it at "compile time".
Having clear data types without oop is possible, even in python.
sodapopcan
9 hours ago
Python's not really built for that AFAIK, though. In languages built for it, you can type your dicts/hashes/maps/whatever and its easier to see what they are/know where the functions that operate on them live. I'm most familiar with Elixir which has structs which are simply specialized map (analogous to dict in Python) where their "type" is the name of the module they belong to. There can only be one struct per module, In this sense is easy to know exactly where its functions live and is almost like a class with the very key difference that modules are not stateful.
cbsmith
4 hours ago
> In languages built for it, you can type your dicts/hashes/maps/whatever and its easier to see what they are/know where the functions that operate on them live.
I think I must be misunderstanding what you mean by that, because I can very much do that in Python.
ederamen
5 hours ago
Use Data Classes
FpUser
9 hours ago
>"In these situations, I often introduce objects that wrap the map or dict, and have methods that make sense for that level."
I've been doing the same thing since the end of the 80s as well starting with Turbo/Borland Pascal, C++, and later any other language that supports OOP.
kccqzy
9 hours ago
Clojure has spec. That allows you to know a specification of what the data structure contains.
akavi
9 hours ago
You can get strongly typed "shaped" data without objects[0], even in Java: Records[1].
~Unfortunately, I believe they're mutable (and cannot be made immutable).~ Edit: I was wrong, they're immutable.
[0]: I'm using "object" to mean "data bound to methods", since the concept of aggregate data in general long pre-date OOP (eg, C's structs)
[1]: https://docs.oracle.com/en/java/javase/17/language/records.h...
taftster
9 hours ago
Java Records are immutable (by the most common definition). They don't have any means to update the record (via setters, etc.) after construction. That doesn't mean, for example, you can't store a reference to a mutable type (for example, a List or Map) in your record.
The frustration I have with Records is there is no good way to prevent direct construction of them. That is, the constructor is public, which prevents an easy way of enforcing an invariant during construction.
For example, let's say that you have a record with a Date type. There's no good way to prevent a user from creating the record with an invalid date, one that is out of a needed date range. Or maybe enforcing a field cannot be null or some combination of fields must meet requirements as a group.
The benefit I get from the classic Builder pattern is defeated with Records. I can't enforce checking of my fields before the construction of the record object itself. Presumably I would need to verify the object after construction, which is unfortunate.
vips7L
9 hours ago
You can enforce some invariants during construction:
record Point(int x, int y) {
Point {
if (x < 0) throw new IllegalArgumentException()
}
}
or if you want to assert something is not null: record Person(String name) {
Person {
requireNonNull(name);
}
}
tpmoney
6 hours ago
As mentioned by the other commenters, you should be able to run any validations or transformations on the data that you want in the canonical constructor, including re-assigning values (for example we've done defaults with `foo != null ? foo : new DefaultFoo()`). The only thing I think you can't do with a record is make the canonical constructor private and force users of your type to call factory methods that can return null instead of throwing an exception. You can provide those factory methods, but anyone can still call the constructor, so you have to do your hard checks in the constructor. On the other hand, no matter how many alternate constructors or factory methods you make, you're also guaranteed that every one of them eventually has to call the canonical constructor, so you don't need to spread your validation around either.
snmx999
8 hours ago
You can create dedicated, already verified objects to pass on to your record. E.g. AllowedDate (extends Date).
akavi
9 hours ago
Can you make the Record class private to a module, and only export a static function that constructs them?
(I know very little about Java)
kaba0
9 hours ago
To a degree, yes, that’s possible. But leaking a private type over module boundaries is bad form, so a better (though possibly over engineered solution) would be to have a separate public interface, implemented by the private record type, and the static function would have that interface as return type.
enugu
4 hours ago
Why is it bad form to expose a record type only via custom functions and not its field accessors? Isn't this just like exposing a more usual object with its public functions and private functions remain inaccessible?
bedatadriven
9 hours ago
A record's fields are final, so records are immutable (though they can include immutable pointers to mutable objects)
goostavos
10 hours ago
Thanks for the kind words :)
>learning that this is a labor of love
I underestimated both the amount of labor and the amount of love that would be involved. There were more than a few "throw everything out and start over" events along the way to this milestone.
Clojure definitely had a huge impact on how I think about software. Similarly, Haskell and Idris have rearranged my brain. However, I still let Java be Java. The humble object is really tough to beat for managing many kinds of runtime concerns. The book advocates for strongly typed data and leveraging the type system as a tool for thinking.
>Java's recent innovations certainly make this a lot easier
Yeah, it's an exciting time! Java has evolved so much. Algebraic types, pattern matching, `with` expressions -- all kinds of goodies for dealing with data.