cesarb
4 days ago
Part of this complexity comes from what IMO was a mistake on Java's design (which was AFAIK copied by C#): the base Object class does too much. It has equality comparison, string conversion, object hashing, and a per-object re-entrant lock. Other than equality comparison (which is also bad because it contributes to the perennial confusion between identity equality and value equality), these need extra storage for each and every object in the system (string conversion contains the object hash code as part of its default output). Some tricks are used to avoid most of the space overhead for the per-object lock, at the cost of extra complexity.
pulse7
9 hours ago
On the other hand: Smalltalk had many many more methods on Object than Java have today...
ygra
9 hours ago
Isn't just the lock something that potentially needs space per object?
layer8
6 hours ago
Object::hashCode returns System.identityHashCode(Object) by default. Since GC can move objects around in memory, and the hash code of an object needs to be stable, this default hash code can’t be based on the memory address of the object, and thus needs to be stored per object.
Since System.identityHashCode() can be invoked (for example by IdentityHashMap) even for objects of classes that implement a custom hash code, it also can’t be optimized away even for such classes.
Conceivably it could be optimized away for an object unless or until System.identityHashCode() is invoked for it. It could thus be allocated on demand similarly to how the object locks are. Of course, this has all kinds of performance trade-offs.
guipsp
9 hours ago
You can lock on any object - if you dynamically create these locks, then you need to coordinate creation among threads.
masklinn
9 hours ago
They're saying that only the lock should need storage in the object header, everything else can be computed on the fly (obviously at a CPU cost rather than memory).
Equality is computed (and the default is trivial, it just compares addresses), hashcode is computed (and the default is trivial, it just returns the object's address), string conversion is computed (and the default is trivial, it just prints the class name and the hashcode IIRC).
greiskul
9 hours ago
I'm not sure you can just return the object address for hashcode, because with GC the object address can move right? So if you are using it for hashcode, you do need to persist it somewhere after it's called the first time.
Nevermark
8 hours ago
Would an optional "Hashable" class interface be a good way to reduce the classes (and objects) that need to have a hash? Or is having a hash an unavoidable primitive feature?
Also, would an optional "Lockable" class interface be a good way to drastically reduce the classes (and objects) that need to maintain lock information?
I am religiously averse to unused but implemented "Positive-Cost Abstractions". I dream of a root class with no methods except "new", and instance methods delete() and isSelf(x).
Even "Polymorphism" could be an opt-in interface. Subclasses of non-polymorphic classes would inherit functionality (yay, reusability), but not be a subtype of their root class.
(With commutativity between subclassing & the polymorphic interface. I.e. If A is a non-polymorphic class, and Ap is a polymorphic version of class A, then all subclasses of Ap and all polymorphic subclasses of A, would be subtypes of Ap.)
_old_dude_
9 hours ago
> the default is trivial, it just returns the object's address
This was trivial a long time ago. Now, all Java GCs move objects in memory.
SpaghettiCthulu
9 hours ago
The default `toString` implementation isn't cached, is it?
layer8
6 hours ago
Not the OP, but my beef with toString is that for some classes it is an essential part of the interface contract that requires a stable and documented string mapping (e.g. for value types like BigInteger or URI, and for String itself), whereas for other classes it just serves as a way to provide a debugging/logging representation that may change from one version to the next, and whose exact representation should not be relied on. These are really two separate purposes with a different interface contract.
It would have been better for Object to have a toDebugString method, and to restrict implicit string conversion (concatenation) to classes implementing a StringConvertible interface with a corresponding separate toString method.
mcdeltat
6 hours ago
Astounding how so many languages and programmers don't make the clear distinction between "debug string", "canonical string", "human readable string", etc. There is no such thing as a totally generic "to string" function for any nontrivial program.
The approach I'm most a fan of is functional languages where everything has a fixed canonical string representation (even cooler when you can convert the string directly back to code), and everything else you must explicitly create a function for.
josefx
4 hours ago
> These are really two separate purposes with a different interface contract.
This is a basic feature of inheritance in an object oriented language, you can take an interface that guarantees "this returns some string" and offer a more concrete guarantee "this returns the objects value as decimal" in the implementation.
> and to restrict implicit string conversion (concatenation) to classes implementing a StringConvertible interface with a corresponding separate toString method.
So anyone wanting to make their code trivially loggable now has to implement StringConvertible by copy pasting String toString(){ return toDebugString(); } into every single class they are implementing? You managed to make Java more verbose for no gain at all, please collect your AbstractAwardInstanceFactoryBuilder on your way out.
layer8
3 hours ago
> So anyone wanting to make their code trivially loggable now has to implement StringConvertible by copy pasting String toString(){ return toDebugString(); } into every single class they are implementing?
If you actually want to output a debug representation, you’ll explicitly call toDebugString(). (And a debugger would call it by default.) This would also make the purpose explicit in the code. And you would’t accidentally output a random debug representation (like the default "@xxxxxxxx") as part of regular string concatenation/interpolation, like on a user-level message, or as a JSON value or whatever. This is why it would be wrong to have a toString() forward to toDebugString().
Currently, for most classes I have to add javadoc for the toString() method saying something like: “Returns a debug representation of this object. WARNING: The exact representation may change in future versions, do not rely on it.” For some of these classes a reliable non-debug string representation would conceivably make sense, but I chose not to have one because there is no immediate need. However, callers need to know which it is, and therefore the documentation is needed.
Conversely, whenever I want to use the toString() of a third-party class, I have to check what kind of output it generates, but unfortunately it’s often not documented. And if testing it (or looking at the source) seems to produce a stable value representation, one has no choice but to hope that it will remain stable, instead of that being part of the contract.
Furthermore, for classes with a value representation it often makes sense to have a different debug representation (for example, with safely escaped control characters or additional meta-data). In current Java, it’s safer to have those in a different, non-standard method than toString() (because users expect the latter to provide the value representation), but then there’s the inconvenience that the debug representation won’t be picked up by debuggers by default, due to the non-standard method.
This is all a symptom of the same method being used for different purposes. And a debug representation makes always sense (as evidenced by the default implementation), while a value representation only sometimes makes sense, and might be absent even when it would make sense. But you generally can’t tell from the method.
Having different methods would solve those issues. With a toDebugString() method, one wouldn’t have to document anything, because the javadoc I paraphrased above would already be contained in the Object class. And the toString() method would only be present for classes that do provide a defined value representation that makes sense on the business/domain level of the class.
invalidname
9 hours ago
It is not but strings are cached (interned) in Java which is a different thing.
layer8
5 hours ago
String literals are interned, strings in general aren’t.
specialist
8 hours ago
Ya. Hindsight's 20/20.
I've half expected the Java/JVM team to change Object to extend a new "NakedObject", and implementing new interfaces Equalable, Hashable, Finalizable, and Waitable. (Current interface Clonable is a goof, so maybe deprecate it and replace with Copyable.)
Then "NakedObject" would only need getClass method, right?
Then values and records could also extend NakedObject, right?
Then equals and clone/copy could be generic, right?
--
Alternately, maybe prevent the gotchas with missing equals, hashCode, and toString by having the runtime autogenerate something reasonable.
layer8
6 hours ago
This would break the invariant that x instanceof Object is true for all non-primitive values x. This assumption is baked into too much code and too many APIs.
For example, you couldn’t add a NakedObject-but-not-Object to a java.util.List, because what Object would List::get(index) then return for it? (Note that the List’s type parameter doesn’t exist at runtime and may also not exist at compile time.)
masklinn
8 hours ago
> maybe prevent the gotchas with missing equals, hashCode, and toString
There's no actual gotcha to them not existing. It works perfectly fine in haskell or rust for instance.
Although it's not a fundamentally useful change to make objects for which a sensible equals/hashcode is trivial not have it, and still have it for objects for which it's not. So without the ability to reach back and remove those properties being universal I fail to see what the point would be.