On the proposal for Data Classes

There is a new draft proposal for Java ‘Data Classes’ being worked on in project amber - read about it here. In short I think the main points are:

Design intent: clearly and unequivocally express the design intent of a class as a ‘DTO’ (even though the author never mentions DTO)
Boilerplate: let the compiler take care of proper implementations of typical DTO operations like equals() and hashCode()
Switch statements: the author briefly talks about enabling semantic features around these DTOs, eg. the ability to use them as targets for switch statements

The draft also goes into some specifics like accessors not necessarily following JavaBeans conventions, opt-in mutability, and migration/compatibility concerns, among others.

Let me tell you why I think this proposal is a bad idea.

Design intent:

This is an honorable goal: it will immediately be obvious to a reader what the design intent of a data class is. The reader wouldn’t have to wade through countless lines following the typical getters and setters, equals, hashCode, and toString implementations just to be sure that this class is just a data class. Fine.

To see the problem with this line of thinking you must first step back and ask yourself: how many DTO-types have I defined in my project and will this new language construct simplify my code considerably? If your answer is “a lot” to both questions, you have a deeper problem that this new construct cannot solve: you may not be thinking in an object-oriented way. Many of the types you’ve defined are merely bags of data and you are most likely programming in an imperative style, not object-oriented as you should be with Java. That is the crux of my argument.

Let’s kill the obvious retort right away: DTO IS an antipattern but pure OOP is not practical therefore there’s no point in judging and getting upset about it.

Yes, DTO is unavoidable sometimes, especially when interfacing with remote APIs (eg. JAX-WS). However, a couple of points:

These should exist at the periphery of your code
They should be encapsulated by your domain model, where operations like equals() and hashCode() truly matter

In essence, you must avoid anemic domain models. DTOs, if present, should only exist at the fringes of your project, and not form the backbone of your whole design philosophy. They should ideally only be present when auto-generated by framework/tools like wsimport and you should never have to touch them.

Boilerplate:

If you accept the previous argument, then the getter/setter “boilerplate” becomes a non-issue: tools will automatically create them only where you absolutely need them. In regards to the proposal, that leaves us with just the other Object methods of interest: equals() and hashCode(). Here’s what I think:

They should be taken out of Object and made into their own interfaces.

Not all APIs want or even need all their DTOs to implement equals(), much less hashCode().

Switch statements:

This is cool but, why not switch on any object using Object#equals (or on my proposed Equality interface)? Because of performance reasons? They already reached a compromise on switching on strings. And as an SO user so eloquently put it (source):

...technical obstacles shouldn’t drive language design. If there is no way to compile it to efficient 
code, it might still get compiled to the equivalent of if … else … clauses, and still have a win on 
source code brevity.

The bottom line

By promoting DTOs to first-class citizenship, novice programmers who don’t know better will feel emboldened to keep abusing the DTO pattern. The other interest group - the implementators of “interface APIs” like Hibernate or jax-ws-ri - will gain some marginal benefit by having their templates reduced by a few lines.

This proposal will overall promote bad design with only marginal benefits.

Why not focus on other, good patterns?

On the proposal for Data Classes

The bottom line

Further Reading

Java Raw Types

JDBC and Oracle: setNULL

The Case of the Sudden OC4J Restarts [part 2]