The JTA API, which was introduced back in 2002 hasn’t changed much in nearly a decade. What could be regarded as a good thing for a specification often referred to as mature, actually isn’t. At least, that’s my opinion (and not the one of any other person nor entity, like my current and previous employers, customers and so on) which I’m going to explain in details in this post.
But what is so wrong about the current specification that I would want to see revised? Well, I’d be lying if didn’t say everything. This blog is my attempt at describing in more details what I don’t like about it, would like to see changed and giving opinions and hints about how I believe it could be improved.
The API from a user perspective
Let’s first start by having a look at the API itself, from a user perspective, ie: someone willing to use an implementation. It’s contained in the javax.transaction package as the javax.transaction.xa one is targeted at implementors.
One of the important aspects of transaction management is error management. Since transactions must guarantee ACIDity no matter what happens, solid error handling is a must. This is the first aspect on which this specification fails: a closer look at the API listed exception shows that the ones in the javax.transaction package don’t have a common ancestor which makes them a pain to manage from the application’s perspective. Look for instance at the UserTransaction.commit() signature: it throws no less than six exceptions including four checked ones. This is at best intimidating but more importantly very confusing and of little help. What is one supposed to do with a HeuristicMixedException? This one is actually thrown when the transaction manager was asked to commit the transaction but some resources (resources being your database or JMS server) followed the commit command while some decided to rollback instead. What is an application supposed to do with such an exception, except report as much detail as possible about it like for instance the list of resources did not manage to commit so a human could take a corrective action. Tough luck, this may or may not be encoded in the reported exception message, depending of the implementation’s good will.
UserTransaction, TransactionManager and Transaction
These three interfaces are a great source of confusion, and of poor design. Let’s first discuss the first two: UserTransaction and TransactionManager. A quick investigation shows that the former interface is an exact copy of the latter minus three methods: suspend, resume and getTransaction. The javadoc explains that UserTransaction is meant for application programmers while TransactionManager is for internal application server use only.
I don’t get why there is such separation between applications and app servers. Why would programmers writing business applications wouldn’t be authorized to suspend and resume transactions? I guess the original intent was with EJB only in mind but event then, EJBs can be configured with Bean Managed Transactions maybe since 1.0, certainly since 2.0. Why application programmers weren’t allowed to use this API, required to implement the REQUIRES_NEW transaction demarcation by suspending the active transaction before starting a new one?
The TransactionManager interface also is confusing, even when put in its original application-server-only context. It contains begin, getTransaction, commit, rollback and setRollbackOnly methods, the latter three being redundant with the exact same methods of the Transaction interface. I know TransactionManager is meant to be implemented as a singleton storing the transaction context (represented by the Transaction interface) in a thread-local or equivalent storage. But why making the API so confusing? It would have been so much simpler to only have a begin method returning a Transaction object in TransactionManager and leave the commit, rollback and setRollbackOnly methods in the Transaction object. Why this redundancy for no benefit but added confusion? It’s not even clear if an application server is allowed to call Transaction.commit() or not and some implementors and users disagree on what should happen in that case. The spec is silent on the subject.
Even if that separation made sense, I also wonder why the interfaces designed towards application programmers and app server programmers ended up in the same package, why no spi or equivalent package was created for the sake of clean separation of concern?
That’s easy: there is nothing. Yes, the UserTransaction implementation is supposed to be available under the java:comp/UserTransaction JNDI name but that’s defined in another specification. I know JTA was meant for application servers only but how confusing is that to specify interfaces but not how to lookup their implementations? Even application servers’ implementors must have been confused by this lack of coherency.
The API from a transaction manager implementor’s perspective
The implementor’s job isn’t any better, quite the contrary. The javax.transaction.xa package’s single declared exception XAException is a joke to any programmer decently versed in OO programming: this is the only exception used to report problems between a XA resource and a transaction manager and it can mean absolutely everything: when this exception is thrown the reason must be encoded as an integer and the code which is going to catch it must use a switch block (or if/else statements) to figure out the exception’s meaning and take appropriate action. I’m not going to go into much more details as to why this is a disastrous design in any OO language but to add insult to injury, this exception has three constructors only one of which accepts a value for that error code. You basically have choice during construction between either setting the error code or a humanly readable error message and if you decide to go with the message variant, the error code’s default value is 0 (zero) which is an invalid error code. How convenient is that? Some implementors may very well decide to leave the error message out which drastically complicates debugging even simple problems. And some did.
Enlistment and XAResource
This undoubtedly is one of the ugliest part of the specification. XAResource is an interface for which resources (again: databases, JMS servers…) supporting the XA protocol must provide an implementation in a completely undefined way. Actually, this is defined but in other specifications: the JDBC one describes how this work for JDBC drivers and the JMS one for JMS servers. But there is no single line in the JTA spec about how to get access to an XAResource implementation and that complicates the big picture again. I’ll come back to that later.
The XAResource interface is a near-direct translation to java of the X/Open XA specification which was meant for the C language. Things that made sense in C are looking clunky in java as – once again – OO design was not considered. That’s why we have methods like start(Xid xid, int flags) accept a flags argument or prepare(Xid xid) return a flag which can be one of the many constants defined in the XAResource interface. You have to constantly refer to the javadoc to figure out which ones are legal for a particular method call from the ones which aren’t. Even for someone with experience of the subject it’s near impossible to remember all the valid combinations and you have to constantly get back and forth between the doc and your IDE. That’s not exactly what I call a well designed and self-documenting API. If the reason was to ease the implementation of the XAResource by making the API as close as possible to the XA one I’m sorry but that’s no good excuse: a look-alike but OO-designed API would have fit the bill more appropriately and it’s not exactly rocket science to encapsulate lower-level API calls. For instance, start(xid, TM_NOFLAGS) could have been start(xid), start(xid, TM_JOIN) could have been join(xid) and start(xid, TM_RESUME) could have been resume(xid). That would have made XAResource a lot less painful to work with. The same is true for other methods like end (same as start), prepare (why not a boolean return value?), commit (why a onePhase boolean instead of two method calls: commit and commitOnePhase?) and recover (why not just returning an array with all Xids directly?).
Recovery and XAResource
Recovery is fundamental to all transactional systems, it’s a process that kicks in to clean up and finish unachieved work after a problem occurred, for instance after a crash or a power outage. How this is supposed to work is completely up to your imagination if you stick to the JTA spec, you have to read and fully understand the XA spec to understand how the mechanism works and should be implemented. And even then you’re left responseless regarding some important pieces of the puzzle.
The recovery logic is about calling XAResource.recover() to get a list of Xids (basically transaction IDs) which are in need of cleanup, reconciliate that with the transaction manager’s activity logs (a.k.a. transaction logs) and figure out how to clean them up either by rolling back or committing. But how to get your hands on the XAResource’s which participated in the transaction? Those aren’t persistent and there is no documented mechanism about how to reconstruct or retrieve them back, the spec merely explains that a JNDI lookup of some kind of XAResource factory could be used. That’s kind of a big gap even if that – once again – supposedly is the job of the application server to keep track of those XAResource factories but there is no documented way of matching a XAResource back to its factory (so the transaction manager can record what resources are participating in a transaction) nor even how to get a XAResource back from its factory (to call recover on them).
Then when is that supposed to run? During startup after figuring out the process running the transaction manager crashed? That’s without counting on resource crashes, network outages and so on. It’s an extremely delicate process that when implemented wrong totally jeopardizes your data integrity while you’re paying the high 2 phase-commit price to keep it safe and the spec remains silent about good practices, common caveats and years of experience on the subject. I hope reading this won’t prevent your from sleeping at night, especially since there is no reference implementation nor any JTA TCK which could help validating implementations. There are some tests in the J2EE TCK (so I’ve heard) but nothing which checks data integrity can be guaranteed after a crash, no even an attempt at that.
The API from a XAResource implementor’s perspective
Not only is the XAResource API bad, but it also assumes or asserts unrealistic requirements from the underlying resource. The spec also is silent on the subject and you have to refer to other ones to get resource-specific details.
If you want to implement a XAResource for a JDBC driver or a JMS client you can refer to those specs. What if you want to implement an XAResource for another kind of resource? That’s when the JCA (a.k.a J2CA or Connector Architecture) comes in play. I’m not going to debate the merits nor the failures of JCA except for the fact that this specification is meant for application servers only. Apparently there’s no life outside application servers according to that spec which was the common thinking back in 2003 but nowadays this is completely backwards and calls for a refreshment.
Have you ever heard of the concept of transaction interleaving? That’s the capacity of executing two transactions in parallel using a single handler. With databases that means being able to execute two different transactions on a single connection and keeping them properly isolated. This feature is extremely uncommon out there as nearly all the most used databases don’t support that are their core: they’re only capable of running a single transaction per connection which is plain fine 99.999% of the time as you’re free to get a second connection in case you really need to run two transactions in parallel. I’m so sorry to report that the JTA spec mandates XAResource candidates to fully support transaction interleaving (paragraph 3.4.6 of the JTA 1.0.1B spec for the curious ones) and that most XAResource implementation out there will simply fail with various results in the range between reporting an error to screwing up your data integrity when two transactions are interleaved on the same connection. It’s entirely possible to implement a 100% spec compliant transaction manager which doesn’t work in most scenarios.
A transaction is bound to a thread, and all transactional resources accessed by that thread after a transaction begun are supposed to be automatically enlisted and delisted by an unspecified mechanism. Yes, automatically enlisted and delisted as enlistment and delistment can only be performed from the Transaction object which is only accessible from the TransactionManager object which an application developer isn’t supposed to have access to. This works with very convoluted designs in JDBC, JMS and JCA (check the XAResource chapter of these specs as well as some J2EE ones I forgot about to get a grasp of what it takes) but is horribly cumbersome. Did you know you have to use a special JTA-aware connection pool for your JDBC connection to be able to participate in a JTA transaction? Well if you don’t use one then your JDBC work will just silently be auto-committed. Once again this relies on the fact that the app server is supposed to provide us with that, okay we can live with that even if that cries for a rework once again and even if some resources would love to be able to participate in JTA transactions without the burden of implementing a JCA connector: the Berkeley DB Java Edition and the Java Content Repositories come to my mind. This loose mechanism turns from a extra burden into an extreme challenge for resources which aren’t connection oriented like transactional caches which need to jump through hoops to figure out when enlistment and delistment should happen as they don’t have an equivalent get connection / close connection cycle.
The rest of the spec
After this JTA API review, there isn’t much left to say about the spec. Let’s quickly summarize that with the fact that the JTA spec relies on a perfect understanding of the X/Open XA spec by the reader. The presumed abort concept, critical to the implementation of the recovery process isn’t even mentioned once. It’s horribly frustrating to constantly have to refer to both the JTA and XA specs to assert that a transaction manager implementation may be doing something wrong with the extra mental exercise of translating the C concepts of the XA spec into java.
I don’t want to blame anyone for the sad state of the JTA specification. It’s bad for whatever reason, maybe a lack of time as it needed to be part of the early J2EE spec, certainly because of a lack of love in the past 8 or so years.
The fact is that this specification definitely needs to be refreshed and deserves more love than it ever had as transactions are a too important subject that will stick with us for the coming decades and will probably need a strong basis for the future challenges it may face in the future like the NoSQL move or the CAP theorm.
Please forgive any inaccuracy / mistake / pure lie you may find in this text and feel free to comment and publicly ridicule me if you find it necessary.