Why we need JTA 2.0

The JTA API, which was introduced back in 2002 hasn’t changed much in nearly a decade. What could be regarded as a good thing for a specification often referred to as mature, actually isn’t. At least, that’s my opinion (and not the one of any other person nor entity, like my current and previous employers, customers and so on) which I’m going to explain in details in this post.

But what is so wrong about the current specification that I would want to see revised? Well, I’d be lying if didn’t say everything. This blog is my attempt at describing in more details what I don’t like about it, would like to see changed and giving opinions and hints about how I believe it could be improved.

The API from a user perspective

Let’s first start by having a look at the API itself, from a user perspective, ie: someone willing to use an implementation. It’s contained in the javax.transaction package as the javax.transaction.xa one is targeted at implementors.

Exceptions

One of the important aspects of transaction management is error management. Since transactions must guarantee ACIDity no matter what happens, solid error handling is a must. This is the first aspect on which this specification fails: a closer look at the API listed exception shows that the ones in the javax.transaction package don’t have a common ancestor which makes them a pain to manage from the application’s perspective. Look for instance at the UserTransaction.commit() signature: it throws no less than six exceptions including four checked ones. This is at best intimidating but more importantly very confusing and of little help. What is one supposed to do with a HeuristicMixedException? This one is actually thrown when the transaction manager was asked to commit the transaction but some resources (resources being your database or JMS server) followed the commit command while some decided to rollback instead. What is an application supposed to do with such an exception, except report as much detail as possible about it like for instance the list of resources did not manage to commit so a human could take a corrective action. Tough luck, this may or may not be encoded in the reported exception message, depending of the implementation’s good will.

UserTransaction, TransactionManager and Transaction

These three interfaces are a great source of confusion, and of poor design. Let’s first discuss the first two: UserTransaction and TransactionManager. A quick investigation shows that the former interface is an exact copy of the latter minus three methods: suspend, resume and getTransaction. The javadoc explains that UserTransaction is meant for application programmers while TransactionManager is for internal application server use only.

I don’t get why there is such separation between applications and app servers. Why would programmers writing business applications wouldn’t be authorized to suspend and resume transactions? I guess the original intent was with EJB only in mind but event then, EJBs can be configured with Bean Managed Transactions maybe since 1.0, certainly since 2.0. Why application programmers weren’t allowed to use this API, required to implement the REQUIRES_NEW transaction demarcation by suspending the active transaction before starting a new one?

The TransactionManager interface also is confusing, even when put in its original application-server-only context. It contains begin, getTransaction, commit, rollback and setRollbackOnly methods, the latter three being redundant with the exact same methods of the Transaction interface. I know TransactionManager is meant to be implemented as a singleton storing the transaction context (represented by the Transaction interface) in a thread-local or equivalent storage. But why making the API so confusing? It would have been so much simpler to only have a begin method returning a Transaction object in TransactionManager and leave the commit, rollback and setRollbackOnly methods in the Transaction object. Why this redundancy for no benefit but added confusion? It’s not even clear if an application server is allowed to call Transaction.commit() or not and some implementors and users disagree on what should happen in that case. The spec is silent on the subject.

Even if that separation made sense, I also wonder why the interfaces designed towards application programmers and app server programmers ended up in the same package, why no spi or equivalent package was created for the sake of clean separation of concern?

Lookup

That’s easy: there is nothing. Yes, the UserTransaction implementation is supposed to be available under the java:comp/UserTransaction JNDI name but that’s defined in another specification. I know JTA was meant for application servers only but how confusing is that to specify interfaces but not how to lookup their implementations? Even application servers’ implementors must have been confused by this lack of coherency.

The API from a transaction manager implementor’s perspective

The javax.transaction.xa package is targeted at implementors but as we’ve seen earlier, the javax.transaction package should be taken into consideration as well.

Exceptions again

The implementor’s job isn’t any better, quite the contrary. The javax.transaction.xa package’s single declared exception XAException is a joke to any programmer  decently versed in OO programming: this is the only exception used to report problems between a XA resource and a transaction manager and it can mean absolutely everything: when this exception is thrown the reason must be encoded as an integer and the code which is going to catch it must use a switch block (or if/else statements) to figure out the exception’s meaning and take appropriate action. I’m not going to go into much more details as to why this is a disastrous design in any OO language but to add insult to injury, this exception has three constructors only one of which accepts a value for that error code. You basically have choice during construction between either setting the error code or a humanly readable error message and if you decide to go with the message variant, the error code’s default value is 0 (zero) which is an invalid error code. How convenient is that? Some implementors may very well decide to leave the error message out which drastically complicates debugging even simple problems. And some did.

Enlistment and XAResource

This undoubtedly is one of the ugliest part of the specification. XAResource is an interface for which resources (again: databases, JMS servers…) supporting the XA protocol must provide an implementation in a completely undefined way. Actually, this is defined but in other specifications: the JDBC one describes how this work for JDBC drivers and the JMS one for JMS servers. But there is no single line in the JTA spec about how to get access to an XAResource implementation and that complicates the big picture again. I’ll come back to that later.

Enlistment is the act of telling a transaction (via the Transaction interface) that a resource is going to participate in it. This is done by calling the Transaction.enlistResource(XAResource xar) method which returns false when something went wrong with the enlistment. Figuring out what went wrong when this happens is an exercise to the API user as there is no way to get more details about the problem, like an exception could provide… when properly designed. Under the hood, this should eventually make the transaction manager trigger a XAResource.start() call to tell the resource any further work is going to be part of a global transaction.

The XAResource interface is a near-direct translation to java of the X/Open XA specification which was meant for the C language. Things that made sense in C are looking clunky in java as – once again – OO design was not considered. That’s why we have methods like start(Xid xid, int flags) accept a flags argument or prepare(Xid xid) return a flag which can be one of the many constants defined in the XAResource interface. You have to constantly refer to the javadoc to figure out which ones are legal for a particular method call from the ones which aren’t. Even for someone with experience of the subject it’s near impossible to remember all the valid combinations and you have to constantly get back and forth between the doc and your IDE. That’s not exactly what I call a well designed and self-documenting API. If the reason was to ease the implementation of the XAResource by making the API as close as possible to the XA one I’m sorry but that’s no good excuse: a look-alike but OO-designed API would have fit the bill more appropriately and it’s not exactly rocket science to encapsulate lower-level API calls. For instance, start(xid, TM_NOFLAGS) could have been start(xid), start(xid, TM_JOIN) could have been join(xid) and start(xid, TM_RESUME) could have been resume(xid). That would have made XAResource a lot less painful to work with. The same is true for other methods like end (same as start), prepare (why not a boolean return value?), commit (why a onePhase boolean instead of two method calls: commit and commitOnePhase?) and recover (why not just returning an array with all Xids directly?).

Recovery and XAResource

Recovery is fundamental to all transactional systems, it’s a process that kicks in to clean up and finish unachieved work after a problem occurred, for instance after a crash or a power outage. How this is supposed to work is completely up to your imagination if you stick to the JTA spec, you have to read and fully understand the XA spec to understand how the mechanism works and should be implemented. And even then you’re left responseless regarding some important pieces of the puzzle.

The recovery logic is about calling XAResource.recover() to get a list of Xids (basically transaction IDs) which are in need of cleanup, reconciliate that with the transaction manager’s activity logs (a.k.a. transaction logs) and figure out how to clean them up either by rolling back or committing. But how to get your hands on the XAResource’s which participated in the transaction? Those aren’t persistent and there is no documented mechanism about how to reconstruct or retrieve them back, the spec merely explains that a JNDI lookup of some kind of XAResource factory could be used. That’s kind of a big gap even if that – once again – supposedly is the job of the application server to keep track of those XAResource factories but there is no documented way of matching a XAResource back to its factory (so the transaction manager can record what resources are participating in a transaction) nor even how to get a XAResource back from its factory (to call recover on them).

Then when is that supposed to run? During startup after figuring out the process running the transaction manager crashed? That’s without counting on resource crashes, network outages and so on. It’s an extremely delicate process that when implemented wrong totally jeopardizes your data integrity while you’re paying the high 2 phase-commit price to keep it safe and the spec remains silent about good practices, common caveats and years of experience on the subject. I hope reading this won’t prevent your from sleeping at night, especially since there is no reference implementation nor any JTA TCK which could help validating implementations. There are some tests in the J2EE TCK (so I’ve heard) but nothing which checks data integrity can be guaranteed after a crash, no even an attempt at that.

The API from a XAResource implementor’s perspective

Not only is the XAResource API bad, but it also assumes or asserts unrealistic requirements from the underlying resource. The spec also is silent on the subject and you have to refer to other ones to get resource-specific details.

Resource-specific details

If you want to implement a XAResource for a JDBC driver or a JMS client you can refer to those specs. What if you want to implement an XAResource for another kind of resource? That’s when the JCA (a.k.a J2CA or Connector Architecture) comes in play. I’m not going to debate the merits nor the failures of JCA except for the fact that this specification is meant for application servers only. Apparently there’s no life outside application servers according to that spec which was the common thinking back in 2003 but nowadays this is completely backwards and calls for a refreshment.

Resource requirements

Have you ever heard of the concept of transaction interleaving? That’s the capacity of executing two transactions in parallel using a single handler. With databases that means being able to execute two different transactions on a single connection and keeping them properly isolated. This feature is extremely uncommon out there as nearly all the most used databases don’t support that are their core: they’re only capable of running a single transaction per connection which is plain fine 99.999% of the time as you’re free to get a second connection in case you really need to run two transactions in parallel. I’m so sorry to report that the JTA spec mandates XAResource candidates to fully support transaction interleaving (paragraph 3.4.6 of the JTA 1.0.1B spec for the curious ones) and that most XAResource implementation out there will simply fail with various results in the range between reporting an error to screwing up your data integrity when two transactions are interleaved on the same connection. It’s entirely possible to implement a 100% spec compliant transaction manager which doesn’t work in most scenarios.

Loose assumptions

A transaction is bound to a thread, and all transactional resources accessed by that thread after a transaction begun are supposed to be automatically enlisted and delisted by an unspecified mechanism. Yes, automatically enlisted and delisted as enlistment and delistment can only be performed from the Transaction object which is only accessible from the TransactionManager object which an application developer isn’t supposed to have access to. This works with very convoluted designs in JDBC, JMS and JCA (check the XAResource chapter of these specs as well as some J2EE ones I forgot about to get a grasp of what it takes) but is horribly cumbersome. Did you know you have to use a special JTA-aware connection pool for your JDBC connection to be able to participate in a JTA transaction? Well if you don’t use one then your JDBC work will just silently be auto-committed. Once again this relies on the fact that the app server is supposed to provide us with that, okay we can live with that even if that cries for a rework once again and even if some resources would love to be able to participate in JTA transactions without the burden of implementing a JCA connector: the Berkeley DB Java Edition and the Java Content Repositories come to my mind. This loose mechanism turns from a extra burden into an extreme challenge for resources which aren’t connection oriented like transactional caches which need to jump through hoops to figure out when enlistment and delistment should happen as they don’t have an equivalent get connection / close connection cycle.

The rest of the spec

After this JTA API review, there isn’t much left to say about the spec. Let’s quickly summarize that with the fact that the JTA spec relies on a perfect understanding of the X/Open XA spec by the reader. The presumed abort concept, critical to the implementation of the recovery process isn’t even mentioned once. It’s horribly frustrating to constantly have to refer to both the JTA and XA specs to assert that a transaction manager implementation may be doing something wrong with the extra mental exercise of translating the C concepts of the XA spec into java.

Conclusion

I don’t want to blame anyone for the sad state of the JTA specification. It’s bad for whatever reason, maybe a lack of time as it needed to be part of the early J2EE spec, certainly because of a lack of love in the past 8 or so years.

The fact is that this specification definitely needs to be refreshed and deserves more love than it ever had as transactions are a too important subject that will stick with us for the coming decades and will probably need a strong basis for the future challenges it may face in the future like the NoSQL move or the CAP theorm.

Please forgive any inaccuracy / mistake / pure lie you may find in this text and feel free to comment and publicly ridicule me if you find it necessary.

This entry was posted in JTA. Bookmark the permalink.

8 Responses to Why we need JTA 2.0

  1. Reza Rahman says:

    Ludo,

    As I mentioned here: http://agoncal.wordpress.com/2011/02/11/java-ee-7-i-have-a-few-dreams/, I am happy to bring this to the attention of the Java EE 7 EG (although to be clear I am not entirely convinced myself that these issues warrant a new JTA revision since it is such a low-level API these days).

    One thing that might be very helpful if that someone else from outside Sun/Oracle lead the specification including creating the reference implementation (just as the CDI specification was created by JBoss/Red Hat, not Oracle). Is this something you would be interested in?

    Thanks,
    Reza

  2. Pingback: Tweets that mention Why we need JTA 2.0 | Ludo’s blog -- Topsy.com

  3. Ludovic says:

    Reza,

    JTA isn’t supposed to be a low-level API, quite the contrary. It’s the failure of the spec at defining a usable API and a clear contract for implementors which made it the perfect candidate for burial down in the deepest layers of modern applications. Some people got that (Christian Bauer of Seam for instance: https://issues.jboss.org/browse/JBSEAM-1144) but most don’t. I can’t blame anyone’s will to avoid such a horrible beast at all cost.

    I’d be more than happy and honored to take the lead of the JTA 2.0 spec, if it ever came to existence. Unfortunately I couldn’t afford doing that.

    Cheers,
    Ludovic

  4. Reza Rahman says:

    Ludo,

    That’s unfortunate, but let’s give it a spin anyways.

    One thing that would be good to do in an updated JTA spec is clearly outline the cases of what to do with non-XA resources (e.g. the last resource gambit), what happens when the two-phase commit really isn’t required (e.g. the very common case when there is only one resource in the tx and local transactions can be used). This would clear up a lot of the misconceptions that the community has with Java EE transaction management. There is some verbiage around this in the EJB spec, but it is not nearly enough. For example, the EJB spec says:

    ============================================================
    Many applications will consist of one or several enterprise beans that all use a single resource manager (typically a relational database management system). The EJB container can make use of resource manager local transactions as an optimization technique for enterprise beans for which distributed transactions are not needed. A resource manager local transaction does not involve control or coordination by an external transaction manager. The container’s use of local transactions as an optimization technique for enterprise beans with either container-managed transaction demarcation or bean-managed transaction demarcation is not visible to the enterprise beans. For a discussion of the use of resource manager local transactions as a container optimization strategy, refer to [ 12 ] and [ 15 ].
    ============================================================

    The updated verbiage could look something like the Microsoft “promotable transactions” verbiage in .NET 2.0: http://msdn.microsoft.com/en-us/library/ms172070%28v=vs.80%29.aspx.

    It would be great to have some text like that in the JTA specification as well. As you alluded to, it would also be very good for JTA to be a pluggable API with a separate TCK so that it could be used in non-Java EE environments.

    Personally, I think this might be more valuable from an end-user perspective at this point given the most common uses of JTA/transaction APIs these days. Of course, cleaning up the API and adding more implementation detail around X/Open XA would not be so bad provided it can be done in a backwards compatible fashion.

    It might be possible that Resin can take a lead on a new JTA spec if resources become the make-or-break issue, I’ll have to check with my team…

    Cheers,
    Reza

  5. Sudhanshu says:

    Hi Ludovic,

    Really good blog, I had similar problem when I tried to understand the JTA at start. The loose specification, might provide it some flexibility, but leads to complex and confusing implementation and usages.

    I am currently working on some sort of transaction memory based JTA implementation. I am planning to use the Ehcache as benchmark for my JTA implementation, but having some problem to understand its XA implementation part in Ehcache. In Ehcache the XATransactionStore implementation the transaction enlists only local cache as resource (line 114, getorCreateXAResource), then how JTA is able to update the other distributed caches without any reference to them. Is it done through Terracotta logical view or some other automatic reference passing?If other please describe. But if terracotta is already supporting distributed update, why someone will use JTA for atomic update. Please clarify on that.

    I am sorry to ask this question on your blog, but I was not able to find any other way to communicate with you.

    I have already asked it as a question on stackoverflow: http://stackoverflow.com/questions/8245598/distributed-ehcache-working-using-jta

    Regards
    Sudhanshu

  6. /t says:

    Ludovic,

    Don’t waste your time. Create your own specification and release in the wild. Just like hibernate did. Provide implementation in BTM and see what will happen. The spec should substitute both X/Open and JTA.

    Thanks,

    /t

  7. Stephen says:

    I must say I love your work with JTA already, though I’m relatively new to this domain. I’m curious to understand how you would use BitronixTM in a Java web in a web container without the JNDI noise and stuff.

    My concerns are:
    1.At what point in the application do you want to bootstrap the datasource? Do you do this in the servlet’s init, or in the action handler’s factory?
    2.At what point do you want to create a TM for a give request? Do you want to scope the TM inside the request? Do you want to scope in the current thread?
    3.At what point do you want to shutdown the TM? Is it ok to reuse the TM in multiple servlet requests, or do you want to close the TM with every servlet request and open a new one for the next servlet request?
    4.For a give servlet request, do you want to open a new TM for every SQL Connection request, or do you want to share the TM for all SQL Connection in that one servlet request?
    5.At what point do you want to shutdown the datasource?
    6.Is there other concerns with the TM and its associated Transactions that I should look out for?

    I look forward to your response

  8. Ludovic says:

    Sorry for the delay coming back to you. Here are the answers to your questions:

    1) Usually, the datasource should be bootstrapped when your webapp starts up. Have a look at ServletContextListener as its contextInitialized() method is what you’re looking for here.

    2) The TM should have the same scope as the datasource, ie: start it up when your app starts up, dhut it down when your app shuts down.

    3) Shutdown the TM and close your datasources during ServletContextListener.contextDestroyed(). The TM is completely thread-safe and actually is meant to be used by multiple threads in parallel. Just call TransactionManager.begin() when you want to start a transaction and TransactionManager.commit() or rollback() when you want to terminate it. Don’t bother about what other threads are doing to the TM, it will handle concurrent transactions just fine.

    4) You want to begin a new transaction for every servlet request that is going to access your transactional resources (ie: database(s), JMS server(s)…) and commit or rollback the transaction at the end of the request. You want one and only one transaction manager for your whole application.

    5) When your app shuts down, see (3).

    6) Nothing in particular. Make sure you configured your datasource(s) fine and that you can use them between a TM.begin() and a TM.commit() block and you should be good to go. The rest is just common sense, like make sure you always commit or rollback the transactions you start, read all INFO, WARN and ERROR logs eventually reported by the TM and that should be it. My advice usually is to avoid tweaking any setting unless you really need to, ie: leave as many settings as you can to their default value, only set the ones you really want to change. BTM has a sensible set of default values and automatically figures out everything that is needed to give you the full XA guarantee without any special config. If anything is wrong, it will tell you as soon as possible with a WARN or ERROR log.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>