Jonas Hogstrom

Re: Re: Eco: Optimistic Locking

Before reading this, I suggest you start with Malcolm’s blog "Eco: Optimistic Locking", and then Krishnan’s follow up blog on the same topic. Both are well written pieces of information on optimistic locking in Eco, but neither of them comes from the horse’s mouth :-)

Hi by the way, and welcome to this first blog post. Here I’ll try to cover some of my responsibilities in the Eco II model powered framework. I spend most of my time working on the O/R mapper and all that is related to that or the OCL evaluator (or a combination of the two such as converting OCL queries to SQL queries). Anyway, back to the topic.

First I’ll explain a small detail that Malcolm got wrong regarding "TimeStamp locking". Each object in the database has a timestamp that is independent of every other object (it is stored along with the other attributes of the object). Every time the object is updated it will receive a new timestamp, and this timestamp will be the same for every object updated in the same transaction. You could think of the timestamp as a transaction number. When the timestamp is validated, Eco will fetch the timestamp for that object from the database, and compare with the timestamp that was retrieved when the object was loaded into memory in the first place. If the timestamps are not equal, it means that the object has been modified by someone else.

This means that if one client updates an Appointment object, this will not prevent another client from updating a Person object. If that had been the case, it would most certainly have been a bad thing since it would make it completely impossible use timestamp locking if you are writing a multi user application (and if you’re not writing a multi user application, there is little need for optimistic locking in the first place)

The sql-statement that Malcolm mentions: "UPDATE ECO_TIMESTAMP SET BOLD_TIME_STAMP = BOLD_TIME_STAMP + 1" is executed once at the start of the transaction (actually, just before, in a separate transaction) and simply allocates a new transaction number to be assigned to all the objects that are about to be updated. If the optimistic locking fails, the timestamp that was allocated will not be used. This could be regarded as a waste, but hey, it’s just an integer :-) If you use one transaction per second, 24/7, including Christmas Eve, you won’t run out of integers for 60 years.

Now on to Krishnan’s blog. After reading Malcolm’s blog he has a few concerns about the optimistic locking in Eco. The first concern is regarding efficiency. Eco currently validates the optimistic locking by issuing a SELECT statement to retrieve the database values and compare them to the values that were previously loaded. It is true that when you update one object, there will be two statements sent to the database. One to retrieve the old values, and if they are OK, Eco will then execute an UPDATE statement. However, if you update multiple objects, Eco will load the data for validation more efficiently, especially if you are using timestamp locking. Normally, the timestamp field is located in the same table for all objects, so the timestamp data for a whole batch of objects can be loaded were with a single SELECT. So as for performance and scalability, the price you pay for validation will always be smaller (and with timestamp locking, magnitudes smaller) than the price you will have to pay anyway for the actual update.

The suggested solution to get rid of the SELECT statement is to include the validation in a WHERE clause to the UPDATE statement. Yes, Eco has all the information available to append this WHERE clause, and I agree it would be wise to do this where this is appropriate. IDbCommand.ExecuteNonQuery (The ADO.Net function that Eco uses to send the update statements) will return the number of rows affected by an UPDATE statement, so it is possible to detect if the conditions were not met, and then roll back the transaction and throw an (optimistic locking failure) exception. [Note. Some databases supports batch updates, which would complicate the case a little, especially if this batch behaviour was hidden behind the scenes by Bdp (Borland Data Provider) for example. ]

However, the optimistic locking scheme in Eco is a little more generic that it might appear when you read Malcolm’s blog.

First of all, if an object is located in multiple tables (perhaps because of inheritance in the model), it is not certain that all tables will be modified when you save the object, but if the mode is "AllMembers", you would still want to validate the values in tables that you are not going to UPDATE. This could possibly be achieved with a little more complex WHERE clause where you nest an inner SELECT to compare data from the tables you are not going to update.

There are however even more complex scenarios… When the EcoSpace sends a batch of objects to be updated, it will send a block of data that should be validated. The data in the datablock will contain the "old" values of the objects that are about to be updated, but is not limited to this. It could contain values from any object that happens to be loaded in the EcoSpace and that for some reason or other needs to be validated (that it has not changed).

Imagine for example a model with a Customer class that has a credit limit, and a association to an Order class (0..*). When the credit limit is changed, you would want to make sure that no other client has added a new Order to the Customer (if the credit limit is lowered, the other client might have placed an Order that takes the Customer above the new credit limit). This can be achieved in Eco by specifying a "region" that covers the credit limit attribute, and the relationship to orders (the region concept is a big topic in itself, and will not be covered here). If the credit limit has been modified, the datablock that contains values for the optimistic locking will now contain not only the old values for the customer object, but also the value of the relation from Customer to Order. The value of the relationship is not stored in the Customer table. It is not even stored in a single row in the Order table. The region mechanisms will actually allow you to specify the dependency between objects that are further apart than one association; perhaps you would want to validate that no one has added an OrderItem to any of the Orders that the Customer already has, or changed the quantity of an OrderItem.

Also, as some reader commented to Krishnan’s blog, blob fields can not be used in the where-clause, but they can be fetched with a select and compared in memory (expensive, but possible if you for some reason don’t want to use timestamp locking).

The second main concern in Krishnan’s blog was regarding isolation levels. Indeed, READ COMMITTED is not enough to guarantee the optimistic locking when some data is validated in memory with a separate SELECT statement, actually, not even REPEATABLE READ is enough to validate an association like the one above between Customer and Order. In order to make sure that no one adds a new Order to a Customer after validating that the existing set of Orders has not changed, the isolation level must be SERIALIZABLE, but this would unfortunately prevent anyone from adding a new order to any other customer. Eco does not enforce any particular use of isolation levels, the developer will have to decide this and weigh the pros and the cons with each isolation level.

So to summarize: Eco could certainly benefit from using WHERE clauses in the UPDATE statements to validate the optimistic locking, but this will not handle all aspects of optimistic locking in Eco.

Posted by Jonas Hogstrom archive on October 22nd, 2004 under Eco |


Comments are closed.



Server Response from: blog1.codegear.com