BINARY_CHECKSUM and Database Concurrency

  • Jun 08, 2006 at 7:42 PM
  • Shawn Wildermuth
  • 1 Comment

Url: http://wildermuth.com/viewrant.aspx?id=2053

Recently I posted about Timestamps and CommandBuilders and I got a well informed reply by Luciano Evaristo Guerche concerning a related approach of using BINARY_CHECKSUM in SQL Server to do the same thing.  I think Luciano's response means to say that if you can't use Timestamp's in the database (like you don't have control over the schema) then BINARY_CHECKSUM is an improvement over the brute force concurrency that CommandBuilders do by default.

I thought Luciano was right, but I wanted to prove it out.  I ran some tests using a Typed DataSet and the Northwind Customer table:

  • I changed all 91 records of the customer table by changing a single field (Address).
  • I updated the table with the brute force method, the BINARY_CHECKSUM method and the timestamp method.
  • I ran it in batches of iterations to get some scale.  I changed the entire table in each batch. I ran it in batches of 5, 25, 100, 250, 500 and 1,000 iterations.

In general the results were what I thought.  The brute force performed worst, the BINARY_CHECKSUM was a performance boost, but not to a great extent; and the timestamp showed considerable improvement.  Here is a chart of my results.  The number of rows updated corresponds ot the batches (e.g. 5 iterations x 91 rows = 455, etc.) and I have calculated the row/second response time from each result:

Let me know if you have any questions...

 

Comments

Gravatar

Luciano Evaristo Guerche Thursday, June 8, 2006

Dear Shawn,

Loved the tests and the graphic. Excellent work. As you have said, timestamp is the best choice for sure because it is a field automatically inserted/updated everytime a record is inserted/updated. When timestamp is available, what we do is check whether the value got at reading time and the value got before inserting/updating are the same or not. But, if timestamp is not available and you have no control of the schema, CHECKSUM and BINARY_CHECKSUM might be considered, since they are hash functions, that is, receive parameters and compute a hash value, with slightly better performance than checking field by field. With CHECKSUM and BINARY_CHECKSUM, what we do is compare the value computed during reading time against the value computed at inserting/updating time.

An interesting fact I'd note here is that if somebody else runs an update in a record you are editing before you update it and set the field value to its own value, timestamp checking will trigger a collision, but checksum functions and field by field comparison will not.

--
Luciano Evaristo Guerche
Taboao da Serra, SP, Brazil



Leave a Comment

*
*
*