My Rants and Raves about technology, programming, everything else...
I just attended the second day of Chris Sells' and Tim Ewald's great Web Services DevCon East and had a great time. Yasser Shohoud gave a wonderful talk on "The Right Way to Build Web Services". He echoed something I have been thinking of for some time. Sure, I didn't want to learn how to write WSDL. At the same time I know that the WSDL that is generated by using the '?wsdl' syntax of ASP.NET's .asmx files does not let me design the interface first. I changed my mind and learned to write WSDL. WSDL really isn't too difficult to write. It is too bad that we cannot disable the ?wsdl syntax and just use a static WebService.WSDL URL to have our customer's get our WSDL files.
My natural inclination is still influenced by my days developing COM components in ATL. I want to define the interface up front like we did with IDL. In the early days of ATL, I had been doing MFC work and did not want to hand-code my own IDL either. You would think I would have learned by now that by starting with interface is the better development model. By writing our own WSDL we can define our interfaces (both the calling convention and the schema of the message) and run WSDL.exe to build a skeleton class for us to implement the service.
Unfortunately .NET just makes it much too simple to annotate the Web Service's methods with [WebMethod] and let the XML Serialization do all the rest. I am hoping we all remember the heartache we suffered the first time we did this in Visual Basic or MFC back in the COM days.
Why is everyone so down on using DataSets in .NET Web Services? Sure, I’ll admit that using DataSets directly as Web Service parameters are indeed a problem. But why throw the baby out with the bath water?
For the uninitiated, DataSets are a problem as Web Service parameters because XML that is automatically generated as the parameter is a DiffGram of the DataSet. Unfortunately DiffGrams are simply not interop-friendly. At the end of the day the obvious use of DataSets in .NET Web Services are simply a bad idea.
But if we deal with DataSets as XML instead of a class to be serialized we can actually achieve some real benefits. If you have experienced DataSets, you know that you can specify an .xsd as the schema of the DataSet. What that means is that you can deliver the contents of the DataSet with relevant schema as an XML document. Since the resulting XML document can refer to a specific schema, the consumers of the Web Service (whether they are using Java, WebSphere, or .NET) will receive a self-describing, strongly typed piece of information.
I was recently in a DevelopMentor course when I ran into a very interesting observation. The XmlSerializer serializes any class that dervies from XmlNode (including XmlDocument, XmlElement, et al) as plain XML. Previous to RTM of the .NET framework, these classes were serialized like any other class (all public properties and fields were serialized). To our amazement (Dan Sullivan and mine), we realized that the XML classes serialized perfectly when run through the XmlSerializer class.
Ok, neato but why do I care? As a developer using ADO.NET, I realized that by utilizing the XmlDataDocument and specifying an XSD for my DataSet, I could have my Web Services return an XML as specified by the XSD without ever doing transformation of the data going out of the Web Service.
This works because the .NET Web Services infrastructure uses the XmlSerializer to serialize objects. Normally, when specifying .NET types as arguments or return types for your Web Services, this would cause problems with interoperability with non-.NET platforms. So if I specify a WebMethod like:
When the XML revolution happened, I was surprised how quickly developers jumped on to the coming tide. I have to admit, the first time I saw XML, I believed it was nothing more than just structured storage. That is the magic of it, isn't it. It is just structured storage, but a structured storage format that is universally understood.
So I think we have reached a crossroads with XML. The toolsets have made it so easy to use that I think there is a bit of overuse of XML. Afterall, XML is just structured storage, but it is inefficient structured storage. When we use XML we are giving up performance and efficiency for the portability of the format and the richness of tools to work with it (the tangential technologies of XPath, XSLT, SOAP, etc.). I like to think of XML as a way to enable enterprises to speak with each other in a common way. With that in mind, are we not (as developers) overusing XML? I think so.
How does this affect my daily work? I have come to realize that there has to be a better way of dealing with XML to improve the speed of development and simplify code that uses it. Using SAX or a DOM model to navigate XML documents creates spagetti code of weakly typed code. There has to be a better way of marrying portability of XML with a simplier (preferrably strongly typed) way of programming against XML. Luckily for those of us using .NET, the Framework's XmlSerializer class is a really useful tool in that it can allow us to use a set of classes as an object graph and only deal with the complexities and inefficiency of XML when we actually serialize it to an XML Document. See http://msdn.microsoft.com for more information on how to use XmlSerializer.
For those who do not know yet, the XML integration with the DataSet is very powerful. Most of the integration is about filling and getting XML from your DataSet. But the XmlDataDocument is really cool. Simply by assigning the DataSet to the XmlDataDocument, you can work with the DataSet data either relationally (through the DataSet) or hierarchically (through the XmlDocument). So, next time you need to transform the DataSet data or just run an XPath query, assign your DataSet to an XmlDataDocument and watch the magic begin...
Too many times when I am asked to look at old ADO code, the recordsets are created with a slower cursor than they actually need. This especially prevalent in ASP code. In most every piece of ASP code, the job of the page is to report existing data. In that case you should always us a adOpenForwardOnly cursor. Remember, if you are only reading the data, the other cursor types are using extra database resources and will cause extra round trips to the database. If you are using other cursors just to enable being able to go backwards in the recordset, it is almost always better to use the adOpenForwardOnly cursor and cache the data locally to allow for reverse transversal.
Am I the only that abhors this dreadful API? I understand the usefulness of using Parameters.Refresh() during development. The problem lies in the fact that is it just too easy to leave the code in place. Including an extra network round-trip in every call to this call is simply a waste of time. Now I know that you are an intelligent programmer that never wouldn never leave that code in place, I am talking about all the other programmers that would. Most databases (ignoring Access) allows you to query the database for the information about parameters. Since the database supports, why does ADO have to? I don't think it does.
Anyone else remember the promised "In-Memory Database" (IMDB) that was to be part of COM+ some years back? Well, Microsoft has finally delivered a first version of it in ADO.NET's DataSet class.
For those who haven't take a look at ADO.NET yet, don't make the mistake of assuming the DataSet is a replacement of the Recordset from ADO. DataSets contain one or more tables. Tables can be setup with relationships, keys, constraints, etc. Though a real SQL engine does not exist (yet), DataSets do allow you to re-create some of your databases in memory. If you are attempting to scale your database servers, please take a look at this very cool technology!
Ok, this pet peeve is a biggie. Over and over I have seen database schemas that simply defined the table structures and some stored procedures. Most modern database systems support advanced features for maintaining database integrity.
If you are going to go through the trouble to define a database schema, please finish the job. The database will help you do your job if you simply define Relationships, Constraints and Triggers. There are good tools for helping you do this. I really like ERwin or Visio for designing databases.
The most common error I see in badly scalable database code is reckless use of the connection object. For all multi-user database programming (which accounts for most of the work these days), database connections are a limited resource. Don't let it be your code that is hanging on to his connection way after you are finished with it. I am *not* saying that all work can be done disconnected. I am simply asking you to keep in mind that Connections are precious things. Try to do these two things: