Data in .NET

In this paper I will introduce the way data is organized in a .NET application. After making a small comparison with Delphi datasources I will dive into the XML datasets which hold the data in a .NET application.

The Delphi datasource and dataset component

All data-aware Delphi components communicate with the data through a datasource component which provides a standardized connection to one set of tabulated data. The data-aware component has a datasource property and a datafield property, the latter identifies the field (column) in the datasource.

The datasource component has a dataset property which has to be set to a tDataSet object. Many Delphi components inherit from that base class. tTable reads and writes its data straight from a database table, tQuery uses a SQL query to get to the data and tStoredProc communicates directly with a view in the database itself. Many versions of the components are available:  to work with ADO data you have to use tADOtable and to work with Interbase data you even have the choice from multiple components from multiple vendors. A tDataset derived component can provide access to more than just database data, for instance the OfficePartner suite wraps up the Outlook address book in a tDataSet derived class.

The interface of a visual component is standardized using the DataSource component but when it comes to programmatically manipulating the data you have to program against the dataset itself, the DataSource class has very few members.

The XML dataset

In a .NET application data is always available as a XML dataset which has the following characteristics :

The dataset is self-describing as it contains it's own metadata. Together with the last point it makes XML datasets an ideal tool in a web application, using the HTTP protocol the description and the contents of a dataset can be easily transported over the internet in one go.

XML is written using tags, just like HTML. The big difference is that the set of HTML tags is predefined by W3C and most tags in an XML dataset are custom tags. Items are enclosed in paired tags. In HTML the text between a pair of <B></B> will be shown in a bold font. In the XML data the data between <MyField> </MyField> tags holds the contents of myfield. The XML schema is written using tags as well, for instance the data between a pair of <xs:sequence></> tags describes the fields in a table.

A dataset is an XMLdocument

An XML dataset is best described by Delphi's tXMLdocument class. An XMLdocument is best thought of a set of nodes connected in tree like structures. A node is a set of tags and what is contained between them. Amongst that are the name of the node, attributes, plain data or other nodes consisting of new sets of tags. Let's take a look of a part of a document describing customers, invoices and and invoice details.

All customer- and invoicedetails- rows are represented as nodes. Both have attributes like diffgr:id and msdata:roworder they are specific to a diffgram, more on that later. The fieldvalues from the database rows are enclosed in a set of tags with the fieldname. The Delphi tXMLdocument class has a lot of useful methods and properties to read and write these documents.

XSD : the schema describing the dataset

The schema is stored in an .XSD file but can be found in the result of a webmethod as well. Let's take a look at the schema of the customers dataset as returned by a webmethod named customers.

For the idea it is enough to read the following points from this schema :

XMLNS: the XML NameSpace

An often misunderstood attribute of the schema is the namespace. When I am naming something Customer in my schema I am describing my view on a customer, a very simple record. When somebody else would create a Customer there is no doubt that he or she will use quite different fields. To distinguish my Customer from your customer I place my customers in my own unique namespace. The value of this namespace,  http://www.Gekko-Software.nl/Schemas/DataSetCustomers.xsd, looks like an URL but is actually a Universal Resource Identifier. The Identifier is supposed to be unique, if you take the URL of your website as base for the namespace you will have a reasonable chance of having an unique namespace. Default the namespace of a new ASP.NET application is set to tempuri.org, that is the temporary namespace for applications under development. The URI does need need to be a valid URL, the XML parser will never send it a request. In the XML returned by the webservice you will see that all names which are not a field or tablename are prefixed, like diffgr:id = "Customers1". This prefix stands for the diffgram namespace. The Id attribute is defined in there, it is said to be qualified in the diffgram namespace.


Analyzing XML datasets

The customer-, invoice- and invoicedetails dataset will be further explored from a Delphi client. For the sake of demo this paper will dive deeper into the details of a more abstract dataset with just two tables. It's schema looks like this.

<?xml version="1.0" encoding="utf-8" ?>
<
xs:schema id="Dataset1" targetNamespace="http://www.Gekko-Software.nl/Dataset1.xsd"
          
elementFormDefault="qualified" attributeFormDefault="qualified"
          
xmlns="http://www.Gekko-Software.nl/Dataset1.xsd" xmlns:mstns="http://www.Gekko-Software.nl/Dataset1.xsd"
          
xmlns:xs="http://www.w3.org/2001/XMLSchema"
          
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <
xs:element name="Dataset1" msdata:IsDataSet="true">
    <
xs:complexType>
      <
xs:choice maxOccurs="unbounded">
        <
xs:element name="GrootBoek">
          <
xs:complexType>
            <
xs:sequence>
              <
xs:element name="idGrootBoek" msdata:AutoIncrement="true" type="xs:int" />
              <
xs:element name="Omschrijf" type="xs:string" minOccurs="0" />
              <
xs:element name="Bezit" type="xs:boolean" minOccurs="0" />
              <
xs:element name="Liquide" type="xs:boolean" minOccurs="0" />
            </
xs:sequence>
          </
xs:complexType>
        </
xs:element>
        <
xs:element name="BoekingsRegel">
          <
xs:complexType>
            <
xs:sequence>
              <
xs:element name="idBoekingsRegel" msdata:AutoIncrement="true" type="xs:int" />
              <
xs:element name="idBoeking" type="xs:int" minOccurs="0" />
              <
xs:element name="Bon" type="xs:string" minOccurs="0" />
              <
xs:element name="Datum" type="xs:dateTime" minOccurs="0" />
              <
xs:element name="Links" type="xs:decimal" minOccurs="0" />
              <
xs:element name="Rechts" type="xs:decimal" minOccurs="0" />
              <
xs:element name="idGrootBoek" type="xs:int" minOccurs="0" />
           </
xs:sequence>
          </
xs:complexType>
        </
xs:element>
      </
xs:choice>
    </
xs:complexType>
    <
xs:unique name="Dataset1Key1" msdata:PrimaryKey="true">
      <
xs:selector xpath=".//mstns:GrootBoek" />
      <
xs:field xpath="mstns:idGrootBoek" />
    </
xs:unique>
    <xs:unique name="Dataset1Key2" msdata:PrimaryKey="true">
      <
xs:selector xpath=".//mstns:BoekingsRegel" />
      <
xs:field xpath="mstns:idBoekingsRegel" />
    </
xs:unique>
    <
xs:keyref name="GrootBoekBoekingsRegel" refer="Dataset1Key1" msdata:ConstraintOnly="true">
     <
xs:selector xpath=".//mstns:BoekingsRegel" />
     <
xs:field xpath="mstns:idGrootBoek" />
    </
xs:keyref>
  </
xs:element>
</
xs:schema>

When compared with a single table schema these differences can be noticed :

In the Visual studio schema editor the schema will be displayed in a graphical representation. By clicking items their values can be set in the property window. 

Data in the XML dataset

Data in an XML dataset is enclosed in a set of tags. An instance of the dataset whose schema we just saw will look like this :

<DataSet1 xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
<BoekingsRegel>
  <Bon>36</Bon>
  <Datum>2001-07-11T00:00:00.0000000+02:00</Datum>
  <idBoeking>84</idBoeking>
  <idBoekingsRegel>414</idBoekingsRegel>
  <idGrootBoek>23</idGrootBoek>
  <Links>50.42</Links>
  <Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel>
  <Bon>5</Bon>
  <Datum>2001-02-08T00:00:00.0000000+01:00</Datum>
  <idBoeking>19</idBoeking>
  <idBoekingsRegel>423</idBoekingsRegel>
  <idGrootBoek>23</idGrootBoek>
  <Links>39.75</Links>
  <Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel>
  <Bon>22</Bon>
  <Datum>2001-05-01T00:00:00.0000000+02:00</Datum>
  <idBoeking>50</idBoeking>
  <idBoekingsRegel>424</idBoekingsRegel>
  <idGrootBoek>23</idGrootBoek>
  <Links>294</Links>
  <Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel>
  <Bon>16</Bon>
  <Datum>2001-04-24T00:00:00.0000000+02:00</Datum>
  <idBoeking>41</idBoeking>
  <idBoekingsRegel>425</idBoekingsRegel>
  <idGrootBoek>23</idGrootBoek>
  <Links>4.08</Links>
  <Rechts>0</Rechts>
</BoekingsRegel>
<GrootBoek>
  <Bezit>false</Bezit>
  <idGrootBoek>23</idGrootBoek>
  <Liquide>false</Liquide>
  <Omschrijf>Onderhoud</Omschrijf>
</GrootBoek>
</DataSet1>

From this bunch of XML you can read the following


The dataset as a class

So far we have seen the textual representation of an XML dataset. That is the format in which it will come in and out of the application. But inside .NET everything is an object, including datasets. Visual Studio will translate the dataset schema into a class (type) definition. This class definition is based on the System.DataSet class. In the class viewer you can get a good overview of the generated class

Actually VS has generated a couple of classes, DataSet1 describes the dataset as a whole; BoekingsRegelDataTable and GrootBoekDataTable describe the tables in the dataset; BoekingsRegelDataTableRow and GrootBoekDataTableRow describe the rows of these tables. The table and row types are used inside the DataSet1 type. DataSet1 has a property tableBoekingsRegel of type BoekingsRegelDataTable and a property tableGrootBoek of type GrootBoekDataTable. The types describing a row have a property for every field in the row, GrootBoekRow has a property Bezit, idGrootBoek, Liquide and Omschrijf.

The fact that a dataset is strongly typed is very handy when coding. The compiler will validate that you are working with an existing field. All tables and fields show up in code completion :

DataSet1.GrootBoekRow gr = dataSet11.GrootBoek[0];
gr.Bezit = false;

I declare a row of the "grootboek" table and set it to the first row of the GrootBoek table in the dataSet11 object. The row has a property Bezit, corresponding to the "bezit" field, I can set the field's value from code.

VS has generated two overloaded methods in the table classes to add new rows to the dataset . One of them is extra handy, it lets you add a new row in one line of code.

The other version takes a little more coding.

DataSet1.GrootBoekRow gr = dataSet11.GrootBoek.NewRow() as DataSet1.GrootBoekRow;
gr.Bezit = true;
gr.Omschrijf = "This is a new row";
dataSet11.GrootBoek.AddGrootBoekRow(gr);

I have to create a new GrootBoekRow, sets its fields and can add the row object to the table.

Filling the dataset from the database using the DataAdapter Class

I have been changing and adding data in the dataset but did not do anything yet with the database where the data is stored. An XML dataset does not need a database at all, it is actually quite agnostic where the data came from and where it will go. In most applications the data will be stored in a database.

To communicate with databases .NET has two important components. One is the connection component, it makes the connection to a SQL server or an oleDB datasource. The second important component is the DataAdapter, it bundles a select, update, insert and delete SQL statement on a set of tabulated data. In Visual Studio you can drag a table or view from the server explorer on the design surface, the VS wizards will then generate both components including the connection string and the SQL statements .

Out of the box VS.NET can work with MS-SQL server data and with oleDB dataproviders. When you drop a table (or view) from SQL server VS will create a sqlConnection and a sqlDataAdapter component, when you drop data from an oleDB source VS will create a oleDbConnection and an oleDbDataAdapter component. OleDB (ADO) will cover almost every database. But there are more components available, for instance for ODBC. Version 1.1 of the .NET framework will contain these as well as native Oracle classes.

Using the DataAdapters VS can also generate the dataset by right clicking the adapter component. If you construct your dataset this way you will not even see any of the XML, the whole schema is generated behind the scenes and is addressed through the corresponding dataset class.

The two most important things a DataAdapter has to do is filling the dataset with data from the database and writing any changes in the dataset back to the database. To fill a dataset the adapter has the Fill method. The dataset in the example read its data from two tables in the database using parameterized SQL queries: 

try
{
   oleDbConnection1.Open();
   oleDbDataAdapter1.SelectCommand.Parameters[0].Value = 23;
   oleDbDataAdapter2.SelectCommand.Parameters[0].Value = 23;
   oleDbDataAdapter1.Fill(dataSet11.GrootBoek);
   oleDbDataAdapter2.Fill(dataSet11.BoekingsRegel);
}
finally {oleDbConnection1.Close();}

After opening the connection the parameters are set, for the sake of demo to a constant value of 23, in your own code you are welcome assign whatever value you consider appropriate. Executing the Fill method will fill the table in the dataset from the database. Here you can see that a dataSet can house more than one table and a dataAdapter just one, two calls on the same dataset fill it from two different dataAdapters.

Disconnected data

The DataAdapters Fill method is very efficient. All data has to be read just once, there is no need to create scrollable cursors on the server or to place locks on the data. Interacting with data this way also means that the resulting data in the dataset is always disconnected. In Delphi you can use a ClientDataset to work with disconnected data. But this class has only direct support for DataSnap (Midas) dataprovider. If you want to use it with other data you will have to quite some coding yourself.

To summarize the differences and similarities between classical Delphi and .NET : 

XML dataset Delphi Datasource
Always disconnected Using ClientDataset disconnected, otherwise connected
Multiple tables 1 table
Includes and validates constraints Constraints validated by the underlying database

Diffgram

We now have a dataset filled with database data and know how to change and add the data from code. To understand how to update the database I will first take a look at the internals of the dataset. The XMLdata I have shown before is in a simple XML format, it was generated by calling the GetXml method of the dataset

textBox1.Text = dataSet11.GetXml();

Internally the dataset uses the diffgram format. Diffgrams were introduced in MS SQL server, it is the format used to access SQL server using XML. You can see what it looks like with a few lines of code.

System.IO.StringWriter sw = new System.IO.StringWriter();
dataSet11.WriteXml(sw, System.Data.XmlWriteMode.DiffGram);
textBox1.Text = sw.ToString();

A dataset has a WriteXML method which is more flexible than GetXml. The StringWriter is a class used to construct complex strings and the DiffGram constant describes the desired XML contents. WriteXml uses the parameters to fill the StringWriter object after which the resulting XML can be read using  the ToString method.

To see what a diffgram looks like I have added one new row to the "BoekingsRegel" table, deleted another, changed a row and finally extracted the XML.

<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
  <DataSet1 xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
    <BoekingsRegel diffgr:id="BoekingsRegel1" msdata:rowOrder="0">
      <Bon>36</Bon>
      <Datum>2001-07-11T00:00:00.0000000+02:00</Datum>
      <idBoeking>84</idBoeking>
      <idBoekingsRegel>414</idBoekingsRegel>
      <idGrootBoek>23</idGrootBoek>
      <Links>50.42</Links>
      <Rechts>0</Rechts>
    </BoekingsRegel>
    <BoekingsRegel diffgr:id="BoekingsRegel2" msdata:rowOrder="1" diffgr:hasChanges="modified">
      <Bon>New bon</Bon>
      <Datum>2001-02-08T00:00:00.0000000+01:00</Datum>
      <idBoeking>19</idBoeking>
      <idBoekingsRegel>423</idBoekingsRegel>
      <idGrootBoek>23</idGrootBoek>
      <Links>139.75</Links>
      <Rechts>0</Rechts>
    </BoekingsRegel>
    <BoekingsRegel diffgr:id="BoekingsRegel4" msdata:rowOrder="3">
      <Bon>16</Bon>
      <Datum>2001-04-24T00:00:00.0000000+02:00</Datum>
      <idBoeking>41</idBoeking>
      <idBoekingsRegel>425</idBoekingsRegel>
      <idGrootBoek>23</idGrootBoek>
      <Links>4.08</Links>
      <Rechts>0</Rechts>
    </BoekingsRegel>
    <BoekingsRegel diffgr:id="BoekingsRegel5" msdata:rowOrder="4" diffgr:hasChanges="inserted">
      <Bon>12345</Bon>
      <Datum>2002-08-23T00:00:00.0000000+02:00</Datum>
      <idBoeking>41</idBoeking>
      <idBoekingsRegel>426</idBoekingsRegel>
      <idGrootBoek>23</idGrootBoek>
      <Links>23</Links>
      <Rechts>0</Rechts>
    </BoekingsRegel>
    <GrootBoek diffgr:id="GrootBoek1" msdata:rowOrder="0">
      <Bezit>false</Bezit>
      <idGrootBoek>23</idGrootBoek>
      <Liquide>false</Liquide>
      <Omschrijf>Onderhoud</Omschrijf>
    </GrootBoek>
  </DataSet1>
  <diffgr:before>
    <BoekingsRegel diffgr:id="BoekingsRegel2" msdata:rowOrder="1" xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
      <Bon>5</Bon>
      <Datum>2001-02-08T00:00:00.0000000+01:00</Datum>
      <idBoeking>19</idBoeking>
      <idBoekingsRegel>423</idBoekingsRegel>
      <idGrootBoek>23</idGrootBoek>
      <Links>39.75</Links>
      <Rechts>0</Rechts>
    </BoekingsRegel>
    <BoekingsRegel diffgr:id="BoekingsRegel3" msdata:rowOrder="2" xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
      <Bon>22</Bon>
      <Datum>2001-05-01T00:00:00.0000000+02:00</Datum>
      <idBoeking>50</idBoeking>
      <idBoekingsRegel>424</idBoekingsRegel>
      <idGrootBoek>23</idGrootBoek>
      <Links>294</Links>
      <Rechts>0</Rechts>
    </BoekingsRegel>
  </diffgr:before>
</diffgr:diffgram>

You can read the following:

Using this bunch of XML the DataAdapter will write all changes to the database with the Update method:

oleDbDataAdapter2.Update(dataSet11.BoekingsRegel);

All information needed for an efficient update of the database is available as modified and inserted rows are marked as such. As all original content of the rows is also contained in the diffgram, the method can check if the row to be updated has been changed since it was read from the database. If you take a look at the generated update statement in the DataAdapter you will see that is exactly what the adapter will do:

UPDATE BoekingsRegel
SET Bon = ?, Datum = ?, idBoeking = ?, idGrootBoek = ?, Links = ?, Rechts = ?
WHERE (idBoekingsRegel = ?) AND (Bon = ? OR
? IS NULL AND Bon IS NULL) AND (Datum = ? OR
? IS NULL AND Datum IS NULL) AND (Links = ? OR
? IS NULL AND Links IS NULL) AND (Rechts = ? OR
? IS NULL AND Rechts IS NULL)

The SQL will use all original values to find the row to update. If a value has been changed the update command cannot find the row and will throw an exception. The result is like a scenario using optimistic locking, but the users do not have a connection to the database and no locks are issued on the database.


XML is everywhere in .NET

XML is used in every corner of .NET, not just for working with database data. A lot of settings in a .NET application can be configured at startup. All configuration files are in XML format, as an example a snippet describing users of a web application :

<authentication mode="Forms">
  <forms name="testapp" loginUrl="login.aspx">
    <credentials passwordFormat="Clear">
      <user name="Peter" password="p2"/>
      <user name="John" password="j2"/>
    </credentials>
  </forms>
</authentication>

Windows- or web-controls do not have special databound versions like Delphi controls. Instead individual properties of a control are marked as being bindable, this is done with an attribute in the code of the control. Most of the controls in .NET have large amounts of bindable properties, in the property windows there is a DataBindings property which will start the databindings dialog.

Using this dialog you could set the value of the backcolor in the grid to field in a XML dataset. This dataset could be filled from a configuration file but be part of the database as well. In the latter case the background will dynamically change as you navigate through the data.

Where are we ?

In this story I have only scratched the surface of the vast world of XML and the many ways .NET uses it. All scenarios described are just the defaults of the Visual Studio wizards, every scenario does have many alternatives. My main point is that .NET can produce XML in multiple formats but while coding you program plain objects.

What's next ?