Data in .NET
In this paper I will introduce the way data is organized in a .NET
application. After making a small comparison with Delphi datasources I will dive
into the XML datasets which hold the data in a .NET application.
The Delphi datasource and dataset component
All data-aware Delphi components communicate with the data through a
datasource component which provides a standardized connection to one set of
tabulated data. The data-aware component has a datasource property and a
datafield property, the latter identifies the field (column) in the datasource.
The datasource component has a dataset property which has to be set to a tDataSet
object. Many Delphi components inherit from that base class. tTable reads and
writes its data straight from a database table, tQuery uses a SQL query to get
to the data and tStoredProc communicates directly with a view in the database
itself. Many versions of the components are available: to work with ADO
data you have to use tADOtable and to work with Interbase data you even have the
choice from multiple components from multiple vendors. A tDataset derived
component can provide access to more than just database data, for instance the OfficePartner
suite wraps up the Outlook address book in a tDataSet derived class.
The interface of a visual component is standardized using the DataSource
component but when it comes to programmatically manipulating the data you have
to program against the dataset itself, the DataSource class has very few
members.
The XML dataset
In a .NET application data is always available as a XML dataset which has the
following characteristics :
- It can contain multiple tables of data.
- It has a schema which describes the tables and their fields.
- Constraints on the data are described in the schema.
- The relationships between the multiple tables are described in the schema
- Schema and data contained can be represented as one big string, which can
be transported using any protocol that will transport a stream of text.
The dataset is self-describing as it contains it's own metadata. Together
with the last point it makes XML datasets an ideal tool in a web application,
using the HTTP protocol the description and the contents of a dataset can be
easily transported over the internet in one go.
XML is written using tags, just like HTML. The big difference is that the set
of HTML tags is predefined by W3C and most tags in an XML dataset are custom
tags. Items are enclosed in paired tags. In HTML the text between a pair of <B></B>
will be shown in a bold font. In the XML data the data between <MyField> </MyField>
tags holds the contents of myfield. The XML schema is written using tags as
well, for instance the data between a pair of <xs:sequence></> tags
describes the fields in a table.
An XML dataset is best described by Delphi's tXMLdocument
class. An XMLdocument is best thought of a set of nodes connected in tree
like structures. A node is a set of tags and what is contained between them.
Amongst that are the name of the node, attributes, plain data or other nodes
consisting of new sets of tags. Let's take a
look of a part of a document describing customers, invoices and and invoice
details.

All customer- and invoicedetails- rows are represented as nodes. Both have attributes
like diffgr:id and msdata:roworder they are specific to a diffgram,
more on that later. The fieldvalues from the database
rows are enclosed
in a set of tags with the fieldname. The Delphi tXMLdocument class has a lot of useful methods and properties to read and write these documents.
XSD : the schema describing the dataset
The schema is stored in an .XSD file but can be found in the result of a
webmethod as well. Let's take a look at the schema of the customers dataset as returned
by a webmethod named customers.

For the idea it is enough to read the following points from this schema :
- The table has four fields : Addres1, Address2, idCustomer and Name.
- The idCustomer field is an integer autoincrement field.
- The other fields are string fields.
- The idCustomer field is the primary key.
An often misunderstood attribute of the schema is the namespace. When I am
naming something Customer in my schema I am describing my view on a customer, a very simple record. When somebody else would create a Customer there is no
doubt that he or she will use quite different fields. To distinguish my Customer
from your customer I place my customers in my own unique namespace. The value of
this namespace, http://www.Gekko-Software.nl/Schemas/DataSetCustomers.xsd,
looks like an URL but is actually a Universal Resource Identifier. The
Identifier is supposed to be unique, if you take the URL of your website as
base for the namespace you will have a reasonable chance of having an unique
namespace. Default the namespace of a new ASP.NET application is set to
tempuri.org, that is the temporary namespace for applications under development.
The URI does need need to be a valid URL, the XML parser will never
send it a request. In the XML returned by the webservice you will see that all
names which are not a field or tablename are prefixed, like diffgr:id =
"Customers1". This prefix stands for the diffgram namespace. The Id
attribute is
defined in there, it is said to be qualified in the diffgram namespace.
Analyzing XML datasets
The customer-, invoice- and invoicedetails dataset will be further explored
from a Delphi client. For the sake of demo this paper will dive deeper into
the details of a
more abstract dataset with just two tables. It's schema looks like this.
<?xml
version="1.0"
encoding="utf-8"
?>
<xs:schema
id="Dataset1"
targetNamespace="http://www.Gekko-Software.nl/Dataset1.xsd"
elementFormDefault="qualified"
attributeFormDefault="qualified"
xmlns="http://www.Gekko-Software.nl/Dataset1.xsd"
xmlns:mstns="http://www.Gekko-Software.nl/Dataset1.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element
name="Dataset1"
msdata:IsDataSet="true">
<xs:complexType>
<xs:choice
maxOccurs="unbounded">
<xs:element
name="GrootBoek">
<xs:complexType>
<xs:sequence>
<xs:element
name="idGrootBoek"
msdata:AutoIncrement="true"
type="xs:int"
/>
<xs:element
name="Omschrijf"
type="xs:string"
minOccurs="0"
/>
<xs:element
name="Bezit"
type="xs:boolean"
minOccurs="0"
/>
<xs:element
name="Liquide"
type="xs:boolean"
minOccurs="0"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element
name="BoekingsRegel">
<xs:complexType>
<xs:sequence>
<xs:element name="idBoekingsRegel"
msdata:AutoIncrement="true"
type="xs:int"
/>
<xs:element name="idBoeking"
type="xs:int"
minOccurs="0"
/>
<xs:element name="Bon"
type="xs:string"
minOccurs="0"
/>
<xs:element name="Datum"
type="xs:dateTime"
minOccurs="0"
/>
<xs:element name="Links"
type="xs:decimal"
minOccurs="0"
/>
<xs:element name="Rechts"
type="xs:decimal"
minOccurs="0"
/>
<xs:element name="idGrootBoek"
type="xs:int"
minOccurs="0"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:complexType>
<xs:unique
name="Dataset1Key1"
msdata:PrimaryKey="true">
<xs:selector
xpath=".//mstns:GrootBoek"
/>
<xs:field
xpath="mstns:idGrootBoek"
/>
</xs:unique>
<xs:unique
name="Dataset1Key2"
msdata:PrimaryKey="true">
<xs:selector
xpath=".//mstns:BoekingsRegel"
/>
<xs:field
xpath="mstns:idBoekingsRegel"
/>
</xs:unique>
<xs:keyref
name="GrootBoekBoekingsRegel"
refer="Dataset1Key1"
msdata:ConstraintOnly="true">
<xs:selector
xpath=".//mstns:BoekingsRegel"
/>
<xs:field
xpath="mstns:idGrootBoek"
/>
</xs:keyref>
</xs:element>
</xs:schema>
When compared with a single table schema these differences can be noticed :
In the Visual studio schema editor the schema will be displayed in a
graphical representation. By clicking items their values can be set in the
property window.

Data in the XML dataset
Data in an XML dataset is enclosed
in a set of tags. An instance of the dataset whose schema we just saw will look like
this :
<DataSet1 xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
<BoekingsRegel>
<Bon>36</Bon>
<Datum>2001-07-11T00:00:00.0000000+02:00</Datum>
<idBoeking>84</idBoeking>
<idBoekingsRegel>414</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>50.42</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel>
<Bon>5</Bon>
<Datum>2001-02-08T00:00:00.0000000+01:00</Datum>
<idBoeking>19</idBoeking>
<idBoekingsRegel>423</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>39.75</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel>
<Bon>22</Bon>
<Datum>2001-05-01T00:00:00.0000000+02:00</Datum>
<idBoeking>50</idBoeking>
<idBoekingsRegel>424</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>294</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel>
<Bon>16</Bon>
<Datum>2001-04-24T00:00:00.0000000+02:00</Datum>
<idBoeking>41</idBoeking>
<idBoekingsRegel>425</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>4.08</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<GrootBoek>
<Bezit>false</Bezit>
<idGrootBoek>23</idGrootBoek>
<Liquide>false</Liquide>
<Omschrijf>Onderhoud</Omschrijf>
</GrootBoek>
</DataSet1>
From this bunch of XML you can read the following
The dataset as a class
So far we have seen the textual representation of an XML dataset. That is the
format in which it will come in and out of the application. But inside .NET
everything is an object, including datasets. Visual Studio will translate the
dataset schema into a class (type) definition. This class definition is based on
the System.DataSet class. In the class viewer you can get a good overview of the
generated class

Actually VS has generated a couple of classes, DataSet1 describes the dataset
as a whole; BoekingsRegelDataTable and GrootBoekDataTable describe
the tables in the dataset; BoekingsRegelDataTableRow and GrootBoekDataTableRow
describe the rows of these tables. The table and row types are used inside the DataSet1
type. DataSet1
has a property tableBoekingsRegel of type BoekingsRegelDataTable and a
property tableGrootBoek of type GrootBoekDataTable. The types describing
a row have a property for every field in the row, GrootBoekRow has a property Bezit,
idGrootBoek, Liquide and Omschrijf.
The fact that a dataset is strongly typed is very handy when coding. The
compiler will validate that you are working with an existing field. All tables and fields show up in code completion :
DataSet1.GrootBoekRow gr = dataSet11.GrootBoek[0];
gr.Bezit = false;
I declare a row of the "grootboek" table and set it to the first
row of the GrootBoek table in the dataSet11 object. The row has a
property Bezit, corresponding to the "bezit" field, I can set the field's value
from code.
VS has generated two overloaded methods in the
table classes to add new rows to the dataset . One of them is extra handy, it lets you add a new row in one line
of code.

The other version takes a little more coding.
DataSet1.GrootBoekRow gr = dataSet11.GrootBoek.NewRow() as
DataSet1.GrootBoekRow;
gr.Bezit = true;
gr.Omschrijf = "This is a new row";
dataSet11.GrootBoek.AddGrootBoekRow(gr);
I have to create a new GrootBoekRow, sets its fields and can add the
row object to the table.
Filling the dataset from the database using the DataAdapter Class
I have been changing and adding data in the dataset but did not do anything
yet with the database where the data is stored. An XML dataset does not need a database at
all, it is actually quite agnostic where the data came from and where it will
go. In most applications the data will be stored in a database.
To communicate with databases .NET has two important components. One is the
connection component, it makes the connection to a SQL server or an
oleDB datasource. The second important component is the DataAdapter,
it bundles a select, update, insert and delete SQL statement on a set of tabulated
data. In Visual Studio you can drag a table or view from the server explorer on
the design surface, the VS wizards will then generate both components including
the connection string and
the SQL statements .

Out of the box VS.NET can work with MS-SQL server data and with oleDB dataproviders.
When you drop a table (or view) from SQL server VS will create a sqlConnection
and a sqlDataAdapter component, when you drop data from an oleDB source VS will
create a oleDbConnection and an oleDbDataAdapter component. OleDB (ADO) will cover almost every database.
But there are more components
available, for instance for ODBC. Version 1.1 of the .NET framework will
contain these as well as native Oracle classes.
Using the DataAdapters VS can also generate the
dataset by right clicking the adapter component. If you construct your dataset
this way you will not even see any
of the XML, the whole schema is generated behind the scenes and is addressed
through the corresponding dataset class.
The two most important things a DataAdapter has to do is filling the dataset
with data from the database and writing any changes in the dataset back to the
database. To fill a dataset the adapter has the Fill method. The dataset in the example read its data from two tables in the database
using parameterized SQL queries:
try
{
oleDbConnection1.Open();
oleDbDataAdapter1.SelectCommand.Parameters[0].Value = 23;
oleDbDataAdapter2.SelectCommand.Parameters[0].Value = 23;
oleDbDataAdapter1.Fill(dataSet11.GrootBoek);
oleDbDataAdapter2.Fill(dataSet11.BoekingsRegel);
}
finally {oleDbConnection1.Close();}
After opening the connection the parameters are set, for the sake of demo to
a constant value of 23, in your own code you are welcome assign whatever value
you consider appropriate. Executing the Fill method will fill the table in the dataset
from the database. Here you can see that a dataSet can house more than one table
and a dataAdapter just one, two calls on the same dataset fill it from two
different dataAdapters.
Disconnected data
The DataAdapters Fill method is very efficient. All data has to be
read just once, there is no need to create scrollable cursors on the server or
to place locks on the data. Interacting with data this way also
means that the resulting data in the dataset is always disconnected. In
Delphi you can use a ClientDataset to work with disconnected data. But this
class has only direct support for DataSnap (Midas) dataprovider. If you want to
use it with other data you will have to quite some coding yourself.
To
summarize the differences and similarities between classical Delphi and .NET
:
| XML dataset |
Delphi Datasource |
| Always disconnected |
Using ClientDataset disconnected, otherwise connected |
| Multiple tables |
1 table |
| Includes and validates constraints |
Constraints validated by the underlying database |
We now have a dataset filled with database data and know how to change and
add the data from code. To understand how to update the database I will first take a
look at the internals of the dataset. The XMLdata I have shown before
is in a simple XML format, it was generated by calling the GetXml
method of the dataset
textBox1.Text = dataSet11.GetXml();
Internally the dataset uses the diffgram format. Diffgrams were introduced
in MS SQL server, it is the format used to access SQL server using XML. You can see what
it looks like with a few lines of code.
System.IO.StringWriter sw = new
System.IO.StringWriter();
dataSet11.WriteXml(sw, System.Data.XmlWriteMode.DiffGram);
textBox1.Text = sw.ToString();
A dataset has a WriteXML method which is more flexible than GetXml.
The StringWriter is a class used to construct complex strings and the DiffGram
constant describes the desired XML contents. WriteXml uses the parameters to
fill the StringWriter object after which the resulting XML can be read
using the ToString method.
To see what a diffgram looks like I have added one new row to the "BoekingsRegel" table, deleted another, changed
a row and finally extracted
the XML.
<diffgr:diffgram
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<DataSet1 xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
<BoekingsRegel diffgr:id="BoekingsRegel1" msdata:rowOrder="0">
<Bon>36</Bon>
<Datum>2001-07-11T00:00:00.0000000+02:00</Datum>
<idBoeking>84</idBoeking>
<idBoekingsRegel>414</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>50.42</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel diffgr:id="BoekingsRegel2" msdata:rowOrder="1"
diffgr:hasChanges="modified">
<Bon>New bon</Bon>
<Datum>2001-02-08T00:00:00.0000000+01:00</Datum>
<idBoeking>19</idBoeking>
<idBoekingsRegel>423</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>139.75</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel diffgr:id="BoekingsRegel4" msdata:rowOrder="3">
<Bon>16</Bon>
<Datum>2001-04-24T00:00:00.0000000+02:00</Datum>
<idBoeking>41</idBoeking>
<idBoekingsRegel>425</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>4.08</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel diffgr:id="BoekingsRegel5" msdata:rowOrder="4" diffgr:hasChanges="inserted">
<Bon>12345</Bon>
<Datum>2002-08-23T00:00:00.0000000+02:00</Datum>
<idBoeking>41</idBoeking>
<idBoekingsRegel>426</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>23</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<GrootBoek diffgr:id="GrootBoek1" msdata:rowOrder="0">
<Bezit>false</Bezit>
<idGrootBoek>23</idGrootBoek>
<Liquide>false</Liquide>
<Omschrijf>Onderhoud</Omschrijf>
</GrootBoek>
</DataSet1>
<diffgr:before>
<BoekingsRegel diffgr:id="BoekingsRegel2" msdata:rowOrder="1"
xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
<Bon>5</Bon>
<Datum>2001-02-08T00:00:00.0000000+01:00</Datum>
<idBoeking>19</idBoeking>
<idBoekingsRegel>423</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>39.75</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
<BoekingsRegel diffgr:id="BoekingsRegel3"
msdata:rowOrder="2" xmlns="http://www.Gekko-Software.nl/DataSet1.xsd">
<Bon>22</Bon>
<Datum>2001-05-01T00:00:00.0000000+02:00</Datum>
<idBoeking>50</idBoeking>
<idBoekingsRegel>424</idBoekingsRegel>
<idGrootBoek>23</idGrootBoek>
<Links>294</Links>
<Rechts>0</Rechts>
</BoekingsRegel>
</diffgr:before>
</diffgr:diffgram>
You can read the following:
Using this bunch of XML the DataAdapter will write all changes to the database
with the Update
method:
oleDbDataAdapter2.Update(dataSet11.BoekingsRegel);
All information needed for an efficient update of the database is available as
modified and inserted rows are marked as such. As all original content of the
rows is also contained in the diffgram, the method can check if the row to be
updated has been changed since it was read from the database. If you take a look
at the generated update statement in the DataAdapter you will see that is
exactly what the adapter will do:
UPDATE BoekingsRegel
SET Bon = ?, Datum = ?, idBoeking = ?, idGrootBoek = ?, Links = ?, Rechts = ?
WHERE (idBoekingsRegel = ?) AND (Bon = ? OR
? IS NULL AND Bon IS NULL) AND (Datum = ? OR
? IS NULL AND Datum IS NULL) AND (Links = ? OR
? IS NULL AND Links IS NULL) AND (Rechts = ? OR
? IS NULL AND Rechts IS NULL)
The SQL will use all original values to find the row to update. If a value
has been changed the update command cannot find the row and will throw an exception.
The result is like a scenario using optimistic locking, but the users do not
have a connection to the database and no locks are issued on the database.
XML is everywhere in .NET
XML is used in every corner of .NET, not just for working with database data.
A lot of settings in a .NET application can be configured at startup. All
configuration files are in XML format, as an example a snippet describing users
of a web application :
<authentication
mode="Forms">
<forms
name="testapp"
loginUrl="login.aspx">
<credentials
passwordFormat="Clear">
<user
name="Peter"
password="p2"/>
<user
name="John"
password="j2"/>
</credentials>
</forms>
</authentication>
Windows- or web-controls do not have special databound versions like
Delphi controls. Instead individual properties of a control are marked as being
bindable, this is done with an attribute in
the code of the control. Most of the controls in .NET have large amounts of bindable
properties, in the property windows there is a DataBindings property
which will start the databindings dialog.

Using this dialog you could set the value of the backcolor in the grid to
field in a XML dataset. This dataset could be filled from a configuration file
but be part of the database as well. In the latter case the background will
dynamically change as you navigate through the data.
Where are we ?
In this story I have only scratched the surface of the vast world of XML and
the many ways .NET uses it. All scenarios described are just the defaults of the
Visual Studio wizards, every scenario does have many alternatives. My main point
is that .NET can produce XML in multiple formats but while coding you program
plain objects.
What's next ?
|