DataReader vs. DataSet
Performance Characteristics
DataReaders are faster than DataSets populated via DataAdapters, right?
Well, yes --- and, no.
The DataSet provides useful functionality for (among others):
- Master-Detail Relationships with integrity checking
- Client-side computed columns, and aggregations that keep up to date
when the underlying data changes
- Client-side filtering and sorting that doesn’t require round-trips
to the Database and stays up to date when the underlying data changes
- A broad choice of data binding options, including full two-way data
binding with Windows Forms, binding to the DataGrid, and other .NET
controls
- Integration with the DataAdapter for automatic database updates
- Ability to directly read and write XML representations of relational
data.
At some point, however (if it hasn’t already), the controversy will
arise that DataReaders are faster that DataAdapters/DataSets and that
you should wean yourself away from all those features and use the lean,
mean DataReader instead. After all, the DataAdapter.Fill() method actually
uses the DataReader internally so the theory would be that it can only
hope to approach the performance of a DataReader alone.
However, there are other issues involved here that may not immediately
come into focus on “first pass”. In reality, on multi-user
systems, a DataReader is holding the connection and any locks open throughout
the duration of your looping logic (e.g., while myReader.Read() { –
loop --} ). Conversely, DataAdapters/DataSets release their connections
and locks right after the fill statement completes. The result of using
DataReaders can therefore lead to increased database contention, slower
performance, and potential poor scalability, depending on concurrent traffic
under load. Of course, the only way to get a realistic handle on all this
is to test one's code in a simulated production environment.
Improving DataAdapter Performance
You can improve the performance of the DataAdapter, however, - approaching
the speed of a DataReader. This is done by temporarily turning off some
of the advanced default functionality of the DataSet during processing.
The biggest performance hits during DataSet retrieval are the integrity
constraints and indexing that the DataSet uses to maintain its data internally.
The code listing below fills a dataset that has two DataTables, Customers
and Employees from the Northwind Database. This is done via a DataAdapter.
DataSet ds= new DataSet();
string strConn="Server=(local);dataBase=Northwind;user id=sa;password=;";
SqlConnection cn = new SqlConnection(strConn);
string strSQL="select * from Customers;select * from employees";
cn.Open();
SqlDataAdapter da = new SqlDataAdapter( strSQL,cn);
ds.Tables.Add("Customers");
ds.Tables.Add("Employees");
ds.EnforceConstraints =false;
ds.Tables["Customers"].BeginLoadData();
da.Fill(ds.Tables["Customers"]);
ds.Tables["Customers"].EndLoadData();
ds.Tables["Employees"].BeginLoadData();
da.Fill(ds.Tables["Employees"]);
ds.Tables["Employees"].EndLoadData();
dataGrid1.DataSource=ds.Tables["Customers"];
dataGrid2.DataSource=ds.Tables["Employees"];
cn.Close();
Note that the first key line of code sets the EnforceConstraints
property of the dataset to false. This turns off constraint checking and
can offer faster overall processing. This setting can be turned back on,
after the data is retrieved, inside of a try/catch loop that specifically
attempts to catch a ConstraintException, and so we can still check for
constraint violations.
By the way, establishing a primary key as part of setting
up your DataSet schema can add a significant performance overhead.
We can also temporarily turn off internal Dataset indexing and notifications.
Before and after each fill, execute the BeginLoadData
and EndLoadData methods (shown above) respectively. These
methods are members of the DataTable class and so you need to call them
on the particular DataTable that you are filling. BeginLoadData
turns off notifications, index maintenance, and constraints while
loading data. This ain't "rocket science", it's right there
in the MS documentation for the DataTable class.
The performance improvements that you will achieve with these techniques
will vary depending upon the size of the data being retrieved and the
number of constraints within your dataset. On the simplest multi-table
dataset however, I saw an average performance improvement of some
31 percent and I suspect with complex master-detail relational
datasets the performance improvement could be significantly higher.
Another test I did (just for comparison purposes) was to put together
a utility class with 2 methods, one to convert a DataReader to a DataTable,
and a second to convert a DataReader (one with more than one resultset,
of course) to a DataSet. The idea was to see if bringing back a DataReader
for speed, and then converting it at the client side into a required DataTable
or DataSet for binding, say, to a DataGrid would be worth it. Sadly, the
performance was actually slower than the regular DataAdapter.Fill() methods.
However, the class still has a number of uses and I reproduce the code
below and also include it with the test – harness download at the
bottom of this article.
using System;
using System.Data;
using System.Data.SqlClient;
using System.Reflection;
namespace PAB.Data.Utils
{
public class DataReaderConverters
{
/// <summary>
/// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataSet:
/// Converts one or more resultsets returned in a SqlDataReader to
a DataSet
/// </summary>
/// <param name="reader">SqlDataReader</param>
/// <returns>System.Data.DataSet</returns>
public static DataSet ConvertDataReaderToDataSet(SqlDataReader reader)
{
DataSet dataSet = new DataSet();
do
{
// Create new data table
DataTable schemaTable = reader.GetSchemaTable();
DataTable dataTable = new DataTable();
if ( schemaTable != null
)
{
for ( int i = 0; i < schemaTable.Rows.Count; i++ )
{
DataRow dataRow = schemaTable.Rows[ i ];
// Create a column name that is unique in the data table
string columnName = ( string )dataRow[ "ColumnName" ];
// Add the column definition to the data table
DataColumn column = new DataColumn( columnName, ( Type )dataRow[
"DataType" ] );
dataTable.Columns.Add( column );
}
dataSet.Tables.Add( dataTable
);
while ( reader.Read()
)
{
DataRow dataRow = dataTable.NewRow();
for ( int i = 0; i <
reader.FieldCount; i++ )
dataRow[ i ] = reader.GetValue( i );
dataTable.Rows.Add( dataRow
);
}
}
else
{
// No records returned
DataColumn column = new
DataColumn("RowsAffected");
dataTable.Columns.Add(column);
dataSet.Tables.Add( dataTable );
DataRow dataRow = dataTable.NewRow();
dataRow[0] = reader.RecordsAffected;
dataTable.Rows.Add( dataRow );
}
}
while ( reader.NextResult() );
return dataSet;
}
/// <summary>
/// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataTable
/// converts SqlDataReader to a DataTable
/// </summary>
/// <param name="reader">SqlDataReader</param>
/// <returns>System.Data.DataTable</returns>
public static DataTable ConvertDataReaderToDataTable(SqlDataReader
reader)
{
System.Data.DataTable table = reader.GetSchemaTable();
System.Data.DataTable dt = new System.Data.DataTable();
System.Data.DataColumn dc;
System.Data.DataRow row;
System.Collections.ArrayList al = new System.Collections.ArrayList();
for (int i =
0; i < table.Rows.Count; i ++)
{
dc = new System.Data.DataColumn();
if (! dt.Columns.Contains(table.Rows[i]["ColumnName"].ToString()))
{
dc.ColumnName = table.Rows[i]["ColumnName"].ToString();
dc.Unique = Convert.ToBoolean(table.Rows[i]["IsUnique"]);
dc.AllowDBNull = Convert.ToBoolean(table.Rows[i]["AllowDBNull"]);
dc.ReadOnly = Convert.ToBoolean(table.Rows[i]["IsReadOnly"]);
al.Add(dc.ColumnName);
dt.Columns.Add(dc);
}
}
while (reader.Read())
{
row = dt.NewRow();
for ( int i = 0; i < al.Count; i++)
{
row[((System.String) al[i])] = reader[(System.String) al[i]];
}
dt.Rows.Add(row);
}
return dt;
}
} // end class
} // end namespace
|
Reader Paul Abraham notes that there didn't seem to be any significant
performance improvement when turning off constraints in using a Typed
DataSet. Out of curiosity, I added a Typed DataSet option to the downloadable
solution. As he noted, even with constraint checking turned off (XSD - generated
classes have this set to "true" by default) there was no performance improvement.
In fact, the loading of the Typed DataSet was among the slowest of all the choices.
Here are the results of my tests with 200 iterations (the average of
the three best times for each case). You can download the entire solution
complete with a WinForms test harness app, ready to go, and play with
it if you like. If you find any additional performance enhancements or
perhaps errors in my code, feel free to post your comments in our article
discussions area at our forums!
DataAdapter to DataSet |
DataAdapter (Enhanced) |
DataReader Converter |
44.93 seconds |
30.76 seconds |
46.81 seconds |
Download
the Code that accompanies this article
| Peter Bromberg is a C# MVP, MCP, and .NET consultant who has worked in the banking and financial industry for 20 years. He has architected and developed web - based corporate distributed application solutions since 1995, and focuses exclusively on the .NET Platform. |
|