DataReaders, DataSets, and performance
By Peter A. Bromberg, Ph.D.
Printer - Friendly Version
Peter Bromberg

DataReader vs. DataSet Performance Characteristics

DataReaders are faster than DataSets populated via DataAdapters, right?      Well, yes --- and, no.

The DataSet provides useful functionality for (among others):

  • Master-Detail Relationships with integrity checking
  • Client-side computed columns, and aggregations that keep up to date when the underlying data changes
  • Client-side filtering and sorting that doesn’t require round-trips to the Database and stays up to date when the underlying data changes
  • A broad choice of data binding options, including full two-way data binding with Windows Forms, binding to the DataGrid, and other .NET controls
  • Integration with the DataAdapter for automatic database updates
  • Ability to directly read and write XML representations of relational data.




At some point, however (if it hasn’t already), the controversy will arise that DataReaders are faster that DataAdapters/DataSets and that you should wean yourself away from all those features and use the lean, mean DataReader instead. After all, the DataAdapter.Fill() method actually uses the DataReader internally so the theory would be that it can only hope to approach the performance of a DataReader alone.

However, there are other issues involved here that may not immediately come into focus on “first pass”. In reality, on multi-user systems, a DataReader is holding the connection and any locks open throughout the duration of your looping logic (e.g., while myReader.Read() { – loop --} ). Conversely, DataAdapters/DataSets release their connections and locks right after the fill statement completes. The result of using DataReaders can therefore lead to increased database contention, slower performance, and potential poor scalability, depending on concurrent traffic under load. Of course, the only way to get a realistic handle on all this is to test one's code in a simulated production environment.

Improving DataAdapter Performance

You can improve the performance of the DataAdapter, however, - approaching the speed of a DataReader. This is done by temporarily turning off some of the advanced default functionality of the DataSet during processing. The biggest performance hits during DataSet retrieval are the integrity constraints and indexing that the DataSet uses to maintain its data internally.

The code listing below fills a dataset that has two DataTables, Customers and Employees from the Northwind Database. This is done via a DataAdapter.

 

DataSet ds= new DataSet();
string strConn="Server=(local);dataBase=Northwind;user id=sa;password=;";
SqlConnection cn = new SqlConnection(strConn);
string strSQL="select * from Customers;select * from employees";
cn.Open();
SqlDataAdapter da = new SqlDataAdapter( strSQL,cn);
ds.Tables.Add("Customers");
ds.Tables.Add("Employees");
ds.EnforceConstraints =false;
ds.Tables["Customers"].BeginLoadData();
da.Fill(ds.Tables["Customers"]);
ds.Tables["Customers"].EndLoadData();
ds.Tables["Employees"].BeginLoadData();
da.Fill(ds.Tables["Employees"]);
ds.Tables["Employees"].EndLoadData();
dataGrid1.DataSource=ds.Tables["Customers"];
dataGrid2.DataSource=ds.Tables["Employees"];
cn.Close();


Note that the first key line of code sets the EnforceConstraints property of the dataset to false. This turns off constraint checking and can offer faster overall processing. This setting can be turned back on, after the data is retrieved, inside of a try/catch loop that specifically attempts to catch a ConstraintException, and so we can still check for constraint violations.

By the way, establishing a primary key as part of setting up your DataSet schema can add a significant performance overhead.

We can also temporarily turn off internal Dataset indexing and notifications. Before and after each fill, execute the BeginLoadData and EndLoadData methods (shown above) respectively. These methods are members of the DataTable class and so you need to call them on the particular DataTable that you are filling. BeginLoadData turns off notifications, index maintenance, and constraints while loading data. This ain't "rocket science", it's right there in the MS documentation for the DataTable class.

The performance improvements that you will achieve with these techniques will vary depending upon the size of the data being retrieved and the number of constraints within your dataset. On the simplest multi-table dataset however, I saw an average performance improvement of some 31 percent and I suspect with complex master-detail relational datasets the performance improvement could be significantly higher.

Another test I did (just for comparison purposes) was to put together a utility class with 2 methods, one to convert a DataReader to a DataTable, and a second to convert a DataReader (one with more than one resultset, of course) to a DataSet. The idea was to see if bringing back a DataReader for speed, and then converting it at the client side into a required DataTable or DataSet for binding, say, to a DataGrid would be worth it. Sadly, the performance was actually slower than the regular DataAdapter.Fill() methods. However, the class still has a number of uses and I reproduce the code below and also include it with the test – harness download at the bottom of this article.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Reflection;

namespace PAB.Data.Utils
{
public class DataReaderConverters
{
/// <summary>
/// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataSet:
/// Converts one or more resultsets returned in a SqlDataReader to a DataSet
/// </summary>
/// <param name="reader">SqlDataReader</param>
/// <returns>System.Data.DataSet</returns>

public static DataSet ConvertDataReaderToDataSet(SqlDataReader reader)
{
DataSet dataSet = new DataSet();
do
{
// Create new data table
DataTable schemaTable = reader.GetSchemaTable();
DataTable dataTable = new DataTable();

if ( schemaTable != null )
{
for ( int i = 0; i < schemaTable.Rows.Count; i++ )
{
DataRow dataRow = schemaTable.Rows[ i ];
// Create a column name that is unique in the data table
string columnName = ( string )dataRow[ "ColumnName" ];
// Add the column definition to the data table
DataColumn column = new DataColumn( columnName, ( Type )dataRow[ "DataType" ] );
dataTable.Columns.Add( column );
}

dataSet.Tables.Add( dataTable );

while ( reader.Read() )
{
DataRow dataRow = dataTable.NewRow();

for ( int i = 0; i < reader.FieldCount; i++ )
dataRow[ i ] = reader.GetValue( i );

dataTable.Rows.Add( dataRow );
}
}
else
{
// No records returned

DataColumn column = new DataColumn("RowsAffected");
dataTable.Columns.Add(column);
dataSet.Tables.Add( dataTable );
DataRow dataRow = dataTable.NewRow();
dataRow[0] = reader.RecordsAffected;
dataTable.Rows.Add( dataRow );
}
}
while ( reader.NextResult() );
return dataSet;
}

/// <summary>
/// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataTable
/// converts SqlDataReader to a DataTable
/// </summary>
/// <param name="reader">SqlDataReader</param>
/// <returns>System.Data.DataTable</returns>

public static DataTable ConvertDataReaderToDataTable(SqlDataReader reader)
{
System.Data.DataTable table = reader.GetSchemaTable();
System.Data.DataTable dt = new System.Data.DataTable();
System.Data.DataColumn dc;
System.Data.DataRow row;
System.Collections.ArrayList al = new System.Collections.ArrayList();

for (int i = 0; i < table.Rows.Count; i ++)
{
dc = new System.Data.DataColumn();
if (! dt.Columns.Contains(table.Rows[i]["ColumnName"].ToString()))
{
dc.ColumnName = table.Rows[i]["ColumnName"].ToString();
dc.Unique = Convert.ToBoolean(table.Rows[i]["IsUnique"]);
dc.AllowDBNull = Convert.ToBoolean(table.Rows[i]["AllowDBNull"]);
dc.ReadOnly = Convert.ToBoolean(table.Rows[i]["IsReadOnly"]);
al.Add(dc.ColumnName);
dt.Columns.Add(dc);
}
}
while (reader.Read())
{
row = dt.NewRow();
for ( int i = 0; i < al.Count; i++)
{
row[((System.String) al[i])] = reader[(System.String) al[i]];
}
dt.Rows.Add(row);
}
return dt;
}
} // end class
} // end namespace

 

Reader Paul Abraham notes that there didn't seem to be any significant performance improvement when turning off constraints in using a Typed DataSet. Out of curiosity, I added a Typed DataSet option to the downloadable solution. As he noted, even with constraint checking turned off (XSD - generated classes have this set to "true" by default) there was no performance improvement. In fact, the loading of the Typed DataSet was among the slowest of all the choices.

Here are the results of my tests with 200 iterations (the average of the three best times for each case). You can download the entire solution complete with a WinForms test harness app, ready to go, and play with it if you like. If you find any additional performance enhancements or perhaps errors in my code, feel free to post your comments in our article discussions area at our forums!

 

DataAdapter to DataSet DataAdapter (Enhanced) DataReader Converter
44.93 seconds 30.76 seconds 46.81 seconds

Download the Code that accompanies this article


Peter Bromberg is a C# MVP, MCP, and .NET consultant who has worked in the banking and financial industry for 20 years. He has architected and developed web - based corporate distributed application solutions since 1995, and focuses exclusively on the .NET Platform.