A Good Solution for "Magic String" Data

Dealing with vendor data (or your own) in the form of "codes" can pose significant challenges. You must ensure that your source code remains readable, that data are properly validated, and that data can be displayed as user-friendly descriptions. The built-in solutions (named constants and enums) help, but they have some significant shortcomings. If you derive a class of named constants from the MagicStringTranslator class, though, you can vanquish all 3 challenges in one fell swoop!

Most software systems have to deal with "magic strings."  A magic string is a code that represents, rather than describes, the state of some entity. Allow me to illustrate with an example: many insurance companies obtain (with a consumer's fully informed consent) some personal data about the insured so that they can more accurately quantify the risk of underwriting a policy (and thus set the appropriate premiums).  A vendor might return the marital status of the insured as a code from the following list:

Marital Status Codes
Code Description
M Married
D Divorced
X Separated
S Single
W Widowed
U Unknown

The code "M" would be the magic string that represents the "Married" status, and so forth.   

Three Problems Associated With Magic Strings

The first problem associated with magic strings is how to make the source code that handles them easy to understand. Woe to the lazy programmer who uses those magic strings in source code like this: 

switch (maritalStatus) // maritalStatus is a one-byte string we got from the vendor and stored in our DB
{
    case "X" :
        DoA();
        break;
     case "S" :
        DoB();
        break;
     // etc.
}

Could you understand this code at a glance?  Probably not.  Does "X" signify "Unknown" or "Divorced" (as in "My kid is with my ex this weekend") or "Separated"?  Does "S" signify "Single" or "Separated"?  If you have to read the vendor documentation in order to understand source code, you're in a bad place.  (And good luck finding the vendor documentation!)

Data validation is also a problem when you handle magic strings.  You might want, for example, to verify at compile time that you are passing appropriate string data to a method.  If you want to update an instance of the Insured class with the marital status from the vendor, you might write a method like this:

public class Insured
{
    
// object construction
    public static Insured CreateInstance() { return new Insured(); }
    
protected Insured() { }

    
public void UpdateStatus(string maritalStatus)
    {
        
// implementation goes here...
    }
}

You would hope that by naming the string parameter appropriately ("maritalStatus") you would keep the careful colleague from entering inappropriate data.  However, from the compiler's perspective it would be completely legal to misuse the method like this:

Insured ins = Insured.CreateInstance();
ins.UpdateStatus(
"Widowed");

Those of you who have unit tests with 100% code coverage and rich boundary-checking would catch this quickly.  The other 99% of you (and this unfortunately includes me too often) would, at best, waste a lot of time debugging this error in a test environment.  And who knows, it could even move into production by accident--oh, the horror!

Worse yet, you could misuse the method in a way that is very difficult to detect.  If you have a set of codes for workflow status, and they happen to resemble the codes for marital status, you could plug them in--and create a truly insidious bug.

Insured ins = Insured.CreateInstance();
string status = myWorkflow.GetStatus(); // S = started, W = work-in-process, U = unknown, D = deleted, X = completed
ins.UpdateStatus(status);

This code would compile and never throw an exception.  You could only pray to catch this one before it hits production.

And of course you will probably want to validate the data a vendor sends you at the point at which you receive it, as well.  If you are receiving data in XML format and it is required to conform to a comprehensive XML schema, you can simply validate the data against the schema.  Unfortunately, even in this age of XML web services, your vendors may not publish an XML schema--and they may not even use XML.  Certainly ours do not.

A third problem associated with data encoded as a magic string is that your system somehow has to translate it into an understandable format in order to display it to a user.  Twenty years ago, green screen systems could get away with displaying something like "Marital Status: X."  The cost of training users to understand the meaning of 'X' was regarded as a cost of doing business.  Today businesses know that they do not have to accept such inferior application design; they expect systems to be easy to learn and use.

The Standard Solutions

A typical recommendation is to create (and use) named constants to represent the magic strings.  Ideally, you would group related codes into a common class, so in C# you might create the following class:

public static class MaritalStatus
{
    
public const string Married = "M";
    
public const string Divorced = "D";
    
public const string Separated = "X";
    
public const string Single = "S";
    
public const string Widowed = "W";
    
public const string Unknown = "U";
}

Then you would use the named constants in place of the magic strings to make your code more readable:

switch (maritalStatus) // maritalStatus is a one-byte string we got from the vendor and stored in our DB
{
    case MaritalStatus.Separated :
        DoA();
        break;
     case MaritalStatus.Single :
        DoB();
        break;
     // etc.
}

While named constants solve the readability problem, they do nothing to help us validate data or provide a human readable description.  So let's continue our search for a solution by taking a look at a datatype built into the .NET Framework: enums.  You could define an enum...

public enum MaritalStatus
{
  Married = 1,
  Divorced = 2,
  //....
}

...then store enum values in your database.  When you want to display a description, the enum can help there as well; you can obtain the name associated with an enum's value by simply calling its ToString() method.  The following code will write "Married" to the console:

MaritalStatus ms = MaritalStatus.Married;
Console.WriteLine(ms.ToString());

An enum still has some drawbacks, however.  You must write some additional logic that translates between the magic string and the appropriate enum value, for one thing: 

public class VendorCodeTranslator
{
  public static MaritalStatus MaritalStatusVendorToOurs(string vendorCode)
  {
    switch (vendorCode)
    {
      case "D": return MaritalStatus.Divorced;
      case "S": return MaritalStatus.Single;
      // and so forth
    }
  }
}

In addition, if you want your description to include a blank, or a character (such as "&" or "*") which is illegal in a programming token, you will not be able to get what you want from an enum.

Kill Three Birds With One Stone

As a programmer, you want to be able to define a class that associates a set of user-friendly names to a set of magic strings, and then let the class handle the 3 responsibilities of code clarity, data validation, and user-friendly display.  In fact, this is possible if your class inherits from the MagicStringTranslator class that I am about to define.  

    [Serializable()]
    
public abstract class MagicStringTranslator: ISerializable
    {
        #region fields

        
private string m_magicCode;
        
private object m_syncRoot = new object();

        #endregion

        #region
construction / initialization

        
public MagicStringTranslator(string code) : this(code, false) { }

        
public MagicStringTranslator(string initString, bool parmIsDescription)
        {
            
// initialize the lookup dictionary using double-checked locking pattern
            if (LookupDict == null)
            {
                
lock (this.m_syncRoot)
                {
                    
if (LookupDict == null)
                    {
                        
// instantiate the lookup hashtable
                        LookupDict = new Dictionary<string, string>();
                        InitializeDictionary();
                    }
                }
            }

            
// set the magic code for this instance
            this.m_magicCode = null;
            
if (!parmIsDescription)
            {
                
// verify that the code is one of the codes in our dictionary
                if (!LookupDict.ContainsKey(initString))
                    
throw new MagicStringBadValueException(initString);
                
this.m_magicCode = initString;
            }
            
else
            {
                
// find the matching description in our list, then set this.magicCode to the desc's corresponding code
                foreach (KeyValuePair<string, string> kvp in LookupDict)
                {
                    
if (initString == kvp.Value)
                    {
                        
this.m_magicCode = kvp.Key;
                        
break;
                    }
                }
                
if (this.m_magicCode == null)
                    
throw new MagicStringBadDescException(initString);
            }
        }

        
protected virtual void InitializeDictionary()
        {
            
// populate the hashtable with key-value pairs.  Value = name of public const string var, Key = string assigned to the var
            FieldInfo[] fields = this.GetType().GetFields();
            
foreach (FieldInfo fi in fields)
            {
                
string key = (string)fi.GetValue(this);
                
string val;
                
object[] attribs = fi.GetCustomAttributes(typeof(CustomDescriptionAttribute), false);

                // use the CustomDescription attribute if it exists 
                
if (attribs.GetLength(0) != 0)
                {
                    val = ((
CustomDescriptionAttribute)attribs[0]).Description;
                }
                // else use the name of the field 
                
else
                    val = fi.Name.Replace("_", " "); // Substitute a blank for an underscore in order to improve readability
                LookupDict.Add(key, val);
            }
        }

        #endregion

        #region
properties

        
public string Value
        {
            
get { return this.m_magicCode; }
        }

        
public string Description
        {
            
get { return LookupDict[this.m_magicCode]; }
        }

        
protected virtual Dictionary<string, string> LookupDict
        {
            
get
            {
                
Dictionary<string, string> myDict;
                m_dictTable.TryGetValue(
this.MyKey, out myDict);
                
return myDict;
            }
            
set
            {
                m_dictTable.Add(
this.MyKey, value);
            }
        }

        
private string MyKey { get { return this.GetType().Name; } }

        #endregion

        #region
ISerializable Members

        
protected MagicStringTranslator(SerializationInfo info, StreamingContext context)
        {
            
this.m_magicCode = info.GetString("magicCode");
        }

        
void ISerializable.GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue(
"magicCode", this.m_magicCode);
        }

        #endregion

        #region
overrides

        
public override string ToString()
        {
            
return this.Description;
        }

        #endregion
    }

When you subclass MagicStringTranslator, you declare a set of named constants, similar to the constants we saw in the MaritalStatus class.  The base class uses reflection to discover the named constants and values, and populates a dictionary of keys (magic strings) and values (descriptions) with them.  The base class then uses the dictionary to validate data (by verifying that they are in the dictionary) and to map the relationship between magic strings and descriptions.  In this fashion, MagicStringTranslator allows you to address all three of the magic string responsibilities (code clarity, data validation, and user-friendly display) simply by declaring a set of named constants.  Let's take a look at some sample code to see how this works.

Sample Code

First, define the MaritalStatus class as a subclass of MagicStringTranslator:

    public class MaritalStatus : MagicStringTranslator
    {
        [CustomDescription("Living happily ever after")]
        
public const string Married = "M";
        
public const string Divorced = "D";
        
public const string Separated = "X";
        
public const string Single = "S";
        
public const string Widowed = "W";
        
public const string Unknown_Status = "U";

        
public MaritalStatus(string code) : base(code) { }
    }

You just need to remove the static qualifier on the previous MaritalStatus class,  make the class inherit from MagicStringTranslator, and define a constructor.  Note that we have decorated the Married name with a CustomDescriptionAttribute; this means that the description associated with the magic string "M" is "Living happily ever after" rather than the default (the constant name ["Married"]).  In addition, the description associated with magic string "U" is "Unknown Status," since MagicStringTranslator replaces any underscore in the constant name with a blank.

This example assumes that the vendor is passing data as codes, not descriptions.  If you receive data in the form of descriptions (believe it or not, we do), you would need to provide a second constructor with an extra boolean parameter that indicates whether the first parameter is a description or a value.

Now you can strongly type the parameter of your UpdateStatus method by defining it as an instance of class MaritalStatus:

public class Insured
{
    
// object construction
    public static Insured CreateInstance() { return new Insured(); }
    
protected Insured() { }

    
public void UpdateStatus(MaritalStatus status)
    {
        
// implementation goes here...
    }
}

And the code that calls the method can hardly go wrong:

Insured ins = Insured.CreateInstance();
MaritalStatus status = new MaritalStatus(MaritalStatus.Widowed);
ins.UpdateStatus(status
);

Bear in mind that we haven't lost the ability to use the named constants for the sake of code clarity.  This code still works fine:

switch (maritalStatus.Value) // maritalStatus is an instance of the MaritalStatus class
{
    case MaritalStatus.Separated :
        DoA();
        break;
     case MaritalStatus.Single :
        DoB();
        break;
     // etc.
}
Trace.WriteLine("Magic string value is {0} and description is {1}", maritalStatus.Value, maritalStatus.Description);

The last line of code shows that we are also able to use the built-in capabilities of MagicStringTranslator to display a user-friendly description in place of a magic string, without having to write any additional logic.

As a bonus, the MagicStringTranslator class has other useful capabilities:

  • MagicStringTranslator implements the ISerializable interface in such a way that only the code/magic string is passed.  This makes using a subclass as a remote method's parameter type both simple and efficient.
  • The class overrides object.ToString() by returning the description associated with the magic string used to initialize the class.  This would prove very useful when the polymorphic ToString() method is used to display the descriptions of an object's properties, when one or more of the properties is typed to a subclass of MagicStringTranslator

Make Wrong Code Look Wrong

So we have achieved code clarity, data validation, and the ability to translate a magic string into a user-friendly description.  Of course, it is still possible to fall asleep at the wheel and write a bug:

Insured ins = Insured.CreateInstance();
MaritalStatus status = new MaritalStatus(myWorkflow.GetStatus());
ins.UpdateStatus(status);

Yes, this will compile.  However, the mix-up between marital status and workflow status should hit you right between the eyes; this source code obeys the "Make Wrong Code Look Wrong" principle propounded by my favorite software development blogger, Joel Spolsky.  A developer who would write this bug needs to take a long vacation, for sure.

Implementation Options

Performance Improvements.  The base MagicStringTranslator class manages all the dictionaries of subclasses via an internal dictionary of dictionaries.  This allows the subclass to be coded as simply and clearly as possible.  If you are trying to squeeze every last cycle out of your CPU, though, your subclass can override the default LookupDict property by referencing its own static dictionary:

    public class MaritalStatus : MagicStringTranslator
    {
        
public const string Married = "M";
        
public const string Divorced = "D";
        
public const string Separated = "X";
        
public const string Single = "S";
        
public const string Widowed = "W";
        
public const string Unknown = "U";

        
public MaritalStatus(string code) : base(code) { }
        // we are maintaining our own Dictionary<string, string> to increase processing efficiency
        private static Dictionary<string, string> myDict;

        
// overrides
        protected override Dictionary<string, string> LookupDict
        {
            
get {return myDict; }
            
set {myDict = value; }
        }
    }

At run-time, the subclass' key/value dictionary is a simple return statement away; the base class' dictionary of dictionaries (and its reliance on discovering the name of the subclass by reflection) is bypassed.  In my testing, this has improved the performance of using the subclass' Value or Description property by over 100%.  However, the performance of the default implementation is already quite good; on my modest desktop box, test code can instantiate and use the properties of the MaritalStatus class (in a loop) 1000 times per millisecond.  Increasing this performance to 2400 times per millisecond does not justify the increased code complexity, in my opinion.  Of course, your situation may differ, so I offer the option just in case.

Dictionary Re-use.  You may also want to re-use a subclass' Dictionary to populate a drop-down list with descriptions and values.  Why should you duplicate any effort, anywhere in your application?  Since the responsibility for knowing how to populate lists would complicate the design of a MagicStringTranslator subclass, you would write a helper method in a utility class instead:

public class UiHelper
{
    
/// <summary>
    /// Uses a Dictionary of key/value pairs as the source of ListItems for a drop-down list
    /// </summary>
    public static void PopulateDropDownList(DropDownList ddl, Dictionary<string, string> dict)
    {
        
foreach (KeyValuePair<string, string> kvp in dict)
        {
            
ListItem li = new ListItem(kvp.Value, kvp.Key);
            ddl.Items.Add(li);
        }
    }
}

Then add a static property to your subclass that furnishes a copy of its Dictionary.  (Of course you wouldn't hand the original to a stranger, which might add or remove key/value pairs at its own whim.)

    public class MaritalStatus : MagicStringTranslator
    {
        
public const string Married = "M";
        
public const string Divorced = "D";
        
public const string Separated = "X";
        
public const string Single = "S";
        
public const string Widowed = "W";
        
public const string Unknown = "U";

        
public MaritalStatus(string code) : base(code) { }

        
public static Dictionary<string, string> KeyValuePairs
        {
            
get
            {
                
MaritalStatus ms = new MaritalStatus(MaritalStatus.Married);
                
Dictionary<string, string> result = new Dictionary<string, string>(ms.LookupDict.Count);
                
foreach (KeyValuePair<string, string> kvp in ms.LookupDict)
                {
                    result.Add(kvp.Key, kvp.Value);
                }
                
return result;
            }
        }
    }

Dictionary Initialization.  You may choose to override the InitializeDictionary method, perhaps by consulting a lookup table in the database.  However, you would forfeit the ability to define and use named constants if you follow this route.

Conclusion

Dealing with magic strings can pose a significant challenge when developing a .NET application.  The built-in solutions (named constants and enums) certainly help, but they have some shortcomings. If you derive a class of named constants from the MagicStringTranslator abstract class, though, you can get data validation and the mapping of magic string data to user-friendly descriptions at essentially no extra cost.

Appendix: Source Code for Helper Classes

    #region CustomDescriptionAttribute

    [
AttributeUsage(AttributeTargets.Field)]
    
public class CustomDescriptionAttribute : Attribute
    {
        
private string desc;

        
public CustomDescriptionAttribute(string desc)
        {
            
this.desc = desc;
        }

        
public string Description
        {
            
get { return desc; }
        }
    }
    #endregion

    #region
Exceptions

    
public class MagicStringException : Exception
    {
        
public MagicStringException(string data) : base(data + " passed to MagicCodeTranslator class") { }
    }

    
public class MagicStringBadValueException : MagicStringException
    {
        
public MagicStringBadValueException(string val): base("Unrecognized value (" + val + ")") { }
    }

    
public class MagicStringBadDescException : MagicStringException
    {
        
public MagicStringBadDescException(string desc) : base("Unrecognized description (" + desc + ")") { }
    }

    #endregion

Acknowledgements

I would like to thank the members of the Lean Programming Yahoo Group who helped me sharpen my analysis by offering feedback on an earlier draft of this article that I posted to the group.  I encourage anyone who is interested in improving his or her ability to write good code in today's business environment to come join us.
By Chris Falter   Popularity  (3828 Views)
Biography - Chris Falter
Chris has spent about 14 years in software development, most recently serving as a lead developer for an insurer in South Carolina. He previously served as an Escalation Engineer and as an Application Development Consultant at Microsoft. A graduate of Princeton University's class of 1983, Chris has also taught high-school students, baked bagels, and lived among the West African poor.