Working with XML Data using PowerShell

In general, the main operations you might want to perform on some arbitrary data in a data store are: parsing specific data out from the store, adding new data to an existing data store, changing existing data in the store, traversing or displaying all the data in the store, and removing specified data in the store. For example, if you are working with SQL these operations correspond to SELECT, INSERT, UPDATE, and DELETE statements. The ability to work with XML data is a fundamental skill for virtually all software development, testing, and management roles. You can take two slightly different approaches when working with XML data using PowerShell. First, PowerShell has an intrinsic [xml] type alias, which, when combined with other intrinsic PowerShell cmdlets give you a PowerShell centric way of working with XML data. Second, because PowerShell can access the .NET Framework, you can take a .NET Framework centric approach. These two approaches are quite similar and you can mix and match approaches too. Let me illustrate with a few examples. In a .NET environment the most common technique used when working with XML is to work via the XmlDocument type. You can think of an XmlDocument object as an in-memory representation of an XML document. Suppose you have this XML document saved as a file named testCases.xml:
<?xml version="1.0" ?>
 <testCase id="001">
 <testCase id="002">
One way to read this document into memory as an XmlDocument object is:
[xml] $xd = get-content .\testCases.xml
If you issue a $xd.GetType() command you’ll see the object $xd has type XmlDocument. An equivalent way to read data into memory is:
[System.Xml.XmlDocument] $xd = get-content .\testCases.xml
I prefer the first approach for a subtle technical reason. In both cases if you issue a $xd | get-member command you will see that PowerShell does expose the underlying methods of the $xd object but does not expose the underlying properties. So, the second approach above is ever so slightly misleading in the sense that the $xd object is not 100% equivalent to the .NET XmlDocument type. In most situations you don’t need the underlying XmlDocument properties, but if you do need the XmlDocument properties you can get the full .NET XmlDocument type by using the PsBase property along the lines of:
[xml] $xd = get-content .\testCases.xml
$xdfull = $xd.psbase
write-host $xdfull.outerxml
In my examples so far I have used the get-content cmdlet to read XML data into an XmlDocument object. An alternative, more .NET-like approach is:
[xml] $xd = new-object System.Xml.XmlDocument
However, this approach is again mildly misleading because $xd does not have its XML properties exposed. To summarize, when loading XML data into memory as an XmlDocument object, I recommend staying as much as possible with a purely PowerShell-feel approach:
[xml] $xd = get-content .\testCases.xml
Your code is smaller and you don’t risk slightly mislead code reviewers because the intrinsic PowerShell [xml] type is not 100% equivalent to the .NET [System.Xml.XmlDocument] type.
This entry was posted in Software Test Automation. Bookmark the permalink.