Kirx' Blog - kirxblog.wordpress.com

Server-free Reporting with App-V 5 (Part 2) | May 21, 2014


Document_ReportingIn the first part of this series we talked about how to collect App-V 5 Reporting data with just a simple file share. In this 2nd part we’ll focus on how to prepare the collected data, allowing further processing.

 

 

 

 

At the end of the first part about App-V 5 Reporting I emphasized that just the centrally collected .XML files are certainly not immediately useful to generate some graphical representation. Let’s recap from that first article what was written about the .XML files:

  • They contain binary information that make XML processing literally impossible.
  • Each file contains all the information, whereas the information is nicely separated into different tables in the App-V Reporting Server and SQL database scenario.
  • Data from one and a single client may be spread across various .XML files.
  • We still don’t have any method to present the data afterwards.

Malformed XML files with binary data

When you open one of the centrally store XML files in an XML viewer, potentially nothing is displayed or you’ll receive an error message. Internet Explorer shows nothing.

XML_IE

 

When you open it with Notepad++, there is that [NUL] placeholder at the end of the XML file.

XML_NPP

When you view the file with an HEX editor you will notice that there isn’t only a 00 [hex] value at the end, but also 2 more non-printable characters.

XML_NPP_Hex

The fancy thing with these 3 little characters is that they confuse Powershell of how to treat them. I won’t tell you about all the pain and hassle I went through by trying to do anything with such a file. Of course any attempt to load it as XML right from the start failed, as these files aren’t well-formed XML at all. Essentially, ‘plain text’ imports and exports from an to various encodings (Unicode, UTF, ANSI with their variations) either resulted in non-readable content (looking like Asian characters) or resulted in HEX formatted text (you know, that kind of result where each readable character is followed by a non-printable 00 [hex], looking like ‘< C L I E N T _ D A T A …’ – unusable either.

Finally I stumbled upon a Raw format (And, well, no: the encoding feature is not well documented for the Get-Content cmdlet, only for various cmdlets to save data into files).

$tTextFileContentArray = Get-Content $tReplacementSourceFilePath -encoding UNICODE –Raw -ErrorAction Stop

Now that the content was finally available somehow, replacing the binary data after the actual XML content was quite easy, as it appears to be always the same data of 0d 0a 00 [hex]. A simple conversion of that hex data into a (non-printable) string, followed by a string replacement did the trick for that. In the following snippet, the $tHexInput passed to the function is the string ‘0d 0a 00 ‘:

Function_Convert-HexToString ([string]$tHexInput){
   $tHexDataList=$tHexInput.Split(' ')
   $tSearchString=''
   FOREACH ($tHexDataValueIN$tHexDataList) {
         $tSearchString+= [char]([Convert]::ToInt16($tHexDataValue,16))
   }
return$tSearchString
}

In the full code for this script you may notice that originally these better-formed XML files have been saved to a new location, but for the final script this wasn’t necessary (as the data is processed in RAM). If you want to have those files, just un-comment the corresponding lines.

# write the non-binary XML content into files, just in case someone needs them
     # $DestXmlFilePath = $DestFolder + 'NoBin\' + $DestFolderXmlFileName
     # Out-File -FilePath $DestXmlFilePath -InputObject $TempFileContent -Force

Also note that I don’t call it ‘well-formed’ XML yet, as the files still (and never) have that <?xml version=”1.0″ …?> meta info.

So, after that adjustment we have a bunch of files that don’t only have an .XML file extension, but really follow some basic formatting rules. Now we can unleash the Might and Magic of PowerShell for XML data processing.

 

Mandatory XML Data Separation and Merging

When you look at the XML files (now with a decent XML viewer!) you may identify two challenges. One is that each file contains three different information sections:

  • One section with client information (Name, OS version and some other)
  • One section with package information (Name, PackagGUID + VersionGUID and others)
  • One section with the actual application usage information (Name, version user name, launch time, end time and others)

XML_File_Content

This separation (or ‘normalization’, as it is called in relational database models) is useful for instance with SQL Server, as there are nice methods to re-join the data later on (so an Application Usage table could hold the info about the application launches, but only ID-based cross-references to another table holding the package info, a third table with the client info, a fourth one with application info and so on) The advantage of such data normalization is the reduction of redundant information: a client’s information (name, OS version, App-V client version) is not stored thousands of times in each application usage record, but only once. Anyway, in an XML based reporting scenario you may want to have the data as plain as possible, without sections of differently structured data in each and every file, because re-merging data that spans various sections or even files is not that easy. Because we are mainly interested in the application launch data, it might be a good idea to separate that, potentially de-normalizing it.

The second challenge is that a single client may generate several XML files on the file server, whereas the file name doesn’t give any indication about the actual client name. Because the client name is sort-of interesting (you do want to know which of your Remote Session hosts is used most, right?) we have to get the client name information and process it somewhere.

Finally, there is some data that you potentially just don’t care about.

For this article, I wanted to generate some application usage reports, therefor I defined the scope as follows.

Relevant information
  • Client Name
  • User Name
  • Application Name (+Version)
  • Application Start Time
Unimportant information
  • Client information (OS Version, App-V Client Version… in fact something that could be interesting)
  • All Package info (as there is no relation between app info and package info that can be extracted from the reporting data anyway)
  • Application close time, app status (always the same), report data upload server name

So, the XML file should contain the following mandatory information

  • Client Name
  • User Name
  • Application Name (+Version)
  • Application Start Time (+Date)

And a single file should contain the information about all clients.

Achieving this is not complicated. After every input .XML file has been liberated from the binary data, the following function can be called that reads each individual APP_RECORD from the original file and adds the ClientName attribute to that element. Note that we also already chop down the entire launch-time value down to just the launch date (day) and add this attribute to the element.

FUNCTION _Prepare-XmlContent ($tXmlContent,$tFileNameGuid){
      $tFileGuid=$tFileNameGuid
      $tClientName=$tXmlContent.CLIENT_DATA.Host
      $tAppRecordXmlDocument=New-Object XML
      $tAppRecordXmlElement=$tAppRecordXmlDocument.CreateElement('APP_RECORDS')
      $tAppRecordXmlElement.SetAttribute('Status','Added ClientName and LaunchDate')
      # === Application Records Files ===
      $tXmlAppRecords=$($tXmlContent.CLIENT_DATA.APP_RECORDS.APP_RECORD)
      FOREACH ($tXmlAppRecordin$tXmlAppRecords) {
            $tLaunchDateString= [string]$tXmlAppRecord.Launched
            $tLaunchDateArray=$tLaunchDateString.split('T')
            $tLaunchDate=$tLaunchDateArray[0]
            $tXmlAppRecord.SetAttribute('ClientName',$tClientName)
            $tXmlAppRecord.SetAttribute('LaunchDate',$tLaunchDate)
            # Append the temporary XML data to the final XML Element
            $script:AppRecordXmlElement.AppendChild($script:AppRecordXmlElement.OwnerDocument.ImportNode($tXmlAppRecord,$true)) | Out-Null      
        }
Return
}

Optional XML Data Preparation

One of the challenges with the time stamps in the data is that they appeared too detailed for me (hey, what’s this – a German complaining about über-precise data?!).

In the reports you perhaps don’t want to have a million data points in time on a per-millisecond basis. Instead you’d potentially like to see the application launches grouped by day, hour, or more fine or coarse.

In fact most reporting solutions allow to group individual data points, but as you may have experienced for your own, there are only a few IT people that really know how to get the most out of reporting solutions (this is somewhat different when you go to your accounting or controlling department – there you find reporting masters). So, because of the lack of knowledge in data analysis I tried to prepare some of the groupings for the targeted application usage report.

What can be done is to use PowerShell to condense date and time information from something like ’2014-04-30T14:25:17.123’ to either single day (2014-04-30), a full hour (2014-04-30T14:00) and some smaller pieces down to 5 minute chunks (2014-04-30T14:25) and then write this artificial launch time group back to the XML file’s application record.

In fact these calculations take their time, and of course the increase the size of the overall XML file, but as they make presenting the data easier, I think it’s worth considering it.

So, let’s start with calculating these artificial groupings:

In the upcoming snippet, $tGroupPeriod is a significant value. If the data should be grouped to full hours, this value would be set to 60. If data should be grouped to a quarter-of-an-hour, then it has to be set to 15 (poor English speaking folks. Over here, we have a single word – ‘Viertelstunde’ – for that… but we could tell an entire novel in just one word). The function always computes the actual value to the beginning of a period. Every time between 8:15:00 and 8:19:59 will be melted down to 8:15 in a 15-minutes group. The $tInputTimeString value has an expected format of [hh:mm:ss.milli]

function_Group-Time{param ($tInputTimeString,$tGroupPeriod)
      $tInputTimeArray=$tInputTimeString.split(':')
      $tInputHr=$tInputTimeArray[0]
      $tInputMin=$tInputTimeArray[1]
      # $tInputSecAndMillisec = $tInputTimeArray[3] # not used at all
      $tCalculatedMinutes= ([Math]::Truncate($tInputMin/$tGroupPeriod) *$tGroupPeriod)  
      IF ($tCalculatedMinutes-lt 10) {[string]$tCalculatedMinutesStr='0'+ [string]$tCalculatedMinutes}# add leading zero if 0..9
            ELSE {[string]$tCalculatedMinutesStr=$tCalculatedMinutes}
      IF ($tCalculatedMinutesStr.Length -lt 2) {$tCalculatedMinutesStr='0'+$tCalculatedMinutesStr}
      IF (([string]$tInputHr).Length -eq 1) {[string]$tInputHr='0'+$tInputHr}      # add leading 0 if lenght = 1
            ELSE {[string]$tInputHr=$tInputHr}
      $tFormattedTimeString=$tInputHr+':'+$tCalculatedMinutesStr
return $tFormattedTimeString
}

I admit that esp. the way to adding leading zeroes isn’t super smart, but hey, I’ve been working for consulting company with an HQ in the Netherlands, and those people taught me that achieving a goal is sometimes sufficient, though the way is not super-perfect ;-)

The core of this snippet is the line

$tCalculatedMinutes= ([Math]::Truncate($tInputMin/$tGroupPeriod) * $tGroupPeriod)

The .Net MATH’s library Truncate function returns the part of a number before the period (so 0.789 returns 0), and then we multiply it with the grouping factor again. In case of the 27th minute and a 15 minute period, (27/15) returns 1.8. This truncates to 1. Then it’s multiplied by 15, so the result would be 15 – and this is what we wanted to group it to. Remember that this 15 means ‘somewhere between x:15 and before x:30’

Calling that grouping function and adding the result would be entered into the FOREACH loop of the _Prepare-XmlContent function. This has not been done in the above snippet of that function, but in the downloadable script this is included.

Again, every additional computing consumes time and every additional grouping value increases the size of the resulting XML, so don’t over-do it. In most cases, grouping by day and hour (60) is sufficient.

About the script perfromance

Running the script against about 300 XMl files with about 5’000 app records just took a few seconsd on my machine (which is not the newest either). running the same script againts 10’000 files with 150’000 lines on the other hand took about 45 minutes.

Mixing and Merging it together

You can find the link to the entire script just below. Running it (with some adjustments) creates a large XML file that contains all the application records including the client names, user names, launch time and date (optionally grouped) and of course the application name.

To reduce the data overhead you may decide to remove attributes from each APP_RECORD element (like the status, the server that was used to upload the reporting data to, maybe the shutdown time, unformatted launch time and others). This hasn’t been done in the script.

I left you some room for improvement, like proper error handling, allow command line parameters or displaying a progress bar.

 

Download from kirx.org

Dowload Streamline-XML from kirx.org

Validation

Do you remember the list of tasks that we defined at the end of the first part of this short series? If not, it was repeated at the beginning of this post, too.

We addressed all the data preparation and processing challenges that were highlighted there, however there is only one minor thing left:

  • We still don’t have any method to present the data afterwards.

Easy to guess that this won’t be covered in this 2nd part, as your browser’s scroll bar already reached the bottom of its tray. I’m sorry so tell you this (well, actually I’m not ;-) ) but you’d have to wait for the next post to get something that you can use – as a short motivation, let’s tell you this: There is no need for full-blown MS SQL Reporting Services to generate a meaningful reports. You just need Excel. And you can prepare Excel reports not only for XML, but also for App-V Reporting SQL database featured information.

Advertisements

Posted in App-V, Tools
Tags: , , ,

2 Comments »

  1. […] during a single hour inspired me to the flexible ‘grouping’ function that we talked about in Part 2 and that significant amount of individual XML files gave me the confidence about the static […]

    Pingback by Server-free Reporting with App-V 5 (Part 3): Finally some Reports! | Kirx' Blog — May 22, 2014 @ 10:49

  2. […] Blog: Server-free Reporting with App-V 5 (Part 1) Part 2, Part […]

    Pingback by App-V 5 Reporting Resources | Kirx' Blog — June 4, 2014 @ 19:27


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: