So, you went ahead and did it. You designed a driving protocol, designed the roadway environment, tested the drive to ensure it worked as intended, allowed a bunch of test subjects to experience all your hard work, and finally ended with a whole bunch of data files.
Now what? What are you supposed to do with all this data??!
Don’t worry, you’re not the first, and won’t be the last, to have this thought. Fortunately, there are tools available that can help turn this daunting chore into a manageable task.
Scripting: Turning Data Into Gold
The approach to data can be summed up in one word: scripting. The ability to have an automated process greatly simplifies the data reduction task (not counting the frustration realized when creating your script and debugging it) for several key reasons:
- You handle huge data sets easily. Once the script has been fully vetted, you can feed all your data files into it and within minutes have a populated data set.
- It is an ideal environment for playing “what if” games, especially if your metric uses thresholds. This is because you can easily change the threshold and create a brand-new reduced data set in minutes.
- It allows you to try implementing more complex data-reduction techniques (wavelets or steering entropy anyone?).
- Many times, previous scripts can be reused with modifications to reduce your current data set. This means that the initial pain of coming up with the script happens once and then you spawn new versions of it with minimal modifications to handle different data.
Scripting: What’s It Good For, Really?
Okay, you get it, scripting is good, but what exactly is it? This is where it gets a tad more complicated because you will actually have to write some code (or find someone to write the code for you) in a scripting or programming language (Python, Visual Basic, Visual Basic for Applications (VBA), MATLAB, etc.). The code that is written will do the following things:
- Open and parse a specific STISIM Drive data file (which you learned about in our previous blog) into its constituent parts.
- Search through the parts for the data required (so you don’t have to!).
- Using mathematics, logic, some Voodoo, or even black magic, use the desired data found in step 2 to compute your desired metric.
- Save the newly computed metric in to a cell for some matrix structure that tracks the metric and the driver.
- After the entire matrix has been populated, export the file to your preferred statistical analysis package for subsequent statistical analysis on the data.
My personal choice for raw data manipulation (I don’t do much with statistical analysis) is to write macro files using the internal VBA that is built into Microsoft Excel. In my humble opinion there are numerous good reasons to take this approach:
- Excel and VBA have been around a long time and the language has not changed very much so that it is stable.
- It is easy to learn and there are millions of examples available on the web.
- Excel is a spreadsheet program that, ahem, excels at matrix organization with its cell structure and the ability to create and populate multiple pages in the same spreadsheet.
- We provide some examples spreadsheets that demonstrate the most efficient ways to parse and handle data. These can be found in the STISIM Drive Tools folder. There is also a discussion of how to use these example spreadsheets in the Data Manual found in the STISIM Drive Help folder.
While this process may seem daunting, especially the first time you dive into it, learning to write scripts (in whatever language you are most comfortable in) is definitely worth your time. It will open doors for you, helping you become a master data manipulator, which can only help you as your career progresses.