« Cut Your Development Time with On-Screen Shortcuts | Main | Disappearing SIMUL8 Graphs »
Monday
Dec282009

Data, Data Everywhere and Not An Expert in Sight

If you've built a few simulations, its a good bet that you've been through projects where you just didn't have access to all the data the textbooks said you should have. In such cases, the enterprising simulationist is usually left with no choice but to deal with this situation by interviewing appropriate subject matter experts (SMEs) and making what amount to expert guesses.

On other occasions, just the opposite is true. On a recent consulting engagement, we were delighted to find that we had all the data we could possibly ask for. What we did not have, unfortunately, was adequate access to anyone who could help describe the process that produced the data.

In this case, the lack of an appropriate SME was due to current organizational realities within the client company. You may also find yourself in this situation when you are charged with simulating a process that is not readily observable, but for which data can be automatically collected, such as a data network. And let's not forget the case where you may have plenty of volunteers who are willing to act as SMEs, but whose information may be conflicting and/or less than perfectly reliable.

However it happens, the 'plenty of data, limited experts' situation brings with it a number of unique challenges and opportunities. At times like these, some good old-fashioned sleuthing is in order. If you listen hard enough, the data will often tell you what you need to know.

In our recent project, we found ourselves charged with simulating a very complex manufacturing system on a very tight timeline. The client was in the midst of a major reorganization, and had several important decisions to make regarding the consolidation of several of its' 14 plants. Because the company had an excellent data capturing process in place, we had all the data we could want. Unfortunately, getting some quality time with someone who could describe how the process actually worked was another story.

We needed to know which machines could produce which kinds of orders. How fast were they producing them? How often did they break down? How were scheduling decisions being made in the plant?

Let the Data do the Talking

With an important decision looming and no time to waste, we turned to the data to help us fill in the blanks. The data were not only useful in fitting the standard input distributions, but also proved to be invaluable in determining the very structure of our simulation model.

As you approach your next simulation project, gather up all the data you can get your hands on and keep in mind the many ways that you can use it to your advantage. Here are just a few suggestions to get you started.

Fit distributions to input data. This one's obligatory, so we'll get it out of the way early. Examples include arrival distributions, the time it takes to complete a task, or the time between machine failures.

Explore the process through the data. A little bit of upfront data exploration can go a long way toward clarifying the structure and objectives of your simulation. This in turn will save you time, trouble and aggravation down the road. While data analysis can be quite sophisticated, some very basic tabulation may be all it takes to push your simulation ahead. Understand in a general way how one factor is related to another. Look for any obvious patterns across time. Get a feel for the range of values that you should expect to see your simulation model produce. After having put in your time with some basic exploration, you'll be in an excellent position to identify problems or inconsistencies in the simulation model. In short, there's a lot to be said for being familiar with your data set.

Find out what is important and what isn't. If it turns out that shrink-wrapping is used on only one half of one percent of the orders in the process you are currently simulating, then it probably isn't all that important to worry about the precise timing of breakdowns on the shrink-wrapping machine.

Decipher undocumented routing or scheduling rules. In our recent project, we needed to know which machines were capable of running jobs of various specifications. Without complete specifications from the manufacturers or access to a knowledgeable expert, we turned again to the data. We used detailed production transaction data to map out which machines actually handled which jobs. We were then able to feed this information directly into the logic of our simulation.

Put disagreements to rest. What happens when the plant manager and the VP of Operations disagree on how internal routing decisions are made on the floor? Or, on how often a machine really breaks down? It's probably all in the data. Let it answer the questions for you.

Verify the simulation. Understand the routings that various entities in the simulation may take. Do large orders really ever make it to department C? If not, why are they ending up there in the simulation? If you've already explored the process through the data, you'll be much more likely to catch problems early, when they are more easily fixed.

Validate results. Is your simulation producing a throughput that is statistically 'similar' to the throughput produced by the real system under similar conditions? Compare the information produced by the simulation with the information contained in your data set to find out.

Amaze everyone with your wisdom. Who said data analysis doesn't have its perks? Some good solid familiarity with the data can put you in a position to speak very intelligently about a process you knew absolutely nothing about a few short days before. On one recent project, exploring the data led us to make some observations. It turns out that the simple phenomenon that seemed obvious in the data was something that the plant manager had been trying to convince his superiors of for months. When we were able to provide proof in the form of a clear chart, we made the plant manager's day. This particular observation didn't have a huge impact on the development of the simulation, but it did go a long way toward establishing our credibility with the plant manager. Whether you are acting as an internal or external consultant, credibility is key to getting the simulation results actually used.

The moral of this story is, that, while finding the average time it takes to process some task is an excellent use for data, it certainly isn't the only one. When you've gone to all the trouble to gather a good data set (and you really should, by the way), why not put it to good use?

 

References (2)

References allow you to track sources for this article, as well as articles that were written in response to this article.

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>