Simulation Tip: Data Driven vs Hard Coding

While simulation tools have improved over time, most of these improvements have been in regards to simplifying technical hurdles, like importing data or making the user interface easier to understand. The complexity of how to break down a complicated, real-world process into concrete steps that can be modeled in a simulation language continues to be the most difficult part of using any package.

big-data-2

When developing simulation models, there are two major ends of the spectrum in terms of approach. At one end is what we call the “brute force” method – the model is created as quickly as possible and the logic is designed only for the process flow as currently understood. Input data like processing times, failure rates, downtimes, etc. may still be flexible by referencing an external data spreadsheet, but the process flows outlined in the logic are set in the model. This method is faster, but any changes to how the process works requires redoing large chunks of the model or even throwing the first version out and starting over, depending on the significance of the changes.

The other end of the spectrum is an extremely flexible model. This kind of model makes no assumptions about how processes work and references data inputs for almost all the decisions in the model. Obviously, this flexibility comes at a cost: it usually takes significantly longer to create logic that evaluates and makes decisions instead of hard-coding a flow of parts through a system. The other impact is on the complexity of the data; having an extremely flexible process flow requires that the data be presented in a very specific format with multiple references to other data sets so that the model can look up what steps need to be done next. While some of this data structuring could be simplified and the simulation model could take on more of the burden of searching through the data, it is always almost faster and cleaner to provide the simulation with complete data.

In most cases, it is best to pick an approach somewhere between the two extremes, but sometimes the extremely flexible solution is the right choice. In one particular case, we chose to stay toward that end of the spectrum because it was clear that many aspects of the process were still being finalized. The operations were going to take place along multiple conveyor lines, but the location of stations along the conveyors was undecided, as was which parts would need to undergo which operations. Some stations were unique and others had duplicates on several conveyor lines. For most of the station operations, modeling was done at at a high level, using input data to specify the processing time, labor group needed, and equipment required, in order to capture the impact of the process without adding extraneous levels of detail that would need to be updated later. This allowed the model to use the same section of logic to represent 95% of the stations in the model since all parts would undergo the same steps before conveying to their next assigned station. Some operations did necessitate more specific logic, but they could be defined using unique logic for that particular station. The stations that each part would visit was also determined by the input data. Parts that did not need that operation would still travel through the station along the conveyor, but would not delay for any processes in it. At the end of each conveyor, the part would determine if it needed to travel to another conveyor for further operations based on its assigned path, or if it was completed and could travel to the packaging area for final processing. This approach enabled the user to quickly change which stations each part visited and time spent in the station, but required that the data be organized in the right way to feed the model.

Once a simulation model has been run, the analysis process is often an iterative one. Because simulation is typically used to analyze complex processes, there is not one simple answer to look at in the outputs. Most analyses will start by looking at wait times combined with resource utilizations to find which areas are causing bottlenecks. Once those bottlenecks have been identified, the next step is often to test increasing capacity or making other changes to the model to understand the impact of the removal of that bottleneck.

Sometimes information may appear contradictory – for example, an underutilized piece of equipment may also be generating long waits for parts needing it. This typically indicates a burst of demand for the equipment, separated by long periods of idle time. Depending on the operation, it may be possible to smooth out demand for the equipment by altering the schedule to reduce the batches of part arrivals to that area.

Ultimately, the analysis technique usually follows a process of identifying the problem in the system, and then testing other scenarios to see which option best alleviates the issue. It also requires having enough knowledge of the system being modeled to conduct the scenario analysis effectively; testing solutions that have no possibility of implementation is rarely helpful. Because of the complex interactive behavior in most simulation models, it’s very likely that solving an issue in one section of the system will generate a new problem in a downstream or other part of the system. These impacts can sometimes be unintuitive and require further analysis to understand exactly how that effect occurred. While time consuming, the ability to conduct this type of system analysis is one of the greatest benefits of simulation since it can be hard to capture these effects in spreadsheet analysis.

Arena Newsletter and Blog

Simulation Tip: Data Driven vs Hard Coding

Subscribe to Blog Updates

Recent Posts

Posts by Topic