IEE 475: Simulating Stochastic Systems: 2024

Tuesday, December 3, 2024

Lecture M (2024-12-03): Final Exam Review

In this lecture, we prepare for the final exam and give a brief review of all topics from the course. Students are encouraged to bring their own questions so that the focus of the class is on the topics that students feel they need the most help with.

Tuesday, November 26, 2024

Lecture L (2024-11-26) Course Wrap-Up

In this lecture, we wrap up the course content in IEE 475. We first do a quick overview of the four variance reduction techniques (VRT's) covered in Unit K. That is, we cover: common random numbers (CRN's), antithetic variates (AV's), importance sampling, and control variates. We then remember some general comments about the goal of modeling and commonalities seen across simulation platforms (as well as the different types of simulation platforms in general).

Thursday, November 21, 2024

Lecture K2 (2024-11-21): Variance Reduction Techniques, Part 2 (AVs and Importance Sampling)

In this lecture, we review four different Variance Reduction Techniques (VRT's). Namely, we discuss common random numbers (CRNs), control variates, antithetic variates (AVs), and importance sampling. Each one of these is a different approach to reducing the variance in the estimation of relative or absolute performance of a simulation model. Variance reduction is an alternative way to increase the power of a simulation that is hopefully less costly than increasing the number of replications.

Tuesday, November 19, 2024

Lecture K1 (2024-11-19): Variance Reduction Techniques, Part 1 (CRNs and Control Variates)

In this lecture, we start by reviewing approaches for absolute and relative performance estimation in stochastic simulation. This begins with a reminder of the use of confidence intervals for estimation of performance for a single simulation model. We then move to different ways to use confidence intervals on mean DIFFERENCES to compare two different simulation models. We then move to the ranking and selection problem for three or more different simulation models, which allows us to talk about analysis of variance (ANOVA) and post hoc tests (like the Tukey HSD or Fisher's LSD). After that review, we move on to introducing variance reduction techniques (VRTs) which reduce the size of confidence intervals by experimentally controlling/accounting for alternative sources of variance (and thus reducing the observed variance in response variables). We discuss Common Random Numbers (CRNs), which use a paired/blocked design to reduce the variance caused by different random-number streams. We start to discuss control variates (CVs), but that discussion will be picked up at the start of the next lecture.

Thursday, November 14, 2024

Lecture J4 (2024-11-14): Estimation of Relative Performance

In this lecture, we review what we have learned about one-sample confidence intervals (i.e., how to use them as graphical versions of one-sample t-tests) for absolute performance estimation in order to motivate the problem of relative performance estimation. We introduce two-sample confidence intervals (i.e., confidence intervals on DIFFERENCES based on different two-sample t-tests) that are tested against a null hypothesis of 0. This means covering confidence interval half widths for the paired-difference t-test, the equal-variance (pooled) t-test, and Welch's unequal variance t-test. Each of these different experimental conditions sets up a different standard error of the mean formula and formula for degrees of freedom that are used to define the actual confidence interval half widths (centered on the difference in sample means in the pairwise comparison of systems). We then generalize to the case of more than 2 systems, particularly for "ranking and selection (R&S)." This lets us review the multiple-comparisons problem (and Bonferroni correction) and how post hoc tests (after an ANOVA) are more statistically powerful ways to do comparisons.

Tuesday, November 12, 2024

Lecture J3 (2024-11-12): Estimation of Absolute Performance, Part III: Non-Terminating Systems/Steady-State Simulations

In this lecture, we start by further reviewing confidence intervals (where they come from and what they mean) and prediction intervals and then use them to motivate a simpler way to determine how many replications are needed in a simulation study (focusing first on transient simulations of terminating systems). We then shift our attention to steady-state simulations of non-terminating systems and the issue of initialization bias. We discuss different methods of "warming up" a steady-state simulation to reduce initialization bias and then merge that discussion with the prior discussion on how to choose the number of replications. In the next lecture, we'll finish up with a discussion of the method of "batch means" in steady-state simulations.

Lecture J2 (2024-11-07): Estimation of Absolute Performance, Part II: Terminating Systems/Transient Simulations

In this lecture, we review estimating absolute performance from simulation, with focus on choosing the number of necessary replications of transient simulations of terminating systems. The lecture starts by overviewing point estimation, bias, and different types of point estimators. This includes an overview of quantile estimation and how to use quantile estimation to use simulations as null-hypothesis-prediction generators. We the introduce interval estimation with confidence intervals and prediction intervals. Confidence intervals, which are visualizations of t-tests, provide an alternative way to choose the number of required replications without doing a formal power analysis.

Tuesday, November 5, 2024

Lecture J1 (2024-11-05): Estimation of Absolute Performance, Part I: Introduction to Point and Interval Estimation

In this lecture, we introduce the estimation of absolute performance measures in simulation – effectively shifting our focus from validating input models to validating and making inferences about simulation outputs. Most of this lecture is a review of statistics and reasons for the assumptions for various parametric and non-exact non-parametric methods. We also introduce a few more advanced statistical topics, such as non-parametric methods and special high-power tests for normality. We then switch to focusing on simulations and their outputs, starting with the definition of terminating and non-terminating systems as well as the related transient and steady-state simulations. We will pick up next time with discussing details related to performance measures (and methods) for transient simulations next time and steady-state simulations after that. Our goal was to discuss the difference between point estimation and interval estimation for simulation, but we will hold off to discuss that topic in the next lecture.

Lecture I (2024-10-31): Statistical Reflections

In this lecture, we review statistical fundamentals – such as the origins of the t-test, the meaning of type-I and type-II error (and alternative terminology for both, such as false positive rate and false negative rate) and the connection to statistical power (sensitivity). We review the Receiver Operating Characteristic (ROC) curve and give a qualitative description of where it gets its shape in a hypothesis test. We close with a validation example (from Lecture H) where we use a power analysis on a one-sample t-test to help justify whether we have gathered enough data to trust that a simulation model is a good match for reality when it has a similar mean output performance to the real system.

Tuesday, October 29, 2024

Lecture H (2024-10-29): Verification, Validation, and Calibration of Simulation Models

During this lecture slot, we start with slides from Lecture G3 (on goodness of fit) that were missed during the previous lecture due to timing. In particular, we review hypothesis testing fundamentals (type-I error, type-II error, statistical power, sensitivity, false positive rate, true negative rate, receiver operating characteristic, ROC, alpha, beta) and then go into examples of using Chi-squared and Kolmogorov–Smirnov tests for goodness of fit for arbitrary distributions. We also introduce Anderson–Darling (for flexibility and higher power) and Shapiro–Wilk (for high-powered normality testing).

We close with where we originally intended to start – with definitions of testing, verification, validation, and calibration. We will pick up from here next time.

Thursday, October 24, 2024

Lecture G3 (2024-10-24): Input Modeling, Part 3: Parameter Estimation and Goodness of Fit

In this lecture, we (nearly) finish our coverage of Input Modeling, where the focus of this lecture is on parameter estimation and assessing goodness of fit. We review input modeling in general and then briefly review fundamentals of hypothesis testing. We discuss type-I error, p-values, type-II error, effect sizes, and statistical power. We discuss the dangers of using p-values at very large sample sizes (where small p-values are not meaningful) and at very small sample sizes (where large p-values are not meaningful). We give some examples of this applied to best-of-7 sports tournaments and voting. We then discuss different shape parameters (including location, scale, and rate), and then introduce summary statistics (sample mean and sample variance) and maximum likelihood estimation (MLE), with an example for a point estimate of the rate of an exponential. We introduce the chi-squared (lower power) and Kolmogorov–Smirnov (KS, high power) tests for goodness of fit, but we will go into them in more detail at the start of the next lecture.

Tuesday, October 22, 2024

Lecture G2 (2024-10-22): Input Modeling, Part 2: Selection of Model Structure

In this lecture, we continue discussing the choice of input models in stochastic simulation. Here, we pivot from talking about data collection to selection of the broad family of probabilistic distributions that may be a good fit for data. We start with an example where a histogram leads us to introduce additional input models into a flow chart. The rest of the lecture is about choosing models based on physical intuition and the shape of the sampled data (e.g., the shape of histograms). We close with a discussion of probability plots – Q-Q plots and P-P plots, as are used with "fat-pencil tests" – as a good tool for justifying the choice of a family for a certain data set. The next lecture will go over the actual estimation of the parameters for the chosen families and how to quantitatively assess goodness of fit.

Friday, October 18, 2024

Lecture G1 (2024-10-17): Input Modeling, Part 1: Data Collection

In this lecture, we introduce the detailed process of input modeling. Input models are probabilistic models that introduce variation in simulation models of systems. Those input models must be chosen to match statistical distributions in data. Over this unit, we cover collection of data for this process, choice of probabilistic families to fit to these data, and then optimized parameter choice within those families and evaluation of fit with goodness of fit. In this lecture, we discuss issues related to data collection.

Thursday, October 3, 2024

Lecture F (2024-10-03): Midterm Review

During this lecture, we review the topics covered up to this point in the course as preparation for the upcoming midterm exam. Students are encouraged to bring their own questions to class so that we can focus on the topics that students feel like they need the most help with.

Tuesday, October 1, 2024

Lecture E2 (2024-10-01): Random-Variate Generation

In this lecture, we review pseudo-random number generation and then introduce random-variate generation by way of inverse-transform sampling. In particular, we start with a review of the two most important properties of a pseudo-random number generator (PRNG), uniformity and independence, and discuss statistically rigorous methods for testing for these two properties. For uniformity, we focus on a Chi-square/Chi-squared test for larger numbers of samples and a Kolmogorov–Smirnov (KS) test for smaller numbers of samples. For independence, we discuss autocorrelation tests and runs test, and then we demonstrate a runs above-and-below-the-mean test. We then shift to discussing inverse-transform sampling for continuous random variates and discrete random variates and how the resulting random-variate generators might be implemented in a tool like Rockwell Automation's Arena.

Thursday, September 26, 2024

Lecture E1 (2024-09-06): Random-Number Generation

In this lecture, we introduce the measure-theoretic concept of a random variable (which is neither random nor a variable) and related terms, such as outcomes, events, probability measures, moments, means, etc. Throughout the lecture, we use the metaphor of probability as mass (and thus probability density as mass density, and a mean as a center of mass). This allows us to discuss the "statistical leverage" of outliers in a distribution (i.e., although they happen infrequently, they still have the ability to shift the mean significantly, as in physical leverage). This sets us up to talk about random processes and particular random variables in the next lecture.

Tuesday, September 24, 2024

Lecture D2 (2024-09-24): Probabilistic Models

In this lecture, we review basic probability fundamentals (measure spaces, probability measures, random variables, probability density functions, probability mass functions, cumulative distribution functions, moments, mean/expected value/center of mass, standard deviation, variance), and then we start to build a vocabulary of different probabilistic models that are used in different modeling contexts. These include uniform, triangular, normal, exponential, Erlang-k, Weibull, and Poisson variables. If we do not have time to do so during this lecture, we will finish the discussion in the next lecture with the Bernoulli-based discrete variables and Poisson processes.

Thursday, September 19, 2024

Lecture D1 (2024-09-19): Probability and Random Variables

Tuesday, September 17, 2024

Lecture C2 (2024-09-17): Beyond DES – SDM, ABM, and NetLogo

This lecture (slides embedded below) provides some historical background and motivation for System Dynamics Modeling (SDM) and Agent-Based Modeling (ABM), two other simulation modeling approaches that contrast with Discrete Event System (DES) simulation.

In particular, in this lecture, we briefly introduce System Dynamics Modeling (SDM) and Agent-Based/Individual-Based Modeling (ABM/IBM) as the two ends of the simulation modeling spectrum (from low resolution to high resolution). The introduction of ABM describes applications in life sciences, social sciences, and engineering (Multi-Agent Systems, MAS)/operations research. NetLogo is introduced (as part of preparation for Lab 4), and it is used to present examples of running ABM's as well as the code behind them. This lecture is also coupled with notes discussing the Lab 3 (Monte Carlo simulation) results and general experience. These comments focus on interval estimation (which is right 95% of the time, as opposed to point estimation that is right 0% of the time) and the role of non-trivial distributions of random variables (as opposed to just their means).

Thursday, September 12, 2024

Lecture C1 (2024-09-12): Basic Simulation Tools and Techniques

This lecture covers content related to implementing simulations with spreadsheets and the motivations for the use of special-purpose Discrete Event System Simulation tools. In particular, we discuss different approaches to implementing Discrete Event System (DES) simulations (DESS) with simple spreadsheets (e.g., Microsoft Excel, Google Sheets, Apple Numbers, etc.). We cover inventory management problems (such as the newsvendor model) as well as Monte Carlo sampling and stochastic activity networks (SAN's). Although we show that spreadsheets can be very powerful for this kind of work, we highlight that this approach is cumbersome for systems with increasing complexity. So this motivates why we would use more sophisticated tools specifically built for simulation (but perhaps not so great for data analysis by themselves), like Arena, FlexSim, Simio, and NetLogo.

This lecture was recorded by Theodore Pavlic as part of IEE 475 (Simulating Stochastic Systems) at Arizona State University.

Tuesday, September 10, 2024

Lecture B3 (2024-09-10): Discrete-Event Simulation Examples, Part II

In this lecture, we close out our review of DES fundamentals and hand simulation. After going through a hand-simulation example one last time, we show how to implement a Discrete Event System (DES) simulation using a spreadsheet tool like Microsoft Excel without any "macros" (VBA, etc.). This involves defining relationships ACROSS TIME that allow the spreadsheet to (in a declarative fashion) reconstruct the trajectory that is the output of the simulation.

We then pivot to discussing the previous "Lab 2 (Muffin Oven Simulation)", which lets us introduce common random numbers (CRNs), statistical blocking, requirements of 2-sample and paired t-tests, and more sophisticated statistical methods that better characterize PRACTICAL significance (and take into account the multiple comparisons problem). Thus, the post-lab2 reflections are largely a preview of future topics in the course.

Thursday, September 5, 2024

Lecture B2 (2024-09-05): Discrete-Event Simulation Examples, Part I

In this lecture, we review fundamentals of Discrete Event System (DES) simulation (e.g., entities, resources, activities, processes, delays, attributes) and we run through a number of DES modeling examples. These examples show how different research/operations questions can lead to different choices of entities/resources/etc. We close with a hand-simulation example of a single-channel, single-server queue with provided interarrival times and service times.

Tuesday, September 3, 2024

Lecture B1 (2024-09-03): Fundamental Concepts of Discrete-Event Simulation (DES)

In this lecture, we cover fundamentals of discrete-event system (DES) simulation (DESS). This involves reviewing basic simulation concepts (entities, resources, attributes, events, activities, delays) and introducing the event-scheduling world view, which provides a causality framework on which an automatic simulation of a DES system can be built. We also discuss briefly how the stochastic modeling inherent to DESS means that outputs will be variable and thus will require rigorous statistics to make sense of.

Thursday, August 29, 2024

Lecture A2 (2024-08-29): Introduction to Simulation Modeling

In this lecture, we introduce the three different simulation methodologies (agent-based modeling, system dynamics modeling, and discrete event system simulation) and then focus on how stochastic modeling is used within discrete-event system simulation. In particular, we define terms such as system, dynamic system, state, state variable, activity, delay, resource, entity, and the notion of "input modeling."

Tuesday, August 27, 2024

Lecture A1 (2024-08-27): Introduction to Modeling

This lecture introduces the topic of modeling with particular focus on the role of quantitative modeling in industrial engineering and operations research. This is an introduction to a course on stochastic simulation.

Thursday, August 22, 2024

Lecture 0 (2024-08-22): Introduction to the Course and Its Policies

In this lecture, we outline the structure and purpose of IEE 475 (Simulating Stochastic Systems) for the Fall 2024 semester at Arizona State University. We go over topics covered in the syllabus and on the course learning management system website.