In this lecture, we review statistical fundamentals – such as the origins of the t-test, the meaning of type-I and type-II error (and alternative terminology for both, such as false positive rate and false negative rate) and the connection to statistical power (sensitivity). We review the Receiver Operating Characteristic (ROC) curve and give a qualitative description of where it gets its shape in a hypothesis test. We close with a validation example (from the previous lecture) where we use a power analysis on a one-sample t-test to help justify whether we have gathered enough data to trust that a simulation model is a good match for reality when it has a similar mean output performance to the real system.
Archived lectures from undergraduate course on stochastic simulation given at Arizona State University by Ted Pavlic
Friday, October 28, 2022
Tuesday, October 25, 2022
Lecture H (2022-10-25): Verification, Validation, and Calibration of Simulation Models (plus some Lecture G3 slides)
In this lecture, we mostly cover slides from Lecture G3 (on goodness of fit) that were missed during the previous lecture. In particular, we review hypothesis testing fundamentals (type-I error, type-II error, statistical power, sensitivity, false positive rate, true negative rate, receiver operating characteristic, ROC, alpha, beta) and then go into examples of using Chi-squared and Kolmogorov–Smirnov tests for goodness of fit for arbitrary distributions. We also introduce Anderson–Darling (for flexibility and higher power) and Shapiro–Wilk (for high-powered normality testing). We close with where we originally intended to start – with definitions of testing, verification, validation, and calibration. We will pick up from here next time.
Thursday, October 20, 2022
Lecture G3 (2022-10-20): Input Modeling, Part 3 (Parameter Estimation and Goodness of Fit)
In this lecture, we (nearly) finish our coverage of Input Modeling, where the focus of this lecture is on parameter estimation and assessing goodness of fit. We review input modeling in general and then briefly review fundamentals of hypothesis testing. We discuss type-I error, p-values, type-II error, effect sizes, and statistical power. We discuss the dangers of using p-values at very large sample sizes (where small p-values are not meaningful) and at very small sample sizes (where large p-values are not meaningful). We give some examples of this applied to best-of-7 sports tournaments and voting. We then discuss different shape parameters (including location, scale, and rate), and then introduce summary statistics (sample mean and sample variance) and maximum likelihood estimation (MLE), with an example for a point estimate of the rate of an exponential. We introduce the chi-squared (lower power) and Kolmogorov–Smirnov (KS, high power) tests for goodness of fit, but we will go into them in more detail at the start of the next lecture.
Tuesday, October 18, 2022
Lecture G2 (2022-10-18): Input Modeling, Part 2 (Selection of Model Structure)
In this lecture, we continue discussing the choice of input models in stochastic simulation. Here, we pivot from talking about data collection to selection of the broad family of probabilistic distributions that may be a good fit for data. We start with an example where a histogram leads us to introduce additional input models into a flow chart. The rest of the lecture is about choosing models based on physical intuition and the shape of the sampled data (e.g., the shape of histograms). We close with a discussion of probability plots – Q-Q plots and P-P plots, as are used with "fat-pencil tests" – as a good tool for justifying the choice of a family for a certain data set. The next lecture will go over the actual estimation of the parameters for the chosen families and how to quantitatively assess goodness of fit.
Friday, October 14, 2022
Lecture G1 (2022-10-13): Input Modeling, Part 1 (Data Collection)
In this lecture, we introduce the detailed process of input modeling. Input models are probabilistic models that introduce variation in simulation models of systems. Those input models must be chosen to match statistical distributions in data. Over this unit, we cover collection of data for this process, choice of probabilistic families to fit to these data, and then optimized parameter choice within those families and evaluation of fit with goodness of fit. In this lecture, we discuss issues related to data collection.
Popular Posts
This lecture covers Variance Reduction Techniques (VRT) for stochastic simulation, covering: Common Random Numbers (CRNs), Control Variates ...
In this lecture, we review basic probability space concepts from the previous lecture. We then go on to discuss the common probabilistic mod...
In this lecture, we introduce the detailed process of input modeling. Input models are probabilistic models that introduce variation in simu...
In this lecture, we review pseudo-random number generation and then introduce random-variate generation by way of inverse-transform sampling...
In this lecture, we introduce the three different simulation methodologies (agent-based modeling, system dynamics modeling, and discrete eve...
In this lecture, we close out our review of DES fundamentals and hand simulation. After going through a hand-simulation example one last tim...
In this lecture, we review topics from the first half of the semester that will be tested over in the upcoming midterm. Most of the class in...
In this lecture, we (nearly) finish our coverage of Input Modeling, where the focus of this lecture is on parameter estimation and assessing...
In this lecture, we continue to discuss hypothesis testing -- introducing parametric, non-parametric, exact, and non-exact tests and reviewi...
In this lecture, we wrap up the course content in IEE 475. We first do a quick overview of the four variance reduction techniques (VRT's...