Tuesday, November 12, 2024

Lecture J3 (2024-11-12): Estimation of Absolute Performance, Part III: Non-Terminating Systems/Steady-State Simulations

In this lecture, we start by further reviewing confidence intervals (where they come from and what they mean) and prediction intervals and then use them to motivate a simpler way to determine how many replications are needed in a simulation study (focusing first on transient simulations of terminating systems). We then shift our attention to steady-state simulations of non-terminating systems and the issue of initialization bias. We discuss different methods of "warming up" a steady-state simulation to reduce initialization bias and then merge that discussion with the prior discussion on how to choose the number of replications. In the next lecture, we'll finish up with a discussion of the method of "batch means" in steady-state simulations.



Lecture J2 (2024-11-07): Estimation of Absolute Performance, Part II: Terminating Systems/Transient Simulations

In this lecture, we review estimating absolute performance from simulation, with focus on choosing the number of necessary replications of transient simulations of terminating systems. The lecture starts by overviewing point estimation, bias, and different types of point estimators. This includes an overview of quantile estimation and how to use quantile estimation to use simulations as null-hypothesis-prediction generators. We the introduce interval estimation with confidence intervals and prediction intervals. Confidence intervals, which are visualizations of t-tests, provide an alternative way to choose the number of required replications without doing a formal power analysis.



Tuesday, November 5, 2024

Lecture J1 (2024-11-05): Estimation of Absolute Performance, Part I: Introduction to Point and Interval Estimation

In this lecture, we introduce the estimation of absolute performance measures in simulation – effectively shifting our focus from validating input models to validating and making inferences about simulation outputs. Most of this lecture is a review of statistics and reasons for the assumptions for various parametric and non-exact non-parametric methods. We also introduce a few more advanced statistical topics, such as non-parametric methods and special high-power tests for normality. We then switch to focusing on simulations and their outputs, starting with the definition of terminating and non-terminating systems as well as the related transient and steady-state simulations. We will pick up next time with discussing details related to performance measures (and methods) for transient simulations next time and steady-state simulations after that. Our goal was to discuss the difference between point estimation and interval estimation for simulation, but we will hold off to discuss that topic in the next lecture.




Lecture I (2024-10-31): Statistical Reflections

 In this lecture, we review statistical fundamentals – such as the origins of the t-test, the meaning of type-I and type-II error (and alternative terminology for both, such as false positive rate and false negative rate) and the connection to statistical power (sensitivity). We review the Receiver Operating Characteristic (ROC) curve and give a qualitative description of where it gets its shape in a hypothesis test. We close with a validation example (from Lecture H) where we use a power analysis on a one-sample t-test to help justify whether we have gathered enough data to trust that a simulation model is a good match for reality when it has a similar mean output performance to the real system.



Tuesday, October 29, 2024

Lecture H (2024-10-29): Verification, Validation, and Calibration of Simulation Models

During this lecture slot, we start with slides from Lecture G3 (on goodness of fit) that were missed during the previous lecture due to timing. In particular, we review hypothesis testing fundamentals (type-I error, type-II error, statistical power, sensitivity, false positive rate, true negative rate, receiver operating characteristic, ROC, alpha, beta) and then go into examples of using Chi-squared and Kolmogorov–Smirnov tests for goodness of fit for arbitrary distributions. We also introduce Anderson–Darling (for flexibility and higher power) and Shapiro–Wilk (for high-powered normality testing).

We close with where we originally intended to start – with definitions of testing, verification, validation, and calibration. We will pick up from here next time.



Thursday, October 24, 2024

Lecture G3 (2024-10-24): Input Modeling, Part 3: Parameter Estimation and Goodness of Fit

In this lecture, we (nearly) finish our coverage of Input Modeling, where the focus of this lecture is on parameter estimation and assessing goodness of fit. We review input modeling in general and then briefly review fundamentals of hypothesis testing. We discuss type-I error, p-values, type-II error, effect sizes, and statistical power. We discuss the dangers of using p-values at very large sample sizes (where small p-values are not meaningful) and at very small sample sizes (where large p-values are not meaningful). We give some examples of this applied to best-of-7 sports tournaments and voting. We then discuss different shape parameters (including location, scale, and rate), and then introduce summary statistics (sample mean and sample variance) and maximum likelihood estimation (MLE), with an example for a point estimate of the rate of an exponential. We introduce the chi-squared (lower power) and Kolmogorov–Smirnov (KS, high power) tests for goodness of fit, but we will go into them in more detail at the start of the next lecture.



Tuesday, October 22, 2024

Lecture G2 (2024-10-22): Input Modeling, Part 2: Selection of Model Structure

In this lecture, we continue discussing the choice of input models in stochastic simulation. Here, we pivot from talking about data collection to selection of the broad family of probabilistic distributions that may be a good fit for data. We start with an example where a histogram leads us to introduce additional input models into a flow chart. The rest of the lecture is about choosing models based on physical intuition and the shape of the sampled data (e.g., the shape of histograms). We close with a discussion of probability plots – Q-Q plots and P-P plots, as are used with "fat-pencil tests" – as a good tool for justifying the choice of a family for a certain data set. The next lecture will go over the actual estimation of the parameters for the chosen families and how to quantitatively assess goodness of fit.



Popular Posts