Have you ever run a statistical test to determine whether data are normally distributed? If so, you have probably used Kolmogorov's D statistic. Kolmogorov's D statistic (also called the Kolmogorov-Smirnov statistic) enables you to test whether the empirical distribution of data is different than a reference distribution. The reference distribution

The post Read more »

Tags: data analysis, Statistical Programming, Uncategorized

Posted in SAS | Comments Off on What is Kolmogorov’s D statistic?

At SAS Global Forum 2019, Daymond Ling presented an interesting discussion of binary classifiers in the financial industry. The discussion is motivated by a practical question: If you deploy a predictive model, how can you assess whether the model is no longer working well and needs to be replaced? Daymond

The post Read more »

Tags: Bootstrap and Resampling, data analysis, Statistical Thinking, Uncategorized

Posted in SAS | Comments Off on Discrimination, accuracy, and stability in binary classifiers

The CUSUM test has many incarnations. Different areas of statistics use different assumption and test for different hypotheses. This article presents a brief overview of CUSUM tests and gives an example of using the CUSUM test in PROC AUTOREG for autoregressive models in SAS. A CUSUM test uses the cumulative

The post Read more »

Tags: data analysis, Uncategorized

Posted in SAS | Comments Off on A CUSUM test for autregressive models

Many statistical tests use a CUSUM statistic as part of the test. It can be confusing when a researcher refers to "the CUSUM test" without providing details about exactly which CUSUM test is being used. This article describes a CUSUM test for the randomness of a binary sequence. You start

The post Read more »

Tags: data analysis, Uncategorized, vectorization

Posted in SAS | Comments Off on The CUSUM test for randomness of a binary sequence

I've previously written about how to deal with nonconvergence when fitting generalized linear regression models. Most generalized linear and mixed models use an iterative optimization process, such as maximum likelihood estimation, to fit parameters. The optimization might not converge, either because the initial guess is poor or because the model

The post Read more »

Tags: data analysis, Tips and Techniques, Uncategorized

Posted in SAS | Comments Off on Convergence in mixed models: When the estimated G matrix is not positive definite

Many SAS procedures support the BY statement, which enables you to perform an analysis for subgroups of the data set. Although the SAS/IML language does not have a built-in "BY statement," there are various techniques that enable you to perform a BY-group analysis. The two I use most often are

The post Read more »

Tags: Bootstrap and Resampling, data analysis, Statistical Programming, Uncategorized

Posted in SAS | Comments Off on Matrix operations and BY groups

An important concept in multivariate statistical analysis is the Mahalanobis distance. The Mahalanobis distance provides a way to measure how far away an observation is from the center of a sample while accounting for correlations in the data. The Mahalanobis distance is a good way to detect outliers in multivariate

The post Read more »

Tags: data analysis, Statistical Thinking, Uncategorized

Posted in SAS | Comments Off on The geometry of multivariate versus univariate outliers

An analyst was using SAS to analyze some data from an experiment. He noticed that the response variable is always positive (such as volume, size, or weight), but his statistical model predicts some negative responses. He posted the data and asked if it is possible to modify the graph so

The post Read more »

Tags: data analysis, Statistical Graphics, Uncategorized

Posted in SAS | Comments Off on Truncate response surfaces