cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
gwenhallberg
Level III

How to simulate process data with some degree of autocorrelation?

I work in a manufacturing environment and often see data that is relatively normally distributed, but is not necessarily randomly distributed.   The Overall Sigma (standard deviation) is often larger than the Within Sigma (control chart sigma) with regard to process capability calculations.  This can occur for a variety of reasons like raw material lot switches and periodic instrument calibrations.  I am trying to evaluate some proposed control strategies, but I can't figure out how to accurately simulate my process.  If I just use Random Normal(), I don't get any of the autocorrelation I would usually see with the actual process data.  Any ideas how to simulate data where the stability index is not 1?  This is not a traditional time series problem where there is a predictable period to the data pattern.  Thank you all in advance for your assistance.

10 REPLIES 10

Re: How to simulate process data with some degree of autocorrelation?

There is a function for this purpose:

 

multivar normal.PNG

gwenhallberg
Level III

Re: How to simulate process data with some degree of autocorrelation?

Thanks, @Mark_Bailey for the suggestion.  This seems like it should be exactly what I need, but I'm afraid I'm having trouble figuring out how to properly implement it.  To start with, I would just like to generate one simulated column in a data table.  I know the mean, overall process standard deviation, within sigma (control chart sigma - which in this situation is based on the moving range), and the "Autocorrelation" value (from the summary statistics in the distribution platform).  I can also get a Correlation value for the process parameter against the Lag() of itself (by one row) from Fit Y by X - and I could repeat the correlation analysis for different row lags.  Any chance you could help me understand how I should structure the formula given the information I have (or let me know if I am just missing the boat entirely)?

ih
Super User (Alumni) ih
Super User (Alumni)

Re: How to simulate process data with some degree of autocorrelation?

After some investigation, some notes:

  • There is some noise added to the values, it seems to have a standard deviation near 1
  • You need to define a mean vector and symmetric covariance matrix with the same dimensions as a mean vector.

Some examples:

Names Default To Here( 1 );

//Make a dataset from a mean vector, covariance matrix, and number of rows
Make Correlated Table = Function({mv, cm, nr, name = "Data Table"}, {dt},
	randmvnMat = Random Multivariate Normal( mv, cm, nr );
	dt = New Table( name );
	dt << Set Matrix(randmvnMat);
	dt << Scatterplot Matrix( Y( dt << Get Column References ), Matrix Format( "Lower Triangular" ), Density Ellipses( 1 ) );
	dt
);

dt1 = Make Correlated Table(
	mv = [0 1 1 0],
	cv = [
		1 0 0 0, 
		0 1 0 0,
		0 0 1 0,
		0 0 0 1
	],
	nr = 100,
	name = "No Correlation"
);

dt2 = Make Correlated Table(
	mv = [0 10 20],
	cv = [
		1 1 0, 
		1 1 0, 
		0 0 1
	],
	nr = 20,
	name = "1-2 Correlated"
);

dt3 = Make Correlated Table(
	mv = [0 10 20],
	cv = [
		1 1 1, 
		1 1 1, 
		1 1 1
	],
	nr = 20,
	name = "1-2-3 Correlated"
);

dt4 = Make Correlated Table(
	mv = [0 10 20],
	cv = [
		 1.0  0.8 -0.5, 
		 0.8  1.0  0.0, 
		-0.5  0.0  1.0
	],
	nr = 20,
	name = "Some Correlation"
);
gwenhallberg
Level III

Re: How to simulate process data with some degree of autocorrelation?

Thanks, @ih - Your nifty script made it so I could understand how the Random Multivariate Normal() feature was intended to work.  This will be great to simulate multiple correlated factors - and is not something I was aware of before.

Re: How to simulate process data with some degree of autocorrelation?

I apologize. I was wrong. You do not have multiple correlation (i.e., more than one variable). You have a time series. Let me look into it more.

gwenhallberg
Level III

Re: How to simulate process data with some degree of autocorrelation?

Thanks, Mark.  If it helps, at all - here is an example of the type of data I am trying to simulate.

Control Chart.jpg

ian_jmp
Staff

Re: How to simulate process data with some degree of autocorrelation?

If you don't mind messing with JSL, then you can play the simulation game indefinitely. Please find attached a couple of old scripts that might give you some ideas. The first builds on the reply from @peng_liu and allows you to simulate output from an AR(2) process and see how the estimated parameters match up with the model parameters used. Stating the obvious perhaps, but not every process can be usefully modelled as autoregressive no matter how many parameters are used. I always found the book by Box and Luceno very useful in the industrial context. The second script simulates what they call a 'sticky innovation process'.

ih
Super User (Alumni) ih
Super User (Alumni)

Re: How to simulate process data with some degree of autocorrelation?

Edit: I also misread and gave a suggestion for multi-correlation instead of autocorrelation. Woops!

 

I haven't tried @Mark_Bailey's suggestion, I think that might be an easier way to do the same thing:

 

You might try using principal components, create random value for latent variables, then calculate your simulated variables based on those latent variables and add univariate error.

 

ih_0-1664554728405.png

 

Using this method you can even make data that is similar to your own process by finding your own loading, coefficients, variances, and errors, and then simulating data just like it.

 

Here is an example:

 

New Table( "Example Correlated Data",
	Add Rows( 100 ),
	New Column( "Prin 1", Numeric, "Continuous", Format( "Best", 12 ),
		Formula( Random Normal( 0, 1 ) )
	),
	New Column( "Prin 2", Numeric, "Continuous", Format( "Best", 12 ),
		Formula( Random Normal Mixture( [-1, 2], [0.3, 0.6], [0.25, 0.75] ) )
	),
	New Column( "X1", Numeric, "Continuous", Format( "Best", 12 ),
		Formula( :Prin 1 * 20 + :Prin 2 * 1 + 5 + Random Normal( 0, 3 ) )
	),
	New Column( "X2", Numeric, "Continuous", Format( "Best", 12 ),
		Formula( :Prin 1 * 2 + :Prin 2 * 1 + 3 + Random Normal( 0, 2 ) )
	),
	New Column( "X3", Numeric, "Continuous", Format( "Best", 12 ),
		Formula( :Prin 1 * -1 + :Prin 2 * 0.3 + 200 + Random Normal( 0, 0.2 ) )
	)
)

 

gwenhallberg
Level III

Re: How to simulate process data with some degree of autocorrelation?

Thanks very much, @ih, for the suggestion!  I'll give both this and Mark's idea a try.