Accelerating Innovation with Space Filling Mixture Designs, Neural Networks and SVEM (2021-US-45MP-807)

9 Kudos

Level: Intermediate

Philip Ramsey, Data Scientist and Professor, North Haven Group and University of New Hampshire
Marie Gaudard, Data Scientist, Predictum
Wayne Levin, Owner, Predictum

Although critical to process/product development and improvement, constrained mixture and mixture process designs are challenging to generate and analyze. Classical mixture and mixture process designs typically do not provide adequate coverage of the design space, assigning too many design points to boundary areas. Also, classical mixture process designs are often too large to be feasible. Furthermore, the large amount of partial aliasing inherent in constrained mixture designs renders them difficult to analyze, especially for predictive modeling purposes.

In this talk, we propose new classes of efficient constrained mixture and mixture process designs based upon the fast flexible space filling designs available in JMP. These designs provide much improved coverage of the design space and are generated without the requirement to pre-specify a model.

Through simulation and case studies we show how these new designs, coupled with self-validating ensemble modeling (SVEM) to fit neural networks, result in effective predictive models. These models provide a basis for process characterization and optimization, thereby ensuring better scale-up and scale-down model performance. We demonstrate how to generate the designs and conduct the SVEM analyses using JMP 16.

Auto-generated transcript...

Speaker	Transcript
Philip J. Ramsey	Hello, this is Phil Ramsey from the University of New Hampshire and Predictum. And along with my co presenters Wayne Levin of Predictum and Marie Gaudard, the
	Emeritus Professor from the University of New Hampshire and Predictum, we're going to present a talk on modern mixture designs.
	And the agenda for the talk is we're going to talk about machine learning and design of experiments, briefly introduce what are mixture designs,
	talk about a new machine learning method for DOE called auto-validation, which is combined with ensemble modeling to create something we call self-validating ensemble modeling methodology.
	And we'll then talk about SVEM, which is self-validating ensemble modeling, and we'll do a demonstration using a mixture experiment and then we're going to talk about
	the SVEM method and space-filling designs. And Wayne Levin will be talking about an add-in that Predictum has developed to do SVEM and he'll also be doing the demonstration of the urea experiment.
	And I want to point out that recently, in fact this year alone, we've already had three and that will soon be for papers published in journals on the topic of
	DOE and machine learning, so people have discovered that machine learning methods traditionally applied to big data
	may have a lot of applicability to DOE, and we certainly agree. And we're going to show how they can be used effectively with mixture experiments. But one of the traditional limitations in the application of so-called machine learning
	to smaller data sets like DOE is the belief that one had to have very large data sets. This is actually not entirely true.
	And as you're going to see, the SVEM method that has been developed by a number of people, including myself and Dr Chris Gotwalt from JMP,
	we actually make it quite feasible to apply these traditional big data methods to the much smaller data like we find with design of experiments.
	And also machine learning is really all about prediction, so we divide statistical modeling up into two approaches, one is called explanatory, which is classic, it focuses on hypothesis testing and P values and parameters.
	And that usually does not lead to good predictive models, but machine learning focuses entirely on prediction.
	So that is our point of view, and that is what we're going to use today and apply it to mixture experiments. We're going to use machine learning
	for prediction. Our focus is not on explanation and, as you will see, we are going to be able to apply these predictive machine learning methods to the small design of experiments
	data sets. Well, traditionally the way we evaluate predictive models is by having a portion of the data used to train, so we call it the training set, a very common method.
	And then we have another set that in some way we've held back. There are a number of strategies. We'll only talk about one, called hold back.
	We set it aside, we fit the model to the training set and then the hold back set, or validation set, allows us to evaluate the prediction capability. In other words, we're predicting data that's not being used to fit the model.
	However, this strategy won't work with design of experiments. They're very efficient and information centric,
	and there typically is not sufficient resources or budget to develop validation sets. Rarely can people do additional runs that could be used for validation.
	So, this would appear to be a barrier to using machine learning for predictive modeling but, as we'll show, this SVEM method gets us around this constraint.
	And I just, before we get into SVEM, want to quickly give you a discussion of mixture designs. I know some of you are familiar with them, some are not.
	These are a special type of experimental design. They're not as common as they should be, because people don't know about them, but it refers to the experimentation with a formulation or recipe, and the components of these formulations can literally be liquids, gases or solids.
	And one of the important concepts is the impact on the responses only depend on the relative proportions of the components.
	So we're very much concerned about the proportions of our components, as you would, say, in a recipe. We are not focused on amounts. If the total amount present were important, this leads to another type of experiment called a mixture amount, which is outside of today's scope.
	But because these are components of a formulation, there is constraints on these factors. In other words, the total amount has to be constrained; sometimes that happens naturally.
	But also the proportions have to be constrained. So in a mixture experiment, unlike a factorial experiment, you cannot independently change the factors.
	So in traditional DOE that many people are familiar with, there are no constraints on the joint settings of the factors. In some cases there are, but typically there are not.
	A mixture experiments, on the other hand, since they are ingredients of some formulation, the total amount has to sum to 1 or 100%. So if I had Q factors, then the settings of those Q experimental factors for every trial has to sum to 1. Furthermore,
	the setting for any one individual component has to be somewhere between 0 to 1. I'll also point out, and we will not have time to discuss this today,
	an important variant on mixture designs are called mixture process experiments, in which we combine mixture factors with process factors.
	We do this because the behavior of mixtures. For instance, think something like a three part adhesive, the behavior changes dramatically with the processing conditions. So we might look at things like pressure and temperature for curing an adhesive and the optimal
	recipe typically changes as a function of those process factors. That is out of scope for today's talk, but it is an important aspect of mixtures.
	So, historically, when people analyze mixtures and even create them...created them, the focus is on explanatory goals, that is,
	hypothesis testing on parameters, P values, and confidence intervals. However, from mixtures in general, this is always been problematic because mixtures inherently have a lot of
	multicolinearity or correlation among the factors and effects. They have to because the total...the total amounts is constrained, and they have to sum to 1, so they cannot be independent by design. However,
	we disagree with this traditional focus and with a lot of experience that we have with mixture experiments, the goal is invariably, frankly, prediction.
	It's not explanation. We're trying to predict performance of mixtures or recipes and that's what the scientists or engineers are interested in.
	So there has been a big disconnect between what the scientists need and how we've analyzed mixtures. Also a lot of the traditional designs are what we would call boundary point designs.
	If you're familiar with the term D optimal, they tend to be very D optimal, that is,
	most of the points are assigned around the boundaries, we have little interior information. There are very few points, sometimes none
	on the interior of the design region. Why is this important? Well, mixture systems, especially in chemical and biological applications,
	tend to be very complex. There are very complex kinetics going on, and these response surfaces have very complicated shapes.
	And if we don't have a good deal of information on the interior of the design region, then very likely we are not going to come up with models that really represent the behavior of the system.
	And this has actually happened quite frequently in the history of the use of mixture designs. Also again, we won't have time to talk about it today,
	a classic approach to analysis or modeling in the past are what I call Scheffe polynomials,
	and these are a type of polynomial designed to be used with mixtures. There are many variants of them, but these mock polynomials are frequently, especially in their classic forms, inadequate to model these response surfaces.
	So how do we approach the problem? Well, we believe the way we need to approach mixture designs is from the point of view of space filling designs.
	And why space filling designs? Again that could be a talk by itself, but these are designs that are created to cover a design space as best one can for a given number of design points.
	And the number of design points
	are an option for the user. This can actually allow us to use smaller designs,
	classic designs, and they covered the design region. This, as a result, can give us more accurate and useful predictive models. We have better information over the whole design region.
	And in our approach to mixtures, it is all based upon the use of space filming designs, and this is a trend we've noticed in industry in recent years.
	I now want to shift gears and talk about...well, given we've created the mixture, we've run it, well, how do we analyze it?
	And I mentioned earlier, one of the problems with machine learning applied to DOE is we don't really have a natural validation set. Well, Gotwalt and Ramsey, that would be me and Dr Chris Gotwalt, actually proposed a method of validation that we've referred to as auto-validation. And
	the first talk on this was actually given at a Discovery Europe conference some years ago.
	And this will not seem intuitive, but hear me out. We can use the original data, the original experiment for training and for validation. In other words, we take the original data and create a copy or clone of it,
	and we refer to this as the auto-validation set. So every observation in the training set has a twin in the auto-validation set itself.
	Now by itself, this would seem nonsensical. Since the auto-validation set has the same observations as the training set, how does that supply an independent assessment of prediction capability? Well, it can't by itself.
	The key to the idea of auto-validation is that we apply a weighting scheme to the observations, and the scheme is based upon generating weights from a gamma distribution.
	By the way, the gamma distribution is commonly used for weighting in statistical applications. The gamma weights have a number of...number of nice properties again that we will not get into today.
	And what we do is we generate a set of weights, but we do it in a special way. If an observation in the training set gets a high value, a high weight,
	the observation in the auto-validation set, the twin, gets a low weight. Why do we do this? This drives anti correlation between the two data sets.
	In other words, we're trying to uncorrelate the sets using a weighting scheme, and we now have a good deal of research that has been done on this in the last few years. It indicates this approach is very effective.
	And there will be a paper coming out shortly in which all these results will be made available. So the idea is to use this weighting scheme, so we take the
	training set, we assign the gamma weights, we have the auto-validation set, and we assign corresponding weights to each of the twins, so if the training set has a high weight,
	the twin in the auto-validation set gets a low weight, and this drives this anti correlation behavior. And then we simply repeat this strategy some number of times to be specified by the user. So in the next slide, I show an example of a simple experiment. We have three factors. Notice
	runs 1-7 are the training set, the original. Runs 8-14 are the clone or the copy. Notice the observations are exactly the same values for the two sets and then notice the column called
	Paired FWB Weights. I'll explain FWB in a moment.
	And notice if an observation, like observation 1 gets a high weight, it's twin, observation 8, gets a low weight. So observation 8 is in the validation set.
	So this idea of fractional weighting is important, so the analysis is done applying the fractional weights to the response. We then repeat this strategy
	some number of times (and the number of times will depend upon the user and the scenario) and we fit the model each time. In other words, we're basically what we call fractionally weighted bootstrapping.
	And what is fractionally weighted bootstrapping? It's basically a form of bootstrapping, also known as generalized bootstrapping,
	in which we do not bootstrap the observations. We bootstrap weights applied to observations. This is a well known methodology and we're applying it.
	What's different is, we apply this repetitively, as I said, it could be hundreds of times, thousands of times, it's really up to the user.
	Each time we fit a model and then at the end, what we do is we take an average or ensemble. And these are called...and this is where we get the term
	self-validating, or auto-validation is self-validation ensemble modeling. We're taking averages
	and ensemble modeling, by the way, is very common in machine learning and deep learning because it tends to lead to more stable and better predictive modeling.
	So, SVEM does fractionally weighted bootstrapping to generate a family of models and then uses model averaging to come up with a final predictive model.
	So, how does the method work? Well, you pick an algorithm for your predictive modeling. In our case, we're going to focus on neural networks for
	mixture designs, but you can pick other methods. SVEM is rather agnostic about what algorithm it is used with it, and it's very amenable to many algorithms. So we create the training and auto-validation set. We assign our fractional weights.
	And then we fit the model, and then we bootstrap this some number of times, as defined by the user, and then we take an average. Okay, so at this point I'm going to turn it over to my colleague, Wayne Levin, who's going to give a demonstration of SVEM and talk about an add-in to do it.
Wayne	I'm just doing the switch here
	and share screen.
	And just going to switch over the PowerPoint.
	Thanks very much, Phil. We're going to use this example here, this urea data, so the purpose of the experiment is to optimize a solution for a cleansing agent.
	And the three components of this mixture are water, alcohol and urea. And so those are these three columns over here. And we're just going to focus on one response, which is viscosity. The target we're trying to hit us 100 but, as we can see down here,
	we want to be between 95 and 105 in terms of viscosity. So with that, I'm going to switch over to JMP.
	Okay, so here is our urea data. Just opened it up here, and just note that there's only 15 runs, and so I'm just going through go to the next step. And the next step in the
	process is to create the auto-validation table, so when I click on that you'll see a couple of things happen. First, we have a new table, and we have the first 15 runs, which is the exact same as what we had here. And then for each of the runs...so if I focus, for example, on this first row (52, 29, 19),
	you can see it's repeated down below here, so 52, 29, 19.
	And likewise the second row will be 57, 23, 19. I'm obviously not going to repeat through all of these, but I think you get the idea. It's simply repeating the combinations in this...
	in the second set down here, and the first set is the training...what's used for training and then the second, as indicated here, is what we get for validation. Now in between is the paired fractional weighting column that Phil was talking about. So
	we tend to see...when we tend to see high numbers up here, we'll get corresponding lower numbers down below and it's very much following the...
	well, the gamma distribution that that phil was talking about. So why don't I just show that. I'm just going to plot the weights over here, just kind of
	position that on the screen. So every time we go through the the iterations, we get...we refresh the weights. So just have a look at the weights over here and just watch the
	column over here, this column right here. So I'm just going to run it so you can see. So each iteration, we get a new fresh set of
	fractional...fractionally weighted pairs across the the both the training and the validation sets. Okay. So next I'm going to
	do a fit model. So I'm going to do, if you will, one iteration, you might say.
	And so viscosity is the response. Just notice down here that the frequency role has this paired fractional...fractionally weighted bootstrapping weight, so that's how it gets factored into the modeling. So there we go and here is, if you will, one iteration,
	okay, based on these particular weights. Well,
	if we want to have an ensemble model, we've got to do that repeatedly. So what I'm going to do now is I'm going to run this with new weights and, if you would,
	keep an eye...you have to keep an eye on kind of three different places at once, which could be a bit of a challenge. I'm just going to try and position these things, alrighty.
	So what's going to happen is, as I click here run with new weights, you're going to see the weights here change, a model will be created and the predictive model will be saved
	up here. I'm going to do one here. Let's give it a go. Bang, so it changed the way it's produced the model.
	Save the model here. This column is also added; this is the ensemble that Phil was talking about it, this is the average. Of course I've only done one iteration so the average is just
	the same. So I'm just going to do another iteration here and we get a new set of weights, a new model, and a new model has been saved, and now the ensemble model over here is the average of these two
	over here. So I'm just going to do it, oh I don't know, a few more times. Tt's not hard to do, and we can see these things
	change as we're going along. So I've got six of them now, so why don't we have a quick look at these six
	models...individual models, as well as the average model, the ensemble, and that's shown here in the profiler. And this this rather shows the instability that
	Phil was alluding to earlier, because if, if you were only do this once...now I know there's the fractional weighting that's going on in here, but as Phil mentioned, you know, a movement of any point I can really
	disturb the model, if you will, and you'll end up with features that don't really...
	perhaps don't really belong there. They're just artifacts of particular design points. But just have a look here, you can see how
	each of these individual models looks...well, they have some different features here. but down below, the average model is in a sense...well, it's the average of all of the models above.
	And you can more or less see that, that big kink over here kind of gets smoothed out. You see a little bit of it down here.
	Now this is only with six iterations and with the add-in, we recommend actually doing 50 iterations.
	But what I'll do as well, let's have a look at the actual versus predicted. Now this one came out pretty...
	pretty consistent. The the lines are all pretty much a slope of one anyway. Often when we do this, we do see more variation across this, but what you will note in the
	average model, it's a little hard to see here in this particular example, but it is a solid line, basically going through the mass of the the other lines there. So I'm just going to move these aside and what I'll do next is
	I'm going to actually do a bunch of runs here. I think we'll do in this demonstration, it's going to do 25 iterations. You can see the model change, you can see the weights
	changing over here. It's not adding anything new up here. So we're just going through. Bang, it's done and now we'll have a look at the report. So I'm just going to open that up.
	Initially we end up with an actual versus predicted over here and a residual plot. It might be nice to look at over here the
	profiler.
	And I I love the profiler. It's
	very appropriate for mixtures, because you can see, as I changed the water, as I increased it, the other two go down. That's the
	inherent nature of mixtures, of course, they all have to contain or add up to 1. Okay, so this is, if you will, the average model.
	The ensemble model across the 25 fits that have taken place over here, alright. So if I may, I'm going to
	now just have a quick look at the mixture profiler.
	We get this look up here.
	Actually, if I may, what what you can see from this is...
	notice the points. Do you see how the points are fitting in very nicely? This is the space filling nature of the design, and you'll also notice that this design actually has constraints in it.
	So we're not covering by any means the entire space of water 0-1 or urea 0-1, and so on, so we have these constraints up here.
	And what I'm going to do now is just throw in the the limits, just so I can visualize...visualize that 95 to 105. So that's showing me in the white space here just where I can...
	what combinations, if you will, satisfy those those those conditions...the spec limits up here.
	Okay, so that's the demonstration of
	of the SVEM method, the methodology, if you will, and
	I may... I'll just.
	So SVEM and space filling design. So again, one of the big motivations about this is that they're inherently unstable.
	So small changes in the data can lead to big changes in the predictions and a single predictive model, of course, can be unstable as a result. You know, you end up with those kinks in it and that just, you know,
	usually quite disturbing just to look at because that's often just not the way the kinetics happen.
	So we need stability and the ensembling of multiple models, based on the fractionally weighted bootstrapping give us that.
	The other...the other big thing about this is the multicolinearity that's inherently present in these...in these situations. Mixtures are complex systems.
	They're not the sum of the components' behavior. They are really the product of the interactions, and that's why, again, I love the profiler because they depict
	those interactions beautifully. Those dynamics just come across really nicely. So because we got all this multicolinearity, it kind of mitigates that and gives a stable, more accurate predictive models. We've been running this now...
	actually it reaches back a few years but
	since we created the SVEM add-in, which I'll talk about just a moment, we've run it numerous times now, since earlier this year and it's it's done a really nice job.
	Our clients are very pleased with it. So like I said, the model averaging using SVEM, it's not automated and JMP standard or JMP Pro, but we do have an add-in, and if you go to our website at Predictum.com, you can learn more about it and how to get it.
	So yeah there's a particular page on SVEM. We also have a modern mixture design course and you can see that that's with Marie and Phil and myself.
	And that, you can see on the training page on the Predictum site. We do cover a section on the classical mixture designs and but two thirds to three quarters of the course features SVEM.
	And that's available, the SVEM add-in is available as part of the course. So in this talk we've seen how building a predictive model from data is...
	data is limited by the feasibility of conducting validation trials to control overfitting. You know, in a classical design experiment, we just don't have the runs to do
	both training and validation, so we suggest using space filling designs, especially for mixture DOEs, but actually more generally for other DOE situations as well. And we described SVEM, which is based on auto-validation and fractionally weighted bootstrapping.
	And we discussed how SVEM enables predictive modeling for DOE data without, again, a separate set of validation runs.
	And we demonstrated SVEM using the urea experiment data.
	And I do want you to know that if you download the slides, you'll find a whole bunch of references. So here's a few here and a few more, and one of the things I love about this is that
	it actually reaches all the way back to 1996. So this is something that has been brewing for some time, and I would encourage you to have a look at these references.
	And with that we'll take questions.

laande · ‎10-08-2021

Awesome info!