Functional Design of Experiments

6 Kudos

See how to design and analyze an experiment where the response variable is a function (e.g., size or temperature over time) instead of a single point measurement.

For a quick overview of functional data analysis in JMP:

Blog: A 5-Minute Introduction to Functional Data Analysis
5-minute video: Functional Data Explorer in JMP 14

To go more in depth on functional data analysis:

Mastering JMP: Using JMP Pro to Pre-Process Functional Data and Create Surrogate Models
Mastering JMP: Using JMP Pro to Model Functional Data

Full Transcript (Automatically Generated)

It's Ross here, systems engineer coming to you from Tampa, Florida. And as Julian said, In this segment, we're going to talk about functional design of experiments, which generally, is the situation in which you're conducting an experiment and your response variable takes the form of a curve instead of a single value.

So let's unpack that idea a little bit. In essence, when we're doing functional design of experiments, or F do we were asking how factors of interest affect the features of a curve that we care about, for example, we might have a load deflection curve. So here on the x axis, we have the load placed on a surface, and on the y axis, we have the deflection that we measure, we see three different curves for three different widths of the material, we might be interested in how the materials width, or some other property and material actually affects the shape of this curve. Here we have a reflectance curve for various types of mirror coatings. So on the x axis, now we Have the wavelength of light and on the y axis we have the percent of light reflected at those varying wavelengths. We might be interested in how different aspects of mirror coatings affect this curve, maybe because we're trying to achieve a certain ideal curve. or third example, one, actually from research that I've done in the past as a cognitive scientist, using sensor data in particular, Eg or brainwaves, voltages measured over time generated by the brain, we might be interested in how the type of stimulus we show someone actually affects the shape of the eg curve we mentioned.

And, you know, sensor data actually is probably a really prominent application area. For this type of thing. A lot of us deal with sensor data, whether it be voltage over time, like we have here, or temperature over time, Ph over time, and so forth. But in all these examples, we're just interested in how some factor of interest actually affects this curve and its shape or its features.

So the example we're going to talk about right now is milling pigment particles. for manufacturing LCD screens, so we start with our raw pigment. And we put it into a bead mill depicted on the left here. And we have various factors that we can control with respect to the milling process, for example, the flow rate or the temperature. And we're interested in how these factors affect the shape of the curves that we see on the right. So you see in these curves where we have time on the x axis and the size of the pigment on the Y, we see, generally speaking, a pretty rapid decrease in particle size at first once we start milling, and then it starts to level off a bit. in green, we see this specification range, so that's where we want to stop. Now we really care about the shape of the curves here. You know, we don't want to spend too much time milling. So we want to see a steep drop off at first so that we can get into that green range as fast as we can. But we also don't want to overfill so we want to see a nice steady kind of asymptote within that green range so that if we happen to leave the mill running a little bit longer We don't actually overshoot and end up with particles too small. So this is a great time to use DLP, right?

We want to know how we can basically optimize this response variable these curves by manipulating our factors. So we would first use jmps DLP tools to figure out how to systematically manipulate the factors, then we would actually record the curves across multiple runs or batches of pigment. And then finally, we would model the relationship between the two. All of this is just the basic DLP process. So here, we have an equation that looks much like the one Elisa just showed us. This is just capturing the relationship between temperature and the curve. So we have some intercept plus temperature times some slope coefficient equals and if you're with me now it's you know something about the curves shape, right? But what, how do we get the curves shape into this equation? That's kind of a big challenge that we're dealing with here in St. Louis. We need the The measure here to be a single value. So the question is, what do we measure? Well, if we're really concerned about just getting to spec as fast as we can, maybe we just measure the time at which we hit spec. But that doesn't really tell us if we get that nice asymptote to make sure we stay in spec. So maybe we see if you know, what the size of the particles is, when we see that the decrease in sizes leveled off whatever leveled off means we would have to operationalize that to maybe okay measures, you know, somewhat imperfect and certainly don't capture the shape of this curve. Maybe we actually instead regress size on time using something like a quadratic model, and then enter those coefficients in as the quote unquote curve shape. Maybe that would do better or maybe not, you can see none of these solutions are ideal. So what we really want is a better way to actually capture the shape of the curve in a single value.

And that's what jmp pros functional Data Explorer can do for us it in essence finds the primary features or shapes in a curve. translates them into single values. The basic idea is this, we take our data, we fit a model to the data, and then that model represents each curve as some average shape on the left center there, plus or minus some amount of some primary shape feature, or in technical terms a functional principal component.

From that model, then we pull out these shape feature scores or where it says FPC one, these functional principal component scores. And these actually capture, for this case, the batches in our experiment, you know, how much of that primary shape feature we would add in or out to approximate that shape. And so in this way, we can actually kind of stay true to the shapes of our curves and still enter them into analysis techniques like linear regression.

So this is on static images, it's kind of hard to understand or visualize, especially if you've never really encountered this idea before. So now I want to switch over to jmp to kind of show you how it all works. So let's bring a dataset in Here's a data set from the experiment where we manipulated our factors. So once again, these four variables here, and then we measured across multiple batches, the size at various times. So the first eight rows here represent the measurements taken from batch 2887.

Just to show you I've also included down at the bottom here, an ideal curve, if you remember I said we kind of wanted to get down to spec really fast, but then just stay there. So this curve is we're about to see actually just kind of captures that ideal state for us. What I'm going to do now is show you the end result of the analysis just so that you can get a get an idea of where we're going. And then I'll run back through and show you how to produce it using tools in jmp. So here's the output of the analysis in jmps functional Data Explorer, this is in jmp Pro. Let me walk you through what we have here the basic elements up in the left side We have each of our batches, you can see them numbered there and jmp has fit a flexible curve to each one, shown in red. It's using something called a spline model, which you can find more information on. On the right. For example, we have a one not cubic model, as you can see here. We then use that model to perform something called functional principal components analysis, which is what actually captures the average shape and this primary shape component. So we see our average shape of those curves here.

And this one component shape that actually explains 99% of the variability in all the shapes of our curves across all of our batches. Down here in the scatterplot, we can actually see for individual batches that either have a lot of that shape on the right side or actually subtract out a lot of that shape on the left, what they look like we'll get 2899 I'll just pay This here, you can see it says it has a component score of 77. So we're adding in a fair bit of this shape. And you can see it doesn't look so great. It's not exactly what we would want to see, we don't have a really steep drop off. And this is kind of a nice level trend down that doesn't ask them to like we would want.

So we can actually kind of explore that a little further down here, we have a profiler that allows us to see on the left panel, the shape of the curve that we get out of the model, given a certain value of the shape component on the right side. So let's go ahead and actually add some of that shape component in. And you can see that the shape on the left is actually changing. And as we get to a pretty high value, you can see that the models shape actually starts to look an awful lot like 2899. Fact 2899 has a component score of 77. So I could go ahead and just enter that in there. And now we can see that the models shape actually looks very much like the shape we observed.

Conversely, if we go on All the way down to batch 2887. This is one that has a pretty negative shape score negative 83, you can see the shape is a fair bit different, it actually looks a lot better, it drops off fast and then kind of levels off. If we enter negative 83. As our score, you can see that once again, the model has captured that now we have a shape that looks an awful lot like what we observed for 2887. So these scores, the 77, and negative 83. These are single values that actually capture a complex variation in the shapes that we saw across all the batches that we run.

So now we have single scores, we can actually enter into an analysis in a way that then allows us to kind of model the shape of our curves with kind of a greater degree of fidelity. And that's what we do. Down here where it says functional do II analysis. Remember our goal is to actually see how changes in our factors affect the shape of the curve.

jmp has used pros generalized regression platform to automatic fit a model for us that relates our factors to the FPC scores, those shape scores that we were looking at. And we can see for example, as I increase the percent beads, we start to get a shape that looks a little more like the one we're after. So increase percent strength, again, we start to see slight improvement in shape, just like you might have seen elsewhere and jmp we can even ask to maximize the desirability that is to find the factor settings that get as close to our target as possible. so here we can see these factor settings down below, give us a curve that starts to look pretty good. We have a nice steep drop off, and then it levels off right here. So that's functional do we in a nutshell, you have your factors of interest, you let jmp extract shape features, and then you just relate your factors of interest to those shaped features and even optimized by finding a set of factors. settings that yield the shape closest to the one that you want. So now in the last few minutes, I just want to walk you through kind of how all this works. So if you want to try this out yourself, you know where to go.

Let me close this down. First, you may be wondering, well, how did we actually design an experiment, these were experimental data is actually from a definitive screening design. This is fairly straightforward. Under the DLP menu, you'll find our normal DLP tool. So I'll go to definitive screening here, and then jmp pro 15. Under the responses, you're not going to see a functional response type. So I'll go ahead and choose that. I'll actually remove this other response this non functional response, you can see we can enter the name just like we would.

Otherwise we can set the number of measures per run. I'll just leave it at five for now. And this is flexible, you could add or remove measures per run as you go along in conducting your experiment. Then we enter our factors as we normally would have a table representing those factors settings just to speed things up. So I'll go ahead and load those factors in. Click Continue. And then I'll keep the standard design options for the DST and just click Make design, we have our design out. Now we'll make the table. So this is a jmp to a table, we see places to record size, I only specified five, so I get them. So you can see each row here represents one function, but you can go ahead and if you want maybe use tables stack as we had done to put this in a tall format, which then would allow you to create a time column and keep track of which time you took the measurements and so forth. You'll also see just like with any other deal table that you get out of jmp, you have your built in scripts to pull up the design window again or conduct your analyses and so forth. So nothing too big there.

You just make sure to specify your functional response type. Now, let's actually take a look at The functional Data Explorer platform and how we use it to analyze the data. So you'll find functional Data Explorer under specialized modeling, functional Data Explorer, I'm in the stacked or tall format, though you can enter data and other formats. For example, rows as functions would correspond to the format that we got out of the DSD platform. These first two fields here, x and y just refer to the variables that define our function. So ours was size across time, so I'll put size as y times x.

Because we're in this stacked format jmp needs to know which row belongs to which experimental runner which batch so I'll go ahead and just put batch in as the ID. Then I'll grab our experimental factors and put them in these z supplementary role so that they get passed along to functional Data Explorer.

Here's our window. You can see it looks similar to what we had before our data have all been loaded in in the green Start Here is actually the target function that I had loaded previously, which you can see down here with the nice steep drop off and then the asymptote.

So right now Trump is treating that as actual data, which it's not. So we want to make sure to tell jmp that that's actually the target function, I'll click Load. And then say that the one that I've labeled target is our target. You can see it's removed here from the plot, and kind of broken out down below.

There are a lot of options you're probably noticing on the right here for processing the data cleanup, transformation, and so forth. We don't have any of that to do here. So what we'll do next is actually fit the curves to the data or the spline model. Here we have a few different classes of models we can use. I'll choose B splines.

We have curves fit to each of our experimental runs, and the functional PCA that extracts that shape component has been done for us. Once again, just this one shape component explains 99% of variability. So we don't really need another shape component. But in other data sets, you might get multiple shaped components out. We have, again, our score plot, and our profiler that allows us to see how changes in these FPC scores affect the shape. When I'm ready to go, I'll just go next to the model title here and request the functional DLP analysis. And now down below, I have the profiler that relates our factor settings to the shape of the curve. And once again, this is done using jmp rows generalized regression. And I can actually modify the model or look into the models details further, right here. So I can go ahead and open this up. And for example, maybe launch another type of model. Or if I care to even view some of the reports, for example, the parameter estimates, so you're not stuck with whichever model is first chosen, you can in fact, kind of work with the model to get one that you're satisfied with.

That's functional do II in 15 minutes, the take home message, you have factors that you want to relate to the shape of a curve. Under the do II menu, you can create your design with a functional response type and then analyze your data in functional Data Explorer by first fitting the curves to the data, then performing the functional principal components analysis and then finally, requesting the functional analysis.

So,a pretty sophisticated technique here available in a nice easy to use package.

You know, this is a deep topic though. So if you're looking for more information, I encourage you to head on over to the jmp on air space on the user community, where for the page for this presentation, you'll find links to some more helpful info. So that's all for me. Thanks for your attention, Julie, and I'll pass it back to you.

It's Ross here, systems engineer coming to you from Tampa, Florida. And as Julian said, In this segment, we're going to talk about functional design of experiments, which generally, is the situation in which you're conducting an experiment and your response variable takes the form of a curve instead of a single value. So let's unpack that idea a little bit. In essence, when we're doing functional design of experiments, or F do we were asking how factors of interest affect the features of a curve that we care about, for example, we might have a load deflection curve. So here on the x axis, we have the load placed on a surface, and on the y axis, we have the deflection that we measure, we see three different curves for three different widths of the material, we might be interested in how the materials width, or some other property and material actually affects the shape of this curve. Here we have a reflectance curve for various types of mirror coatings. So on the x axis, now we Have the wavelength of light and on the y axis we have the percent of light reflected at those varying wavelengths. We might be interested in how different aspects of mirror coatings affect this curve, maybe because we're trying to achieve a certain ideal curve. or third example, one, actually from research that I've done in the past as a cognitive scientist, using sensor data in particular, Eg or brainwaves, voltages measured over time generated by the brain, we might be interested in how the type of stimulus we show someone actually affects the shape of the eg curve we mentioned. And, you know, sensor data actually is probably a really prominent application area. For this type of thing. A lot of us deal with sensor data, whether it be voltage over time, like we have here, or temperature over time, Ph over time, and so forth. But in all these examples, we're just interested in how some factor of interest actually affects this curve and its shape or its features. So the example we're going to talk about right now is milling pigment particles. for manufacturing LCD screens, so we start with our raw pigment. And we put it into a bead mill depicted on the left here. And we have various factors that we can control with respect to the milling process, for example, the flow rate or the temperature. And we're interested in how these factors affect the shape of the curves that we see on the right. So you see in these curves where we have time on the x axis and the size of the pigment on the Y, we see, generally speaking, a pretty rapid decrease in particle size at first once we start milling, and then it starts to level off a bit. in green, we see this specification range, so that's where we want to stop. Now we really care about the shape of the curves here. You know, we don't want to spend too much time milling. So we want to see a steep drop off at first so that we can get into that green range as fast as we can. But we also don't want to overfill so we want to see a nice steady kind of asymptote within that green range so that if we happen to leave the mill running a little bit longer We don't actually overshoot and end up with particles too small. So this is a great time to use DLP, right? We want to know how we can basically optimize this response variable these curves by manipulating our factors. So we would first use jmps DLP tools to figure out how to systematically manipulate the factors, then we would actually record the curves across multiple runs or batches of pigment. And then finally, we would model the relationship between the two. All of this is just the basic DLP process. So here, we have an equation that looks much like the one Elisa just showed us. This is just capturing the relationship between temperature and the curve. So we have some intercept plus temperature times some slope coefficient equals and if you're with me now it's you know something about the curves shape, right? But what, how do we get the curves shape into this equation? That's kind of a big challenge that we're dealing with here in St. Louis. We need the The measure here to be a single value. So the question is, what do we measure? Well, if we're really concerned about just getting to spec as fast as we can, maybe we just measure the time at which we hit spec. But that doesn't really tell us if we get that nice asymptote to make sure we stay in spec. So maybe we see if you know, what the size of the particles is, when we see that the decrease in sizes leveled off whatever leveled off means we would have to operationalize that to maybe okay measures, you know, somewhat imperfect and certainly don't capture the shape of this curve. Maybe we actually instead regress size on time using something like a quadratic model, and then enter those coefficients in as the quote unquote curve shape. Maybe that would do better or maybe not, you can see none of these solutions are ideal. So what we really want is a better way to actually capture the shape of the curve in a single value. And that's what jmp pros functional Data Explorer can do for us it in essence finds the primary features or shapes in a curve. translates them into single values. The basic idea is this, we take our data, we fit a model to the data, and then that model represents each curve as some average shape on the left center there, plus or minus some amount of some primary shape feature, or in technical terms a functional principal component. From that model, then we pull out these shape feature scores or where it says FPC one, these functional principal component scores. And these actually capture, for this case, the batches in our experiment, you know, how much of that primary shape feature we would add in or out to approximate that shape. And so in this way, we can actually kind of stay true to the shapes of our curves and still enter them into analysis techniques like linear regression. So this is on static images, it's kind of hard to understand or visualize, especially if you've never really encountered this idea before. So now I want to switch over to jmp to kind of show you how it all works. So let's bring a dataset in Here's a data set from the experiment where we manipulated our factors. So once again, these four variables here, and then we measured across multiple batches, the size at various times. So the first eight rows here represent the measurements taken from batch 2887. Just to show you I've also included down at the bottom here, an ideal curve, if you remember I said we kind of wanted to get down to spec really fast, but then just stay there. So this curve is we're about to see actually just kind of captures that ideal state for us. What I'm going to do now is show you the end result of the analysis just so that you can get a get an idea of where we're going. And then I'll run back through and show you how to produce it using tools in jmp. So here's the output of the analysis in jmps functional Data Explorer, this is in jmp Pro. Let me walk you through what we have here the basic elements up in the left side We have each of our batches, you can see them numbered there and jmp has fit a flexible curve to each one, shown in red. It's using something called a spline model, which you can find more information on. On the right. For example, we have a one not cubic model, as you can see here. We then use that model to perform something called functional principal components analysis, which is what actually captures the average shape and this primary shape component. So we see our average shape of those curves here. And this one component shape that actually explains 99% of the variability in all the shapes of our curves across all of our batches. Down here in the scatterplot, we can actually see for individual batches that either have a lot of that shape on the right side or actually subtract out a lot of that shape on the left, what they look like we'll get 2899 I'll just pay This here, you can see it says it has a component score of 77. So we're adding in a fair bit of this shape. And you can see it doesn't look so great. It's not exactly what we would want to see, we don't have a really steep drop off. And this is kind of a nice level trend down that doesn't ask them to like we would want. So we can actually kind of explore that a little further down here, we have a profiler that allows us to see on the left panel, the shape of the curve that we get out of the model, given a certain value of the shape component on the right side. So let's go ahead and actually add some of that shape component in. And you can see that the shape on the left is actually changing. And as we get to a pretty high value, you can see that the models shape actually starts to look an awful lot like 2899. Fact 2899 has a component score of 77. So I could go ahead and just enter that in there. And now we can see that the models shape actually looks very much like the shape we observed. Conversely, if we go on All the way down to batch 2887. This is one that has a pretty negative shape score negative 83, you can see the shape is a fair bit different, it actually looks a lot better, it drops off fast and then kind of levels off. If we enter negative 83. As our score, you can see that once again, the model has captured that now we have a shape that looks an awful lot like what we observed for 2887. So these scores, the 77, and negative 83. These are single values that actually capture a complex variation in the shapes that we saw across all the batches that we run. So now we have single scores, we can actually enter into an analysis in a way that then allows us to kind of model the shape of our curves with kind of a greater degree of fidelity. And that's what we do. Down here where it says functional do II analysis. Remember our goal is to actually see how changes in our factors affect the shape of the curve. jmp has used pros generalized regression platform to automatic fit a model for us that relates our factors to the FPC scores, those shape scores that we were looking at. And we can see for example, as I increase the percent beads, we start to get a shape that looks a little more like the one we're after. So increase percent strength, again, we start to see slight improvement in shape, just like you might have seen elsewhere and jmp we can even ask to maximize the desirability that is to find the factor settings that get as close to our target as possible. so here we can see these factor settings down below, give us a curve that starts to look pretty good. We have a nice steep drop off, and then it levels off right here. So that's functional do we in a nutshell, you have your factors of interest, you let jmp extract shape features, and then you just relate your factors of interest to those shaped features and even optimized by finding a set of factors. settings that yield the shape closest to the one that you want. So now in the last few minutes, I just want to walk you through kind of how all this works. So if you want to try this out yourself, you know where to go. Let me close this down. First, you may be wondering, well, how did we actually design an experiment, these were experimental data is actually from a definitive screening design. This is fairly straightforward. Under the DLP menu, you'll find our normal DLP tool. So I'll go to definitive screening here, and then jmp pro 15. Under the responses, you're not going to see a functional response type. So I'll go ahead and choose that. I'll actually remove this other response this non functional response, you can see we can enter the name just like we would. Otherwise we can set the number of measures per run. I'll just leave it at five for now. And this is flexible, you could add or remove measures per run as you go along in conducting your experiment. Then we enter our factors as we normally would have a table representing those factors settings just to speed things up. So I'll go ahead and load those factors in. Click Continue. And then I'll keep the standard design options for the DST and just click Make design, we have our design out. Now we'll make the table. So this is a jmp to a table, we see places to record size, I only specified five, so I get them. So you can see each row here represents one function, but you can go ahead and if you want maybe use tables stack as we had done to put this in a tall format, which then would allow you to create a time column and keep track of which time you took the measurements and so forth. You'll also see just like with any other deal table that you get out of jmp, you have your built in scripts to pull up the design window again or conduct your analyses and so forth. So nothing too big there. You just make sure to specify your functional response type. Now, let's actually take a look at The functional Data Explorer platform and how we use it to analyze the data. So you'll find functional Data Explorer under specialized modeling, functional Data Explorer, I'm in the stacked or tall format, though you can enter data and other formats. For example, rows as functions would correspond to the format that we got out of the DSD platform. These first two fields here, x and y just refer to the variables that define our function. So ours was size across time, so I'll put size as y times x. Because we're in this stacked format jmp needs to know which row belongs to which experimental runner which batch so I'll go ahead and just put batch in as the ID. Then I'll grab our experimental factors and put them in these z supplementary role so that they get passed along to functional Data Explorer. Here's our window. You can see it looks similar to what we had before our data have all been loaded in in the green Start Here is actually the target function that I had loaded previously, which you can see down here with the nice steep drop off and then the asymptote. So right now Trump is treating that as actual data, which it's not. So we want to make sure to tell jmp that that's actually the target function, I'll click Load. And then say that the one that I've labeled target is our target. You can see it's removed here from the plot, and kind of broken out down below. There are a lot of options you're probably noticing on the right here for processing the data cleanup, transformation, and so forth. We don't have any of that to do here. So what we'll do next is actually fit the curves to the data or the spline model. Here we have a few different classes of models we can use. I'll choose B splines. We have curves fit to each of our experimental runs, and the functional PCA that extracts that shape component has been done for us. Once again, just this one shape component explains 99% of variability. So we don't really need another shape component. But in other data sets, you might get multiple shaped components out. We have, again, our score plot, and our profiler that allows us to see how changes in these FPC scores affect the shape. When I'm ready to go, I'll just go next to the model title here and request the functional DLP analysis. And now down below, I have the profiler that relates our factor settings to the shape of the curve. And once again, this is done using jmp rows generalized regression. And I can actually modify the model or look into the models details further, right here. So I can go ahead and open this up. And for example, maybe launch another type of model. Or if I care to even view some of the reports, for example, the parameter estimates, so you're not stuck with whichever model is first chosen, you can in fact, kind of work with the model to get one that you're satisfied with. That's functional do II in 15 minutes, the take home message, you have factors that you want to relate to the shape of a curve. Under the do II menu, you can create your design with a functional response type and then analyze your data in functional Data Explorer by first fitting the curves to the data, then performing the functional principal components analysis and then finally, requesting the functional analysis. So,a pretty sophisticated technique here available in a nice easy to use package. You know, this is a deep topic though. So if you're looking for more information, I encourage you to head on over to the jmp on air space on the user community, where for the page for this presentation, you'll find links to some more helpful info. So that's all for me. Thanks for your attention, Julie, and I'll pass it back to you.