Cluster Analysis to Improve Local Government in Massachusetts (2022-US-30MP-1154)

2 Kudos

Robert Carver, Professor Emeritus, Stonehill College

The town of Sharon, Massachusetts, created a Governance Study Committee to recommend changes to municipal by-laws and governance within the town, particularly with an eye to elevating civic engagement among residents. I am a member of that committee, and in one phase of our work we sought to confer with officials in similar communities across the state to learn from best practices elsewhere.

There are 351 cities and towns in the Commonwealth of Massachusetts, and we had limited time and no budget for comprehensive research. We quickly confronted the issue of how to best identify a modest group of communities closely comparable to Sharon, which in turn raised questions about which characteristics are most relevant to citizen participation in local governance.

Using JMP with publicly available data, I conducted a two-stage project to select key variables and then used those variables to run cluster analysis to identify other communities for our research. Because we are an all-volunteer, appointed public body, the research had to be presentable in public forums, comprehensible by a lay audience. Several visualization features of JMP 16 were particularly valuable in that regard.

This talk walks through the analysis, as well as my strategy to make clustering understandable.

Resources

Academic Case Study on this topic. Description to use with the data attached here.

Hello.

My name is Rob Carver, and today I want to share a story

about a project I've been working on in my small town in Massachusetts.

At the outset, I'll point out that the slides

and the JMP data table are up on the discovery website

and there's a new academic case study on this very topic that will be posted.

It's not already posted. It will be posted very soon.

What I'm hoping to do in 30 minutes is spend most of our time with a JMP demo,

but you're going to need some context and background.

I want to provide a little bit of a scenario,

give you a sense of the problem that I'm trying to solve,

and talk about the research strategy

and then get into the demo and wrap up with some conclusions.

I live in a town called Sharon, which is an archetypal New England town.

Here you see a picture of our talent center.

It was incorporated in 1765, so an old community.

Like many New England communities, the legislative function of the town,

is managed by the annual town meeting in which anyone can come and speak.

M emorialized in the Norman Rockwell images.

From the start, we have used an open-town meeting

and the executive function is carried out by a three-member board

known as The Select Board.

But since Norman Rockwell's day, municipal government has become

technologically, financially, legally more complex,

even for the most fundamental services that a town provides.

Attendance at the town meeting has really dwindled.

About a year and a half ago,

The Select Board created a governance study committee, of which I'm a member.

we are doctors and lawyers

and accountants, and teachers,

and marketing people, local business people.

I'm the resident stats guy.

Overtime, the population of the town has evolved, it's grown .

It's more diverse than it was 100 years ago.

We've gone from an agrarian manufacturing community

to a bedroom community for the city of Boston.

Lots of professionals working in hospitals and universities, law firms in the city.

People tend not to live and work in the town,

and that has impactsq on participation in town governance.

The charge to the governance study committee

is find ways to boost citizen engagement.

we've been doing our due diligence.

We've been researching, we've surveyed residents,

we've read the literature, we've interviewed town officials.

One part of our research, and that's what this talk is about,

is we wanted to reach out to towns like Sharon

to find out what are they doing, what's their experience.

There's some comparative research.

There are 350 towns in Massachusetts.

We have time constraints,

and so we're looking for a way to identify a smallish number of communities

that are similar to us.

We didn't want to reinvent the wheel,

but we thought that modernizing it some would be a good idea.

The driving question covered in this research

is which towns are similar to this town.

A little bit about Sharon.

We sit in South-eastern Massachusetts.

We are not too far from Plymouth,

which is where the 1620 May Flower landing happened.

This community was originally populated by Wampanoag peoples.

Europeans arrived in 163 7.

We're about halfway between Boston and Providence.

For the sports fans out there,

we are next door to where the New England Patriots play football.

Population about 18,500, which is quite average in Massachusetts.

We have great percentage of the voters of the population are registered voters.

Yet out of all those people, we get 2% for a town meeting.

Most recently in May of 2022.

This was the scene and a lot of that is COVID related.

There was social distancing rules in effect, but turnout is low,

partly because of COVID,

partly because of factors that we don't fully understand.

One task for the governance study community is to consider

other alternatives to town meeting,

or tweaks and enhancements to town meeting.

Under state law in an open-town meeting

to participate, you have to be in the room.

It's broadcast on local television,

but you have to be present to speak or to vote.

State law also says there's three ways to run local government.

74% of the communities, the large majority do what Sharon does.

Open-town meeting once or twice a year.

A small number have what's called representative town meeting in which

voters elect their neighbors, maybe a few hundred of them,

to participate and vote in town meeting.

Traditionally cities have had small councils

with a mayor or administrator of some kind.

Increasingly that's being adopted by towns, and so we're looking into that.

For this talk, the task is identify peer towns,

that we can then we could then interview and consult with and reach out to them.

I mentioned some of the state legal constraints.

One other constraint is the town boards,

like a government study committee, have to have open meetings.

Anything we do and decide and deliberate about has to be in public,

which is a good thing.

We have no budget.

We have some wonderful staff in the town hall, but they are

busy doing other things as well.

Data availability was a mixed story.

Plenty of data available about characteristics of communities.

We're really interested in how many people participate in local government

and there's no centralized data about that,

so we needed to hunt for proxies.

We also had no ability to compel folks in other towns

to meet with us, advise us, or share data with us.

We're operating in a topic area that is heavily governed by tradition.

People really cleave to that Norman Rockwell inch.

We came up with a three- stage plan.

As a committee, we brainstorm variables, say why do people participate?

Why don't people participate?

Why are different towns different?

I then grabbed some data from voter turnout in a recent state-wide election

to use as a proxy for citizen engagement.

Ran some models in JMP

to identify those variables that seem to have predictive value.

The committee then discussed and added some more variables

that they thought were important on the town meeting dimension.

That generated 20 predictor columns,

which I knew was far more than I wanted to deal with.

I consulted my brain trust some academic colleagues special thanks go to,

Mia Stevens and Ruth Humble at Chomp,

who advised me on principal components analysis,

which I'll note that the outset was not part of my comfort zone,

so I told her a little bit about that.

Then I ran cluster analysis.

That's the main event today.

People on the committee understood

that we probably want to be talking to towns of comparable size,

but there's more to similarity than size.

There's more to similarity than being a geographic neighbor.

Part of the work involved

instructing the committee a little bit on cluster analysis.

Just in case anybody watching doesn't have much background in this,

here's how I did it.

I said well,

we can look at population and something else at the same time.

and maybe though that something else has an impact on participation.

In this case, the Y axis was a single family property tax bills.

You can see that there's a bunch of towns similar in size to Sharon,

but which might have very different tax impacts.

The idea and cluster analysis,

if you are going to work in two dimensions,

choose two attributes that you think are relevant to your query,

spread the towns out on those two dimensions,

and then identifying a reasonable number of towns

that are reasonably similar to Sharon.

That's a big idea in cluster analysis.

Fortunately we're not limited to two attributes or two dimensions.

We can have more than that.

with that, I think you now know enough to follow the demo.

Where we're walking into this demo,

I had used gathered data from a variety of state and publicly available sources.

Used query builder to build a large data table

inspected for outliers and missing data.

The one real outlier is the city of Boston,

which is just unique.

T hat's excluded from all the analysis.

A little bit of missing this, but nothing terrible.

I'm going to be showing you a JMP project.

Let me switch gears, move into the demo and I hope that I do this correctly.

What we're looking at is my data table of 351 cities and towns.

The first several columns are identification,

size of The Select Board,

their legislative option name of community.

The next 20 columns are our predictors.

Just to round us a bit,

if we look at some basic descriptives of the communities,

towns in Massachusetts tend to be on the small size.

The medium is only 10,000 people.

Sharon is quite near the mean community size.

Terms of legislative function, 74% use open- town meetings,

so we are in good company,

and in terms of the size of The Select Board which is,

another thing, the governance committee is looking at just about 50/ 50.

Half of the town's with a Select Board have three members, half have five.

We've got these 20 predictors.

One issue that comes up pretty early in the analysis is the linearity.

Here I have five

variables that all speak to the size and the electric, the size of the town.

You can see that there are some very strong correlations.

We generally speaking,

don't want to deal with so much of the linearity.

One way out is principal components analysis.

At this point, not quite ready to jump into clustering,

but want to take those 20 columns and distil them down,

conserve as much information as possible,

but reduce the redundancy and collinearity across columns.

To do that,

principal components analysis is an excellent option.

I don't have the ability today to give a full crash course

in principal components analysis,

but we can see that we have variables that seem to be

overlapping in terms of their message.

We also can see that

when you give a PCA 20 columns, it initially comes up with 20 components.

The first few of which seem to capture most of the variability.

We have to make a decision about

how many principal components to use and what they represent.

For this, the screen plot is helpful

and we're looking for a kink or an elbow in the plot.

That seems to happen somewhere down here, around 4, 5, 6 components.

If we consult the

loading matrix to see how

different variables associate load into different components,

we can begin to subjectively assign meaning to the components.

I'll cut to the chase.

We selected six principal components

as being informative for the purposes of cluster analysis,

and came up with some interpretations that made sense to us.

Things like how big is the town?

How affluent is the town?

How fast is it growing?

Now we're ready for clustering.

Back to JMP.

There are two basic approaches to clustering.

You might think of them as a top- down and a bottom- up.

Both of them take the raw data, standardize it,

and then compute Euclidean distances for each pair of rows in the data table

for each pair of communities.

Taking into account six factors,

six principle components.

Size, affluence, education, things like that,

which ones are similar to Sharon

in hierarchical clustering?

The report starts us off with a dazzling graph that with 350 rows,

this is hard to interpret.

Let's begin we'll come back to it.

Begin with something that's a little easier to interpret,

which is the cluster summary.

In the hierarchical method, JMP has found for us 16 clusters.

I can tell you because I peaked

that Sharon turns out to be in cluster 15.

Sharon and 23 other towns.

For example,

if you scan down the affluence column, we see that,

again these are standardized scores,

these are the most affluent towns in the state.

If we come over here to the growth column,

and this is largely growth in the housing units and population.

The least growth, in fact, some negative growth.

If we come back up here.

Now, having looked at the cluster summary,

all of the clusters have been identified and colored.

JMP gives us a cut point.

If we zoom in on cluster 15.

Let's make that a little bigger.

Sharon is here in the center.

It's nearest Euclidean neighbor is Winchester,

which is about an hour's drive.

We now have a provisional list of towns to consult with.

All right, so that's a crash course in the hierarchical clustering.

I'll move out to the K-Means.

Hierarchical is bottom up.

We start with 350 individual towns as clusters.

We interrogate the distance matrix, all the parallelized distances,

find those two towns that are nearest to each other,

they form a cluster.

We take the mean distance of that cluster.

Now we're either looking for the next two nearest towns

or the next town that's closest to that cluster,

and iteratively process for the tree until

we have one gigantic cluster of 350 towns.

With K-M eans clustering.

We flip the process.

We start with 350 towns in one cluster

and then begin slicing and dividing in multiple dimensions.

In this approach, same utility and distances, same distance matrix,

we end up with Sharon being in cluster number 4,

with a full compliment of 33 towns.

We automatically get a cluster means picture.

Again, very affluent low growth,

not necessarily the lowest, but low growth again.

We get slightly different results.

I think in the interest of time, I will show you one other graph.

There's various things to look at,

but let's look at the parallel coordinate blocks.

What is this tool?

We have 16 clusters,

and by the way in K-Means,

it's up to the user to specify the number of clusters.

I chose 16 as a starting point because that's what hierarchical gave us.

H ere we are, cluster four.

The dark brown line is Sharon.

Here we see the six characteristics, the six principal components.

F or example, if we compare,

how is cluster 4 different from cluster 3 let's say, or cluster 5,

maybe similar in sizes.

Cluster five, less affluent.

Property values are a little lower.

Permanent population refers to

their communities with universities, hospitals, prisons, so forth,

vacation homes, snowbirds who leave for the winter.

Towns differ in terms of their permanent populations in cluster three, much lower.

Here's where we find our university towns.

I just popped up the town of Shirley, Massachusetts

as the state's largest

maximum security prison.

I don't know if we consider these folks permanent residence or not, but any event.

We've done two different clustering methods.

Let's take a look at how the results compare.

I saved within each clustering method,

the cluster assignments for each town, created binary variables.

Are you in the same cluster as Sharon or are you in a different cluster?

I also, just as an aside,

JMP has lots of wonderful built in geographic maps.

It does not have a built- in map showing

municipality borders within the state of Massachusetts.

But it turns out that with JMP

it's fairly easy to create a new geographic map.

I was able to do this without very much work at all.

Here are the results of hierarchical clustering, cluster 15.

Sharon is here.

It's similar towns are in blue.

I was pleased to see that my little tiny hometown of Marbleh ead

is similar to the place I moved to.

Hierarchical clustering 15 gives us these 24 towns.

K-Means clustering and some more towns, but there's an awful lot of overlap.

I also, just out of curiosity,

look to identify the 33 towns about 10% of the state

that is most similar to Sharon.

This is a larger group.

Again, an awful lot of repeats.

A lot of repeats.

That last approach also gave us some other advantages.

I want to now shift back to PowerPoint and talk about some of those.

O ne last point before I finish the demo.

We were also curious to ask...

Mostly our goal was, who shall we interview?

Who shall we call in to meet publicly with our committee?

But while we're at it,

let's see what our peers do in terms of governance.

State-wide,

open-town meeting— OTM— dominates 74% and there is no dominant fourth size.

If we look at who's in our cluster,

let me use K-Means, because it's a little bit larger group.

That 74%,

jumps up to 85% with time meeting.

By a two to one margin,

towns have five-member Select Boards.

Now, this isn't definitive as to channel my mother.

If all the other towns jumped off the Empire State Building,

we wouldn't necessarily want to jump off.

But it's interesting to note that the towns most similar to Sharon

favor the five-member Board

and are even more inclined to open- town meeting.

With that, let me

get back to some conclusions.

So what did we learn?

One thing we learned was the geographic proximity is uninformative in those maps,

none of the abutting towns came up blue.

Our most similar communities are not our next door neighbors.

As I just noted,

open-town meeting in five- member boards really predominate.

So what did we do?

This work actually happened several months ago.

We were able to prioritize our outreach,

begin contacting those towns most similar to us.

Many were extremely cooperative and shared a lot of information and data.

We also didn't want to assume that open-town meeting

was the only way to consider,

so we wanted to talk to people with representative or councils.

Those Euclidean distances became instructive in terms of,

okay, none of our immediate neighbors, closest neighbors

use town council or representative town meeting.

But which RTM town, which council town is most like us?

And we contacted those folks as well.

W e went from having to contemplate outreach to 350 towns

in a limited amount of time and with no money and staff,

to focused sampling method.

Then because town officials talk to one another

and they are professionally active,

that led us to other interviewees.

With that, I think that's about my time.

I hope this has been interesting and constructive.

Thank you for coming

and I hope you enjoy the rest of the program.

TCM · ‎10-18-2022

Hi Ron (or anyone who wishes to answer),

I previously sent a PM, when I remembered PM's are discouraged.

I am following the same process as yours: I performed PCA on 90 responses and selected the first 23 PC's (I have inspected and "labeled" my PC's) as inputs to a k-means clustering --> 7 clusters were deemed optimal. Now I have to describe and label my clusters.

Q1: I should use the cluster means from the first n PC's (I choose the first 10 of the 23) to describe my clusters, correct?

Q2: If yes to Q1, does a high negative cluster mean value signify that the cluster is strongly "anti" that particular PC? By the same logic, a high + cluster mean value would signify a strong feature for that cluster.

My cluster member counts are pretty balanced btw.

Thank you for your guidance.

rcarver · ‎10-19-2022

Hi TCM,

Thanks for asking.
Interpretation of the clusters is where you bring your expert domain knowledge to bear. You do want to inspect the 7 sets of Cluster Means and look for meaningful characterizations of the 10 PCs. So, we might imagine a consumer cluster of people who often spend on horror films, classical music, and cotton sweaters (assuming that your could identify the PCs as related to those items). Don't be shocked if it's a challenge to give plausible meanings to all 10 PCs and/or all 7 clusters.

As for the super-high mean of PC1 in one cluster, others might have a different on this. Even though the data are standardized in creating the PCs and the Clusters, you can still certainly have some observations with high values. Whatever PC1 represents, the consumers in that cluster seem to have a lot of it! Does that make sense?

You mention that the cluster memberships are balanced, but does that cluster happen to have a relatively small n?

I'm eager to see what others have to say. meanwhile, I hope this is helpful.

--Rob

TCM · ‎10-20-2022

Hi Ron,

Thank you for your speedy response!

Even though the data are standardized in creating the PCs and the Clusters, you can still certainly have some observations with high values. Whatever PC1 represents, the consumers in that cluster seem to have a lot of it! Does that make sense? My response: Yes, that is what I took it to mean... I will look at the top loadings for PC1 and use that to characterize the cluster with the high mean value to it. In addition, it would seem that because this cluster is heavy on PC1 (which is the most informative of all PC's), this cluster is probably very important in our study.

You mention that the cluster memberships are balanced, but does that cluster happen to have a relatively small n? My response: No, the number of members is pretty high and comparable to the number of members of the other clusters... I take the similarity in membership counts among the clusters to be a positive sign.

I am also curious to hear of others' comments.