cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Cluster Analysis to Improve Local Government in Massachusetts (2022-US-30MP-1154)

Robert Carver, Professor Emeritus, Stonehill College

 

The town of Sharon, Massachusetts, created a Governance Study Committee to recommend changes to municipal by-laws and governance within the town, particularly with an eye to elevating civic engagement among residents. I am a member of that committee, and in one phase of our work we sought to confer with officials in similar communities across the state to learn from best practices elsewhere.

 

There are 351 cities and towns in the Commonwealth of Massachusetts, and we had limited time and no budget for comprehensive research. We quickly confronted the issue of how to best identify a modest group of communities closely comparable to Sharon, which in turn raised questions about which characteristics are most relevant to citizen participation in local governance. 

 

Using JMP with publicly available data, I conducted a two-stage project to select key variables and then used those variables to run cluster analysis to identify other communities for our research. Because we are an all-volunteer, appointed public body, the research had to be presentable in public forums, comprehensible by a lay audience. Several visualization features of JMP 16 were particularly valuable in that regard.

 

This talk walks through the analysis, as well as my strategy to make clustering understandable.

 

 

Resources

 

Hello.

My  name  is  Rob  Carver, and  today  I  want  to  share  a  story

about  a  project  I've  been  working on  in  my  small  town  in  Massachusetts.

At  the  outset, I'll  point  out  that  the  slides

and  the  JMP  data  table are  up  on  the  discovery  website

and  there's  a  new  academic  case  study on  this  very  topic  that  will  be  posted.

It's  not  already  posted. It  will  be  posted  very  soon.

What  I'm  hoping  to  do  in  30  minutes is  spend  most  of  our  time  with  a  JMP  demo,

but  you're  going  to  need some  context  and  background.

I  want  to  provide a  little  bit  of  a  scenario,

give  you  a  sense  of  the  problem that  I'm  trying  to  solve,

and  talk  about  the  research  strategy

and  then  get  into  the  demo and  wrap  up  with  some  conclusions.

I  live  in  a  town  called  Sharon, which  is  an  archetypal  New  England  town.

Here  you  see  a  picture of  our  talent  center.

It  was  incorporated  in  1765, so an  old  community.

Like  many  New  England  communities, the  legislative  function  of  the  town,

is managed  by  the  annual  town  meeting in  which  anyone  can  come  and  speak.

M emorialized  in  the Norman  Rockwell  images.

From  the  start, we  have  used  an   open-town  meeting

and  the  executive  function  is  carried  out by  a   three-member  board

known  as   The Select Board.

But  since  Norman  Rockwell's  day, municipal  government  has  become

technologically,  financially, legally  more  complex,

even  for  the  most fundamental  services  that  a  town  provides.

Attendance  at  the  town meeting  has  really  dwindled.

About  a  year  and  a  half  ago,

The Select Board  created  a  governance study  committee,  of  which  I'm  a  member.

we  are  doctors  and  lawyers

and  accountants,  and  teachers,

and  marketing  people, local  business  people.

I'm  the  resident  stats  guy.

Overtime,  the  population of  the  town  has evolved, it's grown .

It's  more  diverse  than it  was  100  years  ago.

We've  gone  from an  agrarian  manufacturing  community

to  a  bedroom  community for  the  city  of  Boston.

Lots  of  professionals  working  in  hospitals and  universities,  law  firms  in  the  city.

People  tend  not  to  live and  work  in  the  town,

and  that  has  impactsq on  participation  in  town  governance.

The  charge  to the  governance  study  committee

is  find  ways to  boost  citizen  engagement.

we've  been  doing  our  due  diligence.

We've  been  researching, we've  surveyed  residents,

we've  read  the  literature, we've  interviewed  town  officials.

One  part  of  our  research, and  that's  what  this  talk  is  about,

is  we  wanted  to  reach  out to  towns  like  Sharon

to  find  out  what  are  they  doing, what's  their  experience.

There's  some  comparative  research.

There  are  350  towns  in  Massachusetts.

We  have  time  constraints,

and  so  we're  looking  for  a  way  to  identify a  smallish  number  of  communities

that  are  similar  to  us.

We  didn't  want  to  reinvent  the  wheel,

but  we  thought  that  modernizing it  some  would  be  a  good  idea.

The  driving  question covered  in  this  research

is  which  towns  are similar  to  this  town.

A  little  bit  about  Sharon.

We  sit  in  South-eastern  Massachusetts.

We  are  not  too  far  from  Plymouth,

which  is  where  the  1620 May  Flower  landing  happened.

This  community  was  originally  populated by  Wampanoag  peoples.

Europeans  arrived  in  163 7.

We're  about  halfway  between Boston  and  Providence.

For  the  sports  fans  out  there,

we  are  next  door  to  where the  New  England  Patriots  play  football.

Population  about  18,500, which  is  quite  average  in  Massachusetts.

We  have  great  percentage  of  the  voters of  the  population  are  registered  voters.

Yet  out  of  all  those  people, we  get  2%  for  a  town  meeting.

Most  recently  in  May  of  2022.

This  was  the  scene and  a  lot  of  that  is  COVID  related.

There  was  social  distancing rules  in  effect,  but  turnout  is  low,

partly  because  of  COVID,

partly  because  of  factors that  we  don't  fully  understand.

One  task  for  the  governance  study community  is  to  consider

other  alternatives  to  town  meeting,

or  tweaks  and  enhancements to town meeting.

Under  state  law  in  an   open-town  meeting

to  participate, you  have  to  be  in  the  room.

It's  broadcast  on  local  television,

but  you  have  to  be  present to  speak  or  to  vote.

State  law  also  says  there's  three ways to  run  local  government.

74%  of  the  communities, the  large  majority  do  what  Sharon  does.

Open-town  meeting  once  or  twice  a  year.

A  small  number  have  what's  called representative  town  meeting  in  which

voters  elect  their  neighbors, maybe  a  few  hundred  of  them,

to  participate  and  vote  in  town  meeting.

Traditionally  cities have  had  small  councils

with  a  mayor or  administrator  of  some  kind.

Increasingly  that's  being  adopted by  towns,  and  so  we're  looking  into  that.

For  this  talk, the  task  is  identify  peer  towns,

that  we  can  then  we  could  then  interview and  consult  with  and  reach  out  to  them.

I  mentioned  some  of  the  state legal  constraints.

One  other  constraint  is  the  town  boards,

like  a  government  study  committee, have  to  have  open  meetings.

Anything  we  do  and  decide and  deliberate  about  has  to  be  in  public,

which  is  a  good  thing.

We  have  no  budget.

We  have  some  wonderful  staff in  the  town  hall,  but  they  are

busy  doing  other  things  as  well.

Data  availability  was  a  mixed  story.

Plenty  of  data  available  about characteristics  of  communities.

We're  really  interested  in  how  many  people participate  in  local  government

and  there's  no centralized  data  about  that,

so we  needed  to  hunt  for  proxies.

We  also  had  no  ability to  compel  folks  in  other  towns

to  meet  with  us,  advise  us, or  share  data  with  us.

We're  operating  in  a  topic  area that  is  heavily  governed  by  tradition.

People  really  cleave  to  that  Norman  Rockwell inch.

We  came  up  with  a  three- stage  plan.

As  a  committee,  we  brainstorm  variables, say  why  do  people  participate?

Why  don't  people  participate?

Why  are  different  towns  different?

I  then  grabbed  some  data  from  voter turnout  in  a  recent  state-wide  election

to  use  as  a  proxy for  citizen  engagement.

Ran  some  models  in  JMP

to  identify  those  variables that  seem  to  have  predictive  value.

The  committee  then  discussed and  added some more variables

that  they  thought  were  important on  the  town  meeting  dimension.

That  generated  20  predictor  columns,

which  I  knew  was  far  more than  I  wanted  to  deal  with.

I  consulted  my  brain  trust  some academic  colleagues  special  thanks  go  to,

Mia  Stevens  and  Ruth  Humble  at  Chomp,

who  advised  me on  principal  components  analysis,

which  I'll  note  that  the  outset was  not  part  of  my  comfort  zone,

so I  told  her  a  little  bit  about  that.

Then  I  ran  cluster  analysis.

That's the  main  event  today.

People  on  the  committee  understood

that  we  probably  want  to  be  talking to  towns  of  comparable  size,

but  there's  more  to  similarity  than  size.

There's  more  to  similarity  than being  a  geographic  neighbor.

Part  of  the  work  involved

instructing  the  committee a  little  bit  on  cluster  analysis.

Just  in  case  anybody  watching doesn't  have  much  background  in  this,

here's  how  I  did  it.

I  said  well,

we  can  look  at  population and  something  else  at  the  same  time.

and maybe  though  that  something  else has  an  impact  on  participation.

In  this  case,  the  Y  axis was  a  single  family  property  tax  bills.

You  can  see  that  there's  a  bunch of  towns  similar  in  size  to  Sharon,

but  which  might  have  very different  tax  impacts.

The  idea  and   cluster  analysis,

if  you  are  going  to  work in  two  dimensions,

choose  two  attributes  that  you  think are  relevant  to  your  query,

spread  the  towns  out on  those  two  dimensions,

and  then  identifying a reasonable  number  of  towns

that  are reasonably  similar  to  Sharon.

That's   a  big  idea  in  cluster  analysis.

Fortunately  we're  not  limited  to  two attributes  or  two  dimensions.

We  can  have  more  than  that.

with  that,  I  think  you  now  know enough  to  follow  the  demo.

Where  we're  walking  into  this  demo,

I  had used  gathered  data  from  a  variety of  state  and  publicly  available  sources.

Used  query  builder to  build  a  large  data  table

inspected  for  outliers  and  missing  data.

The  one real  outlier is  the  city  of  Boston,

which  is  just  unique.

T hat's  excluded from  all  the  analysis.

A  little  bit  of  missing  this, but  nothing  terrible.

I'm  going  to  be  showing you  a  JMP  project.

Let  me  switch  gears,  move  into  the  demo and  I  hope  that  I  do  this  correctly.

What  we're  looking  at  is  my  data table  of  351  cities  and  towns.

The  first  several  columns are  identification,

size  of   The Select Board,

their  legislative  option name  of  community.

The  next  20  columns  are  our  predictors.

Just to  round  us  a  bit,

if  we  look  at  some basic  descriptives  of  the  communities,

towns  in  Massachusetts tend  to  be  on  the  small  size.

The  medium  is  only  10,000  people.

Sharon  is  quite  near the  mean  community  size.

Terms  of  legislative  function, 74%  use  open- town  meetings,

so we  are  in  good  company,

and  in  terms  of  the  size of  The  Select Board  which  is,

another  thing,  the  governance  committee is  looking  at  just  about  50/ 50.

Half  of  the  town's  with  a  Select  Board have  three  members,  half  have  five.

We've  got  these  20  predictors.

One  issue  that  comes  up  pretty early in  the  analysis  is the  linearity.

Here  I  have  five

variables  that  all  speak  to  the  size and  the  electric,  the  size  of  the  town.

You  can  see  that  there  are some very  strong  correlations.

We generally  speaking,

don't  want  to  deal with  so  much of the  linearity.

One  way  out is  principal  components  analysis.

At  this  point,  not  quite  ready to  jump  into  clustering,

but  want  to  take  those 20  columns  and  distil  them  down,

conserve  as  much  information  as  possible,

but  reduce  the  redundancy and  collinearity  across  columns.

To  do  that,

principal components  analysis is  an  excellent  option.

I  don't  have  the  ability  today to  give  a  full  crash  course

in  principal  components  analysis,

but  we  can  see  that we  have  variables  that  seem  to  be

overlapping  in  terms  of  their  message.

We  also  can  see  that

when  you  give  a  PCA  20  columns, it  initially  comes  up  with  20  components.

The  first  few  of  which  seem  to  capture most  of  the  variability.

We  have  to  make  a  decision  about

how  many  principal  components to  use  and  what  they  represent.

For  this,  the  screen  plot  is  helpful

and  we're  looking  for  a  kink or  an  elbow  in  the  plot.

That  seems  to  happen  somewhere  down here,  around   4, 5, 6  components.

If  we  consult  the

loading  matrix  to  see  how

different  variables  associate  load into  different  components,

we  can  begin  to  subjectively assign  meaning  to  the  components.

I'll  cut  to  the  chase.

We  selected  six  principal  components

as  being  informative for  the  purposes  of  cluster  analysis,

and  came  up  with  some interpretations  that  made  sense  to  us.

Things  like  how  big  is  the  town?

How  affluent  is  the  town?

How  fast  is  it  growing?

Now  we're  ready  for  clustering.

Back to  JMP.

There  are  two  basic approaches  to  clustering.

You  might  think  of  them as  a  top- down  and  a  bottom- up.

Both  of  them  take  the  raw  data, standardize  it,

and  then  compute  Euclidean  distances for  each  pair  of  rows  in  the  data  table

for  each pair  of  communities.

Taking  into  account  six  factors,

six  principle  components.

Size,  affluence,  education, things  like  that,

which  ones  are  similar  to  Sharon

in  hierarchical  clustering?

The  report  starts  us  off with  a  dazzling  graph  that  with  350  rows,

this  is  hard  to  interpret.

Let's  begin  we'll  come  back  to  it.

Begin  with  something that's  a  little  easier  to  interpret,

which  is  the  cluster  summary.

In  the  hierarchical  method, JMP  has  found  for  us  16  clusters.

I  can  tell  you  because  I  peaked

that  Sharon  turns  out  to  be  in  cluster  15.

Sharon  and  23  other  towns.

For  example,

if  you  scan  down  the  affluence  column, we  see  that,

again  these  are  standardized  scores,

these  are  the  most  affluent towns  in  the  state.

If  we  come  over  here  to  the  growth  column,

and  this  is  largely  growth in  the  housing  units  and  population.

The  least  growth, in  fact,  some  negative  growth.

If  we  come  back  up  here.

Now,  having  looked  at  the  cluster  summary,

all  of  the  clusters have  been  identified  and  colored.

JMP  gives  us  a  cut  point.

If  we  zoom  in  on  cluster  15.

Let's make that a little bigger.

Sharon  is  here  in  the  center.

It's  nearest  Euclidean  neighbor is  Winchester,

which  is  about  an  hour's  drive.

We  now  have  a  provisional  list of  towns  to  consult  with.

All  right,  so  that's  a  crash  course in  the  hierarchical clustering.

I'll  move  out  to  the  K-Means.

Hierarchical  is  bottom  up.

We  start  with 350  individual  towns  as  clusters.

We  interrogate  the  distance  matrix, all  the  parallelized  distances,

find  those  two  towns that  are  nearest  to  each  other,

they  form  a  cluster.

We  take  the  mean  distance  of  that  cluster.

Now  we're  either  looking for  the  next  two  nearest  towns

or  the  next  town that's  closest  to  that  cluster,

and iteratively  process  for  the  tree  until

we  have  one  gigantic  cluster  of  350 towns.

With  K-M eans  clustering.

We  flip  the  process.

We  start  with  350 towns  in  one  cluster

and  then  begin  slicing  and  dividing in  multiple  dimensions.

In  this  approach,  same  utility and  distances,  same  distance  matrix,

we  end  up  with   Sharon being  in  cluster  number 4,

with  a  full  compliment  of  33 towns.

We  automatically get  a  cluster  means  picture.

Again,  very  affluent  low  growth,

not  necessarily  the  lowest, but  low  growth  again.

We  get  slightly  different  results.

I  think  in  the  interest  of  time, I  will  show  you  one  other  graph.

There's  various  things  to  look  at,

but  let's  look  at  the  parallel coordinate  blocks.

What  is  this tool?

We  have  16  clusters,

and  by  the  way  in  K-Means,

it's  up  to  the  user to  specify  the  number  of  clusters.

I  chose  16  as  a  starting  point because  that's  what  hierarchical  gave  us.

H ere  we  are,  cluster  four.

The  dark  brown  line  is  Sharon.

Here  we  see  the  six  characteristics, the  six  principal  components.

F or  example,  if  we  compare,

how  is  cluster  4  different from  cluster 3  let's  say,  or  cluster  5,

maybe  similar  in  sizes.

Cluster  five,  less  affluent.

Property  values  are  a  little  lower.

Permanent  population  refers  to

their  communities  with  universities, hospitals,  prisons,  so  forth,

vacation  homes, snowbirds  who  leave  for  the  winter.

Towns  differ  in  terms  of  their  permanent populations  in  cluster  three,  much  lower.

Here's  where  we  find  our  university  towns.

I  just  popped  up the  town  of  Shirley,  Massachusetts

as  the  state's  largest

maximum  security  prison.

I  don't  know  if  we  consider  these  folks permanent  residence  or  not,  but  any  event.

We've  done  two  different clustering  methods.

Let's  take  a  look at  how  the  results  compare.

I  saved  within  each  clustering  method,

the  cluster  assignments  for  each town, created  binary  variables.

Are  you  in  the  same  cluster  as  Sharon or  are  you  in  a  different  cluster?

I  also,  just  as  an  aside,

JMP  has  lots  of  wonderful built  in  geographic  maps.

It  does  not  have  a  built- in  map  showing

municipality  borders within  the  state  of  Massachusetts.

But  it  turns  out  that  with  JMP

it's  fairly  easy to  create  a  new  geographic  map.

I  was  able  to  do  this without  very  much  work  at  all.

Here  are  the  results of  hierarchical  clustering,  cluster  15.

Sharon  is  here.

It's  similar  towns  are  in  blue.

I  was  pleased  to  see  that  my  little  tiny hometown  of  Marbleh ead

is  similar  to  the  place  I  moved  to.

Hierarchical  clustering  15 gives us these 24  towns.

K-Means clustering  and  some  more  towns, but  there's  an  awful  lot  of  overlap.

I  also,  just  out  of  curiosity,

look  to  identify  the  33  towns about 10% of the state

that  is  most  similar to Sharon.

This  is  a  larger  group.

Again,  an  awful  lot  of  repeats.

A  lot  of  repeats.

That  last  approach  also gave  us  some  other  advantages.

I  want  to  now  shift  back  to  PowerPoint and  talk  about  some  of  those.

O ne  last  point  before  I finish   the  demo.

We  were  also  curious  to  ask...

Mostly  our  goal  was, who  shall  we  interview?

Who  shall  we  call  in  to  meet publicly  with  our  committee?

But  while  we're  at  it,

let's  see  what  our peers  do  in  terms  of  governance.

State-wide,

open-town  meeting— OTM— dominates  74% and  there  is  no  dominant  fourth  size.

If  we  look  at  who's  in  our  cluster,

let  me  use  K-Means, because  it's  a  little  bit  larger  group.

That  74%,

jumps  up  to  85% with  time  meeting.

By  a  two  to  one  margin,

towns  have   five-member  Select  Boards.

Now,  this  isn't  definitive as  to  channel  my  mother.

If  all  the  other  towns  jumped  off the  Empire  State  Building,

we  wouldn't  necessarily  want  to  jump  off.

But  it's  interesting  to  note that  the  towns  most  similar  to  Sharon

favor  the   five-member  Board

and  are  even  more  inclined to  open- town  meeting.

With  that,  let  me

get  back  to  some  conclusions.

So what  did  we  learn?

One  thing  we  learned  was  the  geographic proximity  is  uninformative  in  those  maps,

none  of  the  abutting  towns  came  up  blue.

Our  most  similar  communities are  not  our  next  door  neighbors.

As  I  just  noted,

open-town  meeting  in  five- member  boards really  predominate.

So  what  did  we  do?

This  work  actually  happened several  months  ago.

We  were  able  to  prioritize  our  outreach,

begin  contacting  those towns  most  similar  to  us.

Many  were  extremely  cooperative and  shared  a  lot  of  information  and  data.

We  also  didn't  want  to  assume that  open-town  meeting

was  the  only  way  to  consider,

so we  wanted  to  talk  to  people with  representative  or  councils.

Those  Euclidean  distances became  instructive  in  terms  of,

okay,  none  of  our  immediate  neighbors, closest  neighbors

use  town  council or  representative  town  meeting.

But  which  RTM  town, which  council  town  is  most  like  us?

And we  contacted  those  folks  as  well.

W e  went  from  having  to  contemplate outreach  to  350  towns

in  a  limited  amount  of  time and with no  money and staff,

to focused sampling  method.

Then  because  town  officials talk  to  one  another

and  they  are  professionally  active,

that  led  us  to  other  interviewees.

With  that,  I  think  that's  about  my  time.

I  hope  this  has  been interesting  and  constructive.

Thank  you  for  coming

and  I  hope  you  enjoy the  rest  of  the  program.

Comments
TCM

Hi Ron (or anyone who wishes to answer),

I previously sent a PM, when I remembered PM's are discouraged.

 

I am following the same process as yours:  I performed PCA on 90 responses and selected the first 23 PC's (I have inspected and "labeled" my PC's) as inputs to a k-means clustering --> 7 clusters were deemed optimal.  Now I have to describe and label my clusters. 

Q1: I should use the cluster means from the first n PC's (I choose the first 10 of the 23) to describe my clusters, correct?

Q2:  If yes to Q1, does a high negative cluster mean value signify that the cluster is strongly "anti" that particular PC?  By the same logic, a high + cluster mean value would signify a strong feature for that cluster.

 

My cluster member counts are pretty balanced btw.

 

Thank you for your guidance.

rcarver

Hi TCM,

Thanks for asking.  
Interpretation of the clusters is where you bring your expert domain knowledge to bear. You do want to inspect the 7 sets of  Cluster Means and look for meaningful characterizations of the 10 PCs. So, we might imagine a consumer cluster of people who often spend on horror films, classical music, and cotton sweaters (assuming that your could identify the PCs as related to those items). Don't be shocked if it's a challenge to give plausible meanings to all 10 PCs and/or all 7 clusters.

 

As for the super-high mean of PC1 in one cluster, others might have a different on this. Even though the data are standardized in creating the PCs and the Clusters, you can still certainly have some observations with high values. Whatever PC1 represents, the consumers in that cluster seem to have a lot of it! Does that make sense?

You mention that the cluster memberships are balanced, but does that cluster happen to have a relatively small n?

I'm eager to see what others have to say. meanwhile, I hope this is helpful.

--Rob

 
 
TCM

Hi Ron,

Thank you for your speedy response!

 

Even though the data are standardized in creating the PCs and the Clusters, you can still certainly have some observations with high values. Whatever PC1 represents, the consumers in that cluster seem to have a lot of it! Does that make sense?  My response:  Yes, that is what I took it to mean...  I will look at the top loadings for PC1 and use that to characterize the cluster with the high mean value to it.  In addition, it would seem that because this cluster is heavy on PC1 (which is the most informative of all PC's), this cluster is probably very important in our study.

 

You mention that the cluster memberships are balanced, but does that cluster happen to have a relatively small n? My response: No, the number of members is pretty high and comparable to the number of members of the other clusters...  I take the similarity in membership counts among the clusters to be a positive sign.

 

I am also curious to hear of others' comments.