ColdFusion in Context: Grouping Families for Visits

Suppose you're responsible for assigning families who could use a visit to families who can perform visits. It would be nice to arrange things so the helpers doing the visiting aren't driving all over town to do this. Here's one way that ColdFusion can help.

Overview

Here's the short story. Once you use map coordinates to indicate where these families live, this tip will assign helpers to the families farthest away from the center of your map first and will assign the closests available helpers to those families. Additional code preserves existing pairings where you wish and will consider families having special needs prior to considering the general population.

Map Coordinates. Manually enter map grid designators (which consist of letters and numbers) into the database. This code will convert the letters to numbers so you can treat the map grid as XY coordinates.

Relative Distance from Center. The code will determine the distance between each helper and every family. When it sums these distances by family, the families at the edges of your map will have the largest sums. Think of this sum as a measure of the family's isolation from the rest of the group. Sequencing an assignment query in descending order by isolation (largest first) will cause the code to assign the families that have the fewest helpers nearby first. Families near the center have many helpers to pick from; so, their assignment can be handled with the helpers that are left over.

Closest Helper. At this point, the code knows the distance between all helpers and all families. After isolation, it sub-sequences the query in ascending order by distance (smallest first). The query that does the work will specify "order by Isolation desc, Isolation asc". This way, if families are equally isolated, it will assign the best helper match first.

Firm Helpers. Through experience, you have already set some of these relationships and don't want to change them. So, you'll manually mark them "Firm" in the Family table so this code will use these choices "as is" in the final product.

Bookkeeping. To avoid overloading helpers and to avoid assigning the same family twice, you'll need to keep track of each helper's Load and note if a family is already Taken.

Special Needs. Perhaps you would like to pair helpers having special attributes with families having special needs. For example, there might be a language or background that helpers and families have in common (HasX and NeedsX respectively). Handle these families before the rest.

Data

Here's some fanciful sample data in a table named Family3 (referred to in descriptions as simply "Family". Columns: XLtr (map column letter), YNr (map row number), Family (text), Helps (1 means yes), Needs (1 means yes), HasX (a special attribute), NeedsX (a special attribute), Helper (starts empty except for firm pairs), Firm (1 means don't change). The code will modify the Helper column each time and will post the remaining columns each time. FamilyLtr will label a family; HelperLtr will be shorthand for a helper, and Distance will be the distance between them in miles.

FAMILY: headings followed by data...
XLtr, YNr, Family, Helps, Needs, HasX, NeedsX, Helper, Firm, XNr, FamilyLtr, HelperLtr, Distance
P,19,A.Turner,0,1,,,,,,,,
S,20,B.Turner,1,1,,,,,,,,
N,23,C.Turner,0,1,S,,,,,,,
S,20,D.Turner,0,1,,,,,,,,
M,25,E.Turner,0,1,,,,,,,,
L,22,F.Turner,0,1,,,,,,,,
W,20,G.Turner,0,1,,,,,,,,
P,18,H.Turner,0,1,,,,,,,,
Q,22,I.Turner,0,1,,,,,,,,
P,19,J.Turner,0,1,,,,,,,,
Q,18,K.Turner,0,1,M,,,,,,,
Q,18,L.Turner,0,1,,,B.Turner,1,,,,
Q,18,M.Turner,0,1,,S,,,,,,
P,19,N.Turner,1,1,,,,,,,,
Z,21,O.Turner,1,1,,,,,,,,
,,P.Turner,0,1,,,,,,,,
M,21,Q.Turner,1,1,M,M,,,,,,
P,21,R.Turner,0,1,,,,,,,,
P,21,S.Turner,0,1,,,,,,,,
S,20,T.Turner,0,1,,,,,,,,
Q,20,U.Turner,0,1,,,,,,,,
X,19,V.Turner,0,1,S,,,,,,,
O,21,W.Turner,0,1,,,,,,,,
S,20,X.Turner,0,1,,,,,,,,
W,18,Y.Turner,1,0,,,,,,,,
P,21,Z.Turner,0,1,,,N.Turner,1,,,,

You'll also need a Label table that relates map column grid letters to X coordinate numbers. The table will have two columns: Label (the letter at the top of the map column) and XNr (the X Number, the X coordinate to which it corresponds).

In the Label table, A-Z correspond to 1-26 as you would expect. If you have additional letters (such as AA, AB, etc.), enter them with the appropriate values (such as 27, 28, etc.).

Action

Put all the code for this tip in visit.cfm. Post the X coordinates to the Family table by matching the column letter (XLtr) in the Family table to the Label in the Label table.

<!--- Post numeric X coordinate to Family table --->
<cfquery name="setXNr" datasource="context">
update Family3, Label
set Family3.XNr =  Label.XNr
where Family3.XLtr = Label.Label
</cfquery>

Number Families to make it easier to use a graphic display (outside the scope of this tip).

<!--- Number families; remove Helper letter; zero distance --->
<cfquery name="getAllFamilies" datasource="context">
select Family
from Family3
order by Family
</cfquery>
<cfset Row=0>
<cfloop query="getAllFamilies">
<cfset Row=Row+1>
<cfquery name="setFamilyLtr" datasource="context">
update Family3 set
FamilyLtr = '#Row#',
HelperLtr = '',
Distance = 0
where Family = '#getAllFamilies.Family#'
</cfquery>
</cfloop>

Join the Family table to itself to put all useful combinations of Helpers and Families needing a visit into a Distance table. You need to drop that table before each run. (Because it won't exist the first time, make a dummy table or comment out the drop action the very first time you run it.) The formula for distance between two points takes the difference between X coordinates, squares it, adds it to the square of the difference between the Y coordinates, gets the square root of the result (an exponent of .5), and multiplies the whole thing by the grid size in miles (.857 in my case). Give the result an Isolation column into which you'll later post the isolation for the family being visited. The join is between Helpers (a.Helps=1) and Families needing a visit (b.Needs=1). Because you can't recognize your own problems as well as someone else can, a combination where you would visit yourself is excluded (a.Family <> b.Family). Also, if you don't have valid coordinates for the Helper or Family needing a visit, you can't compute the distance; so, exclude those pairs also. (This code just checks the X coordinate, assuming that both X and Y are present or both are absent.)

<!--- Build a fresh Distance table --->
<cfquery name="dropDistance" datasource="context">
drop table Distance3
</cfquery>
<cfquery name="makeDistance" datasource="context">
select a.Family AS Helper, a.XNr AS HelperX,
a.YNr AS HelperY, b.Family AS Family,
b.XNr AS FamilyX, b.YNr AS FamilyY,
((a.XNr-b.XNr)^2+(a.YNr-b.YNr)^2)^.5*.857 AS Distance,
0 as Isolation, 0 as Taken, 0 as Firm,
a.Helps as Helps, b.Needs as Needs,
a.HasX as HasX, b.NeedsX as NeedsX
into Distance3
from Family3 AS a, Family3 AS b
where a.Helps = 1
and b.Needs = 1
and a.Family <> b.Family
and a.XNr <> NULL
and b.XNr <> NULL
</cfquery>

Compute and store the Isolation for each Family in a Isolation table. To do this, drop the previous Isolation table - skip this step or make a fake table for the very first run - and produce a table having the sum of each Family's distance from all Helpers. The group by clause is needed for this select statement to work.

<!--- Build a fresh Isolation Table --->
<cfquery name="dropIsolation" datasource="context">
drop table Isolation3
</cfquery>
<cfquery name="makeIsolation" datasource="context">
select Family, sum(Distance) as Isolation
into Isolation3
from Distance3
group by Family
order by Family
</cfquery>

Post the Isolation to the Distance table so a query to do the work of finding reasonable groups to visit will have this information avaiable.

<!--- Post Isolation to Distance table --->
<cfquery name="postIsolationToDistance" datasource="context">
update Distance3, Isolation3
set Distance3.Isolation = Isolation3.Isolation
where Distance3.Family = Isolation3.Family
</cfquery>

To preserve "Firm" relationships, post them to the Distance table.

<!--- Post Firm to Distance table --->
<cfquery name="postFirm" datasource="context">
update Family3, Distance3
set Distance3.Firm = 1
where Family3.Firm = 1
and Distance3.Helper = Family3.Helper
and Distance3.Family = Family3.Family
</cfquery>

Build a table in which to post workload by helper (Helps) and a table in which to note if a family to be visited (Needs) is taken.

<!--- Build a fresh Helps table to manage load --->
<cfquery name="dropHelps" datasource="context">
drop table Helps3
</cfquery>
<cfquery name="makeHelps" datasource="context">
select Family,
0 as Load, FamilyLtr as HelperLtr
into Helps3
from Family3
where Helps = 1
</cfquery>

<!--- Build a fresh Needs table to mark Families as taken --->
<cfquery name="dropNeeds" datasource="context">
drop table Needs3
</cfquery>
<cfquery name="makeNeeds" datasource="context">
select Family,
0 as Taken
into Needs3
from Family3
where Needs = 1
</cfquery>

Determine the average number of families per helper, add one to that, and use this figure for the maximum workload. (See "Discussion" later.)

<!--- Use more than the ratio of needs to helps for max load --->
<cfquery name="getHelps" datasource="context">
select count(*) as HelpsCount
from Helps3
</cfquery>
<cfquery name="getNeeds" datasource="context">
select count (*) as NeedsCount
from Needs3
</cfquery>
<cfset MaxLoad=(getNeeds.NeedsCount/getHelps.HelpsCount)+1>

Assign the "Firm" relationships, posting the "Load" they provide and marking their families as "Taken". Check the load as you go in case you've manually preassigned too many (in which case some assignments will not be honored.)

<!--- Pair firm (pre-assigned) pairs --->
<cfquery name="tryFirm" datasource="context">
select * from Distance3
where Firm = 1
order by Isolation desc, Distance asc
</cfquery>
<cfloop query="tryFirm">
<cfquery name="getHelper" datasource="context">
select *
from Helps3
where Family = '#tryFirm.Helper#'
</cfquery>
<cfquery name="getNeeder" datasource="context">
select *
from Needs3
where Family = '#tryFirm.Family#'
</cfquery>
<cfif (getNeeder.taken lt 1) and (getHelper.Load lt MaxLoad)>
  <cfquery name="upLoad" datasource="context">
  update Helps3
  set Load = Load + 1
  where Family = '#tryFirm.Helper#'
  </cfquery>
  <cfquery name="doTake" datasource="context">
  update Needs3
  set Taken = 1
  where Family = '#tryFirm.Family#'
  </cfquery>
  <cfquery name="doPost" datasource="context">
  update Family3
  set Helper = '#tryFirm.Helper#',
  HelperLtr = '#getHelper.HelperLtr#',
  Distance = #tryFirm.Distance#
  where Family = '#tryFirm.Family#'
  </cfquery>
</cfif>
</cfloop>

Perhaps you would like to pair helpers having special attributes with families having special needs. For example, there might be a language or background that helpers and families have in common (HasX and NeedsX respectively).

This pass is a bit different. This time, exclude pairs that have already been taken. Also, this time, sequence matters. Order primarily by Isolation (descending) and then by Distance (ascending). Post "Load" and "Taken" as before, checking the load as you go.

<!--- Pair those with special attributes --->
<cfquery name="tryX" datasource="context">
select * from Distance3, Needs3
where Needs3.Taken < 1
and Needs3.Family = Distance3.Family
and HasX <> ''
and HasX = NeedsX
order by Isolation desc, Distance asc
</cfquery>
<cfloop query="tryX">
<cfquery name="getHelper" datasource="context">
select *
from Helps3
where Family = '#tryX.Helper#'
</cfquery>
<cfquery name="getNeeder" datasource="context">
select *
from Needs3
where Family = '#tryX.Family#'
</cfquery>
<cfif (getNeeder.taken lt 1) and (getHelper.Load lt MaxLoad)>
  <cfquery name="upLoad" datasource="context">
  update Helps3
  set Load = Load + 1
  where Family = '#tryX.Helper#'
  </cfquery>
  <cfquery name="doTake" datasource="context">
  update Needs3
  set Taken = 1
  where Family = '#tryX.Family#'
  </cfquery>
  <cfquery name="doPost" datasource="context">
  update Family3
  set Helper = '#tryX.Helper#',
  HelperLtr = '#getHelper.HelperLtr#',
  Distance = #tryX.Distance#
  where Family = '#tryX.Family#'
  </cfquery>
</cfif>
</cfloop>

Now assign the remaining families. Do this in two passes, first remaining one below your previously set limit and then working at your previously assigned limit. (See "Discussion" later.)

<!--- Remainder --->
<cfloop from="#evaluate(MaxLoad-1)#" to="#MaxLoad#" index="PassLoad">
<cfquery name="tryAll" datasource="context">
select * from Distance3, Needs3
where Needs3.Taken < 1
and Needs3.Family = Distance3.Family
order by Isolation desc, Distance asc
</cfquery>
<cfloop query="tryAll">
<cfquery name="getHelper" datasource="context">
select *
from Helps3
where Family = '#tryAll.Helper#'
</cfquery>
<cfquery name="getNeeder" datasource="context">
select *
from Needs3
where Family = '#tryAll.Family#'
</cfquery>
<cfif (getNeeder.taken lt 1) and (getHelper.Load lt PassLoad)>
  <cfquery name="upLoad" datasource="context">
  update Helps3
  set Load = Load + 1
  where Family = '#tryAll.Helper#'
  </cfquery>
  <cfquery name="doTake" datasource="context">
  update Needs3
  set Taken = 1
  where Family = '#tryAll.Family#'
  </cfquery>
  <cfquery name="doPost" datasource="context">
  update Family3
  set Helper = '#tryAll.Helper#',
  HelperLtr = '#getHelper.HelperLtr#',
  Distance = #tryAll.Distance#
  where Family = '#tryAll.Family#'
  </cfquery>
</cfif>
</cfloop>
</cfloop>

There will eventually be some helpers left over from previous runs as you re-work your routes. Remove the helper from pairs where the Needs value is not 1.

<cfquery name="removeOld" datasource="context">
update Family3
set Helper = ''
where Needs <> 1
</cfquery>

Build a simple list to see the result.

<cfquery name="getRoutes" datasource="context">
select Helper, Family, XLtr, YNr, Needs, 
Distance, Firm
from Family3
order by Helper, Family
</cfquery>

HELPER, FAMILY, GRID, MILES<br>
<cfoutput query="getRoutes">
#Helper#, #Family#, #Xltr##YNr#,
<cfif len(trim(Helper))>
  #round(val(Distance))#
  <cfif val(Firm)>
    *
  </cfif>
<cfelse>
  <cfif val(Needs) and (Needs gt 0)>
    Needs Helper
  </cfif>
</cfif>
<br>
</cfoutput>

Browse visit.cfm - it will take a minute or two - and consider the results. One family that needed a helper didn't get one; because, grid coordinates were missing. Another family that doesn't need a helper is listed without a helper. The remainder of the list consists of helper-family relationships. An asterisk indicates where the relationship is "firm".

Discussion

This code does a reasonable job of suggesting assignments. It avoids creating pairs where folks drive halfway across town unnecessarily. Instead, helpers visit neighbors. The only drawback some might perceive is that they might not consider the workload distribution to be "fair".

"Fairness" is an odd term. Equal isn't aways "fair". If I need medicine, I hope you don't deny it because Johnny next to me, who doesn't need it, isn't getting it. Manually examine the Helps table. You'll notice that not all helpers are visiting the same number of families.

The reason isn't the maximum workload: it's the way the workload is built. Building the workload in layers, adding one family to each helper and repeating the process, would force the number of families seen by each helper to be close to equal but would increase the travel distances. To see this effect, change the "from" attribute of the "from..to" loop to something lower (perhaps half the Max Load) and let it step up from that point. (Caution; this also increases processing time substantially.)

Instead of doing this, the code (ignoring the firm and special needs passes for a moment) makes only two passes: one to distribute a rough average workload and one to clean up leftovers (if any). It could use a single pass with a ceiling high enough to handle everyone, but using two passes seems to produce better results. You can see this more clearly when you're working with triple the number of families used in this example.

It uses one plus the raw average as the maximum load; because, this is a quick way to handle the fractional portion of the family/helper ratio. As noted above, the assignment is more sensitive to the manner of selection than to this maximum limit.

A "perfect" solution would still cause the load to vary by 1 (when dividing families by helpers yields a remainder), and it would result in longer average distances. The code "as is" produces good real-world results.

You might want to consider faster ways to reach this result. Perhaps you can combine queries. Perhaps you can use structures for better performance. Perhaps you can better handle the sitation where a Helper is on the edge of the map. Work on this problem and tell us what you've learned. =Marty=