ColdFusion in Context: From Calling Pairs to Calling Tree

Suppose you, after long labor, had finally compiled a list that shows what pages or functions call what other pages and functions in your application. How could you represent it in a hierarchical outline that looks like a Web site map?

You may have seen algorithms for a depth-first search or breadth-first search whose purpose is to visit each link exactly once. That's not quite what you want in order to print a calling tree for your application. The idea is, as far as possible, to show the hierarchy of program calls for your application. Because your utility functions will be called by multiple programs, they'll show up multiple times in the hierarchy.

This demonstration shows a method for printing this hierarchy. Along the way, you'll use lists, use a two-dimensional ColdFusion array, and consider how to handle cross-linked applications when diagramming a Web site. When you're done, you can use this technique to improve your understanding of your application, or expand the technique to automatically generate a site map.

Links You Shouldn't Follow

Beginning with a specified starting page within the site, you want to get the list of links leaving that page, follow almost every one, and repeat the process until every almost every path leaving the starting page has been exhausted. As in much of life, it's the "almost" that takes some thought. There are two reasons you would not want to follow and expand a particular link.

One reason you would not to follow and expand a particular link is that the link connects with a page that's earlier in the path you're following now. It doesn't matter if you've already visited the page. The key problem is whether it currently sits between you and the starting point. Following a link like that would put you in an endless loop. The technique you'll see here uses a stack to dynamically keep track of the current trail (list of pages) that leads back from where you are right now to the starting point.

The other reason for not following a link has to do with the manner in which a table of calling pairs is generated. Usually, the table lists every possible link that leaves a given page, regardless of the decision logic on the page. If you're trying to document the paths that might be taken by a normal user, you won't want to follow a branch that's only open to users who have administrative privileges. You would in this situation run the tree twice: once with the admin path blocked and once with the normal path blocked so you can clearly see the difference. The technique you'll see here reads pages to skip into a list in memory that you compare with each link before you follow it.

Navigation Array

The technique shown in this tip uses a two-dimensional ColdFusion array to keep track of its progress as it walks through the calling tree. It stores the starting point in position /1,1/ [level 1, first element]. It then uses an SQL statement to copy down all the links called by that page and deletes the reference (/1,1/). It stores the first link from the starting point in /2,1/ [level 2, first element]. It pushes the next link from the starting point into that same position (/2,1/), displacing its former contents into /2,2/ [level 2, second element], and so on until every page called by /1,1/ is represented in /2,1..N/. It then runs the query again using the first element on this level (/2,1/), removes it, and pushes its links in the same fashion into /3,1..N/. When it can't chase the first element on a level any farther, it removes it, shfting the contents of that level back one position so that /level,2/ becomes the new /level,1/, and the process continues. When all the elements on a level vanish, the process continues with the previous level. The array grows and shrinks, grows and shrinks, until it finally vanishes and the process stops.

Start with Calling Pairs

You'll need a list showing each relationship as Caller and Callee [a page called by Caller]. Here's an list of pairs you can read into a database. Name the table Tree. Give it elements named Caller and Callee. Define each of these to handle a string of perhaps 200 bytes (in case you have long paths).

CALLER,CALLEE
maze/index.htm,maze/page2.cfm
maze/index.htm,maze/page3.cfm
maze/index.htm,maze/page2.cfm
maze/page1.cfm,maze/basement/page1.cfm
maze/page1.cfm,maze/attic/page1.cfm
maze/page2.cfm,maze/attic/page2.cfm
maze/page2.cfm,maze/attic/page1.cfm
maze/page3.cfm,maze/basement/page2.cfm
maze/page3.cfm,maze/attic/page2.cfm
maze/attic/page1.cfm,maze/attic/page2.cfm
maze/attic/page1.cfm,maze/index.htm
maze/attic/page2.cfm,maze/page2.cfm
maze/attic/page2.cfm,maze/page3.cfm
maze/basement/page1.cfm,maze/basement/page2.cfm
maze/basement/page1.cfm,maze/page2.cfm
maze/basement/page2.cfm,maze/basement/page2.cfm
maze/basement/page2.cfm,maze/basement/page3.cfm

Define a Skip List

For applications more complex than this one, you'll want to avoid expanding certain pages during certain runs because you want your tree to concentrate on a given logical subset of the site (such as the adminstrative side, for example). Create a table named Treeskip. Give it elements named Description, Caller, and StopYN. Configure Caller to hold at least 200 bytes. Make StopYN a numeric field.

You can block as many pages as you need to. For this demonstration, add the following row to the table. Because StopYN is set to 1, this page's children won't be included in the tree:

DESCRIPTION,CALLER,STOPYN
Special Permission Needed,maze/attic/page2.cfm,1

Lay the Groundwork

Put all the code in treewalk.cfm. You'll begin with cosmetics, set defaults, and read the skip list from the database.

The first consideration is cosmetic. The Web server normally renders white space around ColdFusion statements, and in this situation, the white space will interfere with the appearance of the tree if it isn't blocked. After specifying preformatted text, immediately override all output except that found between cfoutput tags. Toss in a paragraph marker to get a fresh start.

<pre><cfsetting enablecfoutputonly="yes">
<cfoutput><p></cfoutput>

Then set defaults. The trace will stop when the number of steps it takes exceeds the Limit. The Stack is a string used as both a string and a list. The navigation array Nav is two-dimensional. A subset of the Indent string precedes each page to visually emphasize its level in the hierarchy. Navigation will begin with the first level (Lev). If your site is so complex that you wish to trace the top portion and then trace major branches in separate runs, change the Offset here so that if a branch begins on say, the fifth level relative to the main trace, you can change the Offset to 5 to make the printed trace of that branch appear to start at level 5. (The trace will be performed as usual; only the display will be affected.) Finally, set Nav[Lev] (which at this time points to level 1, element 1) to be your starting point: the Caller from which you wish to trace.

<cfset Limit=3000>
<cfset Stack="">
<cfset Nav=ArrayNew(2)>
<cfset Indent=":::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::">
<cfset Lev=1>
<!--- !!!Adjust offset so sub-lists merge with a top-level list!!! --->
<cfset Offset=0>
<!--- Adjust starting point --->
<cfset dummy=arrayPrepend(Nav[Lev],"maze/index.htm")>
<cfif not len(trim(Nav[Lev][1]))>
  Starting point not properly set
  <cfabort>
</cfif>

Now read the skip list you've entered manually into the Treeskip table. If StopYN is not zero, you'll skip that particular Caller. (Conversely, if you want to lift the restriction, you can simply change the value of this element in the database instead of removing the entire entry.) The function valueList stuffs the contents of the Caller column of the query into a list of items separated by a comma (by default). The navigation logic will subsequently check the list rather than the query.

<!--- Change this table in the database to skip files that interfere with listing --->
<!--- Just set the StopYN element to 1 if you don't want to follow the link --->
<cfquery name="Skips" datasource="context">
select Description, Caller from Treeskip
where StopYN <> 0
</cfquery>
<cfset Treeskip=ValueList(Skips.Caller)>

Begin Looping, and Back Up as Needed

The master loop continues until too many steps have been taken or the program breaks out of the master loop. To back up, use a secondary loop. In the secondary loop, when a level is not empty, break out of the secondary loop. If the level is empty, pop the Stack (if it's not empty), back up one level, and if you've reached level zero, you're done. The listRest function drops the first element from a list, leaving the rest. The simplicity of dropping an element in this manner is why subsequent code adds items to the beginning of the list (using the listPrepend function) instead of adding items to the end of the list (with the listAppend function). The test for level zero after the secondary loop is needed because breaking out of the secondary loop leaves you still inside the master loop.

<!--- Loop until done --->
<cfset Step=0>
<cfloop condition="Step lt Limit">

<!--- Find good item or quit --->
<cfloop condition="Step lt Limit">
<cfset Step=Step+1>
<cfif arrayLen(Nav[Lev])>
  <!--- Valid value ready for use --->
  <cfbreak>
<cfelse>
  <!--- Back up one level in the navigation array  --->
  <cfif listLen(Stack)>
    <cfset Stack=listRest(Stack)>
  </cfif>
  <cfset Lev=Lev-1>
  <cfif Lev lt 1>
    <!--- Done --->
    <cfbreak>
  </cfif>
</cfif>
</cfloop>
<!--- Stop if all done --->
<cfif Lev lt 1>
  <cfbreak>
</cfif>

Print the Item

Because comments may be added to the printed line later, the line begins with a break. Next comes a string of as many colons as the level number (plus the offset). It then prints the level number (plus the offset). It then prints the item itself: always element one on this level. A sample line might look like this: "::: 3 mypage". Here's the code.

<!--- Print level and item --->
<cfoutput><br>#left(Indent,evaluate(Offset+Lev))# #evaluate(Offset+Lev)# #Nav[Lev][1]#</cfoutput>

Build the Stack, Adjust the Array, and Close the Master Loop

If the item is in the list of items to be skipped, add an appropriate comment to the line and delete this item from the array.

<cfif ListFind(Treeskip,Nav[Lev][1]) gt 0>
  <cfoutput>[on skip list]</cfoutput>
  <cfset dummy=arrayDeleteAt(Nav[Lev],1)>

Otherwise, if the item is NOT on the stack [the path directly back to the starting point], add it to the stack and run a query to try to populate the next level of the array. Because the last row of the query will wind up in the first element of a level, the query is specified as DESC (descending) so that the last row of the query is the first item you want to work with.

<cfelseif ListFind(Stack,Nav[Lev][1]) lt 1>
  <!--- If children, fill next level and put 1st item in stack --->
  <cfquery name="Callees" datasource="context">
  select Caller, Callee
  from Tree
  where Caller = '#Nav[Lev][1]#'
  order by Callee desc
  </cfquery>

If the query returned rows, this caller is a parent. Add the parent to the stack and drop the parent from the array. Add one to the current level, and loop through the query, repetitively stuffing the array from the front (/level,1/). If there are no useful children, don't add to the stack; just add an appropriate comment to the printed line and drop the item from the array.

  <cfif Callees.recordcount>
    <cfset Stack=listPrepend(Stack,Nav[Lev][1])>
    <cfset dummy=arrayDeleteAt(Nav[Lev],1)>
    <!--- Go to and fill the next level --->
    <cfset Lev=Lev+1>
    <cfloop query="Callees">
    <cfif len(trim(Callee))>
      <cfset dummy=arrayPrepend(Nav[Lev],Callee)>
    </cfif>
    </cfloop>
  <cfelse>
    <!--- No children --->
    <cfoutput>[no kids]</cfoutput>
    <cfset dummy=arrayDeleteAt(Nav[Lev],1)>
  </cfif>

If the item was not on the list to be skipped but WAS on the stack, add an appropriate comment to the printed line.

<cfelse>
  <!--- Comment, and remove this item from this level --->
  <cfoutput>[in stack]</cfoutput>
  <cfset dummy=arrayDeleteAt(Nav[Lev],1)>
</cfif>

Finally, increment the step counter and close the master loop.

<cfset Step=Step+1>
</cfloop>

Cosmetics and Caveats

Now it's time for cosmetics and caveats. Turn off the white space blocker. Indicate if the run completed normally or the step limit was exceeded. Although it will be obvious from the top of the tree, as a formality, specify the amount of offset used. Finally, list descriptions and names of files that the process was told to skip.

<!--- Close --->
<cfsetting enablecfoutputonly="no"></pre><br>
<p>
<cfif Limit ge Step>
  The run completed successfully.
<cfelse>
  The run was too long; use the database to skip links
  or raise Limit in the code.
</cfif>
<br>
Offset: <cfoutput>#Offset#</cfoutput><br>
Files deliberately not expanded in this tree...<br>
<table border="1">
<cfoutput query="Skips">
<tr><td>#Caller#</td><td>#Description#</td></tr>
</cfoutput>
</table>

Try It Out

Browse treewalk.cfm and glance at the result (the bulk of which is reproduced below). If you're wondering why maze/page2 is expanded twice, notice that maze/index.htm calls it twice in the list of calling pairs above. Once glance at this list has highlighted a potential problem.

: 1 maze/index.htm 
:: 2 maze/page2.cfm 
::: 3 maze/attic/page1.cfm 
:::: 4 maze/attic/page2.cfm [on skip list] 
:::: 4 maze/index.htm [in stack]
::: 3 maze/attic/page2.cfm [on skip list] 
:: 2 maze/page2.cfm 
::: 3 maze/attic/page1.cfm 
:::: 4 maze/attic/page2.cfm [on skip list] 
:::: 4 maze/index.htm [in stack]
::: 3 maze/attic/page2.cfm [on skip list] 
:: 2 maze/page3.cfm 
::: 3 maze/attic/page2.cfm [on skip list] 
::: 3 maze/basement/page2.cfm 
:::: 4 maze/basement/page2.cfm [in stack]
:::: 4 maze/basement/page3.cfm [no kids]

In the database, change StopYN for the blocked page to zero, unblocking it, and run treewalk.cfm again. Now you can see that maze/basement/page3.cfm can be at the seventh level: a place where it takes six clicks from the home page to reach it.

: 1 maze/index.htm 
:: 2 maze/page2.cfm 
::: 3 maze/attic/page1.cfm 
:::: 4 maze/attic/page2.cfm 
::::: 5 maze/page2.cfm [in stack]
::::: 5 maze/page3.cfm 
:::::: 6 maze/attic/page2.cfm [in stack]
:::::: 6 maze/basement/page2.cfm 
::::::: 7 maze/basement/page2.cfm [in stack]
::::::: 7 maze/basement/page3.cfm [no kids]
:::: 4 maze/index.htm [in stack]
::: 3 maze/attic/page2.cfm 
:::: 4 maze/page2.cfm [in stack]
:::: 4 maze/page3.cfm 
::::: 5 maze/attic/page2.cfm [in stack]
::::: 5 maze/basement/page2.cfm 
:::::: 6 maze/basement/page2.cfm [in stack]
:::::: 6 maze/basement/page3.cfm [no kids]
:: 2 maze/page2.cfm 
::: 3 maze/attic/page1.cfm 
:::: 4 maze/attic/page2.cfm 
::::: 5 maze/page2.cfm [in stack]
::::: 5 maze/page3.cfm 
:::::: 6 maze/attic/page2.cfm [in stack]
:::::: 6 maze/basement/page2.cfm 
::::::: 7 maze/basement/page2.cfm [in stack]
::::::: 7 maze/basement/page3.cfm [no kids]
:::: 4 maze/index.htm [in stack]
::: 3 maze/attic/page2.cfm 
:::: 4 maze/page2.cfm [in stack]
:::: 4 maze/page3.cfm 
::::: 5 maze/attic/page2.cfm [in stack]
::::: 5 maze/basement/page2.cfm 
:::::: 6 maze/basement/page2.cfm [in stack]
:::::: 6 maze/basement/page3.cfm [no kids]
:: 2 maze/page3.cfm 
::: 3 maze/attic/page2.cfm 
:::: 4 maze/page2.cfm 
::::: 5 maze/attic/page1.cfm 
:::::: 6 maze/attic/page2.cfm [in stack]
:::::: 6 maze/index.htm [in stack]
::::: 5 maze/attic/page2.cfm [in stack]
:::: 4 maze/page3.cfm [in stack]
::: 3 maze/basement/page2.cfm 
:::: 4 maze/basement/page2.cfm [in stack]
:::: 4 maze/basement/page3.cfm [no kids]

Now suppose you had a description of each page and could build a link to descriptions (or to the source code itself) from this outline. Extend the concept and share your lessons learned. =Marty= [The on-line demonstration unblocks the "blocked page" to start with; so, only the second run will be displayed. It refers to treewalk.cfm by name rather than by pronoun for the second run in order to display the name in a link for the on-line demonstration. The reason the header introducing the list of blocked files has nothing following it is that the list is empty.]