GA-CCRi Analytical Development Services

Destructuring in Mathematica

A technique that I have particularly useful in Lisp-like languages like Mathematica and Clojure is destructuring. Destructuring is a mechanism for extracting parts of an expression. The Lisp "code as data" paradigm lends itself to destructuring techniques. I recently leveraged destructuring to programmatically modify some graphics I was developing to visualize recursive partitioning techniques. The graphics object that represents a recursive partitioning on a dataset is a dendrogram. Mathematica provides a Dendrogram function that will visualize nested clusters, but it does not provide a way to label the branches on the Dendrogram. The original dendrogram looks like the following.


In Mathematica, this object can also be manipulated in its original form. The original form is an expression.

Graphics[List[List[RGBColor[0, 1, 0], List[]], 
     Line[List[List[1, 0], List[1, 2], List[2, 2], List[2, 0]]], 
     Line[List[List[3, 0], List[3, 2], List[4, 2], List[4, 0]]], 
     Line[List[List[1.5`, 2], List[1.5`, 4], List[3.5`, 4], 
       List[3.5`, 2]]]]]], 
  List[Text[Text[Style["B", RGBColor[0, 0, 1]]], 
    Offset[List[0, -4], List[1, 0]], List[0, 1]], 
   Text[Text[Style["A", RGBColor[1, 0, 0]]], 
    Offset[List[0, -4], List[2, 0]], List[0, 1]], 
   Text[Text[Style["A", RGBColor[1, 0, 0]]], 
    Offset[List[0, -4], List[3, 0]], List[0, 1]], 
   Text[Text[Style["B", RGBColor[0, 0, 1]]], 
    Offset[List[0, -4], List[4, 0]], List[0, 1]]]], 
 List[Rule[PlotRange, All], 
  Rule[AspectRatio, Power[GoldenRatio, -1]]]]

In the dendrogram above, each path represents a region in space that would be classified with the label at the leaf node. Each branch node represents a rule that decides which branch of the tree should be followed to classify new data points. (See the demonstration at the end of the post for details). In order to get the appropriate text onto the branch nodes, I had to destructure the graphics object and construct a parallel graphics object with the textual elements of the tree. Mathematica has Position function that takes any Mathematica expression and destructures it using a rich pattern matching library. Getting the positions of the branch nodes is done as follows:

   Position[FullForm[d], Line[__]]] /. {Line[{_, h_, i_, _}] -> {h, i}}

{{{1, 2}, {2, 2}}, {{3, 2}, {4, 2}}, {{1.5, 4}, {3.5, 4}}}

In the expression above, the call to Position passes in the Line[___] pattern to match at any nested level the Line objects in the graphics object. Then, the positions are extracted from the full form of the dendrogram graphic object and the center two vertices of the line are pulled out as well. These center two vertices refer to the horizontal line of each branch node. We can use these vertices' positions to construct the textual objects appropriately. The following tree is the final result.


Go Back