Visualizing Data Structures With Dot. Part 2: Rendering C Data Structures

Summary

this is a continuation of the previous post: Visualizing Data Structures With Dot. Part 1: Some Advanced Features of the Dot Language.

In this post, we write C code to output a graphviz file to render a quadtree data structure.

The Data Strucure Definition and a Caveat

The quadtree data structure from the last post was defined with two C structs.

// Each terminal node in a quadtree
// contains a list of data items.
// In this case, the data are simple strings.
typedef struct _QTREE_DATA_ITEM QTREE_DATA_ITEM;
struct _QTREE_DATA_ITEM {
  double x, y;
  char *data;
  QTREE_DATA_ITEM *next;
};

typedef struct _QTREE_NODE QTREE_NODE;
struct _QTREE_NODE {
  // Rectangle that this node covers.
  int upper_left_x, upper_left_y,
         lower_right_x, lower_right_y;
  QTREE_NODE *ne_child, *se_child,
             *sw_child, *nw_child;
  QTREE_DATA_ITEM *data_list;
};

Since C has no built-in facilities for determining members of an unknown structure at run-time we will have to write a function which is specialized for QTREE_NODEs.

GCCXML can generate XML files describing the layout, members, and data types of C structures but arrays in C are problematic since they can declared explicitly or as pointers

int a[10];
int *b = malloc(sizeof(int)*10);

Without extra information, we have no way of knowing if a pointer member is an array or a link to another data object.

Ideally, we would like to write a single function which can accept a pointer to any C struct and write dot code to render its members and any other data objects which it refers to.

The Code

Our program will create a text file of dot source code which can then be run through dot program to produce a graphical version of any instance of a quadtree as defined above.

graphviz does have a C api but it is somewhat more cumbersome to use than the simple and direct method of writing text files and then passing them through dot.

Starting Point

The process is started with the function below

void print_to_dot(char *output_file_name, QTREE_NODE *qt)
{
  FILE *fout;
  if (!(fout = fopen(output_file_name, "w"))) {
    fprintf(stderr, "Failed to open: %s\n", output_file_name);
    exit(1);
  }
  fprintf(fout, "digraph quadtree {\n");
  print_to_dot_aux(fout, qt);
  fprintf(fout, "}\n");
  fclose(fout);
}

qt is a pointer to the node to output dot code for.

print_to_dot() writes the first line of dot code

diagraph quadtree {

then calls print_to_dot_aux() (auxiliary) to do the real work. Finally, the closing } is written and the output file is closed.

Printing a node and its children.

print_to_dot_aux() is declared as

void print_to_dot_aux(FILE *fout, QTREE_NODE *qt)

First, we must decide how a parent can refer to its children in the dot file. Since a pointer uniquely identifies a data object in C, we can label nodes with: N0xp where p is the hexadecimal value of the pointer to a node. The statement

printf("N%p", some_pointer);

can accomplish this.

All nodes will have a shape of record. The statement below will be executed for both leaf and nonleaf nodes

fprintf(fout, "  N%p [shape=record, label=\"", (void *) qt);

Leaf and nonleaf nodes also "cover" a rectangle so that will be printed in both cases as well

fprintf(fout, "{Rectangle:%d,%d,%d,%d|", qt->upper_left_x, qt->upper_left_y,
        qt->lower_right_x, qt->lower_right_y);

As an example, if the node that we are currently visiting is at address 0x10 (say) and its recangle has upper-left and lower-right corners (-128, 128) -- (-128,128) then what is printed is

  N0x10 [shape=record, label="{Rectangle:-128,128,-128,128|

Printing dot Code for Interior Nodes

If qt is a nonleaf node, then at least one its child pointers: ne_child, se_child, sw_child, and nw_child will be non-NULL. The if statement below will output the dot code for interior nodes:

  if (qt->ne_child || qt->se_child || qt->sw_child || qt->nw_child) {
    fprintf(fout, "{<ne>ne|<se>se|<sw>sw|<nw>nw}}\"];\n");
    fprintf(fout, "  N%p:ne -> N%p;\n", (void *) qt, (void *) qt->ne_child);
    fprintf(fout, "  N%p:se -> N%p;\n", (void *) qt, (void *) qt->se_child);
    fprintf(fout, "  N%p:sw -> N%p;\n", (void *) qt, (void *) qt->sw_child);
    fprintf(fout, "  N%p:nw -> N%p;\n", (void *) qt, (void *) qt->nw_child);
    print_to_dot_aux(fout, qt->ne_child);
    print_to_dot_aux(fout, qt->se_child);
    print_to_dot_aux(fout, qt->sw_child);
    print_to_dot_aux(fout, qt->nw_child);
  }

The last four lines of the if statement above will print the children of qt to the output file.

If the Northeast, Southeast, Southwest, and Northwest children of qt have addresses 0x111, 0x222, 0x333, and 0x444 respectively, then the the total output would be

N0x10 [shape=record, label="{Rectangle:-128,128,-128,128|{<ne>ne|<se>se|<sw>sw|<nw>nw}}"];
N0x10:ne -> N0x111;
N0x10:se -> N0x222;
N0x10:sw -> N0x333;
N0x10:nw -> N0x444;

Printing dot Code for Leaf Nodes

The data_list member of QTREE_NODE points to the first element of a linked-list of data item stored in leaf nodes. The C code to handle leaf nodes is

if (!qt->data_list) {
  fprintf(fout, "{No Data}}\"];\n");
} else {
  QTREE_DATA_ITEM *lp;
  fprintf(fout, "{");
  for (lp = qt->data_list; lp; lp = lp->next) {
    fprintf(fout, "\\\"%s\\\" (%d, %d)%s", lp->data, lp->x, lp->y,
            lp->next ? "|" : "");
  }
  fprintf(fout, "}}\"];\n");
}

If qt is a leaf node containing the data

data x y
"X3" -64 -64
"X1" -64 -64
"X0" -64 -64
"X2" -64 -64

Then what is printed is

N0x10 [shape=record, label="{Rectangle:-128,-1,-1,-128|
       {\"X3\" (-64, -64)|
        \"X1\" (-64, -64)|
        \"X0\" (-64, -64)|
        \"X2\" (-64, -64)}}"];

The long line was broken for clarity.

Complete Function

The full code for print_to_dot_aux() is

void print_to_dot_aux(FILE *fout, QTREE_NODE *qt)
{
  fprintf(fout, "  N%p [shape=record, label=\"", (void *) qt);
  fprintf(fout, "{Rectangle:%d,%d,%d,%d|", qt->upper_left_x, qt->upper_left_y,
          qt->lower_right_x, qt->lower_right_y);
  if (qt->ne_child || qt->se_child || qt->sw_child || qt->nw_child) {
    fprintf(fout, "{<ne>ne|<se>se|<sw>sw|<nw>nw}}\"];\n");
    fprintf(fout, "  N%p:ne -> N%p;\n", (void *) qt, (void *) qt->ne_child);
    fprintf(fout, "  N%p:se -> N%p;\n", (void *) qt, (void *) qt->se_child);
    fprintf(fout, "  N%p:sw -> N%p;\n", (void *) qt, (void *) qt->sw_child);
    fprintf(fout, "  N%p:nw -> N%p;\n", (void *) qt, (void *) qt->nw_child);
    print_to_dot_aux(fout, qt->ne_child);
    print_to_dot_aux(fout, qt->se_child);
    print_to_dot_aux(fout, qt->sw_child);
    print_to_dot_aux(fout, qt->nw_child);
  } else {
    if (!qt->data_list) {
      fprintf(fout, "{No Data}}\"];\n");
    } else {
      QTREE_DATA_ITEM *lp;
      fprintf(fout, "{");
      for (lp = qt->data_list; lp; lp = lp->next) {
        fprintf(fout, "\\\"%s\\\" (%d, %d)%s", lp->data, lp->x, lp->y,
                lp->next ? "|" : "");
      }
      fprintf(fout, "}}\"];\n");
    }
  }
}

The complete program to create and search quatree structures can be found in My github repo for this blog

Example output

If we have a quadtree given by

Addr Rectangle NE Child SE Child SW Child NW Child Data (x,y)
0x010 -128, 128, 128, -128 0x0d0 0x110 0x150 0x190
0x0d0 0, 128, 128, 0
0x110 0, -1, 128, -128 0x2d0 0x310 0x350 0x390
0x2d0 64, -1, 128, -64
0x310 64, -65, 128, -128 "C" ( 96, -96)
0x350 0, -65, 63, -128
0x390 0, -1, 63, -64 "B" ( 32, -32)
0x150 -128, -1, -1, -128 "X3" (-64, -64), "X1" (-64, -64), "X0" (-64, -64), "X2" (-64, -64)
0x190 -128, 128, -1, 0 "A" (-64, 64)

The generated dot code is

digraph quadtree {
  N0x010 [shape=record,
          label="{Rectangle:-128,128,128,-128|{<ne>ne|<se>se|<sw>sw|<nw>nw}}"];
  N0x010:ne -> N0x0d0;
  N0x010:se -> N0x110;
  N0x010:sw -> N0x150;
  N0x010:nw -> N0x190;
  N0x0d0 [shape=record, label="{Rectangle:0,128,128,0|{No Data}}"];
  N0x110 [shape=record,
          label="{Rectangle:0,-1,128,-128|{<ne>ne|<se>se|<sw>sw|<nw>nw}}"];
  N0x110:ne -> N0x2d0;
  N0x110:se -> N0x310;
  N0x110:sw -> N0x350;
  N0x110:nw -> N0x390;
  N0x2d0 [shape=record, label="{Rectangle:64,-1,128,-64|{No Data}}"];
  N0x310 [shape=record, label="{Rectangle:64,-65,128,-128|{\"C\" (96, -96)}}"];
  N0x350 [shape=record, label="{Rectangle:0,-65,63,-128|{No Data}}"];
  N0x390 [shape=record, label="{Rectangle:0,-1,63,-64|{\"B\" (32, -32)}}"];
  N0x150 [shape=record, label="{Rectangle:-128,-1,-1,-128|
                                {\"X3\" (-64, -64)|
                                 \"X1\" (-64, -64)|
                                 \"X0\" (-64, -64)|
                                 \"X2\" (-64, -64)}}"];
  N0x190 [shape=record, label="{Rectangle:-128,128,-1,0|{\"A\" (-64, 64)}}"];
}

and the rendered version is dot-c-test.png

Next

In the final post, we will explore some techniques for generating dot files from Java data structures. Java has a Class and Field objects which permit us to write generic code to render most data structures.

Visualizing Data Structures With Dot. Part 1: Some Advanced Features of the Dot Lanugage

Summary

This is a continuation of the previous post: Visualizing Data Structures With Dot. Part 0: The Dot Language.

In this post, we will explore some features of the dot language that make it particularly suited to drawing data structures. Our goal is to write a program to generate dot code to draw quadtree data structures

Problem : Several Fields in an Object or Structure With Outgoing Links.

In many cases, a drawing of the outgoing links of a data structure might not be so obvious as they were with the binary tree of the last post. For example, suppose that we have have a quadtree where nodes are defined (in C) as

// Each terminal node in a quadtree
// contains a list of data items.
// In this case, the data are simple strings.
typedef struct _QTREE_DATA_ITEM QTREE_DATA_ITEM;
struct _QTREE_DATA_ITEM {
  double x, y;
  char *data;
  QTREE_DATA_ITEM *next;
};

typedef struct _QTREE_NODE QTREE_NODE;
struct _QTREE_NODE {
  // Rectangle that this node covers.
  double upper_left_x, upper_left_y,
         lower_right_x, lower_right_y;
  QTREE_NODE *ne_child, *se_child,
             *sw_child, *nw_child;
  QTREE_DATA_ITEM *data_list;
};

And we have the data on a 256 ⨉ 256 grid which is centered at (0,0).

advanced-dot-qtree-grid-shadow.png

Addr Rectangle Node Type ne se sw nw data
0x01 -128,128,128,-128 nonleaf 0x02 0x03 0x04 0x05
0x02 0,128,128,0 leaf no data
0x03 0,0,128,-128 nonleaf 0x06 0x07 0x08 0x09
0x04 -128,0,0,-128 leaf w/data "D"
0x05 -128,128,0,0 leaf w/data "A"
0x06 64,0,128,-64 leaf no data
0x07 64,-64,128,-128 leaf w/data "C"
0x08 0,-64,128,-128 leaf no data
0x09 0,0,64,-64 leaf w/data "B"

How can dot be used to display the internal data structure in a useful way?

Solution : Records and Ports

We would like our nodes to be rendered by dot in a fashion similar to what one would find in a data structures textbook. For example

advanced-dot-node-shadow.png

There are three elements that we need to cover:

  • Labels
  • Records
  • Ports

Labels

Labels are simply text which is displayed inside of a node. For example

digraph smalltree {
  ordering=out;
  0 [label="parent"];
  1 [label="child one"];
  2 [label="child two"];
  0 -> 1;
  0 -> 2;
}

If no label is supplied, then dot will label a node with the node name used to build the graph. In the tree above, the node names are: 0, 1, and 2.

advanced-dot-small-tree-shadow.png

Records

Records are a special node shape that allow labels to be separated by vertical and horizontal lines. When records are used, the labels typically represent fields in some structure. Here is an example

digraph record {
  node0 [shape=record,label="field0|field1|field2"];
  node0 -> node1;
}

advanced-dot-record0-shadow.png

Fields can be stacked vertically by surrounding them with curly brackets.

digraph record {
  node0 [shape=record,label="{field0|field1|field2}"];
  node0 -> node1;
}

advanced-dot-record1-shadow.png

If you wish for some fields to be stacked vertically and others to appear next to each other horizontally, use nested curly brackets.

digraph record {
  node0 [shape=record,label="{field0|{field1|field2}}"];
  node0 -> node1;
}

advanced-dot-record2-shadow.png

The curly brackets should be taken to mean: "toggle between vertical and horizontal placement".

digraph record {
  node0 [shape=record,label="{field0|{field1|field2|{field4|field5}}}"];
  node0 -> node1;
}

advanced-dot-record3-shadow.png

Ports

If a particular field contains an outgoing link, then we would like dot to draw the link at the same location as the field. The code below demonstrates the problem that ports solve

digraph port {
  ordering=out;
  root [shape=record,label="link0|link1|link2|link3|link4"];
  root -> child0;
  root -> child1;
  root -> child2;
  root -> child3;
  root -> child4;
}

advanced-dot-port0-shadow.png

In this drawing, it is not obvious which fields contain which links.

One solution is to label the outgoing links themselves

digraph port {
  ordering=out;
  root -> child0 [label="link0"];
  root -> child1 [label="link1"];
  root -> child2 [label="link2"];
  root -> child3 [label="link3"];
  root -> child4 [label="link4"];
}

advanced-dot-port1-shadow.png

Ports are labeled fields which can be used to specify an origin point for a link. Ports are defined inside of a node label and are written in angle brackets. Ports can be referenced in links with the syntax: sourceNode:port -> targetNode

digraph port {
  ordering=out;
  root [shape=record,label="<L0>link0|<L1>link1|<L2>link2|<L3>link3|<L4>link4"];
  root:L0 -> child0;
  root:L1 -> child1;
  root:L2 -> child2;
  root:L3 -> child3;
  root:L4 -> child4;
}

advanced-dot-port2-shadow.png

Example

Armed with records and ports, we can now tackle the quadtree from the beginning of this post. Although this code is monotonous our ultimate goal is to have it generated by a program.

digraph quadtree {
  N0x01 [shape=record,
         label="{Rectangle:-128,128,128,-128|{<ne>ne|<se>se|<sw>sw|<nw>ne}}"];
  N0x01:ne -> N0x02;
  N0x01:se -> N0x03;
  N0x01:sw -> N0x04;
  N0x01:nw -> N0x05;
  N0x02 [shape=record,
         label="{Rectangle:0,128,128,0|No Data}"];
  N0x03 [shape=record,
         label="{Rectangle:0,0,128,-128|{<ne>ne|<se>se|<sw>sw|<nw>ne}}"];
  N0x03:ne -> N0x06;
  N0x03:se -> N0x07;
  N0x03:sw -> N0x08;
  N0x03:nw -> N0x09;
  N0x04 [shape=record,
         label="{Rectangle:-128,0,0,-128|{Data:\"D\"|Data (x,y):-64,-64}}"];
  N0x05 [shape=record,
         label="{Rectangle:-128,128,0,0|{Data:\"A\"|Data (x,y):-64,64}}"];
  N0x06 [shape=record,
         label="{Rectangle:64,0,128,-64|No Data}"];
  N0x07 [shape=record,
         label="{Rectangle:64,-64,128,-128|{Data:\"C\"|Data (x,y):96,-96}}"];
  N0x08 [shape=record,
         label="{Rectangle:0,-64,128,-128|No Data}"];
  N0x09 [shape=record,
         label="{Rectangle:0,0,64,-64|{Data:\"B\"|Data (x,y):32,-32}}"];
}

advanced-dot-quadtree2-shadow.png

Next

In the next post, we will explore some techniques for generating dot files in C.

Visualizing Data Structures With Dot. Part 0: The Dot Language

The Problem

Visualizing data structures can be a great aid in debugging. Unfortunately most development environments don't have the capability to display linked data in graphical form. One notable exception is the GNU Data Display Debugger.

A Solution

In this four-part series, we will explore a few simple techniques for drawing data structures using the dot language.

  • Part 0: The dot Language.
  • Part 1: Some Advanced Features of the dot Language.
  • Part 2: Visualizing Data Structures in C.
  • Part 3: Visualizing Data Structures in Java.

dot is part of the popular graphviz suite of tools.

Basics of the dot Language

dot is used to draw graphs which have a (mostly) hierarchical structure. The language is easily understood by example. A binary tree might be specified as

digraph bintree {
  10 -> 5;
  10 -> 15;
  5  -> 2;
  5  -> 8;
  2  -> 1;
  2  -> 3;
  8  -> 6;
  8  -> 7;
  15 -> 13;
  15 -> 20;
}

Edges are given with: startNode -> endNode.

Use the dot command to render this to an SVG file

dot -Tsvg input.dot -ooutput.svg

The result can be viewed in your browser

the-dot-language-example0.png

dot can also handle general graph structures. Here is a non-hierarchical graph

digraph cycle {
  1 -> 2 -> 3 -> 4 -> 1;
}

the-dot-language-cycle.png

Care must be used if the ordering of children is important. Had the two children of node 8 been reversed in the source file as in

digraph bintree {
  10 -> 5;
  10 -> 15;
  5  -> 2;
  5  -> 8;
  2  -> 1;
  2  -> 3;
  8  -> 7;
  8  -> 6;
  15 -> 13;
  15 -> 20;
}

The result would not be a correctly rendered binary tree

the-dot-language-example1.png

To remedy this, set the ordering to match the order listed in the source file

digraph bintree {
  ordering=out;
  /* ... */
}

and list the children of each node from lowest to highest.

Node Shapes

The default shapes of nodes in dot are ellipses. Differently-shaped nodes can be drawn by writing the node name followed by the shape as indicated by the red lines

digraph bintree {
  ordering=out;
  1  [shape=rectangle];
  3  [shape=rectangle];
  6  [shape=rectangle];
  7  [shape=rectangle];
  13 [shape=rectangle];
  20 [shape=rectangle];
  10 -> 5;
  10 -> 15;
  5  -> 2;
  5  -> 8;
  2  -> 1;
  2  -> 3;
  8  -> 6;
  8  -> 7;
  15 -> 13;
  15 -> 20;
}

The tree now looks like

the-dot-language-example2.png

The location of the [shape = rectangle] lines in the source file is unimportant.

Colors

Nodes can be filled or outlined with a particular color. Let's outline the interior nodes in our tree brown and fill in the leaves light green. To do this, set the color attributes of each node. For outline colors write: color = brown. For color fills write: color = palegreen,style = filled. Let's color the edges in the tree brown as well. Colored edges should be followed with [color = brown]

digraph bintree {
  ordering=out;
  1        [shape=rectangle, color=palegreen, style=filled];
  3        [shape=rectangle, color=palegreen, style=filled];
  6        [shape=rectangle, color=palegreen, style=filled];
  7        [shape=rectangle, color=palegreen, style=filled];
  13       [shape=rectangle, color=palegreen, style=filled];
  20       [shape=rectangle, color=palegreen, style=filled];
  10       [color=brown];
  5        [color=brown];
  2        [color=brown];
  8        [color=brown];
  15       [color=brown];
  10 -> 5  [color=brown];
  10 -> 15 [color=brown];
  5  -> 2  [color=brown];
  5  -> 8  [color=brown];
  2  -> 1  [color=brown];
  2  -> 3  [color=brown];
  8  -> 6  [color=brown];
  8  -> 7  [color=brown];
  15 -> 13 [color=brown];
  15 -> 20 [color=brown];
}

the-dot-language-example3.png

Next

In the next part of this series, we will explore some more advanced features of dot:

  • Record structures with labeled fields.
  • Subgraphs.

GitHub – toddkfisher

Todd Fisher

San Mateo, CA

Civil engineer. Ex programmer.