But we usually use layouts for more complex visualizations—drawing a force-directed graph or a tree, for instance. In these cases, layouts help us to separate calculating coordinates from putting pixels on a page. This not only makes our code cleaner, but it also lets us reuse the same layouts for vastly different visualizations.
Theory is boring, let's dig in.
By default, d3 comes with 12 built-in layouts that cover most common visualizations. They can be split roughly into normal and hierarchical layouts. The normal layouts are as follows:
histogram
pie
stack
chord
force
The hierarchical layouts are as follows:
partition
tree
cluster
pack
treemap
To see how they behave, we're going to make an example for each type. We'll start with the humble pie chart and histogram, then progress to force-directed graphs and fancy trees. We're using the same dataset for all examples, so that we can get a feel of how different presentations affect the perception of data.
These are the last examples in this book, so we're going to make them particularly magnificent. That's going to create a lot of code, so every time we come up with something reusable, we'll put it in a helpers.js file as a function.
Let's create an empty helper.js file:
We're going to add functions as members of this global object. Add the following line to the HTML right before including the normal code.
Let's also agree that all examples start with a drawing area and fetching the data.
Example code will go in the d3.json load listener.
The dataset we'll be playing with has been scraped from my favorite IRC channel's log going back to late 2011. The channel's special feature is the karma bot.
When someone does something we like, we give them karma with nick++ and the bot counts it as a vote for that person. Just like on Reddit, karma is supposed to measure how much the community likes someone, but it's really just about who is most active.
The karma is what we're interested in.
You can get the dataset at https://raw.github.com/Swizec/d3.js-book-examples/master/ch5/data/karma_matrix.json. The dataset consists of objects representing instances of giving karma. Each looks like the following code:
Every object tells us at what time (time) somebody (from) gave karma to (to) somebody else. To deal with the cruft often tacked onto nicknames—for instance, smotko is smotko-nexus from his phone—only the first four letters of the nickname were considered when scraping the dataset.
This creates a clean dataset for us to work with. You can think of it as a list of edges in a graph, where users are nodes and to and from create a directed edge.
Time to draw!
Using the histogram layout
We are going to use the histogram layout to create a bar chart of the karma people have received. The layout itself will handle everything from collecting values into bins, to calculating heights, widths, and the positions of the bars.
Histograms usually represent a probability distribution over a continuous numerical domain, but nicknames are ordinal. To bend the histogram layout to our will, we have to turn nicknames into numbers—we'll use a scale.
Since it feels like this could be useful in other examples, we'll put the code in helpers.js:
These are two simple functions. uniques goes through the data and returns a list of unique nicknames. We help it with the nick accessor. nick_id creates an ordinal scale we'll be using to convert nicknames into numbers.
Now we can tell the histogram how to handle our data with nick_id.
Using d3.layout.histogram() we create a new histogram and use .bins() to define the upper threshold for each bin. Given [1,2,3], values under 1 go in the first bin, values between 1 and 2 in the second, and so on.
The .value() accessor tells the histogram how to find values in our dataset.
Another way to specify bins is by specifying the number of bins you want and letting the histogram uniformly divide a continuous numerical input domain into bins. For such domains, you can even make probability histograms by setting .frequency() to false. You can limit the range of considered bins with .range().
Finally, we used the layout as a function on our data to get an array of objects like this:
Bin width is in the dx property, x is the horizontal position and y is the height. We access elements in bins with normal array functions.
Using this data to draw a bar chart should be easy by now. We'll define a scale for each dimension, label both axes, and place some rectangles for bars.
To make things easier, we begin with some margins. Remember, all this code goes in the data load listener we defined earlier:
And two scales.
Using a log scale for the vertical axis will make the graph easier to read despite the huge karma variations.
Next, put a vertical axis on the left:
We create a grouping element for every bar and its label:
Moving the group into position, as shown in the following code, means less work when positioning the bar and its label:
Because the group is in place, we can put the bar a pixel from the group's edge. All bars will be histogram[0].dx wide and we'll calculate heights using the y position of each datum and the total graph height. Lastly, we create the labels:
We move labels to the bottom of the graph, rotate them by 60 degrees to avoid overlap, and set their text to the .to property of the datum.
Add some CSS styling to the HTML:
Our bar chart looks like this:
Well, the whole graph wouldn't fit in the book. Run the example.
The previous bar chart reveals that HairyFotr has the most karma by far. Let's find out who's making him so popular.
We are going to use the pie chart layout to cut the karma of HairyFotr into slices, showing how much karma he gets from the others. After filtering the dataset for karma going to HairyFotr, we have to categorize entries by givers, and finally feed them into the pie chart layout to generate a pie chart.
We can use the histogram layout to put data into bins depending on the .from property. Let's add a function to helpers.js:
Similar to the uniques and nick_id functions, bin_per_nick takes the data and a nick accessor, and returns histogram data.
We can now do this in pie chart's data listener:
Entries in the per_nick variable will tell us exactly how much karma HairyFotr got from someone.
To bake a pie, we call the pie layout and give it a value accessor:
The pie layout is now full of slice objects, each holding the startAngle and endAngle values and the original value.
Entries look like this:
We could have specified a .sort() function to change how slices are organized and a .startAngle() or .endAngle() function to limit the pie's size.
All that's left to do now is drawing a pie chart. We'll need an arc generator, just as the ones in Chapter 2, A Primer on DOM, SVG, and CSS and some color to tell slices apart.
Finding 24 distinct colors that look great together is hard; lucky for us, @ponywithhiccups jumped to the challenge and made the pick. Thank you!
Let's add these colors to helpers.js:
The color scale is an ordinal scale without a domain. To make sure nicknames always get the same color, a function in helpers.js will help us fixate the domain, as shown in the following code:
Now, we can define the arc generator and fixate the colors:
A group element will hold each arc and its label as shown in the following code:
To make positioning simpler, we move every group to the center of the pie chart. Creating slices works the same as in Chapter 2, A Primer on DOM, SVG, and CSS:
We get the color for a slice with d.data[0].from—the original dataset is in .data and all the .from properties in it are the same. That's what we grouped by.
Labels take a bit more work. They need to be rotated into place and sometimes flipped so that they don't appear upside-down.
Labeling an arc will be handy later as well, so let's make a general function in helpers.js:
We're using partial application to generate a function operating on a d3 selection. This means we can use it with .call(), while still defining our own parameters.
We'll give arc_labels a text accessor and a radius accessor, and it will return a function we can use with .call() on a selection to make labels appear in just the right places. The meaty part appends a text element, tweaks its text-anchor element, depending on whether we're going to flip it, and rotates the element into a particular position with the help of a tickAngle function.
Let's add the contents of the tickAngle function:
helpers.tickAngle calculates the middle angle between d.startAngle and d.endAngle and transforms the result from radians to degrees so that SVG can understand it.
This is basic trigonometry, so I won't go into details, but your favorite high schooler should be able to explain the math.
We use arc_labels back in the load listener:
And our delicious pie is done as shown in the following screenshot:
Clearly, the smallest values could do with some grouping under other, but you can play around with that on your own.
Showing popularity through time with stack
D3's official docs say:
"The stack layout takes a two-dimensional array of data and computes a baseline; the baseline is then propagated to the above layers, so as to produce a stacked graph."
Not clear at all, but I am hard pressed to come up with better. The stack layout calculates where one layer ends and another begins. An example should help.
We're going to make a layered timeline of karma, stretching as far back as 2011, with the width of each layer telling us how much karma went to a user at a certain time. This timeline is called a streamgraph.
To label layers, we're going to create a mouseover behavior that highlights a layer and shows a tooltip with the user's nickname. By fiddling until the graph looked pretty, I discovered that we should bin data into 12-day slots.
Let's begin the binning:
To parse timestamps into date objects, we specified a format for strings like 2012-01-25 15:32:15. Then, we used this format to find the earliest and latest time with d3.extent. Telling d3.time.days() to go from start to finish with a step of 12 days creates a list of bins.
We use the histogram layout to munge our dataset into a more useful form:
You already know what helpers.bin_per_nick does.
To bin data into time slots, we mapped through each layer of the nick accessors and turned it into a two-property object. The .to property tells us whom the layer represents, and .values is a histogram of time slots where entries tell us how much karma the user got in a certain 12-day period.
Time for a stack layout:
d3.layout.stack() creates a new stack layout. We told it how to order layers with .order('inside-out') (you should also try default and reverse) and decided how the final graph looks with .offset('wiggle'). wiggle minimizes change in slope. Other options include silhouette, zero, and expand. Try them.
Once again, we told the layout how to find values with the .values() accessor.
Our layers array is now filled with objects like this:
values is an array of arrays. Entries in the outer array are time bins that look like this:
The important parts of this array are as follows:
x is the horizontal position, y is the thickness, and y0 is the baseline. d3.layout.stack will always return these.
To start drawing, we need some margins and two scales:
The tricky thing was finding the vertical scale's domain. We found it by going through each value of every layer, looking for the maximum d.y0+d.y value—baseline plus thickness.
We'll use an area path generator for the layers;
Nothing too fancy, the baselines define bottom edges and adding the thickness gives the top edge. Fiddling determined that both should be pushed down by 100 pixels.
Let's draw an axis first:
Same as usual—we defined an axis, called it on a selection, and let d3 do its thing. We only made it prettier with a custom .tickFormat() function and used .ticks() to say we want a new tick every two months.
Ok, now for the streamgraph, add the following code:
Not much is going on. We used the area generator to draw each layer, defined colors with helpers.color, and called a tooltip function, which we'll define in helpers.js later.
The graph looks like this:
It looks pretty, but it is useless. Let's add that tooltip function to helpers.js:
We defined event listeners with a .tooltip namespace so that we can define multiple listeners on the same events.
The mouseover function will highlight streams and create tooltips, mousemove will move tooltips, and mouseout will put everything back to normal.
Let's put the three listeners inside the inner function:
That's the simple part of mouseover. It selects the current area and changes its class to highlighted. That will make it lighter and add a red outline.
In the same function, add the meaty part:
It is longer and with a dash of magic, but not scary at all!
First we find the mouse's position, then create a group element, and position it down and to the right of the mouse. We add a text element to the group and call SVG's getBBox() function on its node. This gives us the text element's bounding box and helps us size the background rectangle.
Finally, we remove the text because it's covered by the background and add it again. We might be able to avoid all this trouble by using divs, but I wanted to show you pure SVG tooltips. Hence, consider the following code:
The mousemove listener in the following code is much simpler. It just finds the #nicktool element and moves it to follow the cursor.
The mouseout function selects the current path, removes its highlighted styling, and removes the tooltip.
Voila! Tooltips
Very rudimentary—they don't understand edges and they won't break any hearts with their looks, but they get the job done. Let's add some CSS to the HTML:
And now we have a potentially useful streamgraph on our hands.
Highlighting friends with chord
We've seen how much karma people have and when they got it, but there's another gem hiding in the data—connections. We can visualize who is a friend of whom using the chord layout.
We're going to draw a chord diagram—a circular diagram of connections between users. Chord diagrams are often used in genetics and have even appeared on covers of magazines (http://circos.ca/intro/published_images/).
Ours is going to have an outer ring showing how much karma users give out and chords showing where that karma is going.
First, we need a matrix of connections for the chord diagram, and then we'll go the familiar route of path generators and adding elements. The matrix code will be useful later, so let's put it in helpers.js:
We begin with the familiar uniques list and the nick_id scale, then create a zero matrix, and loop through the data to increase connection counts in cells. Rows are from whom, columns are to whom—if the fifth cell in the first row holds 10, the first user has given ten karma to the fifth user. This is called an
adjacency matrix.
Back in the load listener, we can do this:
We're going to need uniques for labels and it would be nice to have the innerRadius and outerRadius variables handy:
Time to make the chord layout do our bidding:
It is a little different from others. The chord layout takes data via the .matrix() method and can't be called as a function.
We started with d3.layout.chord() and put some .padding() method between groups which improves readability. To improve readability further, everything is sorted. .sortGroups sorts groups on the edge, .sortSubgroups sorts chord attachments in groups, and .sortChords sorts chord drawing order so that smaller chords overlap bigger ones.
In the end, we feed data into the layout with .matrix():
We add a centered group element so that all our coordinates are relative to the center from now on.
Drawing the diagram happens in three steps—arcs, labels, and chords, as shown in the following code:
This creates the outer ring. We used chord.groups to get group data from the layout, created a new grouping element for every chord group, and then added an arc. We use arc_labels from the pie example to add the labels:
Even though the radius is constant, we have to define it as a function using the following code because we didn't make arc_labels flexible enough for constants. Shame on us!
We got chord data from chord.chords and used a chord path generator to draw the chords. We pick colors with d.target.index because the graph looks better, but chord colors are not informative.
We add some CSS to make chords easier to follow:
And our diagram looks perfect:
It looks pretty but unintuitive. We spent hours bickering on IRC before we figured it out.
First of all, chord colors don't mean anything! They just make it easier to distinguish chords. Furthermore, this graph shows how much karma everyone is giving.
From my arc's size you can see I've given about 30 percent of the karma ever given on this channel. I might be too generous.
The width of chords touching my arc tells you how much of that karma is going to whom.
At the other end of each chord, it's exactly the same. Chord width tells you how much karma that user has given me. Chords are bidirectional connections between users.
The force layout is the most complicated of the non-hierarchical layouts. It lets you draw complex graphs using physical simulations—force-directed graphs if you will. Everything you draw will have built-in animation.
We're going to draw a graph of connections between users. Every user will be a node, the size of which will correspond to the user's karma. Links between nodes will tell us who is giving karma to whom.
To make things clearer, we're going to add tooltips and make sure mousing over a node highlights the connected nodes.
Let's begin.
As in the chord example, we begin with a matrix of connections. We aren't going to feed this directly to the force layout, but we will use it to create the kind of data it enjoys:
The force layout expects an array of nodes and links. Let's make them:
We're defining the bare minimum of what we need, and the layout will calculate all the hard stuff.
nodes tell us who they represent and links connect a source object to a target object with an index into the nodes array—the layout will turn them into proper references as shown in the following code. Every link also contains a count object that we'll use to define its strength.
We create a new force layout with d3.layout.force(); just like the chord layout, it isn't a function either. We feed in the data with .nodes() and .links().
Gravity pulls the graph towards the center of the image; we defined its strength with .gravity(). We tell the force layout the size of our picture with .size().
No calculation happens until force.start() is called, but we need the results to define a few scales for later.
There are a few more parameters to play with: overall .friction() (the smallest .linkDistance() value the nodes stabilize to), .linkStrength() for link stretchiness, and .charge() for attraction between nodes. Play with them.
nodes members look like this now:
weight tells us how many links connect with this node, px and py are its previous positions, and x and y are the current position.
links members are a lot simpler:
Both source and target objects are a direct reference to the correct node.
Now that the layout made its first calculation step, we have the data to define some scales;
We're going to use the weight scale for node sizes, distance for link lengths, and given to scale nodes for the highlighting effect:
We use .linkDistance() to dynamically define link lengths according to the .count property. To put the change in effect, we restart the layout with force.start().
Finally! Time to put some ink on paper—well, pixels on screen:
Links are simple—go through the list of links and draw a line.
Draw a circle for every node and give it the right size and color. The strange nick_ class will help us with the highlighting we're doing in the two mouse event listeners:
We add tooltips with the familiar helpers.tooltip function and force.drag will automatically make the nodes draggable:
After all that work, we still have to do the updating on every tick of the force layout animation:
On a tick event, we move every link endpoint and node to its new position. Simple.
Time to define the two highlighting functions we mentioned earlier:
The highlight function will grow all connected nodes according to how much karma they've gotten from the node we're touching with the mouse. It starts by setting the given object's domain, then goes through the uniques list, resizes corresponding nodes using the given scale for size, and uses nick_id to find the nodes.
The current node is left alone.
dehighlight will remove all the shenanigans we caused:
Add some styling to the HTML:
And voilà! We get a force-directed graph of user connections.
Running this example looks silly because it spins around a lot before settling down. But once it stabilizes, the graph looks something like this:
The graph would be more interesting if all nodes were not connected, but hovering one of the smaller nodes will reveal interesting connections.
We should have added some code to print names next to the highlighted nodes, but the example was long enough. Let's say that's left as an exercise for the reader.
We will now move towards hierarchical layouts!
All hierarchical layouts are based on an abstract hierarchy layout designed for representing hierarchical data—data within data within data within data within.... You get the idea.
All the common code for the partition, tree, cluster, pack, and treemap layouts is defined in d3.layout.hierarchy() and they all follow similar design patterns. The layouts are so similar that the official documentation very obviously copy-pastes most of its explanations. Let's avoid that by looking at the common stuff first, and then we will focus on the differences.
First of all, we need some hierarchical data. I spent an afternoon trying to make our karma dataset hierarchical. The result was a scheme that works well with three of the layouts and looks contrived for the other two. Sorry about that.
It's simple really, we kill the Batman.
We'll have a root node called karma, which will contain the 24 users who have ever given karma. For the tree and cluster layouts, each of those will contain nodes for everyone they have given karma to. For the partition, pack, and treemap layouts, children nodes will tell us who contributed to the parent's karma.
The final data structure will look like this:
While it could potentially go on forever, that wouldn't make sense in our case.
The default accessor expects a .children property, but we could easily have done something crazy like dynamically generating a fractal structure in a custom accessor.
As usual, there's a .value() accessor that helps layouts to find data in a node. We'll use it for the .count property—to check how much karma a user's got.
To run a hierarchical layout, we call .nodes() with our dataset. This immediately returns a list of nodes that you can't get to later. For a list of connections, we call .links() with a list of our nodes. Nodes in the returned list will have some extra properties calculated by the layout. Most layouts tell us where to put something with .x and .y, then use .dx and .dy to tell us how big the layout should be.
All hierarchical layouts also support sorting with .sort(), which takes a sorting function such as d3.ascending or d3.descending.
Enough theory, let's add a data munging function to helpers.js:
Wow, there's a lot going on here. We avoided recursion because we know our data will never nest more than two levels deep.
tree holds an empty root node at first. We use helpers.uniques to get a list of nicknames, then map through the array and define the children of the root node by counting everyone's karma and using helpers.bin_per_nick to get an array of children.
The code is wibbly-wobbly because we use filter1, filter2, nick1, and nick2 for data accessors, but making this function flexible makes it useful in all hierarchical examples.
The tree layout displays data in a tree using the tidy
Reingold-Tilford tidy algorithm. We'll use it to display our dataset in a large circular tree with every node connected to its parent by a curvy line.
We begin the load listener by fixating colors, turning data into a tree, and defining a way to draw curvy lines:
You know fixate_colors from before, we defined make_tree not a page ago, and we've talked about the diagonal generator in Chapter 2, A Primer on DOM, SVG, and CSS.
We create a new tree layout by calling d3.layout.tree(). Defining its size with .size() and executing it with .nodes(). size() tells the layout how much room it's got—in this case, we're using x as an angle (360 degrees) and y as a radius. Though the layout itself doesn't really care about that.
To avoid worrying about centering later on, we put a grouping element center stage:
First we are going to draw the links, then the nodes and their labels:
You should be familiar with this by now; go through the data and append new paths shaped with the diagonal generator:
For every node in the data, we create a new grouping element and move it into place using rotate for angles and translate for radius positions.
Now it's just a matter of adding a circle and a label:
Every node is colored with the user's native color and the text is transformed similarly to the earlier pie and chord examples. Finally, we made leaf nodes' text smaller to avoid overlap.
After this, we will add some styling:
Our tree looks like this:
It's rather big, so you should try it out in the browser. Just remember, the inner ring is users giving karma to the outer ring.
The cluster layout is the same as the tree layout, except that leaf nodes line up.
Do you see that the hoi user is hanging out in the inner ring of the tree example? With the cluster layout they end up on the outside with the other leaf nodes.
Codewise this example is the same as the last, so we won't go through it again. Really, the only difference is that we don't have to flip labels at certain angles. You can look at the code on the GitHub examples repository https://github.com/Swizec/d3.js-book-examples/blob/master/ch5/cluster.js.
We end up with a very tall graph that looks something like this:
Now we're getting somewhere! The next three layouts fit our data perfectly—we're taking three looks at how our core users' karma is structured.
The partition layout creates adjacency diagrams, where you don't draw nodes with links between them, but next to each other so that it looks like the children partition the parent.
We are going to draw a two-layer donut chart. Users will go on the first layer and the layer on top will show us where the karma is coming from.
We begin by munging the dataset and fixating colors:
Then use the partition layout:
We used .value() to tell the layout we care about the .count values, and we'll get a better picture if we .sort() the output. Similarly, to the tree layout, x will represent angles—this time in radians—and y will be radii.
We need an arc generator as well, as shown in the following code:
The generator will use each node's .y property for the inner radius and add .dy for the outer radius. Fiddling shows the outer layer should be thinner, hence we are dividing it by the tree depth.
Notice that there's no accessor for .startAngle and .endAngle, which are stored as .x and .dx. It's easier to just fix the data:
It is as simple as mapping the data and defining angle properties, then filtering the data to make sure the root isn't drawn.
We use the familiar grouping trick to center our diagram.
Preparation work is done. It's drawing time:
An arc is drawn for every node, color is chosen as usual:
We add labels and tooltips with the functions prepared in earlier examples. We avoid adding labels for very thin slices so that they don't overlap and make a mess. Sprinkle some CSS:
The adjacency diagram looks like this:
Packing circles into circles
The pack layout uses packing to visually represent hierarchies. It stuffs children nodes into their parents, trying to conserve space and sizing each node so that it's the cumulative size of its children.
Conceptually it's very similar to the treemap layout, so I'm going to skip all the code and just show you the picture. You can still see the code over at GitHub https://github.com/Swizec/d3.js-book-examples/blob/master/ch5/pack.js.
Note
The code is rather familiar—generate a tree, fixate colors, create layout, tweak a few parameters, get computed nodes, draw nodes, and add tooltips. Simple.
It looks very pretty, but not too informative. Adding labels wouldn't help much because most nodes are too small.
The treemap layout subdivides nodes with horizontal and vertical slices, essentially packing children into their parents just like the pack layout, but using rectangles. As a result, node sizes on every level can be compared directly, making this one of the best layouts for analyzing cumulative effects of subdivisions.
We are going to have some fun with this example. Tooltips will name the parent—parents are almost completely obscured by the children—and mousing over a node will make unrelated nodes become lighter, making the graph less confusing (at least in theory).
It's also a cool effect and a great way to end this chapter on layouts.
We begin with the boring stuff; prepare data and fixate colors:
Creating the treemap layout follows familiar patterns:
We added some padding with .padding() to give nodes room to breathe.
Every node will become a group element holding a rectangle. The leaves will also hold a label:
Now for the first fun bit. Let's fit labels into as many nodes as they can possibly go:
Finally! That was some interesting code!
We found all the leaves and started adding text. To fit labels into nodes, we get their size with this.getBBox(), then move them to the middle of the node, and check for fit.
If the label is too wide but fits vertically, we rotate it; otherwise, we remove the label after checking again that it doesn't fit. Making sure of the height is important because some nodes are very thin.
We add tooltips with helpers.tooltip:
Another fun bit—partially hiding nodes from different parents:
We used two mouse event listeners: one creates the effect, another removes it. The mouseover listener goes through all the nodes and lightens those with a different parent or that aren't the parent (d.parent.nick and d.nick are different). The mouseout listener removes all changes.
After this, add some CSS:
The end result looks like an abstract painting:
Touching an area with your mouse restores some sanity as shown in the following screenshot:
Although, not as much sanity as we hoped.