D3 Tips and Tricks v4

Saturday, 10 August 2013

Observations of the impact of publishing the D3 Tips and Tricks book as a web page on Leanpub.

First of all apologies to those readers who are expecting a pure d3.js experience from reading this post and an apology to those readers who are expecting a pure Leanpub experience. This is a bit of both :-)

For the benefit of both groups, d3.js is an awesome JavaScript library / framework for visualizing data in a web page and Leanpub is a remarkable publishing option for authors who want to distribute their work in a slightly different (but cool) and very open way.

On the bright side, those interested in d3 might discover some interesting books to read or a way to publish their own work, and Leanpub devotees might find that there is a way for them to analyse how their book is being received in the World :-)

What's this blog post about?

Well, a while back (28 May 2013), Scott Patten of Leanpub announced on the Leanpub Google groups site that;
You can now put the content of your book on the web automatically every time you hit the publish button. This page will be available to everyone on the internet, not just people who have purchased your book.
So, in addition to providing a framework for publishing and distributing an authors work, Leanpub added the ability to have the entirety of a book published as a single web page, available online for anyone to view.

Now, that may seem to be slightly strange thinking, but as Scott explained;
... your biggest risk as an author is obscurity, not security. Don't worry about people stealing your book. Worry about people not knowing about your book. Putting your book on the web will allow Google (and other search engines) to index it, and people will be able to find you by searching for all of your content, not just the small amount you can put on your book's landing page.
At the time I thought this was an excellent idea (and (spoiler alert) I still do). So I immediately selected this as an option for D3 Tips and Tricks, so that people could not only download the book in a format suitable for reading as pdf, epub or mobi file types, but they could also discover the content online.

Now, at the time of this feature going live, there was an obvious debate over what sort of impact this feature would have on downloads of a book. I therefore offered to share what (if anything) I discovered during this process. This blog post is the observations on what I've seen thus far with D3 Tips and Tricks and it incorporates using a d3.js script (via dc.js) to help :-).

Did using this feature result in additional interest for D3 Tips and Tricks?

The first piece of data I'll present is from Google Analytics (I'd like to point out at this time that I'm no expert in using Google Analytics. Please prepare yourself for disappointment if you think I have any special insights or advice to give on its use).

Here we can see the total number of pageviews for the D3 Tips and Tricks site on Leanpub. The cursor is presented at about the time that I added the `read as a web page` feature and it is evident from the graph that after that time, the number of page views increased. As there were no major alterations I made to the book at that time (other than adding the `read as a web page` feature), I am making the presumption that this increase is as a result of implementing this feature.

(The slight increase in the last two weeks would most likely be as a result of a significant addition to the book in the form of information on crossfilter and dc.js)

It should be remembered that this is only representative of people who visited the site. Not people who downloaded the book.

What proportion of visits was attributable to adding the `read as a web page` feature?

Luckily we can select individual pages for comparison and the following graph shows as a proportion, the number of visits to the /D3-Tips-and-Tricks/read page and the overall number of visits;

The number of visitors who went to the read page (orange line) has remained fairly constant and by the roughest of evaluation methods (just looking at the graph) it would appear that the increase in the overall pageviews is due in most part to the additional traffic going to the read page.

So, did using this feature result in more downloads of D3 Tips and Tricks?

This will be a question that more Leanpub authors will be interested in. Did the access to the content on the web influence the number of people downloading the book?

This is where things start to get a little more complicated. Because D3 Tips and Tricks (the book) can be downloaded for free, there would be no barrier to people reading the content online (for free) to also download the book if they wanted to (for free). With this in mind, I don't believe that the results that I have would be a firm indicator for the results that would occur with a book that cost money to download.
Never the less... Here is an indication (in green) of the numbers of people who also visited the 'purchases' portion of the site.

The data does not appear to have been occurring since before approximately the middle of May, (I'm not sure of the reason, but it may be due to a change in the way Leanpub was running the site) however, there are two clear weeks of data from before the change and to my eye there doesn't appear to be an easily discernible difference in the number of visits to the purchased page from before the end of May to any time afterwards.

That does surprise me a little as I would have expected some degree of difference, but it is possible that the numbers are too low to show anything other than a REALLY significant change. It is also possible that this is as a result of the pricing of D3 Tips and Tricks. If it had been a non-free book, I wouldn't feel feel comfortable predicting what any impact would have been.

Is there any other impact on the introduction of his feature?

Good question, because this gives me a reason to crank up some d3.js graphs!

So I took this as an opportunity to build a dc.js / crossfilter graph that showed the relationship between the number of downloads of the book that occurred per day, the number of sales  per day (where a sale is someone kindly donating money for the book (remember, it can be downloaded for free, so this metric is for where a reader decides that they would like to donate when downloading)), the amount of royalties that is then accrued per day (I still find it very heart warming that people donate (it's like a pat on the back) and I've been more than happy to enjoy the occasional beer while thinking kindly of them :-)), The number of downloads per day of the week, the amount of royalties per day of the week and the total number of sales per day of the week.

Now, I've deliberately removed the values from some of the axes here for privacy reasons, so the graphs are indicative of differences, not amounts, likewise I have obfuscated the graphs slightly by applying a smoothing interpolation (this was done internally to the dc.js files for those interested).


From this we can confirm the earlier observation that the number of downloads of the book does not appear to have changed from before to after the end of May.
By the same token, there does not appear to have been any significant change in either the number of people who choose to donate when downloading the book or the amount that they donate. (again, it is possible that the sample population for this comparison is a little small to make this assumption, so take it with a grain of salt). If I was to REALLY squint hard when looking at the graphs, I might think that the number of sales has reduced, but the overall amount donated is pretty much the same.

The row graphs are pretty interesting. Obviously, Saturday and Sunday see a lower turnover than the weekdays, but interestingly, while the number of downloads and number of sales roughly correspond to each other, it would appear that the amount of royalties reduces on a Wednesday (this would indicate that while the number of sales stays the same, the amount paid (per donation) is less on a Wednesday than a Thursday or Friday). I have no idea what to make of that.

More interestingly (because this is a dc.js visualization) we can start selecting different parameters and discovering other things.

For instance, if I select all the records for donations prior to the end of May, there is a distinct increase in the relative amount of royalties received on a Thursday!


And if I flip the selection to all downloads after the end of May, It looks like Tuesday is a leader for royalties.


That in itself is weird.

Wrap up.

What have we learned?
The introduction of the `read as a web page` feature appeared to have the effect of increasing traffic to the book's content.
There did not appear to be an attributable increase or decrease in the number of downloads of the book as a result (although with the caveat that D3 Tips and Tricks represents a subset of books which can be downloaded for free which would be expected to skew the figures).

Future studies? 

It would be very interesting to get a similar interpretation of figures from other books, especially ones that have a minimum purchase price that is not $0.

Can you implement the same sort of graphs for your own Leanpub book? 

Sure! You'll need to find your own way with Google Analytics, but if you want to use d3.js, / dc.js and crossfilter, they are all open source projects which are available for use.
A copy of the file that I created to generate the graphs is here, but if you're unfamiliar with some of the JavaScript libraries involved, you may need to go through D3 Tips and Tricks first :-)

Tuesday, 6 August 2013

Add a Pie Chart in dc.js

The following post is a portion of the D3 Tips and Tricks book which is free to download. To use this post in context, consider it with the others in the blog or just download the the book as a pdf / epub or mobi .
----------------------------------------------------------
The pie chart provides an useful way of presenting and filtering on discrete values or identifiers similar to a row chart.
The pie chart that we'll create will be a representation of which island the earthquakes occurred in. For those of you unfamiliar with the stunning landscape of New Zealand, there are two main islands creatively named North Island and South Island (stunning and practical!). The determination of what constitutes the North and South Island has been decided in a completely unscientific way (by me) by designating any area South of latitude -40.555907 and West of longitude 174.590607 as the South Island and anything else is the North Island.

The pie graph should end up looking a bit like this.
Good news! The pie chart shares the same cool feature as the row chart...
Click on one of the pie segments...
... and everything dynamically reflect the selection.
Just as with the previous chart examples chart, we'll work through adding the chart in the following stages.
  1. Position the chart
  2. Assign type
  3. Dimension and Group
  4. Configure chart parameters

Position the pie chart

We are going to position our pie chart above our data table (and below the line chart)in the same row as the row chart in one of the blank span4's.
The code that sets up that row should now look like this;
  <div class='row'>
    <div class='span4' id='dc-dayweek-chart'>
      <h4>Day of the Week</h4>
    </div>
    <div class='span4' id='dc-island-chart'>
      <h4>North or South Island</h4>
    </div>   
    <div class='span4' id='blank2'>
      <h4>Blank 2</h4>
    </div> 
  </div>
We've given it an ID selector of dc-island-chart. So when we we assign our chart that selector, it will automatically appear in that position. We've also put another simple title in place (<h4>North or South Island</h4>).
The last span4 is still blank.

Assign the pie chart type

Here we give our chart it's name (dayOfWeekChart), assign it with a dc.js chart type (in this case pieChart) and assign it to the ID selector (dc-dayweek-chart).
Under the row that assigns the dayOfWeekChart chart...
  var dayOfWeekChart = dc.rowChart("#dc-dayweek-chart");
... add in the equivalent for our pie chart.
  var islandChart = dc.pieChart("#dc-island-chart");

Dimension and group the pie chart data

We'll put the code between the dimension and group of the row chart and the data table dimension (this is just to try and keep the code in the same order as the graphs on the page).
When adding our dimension for our islands we want to provide an appropriate label so our code does the figuring out based on the latitude and longitude that we had established as the boundary between North and South.
  var islands = facts.dimension(function (d) {
    if (d.lat <= -40.555907 && d.long <= 174.590607)
      return "South";
    else
      return "North";
    });
This dimension (islands) uses the same facts data, but when we return our key values we are going to return them as either 'North' or 'South'. To do this we employ a simple if statement with a little logic. These are only the two 'slices' for our pie chart.
Then we want to group the data by using the default action of the .group() function to count the number of events of for each day of the week.
  var islandsGroup = islands.group();

Configure the pie chart parameters

There are fewer parameters that can be configured for pie charts, but we'll still take the time to go through the options used here.
This code should go just before the block that configures the dataTable (again, this is just to try and keep everything in the same order as the graphs on the page).
  islandChart.width(250)
    .height(220)
    .radius(100)
    .innerRadius(30)
    .dimension(islands)
    .group(islandsGroup)
    .title(function(d){return d.value;});
That should get the chart working. With the addition of this portion of the code, you should have a functioning visualization that can be filtered dynamically by clicking on the appropriate island in your pie chart. Just check to make sure that everything is working properly and we'll go through some of the configuration options to see what they do.
To start with, your page should look something like this;
The configuration options start by declaring the name of the chart (islandChart) and setting the height and width of the chart.
  islandChart.width(250)
    .height(220)
In the case of our example I have selected the width based on the default size for a span4 grid segment in bootstrap and adjusted the height to make it look suitable alongside the row chart.
Then we set up our inner and outer radii for our pie.
    .radius(100)
    .innerRadius(30)
This is fairly self explanatory, but by all means adjust away to make sure the chart suits your visualization.
Then we define which dimension and grouping we will use.
    .dimension(islands)
    .group(islandsGroup)
 For a pie chart, the `.dimension` declaration is the discrete values that make up each segment of the pie and the `.group` declaration is the size of the pie.

The final line in the configuration adds a tool tip to our pie chart using the value when the mouse hovers over the appropriate slice.
    .title(function(d){return d.value;});

The description above (and heaps of other stuff) is in the D3 Tips and Tricks book that can be downloaded for free (or donate if you really want to :-)).

Sunday, 4 August 2013

Add a row chart in dc.js

The following post is a portion of the D3 Tips and Tricks book which is free to download. To use this post in context, consider it with the others in the blog or just download the the book as a pdf / epub or mobi .
----------------------------------------------------------
The row chart provides an excellent mechanism for presenting and filtering on discrete values or identifiers.
The row chart that we'll create will be a representation of the number of earthquake events that occur on a particular day of the week. As such it doesn't represent any logical reason for selecting a Saturday over a Wednesday, and it is used here solely because the data makes a nice row chart :-). In this respect, what we are expecting to see is the number of events on the x axis and the individual days on the x axis.
It should end up looking a bit like this.
Now for a super cool feature with row charts...
Click on one of the rows...
How about that!
You can select an individual row from your chart and all the other rows reflect the selection. Go ahead and select other combinations of more than one row if you want. Welcome to data immersion!
Just as with the previous chart examples chart, we'll work through adding the chart in the following stages.
  1. Position the chart
  2. Assign type
  3. Dimension and Group
  4. Configure chart parameters

Position the row chart

We are going to position our row chart above our data table (and below the line chart)and we'll divide the row that it sits in into 3 equally spaced spans of span3. The additional two spans we'll leave blank for future use.
Just under the row of code that defined the containers for the line graph;
  <div class='row'>
    <div class='span12' id='dc-time-chart'>
      <h4>Events per hour</h4>
    </div>
  </div>
We add in a new row that has our three span4's.
  <div class='row'>
    <div class='span4' id='dc-dayweek-chart'>
      <h4>Day of the Week</h4>
    </div>
    <div class='span4' id='blank1'>
      <h4>Blank 1</h4>
    </div>   
    <div class='span4' id='blank2'>
      <h4>Blank 2</h4>
    </div> 
  </div>
We've given it an ID selector of dc-dayweek-chart. So when we we assign our chart that selector, it will automatically appear in that position. We've also put another simple title in place (<h4>Day of the Week</h4>).
The additional two span4's have been left blank.

Assign the row chart type

Here we give our chart it's name (dayOfWeekChart), assign it with a dc.js chart type (in this case rowChart) and assign it to the ID selector (dc-dayweek-chart).
Under the row that assigns the depthChart chart...
  var depthChart = dc.barChart("#dc-depth-chart");
... add in the equivalent for our row chart.
  var dayOfWeekChart = dc.rowChart("#dc-dayweek-chart");

Dimension and group the row chart data

We'll put the code between the dimension and group of the line (time) chart and the data table dimension (this is just to try and keep the code in the same order as the graphs on the page).
When adding our dimension for our day of the week we want to provide an appropriate label so our code does something extra.
  var dayOfWeek = facts.dimension(function (d) {
    var day = d.dtg.getDay();
    switch (day) {
      case 0:
        return "0.Sun";
      case 1:
        return "1.Mon";
      case 2:
        return "2.Tue";
      case 3:
        return "3.Wed";
      case 4:
        return "4.Thu";
      case 5:
        return "5.Fri";
      case 6:
        return "6.Sat";
    }
  });
This dimension (dayOfWeek) uses the same facts data, but when we return our key values we are going to return them as a combination of their numerical order (0 = Sunday etc) and their abbreviation (Sun = Sunday etc). This is essentially defining the categories of the values on the y axis for our row chart.
The code snippet looks a little strange, but think of it as extracting the numerical representation of the day of the week from our data (var day = d.dtg.getDay();) and then matching each number with an appropriate label (0 = '0.Sun', 1 = '1.Mon' etc). It's these labels that are now our key values in our dimension.
Then we want to group the data by using the default action of the .group() function to count the number of events of for each day of the week.
  var dayOfWeekGroup = dayOfWeek.group();

Configure the row chart parameters

As with the previous charts, there are plenty of parameters that can be configured. The best way to learn what they do is still to have a play with them. So here is the block of code for configuring the row chart. Once you are happy that it works on your system, take some time and go through the settings in conjunction with the information from the demo page and the api reference.
This should go just before the block that configures the dataTable (again, this is just to try and keep the code in the same order as the graphs on the page).
  // row chart day of week
  dayOfWeekChart.width(300)
    .height(220)
    .margins({top: 5, left: 10, right: 10, bottom: 20})
    .dimension(dayOfWeek)
    .group(dayOfWeekGroup)
    .colors(d3.scale.category10())
    .label(function (d){
       return d.key.split(".")[1];
    })
    .title(function(d){return d.value;})
    .elasticX(true)
    .xAxis().ticks(4);
That should get you working. With the addition of this portion of the code, you should have a functioning visualization that can be filtered dynamically by clicking on the appropriate day of the week in your row chart. Just check to make sure that everything is working properly and we'll go through some of the configuration options to see what they do.To start with, your page should look something like this;

The configuration options start by declaring the name of the chart (dayOfWeekChart) and setting the height and width of the chart.
  dayOfWeekChart.width(300)
    .height(220)
In the case of our example I have selected the width based on the default size for a span4 grid segment in bootstrap and adjusted the height to make it look suitable.
Then we have our margins set up.
    .margins({top: 5, left: 10, right: 10, bottom: 20})
Nothing too surprising there although I did reduce the top margin is slightly more than I thought I would need. You can be the judge for your own charts.
Then we define which dimension and grouping we will use.
    .dimension(dayOfWeek)
    .group(dayOfWeekGroup)
For a row chart, think of the .dimension declaration being the y axis and the .group declaration being the x axis (the opposite to the previous charts).
We can set the range of colours to use one of the standard palettes.
    .colors(d3.scale.category10())
Then we add the labels to our categories by splitting the key values (remember 0.Sun1.Mon etc) at the decimal point and returning the second part of the split value (which is the SunMon part) as the label.
    .label(function (d){
       return d.key.split(".")[0];
    })
The end result produces...
The next line in the configuration adds a tool tip to our row chart using the value when the mouse hovers over the appropriate bar.
    .title(function(d){return d.value;})

We can set the x axis to dynamically adjust when the number of events are filtered by selections on any of the other charts using the following configuration line.
    .elasticX(true)
For instance if we select a subset of the earthquakes using our time / line chart, our row chart will have a corresponding selection of the appropriate days and the x axis will alter accordingly.

Lastly we set up out x axis with 4 ticks.
    .xAxis().ticks(4);

The description above (and heaps of other stuff) is in the D3 Tips and Tricks book that can be downloaded for free (or donate if you really want to :-)).