Raspberry Pi Pico Tips and Tricks

Friday 22 February 2013

Formatting data for Sankey diagrams in d3.js


The following post is a portion of the D3 Tips and Tricks document which is free to download. To use this post in context, consider it with the others in the blog or just download the pdf  and / or the examples from the downloads page:-)
-------------------------------------------------------

The following is a follow on from the previous posts on generating Sankey diagrams in d3.js and goes over some mechanisms for ingesting data.

From a JSON file with numeric link values

As explained in the previous section, data to form a Sankey diagram needs to be a combination of nodes and links.
{
"nodes":[
{"node":0,"name":"node0"},
{"node":1,"name":"node1"},
{"node":2,"name":"node2"},
{"node":3,"name":"node3"},
{"node":4,"name":"node4"}
],
"links":[
{"source":0,"target":2,"value":2},
{"source":1,"target":2,"value":2},
{"source":1,"target":3,"value":2},
{"source":0,"target":4,"value":2},
{"source":2,"target":3,"value":2},
{"source":2,"target":4,"value":2},
{"source":3,"target":4,"value":4}
]}
As we also noted earlier, the `“node”` entries in the `”nodes”` section of the json file are superfluous and are really only there for our benefit since D3 will automatically index the nodes starting at zero. As a test to check this out we can change our data to the following;
{
"nodes":[
{"name":"Barry"},
{"name":"Frodo"},
{"name":"Elvis"},
{"name":"Sarah"},
{"name":"Alice"}
],
"links":[
{"source":0,"target":2,"value":2},
{"source":1,"target":2,"value":2},
{"source":1,"target":3,"value":2},
{"source":0,"target":4,"value":2},
{"source":2,"target":3,"value":2},
{"source":2,"target":4,"value":2},
{"source":3,"target":4,"value":4}
]}
(for reference this file is saved as sankey-formatted-names-and-numbers.json and the html file is Sankey-formatted-names-and-numbers.html)

This will produce the following graph;
As you can see, essentially the same, but with easier to understand names.

As you can imagine, while the end result is great, the creation of the JSON file manually would be painful at best and doing something similar but with a greater number of nodes / links would be a nightmare.

So let's see if we can make the process a bit easier and more flexible.

From a JSON file with links as names

It would make thing much easier if you are building the data from hand to have nodes with names, and the 'source' and 'target' links have those same name values as identifiers.

In other words a list of unique names for the nodes (and perhaps some details) and a list of the links between those nodes using the names for the nodes.

So, something like this;
{
"nodes":[
{"name":"Barry"},
{"name":"Frodo"},
{"name":"Elvis"},
{"name":"Sarah"},
{"name":"Alice"}
],
"links":[
{"source":"Barry","target":"Elvis","value":2},
{"source":"Frodo","target":"Elvis","value":2},
{"source":"Frodo","target":"Sarah","value":2},
{"source":"Barry","target":"Alice","value":2},
{"source":"Elvis","target":"Sarah","value":2},
{"source":"Elvis","target":"Alice","value":2},
{"source":"Sarah","target":"Alice","value":4}
]}
Once again, D3 to the rescue!

The little piece of code that can do this for us is here;
    var nodeMap = {};
    graph.nodes.forEach(function(x) { nodeMap[x.name] = x; });
    graph.links = graph.links.map(function(x) {
      return {
        source: nodeMap[x.source],
        target: nodeMap[x.target],
        value: x.value
      };
    });
This elegant solution comes from here; http://stackoverflow.com/questions/14629853/json-representation-for-d3-networks and was provided by Chris Pettitt (nice job).

So if we sneak this piece of code into here...  
d3.json("data/sankey-formatted.json", function(error, graph) {

            //  <= Put the code here.

  sankey
      .nodes(graph.nodes)
      .links(graph.links)
      .layout(32);
… and this time we use our JSON file with just names (sankey-formatted-names.json) and our new html file (sankey-formatted-names.html) we find our Sankey diagram working perfectly!
Looking at our new piece of code...
    var nodeMap = {};
    graph.nodes.forEach(function(x) { nodeMap[x.name] = x; });
… the first thing it does is create an object called `nodeMap` (The difference between an array and an object in JavaScript is one that is still a little blurry to me and judging from online comments, I am not alone).

Then for each of the `graph.node` instances (where `x` is a range of numbers from 0 to the last node), we assign each node name to a number.

Then in the next piece of code...
    graph.links = graph.links.map(function(x) {
      return {
        source: nodeMap[x.source],
        target: nodeMap[x.target],
        value: x.value
      };

… we go through all the links we have and for each link, we map the appropriate number to the correct name.

Very clever.

From a CSV with 'source', 'target' and 'value' info only

 In the first iteration of this section I had no solution to creating a Sankey diagram using a csv file as the source of the data.

But cometh the hour, cometh the man. Enter @timelyportfolio who, while claiming no expertise in D3 or JavaScript was able to demonstrate a [solution](http://bl.ocks.org/timelyportfolio/5052095) to exactly the problem I was facing! Well done Sir! I salute you and name the technique the timelyportfolio csv method!

So here's the cleverness that @timelyportfolio demonstrated;

Using a csv file (in this case called `sankey.csv`) that looks like this;

source,target,value
Barry,Elvis,2
Frodo,Elvis,2
Frodo,Sarah,2
Barry,Alice,2
Elvis,Sarah,2
Elvis,Alice,2
Sarah,Alice,4


 We take this single line from our original Sankey diagram code;
d3.json("data/sankey-formatted.json", function(error, graph) {
 And replace it with the following block;
d3.csv("data/sankey.csv", function(error, data) {

  //set up graph in same style as original example but empty
  graph = {"nodes" : [], "links" : []};

    data.forEach(function (d) {
      graph.nodes.push({ "name": d.source });
      graph.nodes.push({ "name": d.target });
      graph.links.push({ "source": d.source,
                         "target": d.target,
                         "value": +d.value });
     });

     // return only the distinct / unique nodes
     graph.nodes = d3.keys(d3.nest()
       .key(function (d) { return d.name; })
       .map(graph.nodes));

     // loop through each link replacing the text with its index from node
     graph.links.forEach(function (d, i) {
       graph.links[i].source = graph.nodes.indexOf(graph.links[i].source);
       graph.links[i].target = graph.nodes.indexOf(graph.links[i].target);
     });

     //now loop through each nodes to make nodes an array of objects
     // rather than an array of strings
     graph.nodes.forEach(function (d, i) {
       graph.nodes[i] = { "name": d };
     });
 The comments in the code (and they are fuller in @timelyportfolio's [original gist solution](http://bl.ocks.org/timelyportfolio/5052095)) explain the operation;
d3.csv("data/sankey.csv", function(error, data) {
 … Loads the csv file from the data directory.
  graph = {"nodes" : [], "links" : []};
 … Declares `graph` to consist of two empty arrays called `nodes` and `links`.
      data.forEach(function (d) {
      graph.nodes.push({ "name": d.source });
      graph.nodes.push({ "name": d.target });
      graph.links.push({ "source": d.source,
                         "target": d.target,
                         "value": +d.value });
     });
 … Takes the `data` loaded with the csv file and for each row loads variables for the `source` and `target` into the `nodes` array then for each row loads variables for the `source` `target` and `value` into the `links` array.
     graph.nodes = d3.keys(d3.nest()
       .key(function (d) { return d.name; })
       .map(graph.nodes));
 … Is a routine that Mike Bostock described on [Google Groups](https://groups.google.com/forum/#!msg/d3-js/pl297cFtIQk/Eso4q_eBu1IJ) that (as I understand it) nests each node name as a key so that it returns with only unique nodes.
     graph.links.forEach(function (d, i) {
       graph.links[i].source = graph.nodes.indexOf(graph.links[i].source);
       graph.links[i].target = graph.nodes.indexOf(graph.links[i].target);
     });
 … Goes through each `link` entry and for each `source` and `target`, it finds the unique index number of that name in the nodes array and assigns the link source and target an appropriate number.

And finally...
     graph.nodes.forEach(function (d, i) {
       graph.nodes[i] = { "name": d };
     });
 … Goes through each node and (in the words of @timelyportfolio) “*make nodes an array of objects rather than an array of strings*” (I don't really know what that means :-(. I just know it works :-).)

There you have it. A Sankey diagram from a csv file. Well played @timelyportfolio!

Both the html file for the diagram (`Sankey.formatted-csv.html`) and the data file (`sankey.csv`) can be found in the downloads section of d3noob.org.

From MySQL as link information only automatically.

So, here we are. Faced with a dilemma of trying to get my csv formatted links into a Sankey diagram. In theory we then need to go through our file, identify all the unique nodes and format the entire blob into JSON for use.

There must be a better way!

Well, I'm not going to claim that this is any better since it's a little like cracking a walnut with a sledgehammer. But to a man with just a sledgehammer, everything’s a walnut.

So, let's use our newly developed MySQL and PHP skills to solve our problem. In fact, let's make it slightly harder for ourselves. Let's imagine that we don't even have a value associated with our data, just a big line of source and target links. Something like this;

source,target
Barry,Elvis
Barry,Elvis
Frodo,Elvis
Frodo,Elvis
Frodo,Sarah
Frodo,Sarah
Barry,Alice
Barry,Alice
Elvis,Sarah
Elvis,Sarah
Elvis,Alice
Elvis,Alice
Sarah,Alice
Sarah,Alice
Sarah,Alice
Sarah,Alice

First thing first, just as we did in the example on using MySQL, import your csv file into a MySQL table which we'll call `sankey1` in database `homedb`.

Now we want to write a query that pulls out all the DISTINCT names that appear it the 'source' and 'target' columns. This will form our 'nodes' portion of the JSON data.
SELECT DISTINCT(`source`) AS name FROM `sankey1`
UNION
SELECT DISTINCT(`target`) AS name FROM `sankey1`
GROUP BY name
This query actually mashes two separate queries together where each returns DISTINCT instances of each `source` and `target` from the source and target columns. By default, the UNION operator eliminates duplicate rows from the result which means we have a list of each node in the table.
Exxxeellennt....... (channelling Mr Burns)

Now we run a separate query that pulls out each distinct 'source' and 'target' combination and the number of times (COUNT(*)) that it occurs.
SELECT `source` AS source, `target`  as target, COUNT(*) as value 
FROM `sankey1`
GROUP BY source, target 
This query gets all the sources and all the targets and groups them by first the source and then the target. Each line is therefore unique and the `COUNT(*)` sums up the number of times that each unique combination occurs.
That was surprisingly easy wasn't it?

MySQL is good like that for the simple jobs, but of course we're a long way from finished since at this stage all we have is what looks like two tables in a spreadsheet.

So now we turn to PHP.

Remember from our previous exposure, we described PHP as the glue that could connect parts of web pages together. In this case we will use it to glue our MySQL database to our JavaScript.

What we need it to do is to carry out our queries and return the information in a format that d3.js can understand. In this instance we will select JSON as it's probably the most ubiquitous and it suits the format of our original manual data.

Let's cut to the chase and look at the code that we'll use.
<?php
    $username = "homedbuser"; 
    $password = "homedbuser";   
    $host = "localhost";
    $database="homedb";
    
    $server = mysql_connect($host, $username, $password);
    $connection = mysql_select_db($database, $server);

    $myquery = "
SELECT DISTINCT(`source`) AS name FROM `sankey1`
UNION
SELECT DISTINCT(`target`) AS name FROM `sankey1`
GROUP BY name
";
    $query = mysql_query($myquery);
    
    if ( ! $myquery ) {
        echo mysql_error();
        die;
    }
    
    $nodes = array();
    
    for ($x = 0; $x < mysql_num_rows($query); $x++) {
        $nodes[] = mysql_fetch_assoc($query);
    }

    $myquery = "
SELECT `source` AS source, `target`  as target, COUNT(*) as value 
FROM `sankey1`
GROUP BY source, target 
";
    $query = mysql_query($myquery);
    
    if ( ! $myquery ) {
        echo mysql_error();
       die;
    }
    
    $links = array();
    
    for ($x = 0; $x < mysql_num_rows($query); $x++) {
        $links[] = mysql_fetch_assoc($query);
    }

echo "{";
echo '"links": ', json_encode($links), "\n";
echo ',"nodes": ', json_encode($nodes), "\n";
echo "}";

    mysql_close($server);
?>
Astute readers will recognise that this is very similar to the script that we used to extract data from the MySQL database for generating a simple line graph. If you haven't checked it out, and you're unfamiliar with PHP, you will want to read that section first.

We declare all the appropriate variables that we will then use to connect to the database, then we connect to the database and run our query.

After that we store the nodes data in an array called `$nodes`.

Then we run our second query (we don't close the connection to the database since we're not finished with it yet).

The second query returns the link results into a second array called `$links` (pretty imaginative).

Now we come to a part that's a bit different. We still need to echo out the data in the same way that was required for our line graph, but in this case we need to add the data together with the associated `links` and `nodes` identifiers.  
echo "{";
echo '"links": ', json_encode($links), "\n";
echo ',"nodes": ', json_encode($nodes), "\n";
echo "}";
(if you look closely, the syntax will produce our a JSON formatted output)

So lastly, we need to call this PHP script from our html file in the same way that we did for the line graph. So amend the html file to change the loading of the JSON data to be from our PHP file thusly;
d3.json("php/sankey.php", function(error, graph) {
And there you have it! So many ways to get the data.

Both the PHP file (sankey.php) and the html file (sankey-mysql-import.html) are available in the downloads section on d3noob.org.

Sankey diagram case study

So armed with all this new found knowledge on building Sankey diagrams, what can you do?

Well, I suppose it all depends on your data set, but remember, Sankey diagrams are good at flows, but they won't do loops / cycles easily (although there has been some good work done in this direction here  and here).

So let's choose a flow.

In this case we'll selected the flow of data that represents a view of global, anthropogenic greenhouse gas (GHG) emissions. The diagram is a re-drawing of the excellent diagram on the World Resources Institute and as such my version pales in comparison to theirs.

However, the aim is to play with the technique, not to emulate :-).

So starting with the data presented in the original diagram, we have to capture the links into a csv file. I did this the hard way (since there didn't appear to be an electronic version of the data) by reading the graph and entering the figures into a csv file. From here we import it into our MySQL database and then convert it into sankey formatted JSON by using our PHP script that we played with in the example of extracting information from a MySQL database. In this case instead of needing to perform a `COUNT(*)` on the data, it's slightly easier since the value is already present.

Then, because we want this diagram to be hosted on Gist and accessible on bl.ocks.org, we run the PHP file directly into the browser so that it just shows the JSON data on the screen. We then save this file with the suffix `.json` and we have our data (in this case the file is named `sankeygreenhouse.json`).
Then we amend our html file to look at our new `.json` file and voila!



Sankeytastic!

You can find this as a live example and with all the code and data on bl.ocks.org.


The above description (and heaps of other stuff) is in the D3 Tips and Tricks document that can be accessed from the downloads page of d3noob.org (Hey! It's free. Why not?)

43 comments:

  1. You have just inspired me to give Sankey a fresh new face in Dex. Sankey is a bit more complex than most D3 visuals, so I had shuddered at thinking about creating a more flexible view.

    Thanks!

    Pat

    ReplyDelete
    Replies
    1. Good show! Dex is seriously awesome by the way. Really nice work.

      Delete
  2. Just found this as I was working on a d3 force-directed layout from .csv. I am not a javascript or d3 expert, but I think I might have a method for filling in your blank. I'll keep you updated on progress.

    Thanks so much for your site. It is very helpful.

    ReplyDelete
  3. Here is my ugly but functioning code http://bl.ocks.org/timelyportfolio/5052095. Of course it needs more substantial tests than the simple .csv provided, but I think it will work.

    Just as another note, I discovered that the sankey.js code gets stuck in an infinite loop if source and target are the same. Maybe this will help someone in the future as the try these out.

    ReplyDelete
    Replies
    1. Fantastic! Great work.
      I shall have to have a play.
      Yeah, I understand about the loop business, but there are a couple of folks who have worked on solutions. I haven't played with either yet.
      (see http://bl.ocks.org/cfergus/3956043) and (http://bl.ocks.org/kunalb/4658510)

      Delete
    2. Well. I've had a play and I like it a lot. There are parts where I don;t know exactly how it works, but it sure does work!
      I've updated the post above to include a description of the code and I've updated the book D3 Tips and Tricks as well (with what I hope is appropriate attribution :-).
      Many thanks @timelyportfolio. The d3.js community salutes you!

      Delete
  4. First of all, thank you for doing this. I've been looking for something like this for a while. I tried it with a simple data set that I put together manually, and everything seemed to work great. However, when I tried it on a much larger sample set that was compiled from information pulled from my existing database, my browser started throwing script time outs. Do you have any tips and/or tricks when dealing with a lot of data set?

    ReplyDelete
    Replies
    1. I'm glad you found it useful.
      I've found similar problems to what you describe when trying to use very large data sets.
      For force diagrams, I think the browser is having to do a lot of work (all those interacting forces seem like a lot of maths), and the more nodes you have, the bigger the problem. However, I think that the following might help.
      1. Use Google Chrome. Of the three browsers I have used for D3 (Chrome, Firefox and IE (don't laugh)) Chrome is streets ahead in rendering speed IMHO. I also know that this is out of your control if your creating a graph for others to use, but if you're doing it for yourself, it's an option.
      2. Don't use opacity on objects in the graph. I know it sounds trite, but again, that's a lot of work for the browser to carry out.
      Neither of these is a perfect solution, but for really large data sets, you're fighting a difficult battle for any visualization. I'd be interested if any other readers have an opinion.
      Alternatively, it you're able to post your code into a question on Stack Overflow, there are some very smart people in that forum who may be able to help.
      Good luck.

      Delete
  5. Thanks for your quick response. I tried using different browsers thinking the same thing as you that Chrome might be faster than the others, but it still timed out. I'll try out the opacity thing to see if it helps. I'll let you know if I find something that works.

    ReplyDelete
  6. Great article! I'm somewhat new to D3 but loving it.

    Question: Is there a relatively straightforward way to display the values by default? (as opposed to hover-only)

    ReplyDelete
    Replies
    1. Good question. I have no doubt it would be possible, but I don't know how easy it would be. There is also the very real possibility that because of the crossing nature of the links, the labels would most likely end up interfering considerably. I'd be interested in seeing an example :-)

      Delete
  7. Thank you for these instructions and explanations. I have a working D3 Sankey utilizing php and mysql. The php script takes a date range input and queries the mysql database and produces the correct json output. When the date parameters are hard coded in the php call, the webpage comes up displaying the correct sankey. I would like to enter a date range in a form field and have it trigger the php to re-query the database and have the sankey update automatically. However I cannot figure out how to get the sankey to update. I reviewed "data-load-revert-button.html" but I am having a hard time correlating the updateData section of the graph to a Sankey. Currently it just creates another Sankey on top of the original one. Can you give me any tips on how to redraw? Is this even possible?

    ReplyDelete
    Replies
    1. Thanks Cindy. This should be possible, but I'm afraid that it's beyound my experience. I would commend you to the following pages that should steer you in the right direction;
      http://mbostock.github.io/d3/tutorial/circle.html
      http://knowledgestockpile.blogspot.co.nz/2012/01/understanding-selectall-data-enter.html
      https://github.com/mbostock/d3/wiki/Selections
      Good luck!

      Delete
  8. "I have to admit that I don’t know what the sort line (.sort(function(a, b) { return b.dy - a.dy; });) is supposed to achieve. Again, I’d be interested know from any more knowledgeable readers. I’ve tried changing the values to no apparent affect."

    The sort function makes sure the link for which which the target has the highest y coordinate departs first out of the rectangle. Meaning if you have flows of 30,40,50 out of node 1, heading towards nodes 2, 3 and 4, with node 3 located above node 2 and that above node 4, the outflow order from node 1 will be 40,50,30. This makes sure there are as least crosses of flows as possible.

    Cheers,

    Thanks for your work man, it inspired me to use in my research!

    ReplyDelete
    Replies
    1. Hey! Thanks for the explanation! I'll edit the entry for the book .Many thanks and I'm glad it was useful in your research :-)

      Delete
    2. Thanks again. Just to let you know that I've included your explanation in the book (with appropriate credit). Cheers

      Delete
  9. These examples have helped tremendously, but I'm stuck as I have data formatted in the final example (columns with names but no values) but do not have access to php or mysql. Would it be possible to solve the final example using purely JavaScript, as you did in the others?

    ReplyDelete
    Replies
    1. Hmm.... Although I'm not 100% sure, I think you might be able to use the d3.nest function to let JavaScript do the hard work for you. Check that out for a start. Apologies for the late reply.

      Delete
  10. Check out my HTML5 D3 Sankey Diagram Generator - complete with self-loops and all :) http://sankey.csaladen.es

    ReplyDelete
    Replies
    1. Awesome! That is some clever work. Congrats!

      Delete
    2. Thanks! Using your tutorial, people can also add "fill", "layer" and "value" attributes on nodes list dictionary to each entry in my app, to set the node color, its placement in the x domain or to have a fixed valued other than the max sum of inflows or outflows.

      Delete
  11. hey there and thank you for your info – I have definitely picked up something new from right here.
    I did however expertise several technical points using this web site,
    as I experienced to reload the web site lots of
    times previous to I could get it to load properly.
    I had been wondering if your web hosting is OK? Not that I'm complaining, but slow loading instances times will sometimes affect your placement
    in google and can damage your quality score if ads
    and marketing with Adwords. Well I am adding this RSS to my
    email and can look out for much more of your respective
    intriguing content. Ensure that you update this again soon.

    Also visit my weblog; semillas de lino dorado

    ReplyDelete
  12. How to edit .js file to display two different values on sankey?

    ReplyDelete
    Replies
    1. Probably the best to check out the example for adding tool tips here https://leanpub.com/D3-Tips-and-Tricks/read#leanpub-auto-adding-tooltips.

      Delete
  13. Can this code be converted to CSS and represent online? I want to code diagrams to the web :)

    Thanks
    Regards,
    Creately

    ReplyDelete
    Replies
    1. I know that CSS is really powerful, but I have never really played with it. My instinct says that it probably could, but I have no experience of it sorry.

      Delete
  14. Instead of the above, can we load data from HTML forms instead or a simple, interactive Sankey?

    ReplyDelete
    Replies
    1. An earlier commenter has something that I think you're hinting at here http://sankey.csaladen.es/ It's very well doen and worth a look. Failing that check out the range of data request options here https://github.com/d3/d3/wiki/Requests

      Delete
  15. Hi there, I have questions about the data, when I tried to use sankeygreenhouse.json it gave me the following error:

    (index):60 Uncaught TypeError: Cannot read property 'nodes' of undefined

    I tried to use that data for sankeygreenhouse.json with 5 nodes and 5 links, and it works fine. However, when I tried to copy all the data, it gave me that following error.. Do you know why?
    does this relate to the amount of dataset? thank you!

    ReplyDelete
    Replies
    1. Hmmm.... Hard to say what might be causing the problem, but I doubt that it would be as a result of the amount in the data-set. That is fairly trivial. The first thing that springs to mind would be a problem with either the data (was there some corruption in the file) or with the variables used in the code (did they match the expected nomenclature in the data file (source, target, etc).

      Delete
  16. Hello,
    I am trying to generate a interactive Sankey diagram using the data from Microsoft SQL Server database. I am trying to create a page where a user can select an input from a drop down and this input will be passed to the SQL server and the corresponding data for the input would be returned.This data which is returned should be used to generate the Sankey diagram.Could you please show me an example or guide me of how to achieve the same.I appreciate your time and effort and thanks in advance for helping me out.
    Best regards

    ReplyDelete
    Replies
    1. Wow! That sounds like an interesting project. I don't have an example that I could point to, but you are combining a range of the technologies demonstrated in the book, so you are starting in the right direction. However for the level of detail you describe, I can't think of a project that would come close sorry.

      Delete
    2. Good morning,
      I am happy to say that I am done with the project.I generated a Sankey diagram with data from SQL server DB. The key to pull a dynamic sankey is to pass the json in a variable.In my case I wrote a php program to store the data from SQL server 2012 in a variable and I passed this variable to the 'Graph' variable. I removed the d3.json function as the role of the function is to just parse the json ,so if we have a fully formatted json, there is no need of this d3.json function so i removed this d3.json function and passed a variable directly and voila!! it worked like charm.I have to thank you for giving a very informative tutorial of Sankey.
      There are other features that I have incorprated in the sankey:
      1.pulling the position of the node from the database
      2.pulling the color from the database
      3.made the nodes and links clikcable and navigate to other URLS

      I loaded the json in the Mydata variable and directly passed it to Graph and below is the code
      // load the data--Getting the data from the 'Mydata' Variable from the php code and using it in our graph variable
      var graph = ?php echo ($Mydata);?>;

      Best regards,
      Sathappan Ramanathan

      Delete
  17. Please can someone tell me how to arrange the data in Excel sheet for Sankey diagram?

    ReplyDelete
    Replies
    1. OK, so going straight from an Excel spreadsheet to a Sankey diagram following this example might be difficult. But what you could do is export the data from a spreadsheet into a CSV format and then manipulate it in line with this section https://leanpub.com/D3-Tips-and-Tricks/read#leanpub-auto-from-a-csv-with-source-target-and-value-info-only

      Delete
  18. @D3noob : What do I have to do if I want to implement this similar this in a PHP larval project ? I tried using the code from index.html to a blade.php file but not sure, as I am getting a blank page

    ReplyDelete
    Replies
    1. Wow! That sounds interesting, but I have no experience with laravel I'm afraid. Sorry

      Delete
  19. I know this is an old post but I'm seeing the most frustrating problem with the second type of data: where you have csv of source,target,value. In the part of the code which builds the nodes and links array, my source and target are somehow becoming undefined. If I log the EXACT same object to the console, it's fine, but when I push the object to the links array and then log THAT to the console, the 'source' and 'target' fields are undefined while the value field remains the same. I know this is a long shot, but do you have any insight on why this could be happening? Any direction you could point me on this would be greatly appreciated.

    ReplyDelete
    Replies
    1. I feel your pain, but the good news is that this will most likely be something simple that has escaped your notice with the data. I say this because I have fallen into the same pit more than once myself. What I suggest you do is test with the exact code and especially the data that is presented from the downloads with the book or from the bl.ocks.org page. Assuming that this works, swap in your data and see what happens. I expect that there will be something simple and infuriating in there. If this doesn't work, post the code and data into a question on stack overflow. The clever minds there will find the solution I am sure. Best of luck

      Delete