D3 Tips and Tricks v4

Thursday, 13 December 2012

Getting the Data

We're going to jump forward a little bit here to the bit of the JavaScript code that loads the data for the graph.

I'm going to go out of the sequence of the code here, because if you know what the data is that you're using, it will make explaining some of the other functions that are coming up much easier.

So the piece that grabs the data is this bit.

d3.tsv("data/data.tsv", function(error, data) {
    data.forEach(function(d) {
        d.date = parseDate(d.date);
        d.close = +d.close;
    });

And in fact it's a combination of a few bits and another bit that isn't shown!, But let's take it one step at a time :-)

Ok... There's lots of different ways that we can get data into our web page to turn into graphics. And the way that you'll want to use will probably depend more on the format that it is in than the mechanism you want to use for importing.

For instance, if it's only a few points of data we could include the information directly in the JavaScript.

That would make it look something like;

var data = [
    {date:"1-May-12",close:”58.13”},
    {date:"30-Apr-12",close:”53.98”},
    {date:"27-Apr-12",close:”67.00”},
    {date:"26-Apr-12",close:”89.70”},
    {date:"25-Apr-12",close:”99.00”}
];

The format of the data shown above is called JSON (JavaScript Object Notation) and it's a great way to include data since it's easy for humans to read what's in there and it's easy for computers to parse the data out.

But if you've got a fair bit of data or if the data you want to include is dynamic and could be changing from one moment to the next, you'll want to load it from an external source. That's when we call on D3's 'Request' functions.

A 'Request' is a function that instructs the browser to reach out and grab some data from somewhere. It could be stored locally (on the web server) or somewhere out in the internet.

There are different types of requests depending on the type of data you want to ingest. Each type of data is formatted with different rules, so the different requests interpret those rules to make sure that the data is returned to the D3 processing in a format that it understands. You could therefore think of the different 'Requests' as translators and the different data formats as being foreign languages.

The different types of data that can be requested by D3 are;
  • text - A plain old piece of text that can optionally be encoded in a particular way (see the D3 API).
  • json - This is the afore mentioned JavaScript Object Notation.
  • xml - Extensible Markup Language is a language that is widely used for encoding documents in a human readable forrm.
  • html - HyperText Markup Language is the language used for displaying web pages.
  • csv -  Comma Separated Values is a widely used format for storing data where plain text information is separated by (wait for it) commas.
  • tsv - Tab Separated Values is a widely used format for storing data where plain text information is separated by a tab-stop character.
Details on these ingestion methods and the formats for the requests are explained well on the D3 Wiki page here - https://github.com/mbostock/d3/wiki/Requests. In this particular script we will look at the tsv request method.

Now, it is important to note here that this is not an exclusive list of what can be ingested. If you've got some funky data in a weird format, you can still get it in, but you will most likely need to stand up a small amount of code somewhere else in your page to do the conversion (we will look at this process when describing getting data from a MySQL database).

So, back to our request...

d3.tsv("data/data.tsv", function(error, data) {
    data.forEach(function(d) {
        d.date = parseDate(d.date);
        d.close = +d.close;
    });

The first line of that piece of code invokes the d3.tsv request (d3.tsv) and then the function is pointed to the data file that should be loaded ("data/data.tsv"). This is referred to as the 'url' (unique resource locator) of the file. In this case the file is stored locally, but the url could just as easily point to a file somewhere on the Internet.

The format of the data in the data.tsv file looks a bit like this;

date    close
1-May-12    58.13
30-Apr-12   53.98
27-Apr-12   67.00
26-Apr-12   89.70
25-Apr-12   99.00

(although the file is longer (about 26 data points)). The 'date' and the 'close' heading labels are separated by a tab as are each subsequent dates and numbers. Hence the 'tab separated values' :-).

The next part is part of the coolness of JavaScript. With the request made and the file requested the script is told to carry out a function on the data (which will now be called 'data').

function(error, data) {

Now, there are actually more things that get acted on as part of the function call, but the one we will consider here are the following lines;

data.forEach(function(d) {
        d.date = parseDate(d.date);
        d.close = +d.close;
    });

This block of code simply ensures that all the numeric values that are pulled out of the tsv file are set and formatted correctly. The first line sets the data variable that is being dealt with (called slightly confusingly 'data') and tells the block of code that for each group within the 'data' array it should carry out a function on them. That function is designated 'd'.

data.forEach(function(d) { 

The information in the array can be considered as being stored in rows with each row consisting of two values. One value for 'date' and another value for 'close'.

So the function is pulling out values of 'date' and 'close' one row at a time.

Each time it gets a value of 'data' and 'close it carries out the following operations;

d.date = parseDate(d.date);

This the specific value of date being looked at ( d.date) into a date format that D3 can process and do stuff with via a separate function 'parseDate'. Now, the 'parseDate' function is defined in a separate part of the script, and we will examine that later. So for the moment, just be satisfied that it takes the raw date information from the tsv file in a specific crow and converts it into a format that D3 can then process. That value is then re-saved in the same variable space.

The next line then sets the 'close' value to a numeric value (if it isn't already) using the '+' operator.

d.close = +d.close;

This appears to be good practice when the format of the number being pulled out of the data may not mean that it is automagically recognised as a number. This will ensure that it is.

So, at the end of that section of code, we have gone out and picked up a file with data in it of a particular type (tab separated values) and ensured that it is formatted in a way that the rest of the script can use it correctly.

Now, the astute amongst you will have noticed that in the first line of that block of code (d3.tsv("data/data.tsv", function(error, data) {) we opened a normal bracket ( ( ) and a curly bracket ( { ), but we never closed them. That's because they stay open until the very end of the file. That means that all those blocks that occur after the d3.tsv bit are referenced to the 'data' array. Or put another way, it uses 'data' to draw stuff!

But anyway, let's get back to figuring what the code is doing by jumping back to the end of the margins block.

16 comments:

  1. please give us a link where we can download data.tsv

    ReplyDelete
  2. Sure thing. As this is a continuation of previous posts describing the generation of the script, the link appears on a previous page (sorry about that). You can get it directly from the zipped file here https://dl.dropbox.com/u/101577503/d3noob.zip (or from the link on the download page here http://www.d3noob.org/p/d3noob-downloads.html ) Enjoy!

    ReplyDelete
  3. Hi, I was wondering if you could help me to callback a unqiue data set for a scatterplot visualisation? I am very close to completing the visualisation but am having difficulty calling back my data. Thanks

    ReplyDelete
    Replies
    1. I'm afraid that I would have difficulty in offering to help too much at the moment (time pressures, sorry) but I can thoroughly recommend posting your question onto Stack Overflow (http://stackoverflow.com/questions/tagged/d3.js). There are plenty of clever people there only to eager to help out. Post the link to the question back here if you submit one, and that may help drum up some additional assistance. Good luck.

      Delete
  4. Hey , I actually have a question about exporting data to a file. How to send back a data from js to a text file or excl or CSV? thanks for response in advance

    ReplyDelete
    Replies
    1. Really good question. I've never done that before. I would start with some of the stack overflow examples and go from there. (http://stackoverflow.com/questions/14964035/how-to-export-javascript-array-info-to-csv-on-client-side, http://stackoverflow.com/questions/921037/jquery-table-to-csv-export). I'd be interested to hear how you get on. Good luck.

      Delete
  5. Hello I have problem i have a tsv file on server side and i have to plot the d3 widget on client side how i can do so using jsp and apache.
    I am new to all these

    ReplyDelete
    Replies
    1. Sorry, I'm unfamiliar with jsp, but d3 shouldn't have any problem accessing a tsv file on the server side. If you can provide a sample of the problem you're having on Stack Overflow, someone should be able to help who is familiar with the technologies you're using. Good luck.

      Delete
  6. Hello,
    can i retrive data from localstorage using d3 ? i have try to do it but ı couldnt do it ?
    please do you have any idea how could i retrive data from local storage and show them in a chart ?

    thanks in advance

    ReplyDelete
    Replies
    1. The best advice I would give is to ask a well worded question on stack overflow outlining the problem you're seeing. However, there will be some basics you will need to check over first. Start with a simple example that someone has posted online. Are you running a local web server (not completely essential, but very close to it) and when you post your question, take your time when writing it and provide the code you're using. Good luck.

      Delete
  7. Hi,
    I have a time format like "2015-02-16T11:26:51+00:00" in my json data file. How can I retrieve the time from it?

    Thanks
    -Kamal

    ReplyDelete
    Replies
    1. Thanks Kamal. Apologies for the late reply. I assume that you have most likely moved on from the problem, but for future readers, you have raised a good question.
      The line in the 'forEach' loop that looks like this; d.date = parseDate(d.date);, goes to a function called 'parseDate'. this function takes the presented variable and converts into a time value that d3 understands.
      In your case the 'parseDate' function would look a little like; var parseDate = d3.time.format("%Y-%m-%ydT%H:%M+00:00").parse;
      Check the examples in the following section https://leanpub.com/D3-Tips-and-Tricks/read#leanpub-auto-formatting-the-date--time which also includes an explanation of the time formatters.

      Delete
  8. Hai,
    is it mandatory to use parseDate in forEach.
    can we do it without using parseDate

    ReplyDelete
    Replies
    1. Good question. Sorry for the delay in replying (I know that being over a year late, 'sorry' probably doesn't cut it, but please accept my apology anyway). It's not mandatory, but if you want to use a variable and have it represented as time rather than an ordinal value it would be strongly advised.

      Delete
  9. I have a tsv file that is called data but I have three variables per line; shape, coordinate and name. I want to read this code on D3.js. I wrote a piece of code of following yours. Would it work for what I want to do?
    Thanks in advance!

    Code:
    data.tsv("desktop/data.tsv", function(error, date){
    data.forEach(function(d){
    d.shape = parseShape(d.shape);
    d.coordinate = parseCo(d.coordinate);
    d.name = parseName (d.name);
    d.total = d.shape + d. coordinate + d.name
    }
    };

    ReplyDelete
    Replies
    1. Sorry for the delayed reply. The first line where you have ' function(error, date)' is ok, but you probably mean to have ' function(error, data)' (data instead of date). d3.js will read in the information and give it the variable name data and then the 'data.forEach' process will go through each grouping in 'data' and carry out the functions that you have applied there. So first things first. You will need to change 'date' to 'data' in that first line. Good luck

      Delete