One the things I wanted to play around with was visualizing this site content with d3. But first I needed to create something which would generate data from the site (any domain site actually). It's easy enough to modify for non-domain sites but I'm starting with domains, since that's what I have.
To do this we'll use a couple of script services.
Ultimately this data will be used for visualization. I'll cover that in a separate section. First of all I'm going to scrape the site, looking for and counting occurrences of specific tags and reporting them. That way we can generate some visualizations showing which topics are related and where to find them. The web app - tagsite - will take these URL arguments
Let's take an example (it does take a while to run - there's a lot of content). This will create some relationship data for each page on the site for the given tags, and return straight json.
You are going to get back an array, one item for each page in the web site, that starts like this first element. The counts are the number of times that each synonym is encountered on a given page.
In this case, we want to do the same thing, but this time write the result to gDrive
What gets returned is a description of the drive file. The "hosted" property is a link to the created json file and is the one you should use for getting data into your web app. Here's the link to the live data.
Normally I reference a shared library for GAS stuff ( see Using the mcpher library in your code ), but this is very straightforward and all the code is below. There are no library references needed.
Now let's do something with the data - see Site data to sheets.
For help and more information join our forum,follow the blog or follow me on twitter .
Services > Desktop Liberation - the definitive resource for Google Apps Script and Microsoft Office automation > GAS and sites >