An extension to rename downloads in Google Chrome

Project Gutenberg is an online library where you can download e-books for free. But every time I download one, the file has a stupid filename, like abjx0349.pdf so I have to manually rename it to the book's title.

I decided to do something about it and create a plugin so that the file is automatically renamed with the book title followed by the book's authors' names. The goal is to have a file with a name useful for Windows' search function.

In this article, I'll walk you through the development stages and the plugin's code. This is my first Google Chrome plugin and I'm not a javascript developer so suggestions for improvement are welcome.

On a side note, the extension is a lot shorter and simpler than I expected.

Demo

Here is a video of what the plugin achieves. On the video, we can see that I download a book from Project Gutenberg. The book appears in my downloads/ folder with a gibberish filename. Then, I activate the plugin before re-downloading the same book. The book is automatically downloaded inside the downloads/EBOOKS/ subfolder and the file is named according to the book's title :-)

Turn on the subtitles for a real time text description of what is happening.

Our task

When downloading a book from Project Gutenberg, the usual flow is as follows:

  1. user lands on https://gutenberg.org and search for a book
  2. user clicks on a book and lands on the book's page, for instance http://www.gutenberg.org/ebooks/1112
  3. user clicks the download link and a file with a non-semantic filename is downloaded
  4. user leaves the book's page

Our task is to:

  • inspect the book's page at step 2 to find the book's title and store it
  • rename the downloaded file automatically at step 3
  • free up the storage space at step 4.

To do so, we will create a chrome extension.

Creating an extension

To create an extension, we must create a new folder with a json file called manifest.json. This manifest file is the entry point for chrome to recognize and load the extension.

mkdir extension
vim extension/manifest.json

The minimal content of manifest.json for chrome to recognize the folder as an extension is the following:

// file: manifest.json
{
    "name": "Gutenberg Ebooks Downloader",
    "description": "Automatically rename ebooks downloaded from gutenberg.org",
    "version": "1.0",
    "author": "Julien Harbulot",
    "manifest_version": 2
}

To load the extension into Chrome, proceed as follows:

  1. Open the Extension Management page by navigating to chrome://extensions ;
  2. Enable Developer Mode by clicking the toggle switch next to Developer mode ;
  3. Click the LOAD UNPACKED button and select the extension directory to load the extension into Chrome.

Load an extension into Chrome

Scraping the book's title

First, we need a way to get the book's title. To do so, we will parse the download page with a content script. A content script is a javascript code that executes when a given page is loaded. It can read details (i.e. html) of the page, make changes to it and pass information to the parent extension.

We need to update the manifest to ask for permission to run the extension on the Gutenberg URL and to let chrome know about our content script. To be able to store the book's title, we also need access to the storage API:

// file: manifest.json
{
    ...,
    "content_scripts": [
        {
        "matches": ["http://www.gutenberg.org/ebooks/*"],
        "js": ["content_script.js"]
        }
    ],

    "permissions": [
        "http://www.gutenberg.org/*",
        "storage"
    ]
}

Now, let's create our content script in the file extension/content_script.js.

If we inspect the source code of a page to download a book, for instance this one, we can see that the title is in a <h1> heading with itemprop="name":

// file: http://www.gutenberg.org/ebooks/1112

<div class="header">
<h1 itemprop="name">The Tragedy of Romeo and Juliet by William Shakespeare</h1>
</div>

We can fetch it easily in javascript using document.querySelector. Then, we need to store it so that our extension can later access it. For that, we will use the storage API and store the book's title along with the page URL:

// file: content_script.js

// Get the content of the h1 title
var nameProp = document.querySelector('[itemprop=name]').textContent;

// Set everything to lower case, remove special characters and standardize format
nameProp = nameProp.toLowerCase().replace(/[^a-z0-9 ]/gi, '');
var filename = nameProp.replace(' by ', ' - '); 

// use the storage API 
chrome.storage.local.set({[document.URL]: filename}, function() {
    console.log('Book filename is stored as: ' + filename);
});

We need to ensure that the filename is removed from the storage at some point. The best option I could think of is to remove the entry when we leave the page, since we usually leave the page after we have clicked the download link and the entry is thus no longer required. The code is usual javascript along with another call to the storage API:

// file: content_script.js

// Remove when page closed
window.addEventListener("unload", function() {
    chrome.storage.local.remove([document.URL], function() {
        console.log('Removed ' + filename);
    });
});

And that's it! Let's see how to intercept the download and rename the file with this newly constructed filename.

Intercepting downloads

To intercept downloads, we will use a background script. A background script is used to monitor events and let our extension react to them. Here, we will use it to react to the downloading of a file.

First, we need to update our manifest to ask permission to access the download API and to register our background script:

// file: manifest.json
{
   ...,

   "background": {
        "scripts": ["background.js"],
        "persistent": false
    },

    "permissions": [
        ...,
        "downloads"
    ]
}

Background scripts should almost always be set to persistent: false. As the documentation indicates:

The only occasion to keep a background script persistently active is if the extension uses chrome.webRequest.

There are many events that we can listen to in the download API. The one of interest here is the onDeterminingFilename event.

According to the doc, each listener must call suggest exactly once to change the filename. If the listener calls suggest asynchronously, then it must return true. The template is as follows:

chrome.downloads.onDeterminingFilename.addListener(function(item, suggest) {
    asynchronous stuff {
        suggest({filename: new_filename});
    }
    return true;
});

We can use the argument item to find context about the downloaded file. We will use it to ensure the file comes from Project Gutenberg, and use the referrer URL to fetch the book's title in storage. Here's the full code:

// file: background.js

chrome.downloads.onDeterminingFilename.addListener(function(item, suggest) {
    if (item.referrer.search("gutenberg.org") == -1) {
    // If the file does not come from gutenberg.org, suggest nothing new.
        suggest({filename: item.filename});
    } else {
    // Otherwise, fetch the book's title in storage...
        chrome.storage.local.get([item.referrer], function(result) {
            if (result[item.referrer] == null) {
                // ...and if we find don't find it, suggest nothing new.
                suggest({filename: item.filename});
                console.log('Nothing done.');
            }
            else {
                // ...if we find it, suggest it.
                fileExt = item.filename.split('.').pop();
                var newFilename = "gutenberg/" + result[item.referrer] + "." + fileExt;
                suggest({filename: newFilename});
                console.log('New filename: ' + newFilename);
            }
          });
        // Storage API is asynchronous so we need to return true
        return true;
    }
  });

An icon for our extension

In the chrome://extensions page, click the reload icon in the extension's pane. Try downloading an e-book, it should be automatically renamed and put in the subfolder gutenberg/. Yeay!

The last step is to provide a nice icon for the extension. Download one somewhere on the internet, put it in the extension folder and specify its name in the manifest as follows:

// file: manifest.json
{
    ...,
    "icons": {"128": "icon128.png"}
}

You can use different icon sizes. For more information see the doc here.