Scraping Open Graph Metadata With Node

At Converge SE this weekend, I had the opportunity to attend a workshop on programming social web applications taught by Principal Software Enginner at Yahoo, Jonathan LeBlanc.

One thing Jonathan covered was the Open Graph Protocol. Open Graph makes it easy for any developer to add metadata to their website pages. The data is primarily used control the title, description, image source, and URL to be displayed when their pages are shared on Facebook, but can be used by anybody.

Jonathan gave us a couple homework assignments during the presentation, one of which was to build an Open Graph metadata parser. Granted, Facebook already provides one of these, but it proved to be a fun exercise in Node.js. Using Express I was able to build the parser in half an hour and then deploy to Heroku’s new Cedar stack in a few minutes.

The app grabs the URL passed to it, renders the DOM on the server, and extracts the needed meta tags.

The source is available on Github and a demo is live on Heroku. Go ahead, give it a shot! I recommend looking up MailChimp.