RSS Parser

This is a quick little script I wrote to use on my Mac for geektool. You can call this script from the command line with PHP and supply an rss feed URL on the command line and the script will parse it.

Examples:

dave@server:~$ php rss.php
Usage: php rss.php <rss feed URL> [# stories]
dave@server:~$ php rss.php http://www.daveeddy.com/feed/
Dave Eddy
>> GeekTool For Mac (revisited) (23 minutes ago)
>> Viridian 1.2 Released (1 week ago)
>> Introduction to Unix Tutorial (3 weeks ago)
>> RITLUG (3 weeks ago)
>> Ampache Now Playing WordPress… Now with Ajax/jQuery (1 month ago)
dave@server:~$ php rss.php http://www.daveeddy.com/feed/ 2 # only show 2 stories
Dave Eddy
>> GeekTool For Mac (revisited) (23 minutes ago)
>> Viridian 1.2 Released (1 week ago)

Download

zip rss.php.zip

The code

<?php
/**
 * Copyright (C) 2011 Dave Eddy <dave@daveeddy.com>
 * This program is free software: you can redistribute it and/or modify it
 * under the terms of the GNU General Public License version 2, as published
 * by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful, but
 * WITHOUT ANY WARRANTY; without even the implied warranties of
 * MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
 * PURPOSE.  See the GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program.  If not, see <http://www.gnu.org/licenses/>.
 *
 * rss.php
 * Parse RSS feeds supplied over the command line
 * Written by Dave Eddy <dave@daveeddy.com>
 *
 * Usage Example: php rss.php http://www.daveeddy.com/feed/
 */

/* Config Variables */
date_default_timezone_set('America/New_York'); // set this to your locale

/* No need to edit below this line */

$site    = $_SERVER['argv'][1];
$stories = $_SERVER['argv'][2];

/**
 * Converts a Unix time stamp to how long ago it was relative to now
 * Taken from stackoverflow
 * http://stackoverflow.com/questions/2981602/twitter-rss-pubdate-format
 */
function how_long_ago($unixTime) {
    $chunks = array(
        array(60 * 60 * 24 * 365 , 'year'),
        array(60 * 60 * 24 * 30 , 'month'),
        array(60 * 60 * 24 * 7, 'week'),
        array(60 * 60 * 24 , 'day'),
        array(60 * 60 , 'hour'),
        array(60 , 'minute'),
    );
    $today = time();
    $since = $today - $unixTime;
    for ($i = 0, $j = count($chunks); $i < $j; $i++) {
        $seconds = $chunks[$i][0];
        $name = $chunks[$i][1];
        if (($count = floor($since / $seconds)) != 0) {
            break;
        }
    }
    return $count == 1 ? '1 '.$name : "$count {$name}s";
}

if (!isset($site)) {
	die('Usage: php '.basename(__FILE__).' <rss feed URL> [# stories]'."\n");
}
if (!isset($stories)) {
	$stories = 5;
}

$dom = New DomDocument();
if (!@$dom->load($site)) {
	exit(1); // exit with 1 if it failed to load
}

$items = $dom->getElementsByTagName('item');

echo $dom->getElementsByTagName('title')->item(0)->nodeValue."\n";
for ($i=0;$i<$stories;$i++) {
	if ($i >= $items->length) {
		continue; // you wanted more stories than the RSS had
	}
	$item = $items->item($i);
	$age = how_long_ago(strtotime($item->getElementsByTagName('pubDate')->item(0)->nodeValue));
	echo ">> " . $item->getElementsByTagName('title')->item(0)->nodeValue." ($age ago)\n";
	// uncomment to display the content section of the feed
	//echo $item->getElementsByTagName('description')->item(0)->nodeValue."\n";
}

11 thoughts on “RSS Parser

  1. Pingback: Dave Eddy » Blog Archive » GeekTool For Mac (revisited) | All dropped packets go to heaven

  2. what does the geeklet actually look like i see that it runs fine from terminal but for the life of me I cannot get it to execute in geektool shell

  3. Hi.

    Thanks for this great PHP-script, it solved nearly all my issues with embedding the RSS to the desktop. However, I do have one small problem. Whenever I embed Slashdot to the desktop it prints as published date “41 years ago”. This happens only with the Slashdot and isn’t reproducible in Google Reader, that displays them properly.

  4. @rantom i’m glad this script is working for you! the issue with 41 years ago is that the rss feed doesn’t contain a tag called “pubDate”, instead they use their own XML namespace to store the date.

    Long story short, change line 64 to this:

    $age = how_long_ago(strtotime($item->getElementsByTagNameNS('http://purl.org/dc/elements/1.1/', 'date')->item(0)->nodeValue));

  5. @rantom oh yeah that modified code was just for slashdot specifically, you’re going to have to use that code just for slashdot, and the other code for any other feed.

  6. @Dave:

    Yeah, I did it like that. Thanks again, works just like a charm. However, I’d have one last question. Since Slashdot uses also and such codes (italic) can those be shown with the script as well? So far it doesn’t seem to render them to their corresponding stylings, it only prints the codes.

    Same goes for e.g. &auml

  7. @rantom

    so far it is impossible to render that stuff on the desktop with geektool. The only way to deal with it would be to strip those things out of the output

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>