PHPBuilder - Reading RSS feeds in PHP: Part 2



RSS Twitter
Articles Php Functions

Reading RSS feeds in PHP: Part 2

by: Ian Gilfillan
|
November 2, 2005

This month's article continues on where we left off last time. Please read part one first if you haven't yet done so, as the code examples in this tutorial build upon what we created there. To begin with, let's create two new rss feeds to read. We're going to combine them into one feed, and display them combined on one page, in the correct date order. These two sample feeds are loosely based on some of the feeds supplied by Independent Online. Note that unlike last month's simplified example, each article item has a publication date. We'll call the two feeds africa.rss and southafrica.rss.
africa.rss
<?xml version="1.0" encoding="utf-8" ?><?xml-stylesheet type="text/xsl" href="rss-style.xsl" ?>
<rss version="0.91">
 <channel>
 <title>IOL: Africa</title>
 <link>http://www.iol.co.za/index.php?...truns
 <description>IOL: Africa</description>
 <language>en-gb</language>

 <image>
  <title>IOL</title>
  <url>http://www.iol.co.za/images/rss/iol.gif</url>
  <link>http://www.iol.co.za</link>
 </image>

 <item>
  <title>Mwanawasa won't pass new Zambian constitution</title>
  <link>http://www.iol.co.za/widgets/rss_redirect.php?...truns
  <description>Zambian President Levy Mwanawasa says there is not 
enough time to implement the country's new constitution before 
elections in 2006.</description>
  <pubDate>2005-10-27 12:40:41</pubDate>
 </item>

 <item>
  <title>Museveni's rival back in Uganda as poll nears</title>
  <link>http://www.iol.co.za/widgets/rss_redirect.php?...truns</link>
  <description>An opposition politician expected to be the 
main challenger to President Yoweri Museveni in the upcoming 
elections has ended his four years in exile.</description>
  <pubDate>2005-10-27 08:40:38</pubDate>
 </item>
 </channel>
</rss>

southafrica.rss
<?xml version="1.0" encoding="utf-8" ?><?xml-stylesheet type="text/xsl" href="rss-style.xsl" ?>
<rss version="0.91">
 <channel>
 <title>IOL: South Africa</title>
 <link>http://www.iol.co.za/index.php?...truns</link>
 <description>IOL: South Africa</description>
 <language>en-gb</language>

 <image>
  <title>IOL</title>
  <url>http://www.iol.co.za/images/rss/iol.gif</url>
  <link>http://www.iol.co.za</link>
 </image>

 <item>
  <title>Train crash could have been 'far worse'</title>
  <link>http://www.iol.co.za/widgets/rss_redirect.php?...truns</link>
  <description>Frans Pritchard was fast asleep in bed on 
board the Trans Karoo. Suddenly he was on the floor, paralysed. 
For close to an hour, the 65-year-old man lay on the floor of 
the train, waiting for medical help - one of dozens of passengers 
hurt in the crash between the Blue Train and the Trans Karoo express.</description>
  <pubDate>2005-10-27 12:35:25</pubDate>
 </item>

 <item>
  <title>Hit-and-run family finds help amid despair</title>
  <link>http://www.iol.co.za/widgets/rss_redirect.php?...truns</link>
  <description>The man who has lost his mother, and then 
his wife and baby son in a tragic hit-and-run on a Durban highway, 
will be able to bury his loved ones because of kind-hearted Daily 
News readers.</description>
  <pubDate>2005-10-27 13:05:34</pubDate>
 </item>
 </channel>
</rss>
Starting with the code we ended up with last month, let's use the new feeds. Change the line populating the array:
$rssFeeds = array ('phpbuilder.rss');
to make use of the new feeds, as follows:
$rssFeeds = array ('southafrica.rss','africa.rss');
Run this script, and give some thought to what order the results will appear in. We get the following output:
Title: Train crash could have been 'far worse'
Description: Frans Pritchard was fast asleep in bed on 
board the Trans Karoo. Suddenly he was on the floor, paralysed. 
For close to an hour, the 65-year-old man lay on the floor of 
the train, waiting for medical help - one of dozens of passengers 
hurt in the crash between the Blue Train and the Trans Karoo 
express.
Link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
Pubdate: 2005-10-27 12:35:25

Title: Hit-and-run family finds help amid despair
Description: The man who has lost his mother, and then 
his wife and baby son in a tragic hit-and-run on a Durban highway, 
will be able to bury his loved ones because of kind-hearted 
Daily News readers.
Link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
Pubdate: 2005-10-27 13:05:34

Title: Mwanawasa won't pass new Zambian constitution
Description: Zambian President Levy Mwanawasa says there 
is not enough time to implement the country's new constitution 
before elections in 2006.
Link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
Pubdate: 2005-10-27 12:40:41

Title: Museveni's rival back in Uganda as poll nears
Description: An opposition politician expected to be the 
main challenger to President Yoweri Museveni in the upcoming 
elections has ended his four years in exile.
Link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
Pubdate: 2005-10-27 08:40:38 
Notice that, as you probably expected, the articles are not yet in date order. They appear in feed order, from first to last. The easiest way to order the articles would be to put them in an array, but that's not a particularly scalable way of doing things. I'm still scarred by an example of 'professional' coding that was delivered to me by an outsourced company a few years ago, which read a set of records from a database, put everything into an array, and then started processing. It didn't last too well when faced with a real world of hundreds of thousands of subscribers. So, let me not teach any bad habits here. We'll put the records in a database, and then sort the results as we read them back. For ease of use, I'm going to use the ADOdb database libraries with MySQL. If you find anything that follows unfamiliar, you can have a look at the articles An introduction to the ADOdb class library for PHP, Part 1 and An introduction to the ADOdb class library for PHP, Part 2.
Create the following table, which we'll use to store the feeds:
CREATE TABLE `rss_feeds` (
  `id` int(11) NOT NULL auto_increment,
  `title` varchar(50) NOT NULL default '',
  `description` text NOT NULL,
  `link` varchar(100) NOT NULL default '',
  `pubdate` datetime NOT NULL default '0000-00-00 00:00:00',
  PRIMARY KEY  (`id`)
);
Now, by adding an INSERT statement to endElement(), the feed items will be inserted into the database. Our sample RSS feeds have the date nicely formatted for MySQL's DATETIME field. It's likely though you'll need to do some work with many of the feeds you encounter. You can read the article on date manipulation if that's unfamiliar to you. We're also going to make sure that each feed item is only added once, by checking to see if the link already exists in the database and only inserting a new record if it doesn't. Here's the new endElement() function, again with the changes in bold:
function endElement($xp,$name) {
 global $item,$currentElement,$title,$description,$link,$pubdate,$conn;
 if ($name == 'ITEM') {
  echo "<b>Title:</b> $title<br>";
  echo "<b>Description:</b> $description<br>";
  echo "<b>Link:</b> $link<br>";
  echo "<b>Pubdate:</b> $pubdate<br><br>";
  $ins_title = addslashes($title);
  $ins_desc = addslashes($description);
  $ins_link = addslashes($link);
  $ins_pubdate = addslashes($pubdate);
  $sql = "SELECT COUNT(link) as cn FROM rss_feeds WHERE link='$ins_link'";
  $rs = $conn->Execute($sql);
  if ($rs->fields['cn'] == 0) {
   $sql = "INSERT INTO rss_feeds (title, description, link, pubdate) 
VALUES('$ins_title','$ins_desc','$ins_link','$ins_pubdate')";
   if (!($conn->Execute($sql))) {
    print 'Error inserting: '.$conn->ErrorMsg().'<br>';
   }
  }
  $title = '';
  $description = '';
  $link = '';
  $pubdate = '';
  $item = false;
 }
}
Don't forget to include the database connection details. We do so at the top of the script.
include "db_ado_vars_test.inc.php"; 
//your database connection details
include "$includes_path/adodb/tohtml.inc.php";
include "$includes_path/adodb/adodb.inc.php";
$conn = &ADONewConnection('mysql');
$conn->PConnect($host,$user,$pass,$db_name);
$rssFeeds = array ('southafrica.rss','africa.rss');
After you've run this script, your records should be safely ensconced in the database - you should see the following if you query the rss_feeds table:
mysql> SELECT * FROM rss_feeds\G
*************************** 1. row ***************************
id: 1
title: Train crash could have been 'far worse'
description: Frans Pritchard was fast asleep in bed on board the 
Trans Karoo. Suddenly he was on the floor, paralysed. For close 
to an hour, the 65-year-old man lay on the floor of the train, 
waiting for medical help - one of dozens of passengers hurt in 
the crash between the Blue Train and the Trans Karoo express.
link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
pubdate: 2005-10-27 12:35:25
*************************** 2. row ***************************
id: 2
title: Hit-and-run family finds help amid despair
description: The man who has lost his mother, and then his wife 
and baby son in a tragic hit-and-run on a Durban highway, will 
be able to bury his loved ones because of kind-hearted Daily 
News readers.
link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
pubdate: 2005-10-27 13:05:34
*************************** 3. row ***************************
id: 3
title: Mwanawasa won't pass new Zambian constitution
description: Zambian President Levy Mwanawasa says there is not 
enough time to implement the country's new constitution before 
elections in 2006.
link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
pubdate: 2005-10-27 12:40:41
*************************** 4. row ***************************
id: 4
title: Museveni's rival back in Uganda as poll nears
description: An opposition politician expected to be the main 
challenger to President Yoweri Museveni in the upcoming elections 
has ended his four years in exile.
link: http://www.iol.co.za/widgets/rss_redirect.php?...truns
pubdate: 2005-10-27 08:40:38
4 rows in set (0.00 sec)
Now, let's display them in date order, from most recent to least recent. Make the additions in bold below just after looping through the array:
//Loop through the array, reading the feeds one by one
foreach ($rssFeeds as $feed) {
 readFeeds($feed);
}
$sql = "SELECT title, description, link, pubdate FROM rss_feeds 
ORDER BY pubdate DESC LIMIT 100";
$rs = &$conn->Execute($sql);
while (!$rs->EOF) {
 echo '<a href="'.$rs->fields['link'].'">'.
$rs->fields['title'].'</a>>'.$rs-
>fields['pubdate'].'<br>';
 echo $rs->fields['description'].'<br>';
 $rs->MoveNext();  //  Moves to the next row
}
The formatting is quite simple (the headline is the link, the publication date appears next to the headline on the same line, and the description appears on the following line. We've also select them in descending date order, and limited the number of articles to be displayed to 100. You should easily be able to amend these elements for your own purposes.


1
|
2

Comment and Contribute

Your comment has been submitted and is pending approval.

Author:
Ian Gilfillan

Comment:



Comment:

(Maximum characters: 1200). You have characters left.