At the end of my last article, we had SOAP running with PHP and were using Google as a glorified spell-checker. Seeing as it is rather well known as a search engine, this week we'll look into conducting web searches in a bit more detail. We'll also talk about how to deal with any errors that occur.


Searching with Google

Instead of the simple string returned by doSpellingSuggestion, when calling doGoogleSearch we get a nested structure of objects inside objects. We also need to supply a little more information. Take a look at this, and enter it into a new page in the same folder as the WSDL file.


<dl>
<?php
    $key
= '...';
    require_once
'SOAP/Client.php';
    
$wsdl = new SOAP_WSDL('GoogleSearch.wsdl');
    
$googleProxy = $wsdl->getProxy();

    
$queryResponse = $googleProxy->doGoogleSearch($key, 'PHP', 0, 10, true, '', true, '', '', '');
    
    foreach (
$queryResponse->resultElements as $result)
    {
    PHP article on SOAP Part 2.ems     echo
'<dt><a href="' . $result->URL . '">' . ($result->title ? $result->title : $result->URL) . '</a></dt>';
        echo
'<dd>' . $result->snippet . '</dd>';
    }
?>
</dl>
Now load the page into your web browser. It might take a few seconds for PHP to contact Google, but you should get a list of titled hyperlinks showing the same results as a Google search for 'PHP' (specifically, that's a search with filtering and SafeSearch turned on, and no restrictions in place).


The first half of the code is the same as we saw last time, so let's consider the last part. We call doGoogleSearch to actually do the work and return a results object. The first two arguments are similar to doSpellingSuggestion: a license key and the search query, respectively. There are also a lot of extra parameters on this call that give us the ability to customise the search (apart from the last two, that is, which are just there for decoration). The documentation included with the Google Web APIs developer's kit explains what they do fairly well, so I won't bother repeating it here. One thing to watch for is the data types of the parameters: SOAP is much less forgiving than PHP when it comes to automatically casting types.


Once we've got some search results we will normally want to check for errors, but that step will be added later in this tutorial. For now, we just loop through the main result array contained in the response object and print out a list of links. Aside from a tiny subtlety to deal with the fact that some results might not have titles, that's all we need to do.


Simple error handling

Without proper error handling, however, it's not that great. Change your license key slightly and try again: doesn't look good, does it? Too many PHP developers only code for the normal case and don't take these kind of error conditions into account, but with SOAP it is particularly important to check for errors. You can usually rely on operations to the local server, but when a remote server is involved all sorts of things could go wrong (Google might update the API and remove those two unnecessary arguments to doGoogleSearch, for instance). We can handle most errors simply by inserting this after the doGoogleSearch call:


    if (is_a($queryResponse, 'SOAP_fault'))
        die(
'SOAP fault: ' . $queryResponse->message);
           
    if (
$queryResponse->estimatedTotalResultsCount == 0)
        die(
'No results were returned.'


Now we can search Google and cope with any errors that arise (not in a particularly nice fashion, but I'm sure you can improve that). All we have to do now is decide what to do with this new toy.

How would you like your Google?

As an example of the kind of thing you can do with the code demonstrated here, I've prepared Google to RSS, a simple PHP script that searches Google for a given term and presents the results as an RSS feed. You could use this script to keep an eye on the top 10 search results for a topic you are interested in, or include them in a website. Feel free to experiment with the code.


If you're not familiar with RSS, a simple understanding of XML or HTML should be enough to understand the basics of the this script. It follows the same structure as the snippet given previously: perform a SOAP query on the desired parameters, then loop through the resulting array and generate the output. Notice that it uses the UTF-8 character set, and your scripts should probably do so as well since this is the encoding that Google uses.

Advanced searching: restricts

When calling doGoogleSearch we had to supply a lot of extra arguments to control a wide range of features such as the number of results we want and whether SafeSearch should be enabled. Most of them are fairly easy to use and explained well in the Google Web API's documentation, but restricts are a slightly more complex topic.


Restricts allow you to search from PHP a subset of Google's database, so you could (for example) display sites from a certain country, in a certain language or about a particular topic. Here's an example (replace the doGoogleSearch call in the code given above to try it out):
$queryResponse = $googleProxy->doGoogleSearch($key, 'PHP', 0, 10, true, 'countryUK', true, 'lang_it', '', '');
Access the page again, and while this is still a search for 'PHP', the results will be significantly different. Specifically, this only includes pages in Italian (that's the 'lang_it' bit) that are hosted on servers in the United Kingdom (the 'countryUK' setting). This illustrates the difference between a language restrict and a country restrict. The former returns pages written in the desired language, wherever they are in the world; the latter returns pages from the specified country, regardless of the language they are written in.


There are lists of the available language and country restricts in the Google Web API's documentation. Other than the two types demonstrated here, there are also topic restricts: try replacing 'countryUK' with 'linux' to get Italian pages about using PHP with Linux. Unfortunately, there aren't many topics available yet - four at the time of writing - so this will probably be less useful.


Finally, we can combine multiple restricts using several operators. Prepend a minus sign (-) to remove all results that match the given restrict. Join two restricts with a period (.) to only include results that match both, or use a vertical bar (|) to include results that match either side. You can also use parentheses to group restricts for more complex expressions. Ever wanted to search for pages from Switzerland or Hungary that are related to Linux or the Apple Mac? No, me neither, but you can do it with the restrict '(countryCH|countryHU).(mac|linux)'.

Moving on...

We've studied the basics of how to consume a web service with PHP, examining in detail the main features provided by Google's Web API. There's lots more we could do with Google, and many more web services available across the Internet. You can find some through XMethods, or even create your own - check out the SOAP_Server class, for example. Good luck, and have fun!