Click to See Complete Forum and Search --> : Website mapper


piersk
07-07-2004, 10:11 AM
Dunno if there is anything out there, but I'm looking for some kind of software that, if given the URL of a site, will map out all links and pages throughout the site, not go wondering off to any external sites (i.e. remains in the same domain) and then will display the results in a viewable form.

I think I'm plucking at straws, but if someone knows of something out there, then please let me know.

dar-k
07-07-2004, 10:46 AM
I've seen sites with this functionality...years ago...

Sadly, I've no idea what they were, so google away.

drawmack
07-07-2004, 10:47 AM
You could write that very easily if you used Snoopy.

http://www.hotscripts.com/Detailed/3240.html

AstroTeg
07-07-2004, 10:50 AM
Well, this probably isn't exactly what you want, but maybe it'll spark some ideas:

http://www.cis.hut.fi/kaip/linkkuri-0.01.02.pl

Its Perl. The beauty about it is it uses RobotUA for bot handling and good behavior. It can be disabled though (see code comments). I believe you might use it with a command such as:

Find all files in your website:
linkkuri -c http://domain.example/ > mysite.txt

Which will find all site files and save the links to a file.

The only trick left to solve is formatting and viewing. One approach might be to use this Perl script as is and have PHP launch it and parse the output file...

weekender
07-09-2004, 06:49 PM
my housemate is building a web spider for his MSc thesis. It's written in java, and one of the stages is to map local - then remote - urls and display them.

it's not due until september tho, so don't hold your breath!

adam

bubblenut
07-09-2004, 06:53 PM
I'm sure I (and firefox's search facility) must be blind because I haven't seen wget (http://www.gnu.org/software/wget/wget.html) mentioned yet.