Click to See Complete Forum and Search --> : The power of php


drawmack
07-05-2004, 11:20 AM
A nice little two day project I did

The specs:
Write a cron job that will spider about 1,000 sites. Count the occurance of each letter, digram & trigram in their text. Then calculate the percantage useage of each and draw graphs of them. Make the counts cumulative.

Write a php script that will display this information to the user.

Here is the user page:
http://www.enderswebdev.com/test/letter_frequencies.php

Everything to generate that page was done in php.

onion2k
07-05-2004, 11:52 AM
Very cool.

Weedpacket
07-05-2004, 06:30 PM
OOooohh ... I'd like to see that done in ColdFusion :)

But why you use JPEG instead of PNG for the graphs? Wouldn't PNG look better?

I'm sure everyone here could suggest other things to put in the mix: a row listing the letters in frequency order, for example.

Merve
07-05-2004, 09:22 PM
ColdFusion...Macromedia, you have been horribly misguided

Love the script drawmack :)

Mordecai
07-06-2004, 03:29 AM
Originally posted by Weedpacket
OOooohh ... I'd like to see that done in ColdFusion :)
Eww... why?

Weedpacket
07-06-2004, 05:20 AM
Originally posted by Mordecai
Eww... why? To see if someone can....?

www.cfforums.com:
I want to do a page like
http://www.enderswebdev.com/test/letter_frequencies.php for a client, except it has to be in ColdFusion. Is it possible?

drawmack
07-06-2004, 09:21 AM
Originally posted by Weedpacket
OOooohh ... I'd like to see that done in ColdFusion :)

I hate cf

But why you use JPEG instead of PNG for the graphs? Wouldn't PNG look better?
Older browsers have trouble with PNG so I've never switched the PNG over JPEG but maybe it is time to.

I'm sure everyone here could suggest other things to put in the mix: a row listing the letters in frequency order, for example.
Actually I'm putting together a cryptography site and I was going to put a couple of things one there using this data.
1) A page that will spider your site and tell you how close to standard english frequencies your characters are.
2) A page that will take some text encrypted with a monoalphabetic/polyalphabetic cipher and make a best guess about the key.

Maybe a couple of other little toys as well. Since the cron job stores everything in an ini file that is almost no overhead with the UI scripts to use the info.

Mordecai
07-06-2004, 09:58 AM
Hmm.. assuming there is a 1:1 ratio of characters (input:output), then it will work best on very long strings.. in fact, the longer, the better, probably... that'll be interesting to see how it comes out.

drawmack
07-06-2004, 10:33 AM
actually the chi-squared method of cracking needs about 15 - 20 english characters to crack a monoalphabetic cipher.