Click to See Complete Forum and Search --> : Loads of Stuff -- Need Input Please! :)


Jason@AToM
11-15-2006, 12:01 AM
As a brief summary as an FYI, I'm about to launch a new search engine that, I feel at least, is like NOTHING out there yet! While the site most similar to it is probably ask.com, it FAR surpasses the flexability of what can be "asked" and the simplicity with which answers are returned. At any rate, I've got about 45 days remaining before my TARGET public debut of this little devil and have some things I could use some suggestions on. So.. here goes.

1. What is the MOST SIMPLE yet 99.9% FOOLPROOF way to ensure any and all searches originate from MY homepage (www.*******.com) as opposed to via an offsite link? Now, before I get totally flamed here, the intention is NOT TO toally prevent offiste linking/searches, but rather to prevent ABUSE of them. Please don't ask for a detailed explination right now... just please help out if you can.will with the question.... Thanks in advance... :)

2. Bandwidth savings --- well, gzip compression for one... but I want to be a little more innovative than that. Browser's cache will do a lot of it for me as well, as the page itself, though somewhat dynamically generated, unless "logged in" (admin /debug mode basically) is 100% static... but again, lets take that a step further. Here's a TOTALLY off the wall idea.... lets toss some JS in a cookie, have a small JS script load and eval the contents of said cookie which document.write's the main content right there... in theory, would that not take the load of serving at least SOME data off the server and place it on the client? I know, that's TOTALLY "crazy", but someone give me some pros and cons of that. Obviously security it one issue, but lets say I'll send in the JS "eval" program a dynamically generated xor decrypt key/hash signature of some sort..... perhaps I'm utilizing more resources already than I'm saving? Bandwidth savings is my "focus" with this question / proposal. Cavets.. plenty... feedback anyone?

I've got more, but I think this is plenty to start a stir :p Again, these are VERY PRELIMINARY IDEAS that I have not researched in the slightest. I just want to get some general feedback as to just how psychotic the community here thinks I am! :p

Jason

zabmilenko
11-15-2006, 01:28 AM
2. Newer browsers can use XSLT to load almost all the html on the client side (in the form of an xml stylesheet) and then feed data using plain XML.

Jason@AToM
11-15-2006, 05:44 AM
Much cool! There's a good start! (and yet another set of languages I have to learn in 45 days! :p) Thank you!! :) -- Jason

MarkR
11-15-2006, 06:13 AM
As a brief summary as an FYI, I'm about to launch a new search engine that, I feel at least, is like NOTHING out there yet!


Good luck with your project; I hope you understand the magnitude of this.


1. What is the MOST SIMPLE yet 99.9% FOOLPROOF way to ensure any and all searches originate from MY homepage


Allocate the users a single (or limited number of) use token which can only be obtained via a hidden field in the home page.

However even doing this won't stop someone writing a robot which leeches this token to display their own search page.

Do what the other engines do, and don't care whether your searches originate from your home page anyway.


Please don't ask for a detailed explination right now... just please help out if you can.will with the question.... Thanks in advance... :)


What is the rationale? Your search engine's home page contains nothing interesting for either your users or your own server; serving it is just a waste of bandwidth.


2. Bandwidth savings --- well, gzip compression for one... but I want to be a little more innovative than that. Browser's cache will do a lot of it for me as well...


Most bandwidth will be used on the results pages, which can't be cached by the browser because (presumably) it's never seen them before. So unless lots of users are doing the same search behind a caching proxy, caching will only decrease bandwidth slightly here.

(I am assuming your search page will use GET method and not put any unique user IDs in the query string etc)


, as the page itself, though somewhat dynamically generated, unless "logged in" (admin /debug mode basically) is 100% static... but again, lets take that a step further. Here's a TOTALLY off the wall idea.... lets toss some JS in a cookie, have a small JS script load and eval the contents of said cookie which document.write's the main content right there... in theory, would that not take the load of serving at least SOME data off the server and place it on the client? I know, that's TOTALLY "crazy", but someone give me some pros and cons of that. Obviously security it one issue, but lets say I'll send in the JS "eval" program a dynamically generated xor decrypt key/hash signature of some sort..... perhaps I'm utilizing more resources already than I'm saving? Bandwidth savings is my "focus" with this question / proposal. Cavets.. plenty... feedback anyone?


Oh yes, for sure you could use an AJAX-style system to generate the entire results. This would reduce the bandwidth requirement on the server, but cause severe usability problems, e.g. users can't bookmark search results, non-JS browsers can't use it at all.

Sending back the response as a JSON object would probably be the most efficient way of doing it (obviously this could still be gzipped)

Cheers
Mark

PS: I'm not going to define any terms above, as if you have even 1% of the ability required for this project, you either know them or can find them out easily.