SEO test for your pages using wget
Wget can be used in several ways to test your web server for problems, particularly with dynamic sites that have load balancers.
1). Simple test, what is my web server doing when Google stops by?
Command: wget –user-agent=googlebot http://www.aaronshear.com
Operation –
bash-3.00$ wget –user-agent=googlebot http://www.aaronshear.com
--09:37:56-- http://%E2%80%93user-agent=googlebot/
=> `index.html.41'
Resolving \342\200\223user-agent=googlebot... failed: Name or service not known.
--09:37:56-- http://www.aaronshear.com/
=> `index.html.41'
Resolving www.aaronshear.com... 68.178.211.42
Connecting to www.aaronshear.com|68.178.211.42|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.aaronshear.com/blog/ [following]
--09:37:57-- http://www.aaronshear.com/blog/
=> `index.html.41'
Connecting to www.aaronshear.com|68.178.211.42|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
[ <=> ] 37,426 156.90K/s
09:37:57 (156.52 KB/s) - `index.html.41' saved [37426]
FINISHED --09:37:57--
Downloaded: 37,426 bytes in 1 files
You can see with this command that I have asked my web server to forward any requests for / the root of my home page to /blog.
I also wanted to show you a simple way to see a load balancer in action, with a very popular site. Instead I can identify a variation of cloaking. Now cloaking is in no way shape or form always spam! This example only shows that this web server is looking for Googlebot and doing something with it.
bash-3.00$ wget –user-agent=googlebot http://www.buy.com
--09:39:40-- http://%E2%80%93user-agent=googlebot/
=> `index.html.42'
Resolving \342\200\223user-agent=googlebot... failed: Name or service not known.
--09:39:40-- http://www.buy.com/
=> `index.html.42'
Resolving www.buy.com... 80.67.74.11, 80.67.74.233
Connecting to www.buy.com|80.67.74.11|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
[ <=> ] 123,224 --.--K/s
09:39:40 (7.67 MB/s) - `index.html.42' saved [123224]
FINISHED --09:39:40--
Downloaded: 123,224 bytes in 1 files
If I ran this using the stock user agent of wget, I get the following.
bash-3.00$ wget http://www.buy.com
--09:42:56-- http://www.buy.com/
=> `index.html.44'
Resolving www.buy.com... 80.67.74.233, 80.67.74.11
Connecting to www.buy.com|80.67.74.233|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
[ <=> ] 123,224 --.--K/s
09:42:56 (3.13 MB/s) - `index.html.44' saved [123224]
Labels: seo




0 Comments:
Post a Comment
<< Home