Showing posts with label SEO. Show all posts
Showing posts with label SEO. Show all posts

Monday, September 14, 2009

Heuristic Search Algorithms : StumbleUpon.com

Amazon.com > Books > Computers & Internet > Computer Science > Artificial Intelligence > Heuristic & Constrained Search >
"Good decisions are born of experience. Experience is born of bad decisions."-- Unknown

Don't let the techie title throw you -- heuristic is just a ten-dollar word for learning, and algorithms are just collections of computer commands that accomplish a certain task, much like a recipe. In fact, computer algorithm books are often called cookbooks. So what we're talking about here are computer searches that improve as you use them.

As an example, take Amazon.com's recommendation feature. (If you've read this blog before, you probably saw that coming!) Amazon tracks your browsing behavior on their site using magic browser cookies, and based on the information they collect, guesses what other pages might interest you. BTW, the recommendations I get aren't particularly focussed, since I crawl all over Amazon looking for niche market products. The heuristic algorithm therefore assumes that I'm interested in just about anything. Come to think of it, that's exactly what its supposed to do.

Another familiar example is the Yahoo! Search suggestions feature. If, for example, you enter the search term "router," a box of suggestions will appear after a brief delay. These might include "woodworking routers" or "ethernet routers" or "Netgear routers". I'm not using actual examples from Yahoo! since those are subject to change, but these examples should be sufficient to illustrate the point. I like the fact that Yahoo! merely suggests possible refinements -- Microsoft has an irritating tendency to assume that they know what you want better than you do.

StumbleUpon.com is a social-networking site and search engine that allows you to select up to 127 interests that will be used to customize your search results. As you use the service, you can click on one of two buttons (Thumbs Up) I like this or (Thumbs Down). Simple. If you're ambivalent, you don't have to rate a page. I've seen a lot of rating systems, but this is probably my favorite because you're not ruining somebody's day by giving them the thumbs down -- and they can't ruin yours. All you are doing is demoting the page and others like it in your own search results. People who think Dick Cheney and Donald Rumsfeld are great American patriots are unaffected.

I looked at this service a couple of years ago and wasn't very impressed, but I don't recall the search feature being present then. I may have overlooked it, but I think you just had to "stumble" from page to page. That might be alright for casual surfing, but it's not very focussed. The database is now a lot bigger too. To try Stumbling, go to the Getting Started page, where you will learn all you need to know. If you have trouble adding pages with the toolbar because of your firewall settings, old or weird browser, or whatever, you can use the form below:

Submit Page to StumbleUpon.com

http://

Last, but certainly not least, your StumbleUpon history generates your own personal StumbleUpon blog, which is visible to the major search engines if you're just looking for backlinks, and to other stumblers, of course, if you are more interested in the social networking aspects of the site. You are cordially invited to subscribe to my StumbleUpon blog at any time.

Monday, August 24, 2009

Lijit Search

I've added a new Lijit Search form to this blog which I hope will make it easier to find content from a variety of places where I publish. I've been struggling to be found by the well-known search engines for some time now, but this service is a new one to me. Apparently it allows you to build a custom search engine using your social bookmarks.

It's a very different kind of crawler, and I'm still trying to sort out exactly how it works. It seems to rely primarily on RSS data, another technology I need to learn more about. I did an initial set-up about a week ago, and it's probably too soon to expect much from the search results, but it is possible to get search hits for specific keywords known to be near the top of the queue -- such as "vaporizer", for example. So I know it's working, I'm just not sure how well.

Additionally, I set up the embedded search results feature today, which requires a bit of JavaScript running on an assigned landing page. If you're reading this from a feed, that should explain the previous cryptic post. You have to be at the actual Whole Ed Cata-Blog site and enter a search term into the Lijit Search form in the blog template for results to be displayed.

I don't care much for blogs about nothing ("I'm sitting at my computer now, writing a post about the post I'm writing for my blog...") Therefore, I'll cut this short. I really just wanted to explain that last post to subscribers, and explain why the Lijit search results are as bad as they are right now. It just takes some time for me to figure out how to set it up efficiently, and to crawl the network.

Thursday, April 03, 2008

Where's Search Heading?

Contrarian investment strategy holds that when everybody is jumping on a particular bandwagon, it's time to sell. Shares of Google Inc. spent the last year trading in a range roughly between 450 to 550 with some modest but steady growth. Suddenly, about the time Merriam Webster's Dictionary recognized Google as a verb, shares shot up to over 700 ... briefly. It's not my intention to dispense investment advice, I just find that spurt of ill-founded exuberence interesting in light of the developments in the search-engine world.

I've been watching Google since their early days (at least since Nov 1999), and initially I was a big fan. The original PageRank algorithm was groundbreaking when it was introduced, and the name Google -- derived from Googol, the largest named number -- was an acknowledgment of the enormity of the task of indexing the World Wide Web (considerably less enormous then). They even had a cool motto: "Don't be Evil."

That Google is ancient history. Since their IPO in 2004, the professional management team at Google, Inc. has focussed far less on resisting evil in favor of short-term corporate profits. They are basically corporate raiders, buying up promising technology companies as opposed to fostering innovation. They bought Blogspot (which hosts these pages) and FeedBurner (which I use a lot) without much noticeable effect either way. Some of their other acquisitions and antics have recieved less favorable reviews.

Google is Microsoft's chief rival in developing (or acquiring) web-based applications, which both companies hope will be the "next big thing" in computing. In their efforts to broaden and deepen their reach, it seems that Google has let their flagship search engine go by the wayside. This blog was originally established primarily to counter Google's "thin affiliate" offensive (ca. June, 2005), wherein the search giant decided that merely organizing links in a logically significant fashion was unworthy of listing in their search engine. (Wait a minute! Isn't that what Google professes to do?)

Over the past three years, Google has repeatedly made their search narrower and shallower -- ostensibly improving relevance, but arguably just making their revenue-producing Ad Words program more competitive with their free search results. After about a year of this, according to data from Alexa, Google's reach peaked, and began to go downhill. Their response was to continue crippling (or "improving") their search function in new and more aggressive ways. The end result is that the once-powerful Google search engine now produces pretty lackluster results.

A few years ago, search engines engaged in size wars -- each claiming to index more pages than the other -- until critics began to make the extraordinary claim that size didn't matter. Of course size matters: you can't retreive pages that aren't indexed. What the critics should have said was that size alone was a poor indicator of search engine quality. Let's look at the approximate number of search results the three main search engines produce for some relevant websites as of today (2 APR 08):


Site:GoogleLive SearchYahoo
Amazon.com(43.9 M)(75.1 M)(448.5 M)
Amazon aStores(424 K)(1.96 M)(19.6 M)
Wikipedia (English)(3.8 M)(31.2 M)(191.4 M)


Search relevance is harder to evaluate. To begin with, the very definition of relevance is elusive. But assuming that all the major search engines make a reasonable good faith effort to organize their output based on commonly accepted industry practices, which search engine would you choose? According to a frequently-cited "recent" survey 58% choose Google. I guess it just goes to show, "You can fool some of the people all of the time."

An even greater irony is the fact that the current buzz in the search engine community is all about Yahoo! (See Where's Search Heading? Ask Yahoo's Chief Scientist) Is it any wonder Microsoft is trying to buy Yahoo outright?

Tuesday, March 25, 2008

Popular Automotive Items @ Amazon.com

This is the one of a series of quick articles on popular products from Amazon.com. Of course, the most obvious measure of popularity is Amazon sales rank, the default listing order of Amazon's aStores. That order is already represented in the search engine listings, at least in theory. These items are ones that I have actually sold.

The idea here is that these items are popular enough that someone would buy them, but not so popular that they are widely listed elsewhere. Though this approach may seem wildly random, it is very likely that if someone bought these before, someone else might want to buy them in the future. Perhaps they are just what you're looking for!

Popular Apparel Items @ Amazon.com

This is the first of several quick articles on popular products from Amazon.com. Of course, the most obvious measure of popularity is Amazon sales rank, the default listing order of Amazon's aStores. That order is already represented in the search engine listings, at least in theory. These items are ones that I have actually sold.

The idea here is that these items are popular enough that someone would buy them, but not so popular that they are widely listed elsewhere. Though this approach may seem wildly random, it is very likely that if someone bought these before, someone else might want to buy them in the future. Perhaps they are just what you're looking for!

Thursday, November 22, 2007

Amazon.com's Most Popular Gifts: Books

It's time to address the 900-pound Gorilla in the room: Amazon books. As "Earth's Biggest Bookstore," Amazon.com is synonymous with books in the minds of many consumers, who may not necessarily even think about Amazon for other purchases. The task of adequately indexing Amazon's thousands of book categories through their aStore program is complicated by their 999-category limit for individual stores. There is simply no alternative to splitting our book selections into several more-or-less self-explanatory subdivisions.

Heretofore we have pointed our links to our Browse Amazon Books website, hosted on space provided by a certain ISP who now threatens to discontinue service to anyone who (truthfully) criticizes their woefully inadequate service. That, coupled with Google's headlong rush into corporatism since their acquisition from founders Page and Brinn has pretty well rendered that site irrelevant although MSN Live Search and Yahoo Search still adequately index it.

We have already blogged about our new "Author Mania" store, which features several (mostly fiction) genres that Amazon.com indexes by author. Although building this store has proven to be a rather pains-taking process, other projects are well enough in hand now to continue expanding this resource with some regularity. We've also started Technical Bookmania which attempts to bring Amazon's Technical and Professional categories to light. The specialized books in this store are not individually big sellers, and Amazon doesn't do much to make them easy to find, burying them in an unnecessarily deep and abstruse heirarchy. Finally, we are introducing Book Mania, which is currently an "everything else" catch-all, but should be evolving into our primary book source, where the most popular categories are featured with considerable depth.

In a related story, Amazon just released it's new Kindle e-book reader. Those who have had access to the device in advance for review purposes give it high marks, although the few Amazon customers who have reviewed it thus far are somewhat cooler on the device. It seems little pricey, until you consider that it includes a connection Amazon's own Whispernet wireless telephone network. With all the hoopla that surrounded the Apple iPhone, it's nice to see a truly useful product introduced without quite so much fanfare.

Amazon.com publishes a series of gift guides, which change over time and may even disappear altogether. These snapshots are an excellent indication of what products are popular on any given day. While the content of the guide may vary, these Books should be around for a long time...

  1. Be a Real Estate Millionaire: Secret Strategies for Lifetime Wealth Today
  2. Walking in Your Own Shoes: Discover God's Direction for Your Life
  3. Results That Last: Hardwiring Behaviors That Will Take Your Company to the Top
  4. Deceptively Delicious: Simple Secrets to Get Your Kids Eating Good Food
  5. You: Staying Young: The Owner's Manual for Extending Your Warranty (You)
  6. Eat, Pray, Love: One Woman's Search for Everything Across Italy, India and Indonesia
  7. I Am America (And So Can You!)
  8. The Dangerous Book for Boys
  9. Rescuing Sprite: A Dog Lover's Story of Joy and Anguish
  10. Stop the 401(k) Rip-off!: Eliminate Costly Hidden Fees to Improve Your Life
  11. The Daring Book for Girls
  12. A Thousand Splendid Suns
  13. Clapton: The Autobiography
  14. Water for Elephants: A Novel
  15. Harry Potter and the Deathly Hallows (Book 7)
  16. The Age of Turbulence: Adventures in a New World
  17. World Without End
  18. Harry Potter Boxset Books 1-7
  19. An Inconvenient Book: Real Solutions to the World's Biggest Problems
  20. Boom!: Voices of the Sixties Personal Reflections on the '60s and Today
  21. The Pillars of the Earth (Deluxe Edition) (Oprah's Book Club)
  22. Musicophilia: Tales of Music and the Brain
  23. Double Cross (Alex Cross)
  24. Lone Survivor: The Eyewitness Account of Operation Redwing and the Lost Heroes of SEAL Team 10
  25. Love in the Time of Cholera (Oprah's Book Club)
  26. Three Cups of Tea: One Man's Mission to Promote Peace . . . One School at a Time
  27. Our Dumb World: The Onion's Atlas of the Planet Earth, 73rd Edition
  28. A Lifetime of Secrets: A PostSecret Book
  29. Become a Better You
  30. Quiet Strength: The Principles, Practices, & Priorities of a Winning Life
  31. Star Wars: A Pop-Up Guide to the Galaxy
  32. The War: An Intimate History, 1941-1945
  33. The Nine: Inside the Secret World of the Supreme Court
  34. The Kite Runner
  35. Eclipse (Twilight, Book 3)
  36. Rhett Butler's People
  37. The Secret
  38. The Alphabet from A to Y With Bonus Letter Z!
  39. His Dark Materials Trilogy (The Golden Compass; The Subtle Knife; The Amber Spyglass)
  40. Home to Holly Springs (Father Tim, Book 1)
  41. The Choice
  42. New Moon (Twilight, Book 2)
  43. My Grandfather's Son: A Memoir
  44. Harry Potter Paperback Box Set (Books 1-6)
  45. Playing For Pizza: A Novel
  46. Peek-A-Who?: Board book
  47. The 4-Hour Workweek: Escape 9-5, Live Anywhere, and Join the New Rich
  48. The Wisdom of Menopause: Creating Physical and Emotional Health and Healing During the Change, 2nd Edition

Featured Authors

Tuesday, October 16, 2007

Alexa Toolbar

I've been working on feeds (RSS & atom) lately, anticipating a flood of traffic whenever Windows Vista with it's built-in support for feed technology becomes popular. I'm still waiting for that, since even Microsoft seems to concede that Windows is once again the operating system so advanced you need a new computer to run it, focussing their marketing efforts on OEMs rather than the upgrade crowd.

Amidst the hoopla, I've neglected search engine submissions. All the major search engines claim that they'll find most web pages on their own, so it seemed more productive concentrate on creating new pages. Unfortunately, the major search engines' ability to find pages on their own seems to be rooted in geological time!

Just to illustrate, I did an Alexa search for "thewholeedcatalo" (my Amazon associates I.D.) Gack! 34 results. Not much of a showing for nine years online! Well, what exactly did Alexa find?

Number one, it found "tag" pages for my Squidoo "lenses". Not too surprising, given that a Google search for "Squidoo SPAM" yields ten times as many results as the number of "lenses" Squidoo claims. The rest of the results were about evenly divided between random "real" pages and dead links from spammy bush league "search engines," that are basically fronts for Google AdSense "content."

I did find one useful page. There, still chugging away after years of neglect, was my Alexa Toolbar Page, although there was no mention of the new Firefox Toolbar. I'd forgotten about it.

Alexa doesn't really spider the web, but relies mostly on input from -- I don't know -- perhaps dozens of Alexa Toolbar users. The toolbar communicates with Alexa about your web-browsing behavior, which has led many anti-spyware programs to incorrectly target it. The toolbar isn't spyware, since it operates with the user's permission -- but many potential users are scared spitless at the idea that their security is being "compromised."

So why use Alexa's toolbar, especially given that there are competing products out there that don't annoy your anti-spyware utilities? Honestly, unless your sense of civic duty compels you to contribute your two cents worth to the make-up of the web, there's no particularly good reason -- unless you're a webmaster / blogger.

While most search-engines jealously guard their databases, Alexa has taken a different approach. They make their data available to other search engines on a contract basis, as their robots.txt file indicates. Google's "Similar Pages" links are based on Alexa data, for example. (For more information see Search Engine Watch's Spider Spotting Chart.)

Because Alexa shares its data, it is a "one-stop shopping" solution for search engine submission. You can submit sites to Alexa via their secret "crawl site" form, but to submit larger numbers of pages, including traffic and linking data, the toolbar is a better choice.

The downside of the toolbar (there's always a downside) is that the chatting between your browser and Alexa.com consumes some bandwidth. Back in the days when I was using 56 Kbps dial-up, I found this performance "hit" to be prohibitive. On a broadband connection, it is barely noticable.

Try Alexa Toolbar for Internet Explorer

Try Alexa Toolbar for Mozilla Firefox