Recently, Vanilla launched a new advanced search for enterprise customers. This new feature is the result of over a year of feedback and research into what makes a great community search. We spent a long time thinking about search and really believe that our new advanced search strikes the right balance between power and simplicity. It's been a fun process developing this so I wanted to share some of the inside scoop on this new feature.
Why is Search Hard?
One of the more common complaints on any forum platform is the search. In fact you could go so far as to say that search isn't that great on most any blogging, publishing, social bookmarking, or any other platform that isn't a dedicated search engine coming from the likes of Google or Bing. Why is this the case though? The search engines have to search the entire Internet while these other platforms only need to search one site. Searching one site should be easier than searching all sites.
Well, the main reason I think search engine sites are better is that they make search their central business. Other platforms treat search as a secondary concern. In a way, this must be the case. However, there are more specific reasons why search is hard.
- Storing data in a way that is easy to insert and modify makes it hard to search. There is a fundamental difference in the way that data needs to be indexed for full text search. If you don't use a special-purpose search engine for a large community then even a single search can take down the whole system. As it turns out, we've done a lot of migrations precicely because of this reason. User searches, site crashes, call Vanilla.
- A dedicated search engine takes a purpose-built infrastructure. Out of the box software that you install yourself is built to run on the lowest common denominator of web hosts. That means that a good search engine is not something that is likely to be supported. And if a search engine is supported it's most definitely not going to be easy to install, configure and maintain. There is a whole variety of software out there nowadays that makes scaling sites possible, but none of it runs on Godaddy (to pick on just one of many companies).
- A community is comprised mostly of user-generated content. When you get a massive amount of user-generated content you end up getting a bad signal-to-noise ratio for that content. I want to stress that this isn't a bad thing, but it makes search more difficult. You want each and every piece of that noisy content to be a candidate for search, but you really want to see the best content first. In a way, this is where the search engines that crawl the entirety of the Internet have an easier time. They can be brutal editors when determining what is a good piece of content but we have fewer tools to curate a single site.
Making Search Smarter
We put in a lot of work under the hood to make our search smarter. Ideally, when you type some words in a searchbox you want to find something that exactly matches what you had in mind when you started typing. In reality, all we can do is find stuff that contains the same words or word forms that you typed. Where the search gets smart is in ranking those search results. The smartest search will put the best results near the top. But what determines the "bestness" of a search result? Well, there a few factors that go into search ranking.
- How many of the words you typed are in the search result? Most search engines incorporate this criteria when ranking your search results. The algorithms involved take a combination of the number of words, the rarity of words, and the order of the words you typed. All these factors are put into a match quality ranking.
- How new or old is the content being searched? Forums generate content at an incredibly fast pace. That content can get stale pretty quickly too. We found that sorting on keyword match quality only would show too many stale, useless results. We found that it's absolutely necessary to take into account how new or old content is when doing community search. You might be interested to know that this was our only sort criteria in the previous iteration of our search.
- How good is the content? As I said in the previous point: forums generate a lot of content and not all of it is good. Well, with our reactions system we provide users with a way of curating the content in your community. When users react positively to a post then it increases in score. When users react negatively to a post then it decreases in score. This score contributes to a post's search rank.
Giving You more Search Options (but not too many)
We've known for a long time that we needed to offer more search options, but we wanted to get it right. I suppose in the beginning we were idealistic in providing just one freeform search box. We thought it would make search easier. However, we've found that users get frustrated when they can't find the content that they want even though they know more information such as the author or when they posted.
We knew we needed to add more search options, but wanted to strike the right balance. When looking at other advanced search systems we noticed that a lot of them went too far in the "kitchen sink" direction. They offered so many search options that it was incredibly difficult for us to find the one option we wanted to search on. Internally, we call this the "knobs and dials" approach to user-interface design and that's not a term of endearment.
We think that offering too many search options is an even bigger mistake than offering too little. First off, it makes scanning the user-interface stressful. Secondly it can result in users entering too many search options and getting no search results at all. Rember that you want to come up with a list of search results to choose from.
In the end, I think we came up with a good mix of search options without cluttering the user experience. You may notice that our advanced search bears a striking resemblance to the advanced search in Gmail. This is no coincidence. We drew inspiration from a lot of different sources, and it's incredibly hard to ignore the king of search itself.
Nice Little Touches
After the initial design of our advanced search we went through a refinement phase and added a few nice touches. I think this is an important part of crafting good software and I want to illustrate a few of them here:
- Simplified search excerpts. Before, we showed entire posts in search results; formatting, images and all. We found that this looked cluttered and made it diffuclt to scan through the search results so we simplified the output.
- Grouped search results. We now group search results by the discussion thread. The search result that is shown is the best match within the group, but you can click to drill down and see the rest of the matches. This prevents too many posts in one discussion from clogging up the search results. Sometimes you want to find an individual comment, but most of the time you want to find a discussion thread. We support both use-cases.
- Pictures and video in search results. If a search result contains a picture or a video then we display it nicely tucked away to the right of the search results. If you click play on a video it resizes before playing.
- Quickly search a category or discussion. You'll notice a little magnifying glass icon beside some of the page lists. If you click this you'll get a quick search dialog within the category or discussion that you are currently looking at.
- Autocomplete. On some of the search boxes you'll see a drop down of search results appearing below the box as you type. This adds a quick way of searching without going to the full search page and gives you a glance at the top search results.
- Tools for power users. We've added a few ways of refining your search. Here are some pro-tips:
- You can put a plus sign (+) before a search term to say that it must be in the search result.
- You can put a minus sign (-) before a search term to say that it must not be in the search result.
- You can surround two or more words in "double quotes" to search for an exact phrase.
- One word searches. If you search for just one word then the search is automatically sorted by most recent date alone. We found that a lot of people search for just one word to find out what is being said about that word right now. This type of one word search is very common with people searching for their own usernames. It's affectionately called an "ego search" and it let's people see what is being said about them.
We Hope You Like it!
We put a lot of effort into our new advanced search and really hope you like its new incarnation. But are we done with search yet? Of course not. We will most likely continue to tweak and improve our ranking algorithm over the coming months based on feedback. Even the famous (and secretive) Google page rank is being constantly updated. With search, we want to come up with an ever better answer to the age old question: "What are you looking for?"