As the owner of a company that builds Web applications, I appreciate the need for having a good search facility built into the application. This is important for all the applications we build, but particularly for the e-commerce applications when failure to find relevant products to a search request can result in the significant loss of revenue. I am sure all of you have experienced the frustration of entering a search term into a Web sites search box only to receive results that are not relevant or that no results came back. Over the last few years a number of open source search solutions have been developed that begin to address the complexities of search. Examples include Lucene and Solr. These systems are excellent but using them required a high degree of knowledge in not only the programming but also in their installation and administration.
With this background, I was really excited to hear that Amazon have launched a new Cloud Web Service known as CloudSearch. Have you ever searched on Amazon’s Web site and been impressed by the accuracy and thoroughness of the results ? If so you have been using the technology behind the new CloudSearch service. So what is this service and what does it provide ?
Amazon CloudSearch is a fully managed search service in the cloud. It enables developers to concentrate on building applications and provides all the search functionality required with none of the normally associated complexity. The service is easy to use but incredibly powerful. To use it requires the creation of a search domain. Data is uploaded to the search domain in either XML or JSON format that must conform to Amazon’s Search Document Format (SDF) and Amazon then indexes it and makes it available for complex searches within seconds. Each domain has an endpoint URL to which the searches are sent and then the results are returned in JSON format by default or XML if specifically requested. To prevent unauthorised access to the search domain endpoints, security settings enable access to be restricted to individual or ranges of IP addresses. These addresses will typically be the machines of the Web application that has the search facility built into it.
The pricing model for CloudSearch is based on the number of machine instances running to support your search needs, and there are three instance sizes to choose from initially. After this there is a small charge for bulk data upload to the service and then the normal data transfer costs out of the cloud that applies to all AWS data. Given the sophistication of the service, together with the simplicity of its usage, this is an incredibly attractive proposition Amazon have again provided as part of the AWS portfolio. I am impressed that I am using this service within two hours of reading the documentation. If you require search in your applications, I urge you to take a look at this service.