Blogs

Enterprise Search Implementation, Step 3: The Proof of Concept

By John Gillies posted 12-01-2011 12:04

  

This is the third in a series of postings about the process of choosing and implementing an enterprise search engine in a law firm. The first addressed Establishing the Business Requirements, while the second looked at Picking the Right Search Engine.

In my previous posting, I talked about the process that should have resulted in your choosing the search engine that (ideally) best meets your needs. The next stage, the proof of concept, is where you put it to the test and ensure that it performs as expected in your environment.

The following is the definition of “proof of concept” from Technology.org: “A proof of concept [PoC] is a test of an idea made by building a prototype of the application. It is an innovative, scaled-down version of the system you intend to develop. In order to create a prototype, you require tools, skills, knowledge, and design specifications.” (I don’t know if the two terms are related, but for me “proof of concept” brings to mind the often misquoted phrase, “the proof of the pudding is in the eating.”)

Essentially, in the PoC you and your IT colleagues are looking to see whether the engine not only does what the vendor has promised, but whether it also does what you need it to do, in the way you need it done, and does so properly in your technical environment.

Generally, this will involve loading the search software in a test environment, indexing a small percentage of the documents that it will be searching, and running tests to determine how it responds.

While your due diligence up to this point will have included confirmation that the engine should perform in your IT environment, now is the time that the nitty gritty testing will take place. (Since every firm’s IT infrastructure is unique, it’s important to do this testing before proceeding any further.)

You will have identified, in your business requirements document, the various document repositories that you will be indexing (such as the DMS, your accounting system, the intranet, etc.). Once the initial technical testing is done, you will take an appropriately sized “slice” of each of those repositories to index. Ideally, a representative mix of document types, sizes, and security profiles would be included. This will provide the raw data that will be used for the rest of your PoC testing as well as for pilot testing. You will want to make sure that the resulting data subset will go together to provide meaningful results. You may wish, for example, to index documents published within the last 30 days. You will need to be mindful that, as a result, there may be content that pilot testers might expect to find but is not in fact in the PoC database, so it will be important to manage expectations at that time.

Once your various data repositories have been crawled and indexed, you’ll need to set up the user interface so that it displays properly, and then set up and test the security modules. One of the first questions that lawyers will ask is whether the search engine respects the permission walls erected around sensitive information. This should be straightforward to confirm for DMS documents, but if you are including other repositories, particularly for your accounting information, you will want to pay special attention to this issue. Nothing will sink acceptance of your search engine faster than the discovery that users are suddenly able to access documents that should be hidden from them.

This will be the point where you should determine when you are going to conduct your database cleanup. While I will deal with this issue in more detail in my next posting, it’s important to note that you will find that there are a number of sensitive documents that currently exist but are essentially hidden because your current search tools are inadequate to reveal them. (This is referred to as “security through obscurity”.) Since you will have to conduct this cleanup before launch, your question is whether to do it now, before the pilot, or afterwards.

Now you will finally be at the point where you can start some serious testing. Your colleagues in IT will need to carry out their technical tests while you use your business requirements document to test the four key aspects of the search engine, namely

  • relevance
  • responsiveness (i.e., speed)
  • consistency
  • proper working of the key functions

You should know that responsiveness is difficult to test in the PoC, because it’s not really until a full index of your data sources is performed, and released into production, that you’ll know how fast it is. The other three aspects, however, are what you should be focusing on.

As to the third bullet, the search engine should consistently apply the established rules for weighting, ranking, and security criteria on a user by user basis. This may result in different results, but it demonstrates a consistency in the expected user experience regardless of how the user is connecting to the engine.

It will be very helpful if you develop use cases to test these aspects. Develop various types of typical searches that you expect different users to conduct, then carry out those searches and record the results. It’s useful in this context to develop personas (e.g., first year associate who knows nothing, experienced senior associate managing a deal, partner in a litigation matter, assistant doing a search on behalf of his or her lawyer, etc.). With your knowledge of the business requirements, you should also develop test cases that highlight what a specific type of user should “not” find (due to data source, or document security). You will want to keep these use cases for more testing during the pilot phase.

Assuming everything has gone well in your PoC, you will be ready to accept the software, engage in the database cleanup (if you haven’t done so already), and proceed to your pilot testing.



#KnowledgeManagementandSearch
0 comments
18 views