- Full-text search that looks at the Title, Code, and Notes fields of a snippet
- Has to work on Azure
- Has to be fairly easy to set up and tweak
- Will need to be able to search tags
- Has to support more granular parameters for handling Advanced Search
So I followed this tutorial to set up Lucene: http://chriskirby.net/getting-full-text-search-up-and-running-in-azure/ and everything went pretty smoothly.
Permission filtering preparation
So I set up field indexing like this:
private static Document MakeDocumentFromSnippet(
SearchIndexSnippetModel snippet) {
var doc = new Document();
doc.Add(new Field(FIELD_ID,
snippet.Id.ToString(),
Field.Store.YES,
Field.Index.NOT_ANALYZED,
Field.TermVector.NO));
doc.Add(new Field(FIELD_USER_ID,
snippet.UserId,
Field.Store.YES,
Field.Index.NOT_ANALYZED,
Field.TermVector.NO));
doc.Add(new Field(FIELD_IS_PUBLIC,
snippet.IsPublic.ToString(),
Field.Store.YES,
Field.Index.NOT_ANALYZED,
Field.TermVector.NO));
// indexing the Title, Text, and Notes below is omitted
....
return doc;
}
Note the
Field.Index.NOT_ANALYZED
specified for the FIELD_USER_ID
and FIELD_IS_PUBLIC
. We will use these fields when we filter out the data the current user is not allowed to see.Indexing and Updating
So I added some code that loops through all the existing snippets, and indexes each one like this:
public void AddSnippetToIndex(SearchIndexSnippetModel snippet) {
using (var writer = MakeIndexWriter()) {
var doc = MakeDocumentFromSnippet(snippet);
writer.AddDocument(doc);
}
}
And every time a snippet is updated, I update the index like this:
public void UpdateSnippetInIndex(SearchIndexSnippetModel snippet) {
using (var writer = MakeIndexWriter()) {
var doc = MakeDocumentFromSnippet(snippet);
writer.UpdateDocument(new Term(FIELD_ID, snippet.Id.ToString()), doc);
}
}
To make this work correctly, you have to tell the
writer
which document to update. Since Lucene doesn't support actual updating, it will delete all documents that match the term you provided and add the new document doc
. It wasn't finding the document to update at first because I was using Field.Index.NO
for the FIELD_ID
field during indexing. Switching to Field.Index.NOT_ANALYZED
fixed the problem.Searching and permission filtering
I have a single input for the user to type their search query. The results returned from the query need to be filtered to only show snippets the current user has access to. To achieve this, we need to add additional conditions to the query the user types. At INT64 the user has access to his own snippets and to any public snippets.
Here is an excerpt from the
Search
method where I prepare the queries:
// prepare the searcher and parser
var searcher = new IndexSearcher(m_azureDirectory);
var parser = new MultiFieldQueryParser(Version.LUCENE_30,
textSearchFields,
new StandardAnalyzer(Version.LUCENE_30));
// parse the user query
var userQuery = parser.Parse(query);
// filter out results that don't belong to the current user and
// that are not public
var onlyThisUser = new TermQuery(new Term(FIELD_USER_ID,
BizSession.CurrentState.UserId));
var onlyPublic = new TermQuery(new Term(FIELD_IS_PUBLIC, true.ToString()));
var onlyThisUserOrPublic = new BooleanQuery
{
{ onlyThisUser, Occur.SHOULD },
{ onlyPublic, Occur.SHOULD }
};
var finalQuery = new BooleanQuery
{
{ onlyThisUserOrPublic, Occur.MUST },
{ userQuery, Occur.MUST }
};
// do the search
var totalToRequest = (criteria.PageNumber + 1) * criteria.PageSize;
var results = searcher.Search(finalQuery, totalToRequest);
So I create additional term queries for matching the current user's id, and for matching public snippets. Then I add both of those term queries to the
BooleanQuery onlyThisUserOrPublic
, telling it that both conditions need to be matched using Occur.SHOULD
. This is like saying, "either the User Id matches the current user's id, or the snippet is public".
Then I add both my new permission query and the user query into another
BooleanQuery
, telling it this time that both conditions MUST occur. This gives us a final query of (matches user input AND (snippet's user id == current user id OR snippet is public))
And then we loop through the results (there is some paging code there) and make the resulting snippet list, which I omitted for brevity.
To Be Continued...
The tags are not included in the search at this time. Once tag search is done, I will write part 2 of the post concentrating just on that.
Please leave thoughts and comments below.
Thank you for the article.
ReplyDeleteI am sure that almost everyone understands the advantages of using online data room over traditional one. But here we face a danger of being hacked and losing important information. So when choosing which VDR to use pay special attention to its security system and better do a virtual data room comparison.
It is a very interesting and useful article. I got a lot interesting facts from this article. I advise you to read it all.
ReplyDeleteI have never tried to install the filter! I think that there is some rationality
ReplyDeleteI tried to figure out on their own in this information. But without the help of my friend, I have failed.
ReplyDeleteSo I create additional term queries for matching the current user's id, and for matching public snippets.
ReplyDeleteAmazing, this is a great article, Clipping Path Associate, is one of the best clipping path service provider around the world.
ReplyDeleteI thank you for the information and articles you provided
ReplyDeletecanlı sex hattı
ReplyDeleteheets
salt likit
salt likit
puff bar
ELY
adana
ReplyDeleteadıyaman
afyon
ağrı
aksaray
7VO3