Skip to content
Commit 14786e37 authored by Vishesh Handa's avatar Vishesh Handa
Browse files

Optimize Contact Search Job Queries

Some of the contact search job queries are pathological, and would
result in virtuoso consuming a lot of cpu for large amounts of time.
I've optimized the queries by doing the following -

* Not adding a "?r a nco:Contact" term. This is unnecessary as it
  results in an extra property being matched. Considering that Nepomuk
  data almost always follows the ontologies (the exception is legacy
  data), just using properties which have a domain of nco:Contact should
  guarantee the correct results.

* Avoid unions - Virtuoso cannot optimize the unions that well. Instead
  we use a FILTER(?p in (..)) instead.

* Avoding regex based search - Regex based search will always be
  terribly slow. It literally applies the regular expression on each
  candidate in order to filter them out. It's a lot better to use the
  full text index. This is done using 'bif:contains'. We do loose a
  little bit of accuracy, and we cannot match word boundaries. But I
  think have a good user experience trumps a little bit of accuracy.

REVIEW: 107065
parent fe86e5a9
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment