Identity Matching: Keeping On Top of High Data Volumes – What Are the Options?


Considering the huge amount of data that organisations now have to process for name screening, it is imperative that they have sophisticated filtering systems in place. Otherwise, they risk being overwhelmed by the data, either suffering multiple false positive alerts or worse still, missing relevant hits.

There are various different approaches that organisations can use, with different levels of sophistication.

Let’s start with the more generic options, all designed to reduce the number of irrelevant hits businesses have to wade through, only to then discard them.

Rule based qualification. This is when rules, which can be customized, are applied in a post-filtering step. This automatically discards hits or flags them up as requiring specific investigation and monitoring.

Whitelist application is the practice of avoiding recurrent hits related to certain persons. For example, a whitelist entry might be a reliable customer that happens to have an unfortunate name resemblance.

Score-based categorization works along the principle of attributing risk scores, based on rules, which again can be customized. It is another way of automating the process of false positive hit reduction and prioritizing hits for decision making.

Those are the more basic options. Now onto the more advanced methods that are designed to further limit the number of false positive alerts. Firstly, there is the approach that is based on well structured data. In order to follow this approach, customer databases have to contain correctly spelt, properly structured and exhaustive data. Then hit reduction is achieved through dedicated, very focused search logic. Searches run through a limited number of permutations and name combinations and reduced tolerance in terms of fuzzy matching is applied.

Another technique makes use of additional information inputted into a client’s database that helps to distinguish between false alerts and true alerts during the detection process. This additional information includes entity type, date of birth, geographical information, nationality and gender.

The last technique is one that is based on optimized processes. This is when client databases are programmed so that they do not keep searching repeat data – ie screening against the full Political Exposed Persons or sanctions list each time. Instead, an organisation carries out one all-important first run that establishes what alerts are considered important regarding all customers. Any subsequent runs focus purely on the delta list, raising only new alerts or reapplying memorized decisions when alerts reappear. This approach can be costly and time-consuming initially, but organisations should reap the rewards in terms of costs and time saved later on.

What next? Next week’s post looks at real-time business functions and the value of screening hubs.

Also in our series on identity matching:

  1. What is name screening and why do we need it?
  2. What’s in a name?
  3. Getting through the numbers while keeping the hits right