kili-technology.com Open in urlscan Pro
2606:4700:20::681a:360  Public Scan

Submitted URL: https://proxhorror.com/lt/2251799839093995/NV0i2x7kKdkFta2fSqC0d
Effective URL: https://kili-technology.com/data-labeling/training-an-id-information-extraction-algorithm-with-kili-technology-the-story-of-lcl
Submission: On March 25 via manual from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Watch the replay!
Fast Track Shipping Insurance AI Models: Overcoming Training Data Challenges
 * Products
   Platform
    * Labeling
    * Quality
    * Integration
    * LLM Fine Tuning
    * LLM Evaluation & Testing
   
   
   Labeling Services
    * Kili Simple Offer
    * ML expert guidance
   
   
   Assets
    * Text Annotation Tool
    * Image Annotation Tool
    * Video Annotation Tool
    * OCR Annotation Tool
    * Geospatial Annotation Tool
   
   
   
   Master the craft of preparing training data to turbocharge your ML efforts
   
   DOWNLOAD EBOOK HERE >
 * Solutions
   Solutions
   Data LabelingText AnnotationNatural Language ProcessingComputer Vision
   Image AnnotationVideo AnnotationLLM EvaluationRAG Evaluation
   
   
   Use Cases
    * Insurance
    * Security
    * Healthcare
    * Manufacturing
    * Content categorization
   
   
   
   Master the craft of preparing training data to turbocharge your ML efforts
   
   DOWNLOAD EBOOK HERE >
 * Company
    * About us
    * Why Kili
    * Careers
    * Events
   
   
 * Resources
    * Blog
    * Events & Webinars
    * Whitepapers
    * Case Studies
    * Open Datasets
    * Models
   
   
   
   Checklist: Comparing Data Labeling Services
   
   Download our free resource here
 * Docs
    * What is Kili Technology?
    * Getting started
    * Changelogs
   
   
   
   Users & rolesHandling projectsLabelingQuality Management
   PluginsAutomationKili APITroubleshooting
   
   
 * Pricing

 * Request a demo

Get My Data LabeledLog In
 * Products
 * Solutions
 * Company
 * Resources
 * Docs
 * Pricing

 * Request a demo

Get My Data Labeled
 * Home
 * /
 * Data labeling
 * /
 * Training an ID information extraction algorithm with Kili Technology: the
   story of LCL

Deep Dive


TRAINING INFORMATION EXTRACTION MODELS: THE STORY OF LCL

Banks are amongst the most regulated establishments in the world. Discover how
LCL built a powerful ID information extraction algorithm using Kili Technology
to classify the IDs of their customers.

Axel CypelAI Expert at LCL

Table of Contents

 * Why information extraction? Another issue...
 * Where AI & extracting information comes into action
 * Working with Kili Technology for information extraction: the right solution
 * Back to the topic at hand
 * That’s not the end of the story


WHY INFORMATION EXTRACTION? ANOTHER ISSUE...

Banking activity is one of the most regulated sectors in advanced countries. The
Basel Committee and the European Central Bank act as supervisors of the banking
commercial structure which has the power to create money through credit. With
such responsibility, banks are subject to national or international regulations.
Regulations from influential countries like the United States, or the EC, also
apply to our own institutions through extraterritorial laws (for instance, when
using USD currency, or abiding by embargos).

One of the pillars of good management for a bank is the well-known KYC: Know
Your Customer. A massive amount of data needs to be collected to ensure a
suitable KYC, and banks’ customers are perfectly aware of this obligation. One
of the passage obligé to build powerful KYC is the collection of identity
documents. To comply with international regulations, banking institutions are
required to have a recent identity document copy for each of their customers.
This copy should be of sufficient quality to be used for verification or control
purposes. The issue begins when you need to parse your entire client base to
ensure this constraint in a short time range, for all the clients in your
portfolio (a few millions). This, of course, cannot be done manually.

Another key pillar of good management is Corporate Social Responsibility (CSR).
As banks are a major transmission belt in the economy, policies to make
ecological changes apply to the finance industry. Banks being the natural ally
of citizens when buying real estate, setting conditions on the energetic score
when acquiring new construction or making renovations is important. The “DPE”
(Energy Performance Diagnostic) is now a mandatory document due when a real
estate loan is signed. This document contains data that allow banks to create
the regulatory extra-financial reporting asked by the Regulators.


WHERE AI & EXTRACTING INFORMATION COMES INTO ACTION

Many formats of ID documents (National ID Card, Passports, residence permit) and
DPE – for which there is no unique template – are collected and dumped in the
LCL Electronic Document Management system. For each document, a dozen text
fields must be extracted to ensure that the proper information is registered
either in the bank’s CRM or in the appropriate reporting. And this information
extraction must be executed successfully millions of times, for our millions of
customers (we do not complain!).

To recognize, categorize and extract structured data from these documents, we
have two options: build an army of labelers to annotate 100% of the raw data
manually or use state-of-the-art recognition algorithms to develop AI models.
The former is not feasible in real life, while the latter is more than an
option: it is the solution. And LCL is equipped for this challenge: there is
already a team in LCL specialized in the business of creating AI products to
process images, text files, voice samples and run named entity recognition or
natural language processing techniques on legal documents.

Document categorization & natural language processing are particularly well
served by using supervised machine learning. The challenge resides in the high
expectations from the business units (Compliance Department, CSR Management): we
cannot afford any mistakes given the importance of our two missions. But that’s
the way our business works!

Even with our choice to use artificial intelligence and information extraction,
building a labeling team is still a challenging task. As we said, we did not
plan to hire an extensive team of annotators. But when using supervised
learning, it is well known that to train our model, we need to obtain an example
database, containing thousands of labeled and annotated images. This is where we
need software to handle our document extractions and pre-processing to simplify
the job of our five dedicated labelers handling the extracted data. 


WORKING WITH KILI TECHNOLOGY FOR INFORMATION EXTRACTION: THE RIGHT SOLUTION

To be able to label our existing data, we chose to work with Kili Technology,
the labeling platform to build high-quality training data from structured and
unstructured data.

Having extracted tens of thousands of images from our Enterprise Data
Management, we stored them in our on-premises servers. A bridge with the Kili
Technology software, also installed on-premise, allows our remote team of
labelers to work on identifying entities, classification & global annotation
tasks (e.g. the letter from the energy diagnostic, or the expiry date of a
national identity card). The Kili Technology SDK allows us to use our custom
models to OCRize interest areas, extract information, and prepare these
documents for manual annotation.

Labelers, but also experts from businesses enjoy working on Kili Technology
because they can focus on the task to be done easily and in time. This is
opposed to our former constant fear of work being lost or needing nightmarish
file management with a dozen Excel files. From our perspective, not having to
bother with data transfer and backups is a great relief: the installation of
Kili Technology has been done up to our data safety standards. After some use,
Kili Technology was considered a great comfort by both hired and volunteer
labelers. People from our business units wanted to be involved in building AI:
engaging many people as a workforce helped the federation of the company around
AI.

Watch video

Learn more


In the end, Kili Technology provides a strong labeling software focused on
dataset quality and an easy-to-use interface, but there’s also a team behind the
scenes. Our counterparts at Kili Technology gathered very quickly whenever we
encountered any issue. Our dedicated customer success manager is very careful
about any pain point that can arise and will organize meetings with tech
profiles, should there be a need to customize or be trained on certain
functionalities of the application.

On one hand, all features needed for labeling are present, and a few of them are
often used. On the other hand, we sometimes have a need for a feature that is
not (yet) developed but that can be added to the roadmap. As a large corporation
collaborating with a start-up, chances are our paces are sometimes different.
But even with the differences in the working model, Kili Technology remains very
attentive to our challenges.


BACK TO THE TOPIC AT HAND

For each of our two labeling campaigns, we used Kili Technology’s labeling
platform. It allowed us to push, label and retrieve the data that will feed our
machine learning algorithms. Deep learning is a big consumer of data if the
output model is to meet the business requirement with a very small error rate.

A standard annotation campaign accounts for typically 5,000 documents to be
annotated in 2 or 3 weeks. To obtain them, there is an important data
preparation work upstream. Thanks to the fact that our AI infrastructure now
includes Kili Technology, we can use the tool for all kinds of projects by
people trained to use the platform. Simply by requesting the relevant raw data
to be loaded in, LCL teams can accelerate drastically the creation of their
training datasets, which means a significant improvement for all the parties
involved.

Once our models are ready, there are integrated by our IT and ready to be pushed
to production massively. See for yourself: more than 13 million documents were
run in batches to check if every single KYC was complete with its readable ID
document. The data extraction from every scan copy filled in a compliance tool
that allowed the retail network of advisors to update documents with their
clients. All of that is done algorithmically, with a training dataset built on
Kili Technology.

As for the extra-financial reporting, our algorithms will catch the relevant
information contained in the DPEs during the credit process. That is, one more
time, something that cannot be done by a manual process unless at great cost.
And for all these services rendered by AI, there is a need for a labeled
database. Without a tool such as Kili Technology, filling the requirements of
the regulators would take months and make us miss the compliance schedule.
Regulators do not wait!


THAT’S NOT THE END OF THE STORY

As you may have experienced, banks collect data from customers all the time:
 images, e-mails, phone calls, contracts, etc. We can even process voice
recordings, where speech-to-text algorithms can apply the power of NLP to live
conversations. We will undoubtedly use Kili Technology to annotate additional
assets, all with the goal of improving our security and customer satisfaction.
Regulators never sleep! 

To learn more about information extraction, optical character recognition,
natural language processing, automatically extracting structured information,
extracted entities, automatic annotation, information extraction and information
retrieval, check out our webinars and other articles!

Other articles on topic

How to Choose a Data Labeling Service: A Comp...



Best Geospatial Annotation Tool: What to Look...



How to Perform Distributed Training?



Document Layout Analysis, a complete guide



Using ChatGPT to pre-annotate Named Entities ...



How to Ensure the Accuracy of Your Geospatial...



Satellite Imagery Annotation: Challenges and ...



How to create an image recognition model



Best Practices for Unstructured Data Protecti...



Customer Story: How Covea Leveraged Kili Tech...



Discover fairness issues in classification wi...



Computer Vision Applications – Definition, Us...



Supercharging Your Machine Learning Models wi...



Data + Optimization Part 1: How Kili Technolo...



Neural Network Architecture: all you need to ...



Understanding Named Entity Recognition & Text...



Webinar Recap: Fast Track Insurance AI: Overc...



Beginner’s Guide to Intelligent Document Proc...



Our Journey to Cleaning the Oceans with Machi...



How to compare Data Labeling Tools?



List of Image Annotation and Labeling Service...



Video 2.0: Kili Technology’s Fresh Start on V...



Understand predictive model through Topologic...



Automatic error identification for image obje...



AI-based Visual Inspection Systems: Next-gene...


Want to get ML content directly in your inbox?

Subscribe to our Newsletter



Read More

Read our Guides


ULTIMATE GUIDE TO DATA LABELING IN ML




OUR COMPLETE GUIDE TO VIDEO ANNOTATION




TEXT ANNOTATION AND DOCUMENT PROCESSING: A COMPLETE GUIDE






GET STARTED

Get started! Build better data, now.

Request a demoGet My Data Labeled

Kili Technology © 2023

Products
LabelingQualityIntegrationProfessional ServicesPlans & Features
Tools
LLM Fine-Tuning ToolLLM Evaluation ToolImage Annotation ToolVideo Annotation
ToolNLP Text Annotation ToolOCR Annotation ToolGeospatial Annotation ToolData
Labeling Tool
Guides
Data Labeling GuideRAG Evaluation GuideLLM Evaluation GuideText Annotation
GuideNatural Language Processing GuideComputer Vision GuideImage Annotation
GuideVideo Annotation Guide

Kili Technology © 2023

CompanyPress
France47 boulevard de Courcelles, 75008 Paris
United States524 Broadway, New York, NY 10012
PRIVACY POLICYLEGAL NOTICESECURITY INFOSTATUS

















This website uses cookies
Hey! At Kili Technology, we are committed to ensuring your privacy and providing
you with the best possible experience. Now, before you jump into exploring our
fantastic content, we'd like to get your permission to use these cookies. Don't
worry; we've got your privacy covered! 😊

Our cookies serve two primary purposes:

1️⃣ Enhancing Your Experience: These cookies allow us to remember your
preferences so you don't have to set them every time you visit.

2️⃣ Analyzing and Improving: We use cookies to enhance our content, features,
and overall user experience.

But here's the best part: we respect your choices! You have full control over
which types of cookies you want to enable or disable. If you accept, we will use
cookies for both the aforementioned purposes. However, if you prefer not to, we
will only use the necessary cookies required for the site's basic functionality.
Read more
Save & Close
Yes, it's Ok for me
Let me choose Hide details