CASE STUDY

Custom CV Model Improves the Accuracy of Image Theft Detection Solution from 65% to 88%

Expertise Delivered

Backend
Data Science
QA

Industry

Legal

Our client,

Our client, PhotoStat, is an online platform for photographers and artists to find out where and how their images are being used online. PhotoStat enables artists to find and resolve cases of unauthorized usage of their images.

PhotoStat has helped more than 100,000+ artists and agencies uncover and handle copyright infringement claims worldwide through tech and legal support.

Disclaimer: The name PhotoStat is a placeholder, as there is an NDA signed between both parties. 

Challenge

With the internet and the overall accessibility it provides, image theft has increased exponentially over the years. Whether out of sheer unawareness or willful malice, many people steal the photographer's intellectual property without permission or any credit.

One such instance happened in 2012 when John (name changed to maintain confidentiality), a photographer, was battling online image theft. It led him to build the platform PhotoStat, wherein artists and photographers upload their portfolio/images and check if someone has used their work anywhere else on the internet in an unauthorized manner.

PhotoStat then performs image matching and searches for similar images from millions on the internet. The challenge here was that the platform was prone to a high level of false-positive and false-negative results, leading to incorrect results and confusion. This also negatively affected the company's brand image and customer base.

The platform was based on a logistic regression model with a precision of less than 65%. It gave out many incorrect results for the users, which then needed the team to analyze and classify these images manually.

To fix this problem, the client started looking for a solution that measures the exact degree of visual similarity between the images instead of simply classifying the images into positive and negative results. For fast searching and indexing in the future, the platform should also have the added feature of hashing / applying fingerprints to the images being compared. Consequently, the client also wanted the platform to support different file formats of images, viz. GIF, PNG, JPG, JPEG, TIFF, WEBP, and PDF.

Solution

The founders of PhotoStat were looking for companies specializing in Computer Vision services via Clutch and stumbled upon Maruti Techlabs' profile. A couple of calls and meetings later, it was clear that both companies aligned well, and Maruti Techlabs was signed as their Computer Vision Services partner.

These were some of the reasons Maruti Techlabs stood out to the founders:

• MarutiTech's expertise and experience in Image Recognition and Segmentation
• Our work in Object Recognition, OCR, and Process Automation
• Our expertise in refining and sorting different datasets
• High degree accuracy of the AI and ML models we have built
• Reviews from our clients on the Maruti Techlabs Clutch profile

1. Feasibility Study: 
After thoroughly understanding the client's use case and requirement, the first thing we worked on was conducting a feasibility study spanning four weeks, wherein our AI experts defined the scope of the solution and conducted a detailed analysis of the platform's current state.

During the feasibility study, our AI experts work with a sample of the image datasets (provided by the client) to determine the feasibility of the desired outcome. After creating the training dataset acquired from the client's database, we filtered, organized, and labeled the dataset to make it search-friendly. The labeled dataset then underwent meticulous quality checks like adding or removing pixels, removing noise, and sorting misclassified data.

After the data was processed, our engineers then leveled the dataset using various techniques like flipping, cropping, blurring, zooming, and compressing as required. Fundamentally, our AI experts studied, preprocessed, matched the sample data, and defined approaches they would take to build the search engine.

2. Development:

Once we completed the feasibility study and refined the training datasets, the actual development work began. With the help of training datasets, we designed a search engine, using computer vision technology, that gives similar user images for the given input image (uploaded by the artist).

Along with the similar images, the search engine also shows the percentage of similarity for each resultant image in the form of similarity distance. For easy interpretation of data, we created three buckets wherein the search engine classifies the results based on the similarity distance score. The three buckets are:  

• Exact Match
• Similar Match
• No Match

We built the image search engine as a 5-layered architecture wherein: 

1. The first layer preprocesses the images for noise removal, normalization, filtration, resizing, grayscale conversion, etc.

2. The second layer acts as the feature extractor and feature hash generator. This layer extracts the features of images in comparison. For the feature hash generator, we built a hash generation algorithm based on the features extracted for the images.

3. We built an index tree for the third layer to enable faster searching of images based on their feature vectors.

4. The incoming image bucketing is done based on similarity distance calculation. If the initial distance scoring via feature hash matching does not give a confident result (distance score more than a certain threshold value), then the fourth level of logic matches the feature vectors. And it calculates the percentage for features of user image matching in the public image to accommodate picture-in-picture similarity scoring.

5. An additional fifth layer was built for logo detection wherein a separate model was built and trained on labeled data of logo images.
 

Turn your digital data into informed decisions. With Computer Vision.




Phone

Don't take our word for it, take theirs!

" Maruti Techlabs helped solve one of our most pressing challenges. With their experience in image recognition and computer vision, they helped make our platform much more efficient. We have seen a significant decrease in manual analysis by 66%, which has worked out great for our team and our customers. Our platform is now able to process complex image similarity scenarios like logo detection, detecting resolutions changes, picture in picture on its own with minimal manual intervention."

- Chief Technical Officer

Technology Stack

Build-an-image-search-engine-using-python.png

Result

✔️23% greater precision of image similarity detection

✔️66% decrease in manual analysis

✔️Faster and accurate image theft discovery

✔️Reduced TAT on case resolution

✔️Increased customer satisfaction

Not only did the computer vision-based search engine reduce costs for PhotoStat in the long run, but it also elevated the entire customer experience of their platform. Greater precision rate and decreased manual efforts resulted in an enhanced platform and overall increased efficiency.

Following the success of the collaboration, PhotoStat decided to partner with Maruti Techlabs as their product development partner. Both teams now work parallelly towards PhotoStat's product roadmap and enhanced service delivery.

greater precision of image similarity detection

accuracy of image theft detection solution

decrease in manual analysis

Our Development Process

We follow Agile, Lean, & DevOps best practices to create a superior prototype that brings your users’ ideas to fruition through collaboration & rapid execution. Our top priority is quick reaction time & accessibility.

We really want to be your extended team, so apart from the regular meetings, you can be sure that each of our team members is one phone call, email, or message away.

Mesa de trabajo 1 copia@2x.png
How We Work?
Acquiring Image Datasets
The first step we carry out is the analysis of business goals and the creation of a database of images extracted from multiple sources. Structured, relevant, and quality data is prepared to serve as a guideline for future comparison.
Labeling Datasets
After structuring the image datasets, we label the images to make the database more search-friendly. Filtering similar patterns and making object comparisons become more efficient with this method. We use variables like color, contour, intensity, and size to create labels and organize the data.
Processing the Data
We then test the labeled dataset against training data by processing it through meticulous quality checks. We run a series of automated processes to enhance the images, like adding / removing pixels, removing noise, sorting misclassified data, & so on.
Data Augmentation
To improve the training data, we modify the images with a variety of techniques like flipping (horizontally or vertically), cropping, blurring, zooming, and compression. This way, we train the model for more accurate image recognition results.
Understanding the Image
In the final stage, we ensure that the model is able to interpret and categorize the object identified correctly. The software is now adequately trained to recognize images from new input sources. With the iterative process, we ensure that the model continues to enhance its capabilities over time.
Acquiring Image Datasets
The first step we carry out is the analysis of business goals and the creation of a database of images extracted from multiple sources. Structured, relevant, and quality data is prepared to serve as a guideline for future comparison.
Labeling Datasets
After structuring the image datasets, we label the images to make the database more search-friendly. Filtering similar patterns and making object comparisons become more efficient with this method. We use variables like color, contour, intensity, and size to create labels and organize the data.
Processing the Data
We then test the labeled dataset against training data by processing it through meticulous quality checks. We run a series of automated processes to enhance the images, like adding / removing pixels, removing noise, sorting misclassified data, & so on.
Data Augmentation
To improve the training data, we modify the images with a variety of techniques like flipping (horizontally or vertically), cropping, blurring, zooming, and compression. This way, we train the model for more accurate image recognition results.
Understanding the Image
In the final stage, we ensure that the model is able to interpret and categorize the object identified correctly. The software is now adequately trained to recognize images from new input sources. With the iterative process, we ensure that the model continues to enhance its capabilities over time.

AI Readiness Audit

Finding the right AI partner is no easy task.

Building an AI product that delivers on value, is even more challenging.

We help you get started with a slightly different approach. Before we get into the trenches and kickstart development, we take a top-down approach with an AI Readiness Audit.

This involves really validating the idea, through qualitative and quantitative analysis of your datasets, identifying the best fit approach to model development, and putting together an implementation roadmap.

All this before writing a single line of code, and investing heavily into the idea.

Achieve more with less. 

AI Readiness Audit.png

More social proof incase you're still on the fence

Our Clients Review
Our Happy Clients
Review Everything
Why Choose Maruti Techlabs?
14+ years experience
Start as quickly as week
Recurring cost of training and benefits - $0
4.8/5 NPS on Clutch
Certified PMs & delivery teams
Rapid deployment & on-time delivery
Complete transparency
Robust communication across shared channels
Agile & lean startup methodology
Experience across 16 industries

Years of Experience

Professionals

Projects Delivered

NPS on Clutch

Services
  • Software Product Development
  • Artificial Intelligence
  • Data Engineering
  • DevOps
  • UI/UX
  • Product Strategy
Case Study
  • DelightfulHomes (Product Development)
  • Sage Data (Product Development)
  • PhotoStat (Computer Vision)
  • UKHealth (Chatbot)
  • A20 Motors (Data Analytics)
  • Acme Corporation (Product Development)
Technologies
  • React
  • Python
  • Nodejs
  • Staff Augmentation
  • IT Outsourcing
Company
  • About Us
  • WotNot
  • Careers
  • Blog
  • Contact Us
  • Privacy Policy
mtechlogo.svg
Our Offices

USA 
5900 Balcones Dr Suite 100 
Austin, TX 78731, USA

India
10th Floor The Ridge
Opp. Novotel, Iscon Cross Road
Ahmedabad, Gujarat - 380060

clutch_review
goodfirms_review
Social
Social
Social
Social
©2024 Maruti TechLabs Pvt Ltd . All rights reserved.

  • Software Product Development
  • Artificial Intelligence
  • Data Engineering
  • DevOps
  • UI/UX
  • Product Strategy

  • DelightfulHomes (Product Development)
  • Sage Data (Product Development)
  • PhotoStat (Computer Vision)
  • UKHealth (Chatbot)
  • A20 Motors (Data Analytics)
  • Acme Corporation (Product Development)

  • React
  • Python
  • Nodejs
  • Staff Augmentation
  • IT Outsourcing

  • About Us
  • WotNot
  • Careers
  • Blog
  • Contact Us
  • Privacy Policy

USA 
5900 Balcones Dr Suite 100 
Austin, TX 78731, USA

India
10th Floor The Ridge
Opp. Novotel, Iscon Cross Road
Ahmedabad, Gujarat - 380060

©2024 Maruti TechLabs Pvt Ltd . All rights reserved.