Fingerprints have been a huge part of solving crimes since the trial of Thomas Jennings in Chicago in 1911. While breaking into a house, Jennings shot and killed the homeowner. Based on fingerprints found at the scene, Jennings was convicted of the crime. Since then fingerprints have been a key player in many cases.
Recent research has shown that fingerprint examination can produce inaccurate results. A 2009 report from the National Academy of Sciences found that results aren’t repeatable from examiner to examiner. Experienced examiners may find that they disagree with their own past conclusions when they re-examine at a later date. This can lead to innocent people being wrongly accused and criminals being free.
Scientists have been working to reduce human error in fingerprinting. This week, scientists from the National Institute of Standards and Technology (NIST) and Michigan State University developed an algorithm that automates a key step in fingerprint analysis process.
"We know that when humans analyze a crime scene fingerprint, the process is inherently subjective," said Elham Tabassi, a computer engineer at NIST and a co-author of the study. "By reducing the human subjectivity, we can make fingerprint analysis more reliable and more efficient."
There would not be a problem if all fingerprints on a crime scene were high-quality. Computers can easily match two sets of rolled fingerprints. Rolled fingerprints are collected under controlled conditions, such as when a subject rolls 10 fingers onto a fingerprint card or scanner.
The fingerprints left on a crime scene — called latent prints — are often partial, distorted and smudged. If the print is left on something with a confusing background pattern like a dollar bill, it may be difficult to separate the print from the background.
When an examiner receives latent prints from a crime scene, the first step is to judge how much useful information they can extract.
"This first step is standard practice in the forensic community," said Anil Jain, a computer scientist at Michigan State University and co-author of the study. "This is the step we automated.”
If the print contains sufficient and usable information, it can be submitted to an automated fingerprint identification system (AFIS). AFIS then searches its database and returns a list of potential matches, and the examiner looks for a conclusive match. The initial decision on fingerprint quality is critical.
"If you submit a print to AFIS that does not have sufficient information, you're more likely to get erroneous matches," Tabassi said. On the other hand, "if you don't submit a print that actually does have sufficient information, the perpetrator gets off the hook."
The current process of judging print quality is subjective and different examiners come to different conclusions. Automating that step makes the results consistent.
“That means we will be able to study the errors and find ways to fix them over time,” Tabassi said.
The current process of judging print quality is subjective, and different examiners come to different conclusions. Automating this step makes the results consistent. This means that researchers could study the errors and fix them over time.
Automating this step will allow fingerprint examiners to more efficiently process this evidence. That would allow them to reduce backlogs, solve crimes more quickly and spend more time on challenging prints that require more work.
Researchers used machine learning to build their algorithm. With traditional programming, the programmer writes out explicit instructions for a computer to follow. In machine learning, the user trains the computer to recognize patterns by showing it examples.
To find the training examples, researchers had 31 fingerprint experts analyze 100 latent prints each, scoring every print's quality on a scale of 1 to 5. These prints and their scores were used to train the algorithm to determine the information contained in the latent print.
After the training was completed, researchers tested the algorithm's performance by having it score a new series of latent prints. They then submitted the scored prints to AFIS software connected to a database of over 250,000 rolled prints.
The testing scenario was different from real casework. In this test, the researchers knew the correct match for each latent print. If the scoring algorithm worked correctly, then the ability of AFIS to find that correct patch should correlate with the quality score. Prints scored as low-quality should be more likely to produce incorrect results. That’s why it is so important to not inadvertently submit low-quality prints to AFIS in real casework. Prints scored as high-quality should be more likely to produce the correct match.
Based on this metric, the scoring algorithm performed slightly better than the average of the human examiners involved in the study.
The availability of a large dataset of latent prints is what made the breakthrough possible. Machine learning algorithms need large datasets for training, large datasets of latent fingerprints have not been available to researchers because of privacy concerns until now. The Michigan State Police provided the researchers with the testing dataset after first stripping the data of all identifying information.
The next step for the researchers is to use an even larger dataset, which would allow them to improve the algorithm’s performance and more accurately measure its error rate.
A paper on this research was published in IEEE Transactions on Information Forensics and Security.