Video recorded during laparoscopic surgeries — in which a fiber-optic camera is inserted into a patient’s abdominal cavity to provide a video feed that guides the surgeon through a minimally invasive procedure – could be of value in training both medical providers and computer systems that would aid with surgery. But with many hours of video documented for each case, reviewing them is time consuming.
Now a new system devised by researchers from MIT and Massachusetts General Hospital can efficiently search through hundreds of hours of video for events and visual features that correspond to a few training examples.
The researchers trained their system to recognize different stages of an operation, such as biopsy, tissue removal, stapling and wound cleansing. The technology is applicable to any analytical question that doctors deem worthwhile. The system could be trained to predict when particular medical instruments — such as additional staple cartridges — should be prepared for the surgeon’s use, or it could sound an alert if a surgeon encounters rare, aberrant anatomy.
Previously, the team investigated “coresets,” or subsets of much larger data sets that preserve their salient statistical characteristics. These coresets have been used to perform tasks, such as deducing the topics of Wikipedia articles or recording the routes traversed by GPS-connected cars.
In this case, the coreset consists of a couple hundred or so short segments of video — just a few frames each. Each segment is selected because it offers a good approximation of the dozens or even hundreds of frames surrounding it. The coreset thus winnows a video file down to only about one-tenth its initial size, while still preserving most of its vital information.
For this research, surgeons identified seven distinct stages in a procedure for removing part of the stomach, and the researchers tagged the beginnings of each stage in eight laparoscopic videos. Those videos were used to train a machine-learning system, which was in turn applied to the coresets of four laparoscopic videos it hadn’t previously seen. For each short video snippet in the coresets, the system was able to assign it to the correct stage of surgery with 93 percent accuracy.