Have you ever experienced a strange sound coming from your car? You instinctively feel something is wrong, but what next? You must call the dealer and tell them, for example, that the car makes a noise when tapping the brakes. Then you need to arrange an appointment, get to the appointment itself, and so on.
Human experts are very quick at decoding information from sound and determining what is happening around them. Often, that annoying drip, click, rattle, or screech is an indicator of something that’s wrong or about to go wrong. An expert car mechanic may even infer what might be wrong based on the sound.
This is a challenge that the IEEE Signal Processing Society DCASE (Detection and Classification of Acoustic Scenes and Events) 2023 Challenge aimed to resolve. It focuses on realizing the high potential in accurately decoding specific sounds for their context and drives research in several areas of interest to service & asset maintenance organizations:
- AI sound models that can be trained on one type of machine and then used to detect anomalies on a completely new different type of machine.
- The ability to use spatial audio to accurately detect where a sound is coming from.
- Models that can run on small edge devices with low power requirements like phones.
By solving these three challenges, a host of use cases can be unlocked that use non-invasive sound to replace machine telemetry in both brown & greenfield implementations, even when the machine does not provide built-in sensors. Below, we have listed some examples that describe the potential for audio quality detection:
- A model could run on a phone. For example, manufacturers can develop an app for washing machines that detects malfunctions and suggests possible solutions based on sound, thereby enabling customers to fix the issue without having to call a technician.
- Low-cost microphones could be placed in central locations in factories that can detect bad belts or bearings and correctly mark on a map which machines need maintenance. With flexible generic models not specific to a machine type, a single microphone (sensor) could cover many machines in a large area of a facility, reducing implementation complexity in both brownfield and greenfield factories.
- In the food industry, a bottling or packing line might be able to shut itself down automatically when it hears a bottle or package breaking so glass and other dangerous items are removed from the line before continuing and less product is wasted.
- On larger sites (e.g., construction or shipbuilding), microphones with GPS and spatial audio could be attached to robots, cranes, or vehicles to detect sounds that shouldn’t exist, like a crash or collapse and determine where they occurred to send help.
- An isolated worker shouting for help, could be automatically located and help called in using quality audio detection. In this case, the worker could be trapped away from a phone or other device and still get help.
- Leaks in storage containers could be detected and located with spatial audio faster by microphones listening for the tell-tale hiss or drip. Possibly reducing environmental contamination and reduce scrap and cleanup costs.
For the noisy car scenario, we started with, imagine having an app on your phone that would “listen” to the car and provide a pre-diagnosis with an option to schedule the appointment.
Sound-based AI models have the potential to enable new techniques for troubleshooting in service and maintenance organizations. While most IoT-based AI models rely on modern machines with sensor telemetry or retrofitting older machines, sound-based AI models could work in any industry on both new and old machines. Giving organizations another tool to increase FTR (First Time Resolution), reduce analysis times, and offer faster resolution. Ideally, it would be a tool to empower onsite operators at both brown and greenfield locations to self-diagnose based on precursor sounds and fix issues before they become serious.