|
Jihwan Lee, a leading researcher at the University of Southern California's Ming Hsieh Department of Electrical and Computer Engineering and USC’s Signal Analysis and Interpretation Laboratory (SAIL), is at the forefront of developing non-invasive methods to decode speech directly from brain signals. This development could transform the lives of individuals with speech and communication disorders.
Traditionally, decoding complex brain functions like speech has required invasive procedures such as electrocorticography (ECoG), which involves surgically implanting electrodes. However, Lee and his team are exploring how non-invasive electroencephalography (EEG) can be used for the same purpose.
“Interpreting brain signals through non-invasive methods like EEG has traditionally been limited to relatively simple classification tasks. But recent research including our work is beginning to explore how these complex, challenging tasks such as speech decoding can be achieved using non-invasive EEG signals,” Lee explained.
This technological leap is particularly significant because it eliminates the need for surgery, making BCI tools more accessible. With a wearable EEG cap, users could one day convert their brain signals into speech without any physical movement. This explores a possibility that stands to benefit millions suffering from conditions like ALS, stroke, or paralysis.
At the heart of this innovation are recent advancements in signal processing and artificial intelligence. According to Lee, “AI technology has made it possible to decode speech from brain signals based solely on the intention to speak or partial movements of the speech articulators. This means that even those who cannot vocalize sounds may soon have a way to communicate by harnessing their internal intentions and subtle neuromuscular cues.”
While the vision is promising, Lee acknowledges that most current high-accuracy speech BCIs still rely on invasive methods. “This poses a significant barrier for individuals who are hesitant or unable to undergo such surgery. That’s why we are actively exploring ways to achieve similar capabilities through non-invasive approaches like EEG. While promising, these methods are still in the early stages and not yet ready for broad deployment,” he noted.
Lee highlighted the importance of international partnerships, particularly with institutions in India such as IIT-Madras and the Indian Institute of Science.
“International collaboration allows research to cover diverse demographics and align with different regional guidelines. Our lab recently conducted the Workshop on Mapping Brain-Body-Behaviour Signal Dynamics in Human Speech Production and Interaction in Hyderabad, India was a key step in deepening this research dialogue across borders,” he said.
What truly sets USC’s approach apart is a novel architecture that allows EEG-decoded speech to be rendered as both waveforms and phoneme sequences simultaneously, bypassing traditional sequential processing. “We integrated two speech modules in parallel. The dual structure enhances performance by reducing latency and error accumulation, which are common pitfalls of traditional sequential models,” Lee explained.
This real-time, parallel processing design is a potential game changer. The inclusion of a phoneme predictor improves the clarity of generated speech sounds, while a simultaneous waveform generator increases textual decoding accuracy, both working synergistically. The result is a more efficient, accurate, and user-friendly system that holds immense promise for future clinical applications.
“The paper - Enhancing Listened Speech Decoding from EEG via Parallel Phoneme Sequence Prediction, demonstrates a method that improves the performance of models that decipher what you are listening from EEG brain signals. This new and improved tool to remove noise or errors that could distort the accuracy of the model, is 2x as effective as a previous version they worked on within the USC SAIL last year,” informed Shrikanth Narayanan, USC University Professor, Nikias Chair in Engineering, and Director of SAIL, as well as the senior author on the study.
“The incredible collective and converging technological advances in biosensing, signal processing and AI are enabling new insights into the brain-body-behaviour connections underlying human speech and language processing and generation including critically when some of these systems are affected by injury, disease or disorder. This, in turn, opens up new avenues for creating technologies that can support and enhance human communication and interaction” explained Narayanan.
|