The challenge to teach sign language to computers and machines
Computers and technology have been rapidly changing and adapting to the ever-increasing demands of the human race. People have been witnesses to this digital revolution in the last decade. At first, while using a computer we had to bash away at a keyboard full of noisy keys followed by taps on a touch-sensitive screen. Now, the usability of digital devices has become as innovative as simply speaking and giving out commands. Today, there are more than 100 million Alexas, Amazon’s voice assistant, accepting voice commands from millions of people. Siri, Apple’s voice assistant, processes as many as 25 billion requests per month globally. It is anticipated that by the year 2025 the tech market could be valued at more than $ 27 billion, sounds bountiful, right?
However, there is a blind spot in this growing tech world. There is one group left behind amid all these developments- “The users of Sign Language”. According to the World Health Organisation (WHO), there are 430 million people as deaf or with a partial hearing aid who use sign languages to communicate with the rest of the people. These people are at the risk of being secluded from the global digitisation process which is taking over our everyday life, simply because the computers and tech world is not very friendly with the sign languages to talk to computers.
Although we still do not have a full-fledged computer that understands sign languages, it would be wrong to say that we have never even tried to give due attention to this field of computers. A lot of people have tried their level best to teach sign language to computers and there have also been several claims of success in recent years. New and innovative solutions have emerged ranging from haptic gloves to hand shapes detecting software with many of them winning acclaim.
As Mark Wheatley, the executive director of the European Union of the Deaf (EUD), says “The value for us basically is zero.” Why? It’s easy to comprehend. Gloves just like body-worn cameras are intrusive and require adaptation to the needs of the people. Even though hand-shape recognition can prove out to be useful, it won’t be able to handle the full complexity of sign languages all by itself. This is because sign languages aren’t just a function of hands but also rely on facial expressions and movements of other body parts.
Other solutions include offering cheap alternatives to human interpreters, especially in public use places such as police stations, hospitals, or classrooms. But there is a high risk here- the cost of even minuscule errors could be extremely disastrous. However, things are improving with the work of research groups inclusive of deaf scientists on the best use of technology to serve deaf interests.
Sign language students are compiling databases and recording examples on how the languages are used and programmers are, in turn, trying their best to develop this data into useful products. Sign languages just like spoken languages have several hundred specific details such as their own grammars, idioms and dialects. Again, just like spoken languages, the usage of sign language does not follow the hard-and-fast rules of grammar books during the subtleties of everyday usage.
Thus, making computers understand to correctly interpret all this is much harder than actually teaching it to another human being. It’s more complex and intricate than you can think of. But, converting data is not the primary problem because the basic one is to generate this data. A Research published in 2019 which was led by Microsoft, a big computing firm, estimated that there are around a billion words from over 1,000 different speakers consisting in a typical publicly available corpus of a spoken language. On the other hand, an equivalent data-set in a sign language is supposed to have fewer than 100,000 signs from just a sample size of ten people. This data in itself highlights the multitude of the problem at hand. Apart from the enormous amount of data, it is quintessential that a good data set has variety. By variety we mean:
- Conversations between native signers of diverse backgrounds
- Different Dialects
- Different levels of fluency
- Different fluency of movement
A researcher at the University of Hamburg, Thomas Hanke, has assembled a sign-language library that has nearly 560 hours of conversations. A problem that he and his team faced in their data collection phase was that many volunteers started incorporating local signs which skewed the data. Although, collecting data is complex it’s still the easy bit in comparison to teaching computers. They are slow learners and need explicit teaching of what each example means by annotating everything from their movement to their facial expression. Collection of such exhaustive data takes time and efforts, lots of it. This is proven by the example of Dr Hanke, who after eight years still has only 50 hours of correctly annotated videos. Another particular concern is privacy. Collecting sign-language data involves an enormous amount of effort in recording participants’ faces instead of just their voices in the case of audio data.
Machine learning and artificial intelligence will be able to achieve impressing results if we are able to collect enough data and put researchers with a good understanding of deaf culture to work on it. A Hungarian firm called the SignAll has a 25- person team that includes three deaf people. It is one of the biggest groups working in this field and holds a proprietary database of 300,000 annotated videos of 100 users. SignAll‘s software can recognise American Sign Language (ASL) which is one of the most widespread sign languages though not yet at the speeds at which the native signers actually communicate. Currently, its product SignAll 1.0 has enough data and information to translate signs into written English. This means that a hearing interlocutor can respond using speech-to-text software. However, this adds a significant burden of pointing three cameras at a signer wearing the gloves
This burden might be lifted soon as claimed by Zsolt Robotka, boss of SignAll. He says that the firm is hoping for a glove-free option and also working on a product that works with a single camera on a smartphone. This means that with the success of this technology in integration with other apps, deaf people would be able to use their phones for common commands such as searching for directions or finding meanings of unknown signs.