Speech-to-text service from Spinvox used humans and broke its privacy promises, according to the the BBC and rival vendors
A British service that claims to turn voice messages into text automatically, actually uses human transcribers and breaches users’ privacy in the process, according to the BBC, and voice transcription experts.
Spinvox offers a speech-to-text service, which it claims uses “a Voice Message Conversion System, known as D2 (the Brain)” that can “call on human experts for assistance”. The BBC claims that most of the work is actually done by humans in call-centres in Africa and the Phillipines, in breach of its stated policy – and goes on to report that these staff set up a Facebook account where private customer information was posted.
The system must use a bank of humans, says the BBC’s technology correspondent Rory Cellan-Jones, because it transcribed the same message five different ways. Also, an operator in an Egyptian call-centre formerly used by Spinvox told the BBC: “The machine doesn’t understand anything. You have to start typing when you hear the message.”
SpinVox’s entry in the UK’s Data Protection Registry says that it does not transfer data outside the European Economic Area, so it appears to be breaking this. Meanwhile, the operators’ Facebook account has been shown on TechCrunch, and appears to include potentially commercial information from transcribed calls.
Spinvox has expanded swiftly since its foundation in 2004, using technology developed at Cambridge university, and $200m (£120m) in investment. It has offered a consumer service and then sold services to network operators.
Rivals such as the established Nuance Communications, say that the Spinvox service, transcribing short messages form multiple people over mobile phones could never have worked as described: “In Nuance’s view, this task will never be able to be totally automated in the near future. You cannot control the person leaving your voicemail, or the environmental factors,” said John West, solutions architect in Nuance’s mobile group. “Spinvox is offering something that is impossible to deliver now.”
Nuance Communications, the company behind the well known Dragon PC transcription system, has various contracts with operators, and traces its lineage through Xerox back to voice recognition and AI pioneer Ray Kurzweil in the 1970s.
In a statement to the BBC, Spinvox admitted that all speech systems use human intervention, and promised that when it does this, the messages are anonymised so they cannot be connected with a particular person. It would not reveal the proportion of its traffic that is treated in this way, or where it is sent.
Spinvox denies that the Egyptian call centre ever handled live calls, but was weeded out during training sessions because it wasn’t up to scratch – something the BBC claims to have had contradicted by staff there.
Experts are bemused that people are objecting that the service infringes privacy by using human transcribers, when normal voicemail systems are susceptible to eavesdropping, and even an automated transcription service would produce easily-searchable text.
“Come on people,” said analyst James governor of Red Monk on Twitter. “Everyone knows Spinvox is a Mechanical Turk [a concealed human presented as an automaton]. That’s what makes it so excellent.”
While speech recognition in the cloud has been evolving, speech recognition on devices themselves has been progressinng in devices such as the iPhone, devices using Intel’s Atom, or the Tesla Motors electric car.