commentson 29 April 2003 : 21:56, justin sez:

Actually, considering what this might cost someone hosting a conference to supply a stenocaptioner for each session of a conference, it's probably better if we work on training existing laptop-toting information freeks to type faster and more accurately, for the public good. At least for the good of the attention-span disordered visiting a chat room during a lecture.

commentson 29 April 2003 : 23:20, Rudy sez:

I am totally for the idea but /nod on the cost problem. Seems a good idea relegated to realization by improvements in voice recognition. I suppose current IP issues might be potentially prohibitive, so one more reason to fix IP law.

Group response seems quite nice for identifying dialog pieces of "significance" but transcripts easily allow one to work backwards and determine context. I would also venture real-time transcription would improve audience comprehension markedly. The ability to look back quickly over what was said seems like a thing I would quickly wonder how I ever did without. Text seems superior to video/voice for these purposes since most people can consume it far faster than they could with the primary source.

commentson 30 April 2003 : 02:01, Tibor sez:


commentson 30 April 2003 : 11:03, misuba sez:

For a lot of conferences, you could probably find volunteers to do captioning, in exchange for discounted admission.

commentson 30 April 2003 : 12:46, mnickel sez:

or, even better would be to throw some serious community brainpower to make reliable speech-to-text systems.

( Edit: Ooooo or if computational power is difficult, maybe a P2P network / Grid Computing application that would sit on all conference members laptops and provide computing power to the speech-to-text systems.)

This would be a much lower-cost solution than having ppl in meatspace typing away.

I have no idea the capabilities of current speech-to-text technology. I don't even know if something like this is even feasible today given the time necessary for training applications like "Naturally Speaking"

I know that text-to-speech software is pretty good, both commercial and open source. I'm just not sure about the capabilities of speech-to-text.

It would be a very interesting experiment to feed an MP3 recording of a speaker into an existing speech-to-text system and see if the resulting text would be understandable.

Hehehe... I can only imagine the next logical step: each Presenter/Speaker is wired up with a HUD that allows them to monitor the IRC chatroom, blogs, *and* the physical groove of the room as they speak. Now that's some *serious* multi-tasking ability.


commentson 1 May 2003 : 11:24, may sez:

what a fab idea! There would also be the added benefit of making conferences more accessible to the deaf. I've dug up some info on it over here and here. btw, it was nice meeting you at the Play Time BoF last week. Good luck on whatever it is you decide to do with the whole gaming-journalism-niche!

commentson 1 May 2003 : 11:36, Joan sez:

Intel just released some open source speech recognition software that seems to be aimed at getting hardware developers to build special purpose systems. The method crossbreeds a high quality audio signal even in noisy environment (using arrayed microphones) and lip reading (using video feed).

As this gets perfected and hardware gets smaller, I think it will be integrated into a standard lecturn or transparant video prompt screen so that all speech can be simultaniously transcribed and transmitted with high accuracy.

Audio-Visual Speech Recognition (AVSR) info

In my dreams, anyway...

commentson 4 May 2003 : 14:47, Stewart Butterfield sez:

Hey Justin - thanks for the suggestions. Still not sure if we will do anything further with Confab in the near term (since we still have a game to finish), but I think it could be a very useful tool, particularly with some more enhancements.

Side note: I once gave a talk at a conference in Paris and, being first up, did not know that there was going to be simultaneous translation into French going on. So, while I sat at one end of the stage, a woman in a booth sat on the other end of the stage and the experience was horrible -- a lot like talking on the phone when you get that one second delay. (Further, the talk was about the *metaphor* of navigation, and a lot of the points relied on idiomatic English -- I was constantly second guessing the translation and trying to French-ify on the fly. That sucked.

commentson 5 May 2003 : 01:00, justin sez:

Translation is a huge problem I hadn't really considered. In stenocaptioning as I'd craved it, the playing field was written english. But most people don't speak in written english. So there's a transformation happening there. At least with a text transcription, the presenter doesn't necessarily see or hear it happening. That would be very confusing, especially if you have any command of French.

