Skip to content. | Skip to navigation

Personal tools
Log in
Sections
You are here: Home

jorgenmodin.net - Blog

Testing out deepspeech for speech to text recognition

Posted by admin |

Contents: 16 minutes of heavily compressed "mushy" speech in British-ish English. Text file needed to create a script to redo the audio. I tried deepspeech 0.4.1 with its supplied model.

./bin/deepspeech --model models/output_graph.pb \
--alphabet models/alphabet.txt \
--trie models/trie --lm models/lm.binary \
--audio resampled-mono.wav > 16monoklr.txt

For comparison here's what Google transcribes from the same file:

this is the Swedish Land Registry logjam products contract blockchain system a Croma way running on s flex so I'd open a number of tabs here for each participant into the smart contract seller buyer sellers and buyers Bank the Land Registry itself and another server which in this case could be the demo…

First try on a 48KHz stereo wav file. The result is close to garbage and way too short:

his is the way should under his dream look him potomac changes and i conteanyng on spikes so i oftener of catechism in to the smart contract celeritate sang biting the understreak so i don't know server wichitas be the democrats not stage she told some pecorino is…

Second try 16KHz stereo version of the same file. Result, this is the entire contents of the file:

ohhhhhhhhh ohhhhhhhhh hhhhhhhhow hhhhhhhhow hhhhhhhhow here

Third try 16KHz mono version of the same file. The result is close to garbage and way too short:

this is these we should under destrem look him or he come look changed and a craney running on specs so i don't get a number of caterers to the smart contract selerier said or sankirtan the understreak self and an observer which he is caste democrat…


 

 

Feb 05, 2019 02:05

What's the best microphone for a screencast?

Posted by admin |

I want super sound quality that doesn't preoccupy my hands, or is a table microphone, it needs to be close to my mouth.

My best microphone for this purpose that I own, is a T-bone TWS One (~€60) wireless headset that runs on 865 MHz to a wireless receiver. It's actually really good. I do not need any better. But it is a bit of work to set up:

  • the headset itself
  • the transmitter with batteries
  • the receiver, with power adapter
  • A 6.3mm output from the receiver that needs to go somewhere

So is there more convenient stuff out there?

What is not better

I have a lavalier, an Aputure A.lav. It is emphatically worse in sound quality than the T-bone. Its thin microphone cable is prone to picking up hum, and it makes a lot of scratching noises when cables move around. Maybe my copy shouldn't have passed quality control.

Also worse is a Sony MDR-XB950B1 headset over Bluetooth. If you post process with normalization, equalization and compression you can get up to the level of the sound of a depressed airline pilot over the plane's PA.

Here is the microphones I am looking at right now

Plantronics Voyager 4220 UC bluetooth headset €200

Has good reviews. A bit expensive though, but might be a good replacement for my Sony MDR-XB950B1. About €200.

Samson Go Mic USB, USB microphone €40

USB means there is very little analog cable length to pic up hum. It is also reasonably priced, very small with built-in clip and mic mount. About €50.

ModMic 4/5, clip-on analog microphone for headphones

Good reputation but does have a long analog cable. But maybe it is well shielded. Analog means it can be plugged into an analog mixer. However if you run a Linux distribution with Jack, such as MediBuntu, you ought to be able to do simultaneous recording from multiple sound cards, including USB microphones.

ModMic Wireless, clip-on wireless microphone for headphones €100

ModMic Wireless – Antlion Audio A bit expensive.

Overviews

 

Feb 02, 2019 01:55

Flowblade— a so far so good video editor on Linux

Posted by admin |

This morning I opened my KDEnlive project that I spent 4 hours on yesterday, and found it had scrambled all my clips over three video timelines, changed durations and lost some clips.

So today I switched. I have tested many video editors but never before Flowblade. So far, so good! It took me an hour to recreate yesterday's work (since I now knew the source material better) and lo and behold, it hasn't corrupted my work yet. Thruth to be hold, neither had KDEnlive before yesterday.

I remember when I had to build a watchdog for the Linux Abiword word processor, a watchdog that reacted to every file change Abiword made and checked that into a local git repository. That version of Abiword had the habit of corrupting its document files. I thought I'd now need to do the same with KDEnlive. But I think I will stick with Flowblade now. It has crashed on me 5 times today, but so far not corrupted anything.

I found Flowblade to have a logical UI and you can name clips! But, you cannot cut soundtracks. That could be a dealbreaker. Maybe you can if you merge them with video tracks?

Ok, I have found a way to cut an audio clip.

  • Make sure all tracks are inactive
  • That includes the track the audio clip is on, if that track is active, it won't work
  • Select the audio you want to make a cut in
  • Press x, or select Edit→Cut Clip
Jan 31, 2019 11:50

Annotate screenshots in Ubuntu when Shutter is gone—Use Krita

Posted by admin |

You can also use Flameshot if you use that tool to screenshot. Krita is better than Gimp, because you can draw ellipses directly without using selection tools and such. No arrows though as far as I have found. There might be Gimp plugins that do more stuff though.

 

Obviously use can use a vector editor such as InkScape, but then you have to think about preserving resolutions.

Jan 30, 2019 03:50

Impressions of libre MarkDown note-taking apps: Joplin & Boostnote.

Posted by admin |

Impressions of libre MarkDown note-taking apps: Joplin & Boostnote. Joplin less cluttered, obvious how to make notes, has a menu bar. Boostnote UI more like VS Code editor put together at random 🙂 Actual usage may shift opinion…

Jan 22, 2019 11:46

On the French yellow vest movement & similar movements

Posted by admin |

This interesting article goes into some causes, but admits to not understand the movement.

https://www.the-american-interest.com/2019/01/21/the-problem-with-no-name/

I think there's a reason for yellow vests & such: Disorientation. Unclear road ahead, imaginations run on fumes. Society hasn't integrated the power of the Internet; it helped wipe out the old French parties. Monetary policy adds to our present confusion. Old extreme left/right baggage floats up.

The road ahead must bring clarity to a computerized & networked world. Now its effect is centrifugal & ppl trip on fear or hubris. I hope blockchains can help, bring clarity to what money is, & coordinate people to act politically. For the latter digital identities is important.

Jan 22, 2019 11:55

Link to comparison of lots of programming fonts — and my two favorites

Posted by admin |

http://app.programmingfonts.org

 

My two favorites:

Bitstream Vera Sans Mono and Hack.

Jan 22, 2019 12:55

Getting OpenVPN to work when ipv6 support is missing

Posted by admin |

You may get error messages such as:

"GDG6: NLMSG_ERROR: error Operation not supported"

"GDG6: remote_host_ipv6=n/a"

 

If it cannot get ipv6 connection, add these to your config file:

 

pull-filter ignore "route-ipv6"
pull-filter ignore "ifconfig-ipv6"
Jan 07, 2019 02:39

Hints to software to build your own Google Home

Posted by admin |

Some guys are trying, however they are held back by production and fitting problems of PCB and LCD. Would've been easier if they had just used standard hardware I guess. But it's harder to charge for that.

https://www.kickstarter.com/projects/aiforeveryone/mycroft-mark-ii-the-open-voice-assistant

They do on the page mention a number of text to speech and command softwares, those could be worth to take a look at.

 

  • Pocket-sphinx
  • precise
  • Mozilla DeepSpeech
  • Adapt
  • Padatious
  • Mimic
  • CMU flite

 

So there's probably where you hould start looking!

 

Dec 17, 2018 03:05