Personal tools
You are here: Home How-tos

How-tos


Linux ocr for getting text from a screenshot

Filed Under:

 

Summary: For a 72dpi screenshot,  gocr returned something intelligible, tesseract returned nothing and ocrad returned gibberish.

Multiplying the pixel count by 4, and interpolating helped tesseract and ocrad to output text at all, but they were still not superior to gocr

 

These OCR programs are probably not calibrated for making text out of pixel-perfect low-resolution screen shots, but from high-resolution somewhat noisy scans of different type faces on paper. Doing OCR from a screenshot ought to be quite easy: Each letter is pixel perfect and looks exactly the same, and there are no problems with slanting text or other distortions. In fact, writing your own OCR program is a distinct possibility for this.

I had via mail received a 72dpi screenshot that I wanted to get the text from. The top part looked like this:

 salsatext.png

The top of the screenshot

Tesseract, which is a program that is highly recommended on the web, returned nothing when run on this screenshot. At first after reading this, I thought this had to do with my tif possibly having a layer of transparency, but ensuring it was not there did not change anything.

According to the same discussion, it seems like tesseract wants to have a high resolution image (see tests on that further down).

 

Now the ocrad program returned this:

Al Po_ Po_ _ Al _|_o _|_o _ M_|_o_hl_o
lollob_IOldo _|_o_do l_m_o
A_ _ol__lo _ _|_o_do l_m_o
__ _o _| Colmo_ _ ____ _ ___ T__o_
OIOo Ml__ __o C_o_o_o_ O_O____o
llo_o_do__ _ l_|_ __llO_ Co__ol__
_| Mo_|___o O_O____o A_oOo_
A_ Amo_ C_o_do Woblo_ lo_ Ml_odo_ _ Alb___o Bo__o_ _| Tl_o_ D
_o_ldo B___lol _ _|_hl_ _o_ b Bobb_ c___
lo_ Tomoll_o_ D_ OIOo O_O____o A_oOo_ _o__
A_ld _o_ Bo_____o
_o_ Wo_ Wo_ B___o _|__o C_bo_ Plo____

 ...and so on.

Gocr returned this:

y 9 999)        9       9     _J   ypp yyy    y
AI P a n P a n Y AI Vin o Vin o m eIc o c hita
L oIIo brigid a Ric ard o L e m v o
A V aIeria Ric ard o L e m v o
Se Va EICaiman fru ko Y Sus Tesos
Oiga mire Vea Guayacan Orquesta
LIora n d ote L uis F eIi e G o n z aIe z
EImanisero Orquesta Aragȯn
A Amor Cuando HabIan Las miradas AIberto Barros''EITitan D.
S o nid o B e stiaI Ric hie R a 6 B o b b C ru z
Los TamaIitos De OIga Orquesta Aragȯn,Josė
A cid R a y B arretto
Ran Kan Kan Buena Vista Cuban PIayers
Undanta Bo Kas ers Orkester

...and so on, which given the non-outputting competition, must be deemed fantastic. Still, it cannot deal with any characters extending below the baseline (p, g and y for example), and all ls are interpreted as 1s.

Increasing the pixel density of the image

This turned out to be non trivial with the tools I had at hand. I finally got resampling working with the program pnmenlarge, part of the netpbm suite of command line unixish image processing tools:

cat salsatext.pnm | pnmenlarge 4 > enlarged.pnm

This quadrupled each pixel, and now tesseract magically started working!

 (convert to to tif first)

Fil Pen Pen 'ii Fil '·.·'inp '·.·'inp et`} i···1el¤:p·:hite
Lpllplznrigide Riterdp Lem·-rp
.·!·.·-,# '·.·'elerie et`} Riterdp Lem·-rp
5e '·.·'e El Ceimen et`} Frulce 'ii 5us Tesps
Ciige i···1ire '·.·'ee Gueyeten Ordueste
Llprendpte et`} Luis Felipe Gpneelee
El i···1eniserp Ordueste ifiregdn
.·!·.·-,# Famer, Cuendp Hel:·len Les i···1iredes et`} .·!·.ll:¤ertp Eerrps "El Titen D. ..
Epnidp Eestiel et`} Richie Re·-,# El Epl:·l:·3r Crue
Lps Temelitps De Cilge Ordueste ifiregdn, _|pse
.·!·.¤:id Re·-,# Eerrettp
Ren Ken Ken Euene '·.·'iste ·Zul:·en F‘le3··ers
Llndenteg et`} Ep iiespers Orltester

Well, it does at least produce output, but the quality is at the point that you can barely guess which line it is trying to decode.

Let's try switching to Spanish as language:

.ü.| Pan Pan "x‛ .ü.| '·.·'ina '·.·'ina —l=*.`Š f'·'1a|·:·:··:|'•i|:a
La||a|:·ri·;|i·:|a F‘xi·:ar·:|a Lam'­.«a
.ü.'-,« '·.·'a|aria —l=*.`Š F‘xi·:ar·:|a Lam'­.«a
Sa '·.·'a El Caiman —l=*.`Š FrukJ:· "x‛ 5uS TaSaS
Diga Mira '·.·'aa Cuaşracan Dr·:]uaS|:a
Llarandata —l=*.`Š LuiS Falipa Ganzalaz
El f'·'1aniSara Dr·:]uaS|:a Aragón
.ü.'-,« Fumar, Cuanda Ha|:·|an LaS f'·'1ira·:|aS —l=*.`Š .ü.||:·ar|:·:· EarraS "EI Titan D. ..
Sanida EaS|:ia| —l=*.`Š F‘xi·:|'•ia Ra'-; Ex Ea|:·|:·ş» Cruz
LaS Tama|i|:aS Da Diga Dr·:]uaS|:a Aragón, _|aSa
.ü.·:i·:| Ra'-; Earratta
F‘xan Kan Kan Euana '·.·'iSta Cu|:·an F‘|aş·'arS
L|n·:|an|:a·; —l=*.`Š Ba kaS|:·arS DrkaStar

 

That was not good. Maybe the enlargement needs to be smoother?

pamstretch, also from the netpbm package, also increases pixel count but additionally smooths the output by interpolating pixels.

As many unixish tools, pamstretch takes data from stdin and outputs it to stdout:

cat salsatext.pnm | pamstretch 4 > stretched.pnm

Tesseract needs tif format, handled here by Imagemagick's convert command

convert  stretched.pnm  stretched.tif

run tesseract on it in this case with -l spa, which means Spanish language

tesseract stretched.tif str -l spa

The result:

AI Pan Pan 'l" AI Vino Vino —.?•.`$ Molcochita
Lollobrigida Ricardo Lomyo
Ay 'o‘aloria —.?•.`$ Ricardo Lomyo
5o 'o‘a El Caiman —.?•.`$ Fruko 'l" Sus Tosos
Diga Miro 'o‘oa Guayacan ûrquosta
Llorandoto —.?•.`$ Luis Folipo Gonzalo:
El Manisoro ûrquosta Aragon
Ay Amor, Cuando Hablan Las Miradas —.?•.`$ Alborto Barros "El Titan D. ..
5onido Eostial —.?•.`$ Richio Ray En Bobby Cruz
Los Tamalitos Do Olga ûrquosta Aragon, José
Acid Ray Earrotto
Ran Kan Kan Euona 'liista Cuban Playors
Undantag —.?•.`$ Bo Iäaspors ûrkostor

...better. Let's try English:

AI Pan Pan 'i" AI 'a'inu 'a'inu 3} Ms|cuchita
Lu||ubrigic|a Ricarclu Lsmyu
Ay 'a'a|sria 3} Ricarclu Lsmyu
5s 'a'a El Caiman 3} Fruku 'i" 5us Tssus
Diga Mirs 'a'sa Guayacan Drqussta
L|uranduts 3} Luis Fs|ips Gun:a|s:
El Manissru Drqussta Aragun
Ay Amur, Cuanclu Hab|an Las Miradas 3} Albsrtu Earrus "El Titan D. ..
5unic|u Esstia| 3} Richis Ray E: Eubby Cru:
Lus Tama|itus Ds D|ga Drqussta Aragun, juss
Acid Ray Earrsttu
Ran Iian Iian Eusna 'a'ista Cuban Playsrs
Unclantag 3} Eu Iiaspsrs Drksstsr

That is worse.

How does ocrad perform?

Al Pan Pan Y Al Vino Vino __ Melcochila
Lollobrigida Ricardo Lemvo
Ay Valeria __ Ricardo Lemvo
Se Va El Caiman __ FrukD Y Sus Tesos
Oiga Mire Vea Guayacan Orquesla
Llorandole __ Luis Felipe Gonzalez
El Manisero Orquesla Arag�n
Ay Amor, Cuando Nablan Las Miradas __ Alberlo Barros "El Tilan D,,,
Sonido Beslial __ Richie Ray bBobby Cruz
Los Tamalilos De Olga Orquesla Arag�n, los� , , ,
Acid Ray Barrello
Ran Kan Kan Buena Visla Cuban Players
Undanlag __ Bo Kaspers OrkPsler

 

A lot better than the line noise seen before. With enlarged but not interpolated:

Al Pan Pan Y Al Vino Vino __ Melcochi_a
Lollobrigida Ricardo Lemvo
Ay Valeria __ Ricardo Lemvo
Se Va El Caiman __ FrukoYSusTesos
Oiga Mire Vea Cuayacan Orques_a
Llorando_e __ Luis Felipe Conzalez
El Manisero Orques_a Arag�n
Ay Amor, Cuando Hablan Las Miradas __ Alber_o Barros "El Ti_an D,,,
Sonido Bes_ial __ Richie Ray b Bobby Cruz
Los Tamali_os De Olga Orques_a Arag�n, _os� ,,,
Acid Ray Barre__o
Ran Kan Kan Buena Vis_a Cuban Players

That's worse.

So, tesseract and ocrad needs the input to be "scannified" by multiplying the pixel count and interpolating to get a bit of smoothness, but they still do not clearly beat gocr.

For scanned in documents the ranking seems reversed.

 Peter Selinger: Review of Linux OCR software:

Of course, it must be stressed that the test results reported here are derived from only two scanned pages. It is possible that for other inputs, the programs rank differently. However, based on the tests reported on this page, here is a summary of my conclusions:
* Tesseract gives extremely good output at a reasonable speed. It is the clear overall winner of the test. The only caveat is that one absolutely must convert the input to bitonal.
* Ocrad gives reasonable output at extremely high speed. It can be useful in applications where speed is more important than accuracy.
* GOCR gives poor output at a slow speed.

 

 

 

 

 

"svnadmin load" loads into an existing repository; it does not create one

Filed Under:

  • svnadmin dump repository > repository.svndump
  • svnadmin create repository
    svnadmin load repository < repository.svndump

I needed to downgrade a couple of repositories from format version 3 to 2 in order to work with RHEL, and it took me a while to realise that svnadmin load loads into an existing repository, it does not create a new one. So first create the repository with svnadmin create, then load into it.

Run a plone zexp imported into a fresh Data.fs

Filed Under:

Summary: You can't. The server will give an error. The trick is to create a bogus site in the new Zope server. This somehow modifies Zope, from what I have read on the Internet, the acl_users folder in the Zope root.

One of my hobby projects was not on our backup bandwagon, so when I accidentally corrupted Data.fs by overwriting it (the tar command is very picky it turns out about not mixing up source and destination file names) I was out of remedies. But luckily enough I had made a "site.zexp" a few days ago by exporting the site from within the Zope ZMI.

So, just delete the old Data.fs completely, start up the server and import the zexp. That works, but you cannot view any pages, you get AttributeError: getGroups . Googling I found a posting by Andreas Jung. Jung doesn't explicitly say it but writes:

Problem solved. The behavior is caused by the stupid expectation of Plone
that the root acl_users folder having been replaced with its own
implementation while creating a new site

...and from there it was possible to deduce that creating a new bogus site from within the ZMI should do the trick. And this can be done after the import.

Make rdiff-backup use a different port for ssh

Filed Under:

Summary:

rdiff-backup --remote-schema "ssh -C -p9222 %s rdiff-backup --server"
username@remoteserver::/path_to/filestobackup
/path_to/backedupfiles

...worked for me, to backup a remote server via a non standard ssh port (9222 in the above example). Note the double quotes around the string following --remote-schema. All examples I could find on the Internet used single quotes, and using rdiff-backup 1.2.8 between two CentOS 5 machines, this did not work.

Get Spotify working with PulseAudio on Ubuntu Linux

Filed Under:

I had problems getting Spotify to work under Ubuntu and Wine, with a Microsoft LifeChat LX-3000 headset. The sound chopped 2-3 times per second.

Using OSS and normal Wine worked fine with the internal sound card of the laptop, but not with the USB headset.

I found this discussion thread and tried different remedies. The one that worked was Neil Wilson's fork of Wine, WinePulse, with support for PulseAudio.

You can update your system with unsupported packages from this untrusted PPA by adding ppa:neil-aldur/ppa to your system's Software Sources. Not using Ubuntu 9.10 (karmic)?


Läs mer: Release Packages : Neil Wilson

HD Video from Canon HF200 on Ubuntu Linux - convert and play

Filed Under:

Notes to self on how to play videos from the camera on my Linux computer.

The files that the Canon HD video camera outputs have the suffix MTS. These can be played by Videolan client on my Dell Celeron-equipped laptop. Well, kind of: It plays the first two frames or so, then chokes on the video and keeps playing the sound.

The mts files can be converted to other formats with ffmpeg. The video from the Canon camera seems to be interlaced. If you use ffmpeg straight off the bat like so:

ffmpeg -i canonvideo.mts -sameq video.mp4

lines will be all wavy because the canon format is interlaced. Use the -deinterlace option like this:

ffmpeg -deinterlace  -i canonvideo.mts -sameq video.mp4

The mp4 file then plays effortlessly with vlc on the computer.

It should be possible to make the output interlaced as well, with the ilme option

However this:

ffmpeg  -i 00002.MTS -sameq -flags ilme video.mp4

still creates wavy lines. There are a number of idioms with ilme in them floating around the Internet, and I am not sure how to use it.

Put together a video split in parts

Canon HF200 splits long videos in separate files, each part about 2GB in size. These files cannot be converted as individual videos! Well, the first one can, but the  following ones each depends on the one before. This is because the camera in order to save space writes incomplete frames to the files, frames that only contain the changes as compared to previous frames. Some frames however are complete on their own and are usually called key frames. Now when the camera splits the video into files, it does not take care to do this at key frames.

The video therefore before conversion needs to be put together into one large file. On Linux this can be done with the cat commmand:

cat 00001.MTS 00002.MTS 00003.MTS > whole-video.MTS

Slow motion a video and save to file

Filed Under:

This guide shows you, on Linux,  how to make a slow motion video of an mp4 video (in this case downloaded from Youtube), with the pitch of the sound intact. There is probably a one-line command to do this in mencoder, ffmpeg or vlc, if so please enlighten me. This guide starts with a terse summary, and then continues with a more verbose explanation.

Summary

You need:

  • mencoder (part of mplayer)
  • ffmpeg
  • sox

All are open source and freely downloadable.

Assume the video is called "normal.mp4", that should be made into a slow motion video called "slomo.mp4", with the pitch intact so we do not get those grovelling noises.

First to slow down the video to half speed, use mencoder, part of mplayer:

mencoder -ovc copy -oac mp3lame -speed 0.5  normal.mp4 -o slow.mp4

Extract the sound with ffmpeg:

ffmpeg -vn  -i slow.mp4 slow.wav

You may now discard the slow.mp4 file.

Pitch it up:

sox slow.wav slow_but_pitched_up.wav pitch 1200

Make a slow version of normal.mp4 with no sound:

mencoder -ovc copy -nosound -speed 0.5  normal.mp4 -o slow_no_sound.mp4

You can put sound and video together with with mencoder:

mencoder -ovc copy -audiofile slow_but_pitched_up.wav -oac faac slownosound.mp4 -o slomo.mp4

or use ffmpeg:

ffmpeg -i slownosound.mp4 -i slow_but_pitched_up.wav -map 0.0 -map 1.0 slomo.asf

(the sound needs to be compressed above, methinks)

If you can make an mp4 file instead of an asf file, so much better. On my machine it complained about not having the correct codec for sound; I am still looking into that.

Slow motion an mp4 video on Ubuntu Linux 9.10

--longer explanation and screenshots --

If you watch a video in vlc, you can slow it down, the sound is slower but stays at the original pitch, which is neat. I was unable to find a "Ok, good, now play through this and save it to a file" setting in vlc, so below is the road i treaded to finally convert a video file into a slow motion video file, with pitch intact.

First to slow down the video to half speed, use mencoder, part of mplayer:

mencoder -ovc copy -oac mp3lame -speed 0.5  normal.mp4 -o slow.mp4

This will slow the video down to half speed, but unfortunately it also speeds down the sound. Mplayer has a switch for affecting the pitch but mencoder does not pick it up.

The -speed flag above indicates the speed, with 0.5 being half speed.

So far I have been unable to make mencode preserve the pitch (i.e pitch shift it back), but sox can pitch shift. However it only operates on sound files. So, the sound of the mp4 file needs to be separated out, then sox can operate on it, then we combine the sound and video again.

Separating out the sound

On  my Ubuntu 9.10 I hade to install "libavcodec-extra" to get it to work. The  command is like this:

ffmpeg -vn  -i slow.mp4 slow.wav

Now sox can operate on it. AFAICT sox should be able to read mp3 too, but not on my machine, despite library installations and hand waving.

Pitching up the sound

Now sox can pitch it up

sox slow.wav slow_but_pitched_up.wav pitch 1200

Sox is special in that it wants the input and output files first, and then after them, the command line arguments. Sox has a flag called "pitch" pitch takes among other things a percentage value, where 100 is one semitone, and hence 1200 is an octave. We want an octave shift since we slowed down the the video to 50%, and an octave is a doubling of frequency (pitch).

Combining slow motion video and pitched up sound

Now we need to combine the sound and the video.

You can put sound and video together with with mencoder:

mencoder -ovc copy -audiofile slow_but_pitched_up.wav -oac faac slownosound.mp4 -o slomo.mp4

There is some problem with that file though since ffmpeg reports:

Seems stream 0 codec frame rate differs from container frame rate: 29.97 (30000/1001) -> 14.99 (15000/1001)

It plays fine in vlc, though

You can use ffmpeg like so:

ffmpeg -i slownosound.mp4 -i slow_but_pitched_up.wav -map 0.0 -map 1.0 slomo.asf

 

If you can make an mp4 file instead of an asf file, so much better. On my machine it complained about not having the correct codecs with ffmpeg.

vlc has a gui for combining sound and video for different files. First, for it to work I had to produce a slow version of the video with no sound, so rerun the command from the beginning of the guide, but make a video file with no sound:

mencoder -ovc copy -nosound -speed 0.5  normal.mp4 -o slow_no_sound.mp4

 Then the GUI in vlc can combine them. Start vlc and choose "Convert/Save" from the "File" menu:

 In the dialogue, select your slow motion file with no sound. Tick "Show more options", "Play another media synchronously" and click "Browse" to add extra media, and select the sound file there.

 Click "Converts/Save" in the above dialog, and you get the below dialog:

 

Here you have to experiment a little to select a profile that uses codecs that

  • you have on your system
  • vlc realises you have on the system

Happy slow motioning!

 There is this command in vlc, I wonder if it could be used for something:

--audio-time-stretch, --no-audio-time-stretch
Enable time streching audio (default enabled)
This allows to play audio at lower or higher speed withoutaffecting
the audio pitch (default enabled)

How to use the reverse-i-search in bash

Filed Under:

A quicker alternative to hitting the up arrow to get back the old command that you typed some time ago, is to hit Ctl-r, and then type a substring from the command you're looking for.

For example if you want to type "ssh -p 1022 username@ahost.domain", if you have typed it before you can just hit Ctl-r follwed by 1022 .

If you do not get the command you are looking for, hit Ctl-r again and it will find the next line in your command history that fits the pattern.


If you want to go to some other command press ctrl+r again to move backwards This will speed up your whole process


Läs mer: Just another Programmer: reverse i search for linux users

Move gobs of small files quickly with ssh and "|" in one go

Filed Under:

If you need to copy a lot of files to another computer in Linux, you can use scp with the "r" and "C" flags.

"r" makes it copy recursively and "C" makes it compress the files before sending them, which saves a bit of bandwidth. However the overhead of using scp for copying many small files is huge, at least an order of magnitude (10 times), maybe two, bigger on my machine transporting a couple of thousand of files. One solution is to use tar and first create a tarball, and then copy  that over.

However sometimes you may not have space on the source device to make a tarball. In that case you can tar it "on the fly" and pipe it to not scp, but ssh. It turns out that you can supply commands to ssh already on login:

tar zcf - SOURCEDIR | ssh user1@remotehost "cat > /DESTDIR/DESTFILE.tar.gz"

If you are not already authorized with keys on the remote host, it will prompt you with a password.

I pretty much stole the command above from:
Lâmôlabs » Pushing & Pulling Files Around Using tar, ssh, scp, & rsync

I just omitted the "v" flag, since even though you can still supply your password, the prompt gets buried in verbose output with the "v" flag.

 

And if you do not want a tar ball as the end result on the remote computer, you can unpack it on the fly:

localhost% tar zcf - SOURCEDIR | ssh user1@remotehost tar zxvf -

Again taken from the Lâmôlabs page, again omitting the "v" flag.

Getting hard disks up to DMA speed on Proliant ML110/CentOS Linux

Filed Under:

 

Summary: Editing the menu.lst file in the grub directory of the /boot directory to include "ide0=noprobe ide1=noprobe" as parameters for my default boot option worked for me. Your mileage may vary of course

 

A machine that I use as a home server, a HP Proliant ML110 G5, has taken very long to copy large files. A 5GB file seemed to take about 3 hours. This is of course wholly unreasonable. Something must be wrong, but where? I suspected it wasn't using DMA mode, which is a quicker mode of data transfer. Googling around I found that the hdparm should be able to check DMA mode and set it if not enabled. Well hdparm reported it was not set and furthermore that it could not be set. A hint that something was wrong was also that the drives where identified as /dev/hd*, not /dev/sd*. So further googling revealed it could be a BIOS setting. I looked into BIOS but everything was OK there. Eventually I found this page that says that the correct kernel driver may not be loaded, and instead a generic very conservative driver is loaded:

Google suggests booting with ide0=noprobe ide1=noprobe to make sure the ata-piix driver is used. If you don't want to reinstall then make sure initrd contains the ata-piix driver and that references to /dev/hd* are replaced with /dev/sd* in fstab etc.

Read more: [CentOS] Re: DMA mode

 

So the question is: Do I have the ata-piix driver on my system? It is probably not a good idea to disable something and not having the thing you want to have enabled, installed. It turns out the file you should be looking for on the file system (at least on  my CentOS) is called "ata_piix", not "ata-piix".

I edited the menu.lst file in the grub directory of the /boot directory to include "ide0=noprobe ide1=noprobe" as parameters (You may want to make a new boot option for this so you can easily go back). I then rebooted and edited fstab to mount the new /dev/sd* devices after checking in /dev what they were, and rebooted again.

Before and after

Before the performance according to hdparm was like this:

/dev/hda:
 Timing cached reads:   5052 MB in  2.00 seconds = 2526.92 MB/sec
 Timing buffered disk reads:   10 MB in  3.71 seconds =   2.69 MB/sec
[root@firefly ~]# hdparm -tT /dev/hdc

/dev/hdc:
Timing cached reads:   4920 MB in  2.00 seconds = 2460.83 MB/sec
 Timing buffered disk reads:   10 MB in  3.71 seconds =   2.70 MB/sec
[root@firefly ~]# hdparm -d1 /dev/hdc

 

And now it is like this:

 

[root@firefly ~]# hdparm -tT /dev/sda
/dev/sda:
 Timing cached reads:   4948 MB in  2.00 seconds = 2475.64 MB/sec
 Timing buffered disk reads:  180 MB in  3.02 seconds =  59.53 MB/sec
[root@firefly ~]# hdparm -tT /dev/sdb
/dev/sdb:
 Timing cached reads:   4840 MB in  2.00 seconds = 2421.36 MB/sec
 Timing buffered disk reads:  286 MB in  3.02 seconds =  94.67 MB/sec

 

So a neat improvement on buffered disk reads of factor 35 or 3500% on my second disk, and an improvement of factor 22 or 2200% on the slightly older drive. Copying the 5GB file now seemed to take around 2 minutes, which is an even bigger difference in speed.

 

Here is some background information on SATA in Linux and drivers.

hello
 

This site conforms to the following standards: