Archives 2005 - 2019

Länk - Neural networks for extracting useful text from HTML

published Feb 10, 2011 01:04   by admin ( last modified Feb 10, 2011 01:04 )

 

You’ve finally got your hands on the diverse collection of HTML documents you needed. But the content you’re interested in is hidden amidst adverts, layout tables or formatting markup, and other various links. Even worse, there’s visible text in the menus, headers and footers that you want to filter out. If you don’t want to write a complex scraping program for each type of HTML file, there is a solution.



Läs mer: The Easy Way to Extract Useful Text from Arbitrary HTML - AI Depot


Cheaper virtual hosting

published Jan 23, 2011 05:39   by admin ( last modified Jan 23, 2011 05:39 )

For some hobby projects I have been looking for cheaper virual hosting: I have not tested these two, but I just make a note here of them for future reference:


Virpus | Unmanaged VPS, Cheap VPS, VPS Hosting, Cloud Hosting, Cheapest VPS, Virtual Private Server Hosting, VPS Hosts

 

 x10VPS - Features & Information


Länk - Robotshop - när man vill hacka den fysiska världen

published Jan 21, 2011 03:05   by admin ( last modified Jan 21, 2011 03:05 )

Har inte använt dem, men lägger den här som bokmärke. Verkar ha robotar, servos, arduinos och annat.

RobotShop, the World's Leading Robot Store for Personal and Professional Robot Technology. Here you will find personal robots, professional robots, robot toys, robot kits and robot parts for building your own robots. If you are looking for robot pet care, robot floor cleaners, robot vacuums, robot pool cleaners or robot mowers, to do your household chores, this is the site for you. We also bring robots back to life™ via our Robot Repair Center.



Läs mer: RobotShop | Robot Store | Robots | Robot Parts | Robot Kits | Robot Toys


Många vetenskapliga studier inte så vetenskapliga

published Jan 19, 2011 11:51   by admin ( last modified Jan 19, 2011 11:51 )

Ett allt mer uppmärksammat fenomen är att det blir svårare att replikera resultat som utförts i tidigare vetenskapliga studier. Detta gäller bl a inom psykologi och medicin. En rimlig förklaring till detta är att de inte inte var så vetenskapliga till att börja med. Det borde t ex kunna bero på att:

  • Man subjektivt felbedömt uppmätta data
  • Att man bestämt vad som är signifikant (mindre än 5% sannolikhet att det är slumpen) efter studien har börjat, t ex att man letat efter hårlängds korrelation med födelsevikt, men istället "upptäcker" ett samband mellan hårlängd och längd vid födseln, dvs man shoppar runt bland variabler tills man hittar de som slumpmässigt är signifikanta. Signifikanskriteriet kan ju nämligen bara gälla om man satt vad det är man letar efter i förväg (a priori)
  • Att man upprepar med olika experiment tills man når en signifikans


En diskussion om den svinnande signifikansen vid upprepade experiment finns i The New Yorker:

This suggests that the decline effect is actually a decline of illusion. While Karl Popper imagined falsification occurring with a single, definitive experiment—Galileo refuted Aristotelian mechanics in an afternoon—the process turns out to be much messier than that. Many scientific theories continue to be considered true even after failing numerous experimental tests. Verbal overshadowing might exhibit the decline effect, but it remains extensively relied upon within the field. The same holds for any number of phenomena, from the disappearing benefits of second-generation antipsychotics to the weak coupling ratio exhibited by decaying neutrons, which appears to have fallen by more than ten standard deviations between 1969 and 2001.

Läs mer: The decline effect and the scientific method : The New Yorker

Det finns ju också en möjlighet att världen tröttnar på signifikansen och justerar sig, men det är vad man skulle kalla ett "Extraordinary claim"


Getting keyring-less wireless network connection on Ubuntu

published Jan 11, 2011 02:43   by admin ( last modified Jan 11, 2011 02:43 )

Make wireless network available to all users.

This had been a minor annoyance since a re-install of 9.04. I had never noticed that little "available to all users" checkbox before.



Läs mer: Howto: Get Network Manager to stop asking you for your keyring password (pam_keyring) - Page 14 - Ubuntu Forums


Porsche 918 - kan man be om en reklamfinansierad?

published Jan 10, 2011 07:49   by admin ( last modified Jan 10, 2011 07:49 )

Det är märkligt hur vissa bildesigner bara känns "rätt". Porsches nya 918 är en sådan. Det är en s k retrodesign; jag tänker direkt på tävlingsbilar från början av 70-talet såsom Porsche 917. Jag hittade ingen bild med bra licens för kupéversionen RSR så här är en bild på cabben, Spyder:

Porsche 918 Spyder. Källa: BeverlyHillsPorsche

Vad gäller den andra versionen - RSR, så förevisas den med en massa reklamklistermärken, kanske återigen för att frammana tidig 70-talsracing. Så jag har härmed ett förslag till Porsche: Kan man tänka sig att man om man bara har tillräckligt mycket reklamdekaler på bilen, så skulle man kunna få ett exemplar gratis? Jag lovar att köra försiktigt men inte mjäkigt, och inte ta bort några dekaler.

Porsche 917C, vinnare av Le Mans 1970. Bilden är Public domain.

 

Läs mer: Porsche løfter sløret for ny ekstrem sportsvogn - Politiken.dk
Läs mer: Nytt ljus på bilmässan i Detroit - DN.SE

Porsche, som slopade Detroit för fyra år sedan, är nu tillbaka och visade upp en racebil i hybridversion – Porsche 918 RSR hybrid.

Läs mer: Detroit – försiktigt positivt | Näringslivsnyheter | SvD

 

 


Moving specific files with find, perl and xargs

published Jan 09, 2011 02:33   by admin ( last modified Jan 09, 2011 02:33 )

Summary:

find -print0|perl -n0 -e 'print if m/\(\d+\)/'| xargs -0 -I xxx mv xxx copies/

I needed to find all files that had a parenthesized number in them, such as  "filename(3).txt", and move them to a copies folder.

It is a good idea to use -print0 to cater for complex filenames on the find side, and in perl you tell perl to go into null terminated mode with the "-0" switch as seen above.

If you then pipe it further to xargs, you use the "-0" switch to tell xargs that the file names are null terminated. xargs will assume you want the file names as the last part of the command given to xargs, but for mv you do not want this, since the last part of the command should be the destination folder. The -I switch allows you to specify a placeholder identifier for the input to xargs. Above I have chosen "xxx" for this. Hence in "mv xxx copies/", xxx is replaced with the file name.


When typing "d" mimimizes all your windows with a remote desktop

published Dec 31, 2010 05:51   by admin ( last modified Dec 31, 2010 05:51 )

Of all strange things that happens to a user in Linuxland, having all windows mimize whenever you type the letter "d" while in a VNC session, has been one of the strangest.

A surprising number of Linux commands and programs, python code and Swedish and English words contain the letter "d", so the situation quickly gets untenable.

It turns out that on Ubuntu 10.10 the keyboard shortcut for mimimizing all windows is spuriously set to "d" plainly and simply, when accessing the desktop over VNC. A theory is that this happens because a key that normally is used in combination with d (meta) to mimimize the windows, is not found over VNC and the desktop simply sets the shortcut to "d".

The remedy is to rebind the minimize all windows action to another key combination in System->Preferences->Keyboard Shortcuts .

I chose Windows + m.

See: Comment #3 : Bug #655886 : Bugs : “tsclient” package : Ubuntu

 

 


What does (yield) do in python?

published Dec 21, 2010 07:14   by admin ( last modified Dec 21, 2010 07:14 )

I browsed through the documentation for the Python package m4us and found a "simple example". In it, there are yield statements.

@coroutine(lazy=False)
def lines_producer(file_):
"""Emit lines from a file as messages."""
inbox, message = (yield)
for line in file_:
if is_shutdown(inbox, message):
yield 'signal', message
break
inbox, message = (yield 'outbox', line)
(yield 'signal', ProducerFinished())

Ok got that, kind of. Yield is followed by some arguments that it, well yields. But in the code I found this:

inbox, message = (yield)

Uhm, what is that? It yields something into the method we are in. That's like the other way around. But from where?

I've made a test script sheds some light:

def a_generator():
message = yield('return value of first yield')
message2 = yield(message)
message3 = yield(message2)

g = a_generator()
print "first send:", g.send(None)
print "second send:", g.send('Second send')
print "third send:", g.send('third send')

First, a function is defined. It has the name "a_generator", but it could be anything. However the fact that it contains at least one yield statement means that python will treat the function as a generator.

Now, on the first send, the method is executed until it reaches the first yield. That yield then returns to the caller, in this case with the string "return value of first yield". But it is only half the yield that is executed by the first send! For the next send, the execution is resumed in such a way that the first yield is the input point for the argument of the second send. The string "Second send" is assigned to the variable "message" on the second send.

So the first send used the first yield to get something back. But there is more life left in that yield: That yield's ability to inject something into the method has not been used up yet, the second send will do that. The python documentation on yield talks about suspending and resuming execution, I did not at first realize that it suspends and resumes right in the the middle of the yield word, so to speak.

Since the first yield returns, the first call to send or next can feel a bit futile, if you intend to send stuff into generator. Your first send cannnot contain any useful parameters. One way of dealing with this is to wrap the function inside a decorator that takes care of calling next or send the first time. See David Beazley's script here.

Reading further on Beazley's site it turns out he has made an excellent presentation on the topic. It turns out that the usage of yield with send is not used so much with generators but rather with something called coroutines, and my guess is that the code at the top of this blog posting is coroutine-oriented, so my use of the term generator in this post may be a bit out of place.


novnc: VNC screen sharing in the web browser

published Nov 29, 2010 02:20   by admin ( last modified Nov 09, 2012 11:51 )

Sometimes, when you want to do screen sharing, it can be hard to get the client in place on the other person's computer. They may have trouble installing it, may not have the rights to install it, or feel it is just too much trouble.

Fear not, because nowadays one can use a regular web browser as a VNC client. All the person at the other end needs to have on their computer is a modern web browser, such as Google's Chrome.

 

 

Screen detail: A Chrome browser running noVNC. In the upper half one can see the web browser that noVNC is running in, further down shows part of the VNC session that is running inside the browser (Gnome on Ubuntu 10.4)

The new generation of web browsers support the canvas element of HTML 5. The canvas element allows the browser to display a bitmap in a rectangle, and this bitmap can be continously manipulated pixel-by-pixel through javascript. Again, having a modern browser with a super fast javascript engine makes it realistic to use the canvas element for sharing a remote computer in real time.

 

I have found three implementations of VNC through the web browser:

ThinVNC works on Windows only on the server side as far as I can tell, and I run Linux, so that one is out. Guacamole needs a java servlet container (e.g. Apache + Tomcat), and I do not want to install that, so the choice fell on noVNC, and so far it works like a charm!

I have not been able to make it run on Firefox (v 3.6.12), but it does run on Chrome (v 8.0.552), both tested on Ubuntu Linux. A customer failed to connect with a Chrome 7 series, running on OSX, so my guess is Chrome 8 or better is a good bet. noVNC also works with Chromium (an open source version of Chrome) 9.0.565 on CentOS 5, but not with Firefox 3.6.4 on CentOS 5 in my tests.

All three programs, or at least novnc and Guacamole that I have paid attention to, requires three programs to run:

  • The VNC server of your choice,
  • A server that translates (proxies) the VNC server communication to a web browser friendly language (called WebSockets),
  • A javascript running in the web browser that takes care of painting the remote screen, and handles mouse and keyboard events in cases one is not merely viewing but also controlling the remote machine.

 

What you need on the server side.

I liked the way noVNC gets you up and running really quickly. First and foremost you need a VNC server. I have tested noVNC with Gnome's desktop sharing (on Ubuntu 10.4). After downloading and unpacking noVNC, you can start it right away in the terminal, with the command indicated in the readme file. This will start a Websockets proxy server. The one used in noVNC is written in Python and hence needs no Java (in contrast to Guacamole). You need to indicate what port your VNC server is running on, so the Websockets server knows where the VNC server is that it is proxying. VNC servers typically run on port 5900, or 5901, or 5902 or thereabouts. For desktop sharing it is often 5900.

The command line also starts a web server that the remote client should connect to. You can vary the ports that the Websockets proxy and web server run on, to avoid conflicts with other server processes on your machine.

After having started noVNC, simply point the browser to the URL that the command returned in the terminal. This will load a web page and the javascripts needed to connect.

The web server is just a simple server to serve the html page and javascripts to the browser. If you already have a web server somewhere you can serve the files from there instead.

Encrypted connections

When using plain old VNC, it is a good idea to tunnel the session through ssh. With noVNC you can have the connection encrypted directly in the browser. The noVNC readme explains how: There is no need to connect to a different encrypted port; the proxy automatically recognises that the connection is encrypted. The proxy can be started with a --ssl-only flag to only accept encrypted connections, and encryption must then always be switched on in the web browser page, either under the settings menu (that looks like a button) or by supplying "encrypt=1" as an extra CGI parameter when loading the page. I have tested with Google Chrome (v 8.0.552), on Ubuntu Linux 10.10 and it worked fine. Ok, I didn't intercept the traffic between the browser and server to really verify it was encrypted, but I take in good faith that it was.

Installing the proxy server as a service

Once I had played around with noVNC for a while, I started to look to have it installed with a bit more permanence. Now the HTML and the javascript for the web browser can be served from any old web server, and the VNC server is built into Gnome, but what about the WebSockets proxy, that translates the VNC to something the browser can understand? noVNC ships with no less than 3 alternative WebSockets proxies, one written in Python, one written in (server side) Javascript and one written in C. The Javascript version does not support encryption. The Python based proxy server can be run in the foreground with the -f flag, and as a daemon otherwise. But it would be nice to have it run always, and automatically started when the server reboots. I decided to put it under the control of supervisord, a framework for running processes and keeping them alive, that I am familiar with from developing with Plone and buildout. Supervisord does not manage other daemonized processes. Instead it wants to run processes that think they are running in the foreground, but are in fact connected to supervisord. Supervisord in this way is tightly coupled with the processes it runs. It can restart and otherwise manage its child processes, without the need for pid files or other such things. It i also trivial to run multiple copies of a program.

On Ubuntu 10.4, you can install supervisord easily; it is one of the packages in the usual repositories. After it has been installed, it has an entry in the /etc/init.d directory , and in /etc/supervisor/ directory you can add directions for it to run wsproxy.py as a foreground process. On install supervisord is configured to start when the server boots. Here is what I use. This is an entry in the /etc/supervisorsuper/supervisord.conf file on a Ubuntu 10.4 server:

 

[program:wsproxy]
command=/usr/local/bin/wsproxy/wsproxy.py -f 6900 --cert=self.pem --ssl-only localhost:5900
process_name=%(program_name)s
numprocs=1
directory=/usr/local/bin/wsproxy
stopsignal=TERM
user=a_user_name

It runs wsproxy.py on server startup, and will restart wsproxy if it ever would go down. wsproxy is configured above to proxy a VNC server running on port 5900 and serves it out as a websocket service on port 6900 for noVNC. wsproxy is configured above to requre SSL encryption.

Update:

There seems to be a bit of trouble with the websocket protocol that noVNC uses, but according to Wikipedia, Google Chrome should still support it in all versions. The crux is here.:

http://hacks.mozilla.org/2010/12/websockets-disabled-in-firefox-4/

Check:

http://en.wikipedia.org/wiki/WebSockets#Browsers_supporting_WebSocket

for what browsers to use. However, the developer of NoVNC, Joel Martin, points out in a comment to this blog post that NoVNC has a built in Flash based Websockets proxy, so one is with NoVNC actually not dependant on Websockets support in the web browser.

So, right now it seems like Firefox will not support Websockets out-of-the-box, until they have fixed the issue. It seems though that you can turn it on again in about:config. Opera is turning websockets off by default from what I can see. However I guess the browser versions tested below as working (Chrome v 8.0.552, and Chromium 9.0.565) will continue to work since they are already out, and it seems all Chrome versions support it.


js-test-driver, an alternative to webdriver and windmill

published Nov 09, 2010 09:13   by admin ( last modified Nov 09, 2010 09:13 )

Some quick notes for myself:

Two frameworks that allow you to test your web code at the end point, i.e. from a browser's point of view, are Webdriver (part of Selenium now) and Windmill.

A third option has emerged, js-test-driver. The tests in Windmill seems to be in Python (I have toyed briefly with it) while the tests in Webdriver are in Java (I use this regularly for a few test cases). js-test-driver seems to use Javascript.


# easily integrates with continuous builds systems and # allows running tests on multiple browsers quickly to ease TDD style development.


Läs mer: js-test-driver - Project Hosting on Google Code


Problems with DAAP on Ubuntu 10.10

published Oct 31, 2010 06:23   by admin ( last modified Oct 31, 2010 06:23 )

Rhythmbox will not connect to a firefly (mt-daapd) server if it is running on the same machine on Ubuntu 10.10 Maverick Meerkat. Banshee will connect to it and show the playlists, but will not be able to play the songs. The servers work just fine though if you connect to them with Rhythmbox or Banshee from another Ubuntu 10.10 machine.

In order to make sure that this not just a borked 10.10 installation on one machine, I connected to the second machine's server from the first, worked fine. Connecting the second machine's clients  to its own server did not work.

Problem temporarily handled by downgrading Rhythmbox to the versions for 10.4.


Berglin har inte heller koll på hur ekonomi fungerar

published Oct 31, 2010 05:27   by admin ( last modified Oct 31, 2010 05:27 )

Tecknaren Berglin har ofta väldigt roliga och underfundiga teckningar. Men idag sätter man kaffet i halsen när han kommer fram till följande slutsats:

 

Nej, vårt välstånd bygger inte på att några i ett annat hörn av världen har jävligt låga löner. I alla fall inte om man jämför med vad de skulle ha om vi inte köpte av dem alls. Då skulle de svälta ihjäl eller tvingas till desperata och destruktiva livsstilar. Det faktum att vi handlar från dem gör dem rikare: Det gör det möjligt för dem att sätta sina barn i skolan, köpa ett hus, betala en operation osv. Med andra ord: Frihandeln och globaliseringen hjälper dem att leva.

Jag vet inte riktigt hur Berglin har tänkt här...

 

Läs mer: Berglins | Serier | SvD


Firesheep visar på enorma & svårtäppta säkerhetshål på webben: skydd

published Oct 28, 2010 12:24   by admin ( last modified Oct 28, 2010 12:24 )

De flesta webbsajter med login på, autentiserar användaren med hjälp av liten cookie. Om någon kan avlyssna trafiken mellan en besökare och webbsajten - och det är ofta förhållandevis lätt att göra, så kan denne kopiera cookien och får då plötsligt tillgång till besökarens konto, på t ex Facebook.

Detta är inte alldeles enkelt att skydda sig mot. Det bästa skyddet är att man använder https (SSL/TLS) i all kommunikation med sajter som man är inloggad på. Då är cookies krypterade, och kan inte avlyssnas och användas. Problemet är att en webbläsare om den får en cookie från en domän, om man ite speciellt anger det, även kan användas för http-kopplingar mot sajten. http-kopplingar kan dyka upp på en annars SSL-skyddad sajt av fler anledningar:

  • Vissa resurser använder inte SSL/TLS (bilder, videos mm), därför att det blir långsamt pga att de:
    • inte cachas på webbläsarsidan
    • och/eller tar processorkraft på serversidan
  • Man har glömt att skydda vissa resurser
  • Man låter sina resurser vara del av andras webbsajter (som t ex Facebooks "Like"-knappar) och har inte skyddat dessa med https (SSL/TLS)

Eftersom många sociala sajter tillåter att deras resurser används på andra sajter (som webbuggar t ex), är det väldigt svårt som webbanvändare att veta vilka webbsajter ens webbläsare kommunicerar med i ett givet ögonblick.

Hur skyddar man sig?

  • Det bästa verkar vara att se till att ens webbläsare bara accepterar cookies från SSL-adresser som har secure-flaggan satt på cookies. Sådana cookies skickas aldrig av webbläsaren över en okrypterad länk. Jag har inte sett någon sådan plugin som säkerställer detta dock. En variant är att webbläsaren själv sätter secure-flaggan på cookies som kommer från en https-adress.
  • Arbeta över ett VPN eller annan krypterad tunnel och sedan göra en riskbedömning på om någon kan avlyssna efter tunnelns slut

Plugins som gör något i alla fall i dag är t ex:

 

Jag hittade just en gammal Firefox-plugin, Secure Cookies, i vilken bl a följande kod finns (i securecookies.js):

if(topic == "http-on-examine-response" )
{
if(this.secure){
var httpChannel = subject
.QueryInterface(Components.interfaces.nsIHttpChannel);
var cookie = httpChannel.getResponseHeader("Set-Cookie");
//is there a Secure Cookie in the Header
if(cookie.search(/secure/i) != -1)
{
//cookieheader = all not secure cookies
var cookieheader = this.setCookiefromString(cookie, subject.URI);
httpChannel.setResponseHeader("Set-Cookie", cookieheader, false);
cookie = httpChannel.getResponseHeader("Set-Cookie");
}
}
}

Här borde det finnas möjligheter att stoppa in ett obligatoriskt secure-kriterium. Men plugin:en är inte uppdaterad för nyare Firefox, så en del jobb behövs nog.

You can’t simply avoid visiting the sites that are being attacked here. There’s an enormous amount of mixed content on the web today, such as the Facebook “Like” button, Digg’s “Digg It” button, twitter widgets, and even embedded images that are hosted on Flickr or other photo sharing sites. Every time you access any web page that includes any of this content, your browser also sends any authentication cookies you have with the request to pull down the widget. TechCrunch is a great example of this, every article has lots of little widgets to share it on numerous social sites.



Läs mer: Firesheep, a day later - codebutler


Getting the network applet back in Ubuntu 10.4

published Oct 27, 2010 12:37   by admin ( last modified Oct 27, 2010 12:37 )

Suddenly the little icon that show your network connection status was gone. This was a bit of a problem since I could not configure the computer to connect to a wireless network anymore.

It turns out that the applet behind the missing icon is called nm-applet. Trying to start it with:

nm-applet

did not work since it was reported to be already running. Well, we can do something about that:

killalll nm-applet
nm-applet

and it turned up again


Alt-dot not working in Ubuntu 10.10 - workaround

published Oct 25, 2010 08:07   by admin ( last modified Oct 25, 2010 08:07 )

In Gnome Terminal in my installation of Ubuntu 10.10 (Maverick Meerkat), the trusty old alt + "." does not work anymore. It usually prints the last argument used in bash, but now it just prints a dot ".".

Workaround seems to be to use Esc + "." instead, does the same thing. But it works.

Found it here:

Equivalent/Alternatives for Alt+Dot in Mac
One thing i miss from the Linux shell is the Alt+Dot shortcut. What it does is insert the last argument of the previous command. It appears to be a trivial technique, but it's really useful.



Läs mer: 1 bash history scrolling - Gooduser.info


There is a plugin for Rhythmbox for browsing directories

published Oct 24, 2010 02:52   by admin ( last modified Oct 24, 2010 02:52 )

One annoying thing with the otherwise capable Rhythmbox for Gnome, is that your music library becomes one big soup of songs. However there is a plugin for Rhythmbox that allows you to browse your directories. It is called folderview and allows you subselect from your music collection based on path, which means that you can browse in a tree to a directory within your music repository, and folderview will display and play all songs in that directory and its subdirectories.

When I say "will play" that is exactly what it does, it will immediately switch to and start playing the first song it finds, if it detects that Rhythmbox is already playing something. Besides this minor quirk it is a very useful plugin: Kudos to the author and to the architects of the Rhythmbox plugin system!

If you want completely separate entities under Rhythmbox, with separate playlists, take look at the mt-daapd (Firefly) DAAP server in conjunction with the Rhythmbox DAAP plugin.

Rhythmbox folder view plugin



Läs mer: folderview - Project Hosting on Google Code


Finding the trashcan in Ubuntu 10.4

published Oct 20, 2010 03:23   by admin ( last modified Oct 20, 2010 03:23 )

In a freshly upgrade 10.4 I eventually found the trash in ~/.local/share/Trash

I could not find it in the GUI.


Gör en egen budget på The Guardian

published Oct 19, 2010 04:31   by admin ( last modified Oct 19, 2010 04:31 )

The Guardian är en av de tidningar som verkligen förstått Internet. Nu har de gjort en Flash-applikation i vilken man själv kan bestämma vad som ska sparas i den brittiska statsbudgeten.

Man borde koppla detta så att man kan se andras budgetar också.

The coalition says it must slash billions from public spending to tackle the UK's growing budget deficit. George Osborne's comprehensive spending review will reveal where the axe will fall. But should he cut as deep? And is he cutting the right things?



Läs mer: Comprehensive spending review interactive: you make the cuts | Politics | guardian.co.uk


lastmatch.py -command that identifies mp3s

published Oct 12, 2010 09:53   by admin ( last modified Oct 12, 2010 09:53 )

lastmatch.py is a part of pylastfp, a python module that allows you to:

  • fingerprint your mp3s and hence find other rips of the same music
  • or find out the correct name and title for given mp3.

The latter is achieved by the lastmatch.py script searching the lastfm database over the net for matching audio fingerprints, and return the metadata for those files. Metadata is limited to title and artist.


This is a Python interface to Last.fm's acoustic fingerprinting library (called fplib) and its related API services. It performs fingerprint extraction, fingerprint ID lookup, and track metadata lookup. It also comes with some helpers for decoding audio files.


Läs mer: Python Package Index : pylastfp 0.2