Discogs.com is one of my favorite web sites. I discovered it 15 years ago, and have used it since to research records and music and such. I just recently discovered that they have an API, so I used it to search for each record in my collection, and if it found results, I imported the link to the discogs page as well as a link to the thumbnail images to my database.

The process was rather convoluted. To start I just did a search for the data as it was in my database - artist, title, label and catalog number. But discogs often has multiple entries for each record - maybe it was released in different countries, or re-released, or has different entries for promos, test pressings and white labels. So my starting algorithm was as follows:

  • Match the full data in my database to a search request to Discogs API.
  • If one and only one result is returned, take that one and link it.
  • If more than one result is returned, filter out the ones for promos, mispresses, white labels, etc.
  • If there is still just one result, use that one. Otherwise mark the number of results returned in the database and move onto the next record.

This matched a couple hundred out of the couple thousand records in my database. Most of my records got 0 matches to Discogs, some still had multiple matches - anywhere from 2 to 35. So I started reviewing the ones with multiples by hand, and I realized that for most of the records with under 5 matches the matches were pretty much equivalent. So for those I just took the first match and assigned it. This matched another couple hundred records.

Now I put aside the few remaining records with from 5 to 35 potential matches and focused on the thousand or so that had no matches. Reviewing some of them manually, I found that many of them were due to typos in my database. So my next step was to omit the artist field and just check the title, label and catalog number. I got another couple hundred matches using this method. Then I went on and just searched using the catalog number. This method matched about half of the remaining unmatched records - but I had to manually verify each match because some catalog numbers are not unique. 

Unfortunately I do not have a catalog number for every record in my collection, and as of now about 1/3 of the records in my database are still unmatched. For those that are matched, on the record information page you will now see a link to the discogs.com page for that record, as well as a thumbnail pulled from discogs.com if available. For anyone interested in collecting records I highly recommend discogs.com as it is by far the most comprehensive database of music releases I know of. 

Labels: personal , music

No comments

Redis

Feb. 28, 2017, 5:05 p.m.

I've been messing around with Redis for a little while now and I'm using it in a couple places on this site. The first thing I did was I started caching some DB queries that get performed a lot, like the main blog page, the blog archives menu and the list of recent posts on the home page. Laravel's Cache facade makes caching really easy, and you can switch between the default cache driver which caches to files and Redis without changing any of the code. For some pages I use the Laravel Cache facade, for others I use the Redis facade to cache directly to Redis just for some variety. In general it is probably much better to use the Cache facade than the Redis facade because if you want to switch to a different caching mechanism with one change to the .env file instead of having to rewrite all of the code.

I also use Redis to queue some tasks which don't need to be done synchronously, but that's another post.

I never used Memcache because I didn't like the fact that it's all stored in memory, so if the server goes down you would lose all of the data in it, but Redis persists data to disk by default so it provides the speed of keeping data in memory with a very low risk of losing the data. In my case, I store the data in the DB and cache it in Redis, but if I were to start from scratch (and had plenty of RAM on my server) I would probably keep a lot more data in Redis.

So, to summarize, I think Redis is awesome and I will definitely make more use of it in my stack in the future.

Labels: coding

No comments

Database Change to My Record Collection

Feb. 28, 2017, 5:04 p.m.

In my records table I used to have the full text of the label in each row, but I ended up with small typos in label names that ended up screwing things up. So I separated the labels out into their own table and added a foreign key to link the two tables. In my records admin instead of having a text field for the label I used a drop-down that was populated from the labels table. But this caused more problems than it solved, because it greatly complicated the code for searching records, updating records, and I had to write extra methods to add new labels to the labels table.

So I decided to get rid of the extra table and just put the full text back into the records table. I kept the drop-down menu in the admin section, but populated it with data pulled from the records table to eliminate my original problem. To my Record model I added a public static function that selects the distinct labels with the text as the key and the value to pass to the drop-down menu, and then was able to greatly simplify a lot of my code. 

I just pushed this up a few minutes ago, and so far haven't found any problems, but that doesn't mean they aren't there. If anyone finds any errors in the Record Collection page or the API please let me know.

As a side note, having PHPUnit tests for everything already set up made this a whole lot easier. I didn't have to go through and check every possible combination of things that could be searched or test adding and editing and all of that because I already had the tests written. I used to rely on a QA department to regression test everything, but PHPUnit makes all of that much easier and instead of waiting days to have QA people check everything I can do it in under a minute myself.

Labels: coding , music

No comments

SSL and Let's Encrypt

Feb. 28, 2017, 5:03 p.m.

The first time I ever tried to install an SSL certificate on a web server was probably around 1999. At the time there were only a couple options - Verisign and Thawte if I remember correctly, the certificates cost a couple hundred dollars each, and you had to go through a lengthy and complicated process to get the certificate approved which involved compiling a lot of documentation (I remember being asked for a Dun & Bradstreet number for one thing), multiple phone calls, and took a couple weeks to complete. Once the certificate was finally approved and issued the process of trying to install it on the server was almost as complicated.

How times have changed. Yesterday I installed a certificate on this server. It was free and took about 15 minutes, most of which was spent trying to find the documentation. At first I was just messing around and decided to install a self-signed certificate, which was quick and easy, but having to click through the page which says that "this site is insecure" was nerve wracking, even knowing that it doesn't really mean anything. A quick Google search turned up lets encrypt which offers free SSL certificates that are recognized by most browsers.

As easy as installing an SSL certificate for Apache is, I then found CertBot which makes it even easier. The main page has instructions for different OSes and servers. For Ubuntu I just installed the certbot package and ran it, it asked me what domains I wanted the certificate to cover and for my email address and then generated it.

I was a bit wary of allowing CertBot to change my Apache config so I just had it generate the certificates and did the config myself. After I had no problems, on my other server I let CertBot do the config as well and had no problems at all. And when it was done SSL just worked, I didn't have to touch the config or even restart Apache, much less provide a DUNS number. I'd like to thank the EFF and Lets Encrypt for CertBot and for making this so easy.

Labels: coding

No comments

Archives