Archive for the ‘Uncategorized’ Category

Wait, really? Last post was in 2017?

Anyway, not dead yet, more to follow eventually.

I’ve been digging a bit into web-and-phone-app-suitable media formats for audio and “quasi-audio” (where the media is really “audio” but is formatted as a “video” because everything has to be a friggin’ video these days).

Also finally got around to figuring out why the API wasn’t working for this site. I kind of want to dabble in making an “app” for podcast episodes that get posted here.

Watch this space…

Too many things going on lately, I’ve been slacking on the blogs. I haven’t been idle, though:

I’ve been playing with the APIs for Jamendo and Freesound.org to try to better automate the finding, fetching, and properly-tagging (including applicable copyright license terms!) the local copy for use in podcasts. Having the appropriate information in real audio metadata makes it a lot easier to find compatibly-licensed sound and to find the necessary attribution information when the time comes.

I’m also quasi-professionally doing podcast audio editing and production on the side. As a moderately-skilled amateur rather than a “serious professional” I’m not charging much, but I am technically getting paid to do it, so I say it counts. Currently my sole client is a podcast being done by a couple of independent authors who chat about their current projects, then take a random “writing prompt” and try to whip up a short piece of fiction based on it, and then they each read out what they’ve come up with. If that sounds like something you may be interested in, that podcast is at http://impromptucast.com, its RSS feed for your podcatcher is at http://www.andivan.com/impromptucast-rss.xml, and for those of you stuck on stinky old iTunes, their iTunes page is at https://itunes.apple.com/us/podcast/impromptu/id1280197831?mt=2. In addition to a basic intro and “outro”, they also asked me to come up with some sort of spoof-commercial to insert between the two halves of each episode, which has been fun. I’ll post those somewhere here when I get a chance – I’ve been meaning to put together a portfolio online.

Anyway, they each record their local audio as they chat online, then at the end they send the raw audio to me, and I synchronise it, clean it up and edit it, blend in intro/outro/”commercial break”, then export to mp3 (and flac for archiving and generation of opus and/or Ogg Vorbis audio as needed later) with complete metadata so they can post it. Then I generate updated RSS for their feed. If anybody else is interested in hiring some inexpensive help getting their podcast together, feel free to comment!

Working on that is what got me going on the API projects. I also whipped up a “complete metadata tagger for podcast files” to go with it – in case it’s not obvious, I hate when important information (e.g. licensing, source URLs, etc.) are left off of audio files.

(EDITS: I made a couple of quick additions to this post. Turns out there is at least one person who cares whether I try out this plugin again, and it’s the developer. One nice thing I can definitely say about the plugin is that the developer has always been pretty engaged with the plugin’s users. I vaguely recall having some communication with him online half-a-decade ago when I first started looking into the plugin. See the comments on the post…)

I actually installed the “Blubrry Powerpress” plugin on here way back at the beginning when I noticed it had some (at the time) rudimentary .ogg [vorbis] audio support. I played with it a little and then never did much with it. (EDIT for clarity: this was almost half-a-decade ago – back in August 2010. If you go all the way back to that first audio post [the “TuberculosisBurgers” episode of “Stir-Fried Stochasticity”], that one was posted via the “Blubrry PowerPress” plugin. Quite a bit of development has gone on with the plugin since them but I’ve not gotten around to trying to use it again since…yet.)

Since I’m switching to .opus for everything anyway, I’d been thinking about just removing the plugin (after I go back and find the one or two posts I made using it, to fix them to not require it), but I just noticed that the new 6.0 version has finally added support for .opus, according to a line buried in the changelog.

Still not sure if I’ll keep it and try it out some more or just purge it. The homepage for blubrry.com has adopted a style that looks like a child produced by Windows 8 and an iPod advertisement after a drug-fuelled orgy. Ick. Big flat ugly blocks of text and bland graphics, and almost “mystery-meat navigation”. If that’s indicative of where the plugin is heading, I should probably just purge it (I get the impression that the plugin is more focussed on iTunes™ and “Search Engine Optimization” than any other features, but I’m not really concerned about Apple’s mandates on this blog) (EDIT:What I’m getting at here is that it’s not clear how much of the plugin I’ll actually get any use from if I’m not on iTunes, essentially. Is it going to be like having Microsoft Excel installed when I only need to do some basic math?).

(Another post-posting edit for clarity: the “dumbed-down-to-an-iTunes-ad” interface that Windows 8 went all in for is the second-worst trend in the name of “mobile-friendliness” in my personal opinion – second only to websites that now pop up as a giant background graphic filling the screen, which you then have to scroll down past to actually see the “content” you clicked on the link to get to in the first place…and keep scrolling because it’s giant “easy-reader” text, I assume so it’s legible on tiny phone screens. Yes, I’m looking at YOU, medium.com, among others. The only concern related to the plugin is that it suggests a doubling-down on iTunes-and-other-proprietary-services-related features and a moving away from the simple self-hosting I’m interested in as development goes on.)

Since the closest thing to a formal “New Year’s Resolution” is to do a lot more web audio production, I could probably use the plugin, but I certainly don’t NEED it. I’ll consider it while I work on getting some audio produced.

Does anybody still reading this have an opinion either way?

PlayOggThe crappy-old-mp3 standard has now been around for nearly a more than a quarter-century.


UPDATED: this page might be interesting for historical purposes, but as far as I can tell the last of the possible .mp3 patents finally expired by 2018 or so. You may now legally use this ancient format however you please. I still say the quality is poor by modern standards, and it’s still not suitable for low-latency audio, but it does have the advantage that virtually everybody supports it and it’s generally “good enough”.


Most kinds of audio players and web browsers have supported better, legally-free formats for a while now, but as usual Microsoft and (most prominently) Apple are stuck with only formats that you have to pay a metaphorical “poll tax” for permission to use.

.mp3 is one of them, of course. By modern standards, mp3 is pretty poor. It’s high-latency so it’s not suitable for interactive or live uses (e.g. VoIP), the quality is lacking at all but the highest bitrates (so you either have low-quality audio or huge files to transmit and store). It’s also weighed down by a bunch of patents, of course, so you can’t even legally make or use .mp3 files without somebody paying protection money to some lawyers for permission…

…or can you?

Personally, I’d rather just never touch the stuff (patents or not, the audio format is lacking, and I really don’t like the fussy, limited little “id3” standards for its metadata, either), but I grudgingly concur that having a basic “fallback” format that Microsoft/Apple/fifteen-year-old-media-player owners could use until they realize they can upgrade to something better by using a different browser/media-playing app is sometimes helpful.

I occurs to me that with the original specification for .mp3 being published in 1993 or so, more than 20 years ago, and given that patents aren’t supposed to last more than 20 years, it seems like a reasonable assumption that at least some if not all of the still-threatening patents (the last of which still doesn’t expire until 2017!) are optional “optimizations” or techniques that don’t necessarily have to be applied to generate a valid .mp3 file that ancient media players (or new media players from ancient companies…) can at least play back, even if the files are not “optimal”.

For example (DISCLAIMER: IANAL), This Big List of MP3 Patents shows that there are 9 6 patents left (as of 2014 May, 2015) keeping mp3 locked up. However, at least fourtwo of those are specifically patents on ways of encoding two or more channels of audio (i.e. stereo, surround-sound, etc.), so a single-channel (mono) audio stream encoded to .mp3 should definitely not trespass on half of those patents. One more appears to be specific to techniques for encoding “low sampling rate” audio (i.e NOT the usual 44.1kHz or 48kHz), so a typical 44.1kHz or 48kHz audio source encoded to .mp3 at that rate wouldn’t trespass, either.(Note: Since this post was originally written, a few of the encumbering patents have finally matured into the public domain, hence the edits…)

That leaves 3 remaining US patents to be tiptoed around to generate legal .mp3 files:

Trying to read those things make my head hurt, but it kind of looks to my definitely-non-expert eye like 5579430 has something to do with “average bit rate” encoding (not “constant bit rate”), so possibly a true “constant bitrate” encoding wouldn’t infringe?

5742735 looks kind of like it’s both a specific way of designing an encoder (with a “controller” and multiple “multi-signal processors”), AND possibly involves application of particular and/or multiple “psychoacoustic models” (as I understand it, more or less an algorithm for deciding when some part of the sound can’t be noticed anyway and can be degraded or thrown out entirely, to save extra room for detail in more important parts of the sound). I’m at a loss as to whether an encoder like LAME actually falls under this patent at all. Anybody know? Even if LAME’s architecture might trespass on the special “multi-signal processor” stuff, might it be possible that running it at “-q 9” (“disables almost all algorithms including psy-model.”) would avoid this patent?(Furthermore,this patent has finally officially run out and no longer even applies!)

5924060 is also a bit beyond me, but I notice claim 7 specifically calls for application of a psychoacoustic model, so perhaps “lame -q 9” might avoid it entirely? Beyond that, I can’t tell how avoidable the techniques described are.

6009399 seems to specifically claim using multiple psychoacoustic models at the same time (and then deciding which one works best for a particular sample and using that one, if I’m interpreting that correctly). Seems like not applying a psychoacoustic model to optimize the encoding would bypass this one, or for that matter even only using one model for the entire encoding process.

So then…am I correct in thinking that there’s a good chance that “lame -q 9 -m m –cbr”, given input audio with 44.1kHz or 48kHz sampling rate, would avoid the last remaining patent threats still hovering over mp3?

(And, yes, I’m aware that the result would sound even worse than usual mp3 files, but the point is merely to generate “usable” mp3 data as a fallback for old/recalcitrant audio players. I’ve got .opus, .ogg [vorbis], and .flac for high-quality audio.)

I’m working on earning my “Subversive Radio Host” merit badge with my RaspberryPi. Once I’ve got the whole system worked out I’ll likely be doing a Hacker Public Radio episode about it. In the meantime, though, for testing purposes I have a feed from the local NOAA Weather Radio which I am feeding as 10kbps opus audio to an instance of icecast2.

I figured I’d post this since there’s a piece of information I found today regarding having a continuous stream, and someone else may find it useful.

I was finding that my stream to the icecast server, using the command line suggested by the opusenc man page, was spontaneously dying at almost precisely 6 hours, 12 minutes, and 50 seconds. For reference, here’s what the man page says (or at least has for quite some time now and did when I last checked it today) as a suggestion for live-streaming recorded audio in realtime:

arecord -c 2 -r 48000 -twav - | opusenc --bitrate 96 - - | oggfwd icecast.somewhere.org 8000 password /stream.opus

I was using the same string of commands, with minor changes (only one channel, –bitrate 12 [or 10], different address and credentials for oggfwd).

To cut the drama short, the problem turns out to be arecord. I still am not sure whether arecord was hitting a “maximum number of bytes” or “maximum run time” problem, but either way, it turns out you can just use sox (symlinked as “rec”) in its place, which is nice because I had been thinking about playing with having sox apply some filtering to remove noise, etc.

Here’s a suggested replacement command line for this purpose:
rec -c 1 -t wav - | opusenc --bitrate 10 - - | oggfwd icecast.somewhere.org 8000 password /stream.opus
(with the parameters adjusted to your own needs, of course).

It’s been going for over 7 hours now, so I’m assuming my problem is resolved. In the long run it won’t matter so much – my intended use-case in the end involves playing pre-recorded files (and would work perfectly with ices2‘s features if ices2 supported opus…), but for times when I may want to live-stream something, this may come in handy. With opus at 10-12kbps being still decent quality for voice, one ought to be able to feasibly live-stream audio even over a really slow pre-“3G” cellphone data link or dial-up modem in realtime.

Oh, speaking of “Subversive Radio Host” – using the clever pifm software turns the Raspberry Pi into a remarkably powerful transmitter. I think I’ve got my antenna trimmed down enough to make it a legal unlicensed transmitter (my original test was with a 20″ or so piece of alligator-clip wire that happened to be handy, and got the signal out to about three blocks away. Definitely too much power for legal unlicensed use.) I suspect with an ideal-length wire as an antenna you could cover a whole town, if you placed it well, and if you didn’t mind getting the FCC (or whoever your local regulatory agency is) very annoyed with you.

Old Time Radio voice actor(That was the sponsor of the ‘Old Time Radio’ show)

I’ve decided that while I’m experimenting with encoding opus audio for playback on the web and portable media players and smartphones,

Recently I started listening to an “Old Time Radio” series I found on archive.org, and since I’m playing mostly with voice audio I thought it would be a good test subject. Episode 1 of “Chandu, the Magician” (“The Return of Frank Chandler”) actually compressed very well all the way down to 14kbps with little noticeable loss of quality compared to the original. Admittedly, the original was a lossy digitization of an old wire recording, but still, what quality was there was preserved well.

An old record album - 45rpmMetadata in Ogg files is nice and simple compared to the bizarre mess that is id3v2, which is the metadata format used for crappy-old-mp3.

Where mp3 has a clutter of special little pre-specified data-structures to pick from, Vorbis and Opus in Ogg files uses a nice, simple “(fieldname)=(text)” format. Well…except for one thing.

“Album art” isn’t text. It’s not sound, either. Now, the most-technically-correct way to deal with this sort of thing in an Ogg file would be to have the “album art” images as their own separate media “stream”, sort of like a “movie” file where the “video” is one or more still images. This is kind of an inconvenient kind of thing for audio-only players to deal with I guess, and is different from how the more well-known mp3 did it, so a workaround was devised.

First, let’s get this out of the way: NO, do NOT just base64-encode a jpeg file and cram it into the metadata. Early on, doing this in a field named “COVERART” was one way people tried to to cram album art into audio-only Ogg files. However, few if any media players ever bothered to implement using that field, and in addition this method is even less capable than the “APIC” structure that mp3 uses for album art. The correct way to include “album art” images in an Ogg Vorbis (or Opus) file is with the somewhat unintuitively-named METADATA_BLOCK_PICTURE field.

The reason for this strange name is because that’s what the “album art” structures in FLAC files is called, and this precise same binary structure is what’s used for “vorbiscomments” (again, Ogg Vorbis AND “.opus” [Opus in Ogg]) album-art. Since the binary structure is exactly the same, that means any media player that can handle “album art” in .flac files can use exactly the same pre-existing code to also handle album art in Ogg files.

The difficulty is of course that this is still “binary” (non-text) data, so this whole structure (not just the contents of the image file) gets base64 encoded. Annoyingly, I’ve had a hard time finding anything that properly supports generating this structure. The encoder utility for flac actually has a built-in “-picture” option that lets you specify picture files (along with the rest of the data structure, where it might not be auto-detectable or where you don’t want the defaults) easily when making the file in the first place, but neither the oggenc encoder for Ogg Vorbis nor opusenc for Ogg Opus had this option, nor did the “vorbiscomments” utility used for adding or modifying metadata in pre-made Ogg Vorbis files. There are a few graphical utilities (such as kid3) that can properly encode a single piece of “album art” for a file, but only with some hard-coded defaults. Also, they still don’t support Opus files (pending a long-delayed release of taglib version 1.9). This makes embedding album-art during automated processing a problem, as well as the issue of including multiple embedded images, which is also perfectly valid (not only in .flac, .ogg, and .opus but in .mp3 as well) with a couple of minor restrictions.

Some help with this is now available, though. It turns out that the next release of opusenc will actually also include a flac-like “-picture” function, which will make that easier, but still doesn’t solve the case for Ogg Vorbis. Also, I was able to find a perl implementation of generating the METADATA_BLOCK_PICTURE structure and then base64-encoding it, which at a glance looks like it ought to work. Hardcore pythonistas could probably code up something equivalent using mutagen as long as you don’t mind being stuck on Python 2.x. And now…I have a correctly working PHP implementation, and a script that can be run either from a web server (using a form to upload the picture and fill in the data) or from a command-line prompt. Anybody want it?

Not this site or anything, but specifically the “www.dogphilosophy.net” address. This particular site will still be accessible at “http://dogphilosophy.net”. (http://hpr.dogphilosophy.net site, which has been more active than this one lately, will also remain up).

I’m just getting tired of the buttnuggets in Indonesia (I’m looking at YOU, http://pemudaindonesiabaru.blogspot.com, among others) who insist on using an old “blogspot” theme that hotlinks for no good reason to a no-longer-existing image file on www.dogphilosophy.net, thereby clogging the crap out of my webserver logs. The “www” is kind of redundant these days anyway, so I may as well dump it. That said, if you’re watching the RSS feed or just remember typing in “www.dogphilosophy.net” to get here, update your links to just “http://dogphilosophy.net” instead.

Meanwhile, just another note to mention that my latest audio endeavor finally popped up at Hacker Public Radio, and you can download it or listen directly at the hpr.dogphilosophy.net site if you’re not a listener at Hacker Public Radio.

This latest episode is a review of gameplay for Google’s new geolocation based game (“Ingress”). I’m working on a followup episode to this one, then I can finally do the mysterious geotagging episode I’ve been talking about for a couple of years now.

Oh, also over on the hpr.dogphilosophy.net site, I’ve started a list of topics I’m either actively working on or thinking about working on. I strongly encourage/beg anyone who’s interested to check out the list of Potentially Upcoming Shows and leave opinions on which topics look interesting (or suggest additional topics!).

It’s worth noting that I’ve decided that the Stir-Fried Stochasticity shows are probably “of interest to Hackers” and therefore I’m currently planning to fold those topics into the hpr.dogphilosophy.net site as well, which is why you’ll see references to scientific-paper topics (and the Gram Stain!) on the “Potentially Upcoming Shows” list.

A minor bit of news in conclusion: I finally managed to work out what my last remaining problem with it was (much thanks to “derf” on the #opus channel of freenode IRC!) and I now have a working implementation of the funky “METADATA_BLOCK_PICTURE” (“album art”) structure that needs to be generated for Ogg (e.g. vorbis or opus) audio files in PHP. I’ve gotten back to work on the web-based converter project that I mentioned way back in Hacker Public Radio episode#1033. I now have core purpose of the project (taking an uploaded source file and “album art”, prompting in a user-friendly manner for metadata [“title”, “artist”, “genre”, etc.] and encoding settings (bitrate/final file size, etc.), and then generating a valid Opus audio file) completely working as far as I can tell. Of course, right now it stops abruptly at that point as there’s still plenty of interface work and additional features to add, but it actually does something useful for me now – hooray! I’ll do a separate post about this sometime later. Yes, source code will be available, most likely under the AGPL by default (other terms by negotiation, if anybody actually wants it that much).

Opus audio codec logo

Opus Codec logo

There’s a shiny new awesome, high-quality, and legally-free audio standard out now called Opus. Opus audio quality is even better than the already-very-good legally-free Vorbis codec that is widely supported (if not widely promoted) these days, and seems to also beat the proprietary “HE-AAC” codec. Needless to say, it makes ancient mp3 shrivel up with shame.

Having only just been finalized, it’s currently supported in Firefox 15 already on all platforms, including Android. It appears to also be supported in most recent browsers on Linux if they use the “gstreamer” framework for multimedia, and rumor has it that full support on other operating systems will be appearing for Opera and Google Chrome in the relatively near future. Support should be showing up in the next version (probably 2.0.4, I’m guessing) of VLC on all platforms, in the next release of Rockbox for various media players and (hypothetically) Android devices, the next release of the Mumble voice chat system, and probably quite a few others very quickly. Heck, even Microsoft (or at least their Skype division) has been involved in the development of Opus, and the group working out the “WebRTC” standard for web-based voice chat (including Microsoft, apparently) voted to support Opus as “Mandatory to Implement”, so anything that ends up supporting the WebRTC standard will support Opus, so there’s even a chance we might see a rare case of Microsoft Internet Exploder actually supporting a really good media format that everyone is allowed to use sometime relatively soon.

Anyway, the point is that Opus is friggin’ awesome especially for audio downloaded from the internet and everyone should be using it. Well, that’s ONE point – the other is that I plan to do it here, too.

I’ve got a couple of bits of audio that I’ll actually be ready to record and post pretty soon: a year-overdue contribution for Hacker Public Radio (Opus version to be posted to http://hpr.dogphilosophy.net which I set up specifically for my Hacker Public Radio efforts) which just happens to be about media – especially audio – on the internet, and a bit about New England’s “You can’t get there from here” schtick and how it maligns the Booming Metropolis of Millinocket, Maine. (Hey, everybody knows that “all trails lead to Millinocket“, right?) I’ve also got three topics queued up for my “Stir-Fried Stochasticity” science-paper audio project (an 11-paper science monstrosity show on the topic of the Gram Stain, a show on several papers discussing “shinrin-yoku” in honor of the new arboreally-enhanced location here where the Asylum for the Sufficiently Nerdy has moved, and one on a couple of garbage papers). All will be posted in high-quality Opus format, along with modern Ogg Vorbis and possibly crappy-old-mp3 for “legacy” purposes for now.

Incidentally, I updated my HTML5 Web Browser Audio Test page so that it now also has FLAC and, of course, Opus audio samples, so if you go there you can test which audio formats your browser supports (and which one it selects by default if it supports more than one format). As a bonus, the audio samples are all explanations of their formats (for example, the .mp3 format audio sample is a bit of audio talking about the .mp3 format), so it’s educational and stuff, too.

Anyway, test your browser there, and please start leaving comments pestering me – my regular duties in my new profession keep me pretty busy, but I can make more time for audio projects as long as people are interested (and the more interest I hear, the more time I’ll set aside for it and start getting things posted).

Also, I’ve whipped up an HTML5 <audio> tag test for your browsers. I’m trying to figure out exactly how good (or bad) and widespread support for the <audio> tag is these days.

The HTML5 <audio> test page linked above is one that I’ve put together that includes examples of four major audio file formats currently in use (Ogg Vorbis, MP3, WebM Audio, and .wav). The page will report what your browser software reports regarding its compatibility with different audio formats, and provides buttons to push to switch to the different files to try listening to them (regardless whether your browser says it works or not – I know of at least one case where the browser outright claims not to support a file format when it does…).

On a whim, I added a field that reports whether your browser claims to support FLAC, though I don’t yet have a sample file for testing up. I’ll eventually add a sample Opus file as well, since that looks to me like a hugely useful format once it’s ready.

If you have time, give it a try. Pretty much everybody who isn’t stuck on an old version of the Internet Explorer browser on Windows should be able to use it. If you’re willing, the form has fields where you can specify whether a file format really worked or not, a button at the bottom to submit the report, which will make a note of which browser you’re using and what worked. Note that you don’t HAVE to do this to use the page for testing your audio, if for some reason you don’t want to report what you find out – the audio on the page isn’t dependent on whether you submit results. Eventually I’ll have a fairly complete picture of what supports what one way or another.

Also: pester me – I really will be posting audio again.