(You’ll understand the title of this post if you listen…)
I’ve decided that while I’m experimenting with encoding opus audio for playback on the web and portable media players and smartphones, I might as well post some of my experiments on opuscast.com until I finally get around to building something more complete there.
Recently I started listening to an “Old Time Radio” series I found on archive.org, and since I’m playing mostly with voice audio I thought it would be a good test subject. Episode 1 of “Chandu, the Magician” (“The Return of Frank Chandler”) may be heard (with an appropriately recent and freedom-loving web browser) or downloaded to listen to in Opus-supporting media players right now at opuscast.com. If you give it a listen, you’re welcome to post your observations here in the comments.
Metadata in Ogg files is nice and simple compared to the bizarre mess that is id3v2, which is the metadata format used for crappy-old-mp3.
Where mp3 has a clutter of special little pre-specified data-structures to pick from, Vorbis and Opus in Ogg files uses a nice, simple “(fieldname)=(text)” format. Well…except for one thing.
“Album art” isn’t text. It’s now sound, either. Now, the most-technically-correct way to deal with this sort of thing in an Ogg file would be to have the “album art” images as their own separate media “stream”, sort of like a “movie” file where the “video” is one or more still images. This is kind of an inconvenient kind of thing for audio-only players to deal with I guess, and is different from how the more well-known mp3 did it, so a workaround was devised.
First, let’s get this out of the way: NO, do NOT just base64-encode a jpeg file and cram it into the metadata. Early on, doing this in a field named “COVERART” was one way people tried to to cram album art into audio-only Ogg files. However, few if any media players ever bothered to implement using that field, and in addition this method is even less capable than the “APIC” structure that mp3 uses for album art. The correct way to include “album art” images in an Ogg Vorbis (or Opus) file is with the somewhat unintuitively-named METADATA_BLOCK_PICTURE field.
The reason for this strange name is because that’s what the “album art” structures in FLAC files is called, and this precise same binary structure is what’s used for “vorbiscomments” (again, Ogg Vorbis AND “.opus” [Opus in Ogg]) album-art. Since the binary structure is exactly the same, that means any media player that can handle “album art” in .flac files can use exactly the same pre-existing code to also handle album art in Ogg files.
The difficulty is of course that this is still “binary” (non-text) data, so this whole structure (not just the contents of the image file) gets base64 encoded. Annoyingly, I’ve had a hard time finding anything that properly supports generating this structure. The encoder utility for flac actually hasa built-in “-picture” option that lets you specify picture files (along with the rest of the data structure, where it might not be auto-detectable or where you don’t want the defaults) easily when making the file in the first place, but neither the oggenc encoder for Ogg Vorbis nor opusenc for Ogg Opus had this option, nor did the “vorbiscomments” utility used for adding or modifying metadata in pre-made Ogg Vorbis files. There are a few graphical utilities (such as kid3) that can properly encode a single piece of “album art” for a file, but only with some hard-coded defaults. Also, they still don’t support Opus files (pending a long-delayed release of taglib version 1.9). This makes embedding album-art during automated processing a problem, as well as the issue of including multiple embedded images, which is also perfectly valid (not only in .flac, .ogg, and .opus but in .mp3 as well) with a couple of minor restrictions.
Some help with this is now available, though. It turns out that the next release of opusenc will actually also include a flac-like “-picture” function, which will make that easier, but still doesn’t solve the case for Ogg Vorbis. Also, I was able to find a perl implementation of generating the METADATA_BLOCK_PICTURE structure and then base64-encoding it, which at a glance looks like it ought to work. Hardcore pythonistas could probably code up something equivalent using mutagen as long as you don’t mind being stuck on Python 2.x. And now…I have a correctly working PHP implementation, and a script that can be run either from a web server (using a form to upload the picture and fill in the data) or from a command-line prompt. Anybody want it?
Not this site or anything, but specifically the “www.dogphilosophy.net” address. This particular site will still be accessible at “http://dogphilosophy.net”. (http://hpr.dogphilosophy.net site, which has been more active than this one lately, will also remain up).
I’m just getting tired of the buttnuggets in Indonesia (I’m looking at YOU, http://pemudaindonesiabaru.blogspot.com, among others) who insist on using an old “blogspot” theme that hotlinks for no good reason to a no-longer-existing image file on www.dogphilosophy.net, thereby clogging the crap out of my webserver logs. The “www” is kind of redundant these days anyway, so I may as well dump it. That said, if you’re watching the RSS feed or just remember typing in “www.dogphilosophy.net” to get here, update your links to just “http://dogphilosophy.net” instead.
Meanwhile, just another note to mention that my latest audio endeavor finally popped up at Hacker Public Radio, and you can download it or listen directly at the hpr.dogphilosophy.net site if you’re not a listener at Hacker Public Radio.
This latest episode is a review of gameplay for Google’s new geolocation based game (“Ingress”). I’m working on a followup episode to this one, then I can finally do the mysterious geotagging episode I’ve been talking about for a couple of years now.
Oh, also over on the hpr.dogphilosophy.net site, I’ve started a list of topics I’m either actively working on or thinking about working on. I strongly encourage/beg anyone who’s interested to check out the list of Potentially Upcoming Shows and leave opinions on which topics look interesting (or suggest additional topics!).
It’s worth noting that I’ve decided that the Stir-Fried Stochasticity shows are probably “of interest to Hackers” and therefore I’m currently planning to fold those topics into the hpr.dogphilosophy.net site as well, which is why you’ll see references to scientific-paper topics (and the Gram Stain!) on the “Potentially Upcoming Shows” list.
A minor bit of news in conclusion: I finally managed to work out what my last remaining problem with it was (much thanks to “derf” on the #opus channel of freenode IRC!) and I now have a working implementation of the funky “METADATA_BLOCK_PICTURE” (“album art”) structure that needs to be generated for Ogg (e.g. vorbis or opus) audio files in PHP. I’ve gotten back to work on the web-based converter project that I mentioned way back in Hacker Public Radio episode#1033. I now have core purpose of the project (taking an uploaded source file and “album art”, prompting in a user-friendly manner for metadata ["title", "artist", "genre", etc.] and encoding settings (bitrate/final file size, etc.), and then generating a valid Opus audio file) completely working as far as I can tell. Of course, right now it stops abruptly at that point as there’s still plenty of interface work and additional features to add, but it actually does something useful for me now – hooray! I’ll do a separate post about this sometime later. Yes, source code will be available, most likely under the AGPL by default (other terms by negotiation, if anybody actually wants it that much).
There’s a shiny new awesome, high-quality, and legally-free audio standard out now called Opus. Opus audio quality is even better than the already-very-good legally-free Vorbis codec that is widely supported (if not widely promoted) these days, and seems to also beat the proprietary “HE-AAC” codec. Needless to say, it makes ancient mp3 shrivel up with shame.
Having only just been finalized, it’s currently supported in Firefox 15 already on all platforms, including Android. It appears to also be supported in most recent browsers on Linux if they use the “gstreamer” framework for multimedia, and rumor has it that full support on other operating systems will be appearing for Opera and Google Chrome in the relatively near future. Support should be showing up in the next version (probably 2.0.4, I’m guessing) of VLC on all platforms, in the next release of Rockbox for various media players and (hypothetically) Android devices, the next release of the Mumble voice chat system, and probably quite a few others very quickly. Heck, even Microsoft (or at least their Skype division) has been involved in the development of Opus, and the group working out the “WebRTC” standard for web-based voice chat (including Microsoft, apparently) voted to support Opus as “Mandatory to Implement”, so anything that ends up supporting the WebRTC standard will support Opus, so there’s even a chance we might see a rare case of Microsoft Internet Exploder actually supporting a really good media format that everyone is allowed to use sometime relatively soon.
Anyway, the point is that Opus is friggin’ awesome especially for audio downloaded from the internet and everyone should be using it. Well, that’s ONE point – the other is that I plan to do it here, too.
I’ve got a couple of bits of audio that I’ll actually be ready to record and post pretty soon: a year-overdue contribution for Hacker Public Radio (Opus version to be posted to http://hpr.dogphilosophy.net which I set up specifically for my Hacker Public Radio efforts) which just happens to be about media – especially audio – on the internet, and a bit about New England’s “You can’t get there from here” schtick and how it maligns the Booming Metropolis of Millinocket, Maine. (Hey, everybody knows that “all trails lead to Millinocket“, right?) I’ve also got three topics queued up for my “Stir-Fried Stochasticity” science-paper audio project (an 11-paper science monstrosity show on the topic of the Gram Stain, a show on several papers discussing “shinrin-yoku” in honor of the new arboreally-enhanced location here where the Asylum for the Sufficiently Nerdy has moved, and one on a couple of garbage papers). All will be posted in high-quality Opus format, along with modern Ogg Vorbis and possibly crappy-old-mp3 for “legacy” purposes for now.
Incidentally, I updated my HTML5 Web Browser Audio Test page so that it now also has FLAC and, of course, Opus audio samples, so if you go there you can test which audio formats your browser supports (and which one it selects by default if it supports more than one format). As a bonus, the audio samples are all explanations of their formats (for example, the .mp3 format audio sample is a bit of audio talking about the .mp3 format), so it’s educational and stuff, too.
Anyway, test your browser there, and please start leaving comments pestering me – my regular duties in my new profession keep me pretty busy, but I can make more time for audio projects as long as people are interested (and the more interest I hear, the more time I’ll set aside for it and start getting things posted).
Also, I’ve whipped up an HTML5 <audio> tag test for your browsers. I’m trying to figure out exactly how good (or bad) and widespread support for the <audio> tag is these days.
The HTML5 <audio> test page linked above is one that I’ve put together that includes examples of four major audio file formats currently in use (Ogg Vorbis, MP3, WebM Audio, and .wav). The page will report what your browser software reports regarding its compatibility with different audio formats, and provides buttons to push to switch to the different files to try listening to them (regardless whether your browser says it works or not – I know of at least one case where the browser outright claims not to support a file format when it does…).
On a whim, I added a field that reports whether your browser claims to support FLAC, though I don’t yet have a sample file for testing up. I’ll eventually add a sample Opus file as well, since that looks to me like a hugely useful format once it’s ready.
If you have time, give it a try. Pretty much everybody who isn’t stuck on an old version of the Internet Explorer browser on Windows should be able to use it. If you’re willing, the form has fields where you can specify whether a file format really worked or not, a button at the bottom to submit the report, which will make a note of which browser you’re using and what worked. Note that you don’t HAVE to do this to use the page for testing your audio, if for some reason you don’t want to report what you find out – the audio on the page isn’t dependent on whether you submit results. Eventually I’ll have a fairly complete picture of what supports what one way or another.
Also: pester me – I really will be posting audio again.
Do me a favor and harass me online until I start getting audio out again.
The relocation and career change has been an ongoing distraction, but it’s long past time for me to get back to work on getting Stir-Fried Stochasticity and other projects back to life.
I even got some nice new equipment to record with, thanks to a generous grant from my Producers (the Mom and Dad foundation) of a Zoom H1 four-channel digital recorder. I really need to start making more use of it.
Therefore, anyone who is interested in my little audio projects is encouraged to harass me online here, or via Google Talk, or on my Google+ page, or whatever, until I start coughing up new episodes.
Not literally, of course, I mean, you probably don’t really want to hear “coughing up” noises, but you know what I mean.
I’m still here…but I’m no longer there. (And perhaps, no longer “all there”, but that’s a separate issue.)
An unexpected but hopefully fortuitous attack of Life (the concept, not the cereal) has kept me away from my audio projects for months now. The secret location of our Asylum for the Sufficiently Nerdy is being moved thousands of miles (literally – this is not an exaggeration) in pursuit of a promising project. Fortunately, we’re almost done with the move. We’ll still be pretty busy for a while yet, but I think things will settle down enough for me to shave a few hours per week to get back to audio.
Indeed, I very much hope so – not only do I want to get back to the “Stir-Fried Stochasticity” oggcast project and my HPR offerings, but I also started listening, during the seemingly endless drive from our old Asylum to the new one, to old 1930′s-1940′s radio serials I found on archive.org, and now I find myself itching to try to do one myself.
I will try to put out an episode of Stir-Fried Stochasticity here soon, explaining my ongoing battle with Zombie Hans Christian Gram, which I think I can put together more quickly than the episode I’d started working on previously, which is garbage (also literally). Other projects to follow as time permits and interest is expressed…
Thanks for not giving up on me (if there’s anyone left out there who hasn’t already).
Incidentally, if you’ve been getting frustrated by the “Bad Gateway” errors, so am I and I promise I’ll get that fixed somehow…
(This is an earlier audio bit, reposted here from elsewhere just to consolidate onto this site with the other audio. Meanwhile, I am working on the next episode of Stir-Fried Stochasticity, as well as the next episode of “Thoughtkindness” for Hacker Public Radio.)
Gather around the campfire, boys and girls and everyone else. It’s story time.
(This is both an attempt to entertain AND a technical test – I’d be most appreciative if any or all of you left me a comment letting me know how this works for you. I’ll put some technical information at the end of the post.)
This story concerns a certain location in Mount Ranier National Park…
After you hear this harrowing tale, if you can’t make it out to Mount Ranier National Park to verify the story for yourself, you can see a picture of the monument online. Click or scan the QRCode image to the right to see it after you’ve heard the story.
Feedback is welcome and encouraged. For those who are interested, here’s what this post is supposed to do, technically:
If you are viewing this post in a modern (HTML5-supporting) browser, the “native” audio player in your browser should appear above, allowing you to press “play” and listen to the story. Of all the modern HTML5-supporting browsers, most support the high-quality (and legally free to use) “Ogg Vorbis” audio format and will play that version. If you are in the minority of HTML5-browser-using population (Safari or IE9), an MP3 version should play instead. (The problem with Safari is that Apple doesn’t include a Quicktime component for Ogg media formats out of the box. Personally, I would recommend going ahead and installing the Free Quicktime Components, which will enable Ogg media formats for Safari, iTunes, and all other Quicktime-using programs, including enabling Apple platform applications to create files of these types so you can participate, too.)
If you are NOT using a modern, HTML5-supporting browser at all (or are perhaps using one I’ve never heard of that supports neither higher-quality Ogg Vorbis nor MP3) – mainly Microsoft’s previous “Internet Explorer” browsers and really old versions of Firefox or Opera that may still be in use – if you have Java installed, a Java-based Ogg Vorbis player should appear instead, allowing you to play the higher-quality audio anyway.
If your browser doesn’t support HTML5 AND doesn’t support Java, a link to an Adobe Flash-based MP3 player should appear. Click on that, and you SHOULD have a window pop up that will play the lower-quality MP3 version of the audio.
In short, nearly everyone should be able to play the audio if I’ve done all of this correctly. Please let me know.
I decided that it would be good to offer the computer-nerd-related audio I do to Hacker Public Radio, so I put together an initial episode describing my motivations and offering a review of my shiny new full-powered laptop (which, I must say, is somewhat easier to edit and process audio on) and the vendor I got it from. (SPOILER: I like them…)
Comments so far (both of them…) are substantially positive, so I’m planning to do new ones monthly.
If you’re interested in such things, you can either go straight over to Hacker Public Radio, or the corner of this server that I set up for discussion at http://hpr.dogphilosophy.net where I have the audio linked for direct listening from the webpage if you are using a modern browser.
I have recently been informed that, apparently, there is at least one person who is not just a member of my immediate family humoring me but who is, nonetheless, subscribed to the RSS feed here and potentially interested in my currently-intermittent episodes of Stir-Fried Stochasticity…
Is this true?
(tap)(tap)(tap) is this thing on?…
Seriously, though, the main reason I’ve been so slow to get around to the next one is that I didn’t think anyone had been listening to the ones I’d done so far, and considering the labor involved in putting an episode together, it’s hard to justify the time if I’m largely just talking to myself.
If there are people out there besides a couple of members of my immediate family* who want to hear more, please let me know, and I can certainly call the Ninjaologists back from furlough and start up negotiations with the Science Pirates to get things rolling again…
*(Again, not that I dislike members of my immediate family or anything, but I can talk to them more or less whenever I want…)