PlayOggThe crappy-old-mp3 standard has now been around for nearly a quarter-century. Most kinds of audio players and web browsers have supported better, legally-free formats for a while now, but as usual Microsoft and (most prominently) Apple are stuck with only formats that you have to pay a metaphorical “poll tax” for permission to use.

.mp3 is one of them, of course. By modern standards, mp3 is pretty poor. It’s high-latency so it’s not suitable for interactive or live uses (e.g. VoIP), the quality is lacking at all but the highest bitrates (so you either have low-quality audio or huge files to transmit and store). It’s also weighed down by a bunch of patents, of course, so you can’t even legally make or use .mp3 files without somebody paying protection money to some lawyers for permission…

…or can you?

Personally, I’d rather just never touch the stuff (patents or not, the audio format is lacking, and I really don’t like the fussy, limited little “id3″ standards for its metadata, either), but I grudgingly concur that having a basic “fallback” format that Microsoft/Apple/fifteen-year-old-media-player owners could use until they realize they can upgrade to something better by using a different browser/media-playing app.

I occurs to me that with the original specification for .mp3 being published in 1993 or so, more than 20 years ago, and given that patents aren’t supposed to last more than 20 years, it seems like a reasonable assumption that at least some if not all of the still-threatening patents (the last of which still doesn’t expire until 2017!) are optional “optimizations” or techniques that don’t necessarily have to be applied to generate a valid .mp3 file that ancient media players (or new media players from ancient companies…) can at least play back, even if the files are not “optimal”.

For example (DISCLAIMER: IANAL), This Big List of MP3 Patents shows that there are 9 patents left (as of 2014) keeping mp3 locked up. However, at least four of those are specifically patents on ways of encoding two or more channels of audio (i.e. stereo, surround-sound, etc.), so a single-channel (mono) audio stream encoded to .mp3 should definitely not trespass on half of those patents. One more appears to be specific to techniques for encoding “low sampling rate” audio (i.e NOT the usual 44.1kHz or 48kHz), so a typical 44.1kHz or 48kHz audio source encoded to .mp3 at that rate wouldn’t trespass, either.

That leaves 4 remaining US patents to be tiptoed around to generate legal .mp3 files:

Trying to read those things make my head hurt, but it kind of looks to my definitely-non-expert eye like 5579430 has something to do with “average bit rate” encoding (not “constant bit rate”), so possibly a true “constant bitrate” encoding wouldn’t infringe?

5742735 looks kind of like it’s both a specific way of designing an encoder (with a “controller” and multiple “multi-signal processors”), AND possibly involves application of particular and/or multiple “psychoacoustic models” (as I understand it, more or less an algorithm for deciding when some part of the sound can’t be noticed anyway and can be degraded or thrown out entirely, to save extra room for detail in more important parts of the sound). I’m at a loss as to whether an encoder like LAME actually falls under this patent at all. Anybody know? Even if LAME’s architecture might trespass on the special “multi-signal processor” stuff, might it be possible that running it at “-q 9″ (“disables almost all algorithms including psy-model.”) would avoid this patent?

5924060 is also a bit beyond me, but I notice claim 7 specifically calls for application of a psychoacoustic model, so perhaps “lame -q 9″ might avoid it entirely? Beyond that, I can’t tell how avoidable the techniques described are.

6009399 seems to specifically claim using multiple psychoacoustic models at the same time (and then deciding which one works best for a particular sample and using that one, if I’m interpreting that correctly). Seems like not applying a psychoacoustic model to optimize the encoding would bypass this one, or for that matter even only using one model for the entire encoding process.

So then…am I correct in thinking that there’s a good chance that “lame -q 9 -m m –cbr”, given input audio with 44.1kHz or 48kHz sampling rate, would avoid the last remaining patent threats still hovering over mp3?

(And, yes, I’m aware that the result would sound even worse than usual mp3 files, but the point is merely to generate “usable” mp3 data as a fallback for old/recalcitrant audio players. I’ve got .opus, .ogg [vorbis], and .flac for high-quality audio.)

This content is published under the Attribution-Share Alike 3.0 Unported license.

Picture of Night Vale 'Subversive Radio Host' Merit BadgeI’m working on earning my “Subversive Radio Host” merit badge with my RaspberryPi. Once I’ve got the whole system worked out I’ll likely be doing a Hacker Public Radio episode about it. In the meantime, though, for testing purposes I have a feed from the local NOAA Weather Radio which I am feeding as 10kbps opus audio to an instance of icecast2.

I figured I’d post this since there’s a piece of information I found today regarding having a continuous stream, and someone else may find it useful.

I was finding that my stream to the icecast server, using the command line suggested by the opusenc man page, was spontaneously dying at almost precisely 6 hours, 12 minutes, and 50 seconds. For reference, here’s what the man page says (or at least has for quite some time now and did when I last checked it today) as a suggestion for live-streaming recorded audio in realtime:

arecord -c 2 -r 48000 -twav - | opusenc --bitrate 96 - - | oggfwd icecast.somewhere.org 8000 password /stream.opus

I was using the same string of commands, with minor changes (only one channel, –bitrate 12 [or 10], different address and credentials for oggfwd).

To cut the drama short, the problem turns out to be arecord. I still am not sure whether arecord was hitting a “maximum number of bytes” or “maximum run time” problem, but either way, it turns out you can just use sox (symlinked as “rec”) in its place, which is nice because I had been thinking about playing with having sox apply some filtering to remove noise, etc.

Here’s a suggested replacement command line for this purpose:
rec -c 1 -t wav - | opusenc --bitrate 10 - - | oggfwd icecast.somewhere.org 8000 password /stream.opus
(with the parameters adjusted to your own needs, of course).

It’s been going for over 7 hours now, so I’m assuming my problem is resolved. In the long run it won’t matter so much – my intended use-case in the end involves playing pre-recorded files (and would work perfectly with ices2‘s features if ices2 supported opus…), but for times when I may want to live-stream something, this may come in handy. With opus at 10-12kbps being still decent quality for voice, one ought to be able to feasibly live-stream audio even over a really slow pre-“3G” cellphone data link or dial-up modem in realtime.

Oh, speaking of “Subversive Radio Host” – using the clever pifm software turns the Raspberry Pi into a remarkably powerful transmitter. I think I’ve got my antenna trimmed down enough to make it a legal unlicensed transmitter (my original test was with a 20″ or so piece of alligator-clip wire that happened to be handy, and got the signal out to about three blocks away. Definitely too much power for legal unlicensed use.) I suspect with an ideal-length wire as an antenna you could cover a whole town, if you placed it well, and if you didn’t mind getting the FCC (or whoever your local regulatory agency is) very annoyed with you.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Old Time Radio voice actor(You’ll understand the title of this post if you listen…)

I’ve decided that while I’m experimenting with encoding opus audio for playback on the web and portable media players and smartphones, I might as well post some of my experiments on opuscast.com until I finally get around to building something more complete there.

Recently I started listening to an “Old Time Radio” series I found on archive.org, and since I’m playing mostly with voice audio I thought it would be a good test subject. Episode 1 of “Chandu, the Magician” (“The Return of Frank Chandler”) may be heard (with an appropriately recent and freedom-loving web browser) or downloaded to listen to in Opus-supporting media players right now at opuscast.com. If you give it a listen, you’re welcome to post your observations here in the comments.

This content is published under the Attribution-Share Alike 3.0 Unported license.

An old record album - 45rpmMetadata in Ogg files is nice and simple compared to the bizarre mess that is id3v2, which is the metadata format used for crappy-old-mp3.

Where mp3 has a clutter of special little pre-specified data-structures to pick from, Vorbis and Opus in Ogg files uses a nice, simple “(fieldname)=(text)” format. Well…except for one thing.

“Album art” isn’t text. It’s not sound, either. Now, the most-technically-correct way to deal with this sort of thing in an Ogg file would be to have the “album art” images as their own separate media “stream”, sort of like a “movie” file where the “video” is one or more still images. This is kind of an inconvenient kind of thing for audio-only players to deal with I guess, and is different from how the more well-known mp3 did it, so a workaround was devised.

First, let’s get this out of the way: NO, do NOT just base64-encode a jpeg file and cram it into the metadata. Early on, doing this in a field named “COVERART” was one way people tried to to cram album art into audio-only Ogg files. However, few if any media players ever bothered to implement using that field, and in addition this method is even less capable than the “APIC” structure that mp3 uses for album art. The correct way to include “album art” images in an Ogg Vorbis (or Opus) file is with the somewhat unintuitively-named METADATA_BLOCK_PICTURE field.

The reason for this strange name is because that’s what the “album art” structures in FLAC files is called, and this precise same binary structure is what’s used for “vorbiscomments” (again, Ogg Vorbis AND “.opus” [Opus in Ogg]) album-art. Since the binary structure is exactly the same, that means any media player that can handle “album art” in .flac files can use exactly the same pre-existing code to also handle album art in Ogg files.

The difficulty is of course that this is still “binary” (non-text) data, so this whole structure (not just the contents of the image file) gets base64 encoded. Annoyingly, I’ve had a hard time finding anything that properly supports generating this structure. The encoder utility for flac actually hasa built-in “-picture” option that lets you specify picture files (along with the rest of the data structure, where it might not be auto-detectable or where you don’t want the defaults) easily when making the file in the first place, but neither the oggenc encoder for Ogg Vorbis nor opusenc for Ogg Opus had this option, nor did the “vorbiscomments” utility used for adding or modifying metadata in pre-made Ogg Vorbis files. There are a few graphical utilities (such as kid3) that can properly encode a single piece of “album art” for a file, but only with some hard-coded defaults. Also, they still don’t support Opus files (pending a long-delayed release of taglib version 1.9). This makes embedding album-art during automated processing a problem, as well as the issue of including multiple embedded images, which is also perfectly valid (not only in .flac, .ogg, and .opus but in .mp3 as well) with a couple of minor restrictions.

Some help with this is now available, though. It turns out that the next release of opusenc will actually also include a flac-like “-picture” function, which will make that easier, but still doesn’t solve the case for Ogg Vorbis. Also, I was able to find a perl implementation of generating the METADATA_BLOCK_PICTURE structure and then base64-encoding it, which at a glance looks like it ought to work. Hardcore pythonistas could probably code up something equivalent using mutagen as long as you don’t mind being stuck on Python 2.x. And now…I have a correctly working PHP implementation, and a script that can be run either from a web server (using a form to upload the picture and fill in the data) or from a command-line prompt. Anybody want it?

This content is published under the Attribution-Share Alike 3.0 Unported license.

Not this site or anything, but specifically the “www.dogphilosophy.net” address. This particular site will still be accessible at “http://dogphilosophy.net”. (http://hpr.dogphilosophy.net site, which has been more active than this one lately, will also remain up).

I’m just getting tired of the buttnuggets in Indonesia (I’m looking at YOU, http://pemudaindonesiabaru.blogspot.com, among others) who insist on using an old “blogspot” theme that hotlinks for no good reason to a no-longer-existing image file on www.dogphilosophy.net, thereby clogging the crap out of my webserver logs. The “www” is kind of redundant these days anyway, so I may as well dump it. That said, if you’re watching the RSS feed or just remember typing in “www.dogphilosophy.net” to get here, update your links to just “http://dogphilosophy.net” instead.

Meanwhile, just another note to mention that my latest audio endeavor finally popped up at Hacker Public Radio, and you can download it or listen directly at the hpr.dogphilosophy.net site if you’re not a listener at Hacker Public Radio.

This latest episode is a review of gameplay for Google’s new geolocation based game (“Ingress”). I’m working on a followup episode to this one, then I can finally do the mysterious geotagging episode I’ve been talking about for a couple of years now.

Oh, also over on the hpr.dogphilosophy.net site, I’ve started a list of topics I’m either actively working on or thinking about working on. I strongly encourage/beg anyone who’s interested to check out the list of Potentially Upcoming Shows and leave opinions on which topics look interesting (or suggest additional topics!).

It’s worth noting that I’ve decided that the Stir-Fried Stochasticity shows are probably “of interest to Hackers” and therefore I’m currently planning to fold those topics into the hpr.dogphilosophy.net site as well, which is why you’ll see references to scientific-paper topics (and the Gram Stain!) on the “Potentially Upcoming Shows” list.

A minor bit of news in conclusion: I finally managed to work out what my last remaining problem with it was (much thanks to “derf” on the #opus channel of freenode IRC!) and I now have a working implementation of the funky “METADATA_BLOCK_PICTURE” (“album art”) structure that needs to be generated for Ogg (e.g. vorbis or opus) audio files in PHP. I’ve gotten back to work on the web-based converter project that I mentioned way back in Hacker Public Radio episode#1033. I now have core purpose of the project (taking an uploaded source file and “album art”, prompting in a user-friendly manner for metadata [“title”, “artist”, “genre”, etc.] and encoding settings (bitrate/final file size, etc.), and then generating a valid Opus audio file) completely working as far as I can tell. Of course, right now it stops abruptly at that point as there’s still plenty of interface work and additional features to add, but it actually does something useful for me now – hooray! I’ll do a separate post about this sometime later. Yes, source code will be available, most likely under the AGPL by default (other terms by negotiation, if anybody actually wants it that much).

This content is published under the Attribution-Share Alike 3.0 Unported license.

There’s a shiny new awesome, high-quality, and legally-free audio standard out now called Opus. Opus audio quality is even better than the already-very-good legally-free Vorbis codec that is widely supported (if not widely promoted) these days, and seems to also beat the proprietary “HE-AAC” codec. Needless to say, it makes ancient mp3 shrivel up with shame.

Having only just been finalized, it’s currently supported in Firefox 15 already on all platforms, including Android. It appears to also be supported in most recent browsers on Linux if they use the “gstreamer” framework for multimedia, and rumor has it that full support on other operating systems will be appearing for Opera and Google Chrome in the relatively near future. Support should be showing up in the next version (probably 2.0.4, I’m guessing) of VLC on all platforms, in the next release of Rockbox for various media players and (hypothetically) Android devices, the next release of the Mumble voice chat system, and probably quite a few others very quickly. Heck, even Microsoft (or at least their Skype division) has been involved in the development of Opus, and the group working out the “WebRTC” standard for web-based voice chat (including Microsoft, apparently) voted to support Opus as “Mandatory to Implement”, so anything that ends up supporting the WebRTC standard will support Opus, so there’s even a chance we might see a rare case of Microsoft Internet Exploder actually supporting a really good media format that everyone is allowed to use sometime relatively soon.

Anyway, the point is that Opus is friggin’ awesome especially for audio downloaded from the internet and everyone should be using it. Well, that’s ONE point – the other is that I plan to do it here, too.

I’ve got a couple of bits of audio that I’ll actually be ready to record and post pretty soon: a year-overdue contribution for Hacker Public Radio (Opus version to be posted to http://hpr.dogphilosophy.net which I set up specifically for my Hacker Public Radio efforts) which just happens to be about media – especially audio – on the internet, and a bit about New England’s “You can’t get there from here” schtick and how it maligns the Booming Metropolis of Millinocket, Maine. (Hey, everybody knows that “all trails lead to Millinocket“, right?) I’ve also got three topics queued up for my “Stir-Fried Stochasticity” science-paper audio project (an 11-paper science monstrosity show on the topic of the Gram Stain, a show on several papers discussing “shinrin-yoku” in honor of the new arboreally-enhanced location here where the Asylum for the Sufficiently Nerdy has moved, and one on a couple of garbage papers). All will be posted in high-quality Opus format, along with modern Ogg Vorbis and possibly crappy-old-mp3 for “legacy” purposes for now.

Incidentally, I updated my HTML5 Web Browser Audio Test page so that it now also has FLAC and, of course, Opus audio samples, so if you go there you can test which audio formats your browser supports (and which one it selects by default if it supports more than one format). As a bonus, the audio samples are all explanations of their formats (for example, the .mp3 format audio sample is a bit of audio talking about the .mp3 format), so it’s educational and stuff, too.

Anyway, test your browser there, and please start leaving comments pestering me – my regular duties in my new profession keep me pretty busy, but I can make more time for audio projects as long as people are interested (and the more interest I hear, the more time I’ll set aside for it and start getting things posted).

Posted from Millinocket, Maine, United States.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Also, I’ve whipped up an HTML5 <audio> tag test for your browsers. I’m trying to figure out exactly how good (or bad) and widespread support for the <audio> tag is these days.

The HTML5 <audio> test page linked above is one that I’ve put together that includes examples of four major audio file formats currently in use (Ogg Vorbis, MP3, WebM Audio, and .wav). The page will report what your browser software reports regarding its compatibility with different audio formats, and provides buttons to push to switch to the different files to try listening to them (regardless whether your browser says it works or not – I know of at least one case where the browser outright claims not to support a file format when it does…).

On a whim, I added a field that reports whether your browser claims to support FLAC, though I don’t yet have a sample file for testing up. I’ll eventually add a sample Opus file as well, since that looks to me like a hugely useful format once it’s ready.

If you have time, give it a try. Pretty much everybody who isn’t stuck on an old version of the Internet Explorer browser on Windows should be able to use it. If you’re willing, the form has fields where you can specify whether a file format really worked or not, a button at the bottom to submit the report, which will make a note of which browser you’re using and what worked. Note that you don’t HAVE to do this to use the page for testing your audio, if for some reason you don’t want to report what you find out – the audio on the page isn’t dependent on whether you submit results. Eventually I’ll have a fairly complete picture of what supports what one way or another.

Also: pester me – I really will be posting audio again.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Do me a favor and harass me online until I start getting audio out again.

The relocation and career change has been an ongoing distraction, but it’s long past time for me to get back to work on getting Stir-Fried Stochasticity and other projects back to life.

I even got some nice new equipment to record with, thanks to a generous grant from my Producers (the Mom and Dad foundation) of a Zoom H1 four-channel digital recorder. I really need to start making more use of it.

Therefore, anyone who is interested in my little audio projects is encouraged to harass me online here, or via Google Talk, or on my Google+ page, or whatever, until I start coughing up new episodes.

Not literally, of course, I mean, you probably don’t really want to hear “coughing up” noises, but you know what I mean.

This content is published under the Attribution-Share Alike 3.0 Unported license.

I’m still here…but I’m no longer there. (And perhaps, no longer “all there”, but that’s a separate issue.)

An unexpected but hopefully fortuitous attack of Life (the concept, not the cereal) has kept me away from my audio projects for months now. The secret location of our Asylum for the Sufficiently Nerdy is being moved thousands of miles (literally – this is not an exaggeration) in pursuit of a promising project. Fortunately, we’re almost done with the move. We’ll still be pretty busy for a while yet, but I think things will settle down enough for me to shave a few hours per week to get back to audio.

Indeed, I very much hope so – not only do I want to get back to the “Stir-Fried Stochasticity” oggcast project and my HPR offerings, but I also started listening, during the seemingly endless drive from our old Asylum to the new one, to old 1930’s-1940’s radio serials I found on archive.org, and now I find myself itching to try to do one myself.

I will try to put out an episode of Stir-Fried Stochasticity here soon, explaining my ongoing battle with Zombie Hans Christian Gram, which I think I can put together more quickly than the episode I’d started working on previously, which is garbage (also literally). Other projects to follow as time permits and interest is expressed…

Thanks for not giving up on me (if there’s anyone left out there who hasn’t already). 

Incidentally, if you’ve been getting frustrated by the “Bad Gateway” errors, so am I and I promise I’ll get that fixed somehow…

This content is published under the Attribution-Share Alike 3.0 Unported license.

(This is an earlier audio bit, reposted here from elsewhere just to consolidate onto this site with the other audio. Meanwhile, I am working on the next episode of Stir-Fried Stochasticity, as well as the next episode of “Thoughtkindness” for Hacker Public Radio.)

Gather around the campfire, boys and girls and everyone else. It’s story time.

(This is both an attempt to entertain AND a technical test – I’d be most appreciative if any or all of you left me a comment letting me know how this works for you. I’ll put some technical information at the end of the post.)

This story concerns a certain location in Mount Ranier National Park…

http://www.panoramio.com/photo/38235159After you hear this harrowing tale, if you can’t make it out to Mount Ranier National Park to verify the story for yourself, you can see a picture of the monument online. Click or scan the QRCode image to the right to see it after you’ve heard the story.

Feedback is welcome and encouraged. For those who are interested, here’s what this post is supposed to do, technically:

If you are viewing this post in a modern (HTML5-supporting) browser, the “native” audio player in your browser should appear above, allowing you to press “play” and listen to the story. Of all the modern HTML5-supporting browsers, most support the high-quality (and legally free to use) “Ogg Vorbis” audio format and will play that version. If you are in the minority of HTML5-browser-using population (Safari or IE9), an MP3 version should play instead. (The problem with Safari is that Apple doesn’t include a Quicktime component for Ogg media formats out of the box. Personally, I would recommend going ahead and installing the Free Quicktime Components, which will enable Ogg media formats for Safari, iTunes, and all other Quicktime-using programs, including enabling Apple platform applications to create files of these types so you can participate, too.)

If you are NOT using a modern, HTML5-supporting browser at all (or are perhaps using one I’ve never heard of that supports neither higher-quality Ogg Vorbis nor MP3) – mainly Microsoft’s previous “Internet Explorer” browsers and really old versions of Firefox or Opera that may still be in use – if you have Java installed, a Java-based Ogg Vorbis player should appear instead, allowing you to play the higher-quality audio anyway.

If your browser doesn’t support HTML5 AND doesn’t support Java, a link to an Adobe Flash-based MP3 player should appear. Click on that, and you SHOULD have a window pop up that will play the lower-quality MP3 version of the audio.

In short, nearly everyone should be able to play the audio if I’ve done all of this correctly. Please let me know.



Posted from PACKWOOD, Washington, United States.

This content is published under the Attribution-Share Alike 3.0 Unported license.