Wait, really? Last post was in 2017?

Anyway, not dead yet, more to follow eventually.

I’ve been digging a bit into web-and-phone-app-suitable media formats for audio and “quasi-audio” (where the media is really “audio” but is formatted as a “video” because everything has to be a friggin’ video these days).

Also finally got around to figuring out why the API wasn’t working for this site. I kind of want to dabble in making an “app” for podcast episodes that get posted here.

Watch this space…

This content is published under the Attribution-Share Alike 3.0 Unported license.

Episode 3 of “ImPROMPTu” had something to do with bees and religion, so this was the “commercial” for that episode, in the style of a church-sponsored religious message.

This content is published under the Attribution-Share Alike 3.0 Unported license.

PlayPlay

Here’s a spoof-commercial and wrote, recorded, and produced for ImPROMPTuCast, back in episode 0005 for Halloween, for “Papa Dread’s Pizza”. I tried to somewhat imitate the style and cadence of the “Papa John’s” CEO. It’s certainly not an exact match, but close enough for spoof purposes.

This content is published under the Attribution-Share Alike 3.0 Unported license.

PlayPlay

Here’s a spoof-commercial and wrote, recorded, and produced for ImPROMPTuCast, back in episode 0002: Pumpkin Spice Chlorpromazine!

This content is published under the Attribution-Share Alike 3.0 Unported license.

PlayPlay

Too many things going on lately, I’ve been slacking on the blogs. I haven’t been idle, though:

I’ve been playing with the APIs for Jamendo and Freesound.org to try to better automate the finding, fetching, and properly-tagging (including applicable copyright license terms!) the local copy for use in podcasts. Having the appropriate information in real audio metadata makes it a lot easier to find compatibly-licensed sound and to find the necessary attribution information when the time comes.

I’m also quasi-professionally doing podcast audio editing and production on the side. As a moderately-skilled amateur rather than a “serious professional” I’m not charging much, but I am technically getting paid to do it, so I say it counts. Currently my sole client is a podcast being done by a couple of independent authors who chat about their current projects, then take a random “writing prompt” and try to whip up a short piece of fiction based on it, and then they each read out what they’ve come up with. If that sounds like something you may be interested in, that podcast is at http://impromptucast.com, its RSS feed for your podcatcher is at http://www.andivan.com/impromptucast-rss.xml, and for those of you stuck on stinky old iTunes, their iTunes page is at https://itunes.apple.com/us/podcast/impromptu/id1280197831?mt=2. In addition to a basic intro and “outro”, they also asked me to come up with some sort of spoof-commercial to insert between the two halves of each episode, which has been fun. I’ll post those somewhere here when I get a chance – I’ve been meaning to put together a portfolio online.

Anyway, they each record their local audio as they chat online, then at the end they send the raw audio to me, and I synchronise it, clean it up and edit it, blend in intro/outro/”commercial break”, then export to mp3 (and flac for archiving and generation of opus and/or Ogg Vorbis audio as needed later) with complete metadata so they can post it. Then I generate updated RSS for their feed. If anybody else is interested in hiring some inexpensive help getting their podcast together, feel free to comment!

Working on that is what got me going on the API projects. I also whipped up a “complete metadata tagger for podcast files” to go with it – in case it’s not obvious, I hate when important information (e.g. licensing, source URLs, etc.) are left off of audio files.

This content is published under the Attribution-Share Alike 3.0 Unported license.

(EDITS: I made a couple of quick additions to this post. Turns out there is at least one person who cares whether I try out this plugin again, and it’s the developer. One nice thing I can definitely say about the plugin is that the developer has always been pretty engaged with the plugin’s users. I vaguely recall having some communication with him online half-a-decade ago when I first started looking into the plugin. See the comments on the post…)

I actually installed the “Blubrry Powerpress” plugin on here way back at the beginning when I noticed it had some (at the time) rudimentary .ogg [vorbis] audio support. I played with it a little and then never did much with it. (EDIT for clarity: this was almost half-a-decade ago – back in August 2010. If you go all the way back to that first audio post [the “TuberculosisBurgers” episode of “Stir-Fried Stochasticity”], that one was posted via the “Blubrry PowerPress” plugin. Quite a bit of development has gone on with the plugin since them but I’ve not gotten around to trying to use it again since…yet.)

Since I’m switching to .opus for everything anyway, I’d been thinking about just removing the plugin (after I go back and find the one or two posts I made using it, to fix them to not require it), but I just noticed that the new 6.0 version has finally added support for .opus, according to a line buried in the changelog.

Still not sure if I’ll keep it and try it out some more or just purge it. The homepage for blubrry.com has adopted a style that looks like a child produced by Windows 8 and an iPod advertisement after a drug-fuelled orgy. Ick. Big flat ugly blocks of text and bland graphics, and almost “mystery-meat navigation”. If that’s indicative of where the plugin is heading, I should probably just purge it (I get the impression that the plugin is more focussed on iTunes™ and “Search Engine Optimization” than any other features, but I’m not really concerned about Apple’s mandates on this blog) (EDIT:What I’m getting at here is that it’s not clear how much of the plugin I’ll actually get any use from if I’m not on iTunes, essentially. Is it going to be like having Microsoft Excel installed when I only need to do some basic math?).

(Another post-posting edit for clarity: the “dumbed-down-to-an-iTunes-ad” interface that Windows 8 went all in for is the second-worst trend in the name of “mobile-friendliness” in my personal opinion – second only to websites that now pop up as a giant background graphic filling the screen, which you then have to scroll down past to actually see the “content” you clicked on the link to get to in the first place…and keep scrolling because it’s giant “easy-reader” text, I assume so it’s legible on tiny phone screens. Yes, I’m looking at YOU, medium.com, among others. The only concern related to the plugin is that it suggests a doubling-down on iTunes-and-other-proprietary-services-related features and a moving away from the simple self-hosting I’m interested in as development goes on.)

Since the closest thing to a formal “New Year’s Resolution” is to do a lot more web audio production, I could probably use the plugin, but I certainly don’t NEED it. I’ll consider it while I work on getting some audio produced.

Does anybody still reading this have an opinion either way?

This content is published under the Attribution-Share Alike 3.0 Unported license.

PlayOggThe crappy-old-mp3 standard has now been around for nearly a more than a quarter-century.


UPDATED: this page might be interesting for historical purposes, but as far as I can tell the last of the possible .mp3 patents finally expired by 2018 or so. You may now legally use this ancient format however you please. I still say the quality is poor by modern standards, and it’s still not suitable for low-latency audio, but it does have the advantage that virtually everybody supports it and it’s generally “good enough”.


Most kinds of audio players and web browsers have supported better, legally-free formats for a while now, but as usual Microsoft and (most prominently) Apple are stuck with only formats that you have to pay a metaphorical “poll tax” for permission to use.

.mp3 is one of them, of course. By modern standards, mp3 is pretty poor. It’s high-latency so it’s not suitable for interactive or live uses (e.g. VoIP), the quality is lacking at all but the highest bitrates (so you either have low-quality audio or huge files to transmit and store). It’s also weighed down by a bunch of patents, of course, so you can’t even legally make or use .mp3 files without somebody paying protection money to some lawyers for permission…

…or can you?

Personally, I’d rather just never touch the stuff (patents or not, the audio format is lacking, and I really don’t like the fussy, limited little “id3” standards for its metadata, either), but I grudgingly concur that having a basic “fallback” format that Microsoft/Apple/fifteen-year-old-media-player owners could use until they realize they can upgrade to something better by using a different browser/media-playing app is sometimes helpful.

I occurs to me that with the original specification for .mp3 being published in 1993 or so, more than 20 years ago, and given that patents aren’t supposed to last more than 20 years, it seems like a reasonable assumption that at least some if not all of the still-threatening patents (the last of which still doesn’t expire until 2017!) are optional “optimizations” or techniques that don’t necessarily have to be applied to generate a valid .mp3 file that ancient media players (or new media players from ancient companies…) can at least play back, even if the files are not “optimal”.

For example (DISCLAIMER: IANAL), This Big List of MP3 Patents shows that there are 9 6 patents left (as of 2014 May, 2015) keeping mp3 locked up. However, at least fourtwo of those are specifically patents on ways of encoding two or more channels of audio (i.e. stereo, surround-sound, etc.), so a single-channel (mono) audio stream encoded to .mp3 should definitely not trespass on half of those patents. One more appears to be specific to techniques for encoding “low sampling rate” audio (i.e NOT the usual 44.1kHz or 48kHz), so a typical 44.1kHz or 48kHz audio source encoded to .mp3 at that rate wouldn’t trespass, either.(Note: Since this post was originally written, a few of the encumbering patents have finally matured into the public domain, hence the edits…)

That leaves 3 remaining US patents to be tiptoed around to generate legal .mp3 files:

Trying to read those things make my head hurt, but it kind of looks to my definitely-non-expert eye like 5579430 has something to do with “average bit rate” encoding (not “constant bit rate”), so possibly a true “constant bitrate” encoding wouldn’t infringe?

5742735 looks kind of like it’s both a specific way of designing an encoder (with a “controller” and multiple “multi-signal processors”), AND possibly involves application of particular and/or multiple “psychoacoustic models” (as I understand it, more or less an algorithm for deciding when some part of the sound can’t be noticed anyway and can be degraded or thrown out entirely, to save extra room for detail in more important parts of the sound). I’m at a loss as to whether an encoder like LAME actually falls under this patent at all. Anybody know? Even if LAME’s architecture might trespass on the special “multi-signal processor” stuff, might it be possible that running it at “-q 9” (“disables almost all algorithms including psy-model.”) would avoid this patent?(Furthermore,this patent has finally officially run out and no longer even applies!)

5924060 is also a bit beyond me, but I notice claim 7 specifically calls for application of a psychoacoustic model, so perhaps “lame -q 9” might avoid it entirely? Beyond that, I can’t tell how avoidable the techniques described are.

6009399 seems to specifically claim using multiple psychoacoustic models at the same time (and then deciding which one works best for a particular sample and using that one, if I’m interpreting that correctly). Seems like not applying a psychoacoustic model to optimize the encoding would bypass this one, or for that matter even only using one model for the entire encoding process.

So then…am I correct in thinking that there’s a good chance that “lame -q 9 -m m –cbr”, given input audio with 44.1kHz or 48kHz sampling rate, would avoid the last remaining patent threats still hovering over mp3?

(And, yes, I’m aware that the result would sound even worse than usual mp3 files, but the point is merely to generate “usable” mp3 data as a fallback for old/recalcitrant audio players. I’ve got .opus, .ogg [vorbis], and .flac for high-quality audio.)

This content is published under the Attribution-Share Alike 3.0 Unported license.

I’m working on earning my “Subversive Radio Host” merit badge with my RaspberryPi. Once I’ve got the whole system worked out I’ll likely be doing a Hacker Public Radio episode about it. In the meantime, though, for testing purposes I have a feed from the local NOAA Weather Radio which I am feeding as 10kbps opus audio to an instance of icecast2.

I figured I’d post this since there’s a piece of information I found today regarding having a continuous stream, and someone else may find it useful.

I was finding that my stream to the icecast server, using the command line suggested by the opusenc man page, was spontaneously dying at almost precisely 6 hours, 12 minutes, and 50 seconds. For reference, here’s what the man page says (or at least has for quite some time now and did when I last checked it today) as a suggestion for live-streaming recorded audio in realtime:

arecord -c 2 -r 48000 -twav - | opusenc --bitrate 96 - - | oggfwd icecast.somewhere.org 8000 password /stream.opus

I was using the same string of commands, with minor changes (only one channel, –bitrate 12 [or 10], different address and credentials for oggfwd).

To cut the drama short, the problem turns out to be arecord. I still am not sure whether arecord was hitting a “maximum number of bytes” or “maximum run time” problem, but either way, it turns out you can just use sox (symlinked as “rec”) in its place, which is nice because I had been thinking about playing with having sox apply some filtering to remove noise, etc.

Here’s a suggested replacement command line for this purpose:
rec -c 1 -t wav - | opusenc --bitrate 10 - - | oggfwd icecast.somewhere.org 8000 password /stream.opus
(with the parameters adjusted to your own needs, of course).

It’s been going for over 7 hours now, so I’m assuming my problem is resolved. In the long run it won’t matter so much – my intended use-case in the end involves playing pre-recorded files (and would work perfectly with ices2‘s features if ices2 supported opus…), but for times when I may want to live-stream something, this may come in handy. With opus at 10-12kbps being still decent quality for voice, one ought to be able to feasibly live-stream audio even over a really slow pre-“3G” cellphone data link or dial-up modem in realtime.

Oh, speaking of “Subversive Radio Host” – using the clever pifm software turns the Raspberry Pi into a remarkably powerful transmitter. I think I’ve got my antenna trimmed down enough to make it a legal unlicensed transmitter (my original test was with a 20″ or so piece of alligator-clip wire that happened to be handy, and got the signal out to about three blocks away. Definitely too much power for legal unlicensed use.) I suspect with an ideal-length wire as an antenna you could cover a whole town, if you placed it well, and if you didn’t mind getting the FCC (or whoever your local regulatory agency is) very annoyed with you.

This content is published under the Attribution-Share Alike 3.0 Unported license.

Old Time Radio voice actor(That was the sponsor of the ‘Old Time Radio’ show)

I’ve decided that while I’m experimenting with encoding opus audio for playback on the web and portable media players and smartphones,

Recently I started listening to an “Old Time Radio” series I found on archive.org, and since I’m playing mostly with voice audio I thought it would be a good test subject. Episode 1 of “Chandu, the Magician” (“The Return of Frank Chandler”) actually compressed very well all the way down to 14kbps with little noticeable loss of quality compared to the original. Admittedly, the original was a lossy digitization of an old wire recording, but still, what quality was there was preserved well.

This content is published under the Attribution-Share Alike 3.0 Unported license.

An old record album - 45rpmMetadata in Ogg files is nice and simple compared to the bizarre mess that is id3v2, which is the metadata format used for crappy-old-mp3.

Where mp3 has a clutter of special little pre-specified data-structures to pick from, Vorbis and Opus in Ogg files uses a nice, simple “(fieldname)=(text)” format. Well…except for one thing.

“Album art” isn’t text. It’s not sound, either. Now, the most-technically-correct way to deal with this sort of thing in an Ogg file would be to have the “album art” images as their own separate media “stream”, sort of like a “movie” file where the “video” is one or more still images. This is kind of an inconvenient kind of thing for audio-only players to deal with I guess, and is different from how the more well-known mp3 did it, so a workaround was devised.

First, let’s get this out of the way: NO, do NOT just base64-encode a jpeg file and cram it into the metadata. Early on, doing this in a field named “COVERART” was one way people tried to to cram album art into audio-only Ogg files. However, few if any media players ever bothered to implement using that field, and in addition this method is even less capable than the “APIC” structure that mp3 uses for album art. The correct way to include “album art” images in an Ogg Vorbis (or Opus) file is with the somewhat unintuitively-named METADATA_BLOCK_PICTURE field.

The reason for this strange name is because that’s what the “album art” structures in FLAC files is called, and this precise same binary structure is what’s used for “vorbiscomments” (again, Ogg Vorbis AND “.opus” [Opus in Ogg]) album-art. Since the binary structure is exactly the same, that means any media player that can handle “album art” in .flac files can use exactly the same pre-existing code to also handle album art in Ogg files.

The difficulty is of course that this is still “binary” (non-text) data, so this whole structure (not just the contents of the image file) gets base64 encoded. Annoyingly, I’ve had a hard time finding anything that properly supports generating this structure. The encoder utility for flac actually has a built-in “-picture” option that lets you specify picture files (along with the rest of the data structure, where it might not be auto-detectable or where you don’t want the defaults) easily when making the file in the first place, but neither the oggenc encoder for Ogg Vorbis nor opusenc for Ogg Opus had this option, nor did the “vorbiscomments” utility used for adding or modifying metadata in pre-made Ogg Vorbis files. There are a few graphical utilities (such as kid3) that can properly encode a single piece of “album art” for a file, but only with some hard-coded defaults. Also, they still don’t support Opus files (pending a long-delayed release of taglib version 1.9). This makes embedding album-art during automated processing a problem, as well as the issue of including multiple embedded images, which is also perfectly valid (not only in .flac, .ogg, and .opus but in .mp3 as well) with a couple of minor restrictions.

Some help with this is now available, though. It turns out that the next release of opusenc will actually also include a flac-like “-picture” function, which will make that easier, but still doesn’t solve the case for Ogg Vorbis. Also, I was able to find a perl implementation of generating the METADATA_BLOCK_PICTURE structure and then base64-encoding it, which at a glance looks like it ought to work. Hardcore pythonistas could probably code up something equivalent using mutagen as long as you don’t mind being stuck on Python 2.x. And now…I have a correctly working PHP implementation, and a script that can be run either from a web server (using a form to upload the picture and fill in the data) or from a command-line prompt. Anybody want it?

This content is published under the Attribution-Share Alike 3.0 Unported license.