Web MIDI API - Current Status, use cases
So, obviously, it's a thing.
The Google's own awesome Chris Wilson is working on the spec as well as on a first implementation in Chrome. But he also did another, quite awesome thing: Implementing the Web MIDI API with the help of the Jazz browser plugin. In essence that means that if you can convince users of your application to install that plugin, you can start using the Web MIDI API already.
Well, first of all, for those that don't know, what exactly is MIDI?
MIDI was, in 1982 (!), developed as a simple, reliable and fast way of sending musical data over simple cables. It defines a simple serial protocol and was, in the 1980's mostly used as a hardware-to-hardware protocol.
Despite the fact that the protocol is so simple and so arcane by modern standards, it is still, more or less, the de-facto standard to communicate between musical devices. If you buy a musical controller, regardless of it being something simple like a keyboard or something as mindbogglingly complex like the Ableton push, chances are that the communication works via MIDI.
So, after the Web Audio API now basically become the standard way for browsers to output realtime audio, this is the next building block to enable more complex musical things in the browser.
(Please note: I'm using Chris' shim here. The current documented API (as I've linked to) has slightly different signatures. I kind of hope that this will be unified again pretty soon, though.)
While the Web Audio API currently ignores the fact that computer systems may have more than one audio interface (which is a bummer, really), this would be a silly thing to do with MIDI: Each instrument usually defines it's own interface and it's important to be able to distinguish between those, because there is no way of addressing them in any other way (this is only partly true, but for now, let's just pretend). So, in order to work with a midi interface (in the MIDI world, this is also often called a "port" (remember: It used to be a serial port).
To be able to enumerate the ports we need to request access to the midi system first (This will be, in the final API, likely result in a permission request dialog). The API uses promises to make the usage of the asynchronous API easier:
navigator.requestMIDIAccess().then( onsuccesscallback, onerrorcallback );
we can now (in the onsuccesscallback method, of course) access the list of input and output devices. Note here that MIDI ports are unidirectional, so they are either input or output ports:
function onsuccesscallback(access) { access.inputs().forEach(function(input) { console.log(input.name); }); }
To receive MIDI data, we can now simply add an eventlistener to the input:
input.onmidimessage = function receiveMidi(message) { console.log(message.data); };
Now, interpreting these MIDI messages correctly is a little complex, so let's keep it to a simple example. Let's assume that a note-on message is coming in, which, for example, happens when you press down a key on your MIDI keyboard. You might see something like this in the console:
so, the first byte is a number bigger than 127. This is important, because this means that this identifies the command. 144 is the note-on command. But that's only half of the truth. The first half byte of the command (the most significant nybble) is the command, in our case, 0x90 or 144. The other half is the so called MIDI channel (which ranges from 0-15 (but will usually be counted as 1-based, so our example uses channel 1. (So, there actually is a way of addressing MIDI devices. But only for a certain class of messages, so called channel realtime messages).
The following two bytes are only 7 bit. The first one is the note number (in our case, a middle C). the second one is velocity. Velocity is, with keyboards, a measure how hard you hit the key. This is used to make sounds more expressive, to change the timbre or the volume of a sound while playing.
As soon as you release the key, you'll probably see something like this:
As you might guess, 0x80 is the note-off command. But this might not actually really what you are seeing. Another possibility is this:
Because the spec allows for note-off messages to be sent as note-on messages with 0 as velocity.
Don't ask. Just remember that the spec was written in 1982 and microcontrollers being added to synthesizers at the time had a fraction of the processing power and memory available of what's used today.
To send messages, you can simply take an output (enumeration works exactly the same) and use the send method:
output.send([144, 60, 127])
optionally, you can also send a timestamp as the second argument, which will send the message at the given time, to allow precision scheduling.
Music programs: Being able to use a real keyboard to play sounds in the browser is/would be awesome. This is probably by far the most compelling use case, as music demands a certain class of input devices.
But of course, there's more. Since the Web MIDI API aims to support the full MIDI standard, we can start building sequencers that maybe even mix Web Audio API usage with MIDI usage.
And since almost every operating system nowadays exposes a simple software synthesizer that allows you to play GM (General MIDI, a later standard to specify a set of common sounds such as pianos, strings, guitars and drums to allow for simple playback of songs with a common set of instruments) sounds, you could even use this as a lightweight way of playing songs in a game. We're kind of beyond that, really, but better than nothing.
Sky's the limit. As usual.
There's also a thing called OSC (Open Sound Control) which is an IP (as in Internet Protocol) based protocol that allows to send arbitrary high resolution data which would be awesome to have directly in the browser as well, but due to the way this protocol is built I would not hold my breath. If you want to use it for experiments in the browser, you can use tools like OSC-web that use WebSockets and a simple node.js based proxy to get data in and out of the browser.