Building a musical instrument with the Web Audio API

bambax · on May 13, 2022

> Sine and triangle were a bit more subdued but if you don't fade the sound out properly, it makes a really unpleasant kind of cutting sound due to how your ear reacts when a wave gets cut off.

The clicking sound happens when a wave is stopped when not at 0. You do need to fade the oscillator down before stopping it.

> As I was figuring everything out, I was experimenting with making new AudioContext for every note because I was trying to fade out the sound, but then I kept realizing that after 50 notes, the app would stop working. Fifty is apparently the limit for the browser, so it's better to just make one AudioContext for the entire app.

Yeah the reason the app would stop working when doing things this way is, just because the gain of the oscillator is zero, and the oscillator therefore not audible, it's still there! If you put the gain back up you can hear it again. You can't just fade out and be done with it.

So the appropriate approach (not explicitly explained in the article IMHO) is to

1/ set the gain to zero:

    g.gain.setTargetAtTime(0, context.currentTime, 0.1)

2/ then kill the oscillator with stop(), so that it doesn't consume resources anymore:

    o.stop(context.currentTime + 0.5)

(g being the gain and o the oscillator, and context the AudioContext).

Note: "then" means the two actions need to happen one after the other, and that is decided by parameters of the two functions; the functions themselves aren't chained and can be written in any order.

ArtWomb · on May 13, 2022

This is a terrific Web Audio tutorial! Real-time sound synth in the browser is here. There are already a few DAWs like GridSound out there. But it's still the Wild West ;)

Jim Clark's Synth Programming book from 2003:

https://www.cim.mcgill.ca/~clark/nordmodularbook/nm_book_toc...

Code: https://github.com/aolney/nord-modular-book

chaosprint · on May 13, 2022

A new video today on Web Audio (Creating Music on the Web - Ben Morss and Hongchan Choi - ADC21):

https://youtu.be/GfC2WTStW8M

chaosprint · on May 13, 2022

very very cool. the visual aspect is very impressing.

I also use Svelte for this browser-based music live coding environment I am developing:

https://glicol.org

The issue with raw Web Audio API is that when there are some heavy stuff on the main thread (in your case the visual feedback), the audio may get glitches.

WASM + AudioWorklet is SOTA the solution for this, and Glicol 's tech stack is Rust->WASM->AudioWorklet + SharedArrayBuffer. I also porting the audio engine of Glicol as an NPM package:

https://glicol.js.org

So if you wish some better audio performance for future development, perhaps you can take a look on that.

It provides better audio performance and friendly APIs.

I would be happy to update the audio lib based on your feedback.

fenomas · on May 13, 2022

> when there are some heavy stuff on the main thread the audio may get glitches

Where have you seen that? TFA is just using oscillators, not ScriptProcessor or anything, so one would expect all the heavy audio work to be happening outside the main thread.

Personally I use webaudio to do dynamic music for a game, and even though the game is reasonably heavy (3D, physics, etc) I've not noticed any particular glitching.

chaosprint · on May 13, 2022

You are right! I should say ScriptProcessor there. But for heavy use of audio, especially when there are lots of interactions (often you want your own abstraction rather than the built-in nodes) and sample-level time should be taken into account, WASM+AudioWorklet should definitely be the first choice: https://youtu.be/GfC2WTStW8M?t=1084

gnozzle · on May 13, 2022

Diatonic button accordion (or melodeon as we call it here in the UK) player here. I wasn't expecting to see my, rather niche, instrument on the front page of HN!

Neat app for experimentation. The bellows control is always the sticking point on electronic DBAs but this works well enough to play a simple tune. I don't think I could do tune, bellows and bass at once on this though!

Nice write-up too.

PaulDavisThe1st · on May 13, 2022

On the one hand: really great write up of a great little project that is great at helping people experiment with synthesis and software instrument design.

On the other hand: another example of a development platform ("web browsers") being utilized for something it is fundamentally not designed for, just because it leverages the skills someone gained along the way.

I mostly wish this would stop. People are going to build all kinds of awesome toys using web audio APIs, and every single one of those toys is going to be less performant (latency, CPU/DSP load, interface responsiveness, sensor interactivity) than it's native equivalent. That means that its use will face limitations that are a function of the development platform rather than the designer or performer.

On the other hand, how can anyone be against people learning more about synthesis and instrument design?

My brain hurts.

hansworst · on May 13, 2022

Maybe the web was initially not designed for doing these kinds of things, but on the other hand it is basically the only platform that enables people to click a link and immediately start using whatever app is behind the URL. That's fundamentally why the web has grown as much as it has, and why there is huge incentive for people to build more APIs that let people do more with that platform.

I suppose you could in theory build a similar kind of platform that allows you to the same things in better ways. There may even be a lot of money there. But it hasn't happened, and so it makes a lot of sense IMO to build these kinds of apps.

bambax · on May 13, 2022

On the third hand: distribution.

An app that runs in the browser offers incredible advantages over an app that you have to download and install on your specific machine/OS.

A DAW in the browser sounds insane, but the same was said about everything: word processors, spreadsheets, maps, and look where we are now.

LegitShady · on May 16, 2022

I never worried about milliseconds of delay in spreadsheets, but in a DAW I do. Things are not all the same.

stevehiehn · on May 13, 2022

I think it goes way beyond toys. Here's my use case: I'm building an audio sample pipeline. And I want people to be able to preview combinations of loops and samples in the browser. Asking someone to download a native app just to preview is not awesome: https://signalsandsorcery.org

mit20220401 · on May 13, 2022

IMHO, your usage of Tone.js may not be a strong enough argument for your point. Tone.js is simply too high-level to offer the features mentioned here.

stevehiehn · on May 13, 2022

Not sure what you mean? It's already serving my use case isn't it?

chaosprint · on May 13, 2022

Good point.

I think you do accept the convenience of browsers (many cool projects such as this one https://learningsynths.ableton.com/).

So now the problem is the audio performance.

As I post in the comments below, we now have WASM, so C++ and Rust can all run in browsers. This can provide a near-native audio performance.

Just take Glicol, the live coding language I design as an example: it runs in browsers (https://glicol.org) and it also runs as a VST plugin (https://youtu.be/tmmBhBmIEW0), or you can use the audio engine to write VST plugin (https://github.com/chaosprint/dattorro-vst-rs).

PaulDavisThe1st · on May 13, 2022

> As I post in the comments below, we now have WASM, so C++ and Rust can all run in browsers. This can provide a near-native audio performance.

Native audio apps require realtime scheduling and memory locking, things you cannot do/control in the browser.

The problem is that you can likely get 70-85% of "it" done in the browser, but when the user/performer need the remaining 15-30% for whatever reason, what do they do then?

Jasper_ · on May 13, 2022

Very rarely do you get that level of control even on modern systems. People aren't making music on ASIO-capable sound cards, they're using ASIO4All which implements ASIO on top of your boring MME/WASAPI audio stack. On any modern OS, you don't write directly to the sound card, you write to a buffer which the system mixer does the FX graph on, then that goes to the underlying sound system. CoreAudio, WASAPI and PipeWire are all built like this.

Realtime threads are attempted to be solved by AudioWorklet which allow a high-priority thread.

We probably won't get to the level of 100% assembly programming, but we can get close, and I think that's fine.

Where I get upset at Web Audio is the amount of time spent in all the oscillators and FX nodes; I think you're better off ignoring those just because of how limiting and awkward they are.

jcelerier · on May 13, 2022

> People aren't making music on ASIO-capable sound cards,

sorry what :) I don't know anyone who does a bit of remotely serious music making who doesn't have at least some USB focusrite or something like that

> On any modern OS, you don't write directly to the sound card, you write to a buffer which the system mixer does the FX graph on, then that goes to the underlying sound system. CoreAudio, WASAPI and PipeWire are all built like this.

Paul Davis is the author of JACK and Ardour fyi, I think they have a good idea of how things work :p buffers are unavoidable but I don't see webaudio allowing me to use a 64 frames buffer size and still be able to put in some effects and play with some softsynths like I have right now, or even being able to run isolcpu and tweak DMAs to entirely devote specific CPU cores to audio processing.

skybrian · on May 13, 2022

For "people" read amateurs who haven't even heard of Focusrite (I hadn't.)

I expect many people fooling around with GarageBand or VCV Rack don't have any special sound card. Also, it doesn't seem that uncommon to for musicians to make recordings using a cell phone?

PaulDavisThe1st · on May 13, 2022

Nobody makes finished recordings for distribution with a cell phone. Field recordings for use as samples? Sure. Something to get the overall feel of a performance? Sure. A dedicated mic on an instrument because nothing else was available? Sure.

But recording when it matters is dependent on the microphone, the pre-amp and the A/D converter, none of which are of suitable quality in a cell phone to be the basis for a "serious" recording.

skybrian · on May 13, 2022

This is not what I see on YouTube. Professional recordings are outnumbered by amateur ones. They are as "finished" as they're going to be and sometimes they get a big audience.

Sure, maybe most professionals don't do that but you said nobody does that, as if amateurs don't exist.

PaulDavisThe1st · on May 13, 2022

Well, I guess there's also recordings with a line-in from a mixing console with the mics&pre-amps on the other side of that, and sure, that's a thing, and I ignored that with my comments. That setup gives you just-about-good-enough quality (certainly way better than ye olde "direct from desk" bootlegs of past time).

I was thinking more of setups where you connect a mic/pre directly into the phone.

Most of the time I've seen people doing this, however, they are using "native" recording apps on their phones, not a browser.

PaulDavisThe1st · on May 13, 2022

> Very rarely do you get that level of control even on modern systems.

Windows, macOS, Linux all offer this control.

AudioWorklet does not offer much control of thread priority (read the source code of any DAW to see how much is actually done), and also isn't clearly appropriate for parallelization of DAW processing, which is a standard architecture in any current DAW.

It's not really about asm programming. It's about the ability with the host OS in all the appropriate and required ways. Browsers are extremely unlikely to permit this, for a variety of very sensible reasons.

chaosprint · on May 13, 2022

I think it's all about trade-offs.

For example, I can also point out that the latency on Windows, macOS are too high as these are general-purpose oriented systems. See this paper:

http://eecs.qmul.ac.uk/~andrewm/mcpherson_nime2016.pdf

Similar dilemma we have for the general-purposed browsers. To push to a limit, we can only resort to bare metal or Xenomai Linux like Daisy or Bela. But for the same reason we use Windows and macOS, we trade off some audio performance for a better overall experience.

PaulDavisThe1st · on May 13, 2022

Most of the problems with the lower limit on audio latency these days come down to hardware (motherboard/PCI bus level) rather than the OS. Both macOS and Linux can routinely handle values significantly lower than the 10msec mentioned in the paper you cited, given the right hardware (more or less guaranteed for macOS, much harder to ensure for Linux).

spacechild1 · on May 13, 2022

Yep. Pure Data and SuperCollider can also run in the browser.

But there's a difference between C/C++/Rust audio apps compiled to WASM on the one hand and writing audio code in JS on the other hand.

Then again, I think it's perfectly ok for people to experiment with audio in browser. "Native" audio desktop applications will always be superior, so there's nothing to be afraid of.

jdauriemma · on May 13, 2022

If anyone's interested in a Web Audio API trombone, I made one a while back:

https://jdauriemma.com/trombone.js/

https://github.com/bignimbus/trombone.js

stevehiehn · on May 13, 2022

Great post! I'm very enthusiastic about the WebAudioApi. I'm currently building an audio sample pipeline and enabling users to preview combinations of loops via the api!

https://signalsandsorcery.org

cupofjoakim · on May 13, 2022

Nice job! The choice of accordion def wanted me to play by resizing the window though :)

skybrian · on May 13, 2022

Headline is misleading - it leaves out that the instrument is an accordion.

Talk about burying the lede.

amelius · on May 13, 2022

What's the latency?

aaaaaaaaata · on May 13, 2022

SO cool.