Audial Interfaces

Background

First, a few notes about why, generally, there is an opportunity in this space right now (or soon).

  1. Digital media consumption has skewed strongly towards our eyes. One reason behind this is that the technology for information distribution via our eyes has been ahead of our other senses (and has been for a very long time).
  2. As computers get stronger and smarter, we’re opening up new opportunities for inputs and outputs with computers, this will create unique opportunities for new companies / services.
  3. Audio is undergoing the most significant transformation lead by a few factors: ubiquitous microphones and speakers (AirPods + many others), NLP starting to work, handheld computing power.
  4. Lots of public experimentation in this space, not a lot is really sticking. Still looking for the killer apps.

Experiments

As a hobby for the past few years, I’ve been immersing myself in this space trying to understand it better at a personal level and what technology is available. The follow list are themes of utility that I feel are emerging and of interest:

Music / Song capture

Use Case

I’m a musician. Often the act of writing a song involves me needing to use my hands and capture an audio moment that’s evolving organically. You never really know when the “good stuff” will happen, but you need to be prepared for it. I started using Voice Memos to capture this process to try to extract the moment in which the song comes to life (as an aside, this is very important to capture, usually the song itself is trying to re-create the energy of this moment).

Challenges

Ideas

  1. Ability to annotate timeline via realtime input — after doing this for a while, I wanted to be able to flag or annotate the audio timeline via the audio I was capturing. This could be in the form of notes for myself in the audio process, but could also be used to edit without a computer. For instance “delete the last few minutes of silence,” could be used after leaving the room for a few minutes and realizing I left the recording on, or “split audio file here” could split the file in 2.
  2. Ability to combine multiple microphones in a room for better quality audio — AirPods, multiple iPhones, and iPad would be present, but I could only record using 1. It would be really great if I was able to record all at the same time then stitch the files back together to increase the fidelity.
  3. Smart editing — Voice memos should follow googles lead and provide transcription and tools to easily cut silence or minimize noise.

Audio Journalling

Use Case

I like to talk but I type all day. I know journalling is important but I’ve always had a hard time incorporating it into a daily practice. That is until I started audio journalling. The act of talking to myself as a form of self reflection feels really natural. It takes a moment, but once I got over the initial hurdle of the stigma of talking to myself, I’ve found this to be a great way to explore thoughts and feelings.

Challenges

Again, similar to music capture, management of audio is labour intensive, so while I am technically capturing these journal entries, they are challenging to extract any value out of, unless I am as diligent in editing as I am journalling. Also, just like music, there may be an epiphany or a discrete snippet of value, but I had no way to easily isolate that value while recording.

And, when it comes to editing, written text is much more mutable than audio (with our current toolset). Descript is trying to change that, but the workflow still feels limiting.

Ideas

  1. Ability to execute ‘programs’ while journalling — while the music capture use case was often set it and forget it (my focus would be mostly on the music), while journalling I’m much more aware of the tool. This would enable me to not only be able to annotate the timeline, but be able to execute programs while speaking. For instance, I’d like to be able to add something to a task list while journalling. Or, I’d like to capture an idea in written form in my notes. Or, perhaps I’d want to capture a few links to google searches I’d like to perform at a future date. Or, I’d like to track (rate) my mood, sleep, and / or how my body feels in the beginning or middle of a journal. All of this could be done if the audio capture service could recognize keywords.
  2. Text based editing like Descript is a great feature, but it should be the default editing mode of an audio capture tool rather than an entire workflow.
  3. Would be interesting to integrate this type of journalling feature into a health based app. Something like Headspace.

Short term memory

Use Case

As I have been immersing myself in the design space of audio, I have become very sensitive to “pressing the red button.” There are two affects to describe: 1) if you are in a scenario and you wish you were recording what was happening (conversation, music, personal train of thought), the act of initializing recording disrupts the scenario, often taking away from whatever flow was happening. This phenomenon is not exclusive to audio as any time we reach for our phones to do something we are taking away from the thing that instigated that action. 2) knowledge of being recorded often changes your actions. It takes some time to build comfort with the idea that every sound you are making is being archived. This, too, can get in the way of flow—there is some part of your mental energy being spent on self awareness that could be spent on the mental task at hand.

So, I asked myself the question, what if we could remove the red button? What if capture was a retroactive process? If we were capturing everything, we could just go back to the moment we wanted to archive and do something to “commit it to long term memory.” I started recording more and longer sessions. In meetings, or conversations with my staff, or friends I would start recording at the beginning and leave it run.

Challenges

Ideas


Long form audio messages

Send long form messages to others (or yourself).

Challenges

Ideas


Meeting notes

Recording meetings to add accessible functionality to the meeting.

Challenges

Ideas

Various overarching technical considerations

  1. Audio Monitoring
  2. Audio Ducking
  3. Multi-microphone recording
  4. Noise reduction

Various audio functions

“Hey Alfred, save and transcribe the last 30 mins of audio.” “Do you like me to tag it with anything?” “Take the tags from this sentence: Conversation with Kristina about our secondary dwelling.”

“Hey Alfred, let me listen to audio from last Thursday.” “I’m sorry, it doesn’t seem that you’ve saved anything from last Thursday. Please keep in mind, my memory only lasts 24 hours.”

“Tag—idea: clad the inside of the studio space in plywood. This will enable modularity in what we do inside. End tag.” No response

Reference Points