Fun post but SoundHound has done this pretty much flawlessly for 10 years. It's also built-in in Google Search and on all Android phones. In that context "Shazam for Singing" is an unfortunate title since it implies nothing like that exist and that Shazam is SOTA, when it's the other way around.
No idea how Shazam can be so far behind and still be synonymous with song identification in everyday speech.
Shazam works with a signal processing technique ("fingerprinting" - https://patents.justia.com/assignee/shazam-entertainment-ltd) that aims to re-recognize the song as originally recorded - so its design goal was "Which song is playing here?" or more precisely "Which song from my database is played here (again)?" rather than "Which is the closest match from my database to this new song rendition/'cover'?".
You can imagine it like a sequence of hash codes or shingles (for modeling gaps/pauses to borrow a term from Web page similarity) for subsequent parts of the songs.
Notably, Shazam does not aim to transcribe the lyrics; so the OP's approach may potentially claim some novelty here. In any case, this experiment shows how great large pre-trained neural language models are for rapid prototyping to put something together quickly - perhaps to test feasibility before attempting to develop something better and more bespoke.
This is a novel, (somewhat) hand-rolled approach that I like to see on HN. As others mentioned Google's "Hum to Search" exists, but where's the fun in using Google Wheel™ when I could reinvent it? :)
Your next challenge: make it work with humming. (Hard mode: humming by someone with as terrible pitch as me.)
I think the main problem is that you often don't know the lyrics if you're trying to find a song. Recently a friend of mine hummed a song to shazam and it did actually work!
That's funny. Reminds me of the time I had auto-Shazam going on my phone for some reason when I walked into 7-11, and this guy was singing "Because I Got High" by Afroman. He sounded enough like the actual song that Shazam triggered off it.
Recently a friend of mine hummed a song to shazam and it did actually work!
I seem to recall that Apple announced || demoed this a couple of WWDCs ago. I didn't realize it was already deployed. I assumed it was going to be part of Siri/HomePod.
That reminds me: what ever happened to Shazam live lyrics?
You used to be able to Shazam the middle of a song, and it would start displaying the lyrics in real time. Apple even demoed it in a keynote. I haven’t seen that feature in a while.
I hummed a line from the middle of the song Jaljayo by Twice (I don't speak Korean) and still got a 12% match. Interestingly I also got two other matches at 6% and 5% but I've never heard these songs...
Good job! I am still searching for a good solution which can identify songs by singing or humming the melody. There are sites (eg. soundhound) but these are not working all the time.. I think soundhound is working withoud machine learning. With the power of AI today, we could create something which works better. Like Shazam for music.
Google search does this already. My wife and I were 404-ing on the name of a song which featured a melody on pan pipes (not because we liked it, just because we remembered it from our childhood). I was pretty sure I knew the artist, but couldn't find it on Spotify. She sang the melody into google search, and bingo! Track found.
It's called query by humming, and there used to be academic competitions for it a decade ago. And for this SoundHound has always been ahead of Shazam, which focuses more on music, remixes and covers
I thought this was how Shazam worked as well. I hummed a tune and was super disappointed it didn't identify it. Sounds like a business opportunity to me :)
pedestrian brain: ChatGPT 4, what services can be used to identify a song by singing into a phone's microphone?
gigabrain: What python script can I write that posts an audio file I recorded to an AI I found from a Google Search, but via REST, returns the text, and then posts the texts as lyrics to ChatGPT 4 which can't tell anything after the cut off date
No idea how Shazam can be so far behind and still be synonymous with song identification in everyday speech.