Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Shazam for singing (toasterthoughts.eu)
122 points by przadka on Aug 17, 2023 | hide | past | favorite | 32 comments


Fun post but SoundHound has done this pretty much flawlessly for 10 years. It's also built-in in Google Search and on all Android phones. In that context "Shazam for Singing" is an unfortunate title since it implies nothing like that exist and that Shazam is SOTA, when it's the other way around.

No idea how Shazam can be so far behind and still be synonymous with song identification in everyday speech.


Thank you, I appreciate your comment. The main goal here was to impress my kids with my own engineering skills :)


You've made me sang Baby Shark to test Google's "What's This Song". It didn't work, but it recognized it from your daughter's singing.


I was impressed by SoundHound’s ability to correctly identify even my rather ordinary singing.

I’ve assumed Shazam must simply be better. Never had a need to find out.

Now I know!


Just tested sound hound and it got the first song wrong. Never known shazam to do that


Shazam works with a signal processing technique ("fingerprinting" - https://patents.justia.com/assignee/shazam-entertainment-ltd) that aims to re-recognize the song as originally recorded - so its design goal was "Which song is playing here?" or more precisely "Which song from my database is played here (again)?" rather than "Which is the closest match from my database to this new song rendition/'cover'?".

You can imagine it like a sequence of hash codes or shingles (for modeling gaps/pauses to borrow a term from Web page similarity) for subsequent parts of the songs.

Notably, Shazam does not aim to transcribe the lyrics; so the OP's approach may potentially claim some novelty here. In any case, this experiment shows how great large pre-trained neural language models are for rapid prototyping to put something together quickly - perhaps to test feasibility before attempting to develop something better and more bespoke.

In a related note, Apple may be working on a similar service, at least they filed a related patent application: https://patents.google.com/patent/US6990453B2/en


Apple acquired Shazam in 2018 and it's integrated into iOS.


This is a novel, (somewhat) hand-rolled approach that I like to see on HN. As others mentioned Google's "Hum to Search" exists, but where's the fun in using Google Wheel™ when I could reinvent it? :)

Your next challenge: make it work with humming. (Hard mode: humming by someone with as terrible pitch as me.)


I think the main problem is that you often don't know the lyrics if you're trying to find a song. Recently a friend of mine hummed a song to shazam and it did actually work!


That's funny. Reminds me of the time I had auto-Shazam going on my phone for some reason when I walked into 7-11, and this guy was singing "Because I Got High" by Afroman. He sounded enough like the actual song that Shazam triggered off it.

https://www.youtube.com/watch?v=WeYsTmIzjkw


Recently a friend of mine hummed a song to shazam and it did actually work!

I seem to recall that Apple announced || demoed this a couple of WWDCs ago. I didn't realize it was already deployed. I assumed it was going to be part of Siri/HomePod.


That reminds me: what ever happened to Shazam live lyrics?

You used to be able to Shazam the middle of a song, and it would start displaying the lyrics in real time. Apple even demoed it in a keynote. I haven’t seen that feature in a while.


I've seen a similar feature with the Music app on iOS and tvOS, but it only seems to work with music from Apple Music.

Using it with Shazam would be great for when you can't understand one or two lines of a song.


There are variations where you tap the beat also. I've had that work at times.


Google search does it already: https://blog.google/products/search/hum-to-search. Worked decently enough the few times I tried it.


I had more luck with playing tune on the guitar than with singing. Can work that way too if you're not a great singer.


I hummed a line from the middle of the song Jaljayo by Twice (I don't speak Korean) and still got a 12% match. Interestingly I also got two other matches at 6% and 5% but I've never heard these songs...


I suppose when the next GPT has audio input, you could ask it directly to identify songs.

It also makes me wonder how well ChatGPT can directly tell you lyrics verbatim, and how that would be yet another legal issue


Good job! I am still searching for a good solution which can identify songs by singing or humming the melody. There are sites (eg. soundhound) but these are not working all the time.. I think soundhound is working withoud machine learning. With the power of AI today, we could create something which works better. Like Shazam for music.


Google search does this already. My wife and I were 404-ing on the name of a song which featured a melody on pan pipes (not because we liked it, just because we remembered it from our childhood). I was pretty sure I knew the artist, but couldn't find it on Spotify. She sang the melody into google search, and bingo! Track found.


Thanks!


SoundHound advertises singing as a primary identification vector.

https://www.soundhound.com/soundhound/


It's called query by humming, and there used to be academic competitions for it a decade ago. And for this SoundHound has always been ahead of Shazam, which focuses more on music, remixes and covers


I thought this was how Shazam worked as well. I hummed a tune and was super disappointed it didn't identify it. Sounds like a business opportunity to me :)


Midomi.com is another service tailored to identify your renditions of a song.


Midomi is the OG in this field, I think. It eventually rebranded as SoundHound so you can still hum into any SoundHound app.


What about: “Ken Liiii, a dibba dibba dow doogh”?


For those who don't know the reference: https://youtu.be/koTCXbV0jEw


pedestrian brain: ChatGPT 4, what services can be used to identify a song by singing into a phone's microphone?

gigabrain: What python script can I write that posts an audio file I recorded to an AI I found from a Google Search, but via REST, returns the text, and then posts the texts as lyrics to ChatGPT 4 which can't tell anything after the cut off date

highly regarded


Why that long intro with that unnecessary narrative?

This could have been summarized in a tweet

"Shazam for singing: Used whisper to transcribe what I sing, then used Google to get the name of the song"


haha, i know but I like telling stories. thanks for the comment!


It's the tech way of recipe blogs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: