It's about fourteen kinds of ridiculous, as summarized in other threads. No rhythms, no meter, no tempo, melodies are longer than 12 notes, it's diatonic, single octave, no concept of underlying harmony, the headline is literally false, etc.
Some of the copyright lawsuits are dumb and this is effective satire or performance art but that's all it is.
It's definitely satire, but it's satire in the face of comical law. That's the point. If copyright lawyers want to argue originality based on an arrangement of notes in a 12 tone scale, and in a limited number of bars, then this is a completely valid argument against such a weak argument.
The reality is that many number one songs can be tonally compared to many classical pieces, or even pieces from the last 40 years. The current state of music copyright law is an absolute joke, and deserves to be "disrupted" (destroyed).
You're not gonna get away with copyrighting a 4-letter sequence, but the recent Katy Perry vs Flame lawsuit established that 4 notes is all it takes to have a copyrightable melody.
Usually short phrases aren’t supposed to be protectable under copyright. However, when a defendant blatantly appropriates a well-known literary phrase for a commercial purpose like selling unlicensed merchandise, courts may make an exception.
Yep. One of the arguments no sane person should countenance is that Katy Perry's songwriters happened to use a four note descending synth arpeggio in an intentional attempt to cash in on the fact a four note descending arpeggio in a different key was a motif used on one of the sixteen tracks of an album which hit number five in the Gospel Charts four years earlier. For similar reasons, Universal isn't going after most of the 3.8m websites using the phrase 'phone home', and you're probably OK using three stripes in artwork unless you're drawing them on the shoulders of sportswear or sides of shoes to make it look like Adidas.
[there actually are musicians that specialise in recording backing tracks intended to resemble a particular popular recording which aren't that recording for use in commercial products, but they tend not to get sued...]
Copyright law considers the importance of a sample to the work as a whole, in addition to just the size. Any arbitrary three-word phrase from the middle of a script is probably not copyright infringement. The most memorable line from the entire movie probably is.
Related: The Supreme Court ruled that excerpting a single paragraph from a 454-page book can be copyright infringement. The book was Gerald Ford's memoirs, the one paragraph was his reasoning for pardoning Nixon. The Court's reasoning was more-or-less that nobody cared about anything else Ford did, so excerpting the one paragraph was as good as giving the whole book away for free.
Wouldn’t that be considered something more like an implicitly-created trademark? It’s essentially the equivalent of a company motto for the the movie’s SPV company.
I would note that 4 minutes and 33 seconds of silence is copyrighted by John Cage as " 4'33" ", and a single chord sustained for 20 minutes, followed by 20 minutes of silence, is copyrighted by Yves Klein as "The Monotone-Silence Symphony".
Cage's case is more complicated and less a violation than it may sound at first. It's not 4'33" of "mathematically true silence", that is, you don't violate it per se just by having the right number of zeros in your .wav file. It's 4'33" of performed silence, where the performance is actually the incidental noise of the auditorium it is in. Having a copyright on this particular piece specifically in a performance context may still be an interesting thought experiment, but it doesn't break the system as a whole.
In the latter case, there's also no real risk of accidentally stomping on that.
The claim in this particular case is that they really have generated the entire possible melody space. Legally I think it's likely to fail on multiple levels if it is ever challenged, but part of the point is that some of those failures should also be applied to some real copyright suits that have been won.
(It is somewhat ironic that the music industry continues to be so upset about copyright even as they appear to be converging on The One True Pop Song at speed. Maybe if they acted less like some sort of bizarrely over-trained AI and cranked up the exploration constant, they'd stomp on each other less.)
>It's 4'33" of performed silence, where the performance is actually the incidental noise of the auditorium it is in.
That's an incorrect way to view it actually. 4'33" copyrightable essence is actually represented by the active production of its scoring. I.e. nothing. The background sound is not what makes it copyrightable. You can go ahead and sit at a piano for the length of the composition all you want, wherever you want, and you'll still be publically performing 4'33".
The rather humorous outcome, if one asks me, is that anyone who writes in 4 beats of silemce into a score should be violating copyright if we're going to be consistent.
Some smartass artists have actually written their silent compositions as rhythmically structured rests. That is, something like "silence in 6/4 time, as three quarter rests and three eigthth-triplet rests". It's intentionally absurd, but as the minimum-information notation would be the full-measure rest glyph and the number of measures of duration, the deviation from this is undoubtedly creative content that is copyrightable on paper.
That kind of thing is completely unenforceable with respect to performances, but in written musical notation, copying the specific notation pattern could be infringement. If you write "4/4, tempo 80, 91-measure rest", that's maybe violating the 4'33" copyright. If you write a score for a full band or orchestra that shows rests in each measure for each instrument, with key changes and tempo changes and such, you're just retelling the same joke in a different way.
> Having a copyright on this particular piece specifically in a performance context may still be an interesting thought experiment
Not even a thought experiment: it’s essentially a 4’33”-long ambient acoustic sample. There are plenty of these (though not usually that long) in sample libraries, recording e.g. traffic sounds, or diner conversation, or crickets in a marsh in summer, etc. And those are certainly copyrighted, unable to be used without license.
>Not even a thought experiment: it’s essentially a 4’33”-long ambient acoustic sample.
I would question any legal professional's authoritative standing to even advise on copyright of a work of music if they miscategorize a recording of ambient sound as a performance of a musical scoring consisting entirely of silence. The copyright doesn't apply to the ambient sound, but to the long quiessence of an artist at their instrument.
It demonstrates a complete blindness of the negative space of music, and a positivistic bias that has no place being enshrined in our legal system.
Exactly! What they're doing would otherwise seem pointless, but given how far off the rails the courts went on that lawsuit, copyright jurisprudence needs this kind of sanity check, because courts clearly aren't using enough criteria to call a melody "copied".
The whole concept of IP is absurd, and there are many absurd consequences you can derive from it.
But there are degrees of absurdity, it's one thing to do that when there's 26^100000 possible combinations, it's another when there's just 12^100 (and if you only care about melody it's overestimation, most songs will use much smaller subset of that).
It's not like this absurdity was unknown to the framers of copyright. Thomas Jefferson, writing regarding patents, wrote "other nations have thought that these monopolies produce more embarrassment than advantage to society..." And yet, they persist (and in the modern era, proliferate) because it's generally agreed society derives benefit from paying people for their purely-idea creative work.
... but that doesn't mean copyright and patent aren't a perpetual battle against the "natural" arrangement of idea, and absurdities are extremely possible when the law is misinterpreted or mis-structured.
It's weird -- legal arguments aren't protected under any sort of intellectual property regime. In fact in common law jurisdictions, it's encouraged for you base your work off that of previous practitioners. I always wonder how different our intellectual property regime would be if lawyers demanded to be paid royalties for others citing cases which they had won.
Copyright as an institution is grounded on arguments of net societal benefit, and it doesn't take much argument to demonstrate that there is net societal harm to intentionally diminishing people's ability to comply with the law.
Somewhat tangentially, in some fields and jurisdictions, the legal code which is enforced is copyrighted and paywalled by third parties or the Gov. itself.
> And yet, they persist (and in the modern era, proliferate) because it's generally agreed society derives benefit from paying people for their purely-idea creative work.
Who "generally agree"s on this? The existence of laws doesn't indicate the mood of society.
> The existence of laws doesn't indicate the mood of society.
But if laws are proliferating and regularizing instead of standing still or being abolished, and one assumes that elected representatives are acting on the will of the people, it probably does.
There's lots of controversy over how to improve copyright / patent law, but not very many people in governments in the EU, US, China, Japan, Australia, &c are talking seriously about just burning the whole copyright / patent system to the ground. At least a subset of the countries in the groups listed are generally understood to have representative governments.
Representative governments are well known for supporting powerful groups over the collective benefit of society. Just look at any of those countries tax codes and you will find a multitude of special exceptions.
Patents are of limited duration because the tradeoffs of unlimited patents are so horrific. If we accept a billion dollar drug must enter the public domain, clearly copyright should also be limited just as it was proposed in the US constitution. However, because a tiny minority has a huge benefit and society does not really notice the difference you get the modern mess of unending copyright.
"A tiny minority has a huge benefit and society does not really notice the difference" seems like a utilitarian argument that the system is working as intended; there is net positive benefit.
If it costs 350 million people 1 cent to hand me 1 million dollars they don’t notice, yet that’s a net loss. Critically, even when you include those who benefit it’s still a loss.
Only if you consider money completely fungible and use money as your utility function for determining loss here.
You haven't diminished the ability for 350 million people to do things, practically, by shaving 1 cent off of them. But adding the ability a million dollars provides one individual to do something cool with 1 million they couldn't do before has increased the overall capabilities of everyone.
In essence, you've just described Kickstarter's business model.
> "A tiny minority has a huge benefit and society does not really notice the difference" seems like a utilitarian argument that the system is working as intended; there is net positive benefit.
If tiny minority + 'society' is the entirety of the system, then that's true, but there are also plenty of players who lose out due to restrictive IP regimes—and it's hard to quantify the extent of those losses. (Whether or not the benefits they would reap from looser IP are appropriate or fair is beside the question for utilitarian computations.)
> Patents are of limited duration because the tradeoffs of unlimited patents are so horrific.
What would be the point of a patent system with unlimited duration? If we wanted that, we could just have companies not reveal their inventions in the first place
I think IP is required to monitize valuable ideaS, and monitzation leads to social availability.
On the personal level, I would not like it if publishers could freely print the works of new authors, or engineering solutions I spend years on could be copy and pasted.
Some people prefer Creative Commons, and they are free to publish their work that way. Others need or want financial compensation.
> I generally agree that society benefits from IP.
I certainly didn't mean to say that no-one agrees with it, but your personal agreement doesn't evince the general agreement that shadowgovt (an interesting username, in this context …) suggested. To be fair, neither does my skepticism provide any evidence against it.
I think we can align on the lack of evidence presented. There is also the question on what agreed to means in this sense.
In terms of public opinion, it would be interesting to know what studies have been done. I imagine if you would get broad support for an author copyrighting a book, and less on patenting a pre-existing genetic sequence.
As another unsubstantiated claim, I think if you sat down with the general public and the criteria for patents, they would mostly agree.
The challenge has to do with implementing them and the legal process around them.
If a crappy patent is issued to large corporation, it is incredibly expensive to challenge them.
They persist because they allow some people to create extreme monetization strategies, which feeds into lobbying congress for further expansion of copyright.
That doesn't explain why copyright regimes similar to the US persist in countries around the world, or why countries are finding their way to treaties that standardize international copyright enforcement that look more like the US regime than other country's regimes.
The US has a lot of clout in the world and exports its laws and culture abroad pretty heavily. Countries that have tried to outlaw Coke or Cigarettes are usually sued into the ground by large US corporations and, when that fails, the US has used sanctions to back up the accessibility of US products.
Basically, America's crazy overreach forces our laws onto other nations - this is actually one thing that really frustrates me about corporate tax loopholes, that overreach could be trivially used to force better international standards for corporate VAT taxes there just isn't the political will (due to lobbying) to get it done.
At the risk of getting stuck in a loop: "No rhythms, no meter, no tempo, melodies are longer than 12 notes, it's diatonic, single octave, no concept of underlying harmony"
Some people would tell you it's an ill-posed question.
Some might say lumping trademark, copyright, patent and trade secret laws (historically and in practice very different things) under one heading called "intellectual property" is an intentional strategy to muddy the waters and cloud any argument.
And interesting to include in a study like that how much people in each field actually understood about copyright law. My guess is the general level of understanding is pretty low.
So I can write a program which can generate your name and your sexual preference (among a lot of garbage data). Does that mean this can no longer be considered private information subject to privacy laws?
You can use a ridiculous argument for many things.
The garbage data makes all of the difference. Said accidental generation would make it legal as the only way to pick out the data is to know it already. Otherwise what separates your real name from "Name: Seymoure Butts Sexual Orientation: Mayonnaise".
The fact there is no garbage data at all shows that they are doing judgement on a nakedly wrong level in music - even by the standards of copyright.
Selling "Harry Potter but with the capitalization inverted" to dodge book copyright nor even "this key and this very long block of data which happen to decrypt to the complete works of JK Rowling, don't decrypt because that would be infringement wink wink nudge nudge".
I would presume to vast majority of the automatically generate music is basically garbage?
If we say that music (and really any information) is just numbers which can be enumerated automatically, then surely the creative action is finding and picking a number which is actual interesting out of the infinite sea of random garbage.
My point is that framing music as "just numbers" does not disprove that producing a song is a unique creative work. There may be valid arguments against copyright, but this one isn't.
Yes, of course it means that. I might randomly generate data to test with. Doesn't mean it's private and needs to be treated as such if I accidentally and unknowingly stumble across real info.
This actually reveals an important truth about the nature of information: Information is often better understood as exclusionary, rather than somehow "creative". If I have a "thing", you don't know what color it is. If I know tell you it is "red", you still don't know the exact shade, but I have excluded a lot of possibilities. How informative my statement is depends on how much is excluded. If I name an exact Pantone color, I am being much more informative.
In some sense, looking at information as being exclusionary and as being inclusive are the same thing, but there's a lot of ways in which the former actually makes more sense as a thought framework.
And in this particular context we can see how that plays out... a list of all possible melodies of a given nature actually has very little information in it, because it doesn't exclude enough. It may superficially seem to our human senses that a lot of stuff has been included/constructed, but in reality, the 'list of every possible melody' is a vapor. There's not actually anything there. It is the act of exclusion of possibilities that leads to interesting information. Such information as this list has is contained in its specification of what a "melody" is. Counterintuitively (to a lot of people's understanding), if they widened the specifications, while they would end up with a bigger list they'd end up with less information in the result.
The act of creating a song isn't a matter of creating the possibilities from the raw nothingness, it's a matter of carving them out of the exponentially-large space of possibilities and finding something there useful. The exponentially-large space is so large that it is very easy to not see it that way, because, I mean, it's huge. It doesn't feel like "removing" possibilities the way carving a 3D stone does ("I remove everything that doesn't look like my desired statue"), because the exponential space is so inexpressibly larger, and we need fundamentally different tools to address such a space, but in the end, it's the same thing.
While this isn't what the law was written for necessarily, the "creativity" requirement here could be very easily pressed into service here. They've expressed very little creativity/exclusion on this list and it would be easy to argue it falls far below the threshold necessary for copyright. As a literary criticism of the system, it is successful and thought provoking... as a legal criticism of the system it would fail completely.
> ... as a legal criticism of the system it would fail completely.
The legal criticism would be that there just aren't that many unique melodies—as demonstrated by the fact that they were able to enumerate them all—so the mere fact that two songs use the same melody is not sufficient to show that one is a copy of the other. The set of melodies that are compatible with human ascetics is even smaller. They don't actually need these auto-generated melodies to qualify for copyright for the project to succeed. It works equally well if similarity in melody is not considered sufficient evidence of copyright infringement.
Even just having the database around so that one can say that they copied the melody from here rather than from some other source might be enough. After all, unlike patents, independently producing something similar to a copyrighted work is not infringement; you have to have actually copied from the other work. If you're a musician perhaps you should listen to a few randomly-selected melodies from this program each day. Maybe it will spark something, but even if it doesn't it will at least make it harder to argue that whatever melody you come up with could only have been "subconsciously copied" from some other composer's song you may have heard decades ago.
> The legal criticism would be that there just aren't that many unique melodies—as demonstrated by the fact that they were able to enumerate them all
How does that follow? You can enumerate any finite number. And the article doesn't say how big. Is it a thousand or a trillion? "Riehl says the algorithm works at a rate of 300,000 melodies per second.". The article doesn't say how many seconds it took to generate all melodies though.
Not within a fixed time period in the real world. You're limited by the matter and energy available, and by the speed of light. However we're not talking about the theoretical ideal limits of computation. The upper bound would be 300k melodies for each second since the program was written—68.7 billion in all, according to the Adam Neely interview linked from the Press page of the project site. Which is a lot, but then there are hundreds of millions of known songs, each of which is likely to contain multiple melodies, some of which are much more likely to be chosen than others. Accidental duplication is thus quite likely.
Ha! It does not work however, as a person's sexual preference is a fact, and a statement that either guesses or spits out all recognized variants is not a source of factual information -- no-one is any the wiser afterwards than before.
On the other hand if you offered it up as a fact with reckless disregard for its truthfulness...
Again we see programmers trying to understand the law in terms of ‘how can a piece of data be illegal?’ while the law is quite happily focusing on making specific actions illegal.
‘You can’t arrest me, gold bars aren’t illegal!’
‘Yes, but carrying them out of the federal reserve vault without permission is.’
"I invented this later but independently" isn't a valid defense, even if you can prove it. So it's not actions (like copying or plagiarism) that is prohibited, it's the result.
That's the problem with IP law.
In effect you give people monopoly on numbers. When the numbers are big nobody is bothered by this, because chance of arriving at the exact same one is effectively zero. But for songs the numbers are pretty small (depending on the encoding used to compare the songs), and the absurdity is evident.
That's not correct in the case of copyright. Independent invention can be used as a defense to a copyright infringement claim.
> Generally, a plaintiff proves copying through circumstantial evidence, showing that the defendant had access to the copyrighted work [...]
>
> [...] unlike in patent law, if a defendant independently creates the substantially similar work, he is not liable to the copyright holder.
Your example is barely plausible, much less demonstrable.
The connection between "original works" using an extremely limited set of notes, and your right to privacy using some theoretical predictive algorithm is not at all obvious.
I agree. I wanted to point out the absurdity of the argument used in the article. The argument is that music is just numbers and numbers are not copyrightable. But any kind of information is "just numbers" which can be enumerated given enough time.
For example, there are a number of songs that include a rendition of Tom's Diner within a larger melody. Also popular to add to a song is the Arabian riff melody (which is so old it's not copyrightable, but you get my point) I don't know of any law suits around Tom's Diner song, but this is an example of the types of ridiculous lawsuits out there over copyright.
The code actually can produce every possible melody in MIDI. They simply have not stored every possible melody explicitly (uncompressed) on a hard drive (which is impossible, as the size is infinity).
However, if you interpret the program itself as a self-extracting compressed archive, they actually have stored every possible melody (in a compressed way).
So the question reduces to how much the type of compression matters here (is ZIP allowed? is TAR allowed? what about more sophisticated like PAQ? and what about this Rust code?). This is what I/we discussed here: https://news.ycombinator.com/item?id=22441328
> So the question reduces to how much the type of compression matters here
If you compress, you can copyright the compressed bytes.
If you don't compress, you can copyright the uncompressed bytes.
As far as that copyright extending to derivations, e.g. decompressions, the answer indeed situation-dependent. For example, converting a copyrighted font from TTF to WOFF does not remove the copyright. But converting a copyrighted font from TTF to screen pixels to WOFF removes the copyright. (Sorry I don't have a reference; probably findable.)
The "self-extracting zip" derivation would probably fall into the latter category; that is, the copyright would not transfer.
---
But even if the copyright were maintained during your advanced decompression, one could argue that editing down to very specific portion of that extremely large body of work was a substantive/transformational derivation, which they could then copyright themselves.
Transformative works are very common in art. The most famous example is Duchamp simply adding a mustache to a print of da Vinci's Mona Lisa, and copyrighting that. [1]
It comes down to this: Typefaces/glyphs are not copyrightable. The font code that produces those glyphs is.
> Typefaces cannot be protected by copyright in the United States (Code of Federal Regulations, Ch 37, Sec. 202.1(e); Eltra Corp. vs. Ringer)...However, there is a distinction between a font and a typeface. The machine code used to display a stylized typeface (called a font) is protectable as copyright. [1]
In software, a similar "black-box" derivation process has happened many times, e.g. UNIX/GNU. Copyrights applies to software source code, but not software functionality.
Determining what is the "essential, creative work" in each case in a nuanced way is a matter for courts and armies of lawyers: Apple round corners, Oracle Java APIs, etc.
This is true of fonts, as per the copyright office's
"Policy decision on copyrightability of digitized typefaces".
However the example here, and the situation with fonts seem different. It is the case that font data is viewed as utilitarian and uncopyrightable. So we have a copyrightable program, producing uncopyrightable data. The argument here seems that we have a copyrightable program producing copyrightable data.
Every possible combination of 8 notes in 12 beats is not infinite. Assuming all quarter notes it's just (8^12)-(8-1)^12 = 54,878,189,535. If you factor in rhythmic variances, the number is much much larger, but still not infinite.
More correctly, it can produce every possible monophonic sequence of tones given infinite time, which in the article was further limited to monophonic 12-tone sequences across a single octave (any other limitations I missed?).
A melody contains more than a sequence of tones. The most heartless definition would at the very least include rhythm. For every sequence of tones they output, they only produce one out of hundreds of possible melodies for that sequence.
It is of course an interesting thought to consider the definition of decompression, but on the other hand, we should also limit the contributed idiocy to the bare minimum required to break the relevant idiotic rules.
Part of me wonders if the next step with this is some sort of DL model. I wonder if, trained on one set of melodies (defined in the intuitive sense) it would generate existing copyrighted melodies not in the training set.
By that logic, a program that simply counts up from 0 (with bignums, as long as the machine has enough memory), is actually a compressed form of every single piece of data or information that has ever, will ever or can ever be created.
Music lawsuits can be based on about as much information as they're conveying. I think that's their point, too. I don't believe they were trying to be satirical, they wanted to prove a point about the nature of music itself that could be used in defending musicians against lawsuits.
One of the items they were trying to point out, often abused in lawsuits for pop music, is the idea of "Access". If you came up with an idea all by yourself, but a similar song exists that is popular enough, the court argues that just by there being the possibility that you heard it, you therefore definitely heard it and then copied it.
If this music set exists, and is freely available, shouldn't it be considered that you had reasonable access to it and therefore stole it? No, of course not, that would be a ridiculous assumption and so is the current outlook of a song being popular being enough proof that you stole the idea.
Not to mention, music is extremely formulaic. Chord progressions have a natural tendency to certain forms, with centuries of prior art, rhythm within genres of music is often the same, even melodies have a trend toward particular combinations (leading tones over chord progressions bring about lots of similar sounding solos).
Any musician trying to claim copyright for their music should remember that their song only exists on the back of centuries of musical exploration. Consider how much of the song you can say is truly novel, it's going to be nearly nothing.
The combination of lyrics + chords + melody is in my opinion, the absolute minimum you need to claim a song has been copied. Lyrics are derivative, melodies are derivative, chord progressions are derivative, but together they have the chance to be a unique combination.
>I don't believe they were trying to be satirical, they wanted to prove a point about the nature of music itself that could be used in defending musicians against lawsuits.
Sufficiently advanced “proving a point via absurdity to make a more general argument” is indistinguishable from satire.
> No rhythms, no meter, no tempo, melodies are longer than 12 notes
I am not a musician, but which of these are copied in the Tom Petty / Sam Smith case that motivated this exercise? To my untrained ear, I do hear some similarities in the relative lengths of the notes (meter?).
I don't have a great ear, but I think it's a similar melody, albeit at a different tempo and key (?). If you can't hear it, try changing the video speed to 1.5x during the Sam Smith part.
If these two songs are similar enough, then I think it could be argued that a MIDI sequence has been copied, since in both cases it requires a significant change of tempo and key. A lot of commenters seem to be missing this point: yes, the generated sequences sound different from real songs, but so do the songs involved in the ridiculous court cases. Radiohead and Ed Sheeran were sued for chord progressions, Katy Perry for a melody. The songs involved were altered about as much as the MIDI sequences would need to be to show the similarities.
I'm not incredibly familiar with the court cases, nor that of Coldplay/Satriani, but I have a hard time believing that their decisions are algorithmically binding. Like, yes, they might be similar along those particular axes but that doesn't mean those similarities are the sole reason for the court decisions. There's also matters such as - was songwriter #2 exposed to song from songwriter #1? Does the similarity in arrangement imply intent to copy? Etc.
Exactly. Harmonic progression is probably the most important part of this, as it imparts context on the melody. The same melody over I-V-IV-vi and vi-V-IV-V is not the same thing.
Courts don't care about harmony. Most US courts follow a guideline that the copyrightable parts of a composition are melody and lyrics. Nobody has ever successfully sued for stolen chord progression.
Not according to the US court system. American case law defines chord progressions as insufficiently creative for copyright purposes. Usually rhythms are too. The only copyrightable parts of a composition in precedential cases of most US courts are the melody and lyrics.
And apparently arpeggios, because the US District Court of California ruled in Flame vs Katy Perry that arpeggios are "melodic enough" for copyright protection.
Melodies are insufficient on their own in my opinion, the court rules differently I guess. Melodies are just as likely to be formulaic, and are built using similar foundational knowledge as a chord progression (only a subset of notes works in a given progression for example, and conventions lead you toward certain notes of that subset).
A combination of chords, melody, rhythm are I think the only reasonable measure that a song has been copied.
It matters a lot to the sound of the song, but does it matter to the court?
If I take the entire melody of a Beatles song, including the verse and chorus, but set it to an entirely different chord progression, would the court recognize that as an original song? What if I lifted all of the lyrics as well?
Lyrics and melody? I think that's reasonable to consider that an infringement. Melody over a new chord progression? I do think that should be considered a new work, just to limit the scope of copyright. Even if it's clear you copied the melody, I think melody alone is insufficient to call a song. Unless the original song was entirely melody. A melody is using all the same musical building blocks as the chord progressions did, why does it get special treatment?
Octave - a doubling in the frequency. Generally speaking, in twelve tone equally tempered classical music, melodies and harmonies are drawn from a scale, a subset of these twelve notes. The most notable of those scales, the diatonic scale has 7 unique notes, and the eighth note wraps back around to the beginning of the scale. Thus, moving from a note to a note double the frequency took 8 notes - hence, the octave. Diatonic is both the name of the most used of the 7-note scales mentioned above, and also a term meaning 'within the scale', i.e., 'diatonic to a (given) scale', depending on context.
Melody is the horizontal arrangement of notes for an individual voice or instrument over time. Harmony is the vertical arrangement of notes sounding at the same time, and how those transform horizontally over time as a group. Tempo is the speed in beats per minute of the background 'pulse' of the music.
Now, asking for a music theorist to give you an algorithmic definition of how to make music with any of the above? Good luck ;)
If you're looking for a book, Music: A Mathematical Offering https://www.amazon.com/Music-Mathematical-Offering-Dave-Bens... is pretty good. But you're nearly a the end of your recursion, though. Wikipedia ought to take you the rest of the way to answering those basic questions. More complex questions like why certain combinations sound good whe next to one another, on the other hand...
I found this video to be an extremely helpful and concise explanation of music theory basics:
https://youtu.be/rgaTLrZGlk0
"Learn music theory in half an hour" is obviously an exaggeration, but it really comes astoundingly close to fulfilling that promise. It contains a lot of information and each part builds on the previous parts, so it requires focus and maybe a few repetitions to 'get it', but I think the approach is fantastic for showing how many ideas of music theory are deeply connected.
Not a book as you requested, but hopefully you'll find the other reply useful. As for some of your specific questions:
Frequency is the same concept as radio frequency, but in this case refers to something we can directly sense. Radios transmit electromagnetic waves, which are photons moving at a certain rate, measured in Hertz, or cycles per second. Sound frequency refers to movement of air waves, so a more accurate analogy than radio waves is waves in a pond when a rock is thrown in. Human ears are sensitive to frequencies between 20 Hz and 20,000 Hz, so any sound you hear is a combination of frequencies. Natural language is helpful here, since higher frequencies sound 'higher' and lower frequencies sound 'lower'.
A tone is a sound at a specific frequency, also known as a note. For example, 440 Hz is designated as the note A4 by the Geneva conventions, and this is what most instrument tunings are based off.
Twelve tone equal temperament is the tuning system nearly all modern Western music uses. Certain ratios of frequencies sound pleasant, especially ratios with small numbers, such as 1:2, 2:3, and 3:4. So if we know 440 Hz is a note in the system, it would be nice to also have 587 Hz, 660 Hz, and 880 Hz. However, these frequencies will only really sound good when played with that original 440 Hz, not necessarily with each other. So instead of using them exactly, we approximate them in a useful way. The 1:2 ratio, the octave, is generally considered to be the most important, so that ratio is kept, but otherwise the notes are equally spaced (human hearing is logarithmic), or equally tempered. The most popular tuning system has twelve tones. There's no note at 660 Hz, but there's one at 659 Hz, which is pretty close, and there happens to be one at 587 Hz. Other ratios are also represented reasonably well.
A key is a collection of notes that sound good together, based on the ratios of their frequencies. The alternative would be chromatic composition, where all 12 notes are used and none is obviously 'more important'. Most music is in a specific key, but uses some chromatic notes to make the melody more interesting.
Yes, the video goes into some depth about scales/keys.
I actually had to look up what the difference between a key and a scale is, as I thought the terms were pretty much interchangeable. I've edited the last part of my other comment to reflect this:
A scale is actually an ordered set of notes belonging to a key. A key is just an unordered collection of notes. I got this wrong earlier.
So playing all the notes belonging to C major in ascending order is playing a scale, and playing the notes in any order is playing in the key of C.
Some of the copyright lawsuits are dumb and this is effective satire or performance art but that's all it is.