Anna's Archive Quietly 'Releases' Millions of Spotify Tracks, Despite Legal Pushback * TorrentFreak
submitted by
ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86
Share on Mastodon
Deleted by moderator
not sure if it's very related but apparently spotify is killing their api. hopefully annas releases proper dumps that can replace it..
https://old.reddit.com/r/spotifyapi/comments/1qxv9wm/spotify_api_changes_were_doomed/ (screenshot attached)
I believe it is very related, either as the cause or the effect.
Just an excuse to bring it over the finish line what some at management wanted to do but had no justification to begin it.
At least IMO.
It is also possible that Anna's Archive moved through with their project because someone learnt about Spotify plans in advance.
I doubt it Anna’s Archive has bountys for scraping like this on loads of stuff.
I believe they have a 1 million USD bounty for scraping google books.
Fair.
What font are you using? It's nice 🙂
Gulim (default font for korean windows before vista, so similar to Tahoma) with font smoothing disabled :)
Of course it can't replace it, it's a point-in-time archive.
yeah sure, but it should cover up the vast majority of stuff except new releases
Hope Anna's has a good Onion site setup... Cause they are gonna probably have to rely on that soon enough.
I don't believe they have one, it's long overdue however
I'm always astonished of how underused it's the dark net for these kind of projects. Most torrent sites doesn't have a dark net mirror despite how easily they get blocked in the clearnet.
Because they want people to actually use them.
More people should use the anonymous internet using tools like I2P and Tor. Hopefully that'll be a silver lining with all the governments around the world cracking down on freedom of speech and increasing censorship.
I'd like to see more of this as well, I only know of 3 and one of those isn't onion it's i2p.
I don’t think they do, and this will probably be what changes that. I’m fully expecting the site to lose their remaining TLDs as a result of this.
Deleted by moderator
K now someone make a script that compares your Spotify library to the torrent and downloads all songs please...
This oh this, please
That's actually what i am waiting for
Sure but, again, the files are in a relatively low quality.
You could probably vibe code this.
Deleted by author
The stuff on YouTube is already lossy and it becomes more lossy when you transcode it.
While their intentions are good, this will unfortunately probably lead to them losing their last two domain names.
I don't understand the concern, domain names are cheap and easy to get, they can just keep using new ones. Why does it matter if they lose the ones they have?
Piratebay used to do the domain dance all the time back in the day (and maybe still do).
It's more about users being able to find them again. If they lose all domain names, it becomes difficult to figure out which are the new ones.
Just look up on wikipedia
Fmhy.net usually has a working domain of AA
So does Wikipedia
This.
It's factual, public, relevant info, it can and should be on Wikipedia.
Deleted by author
Maybe this will prompt some people to learn to use Tor
Is TOR not completely owned by the feds? I remember even back in the silk road days people were saying the FBI owns every endpoint. Is TOR still practical? I truly don't know I'm asking for input.
If enough people set endpoints, then the feds will own a fewer proportion of the total. AKA: we have to be the change we want to see in the world.
Yeah but even if you could get it down to like 50% why would anyone want to take that risk? Idk I might be misunderstanding something about how TOR works but it seems no more anonymous than the clearweb from what I've heard.
I'm not sure you are fully aware of the Tor threat model. The exit node is not supposed to be specifically trusted.
TOR is not to be trusted. Everyone else has been working on strengthening their security while TOR has been caught numerous times weakening theirs.
Use i2P and sneakernet. Fuck tor.
Oh i2P is definitively a better option where you can have it. However, Tor is somewhat more approachable as of current: we have for example Tor Browser for the masses, whereas I at least haven't heard of any sort of "I2P Browser" at least for Firefox. Of the three options, Tor is the only one I know that is "portable" (you don't need to be able to make admin-level network changes to use it).
I2p doesn’t have a browser setup extension or something last I checked. TOR browser has been caught lying and degrading security repeatedly. A big part of the problem with tor is the damn browser.
In this scenario it wouldn't matter because the idea is to use it as a way to access a website that would otherwise be accessed over clearnet but has become inaccessible. But if they made an onion site endpoints wouldn't be used anyway afaik since the traffic doesn't leave the network. Now that I'm thinking about it there might be some issues with practicality doing it this way if they have a big volume of traffic, but there are options for routing around censorship that don't involve DNS.
I don't understand this comment, can you elaborate? Why wouldn't the endpoints be used? This is probably my ignorance but I thought all traffic was routed through the onion network and then eventually to the end device, but all that extra routing can't help you if the Feds control the last stop before whatever server you're trying to contact.. are you saying that if a site is entirely hosted on TOR then no information makes it to an endpoint?
Basically yeah. My understanding is that exit nodes are special and using them is a vulnerability, but you only use exit nodes to access clearnet sites from Tor, and you are less vulnerable if you aren't doing that and rather going to sites with .onion urls. Which, unfortunately I can't find one for this website, but I'm thinking they'd probably consider making one if they can't maintain any clearnet domains anymore.
I don't think that's true and a very cursory google suggests (to me at least) that im right and I don't have time to parse a bunch of sources right now. So idk if anyone else could chime in with specific technical details or a source id appreciate it.
Deleted by moderator
I mean the same reason we don't have the full Epstein files for one? They just don't care until they have to. But also just because they aren't prosecuting shit right now doesn't mean they aren't collecting all that data to feed into their latest shitty chatbotm
Deleted by moderator
Talking about the propensity of the current (US mostly but other places too) administration to just hoard data and rely on shitty 'AI' tools to compile and sort through it/analyze it. Idk apparently I was misinformed about the extent to which TOR is actually compromised though.
Edit: the m was a typo its supposed to say chatbot if you didn't get that.
Feds: "CIA Alexa, show me the users who downloaded illegal things."
CIA Alexa: "Yes daddy."
(I assume Chatbotm is a submissive bottom in responses.)
Then it just picks random people and puts them on a list because its shitty and fed off of Reddit nonsense.
It’s a honeypot. The goal was never about stopping CP it’s for blackmail.
The pirate bay is still able to find domains so ¯\_(ツ)_/¯
Anna's Archive when they find out piracy is illegal 🙀
I2P can do torrents. Magnets are easy to host on Tor
Sounds fun 😈
What are they gonna do? Punish them with trillions more in debt?
Meh, whenever I go through spotify looking for the albums I like, about 40% of them aren't even on there. You're better off not bothering with their shite.
The internet needs a better way to share stuff than a fixed list of files. It should be easy to simply browse through a shared folder and decide to participate in storing and hosting that file.
Having to split huge archives like this into multiple torrents is such a terrible workaround. It requires those with huge storage to host the torrents. People who just require a subset can't properly participate.
Such a pity IPFS is so crap. It should be been the solution to this, but alas...
The torrent protocol is quite happy for you to only download and seed some files within a torrent, it's just that the most popular client for it isn't very good at managing very large archives.
I'm guessing there's probably an alternative client that is better at this, can anyone tell me what it is? If there isn't one I'll make one, but I don't want to burn a weekend duplicating something that already exists...
qBittorrent handles selections of individual files quite well. The only downside is a side effect of the protocol: If a data block spans two files (because their size is not an exact multiple) it will create a "partial" file with a strange name next to it - which you need to keep it complete/seeding.
What's crap about IPFS? I've never used it, but have always been intrigued.
It's a resource hog and quite unstable. A major gripe I have with it is that it makes accessing what you downloaded very difficult because the documentation is terrible. It should be possible to mount all your downloaded stuff into a folder, but I have yet to figure out how.
And despite all its resource usage, it is very slow.
The idea is amazing (peer-to-peer, content-addressed storage), but the implementation is extremely lacking.
It's got lots of great ideas (combining what's essentially a giant git repo with bit torrent), but in practice it's pretty slow to do anything
Deleted by author
I really don't see this as being useful for anyone outside of hard archivest. The bitrates are pretty trash. I guess if you just a setup with an incredible amount of music no matter what, this is for you. IMHO the meta data is worth more than these lower quality sound files although we have meta data for what's out there now.
Outside of that here is what and how they are going to release. I'm guessing this drop was their "popular" track drop. From their site:
For popularity>0, we got close to all tracks on the platform. The quality is the original OGG Vorbis at 160kbit/s. Metadata was added without reencoding the audio (and an archive of diff files is available to reconstruct the original files from Spotify, as well as a metadata file with original hashes and checksums).
For popularity=0, we got files representing about half the number of listens (either original or a copy with the same ISRC). The audio is reencoded to OGG Opus at 75kbit/s — sounding the same to most people, but noticeable to an expert.
The point of AA is the archiving.
Anyways, from a listening perspective, 160kbit vorbis is audibly lossless I think, and there are many songs on here that are not possible to find elsewhere. For popular songs you want, yeah, just download the Flac elsewhere.
Most of my mp3s from back in the day are 128kbit, so 160 is an upgrade for me.
I can only speak to what a 75kbps mp3 sounds like, but unless Opus is like 3x+ better at compression, it's going to sound like complete dogshit.
Opus @ 160 kbps is like MP3 at 320 kbps IIRC.
I don't know what 75 kbps opus sounds like, but I can tell you how 32 kbps sounds. Versus mp3 at that bitrate, it sounds actually listenable, while mp3 sounds like you're underwater.
All things considered, the Spotify songs probably sound fine at 75.
I encode my music to opus 96kbps for my phone, and it sounds great. I can't tell the difference between that and the original on Sony xm3s.
https://listening-test.coresv.net/results.htm#list10 via https://opus-codec.org/comparison/
It is that much better.
Where? I checked the torrents JSON mentioned there and there's no text match on 'spotify'... did it get removed or am I looking at the wrong JSON?
I am seeing the same thing, even within an hour of the article being posted.