• purgemasterv2 joined the channel
      • purgemaster has quit
      • adhawkins_ joined the channel
      • danoply_ joined the channel
      • BenOckmo1 joined the channel
      • BenOckmore has quit
      • adhawkins has quit
      • danoply has quit
      • adhawkins_ is now known as adhawkins
      • danoply_ is now known as danoply
      • davic has quit
      • fading_epilogue has quit
      • Glassjoe has quit
      • fading_epilogue joined the channel
      • adhi001 joined the channel
      • sublim20 joined the channel
      • Vacuity_ joined the channel
      • Vacuity has quit
      • SothoTalKer_ joined the channel
      • SothoTalKer has quit
      • SothoTalKer_ has quit
      • SothoTalKer joined the channel
      • RikkoM joined the channel
      • Rohan_Pillai joined the channel
      • Rohan_Pillai has quit
      • farn has quit
      • dave_uy has quit
      • dave_uy joined the channel
      • Kevlar_Noir joined the channel
      • calcmandan has quit
      • calcmandan joined the channel
      • RikkoM has quit
      • Rohan_Pillai joined the channel
      • reosarevok
        Updating beta
      • Vacuity joined the channel
      • Vacuity_ has quit
      • Updating prod
      • Update done
      • TOPIC: MusicBrainz Community | See #metabrainz for development and the other *Brainz’s | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Latest Release: https://blog.metabrainz.org/?p=8719 | Picard 2.6 Beta 2 released! https://picard.musicbrainz.org
      • evelyn has quit
      • evelyn joined the channel
      • akashgp09 joined the channel
      • akashgp09 has quit
      • Rohan_Pillai has quit
      • Glassjoe joined the channel
      • chaban
        And so it begins. WhatGear links are removed: https://musicbrainz.org/edit/78092748
      • reosarevok
        Are these links useful? Should we be storing them but under otherdbs or something?
      • Or is this just added by someone promoting their own site and are not worth storing?
      • chaban
      • On one hand it looks like some SEO site, on the other hand the info looks truthful. There is often proof even.
      • Glassjoe has quit
      • Looks like I just found out what the eb_ suffix means in page source, e.g. eb_twitter on https://whatgear.com/pro/skrillex
      • https://equipboard.com/pros/skrillex that site and its URL structure looks strikingly similar to WhatGear
      • Experimental new site? Copycat? ¯\_(ツ)_/¯
      • RikkoM joined the channel
      • Speaking of gear: https://musicbrainz.org/edit/78065634 that information is from Sound Credit
      • Discovered it a few days ago and added it to the wiki: https://wiki.musicbrainz.org/index.php?title=Ot...
      • At the same time I learned that you can now get ISNIs for free: https://blog.soundcredit.com/post/Music-Industr...
      • rxrog joined the channel
      • rxrog has quit
      • rxrog joined the channel
      • nifemi joined the channel
      • rxrog
        Hi everyone
      • RikkoM has quit
      • ROpdebee joined the channel
      • ROpdebee
      • chaban
        You made it. Welcome to the 80s
      • ROpdebee
      • i've been meaning to join for a while now, but never found the time for it :P
      • anyways, I've just finished some talks with ArchiveTeam, they're going to queue up all URLs from the latest MB dump later today into one of their projects, and set up a live data feed to archive new URLs as they are added
      • those will eventually be injected into the wayback machine
      • BrainzBot
        MBS-9009: Every time a Homepage/Blog/Discography/Biography URL is submitted to MB, it should also be submitted to the Wayback Machine
      • reosarevok
      • ROpdebee
        edit notes aren't included in the live data feed though, so those won't be archived automatically (yet). reosarevok: Is there any way we could get those in a feed too?
      • RikkoM joined the channel
      • rxrog has quit
      • reosarevok
        ROpdebee: probably not, since they're not meant to be public-public (they require login)
      • rxrog joined the channel
      • nifemi has quit
      • rxrog has quit
      • rxrog joined the channel
      • MRiddickW joined the channel
      • rxrog has quit
      • ROpdebee
        ah I see, that makes sense
      • URL entities are now being added: https://tracker.archiveteam.org/urls/
      • darwin
        pretty happy about this, for like when things get removed from beatport or bandcamp
      • useful to have an archive to be able to refer back to
      • musicfan joined the channel
      • musicfan
        I am attempting to add a band named Soraia (https://www.soraia.com/). There's already another unrelated entry with that name, so I'm attempting to disambiguate, but the disambiguation field remains red no matter what, which does not allow me to submit. Is this a known issue or am I doing something wrong?
      • SirPHOENiX17 joined the channel
      • SirPHOENiX1 has quit
      • SirPHOENiX17 is now known as SirPHOENiX1
      • finalsummer
        whatgear looks like a SEO spam site and should be blacklisted. equipboard looks legitimate and has actual user contributions, whatgear looks like just outdated scraped(?) info from the former
      • https://musicbrainz.org/edit/77426700 okay this is 100% a SEO spam account, ban? (ping reosarevok)
      • crism
        Yeah. Added a ton of “official home page” links which clearly aren’t. When called out on one, said, “Oops!”
      • User reported.
      • CatQuest
        [14:46] <ROpdebee> anyways, I've just finished some talks with ArchiveTeam, they're going to queue up all URLs from the latest MB dump later today into one of their projects, and set up a live data feed to archive new URLs as they are added
      • wait so urls in the edit notes or what?
      • becasue omg
      • BUT url entities automatically being put in the IA?
      • 👏 🎉 👏
      • ROpdebee: whooooo
      • [14:47] <ROpdebee> those will eventually be injected into the wayback machine
      • I'm very excited about this!!!!
      • (even if it's not edit note urls today)
      • musicfan: hi! can you give a sreenshot of how you're doing it?
      • ROpdebee
        Well, currently most of the URL entities in the 2021-03-13 mbdump have been grabbed by archiveteam. they'll be uploaded to IA and injected into the wayback machine sometime soon. new or updated URL entities should be grabbed via the live data feed, I've been told that'll be set up later today
      • as for edit notes and annotations, those will likely be done every three days with new data from the dumps, but i'm still working on extracting the URLs as i'd like to replicate the way the MB server does it to make sure we're parsing them consistently
      • CatQuest
        like reo said I'm not sure edit notes are possible :/ but annotation ones should be
      • but this is *such* a help! so many old urls are gone and they were the proof or input fro many releases. some releases you can't even find any more
      • I wonder 🤔 would it be possible to get a simple list of urls that no longer resolve + weren't already in the ia?
      • or rather maybe not a list but to see the number of them
      • ROpdebee
        probably not, the effort that would take would be equivalent to the actual archival
      • CatQuest
        hah, alright
      • ROpdebee
        or maybe a bit less, but would still take as many requests (>6M for URL entities alone)
      • CatQuest
        yea not viable
      • anyway this is excellent news! this exact thing is something I've always worried about. it should have been in function ages ago <3
      • but from now on new urls should be caught so no *new* urls are "lost"
      • oh. btw maybe you should also do this with BookBrainz.org
      • still fairly undeveloped but should also evnetually be able to link to all kinds of things
      • woudl be great to have the url archiving fro mthe get-go
      • ROpdebee
        what could be useful though, is a periodic dump of all recently "used" URLs on MB
      • CatQuest
        used how?
      • entered?
      • ROpdebee
        say once a day a file is uploaded to some FTP with URLs that have been entered into edit notes, annotations, URL entities which have been edited (or their ARs added/removed/edited)
      • CatQuest
      • reosarevok: ?
      • ROpdebee
        that file could be injected into AT's queue immediately with little effort, and you'd get a snapshot of the URLs in the exact state as when they were used
      • CatQuest
        reosarevok: how hard would it be to create a dump ofthe urls entered into edit notes?
      • i personally don't know the data (just a long standing editor+BBstylecat and MB Instrument Inserter) you' wanna talk to reo, yvanzo, zas, etc
      • ROpdebee
        yeah i'm just throwing out ideas, now we'd have to download and process fairly large DB dumps to get just a couple thousand new links every three days
      • to be clear though, we can still get the urls in edit notes from one of the dump files
      • reosarevok
        bitmap: ^ does this seem like something that could be done kinda like with the json dumps?
      • RikkoM has quit
      • ROpdebee
        also, couple of caveats: The project it's being inserted into currently doesn't archive page requisites (images, css, js, etc) but it's better than nothing I guess
      • and Spotify links probably aren't useful, since it loads data dynamically from the API, and those responses aren't captured either. so you'll just get broken pages :( I told them about this, but it's a wontfix situation
      • musicfan has quit
      • KingJ has quit
      • KingJ joined the channel
      • DarthKnight joined the channel
      • bitmap
        reosarevok: sounds very plausible yes
      • ROpdebee has quit
      • ROpdebee joined the channel
      • RikkoM joined the channel
      • DarthKnight has quit
      • yvanzo
        ROpdebee: awesome, archiving URLs is something we wanted for a long time, it's even one of ideas for GSoC: https://musicbrainz.org/doc/Development/Summer_...
      • ravndal has quit
      • Toast joined the channel