#metabrainz

/

      • thomasross has quit
      • MRiddickW joined the channel
      • sumedh joined the channel
      • sumedh has quit
      • sumedh joined the channel
      • d4rkie has quit
      • d4rkie joined the channel
      • d4rkie has quit
      • d4rkie joined the channel
      • sumedh has quit
      • reosarevok
        yvanzo: when around, could you take a quick look at the comment on https://tickets.metabrainz.org/browse/MBS-9790?... ?
      • BrainzBot
        MBS-9790: ISE when annotation search results contain a release group
      • nelgin
        So I'm trying to download the search indexes and it's doing the same thing.
      • sudo docker-compose run --rm musicbrainz fetch-dump.sh --base-ftp-url ftp.musicbrainz.org/pub/musicbrainz search
      • Saving to: ‘/media/searchdump/picard-setup-2.6.0b3.exe.md5’
      • Saving to: ‘/media/searchdump/picard-setup-2.6.1.exe’
      • Saving to: ‘/media/searchdump/picard-setup-2.6.exe’
      • Saving to: ‘/media/searchdump/picard-setup-2.6.exe.md5’
      • Why would it download and save all these?
      • iliekcomputers
        moin!
      • nelgin
        'sup
      • iliekcomputers has quit
      • shivam-kapila has quit
      • iliekcomputers joined the channel
      • wb
      • shivam-kapila joined the channel
      • reosarevok
        nelgin: maybe yvanzo or zas will know, if they appear
      • nelgin
        Maybe.
      • alastairp
        reosarevok: hi, as part of this project that I'm doing at UPF we have links for about 50k items from https://www.muziekweb.nl/ -> MB and -> IMSLP
      • nelgin
        I'm sure it is stuck in a loop downloading stuff.
      • ruaok
        moin!
      • alastairp
        is this interesting to import?
      • nelgin
        Erm eh?
      • alastairp
        we can provide a csv or similar, I guess it's the kind of thing that could be imported with a bot? Muziekweb are using this data already in their db, so I guess they trust it, but we could also do some kind of spot-check of its quality
      • nelgin
        It's downloaded replication data files from 2009. Picard files. This is stupid.
      • changed-ids since 2013
      • alastairp
        nelgin: reosarevok: oh, interesting. I wonder if this is related to the new http download server that we added. and curl is doing a recursive download, following the ".." links
      • reosarevok
        alastairp: Hmm, not sure. Or well: I'm sure we're interested, not sure what's the best way to deal with it :)
      • alastairp
        reosarevok: one of the aims of the project was to "contribute back to the open resources that we use", so this is us trying to contribute back :)
      • in any case, I'm happy to open a ticket and attach a csv to it
      • nelgin
        Did you test it?
      • Cos whatever it is, it aint working.
      • reosarevok
        alastairp: that'd be neat, then we can open a forum post
      • alastairp
        ok, we're planning on building up this list over the next few weeks, so I'll do that
      • reosarevok
        nelgin: maybe alastairp is right, and that change had unexpected consequences for the docker scripts. Then we really could use yvanzo...
      • nelgin
        Not good policy to rely on one person for such a project?
      • outsidecontext
        nelgin: not sure, but shouldn't the base FTP URL be set to http://ftp.musicbrainz.org/pub/musicbrainz/search/ instead of just http://ftp.musicbrainz.org/pub/musicbrainz/ ?
      • nelgin
        NO idea, there's no instructions on it.
      • If that's in the file, then I would say no, that's not what it should be.
      • I'm using the US rather than EU servers.
      • Personally, I'd like to see BASE_FTP_URL be supposed in .env and source that in any scripts before blindly assuming people want to download from Europe.
      • outsidecontext
        ah, maybe that's the issue here. because AFAIK there was no change to the US servers, they had been accessible via HTTP all the time. Only change was that fpt.eu.metabrainz.org is now also accessible via HTTP
      • nelgin
        Maybe it's because I didn't use ftp:// then?
      • Since I'm a nice guy, I can try it again and report back when I wake up in 5.5 hrs
      • outsidecontext
        yes, probably.
      • nelgin
        I guess it depends if wget default to ftp or http
      • Reusing existing connection to ftp.musicbrainz.org:80.
      • HTTP request sent, awaiting response... 200 OK
      • Ah there you go.
      • So the variable is misleading, it shouldn't be FTP_BASE_URL if it's not assuming ftp.
      • outsidecontext
        that could mean the same would happen to the eu servers now as well, since they are now accessible via HTTP. So someone needs to fix this script to work as expected (either force using ftp:// or properly handle the HTTP case)
      • nelgin
      • => ‘/media/searchdump/MD5SUMS.asc’
      • ==> CWD not required.
      • ==> PASV ... done. ==> RETR MD5SUMS.asc ... done.
      • OK, now fetching with ftp
      • MRiddickW has quit
      • MRiddickW joined the channel
      • atj
        alastairp, ruaok: when you have time could you quickly review and merge https://github.com/metabrainz/listenbrainz-serv... before it ends up with conflicts and needs rebasing?
      • is the new dump script in production now?
      • ruaok is on it.
      • ruaok
        yes, it is.
      • atj
        did it work? :)
      • ruaok
      • last incremental dump was the day of the release, before the release.
      • ok, this and one other issue will be the first thing I release today.
      • atj
        well, there should be some decent log messages now
      • BrainzGit
        [listenbrainz-server] mayhem merged pull request #1403 (master…minor-create-dumps-fix): create-dumps.sh: Minor fixes for security and consistency https://github.com/metabrainz/listenbrainz-serv...
      • atj
        thanks
      • ruaok
        will deploy after I fix one other cron entry...
      • atj
        now we find that my changes broke it!
      • ruaok: do you a log server like Graylog?
      • *do you have
      • ruaok
        no, I wish.
      • our traffic level would require us to have 2-4 machines alone in order to process the traffic. that is if we measured all the MB API traffic.
      • nelgin
        recording.tar.zst 99%[================================================================> ] 21.40G --.-KB/s in 32m 37s
      • 2021-04-26 09:35:26 (11.2 MB/s) - Data connection: No such file or directory;
      • I'm about ready to give up on this
      • atj
        what if you didn't log all the API traffic and just used it for Docker logs, errors etc?
      • Docker suppot GELF natively IIRC
      • *supports
      • ruaok
        yeah, we could. but we also don't have a solution for the log shipping to get logs from all the various places into something like greylog. and once we do, we don't have a lot of use for these statistics. its nice to know what browsers people are using and identifying abuse, but putting API keys in place is far more important than analytics...
      • nelgin
        Hmm, the ftp site has 3 ip addresses, one is 65ms the other 2 are 30-40ms. Makes a big difference.
      • _lucifer
        alastairp: could you take a look on https://github.com/metabrainz/listenbrainz-serv... ? I have tested on beta and fixed the bugs i found.
      • atj
        ruaok: sorry, I don't have much idea about your infra, but from what I've seen on this channel, it feels like a centralised logging setup would make it a lot easier for people to track down and debug issues
      • nelgin
        It would be nice if the original install script would ping US and EU servers and determine the fastest IP.
      • While nothing is perfect, it can mean the difference between 6MB/s and 20MB/s to me.
      • atj
        nelgin: latency != bandwidth
      • nelgin
        No, but an overloaded pipe is going to show latency which will slow down bandwidth.
      • and I bet you I get a higher ping to the EU servers which will likely mean lower transfer speeds.
      • alastairp
        _lucifer: great, I've just finished some calls. Looking at reviews again now
      • _lucifer
        Should I merge the #1404 in this one ? I have kept the two separate for easier review but the two will be deployed together.
      • alastairp
        let me tell you in 20 minutes :)
      • _lucifer
        👍
      • ruaok
        atj: yep, agreed.
      • alastairp: iliekcomputers _lucifer : Freso points out that some (most) spammers in LB do not have an email address set.
      • alastairp
        ruaok: yeah, we opened a ticket about that
      • saying that we should require MB accounts to have an email before allowing them to create an LB account
      • ruaok
        do we have consensus that we should be requiring that? what about existing accounts?
      • yep, that makes sense to me.
      • alastairp
        requiring going forward: definitely. existing accounts: unsure
      • _lucifer
        yeah, I agree for future accounts we should have emails.
      • alastairp
        happy to add a banner saying "it'd be great if you added one so that we can do things", but unsure about preventing them from logging in until they add one
      • ruaok
        maybe we need to make an report for accounts that are submitting listens that do not have an email
      • and then eventually we will flush out the bad actors.
      • alastairp
        any thoughts on preventing submissions from email-less accounts (in the far future, once we get people to add them?)
      • ruaok
        I think we should. the sign up process should make it clear that we will not record listens if we have no email address.
      • alastairp
        yeah, that's easy to do from now on (we can stop the signup process if MB doesn't send us an email). I was asking more about current accounts which don't have an email
      • I think we should show a banner to users who sign in who don't have an email, asking them to add one to MB [or directly to LB? probably better in MB]
      • but I don't know if we should start imposing limits on current LB accounts that don't have an email set
      • ruaok
        yes, to the banner. and maybe say that in 6 months after implementing this, that we will stop recording listens for accounts without emails.
      • alastairp
        yeah, that sounds reasonable
      • put it in tweets/blogs/etc
      • nelgin
        I agree.
      • Tho it's got nothing to do with me :)
      • ruaok
        alastairp: I'll flush out the ticket and make it actionable.
      • alastairp
        LB-849
      • BrainzBot
        LB-849: Should we require users to have a confirmed email before creating an LB account? https://tickets.metabrainz.org/browse/LB-849
      • ruaok is editing it now
      • ruaok
        Ok, we'll require email as of Nov 1, 2021.
      • (to record listens)
      • nelgin
        THis sucks.
      • ==> PASV ... done. ==> RETR recording.tar.zst ... done.
      • Length: 22979676478 (21G)
      • recording.tar.zst 99%[================================================================> ] 21.40G --.-KB/s in 31m 46s
      • 2021-04-26 10:11:43 (11.5 MB/s) - Data connection: No such file or directory;
      • Twice now on the same file, so I know it's not me.
      • alastairp
        nelgin: thanks for your reports, it's clear that something is wrong, but yvanzo knows this much better than the rest of us. when he arrives I'm sure he'll be able to get a fix for you
      • I guess we haven't tested this workflow with alternate mirrors as often as we test the default workflow
      • nelgin
        I'm going to try from a single ip rather than using the ftp alias which is an alias of an alias. I'll see how it goes, but I'm not waiting up. I'll report back later. Night all.
      • CatQuest
        Mr_Monkey: is it possible at all to get a fortgang in https://tickets.metabrainz.org/browse/BB-543 ?? it's something that'd speed up entry immensely
      • BrainzBot
        BB-543: Pressing enter in a field does nothing
      • CatQuest
        (or any other BB person. gsoc-ers?)
      • ruaok
        alastairp: sanity check LB-849 plz?
      • BrainzBot
        LB-849: Should we require users to have a confirmed email before creating an LB account? https://tickets.metabrainz.org/browse/LB-849
      • CatQuest
        i'm not alastair but I don't see why not tbh
      • same as editing/voting in mb right?
      • ruaok
        exactly