#metabrainz

/

      • bukwurm
        As consumers are also generic, the present consumer design, validation and db interaction can be used for all the future data sources :)
      • Apart from this, I designed the basic `queue object`, which is the format the producers of any data source must produce
      • I also built LB and poked around.
      • I'd like to work with d3 in future there!
      • Freso
        CatQuest: Ask after meeting.
      • bukwurm
        That's for me. Leo__Verto ?
      • Leo__Verto
        Thanks!
      • CatQuest
        hi Leo__Verto !
      • bukwurm
        Hi!
      • dragonzeron has quit
      • Leo__Verto
        Spent thursday to sunday at a festival and the rest of the week messing around with pandas and plotly to get my spam data visualizations to a point where I'm happy with them
      • dragonzeron joined the channel
      • dragonzeron
        sorry internet went off
      • Leo__Verto
        as promised, here's a graph on bio lengths from last wednesday https://github.com/metabrainz/spambrainz/blob/m...
      • CatQuest
        :O
      • Leo__Verto
        interestingly enough there seems to be almost a gaussian distribution around bio lengths of 3.5k with some outliers to the shorter side
      • more stuff to come soon!
      • CatQuest is stoked!
      • CatQuest
        this is very interesting
      • Leo__Verto
        dragonzeron, go!
      • dragonzeron
        Hi
      • bukwurm
        Leo__Verto: Interesting!
      • Freso
        Leo__Verto: Did you compare with length of non-spam editors?
      • dragonzeron
        So I have been working on some japanese artist by adding works to the recordings and then I am also merging recordings that I didnt on artist I cleaned up on
      • Leo__Verto
        Freso, I'm working on how to do that without having access to the prod DB myself but that's the end goal!
      • dragonzeron
        I was also wanting to know if someone can possibly look into the Amazon glitch where the image of the album does not show up
      • I put a request into bugs on metabrainz
      • CatQuest would also be interested in that comparison
      • CatQuest
        dragonzeron: that's mostly solved with "upload ito caa" thoguh
      • dragonzeron
        true but even then it kinda becomes a hassle
      • when I add multiple albums at the time and the amazon thing kinda helps shorten the space and what not
      • CatQuest
        such issues withamazon was one of the reasons caa came ot be
      • Freso
        I think it's more likely we'll stop using Amazon for showing covers entirely.
      • dragonzeron
        ok
      • CatQuest
        yep
      • dragonzeron
        understandable
      • CatQuest
        it's a nice backup so i wouldn't remove it asap. but
      • dragonzeron
        I am also planning on putting in a couple of request about updating the famicon of some of the other databases we added so that it would show up on the artist page
      • CatQuest
        fixes on the bugs n it will probably not be prioritised at all
      • Freso
        Anyway, reviews is not for extended discussion. :)
      • CatQuest
        yes
      • dragonzeron
        if that is what I am supposed to say either way Arigatou Gozaimas
      • who is left
      • Freso
        No one, you were the last. :)
      • dragonzeron
        ok
      • Freso
        And no one else has spoken up.
      • So thanks for your reviews, dragonzeron and everyone!
      • And thanks for your time!
      • </BANG>
      • dragonzeron
        thank you for having me
      • CatQuest
        thank you freso!
      • rdswift
        Leo__Verto, the missing part of the #musicbrainz log is at https://pastebin.com/GSC55YC5 and the missing part of the #metabrainz log is at https://pastebin.com/UMNV2PUk (times are UCT-6). If you can point me to the desired format, I'll slam together a conversion script for future.
      • dragonzeron
        I also forgot to mention I celebrate my 1 year anniversery of working on here today
      • TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | GSoC https://goo.gl/7jsjG2 | Meeting agenda: Reviews, Stop using Amazon for cover art (Freso)
      • CatQuest
        didn't you mention it earlier?
      • aw man
      • Freso
        dragonzeron: Congrats (again). :)
      • ruaok
        kartikeyaSh: iliekcomputers tests now pass -- thanks. I'm ok with PR #36.
      • CatQuest
        i really don't want amazon to go away that soon
      • ruaok
        iliekcomputers : merge when you're happy, please.
      • TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | GSoC https://goo.gl/7jsjG2 | Meeting agenda: Reviews, Amazon and cover art (Freso)
      • Leo__Verto
        rdswift, thank you, I think I still have a modified import script from last time I got logs from you!
      • CatQuest
        Freso: what is an "exit" url?
      • kartikeyaSh
        finally :)
      • rdswift
        Glad to help.
      • CatQuest
        Leo__Verto: rarg you've been accepting logfro mthers thna me?
      • /jk
      • Leo__Verto
        :P
      • github joined the channel
      • github
        [messybrainz-server] paramsingh closed pull request #40: Don't create database in test.sh (master...tests-fix) https://git.io/vhVXy
      • github has left the channel
      • Freso
        Leo__Verto: I just looked at my local db mirror, and it seems that indeed bio is one of the things being sanitised in the dumps. Ah well. :/
      • CatQuest
        well since it would maybe help leo's spambrainx thing, should we prepair special dp dump for him?
      • github joined the channel
      • github
        [messybrainz-server] paramsingh closed pull request #36: LB-352: Add script to fetch artist MBIDs provided recording MBIDs from MusicBrainz. (master...mba) https://git.io/vproH
      • github has left the channel
      • CatQuest
        db*
      • Freso
        Looks like we don't have languages either, so no analysing how much editors stick to languages they know or not. :/
      • Ah well.
      • CatQuest
        param is on a roll! ᕕ(ᐛ)ᕗ
      • Leo__Verto
        yvanzo, at this point I'd love to hand you a jupyter notebook to run against the main db but I'm not sure if that'd be GDPR compliant :/
      • rdswift
        Leo__Verto, not that there's a line or two overlap with the existing logs at either end.
      • *note*
      • Freso
        CatQuest: I'm guessing and Leo__Verto can prepare a query and yvanzo can provide the results of it.
      • CatQuest
        rdswift: doing the good !
      • Leo__Verto
        rdswift, yeah that's okay, the importer should be able to detect duplicates
      • CatQuest
        oohhh
      • Freso
        That way Leo__Verto can get the data in an anonymised form. But I reckon Leo__Verto and yvanzo will figure something out. They're smart peeps. :)
      • CatQuest: Aaanyway.
      • CatQuest
        the few times I did logs i als odid that, to help find the right spot
      • Freso
        "Exit URLs" is a term I kind-of coined (together with yvanzo) for describing URLs that are meant to just redirect you elsewhere.
      • CatQuest
        Freso hi :)
      • Freso
        Let me find an example…
      • rdswift
        Oh yeah, I also removed the [off] lines from the logs posted.
      • CatQuest
        soo. "redirect" urls?
      • Freso
        !m rdswift
      • BrainzBot
        You're doing good work, rdswift!
      • yvanzo
        Leo__Verto: correct guess ;)
      • CatQuest
        oh I didn't do that, i expected that importer would do that auto
      • well done rdswift
      • rdswift
        Freso, I remeber you mentioning that to me last time.
      • Freso
        :)
      • HSOWA joined the channel
      • CatQuest
        so that's why "exit" and ont "redirect"
      • Freso
        (`exit.sc/?url=` is SoundCloud's "exit URL".)
      • CatQuest
        wait mediafire is still la thing?
      • ooh "leaving this domain" tyoe urls
      • yea tohse are hella annoying
      • !m Freso(Bot) for removing those!
      • BrainzBot
        You're doing good work, Freso(Bot) for removing those!!
      • KassOtsimine has quit
      • UmkaDK has quit
      • thomasross has quit
      • CatQuest
        ..don't know how i feel abotu direct download linsk in general though.. I liek better links to pages wher yo ucna click a dl link (not that that's you doing but the original person mind)
      • Freso
        Basically, they're addresses that websites/services use between themselves as the end-point, either to keep better stats for where people leave to, for helping to prevent fraud and/or try to make people not leave their site (e.g., Facebook warns people that they're leaving Facebook using `facebook.com/l.php`), or to obscure the HTTP referrer to make it harder for the target site to tell where the visitor comes from (e.g., I think maybe DDG
      • uses/can use something like this, to prevent sites being able to tell what search words were used on DDG to find the target site).
      • CatQuest
        try to make people not leave their site (e.g., Facebook warns people that they're leaving Facebook
      • yeai know that ype of url, I find thme seriously nnoying so I am very happy about your bot removing those :D
      • Freso
        CatQuest: Like I wrote in the forum, many of the exit URLs will need additional cleaning of various sorts, this is just a "first pass" of cleaning them.
      • CatQuest
        👍 yurp
      • Freso
        There's definitely no reason for them to be in MB, but they sometimes sneak in if you for example right click links and "copy URL to clipboard".
      • CatQuest
        exactly
      • oh god. googledocs does this
      • i HATE it
      • it makes copy pasting urls into mb from google doc liner ontes (ype that's a thing) really annoying
      • Freso
      • CatQuest
        thnkas. totes voting yes on all the ones that are not wrong :V
      • Freso
        (All the "exit URLs" FresoBot has handled so far.)
      • CatQuest
        lmao, the first one is youtube redirecting to failbook and isntagram, and then there is a instagram redirecting to youtube
      • >_< XD
      • !m Freso(Bot) for removing those!
      • BrainzBot
        You're doing good work, Freso(Bot) for removing those!!
      • CatQuest
        again, becasue fantastic!
      • dragonzeron has quit
      • would also be great if we could incorporate the most common ones in our url cleanupscript!
      • Freso
        Yeah. I think the issue with that would be that they need two passes, but I haven't really looked at URLCleanup.js much since yvanzo's overhaul of it, so maybe it is (more) feasible now.
      • (I've also been talking with yvanzo about making FresoBot use URLCleanup.js directly to clean up URLs and fix up their relationships too.)
      • yvanzo
        Freso: It has to be done ahead of URLCleanup.js not inside.
      • Freso
        yvanzo: So that part hasn't changed. :)
      • samj1912 is super tired of the 100th power cut of the week -_-
      • Monkey_ has quit
      • Dr-Flay joined the channel
      • samj1912
        ruaok:
      • ruaok
        not a lot of context to evaluate what is doing on...
      • samj1912
        ah, replayed live logs
      • ruaok
        I get that.
      • how many requests? what was the server load?