#metabrainz

/

      • vogen joined the channel
      • vogen has quit
      • MusicbrainzB0T has quit
      • MusicbrainzB0T joined the channel
      • thomasross has quit
      • reosarevok
        CatQuest: yeah, that's all it does. Now that's all sortname guess does too :p
      • _lucifer
        ruaok, i tried to test again on michael but it seems not to be picking up requests from the queue again.
      • however pinging prince on the 62673 succeeds.
      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #2066 (beta…MBS-11583): MBS-11583: Use sanitized context in hydrated component https://github.com/metabrainz/musicbrainz-serve...
      • [musicbrainz-server] reosarevok merged pull request #2066 (beta…MBS-11583): MBS-11583: Use sanitized context in hydrated component https://github.com/metabrainz/musicbrainz-serve...
      • [musicbrainz-server] reosarevok merged pull request #2033 (master…MBS-11542): MBS-11542 / MBS-11552: Allow and cleanup new Classical Archives links + add validation https://github.com/metabrainz/musicbrainz-serve...
      • _lucifer
        ok i figured due to some reason yarn is trying to connect to marlon instead of worker-marlon. i think that might be the case if its tries public ip instead of internal ip.
      • interesting hdfs cli works as expected though reports the internal ips and worker-*
      • ok. the ip thing is fixed now but it now fails with user application exited with exit code 1.
      • alastairp
        CatQuest: thanks! blank spaces is OK, the only problem we might have is an empty string. Anything that is actually represented by characters is fine 👍
      • sumedh joined the channel
      • reosarevok
        bitmap, yvanzo: for MBS-1658, I was thinking at least one of the places to add a comment to the entry should be from the list itself
      • BrainzBot
        MBS-1658: My Collection: add free text comment field https://tickets.metabrainz.org/browse/MBS-1658
      • reosarevok
      • CatQuest
        alastairp: iirc there was an issue with an artist credit that included https://beta.musicbrainz.org/artist/3f0bdf7f-3f... some time back..
      • reosarevok
        bitmap, yvanzo: So that last column should have like an edit icon somehow, and I guess allow inline editing that would get sent to the DB? Do you know if we do anything like that anywhere else?
      • Or if you think that's a bad idea, how would you do it?
      • alastairp
        CatQuest: hah, nice
      • that's not a problem either though, but I can see how it could be a problem
      • CatQuest
        it was a nice bobby tables thing
      • .. niche
      • damn englich
      • sumedh has quit
      • D4RK joined the channel
      • D4RK-PH0ENiX has quit
      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #2067 (master…MBS-10899): MBS-10899: Report for releases with catnos that look like ISRCs https://github.com/metabrainz/musicbrainz-serve...
      • ruaok
        moooin!
      • Mr_Monkey: alastairp : so after more experimenting last night, I'm able to get rid of the min/max ts cont aggs by simply creating a 5 days cont agg with compound index on user/listened_at. Thats 18M rows less for starters.
      • and I think we can replace those with month and year cont aggs for the graphs you two would like.
      • alastairp
        right. get all data from the same table?
      • ruaok
        yeah, it was already there. just the index was missing to make it faster.
      • alastairp
        sweet, if a month and year aggregate is possible then that sounds like it should be perfect
      • great
      • ruaok
        basically we swapped doing a table scan on the DB with an index scan. not sure we can do much better than that -- but with increased cache times, this should work well.
      • alastairp
        _lucifer: ^ remember how I told you to add indexes to tables where you want to select some data?
      • _lucifer
        yup, i'll keep it in mind :D
      • ruaok
        its the rookie mistake that keeps on giving. #going25yearsstrong
      • sumedh joined the channel
      • sumedh has quit
      • _lucifer
        ruaok, apparently `0.0.0.0` is causing hostname resolution errors. 0.0.0.0 is resolving to the server's name michael instead of leader.
      • the same error was happening on workers leading to resolving as tito instead of worker-tito so on.
      • sumedh joined the channel
      • i fixed that by changing the configurations of various files here https://github.com/metabrainz/hadoop-cluster-do...
      • but for michael it seems it is picking up some default we didn't use to define in the earlier setup. any guesses which one it could be?
      • ruaok
        _lucifer: I suspect that is because the canonical name of the machine is michael and has its reverse DNS set like that.
      • so, for config purposes you should always use michael. leader is just a shorthand/convention for us to log into the cluster.
      • _lucifer
        i think what is happening is that michael is used when it tries to bind an interface on 0.0.0.0.
      • ruaok
        I would change the /etc/hosts and change leader to michael.
      • _lucifer
        makes sense. i'll try that.
      • ruaok
        not 100% sure that will work. the last paste -- which container is that from? is it up?
      • _lucifer
        no it goes down after that.
      • ruaok
        from the inside of that container can you ping michael:31171 ?
      • try the bash trick again, get the container up and then see what you can or cannot connect to.
      • _lucifer
        just tried that fails with unknown host
      • ping michael works but with the port doesn't
      • ruaok
        sorry wget michael:31171
      • _lucifer
        Aaannd it worked!
      • ruaok
        !m _lucifer
      • BrainzBot
        You're doing good work, _lucifer!
      • _lucifer
      • ruaok
        yisss!
      • _lucifer
        changing to michael didn't work
      • but changing a spark default to the vlan ip did
      • ruaok
        so, lets do all the loading of data (mapping, incrementals) then we can fire off some jobs.
      • that makes sense.
      • very very good.
      • _lucifer
        two things left to do. one is define memory defaults and second is update the new configuration in syswiki.
      • monitoring this cluster is easier than the docker one. one tunnel is sufficient
      • ruaok
        that was exactly the goal.
      • and each server is being monitored by all of zas' magic.
      • _lucifer
        :D
      • alastairp: available to talk about the GH actions PR?
      • ruaok, zas: do you know if any service we run on j5 might listen on port 5666?
      • zas
        nagios
      • well, its agent
      • _lucifer
        👍 thanks!
      • sumedh has quit
      • sumedh joined the channel
      • zas, i saw a commit in syswiki renaming germaine to jermaine so wanted to let you know that i noticed /etc/hosts on jermaine still contains has a couple of entries referring germaine.
      • zas
        oh, ok, I'll fix it
      • https://data.musicbrainz.org (ftp / williams over http(s|2))
      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #2068 (master…MBS-10711): MBS-10711: Convert report lists to react-table [WIP] https://github.com/metabrainz/musicbrainz-serve...
      • reosarevok
        bitmap, yvanzo ^ would really appreciate some feedback on whether the way I'm approaching this seems sensible, improvements etc, before I keep working on other lists
      • zas
      • BrainzGit
        [bookbrainz-site] akashgp09 opened pull request #601 (master…browser-compatibility): FIX(BB-615): Copy/Paste annotation text in FireFox <= 60 https://github.com/bookbrainz/bookbrainz-site/p...
      • scory joined the channel
      • scory
        Hello everyone. I would like to ask about Lucene Search syntax of the musicbrainz database. What kind of instance is it running on? If i would like to have a musicbrainz database (mbdata) with ElasticSearch instance, how do you recommend to integrate these two. I just started looking into ElasticSearch but what I found out i would need some kind of
      • data set to import it to ElasticSearch (*.json for example). Do you have some kind of method to import musicbrainz database into a Lucene Search intance, a data set I can use, or i should generate it myself? How do you keep it updated?
      • ruaok
        hi scory!
      • why must is be elasticsearch?
      • because we have a perfectly working search infrastructure that you can use without having to reinvent the wheel.
      • scory
        That infrastructure currently doesn't support what I need, last time I was here that was the conclusion for me. That's the reason I am currently running a mbdata server locally, and can run graphql queries against it, with batching. But I would like to implement an ElasticSearch instance on graphql. But currently i am just investigating.
      • ruaok
        you could look at the denormalized JSON dumps we have: ftp://ftp.eu.metabrainz.org/pub/musicbrainz/dat...
      • those fit for importing into a document store.
      • scory
        Thank you very much.
      • scory has quit
      • sumedh has quit
      • bitmap
        reosarevok: I can't think of anything else like that offhand, but doesn't seem like a bad idea. we could add a small endpoint to /ws/js for it
      • sumedh joined the channel
      • vardan has quit
      • adhi001 joined the channel
      • adhi001
        Sorry ruaok , I was sick the last week and was not able to submit a proposal for GSoC. Still part of the community :)
      • ruaok
        oh, bummer. that sucks. at least you're better, right?
      • adhi001
        yeah
      • Thank you for your concern
      • alastairp
        _lucifer: hi, sorry - had a hectic day. still around?
      • _lucifer
        alastairp: hi! no worries. yup, i am available.
      • alastairp
        so I was suggesting using test.sh in the actions?
      • _lucifer
        yes
      • alastairp
        so we already have things like `./test.sh -b` to build, and `test.sh -u` to bring up containers
      • test.sh fe to run frontend tests
      • _lucifer
        yes there's also test.sh spark
      • alastairp
        great, so it sounds like it's probably a good fit that we can use the actions files for specifying the orders in which to run things, but reusing test.sh for the actual commands allows us to share 100% test code between local developement and CI, right?
      • _lucifer
        should we use separate build steps? like there's a ./test.sh -u to just bring up supporting containers. or should we just to do ./test.sh which does it in one go
      • alastairp
        but we need to separate pull / cache / build / run, in CI, right?
      • _lucifer
        yes, mostly. we'll still have to pull manually
      • that step won't change
      • alastairp
        one question - if a test generates files (e.g. the junit xml), will it be cached? Or does the cache action only cache docker layers?
      • _lucifer
        i'll need to check that.
      • i expect docker layers only but we can confirm it by generating some files and looking at the actions output
      • alastairp
        it seems that satackey/action-docker-layer-caching works explicitly on layers (looking at the output, it uses docker commands to generate the archives)
      • so yeah, ./test.sh pull; restore cache; run test.sh; save cache
      • if that works, then I'm all for it!
      • _lucifer
        makes sense. i'll try that.
      • the junit action works fine but as i mentioned it might comment excessively
      • alastairp
        neat. did you see what happens if one fails? does it only update the comment or does it also add an annotation to the failing test?
      • and it'll make a new comment on every push (i.e. even if they all pass?)
      • _lucifer
        no i haven't, let me do that right now.
      • yes
      • alastairp
        or only if the results of the test run change?
      • interesting
      • you're right that this could get a bit annoying
      • _lucifer
        it'll hide the existing one and add a new one
      • for LB there are going to be 4 comments on each push
      • alastairp
        oh, that's quite annoying. merging tests together would help (e.g. get it down to 2), but I suspect that this might be too much
      • one thing that ruaok was suggesting back on jenkins was that it seems stupid to run tests on _every_ push, perhaps there could be a way to run them less often. once a day? on request based on a comment? just before merge?
      • ruaok
        anything, really.
      • alastairp
        let's not spend too much more time on this, but perhaps there is an action or a flag for `on:` that lets us decide to run them less often
      • _lucifer
        i'll look into that should be possible i think
      • alastairp
        on: comment: contains: "test please"
      • that'd be great