#metabrainz

/

      • samj1912
        ooh
      • 2018-06-01 15255, 2018

      • ruaok
        so, transferwise is winning on the INR front.
      • 2018-06-01 15209, 2018

      • ruaok
        and it will win for france as well and possibly sweden.
      • 2018-06-01 15250, 2018

      • samj1912
        nice!
      • 2018-06-01 15219, 2018

      • ruaok runs off to get divorced
      • 2018-06-01 15241, 2018

      • samj1912
        zas: what's the current search req/s?
      • 2018-06-01 15258, 2018

      • samj1912
        zas: I guess we will also need to enable https://lucene.apache.org/solr/guide/6_6/distribu… ?
      • 2018-06-01 15221, 2018

      • rsh7 joined the channel
      • 2018-06-01 15209, 2018

      • zas
      • 2018-06-01 15203, 2018

      • zas
        mean peak at ~130 q/s (hitting upstreams), low around ~80
      • 2018-06-01 15238, 2018

      • zas
        for all 3 servers
      • 2018-06-01 15217, 2018

      • zas
        so to be safe we need at least 2 times the max, around 250 q/s, and it's a min
      • 2018-06-01 15251, 2018

      • KassOtsimine joined the channel
      • 2018-06-01 15252, 2018

      • zas
        on 3 nodes that's around 80q/s, 3 times what we get now
      • 2018-06-01 15205, 2018

      • samj1912
        Hmm
      • 2018-06-01 15214, 2018

      • zas
        but i'm pretty sure the current setup is under performing
      • 2018-06-01 15259, 2018

      • zas
        what is the current caching config ?
      • 2018-06-01 15215, 2018

      • samj1912
        zas, for load balancing have you taken a look at https://lucene.apache.org/solr/6_6_0//solr-solrj/…
      • 2018-06-01 15224, 2018

      • samj1912
        zas, not sure
      • 2018-06-01 15237, 2018

      • samj1912
        I will have to check
      • 2018-06-01 15209, 2018

      • zas
        samj1912: that's not for our case, that's for client lib
      • 2018-06-01 15223, 2018

      • HSOWA has quit
      • 2018-06-01 15226, 2018

      • zas
        we need a frontend like nginx reverse proxy or haproxy
      • 2018-06-01 15246, 2018

      • zas
        but we don't need to set it up yet, let's focus on the performance of one node first
      • 2018-06-01 15249, 2018

      • samj1912
        Oh cool
      • 2018-06-01 15209, 2018

      • samj1912
        Can you conduct your own benchmarks once?
      • 2018-06-01 15219, 2018

      • samj1912
        I am not sure if I did it correctly?
      • 2018-06-01 15242, 2018

      • zas
        can you document how you did ? i'll use it as a basis
      • 2018-06-01 15209, 2018

      • zas
        my feeling is that the current configuration is totally "untuned"
      • 2018-06-01 15253, 2018

      • samj1912
        I just varied a couple of params for ab, concurrency, the number of rows returned(25,100,300), and the output type (m json vs mbxml)
      • 2018-06-01 15212, 2018

      • Darkloke
        Hi2All. What is this type of error in picard - "E: 13:39:19 Fingerprint calculator failed exit code = 2, exit status = 0, error = Unknown error" ?
      • 2018-06-01 15236, 2018

      • zas
        how did you generate queries ? can you show url schema ?
      • 2018-06-01 15243, 2018

      • samj1912
        I ran the default query under the advanced end point
      • 2018-06-01 15205, 2018

      • samj1912
      • 2018-06-01 15205, 2018

      • zas
        did you limit number of results or anything?
      • 2018-06-01 15212, 2018

      • samj1912
        See above
      • 2018-06-01 15226, 2018

      • samj1912
        As, I said, varied between 25, 100 and 300
      • 2018-06-01 15240, 2018

      • zas
        but this query returns all releases ?
      • 2018-06-01 15252, 2018

      • samj1912
        Yup
      • 2018-06-01 15221, 2018

      • samj1912
        But shows the top 30 results
      • 2018-06-01 15258, 2018

      • zas
        do you want me to extract a bunch of real queries from current logs ? and we use same params to mimic actual traffic
      • 2018-06-01 15216, 2018

      • samj1912
        Sure
      • 2018-06-01 15225, 2018

      • samj1912
        That will work better
      • 2018-06-01 15253, 2018

      • samj1912
        But we will have to alter the params a bit
      • 2018-06-01 15211, 2018

      • samj1912
        Solr has different params than ws
      • 2018-06-01 15241, 2018

      • zas
        yup, i'll first extract requests
      • 2018-06-01 15232, 2018

      • zas
        let's work on releases and recordings ok ?
      • 2018-06-01 15241, 2018

      • samj1912
        Okay
      • 2018-06-01 15254, 2018

      • samj1912
        Btw, recordings are still being indexed
      • 2018-06-01 15228, 2018

      • samj1912
        Or not
      • 2018-06-01 15239, 2018

      • samj1912
        Looks like it died again
      • 2018-06-01 15242, 2018

      • samj1912
        Sigh
      • 2018-06-01 15256, 2018

      • samj1912
        zas, on further investigation it seems like we should have more than 1 shard minimum
      • 2018-06-01 15206, 2018

      • samj1912
      • 2018-06-01 15255, 2018

      • samj1912
        Also should we increase the number of threads?
      • 2018-06-01 15217, 2018

      • zas
        dunno, we need to measure things first
      • 2018-06-01 15219, 2018

      • Leo__Verto joined the channel
      • 2018-06-01 15200, 2018

      • iliekcomputers
        Hi
      • 2018-06-01 15227, 2018

      • iliekcomputers
        I have access to internet after what seems like a lifetime 😳
      • 2018-06-01 15245, 2018

      • Leo__Verto has quit
      • 2018-06-01 15256, 2018

      • drsaunder has quit
      • 2018-06-01 15202, 2018

      • samj1912
        zas:
      • 2018-06-01 15205, 2018

      • samj1912
      • 2018-06-01 15213, 2018

      • samj1912
        from random URLs after parsing our logs
      • 2018-06-01 15252, 2018

      • samj1912
        whats the current ws global limit?
      • 2018-06-01 15205, 2018

      • drsaunder joined the channel
      • 2018-06-01 15247, 2018

      • samj1912
        hmm, solr pretty much shits itself when I increase concurrency to 50
      • 2018-06-01 15230, 2018

      • Leo__Verto joined the channel
      • 2018-06-01 15256, 2018

      • samj1912
        or when I try to benchmark it at 15
      • 2018-06-01 15215, 2018

      • samj1912
        but it hits upto 110 reqs/s under a real load from logs
      • 2018-06-01 15230, 2018

      • samj1912
        most prolly coz it returns a blank for most of them since recordings arent indexed
      • 2018-06-01 15202, 2018

      • samj1912
        zas: I also made a list of urls evenly distributed amongst all 3 solr nodes
      • 2018-06-01 15225, 2018

      • samj1912
      • 2018-06-01 15201, 2018

      • ruaok
        hiya iliekcomputers!
      • 2018-06-01 15221, 2018

      • ruaok
        we've got loads to do for when you're back. :)
      • 2018-06-01 15258, 2018

      • rsh7 has quit
      • 2018-06-01 15212, 2018

      • samj1912
        zas: all the test files are on my home folder on kiki including the generation script
      • 2018-06-01 15253, 2018

      • samj1912
        im off to the ophthalmologist
      • 2018-06-01 15241, 2018

      • D4RK-PH0ENiX has quit
      • 2018-06-01 15218, 2018

      • D4RK-PH0ENiX joined the channel
      • 2018-06-01 15219, 2018

      • yvanzo
        GeneralDiscourse: Current schema is 23 (for one year and to last about six more months). How old is your database? Does https://musicbrainz.org/doc/MusicBrainz_Server/Se… help?
      • 2018-06-01 15228, 2018

      • D4RK-PH0ENiX has quit
      • 2018-06-01 15242, 2018

      • Darkloke has quit
      • 2018-06-01 15241, 2018

      • D4RK-PH0ENiX joined the channel
      • 2018-06-01 15240, 2018

      • iliekcomputers
        ruaok: woo!
      • 2018-06-01 15225, 2018

      • ruaok
        gah, that sucks.
      • 2018-06-01 15245, 2018

      • bukwurm joined the channel
      • 2018-06-01 15213, 2018

      • iliekcomputers
        Good thing is i dont have to go outside 😬😬
      • 2018-06-01 15246, 2018

      • ruaok
        between the heat and the air, enough to kill ya.
      • 2018-06-01 15210, 2018

      • ruaok
        at least the food doesn't kill you.
      • 2018-06-01 15207, 2018

      • iliekcomputers
        Yea
      • 2018-06-01 15213, 2018

      • iliekcomputers
        There's a sandstorm out there rn
      • 2018-06-01 15204, 2018

      • samj1912
        Soooo. Apparently I am not able to see nearby objects for at least 12 hrs. I'll be afk for a while now.
      • 2018-06-01 15214, 2018

      • bukwurm_
        LordSputnik: Oh, sorry wasn't able to see your message yesterday. I was on matrix connection, and it didn't come across. 😅
      • 2018-06-01 15245, 2018

      • iliekcomputers
        samj1912: lasik?
      • 2018-06-01 15221, 2018

      • samj1912
        Nope, just some random eye solution for an eye check up.
      • 2018-06-01 15214, 2018

      • samj1912
        Using my grandma's bifocals to type 😂
      • 2018-06-01 15232, 2018

      • samj1912
        So I'll be off now o/
      • 2018-06-01 15257, 2018

      • ruaok
        lol, I wish I could see that.
      • 2018-06-01 15235, 2018

      • ruaok is imagining https://goo.gl/images/nRKTC4
      • 2018-06-01 15234, 2018

      • UmkaDK_ has quit
      • 2018-06-01 15244, 2018

      • sentriz has quit
      • 2018-06-01 15240, 2018

      • Leo__Verto has quit
      • 2018-06-01 15225, 2018

      • Sophist-UK joined the channel
      • 2018-06-01 15239, 2018

      • Sophist_UK has quit
      • 2018-06-01 15205, 2018

      • drsaunder has quit
      • 2018-06-01 15213, 2018

      • D4RK-PH0ENiX has quit
      • 2018-06-01 15239, 2018

      • D4RK-PH0ENiX joined the channel
      • 2018-06-01 15236, 2018

      • drsaunder joined the channel
      • 2018-06-01 15218, 2018

      • zas
        samj1912: around?
      • 2018-06-01 15244, 2018

      • zas
        is it sir loading williams ? still indexing recordings ?
      • 2018-06-01 15216, 2018

      • thomasross joined the channel
      • 2018-06-01 15258, 2018

      • zas
        bitmap, yvanzo: ping
      • 2018-06-01 15213, 2018

      • zas
        /var/lib/docker/overlay2/9d84776d88faeff81707abc8ff4e46a2160528d3efad8ece4384811d44a2ac58/merged/home/musicbrainz on cage is using a lot of diskspace, in musicbrainz-server (363M) .cpanm (333M) and .cache (320M)
      • 2018-06-01 15218, 2018

      • zas
        is it expected ?
      • 2018-06-01 15240, 2018

      • zas
        there are significant increases of diskspace usage on many machines since 8am GMT
      • 2018-06-01 15257, 2018

      • bitmap has quit
      • 2018-06-01 15204, 2018

      • bitmap joined the channel
      • 2018-06-01 15206, 2018

      • zas
      • 2018-06-01 15213, 2018

      • zas
        oh, ignore this, those are mainly on md1 (that's /boot)
      • 2018-06-01 15223, 2018

      • zas
        yvanzo, bitmap: i was debugging Bandcamp importer issues related to https & Bandcamp changes. I noticed that https://musicbrainz.org/label/1d81d8ca-588a-4837-… has https Bandcamp on right side, but http in relationships
      • 2018-06-01 15250, 2018

      • zas
        importer looks for urls to find out if entity exists, but in this case https://suiciderecordsswe.bandcamp.com/album/vall… doesn't show anything, the importer looks for https url, but doesn't find it because it is stored as http
      • 2018-06-01 15254, 2018

      • zas
      • 2018-06-01 15219, 2018

      • zas
        that's a bit messy
      • 2018-06-01 15228, 2018

      • zas
        all Bandcamp urls should be converted to https imho, since Bandcamp now redirects all http to https (though it still gives http url in DOM data, the importer will be modified to convert them to https)
      • 2018-06-01 15236, 2018

      • zas
        what do you think?
      • 2018-06-01 15201, 2018

      • yvanzo
        zas: We have to normalize in the database URLs which are normalized in the code.
      • 2018-06-01 15237, 2018

      • zas
        There are also instagram urls : with and without www. i found some artists having both
      • 2018-06-01 15240, 2018

      • yvanzo
        http/https conversions can be done with a direct db change from a cron task
      • 2018-06-01 15209, 2018

      • yvanzo
        more advanced normalizations should be done using URLCleanup.js directly
      • 2018-06-01 15217, 2018

      • yvanzo
        with a bot
      • 2018-06-01 15219, 2018

      • bukwurm has quit
      • 2018-06-01 15213, 2018

      • zas
        i can fix the importer to use https URLs, but since there are http and https Bandcamp URLs in the db atm, it means a proper fix would be to query for both forms :(
      • 2018-06-01 15219, 2018

      • zas
        actually it'd be great to be able to query ignoring the protocol part
      • 2018-06-01 15247, 2018

      • zas
        yvanzo: when do you think Bandcamp URLs can be fixed in the db ?
      • 2018-06-01 15204, 2018

      • zas
      • 2018-06-01 15216, 2018

      • zas
      • 2018-06-01 15235, 2018

      • zas
        but on mouseover it displays https url ...
      • 2018-06-01 15256, 2018

      • zas
        I'll fix the importer to play well with https urls, but i'll not double the queries, it shouldn't be a problem if bandcamp urls are all converted in the db
      • 2018-06-01 15247, 2018

      • HSOWA joined the channel
      • 2018-06-01 15247, 2018

      • HSOWA has quit
      • 2018-06-01 15247, 2018

      • HSOWA joined the channel
      • 2018-06-01 15257, 2018

      • KassOtsimine has quit
      • 2018-06-01 15250, 2018

      • zas
      • 2018-06-01 15242, 2018

      • Leo__Verto joined the channel
      • 2018-06-01 15226, 2018

      • KassOtsimine joined the channel
      • 2018-06-01 15240, 2018

      • HSOWA has quit
      • 2018-06-01 15219, 2018

      • Nyanko-sensei joined the channel
      • 2018-06-01 15251, 2018

      • D4RK-PH0ENiX has quit
      • 2018-06-01 15235, 2018

      • antlarr has quit