#metabrainz

/

      • agentsim has quit
      • ZaphodBeeblebrox
        \o heisan gcilou
      • ZaphodBeeblebrox is now known as CatQuest
      • CatQuest
        sorry that of all the scandinavian countries you had to go to sweden :D
      • CatQuest looks out on the pretty sun here >:D
      • ruaok
        our usual gelateria makes the cut of the best in barcelona: http://www.lavanguardia.com/comer/sitios/201706...
      • gcilou
        Lol the sun is finally peeking out
      • But man, these trains are difficult to figure out
      • ruaok
        the t-banen?
      • or trying to figure out why the swedish trains all look fugly?
      • gcilou
        All of them. Especially with my non-existent swedish
      • CatQuest
        I'm sure swedes speak english
      • gcilou
        Their signs don't
      • ruaok
        these trains are never going to mate: https://www.seat61.com/images/Sweden-oresund-tr... (partly because they are actually danish)
      • do you have a working mobile phone? google translate is a life saver.
      • CatQuest
        rob: SothoTalKer linked something interesting earlier: https://www.bleepingcomputer.com/news/security/...
      • ruaok
        but you can stop any swede and ask. not only will they speak english, but they will be glad to help.
      • or take you to ikea, which may not be what you want.
      • CatQuest
        stay wy from ikea! :O
      • (although the meatballs are jum)
      • gcilou
        True. The train and bus buying system is weird
      • zag joined the channel
      • ruaok
        CatQuest: good think we don't operate a hadoop cluster
      • CatQuest
        well I read the word "elasticsearch" in there so I figured.. better
      • share it just in case?
      • /it might be insteresting anyway
      • hmm gcilou I don't know if swedish systems are that different from norwegian ones.. whats weird about them? :D
      • ruaok
        they run on time? there are so many trains in a day?
      • iliekcomputers
        hmm, trains running on time, I wonder how that works
      • :)
      • zag
        just for info, ever since musicbrainz-mirror went down last month. The main MB web service has also been unusable. I think a few big users switched back and its overloading again
      • ruaok
        it even works in spain. :)
      • Guest29341 has quit
      • zag: we
      • 're aware and working on it.
      • zag
        thx, any news about api keys or usage stats? I know that was a hope one day
      • ruaok
        not yet, no
      • zag
        k thx
      • gcilou
        Lol I guess just buying tickets for a ride rather than a certain number of days is different than in the US
      • Matthew__ joined the channel
      • Matthew__ is now known as Guest79527
      • Guest79527
        Hi, all. We have an issue where replication is permanently stuck in 'Continuing a previously aborted load' but no statements are executed and dbmirror_pending does not get cleared. This started happening at sometime between 1600 and 1700 BST yesterday. Do you have any advice as to how we might 'reset' replication? Can I safely truncate the dbmirror_pending table for example?
      • Guest79527 is now known as MatthewGlubb
      • reosarevok
        bitmap, ruaok ^
      • ruaok
        hmmm, I don't know what might causes that.
      • but truncating the table might cause the replication to surely fail. which is may already have.
      • do you have logs going back that you could look at and see what happened?
      • SothoTalKer
        wow, sweden looks ugly '-'
      • MatthewGlubb
        It happened across all of our replication instances, production and pre-production databases - there was no user input. What I did see was that the previous packet did not complete applying all of its statements.
      • bitmap
        I saw someone filed MBS-9366 just now, which sounds the same. but I don't know what could've caused this yet
      • BrainzBot
        MBS-9366: duplicate key value violates unique constraint "artist_alias_idx_primary" https://tickets.metabrainz.org/browse/MBS-9366
      • MatthewGlubb
        Thanks. That certainly looks like it could be the same problem as it matches the time the error started.
      • (for us)
      • bitmap
        especially in the artist_alias table
      • 104949 would be the problematic packet
      • MatthewGlubb
        Yes
      • "artist_alias_idx_primary" UNIQUE, btree (artist, locale) WHERE primary_for_locale = true AND locale IS NOT NULL
      • So it seems like the artist/locale is being modified but it's trying to modify it to an existing artist/locale
      • bitmap
      • to81 joined the channel
      • that's in the dbmirror_pendingdata for that packet, the first and last sequences there are certain to conflict
      • MatthewGlubb
        Certainly looks that way
      • bitmap
        what's bizarre is that the artist IDs are different in the 'key' rows
      • never seen anything like that
      • ruaok
        fun. :(
      • bitmap
        yeah
      • there was an artist merge with those other ids https://musicbrainz.org/edit/45678247
      • reosarevok
        jesus2099, you broke everything!
      • to81 has quit
      • Got another supporter @ support asking about the same issue. Let me know as soon as we have a fix so I can tell them :)
      • bitmap
        but how the hell did dbmirror see both ids for the same sequence
      • reosarevok tweets in the meantime
      • D4RK-PH0ENiX has quit
      • kyan has quit
      • D4RK-PH0ENiX joined the channel
      • agentsim joined the channel
      • ZarkBit has quit
      • nvm, those artist IDs are different because they're update operations, sorry, I was thinking they were inserts
      • still alarming, but a lot less so :P
      • drsaunders joined the channel
      • thanks coffee
      • ruaok
        lol
      • zas
        more IPs blocked, if someone complains, ask for IP first, we may have false positives on this batch
      • SothoTalKer
        wow, alomst 6000 hits to /collection/create
      • and ~250k useragents from yahoo and bing
      • to81 joined the channel
      • ruaok
        Gah. I knew crawlers would be a problem for us back in the day.
      • I didn't thin they'd still be a problem.
      • to81 has quit
      • to81 joined the channel
      • agentsim has quit
      • SothoTalKer
        well, craw-delay is now set to 2 seconds for any bot except IA. that should make the hits per day to go to ~43200
      • depending how many bots are run simultaneously
      • ruaok
        someone is working at hetzner today: > We have received your order and shall inform you once we have activated your request.
      • SothoTalKer
        yes, they are called "auto response bot" :D
      • ruaok
        one that takes 45 minutes to respond?
      • reosarevok
        German efficiency! :p
      • ruaok
        I can see the thinking.
      • we'll have one bot respond right away. and another that responds in 45 minutes.
      • yeah, people will think we work on holidays.
      • SothoTalKer
        45 mins is actually a very short response time.
      • reosarevok
        Devious!
      • SothoTalKer: for a human yes, for a bot probably not :)
      • SothoTalKer
        their automatic response system is slightly overloaded because they got so many requests
      • reosarevok
        What is this, our ws?
      • :p
      • SothoTalKer
        i bet it is just some poor person who has to work on holidays but can't actually do anything else than route your request to the right persons.
      • ruaok
        haters are gonna hate, sheesh.
      • iliekcomputers
        new servers?
      • Rotab
        moar!!
      • SothoTalKer
        i think you should use a server farm and not try to run websites like this from home! :X
      • Freso
        What if you live on a farm?
      • SothoTalKer
        animals don't count ^-^
      • CatQuest
        too many spam users @_@
      • SothoTalKer
        you could report them
      • CatQuest
        no, see yesterday's conversation
      • hmph, why is crazy webcrawler still in this list :/
      • SothoTalKer
        because it takes up to 2 weeks until they read the new robots.txt
      • CatQuest
        hmmm
      • SothoTalKer
        same with all the other disallowed bots
      • CatQuest
        http://crazywebcrawler.com/ "click here if oyu don't want us to crawl your website"
      • SothoTalKer
        indeed an admin could write them a mail and request an instant stop
      • CatQuest
        hmmm
      • "Blocking our web crawler by IP address will not work. "
      • ruaok
      • nice, that last set of blockings increased throughput by another ~1,000 requests/min
      • SothoTalKer
        if you write them a mail, they put your site on a blacklist the bot ignores.
      • CatQuest
        hey maybe we should?
      • they tal kabout "if we crawl your siteto quickly"
      • SothoTalKer
        we blocked them totally via robots.txt so hits should gradually decline
      • CatQuest
        i already know that, i read the robots.txt
      • ruaok
        I wonder if we should only allow: Google, IA and then common crawl. if someone wants our data for their search engine, use common crawl
      • SothoTalKer
        if we still see the same amount of hits after 2 weeks, we could ask them nicely :)
      • CatQuest
        what is common crawl?
      • also I womder what the "-" useragent/referrer is
      • also top request "/"
      • SothoTalKer
        "/" is the main site
      • CatQuest
        basiclaly "no user agent/referrer" ?
      • SothoTalKer
        yes
      • reosarevok
        ruaok: I don't think we should block stuff like yandex completely (which seems has most of the market share in Russia)
      • ruaok
        common crawl is a data set of crawled web pages. http://commoncrawl.org/
      • reosarevok
        Slowing them should be good enough
      • CatQuest
        ah so / is internal links (someone going from some link inside mb to another page)
      • SothoTalKer
        some people might use referrer blockers or come from sites that link through those or come directly to MB
      • CatQuest agrees with reo
      • ruaok
        reosarevok: I'm not denying them access to our data. I'm suggesting they get it via common crawl
      • reosarevok
        Sure, but is there any chance they're going to use that?
      • Freso
        CatQuest: "/" is someone going from the front page of MB to somewhere else on MB.
      • ruaok
        reosarevok: depends on how much they care to have our data.
      • Freso
        CatQuest: "-" is when there's no referrer info.