#musicbrainz-devel

/

      • jessew joined the channel
      • ruaok will just leave this here http://bcnftw.es/
      • kepstin-laptop
        ... ruaok*any* excuse to have a party, eh :)
      • ... that lost some words.
      • ruaok
        life is a party, no? :)
      • Freso joined the channel
      • ianmcorvidae: you wouldnt happen to be near a computer, would you?
      • kepstin joined the channel
      • ianmcorvidae
        ruaok: I am now, heh
      • oh, geez, hm
      • ruaok: clearly you should have tried to get bcnftw.cat :)
      • harder to pull off though, I suppose
      • ianweller
        there should be a non-profit that can purchase cctld domains for you by proxy by following whatever weird rules are required
      • ianmcorvidae
        .cat is quite restricted as I understand it
      • ianweller
        iirc, you either have to show that it will have content relating to the catalan language, or you have to know a guy
      • ianmcorvidae
        ah
      • ianweller
        hence why nyan.cat has had a catalan language option
      • ianmcorvidae
        yeah, have content in catalan published online already, access to a special code, "develop activities (in any language) to promote the Catalan culture and language" or are endorsed by 3 people who already have .cat domain names
      • hah
      • ianmcorvidae wonders how crypto.cat got away with it, probably the same way
      • heh, yeah, catalan is the second language, right under english :P
      • Freso joined the channel
      • kepstin
        so as long as ruoak finds someone to translate his blog into catalan, he's probably good :)
      • i mean, the content's definitely relevant :)
      • (as a side note, 'bcn' always makes me thing 'bacon'. I guess I'm too used to unix command names dropping vowels)
      • night199uk joined the channel
      • night199uk joined the channel
      • night199uk joined the channel
      • Leftmost joined the channel
      • luks
        can somebody please create https://github.com/metabrainz/libdiscid and give me access to it?
      • jessew joined the channel
      • ianmcorvidae
        luks: just an empty repo?
      • luks
        yes
      • ianmcorvidae
        okay, should be there and you should have access
      • luks
        thanks
      • it's quite embarrassing that we have libdiscid fixes committed in 2009 and never released :/
      • ianmcorvidae
        heh
      • I don't know that anyone in particular has been keeping tabs on that project, I guess that suggests nobody was :)
      • ocharles
        Explosions eh
      • nikki
        are you here to fix our replication?
      • ocharles
        not entirely
      • maybe if i catch up enough to understand the problem better
      • ianmcorvidae
        problem isn't really well-understood generally, I think right now we're just going for getting replication back on track
      • (though potentially still paused -- just have correct packets up to the current replication sequence on production)
      • but until we know what's actually causing the problem we'll have the potential for getting in trouble again :/
      • ocharles
        i see more scare noise about locks though
      • 1500 locks is nothing to be alarmed or happy about - it's just a number
      • do we know what type of locks are held, and where?
      • ianmcorvidae
        no
      • which is what I want to test for next time statistics run
      • (which is when this happened as well -- I think it may be related)
      • (people have been getting 502s trying to edit... anything, when stats are running)
      • ocharles
        yea, it does sound related
      • are our 5xx graphs shining any light on correlation?
      • ianmcorvidae
        (which makes no sense, it has almost no locks other than a bunch of access share on various tables and an exclusive lock on the stats table (but one that allows access share))
      • for the statistics problem, it's definitely correlated -- we moved stats an hour later to test exactly this and the problem moved with it
      • I don't know if this is related or how, but it did happen at the right time
      • ocharles
        hmm
      • ianmcorvidae
        basically I just want more information about the problem I know the most about that looks like it might be related :)
      • ocharles
        same
      • ianmcorvidae
        of course, that can't happen until twelve hours from now
      • ocharles
        unless you collect stats again and throw away current stats
      • (well, back them up, run, and then restore)
      • ianmcorvidae
        yeah, we could dump and then delete today's stats, run it, then delete and reimport
      • ocharles
        right
      • ianmcorvidae
        my plan while running it was to trigger what would be a 502 -- i.e. try to submit and edit -- and just dump all of pg_locks to a file while it's timing out
      • an*
      • and then it's "just" a matter of looking at everything that's waiting for a lock to figure out *why* it's waiting when statistics shouldn't need such a thing
      • ocharles
        we could also change the time out killer thingy to log the query that was executing at timeout
      • but that should be in the serverlog
      • (pg)
      • ianmcorvidae
        hm
      • I may not know where that log is
      • ocharles
        /var/log/postgres/serverlog
      • i need to shut this 'unexpected eof' thing up
      • in fact, that sorta implies that queries aren't getting aborted and are running for long periods of time
      • ianmcorvidae
        hm
      • yeah
      • rob was theorizing that something was causing locks -- by which he may have meant transactions holding locks -- to remain open
      • DBDefs, perhaps?
      • did the timeout make it through the DBDefs changes
      • ocharles
        we still don't know if it's a locking problem, really
      • iirc, postgresql is setup to log if stuff takes ages to acquire a lock
      • and i'm not seeing those messages
      • ianmcorvidae
        just brainstorming things to check :)
      • ocharles
      • somewhat interesting
      • ianmcorvidae
        I was wondering if it was that
      • that's the only explicit lock we're getting (the select from editor for update) that looked probable
      • ocharles
        they do always crop up at ~1:30
      • ianmcorvidae
        yeah, that's the statistics time
      • (since it got moved an hour later for diagnosing this)
      • what I don't understand is why statistics would have a lock on an editor table that conflicts there
      • ocharles
        i'm not sure it does, i just wonder if the amount of writes it does causes stuff to slow down
      • but that's a whopping slow down
      • ianmcorvidae
        yeah
      • ocharles
        how is replication broke?
      • ianmcorvidae
        aborted in the middle of doing a packet
      • ocharles
        is there a log i can see?
      • ianmcorvidae
        probably, it'd be in email
      • did you read through rob's email?
      • ianmcorvidae looks for the relevant email, anyway
      • ocharles
        yea
      • i'm not finding what i want in emails
      • ianmcorvidae
        I'm not really sure why it aborted, rob seems to have an idea why
      • ocharles
        the only abort i see is that the next hour rolled around and an existing job was running
      • ianmcorvidae
        yeah, that's all I'm seeing in email
      • ocharles
        i'm going to guess that rob killed that job with SIGTERM/SIGKILL
      • ianmcorvidae
        however, as rob outlines, it stopped dumping one packet and the next hour (whichever hour that was) included some of the same sequence IDs
      • possibly
      • hoping he can provide insight on this topic in a few hours
      • ocharles
        mmmm
      • well, i need to get out of bed, have a shower and get some breakfast then
      • it's a bit of a lazy day :P
      • ianmcorvidae
        seems reasonable
      • if that editor lock is in fact the thing that's failing, btw, we might limit the grabbing of that lock to autoedits, which is the only place where it *should* be required (but we're doing it for every edit)
      • I think that's not the main issue though
      • ocharles
        i mostly think that might be a symptomn, but not the problem
      • ianmcorvidae
        yeah, agreed
      • ocharles
        ok, gonna et up then, bbiab
      • Freso
        ianmcorvidae: Just replied to your comment on CR.
      • And, uh, sorrry for being a blurb of text. I just get up and couldn't manage figuring out a good place to insert linebreaks. :|
      • ianmcorvidae
        you need to publish the comment
      • it's not there :P
      • Freso
        Oh, right.
      • Silly CR.
      • Done.
      • LordSputnik joined the channel
      • reosarevok joined the channel
      • reosarevok joined the channel
      • sezuan joined the channel
      • voiceinsideyou joined the channel
      • voiceinsideyou1 joined the channel
      • kepstin-laptop joined the channel
      • ruaok joined the channel
      • ruaok
        ianmcorvidae: ping?
      • nikki
        I imagine he's still asleep. he didn't go to bed until late
      • not even four hours ago :P
      • ruaok
        I figured that. :) I got the last email from him about 5 hours ago.
      • I'll wait for him to wake to try and patch things back up.
      • I will, however, get the search indexes building again.
      • damn nagios. not sending me emails.
      • reosarevok
        Supposedly he did that?
      • ruaok
        who did what?
      • nikki
        ian apparently got search indexes updating again
      • reosarevok
        that
      • ruaok
        oh, whoops.
      • he didn't mail me about that.
      • ah looks like we got one set out and I killed the next run thats been going for about an hour
      • LordSputnik has left the channel