#musicbrainz-devel

/

      • jessew joined the channel
      • 2013-01-01 00145, 2013

      • ruaok will just leave this here http://bcnftw.es/
      • 2013-01-01 00153, 2013

      • kepstin-laptop
        ... ruaok*any* excuse to have a party, eh :)
      • 2013-01-01 00103, 2013

      • kepstin-laptop
        ... that lost some words.
      • 2013-01-01 00146, 2013

      • ruaok
        life is a party, no? :)
      • 2013-01-01 00134, 2013

      • Freso joined the channel
      • 2013-01-01 00121, 2013

      • ruaok
        ianmcorvidae: you wouldnt happen to be near a computer, would you?
      • 2013-01-01 00133, 2013

      • kepstin joined the channel
      • 2013-01-01 00142, 2013

      • ianmcorvidae
        ruaok: I am now, heh
      • 2013-01-01 00104, 2013

      • ianmcorvidae
        oh, geez, hm
      • 2013-01-01 00121, 2013

      • ianmcorvidae
        ruaok: clearly you should have tried to get bcnftw.cat :)
      • 2013-01-01 00128, 2013

      • ianmcorvidae
        harder to pull off though, I suppose
      • 2013-01-01 00128, 2013

      • ianweller
        there should be a non-profit that can purchase cctld domains for you by proxy by following whatever weird rules are required
      • 2013-01-01 00103, 2013

      • ianmcorvidae
        .cat is quite restricted as I understand it
      • 2013-01-01 00153, 2013

      • ianweller
        iirc, you either have to show that it will have content relating to the catalan language, or you have to know a guy
      • 2013-01-01 00100, 2013

      • ianmcorvidae
        ah
      • 2013-01-01 00152, 2013

      • ianweller
        hence why nyan.cat has had a catalan language option
      • 2013-01-01 00130, 2013

      • ianmcorvidae
        yeah, have content in catalan published online already, access to a special code, "develop activities (in any language) to promote the Catalan culture and language" or are endorsed by 3 people who already have .cat domain names
      • 2013-01-01 00134, 2013

      • ianmcorvidae
        hah
      • 2013-01-01 00155, 2013

      • ianmcorvidae wonders how crypto.cat got away with it, probably the same way
      • 2013-01-01 00108, 2013

      • ianmcorvidae
        heh, yeah, catalan is the second language, right under english :P
      • 2013-01-01 00149, 2013

      • Freso joined the channel
      • 2013-01-01 00107, 2013

      • kepstin
        so as long as ruoak finds someone to translate his blog into catalan, he's probably good :)
      • 2013-01-01 00122, 2013

      • kepstin
        i mean, the content's definitely relevant :)
      • 2013-01-01 00153, 2013

      • kepstin
        (as a side note, 'bcn' always makes me thing 'bacon'. I guess I'm too used to unix command names dropping vowels)
      • 2013-01-01 00149, 2013

      • night199uk joined the channel
      • 2013-01-01 00144, 2013

      • night199uk joined the channel
      • 2013-01-01 00143, 2013

      • night199uk joined the channel
      • 2013-01-01 00147, 2013

      • Leftmost joined the channel
      • 2013-01-01 00142, 2013

      • luks
        can somebody please create https://github.com/metabrainz/libdiscid and give me access to it?
      • 2013-01-01 00112, 2013

      • jessew joined the channel
      • 2013-01-01 00129, 2013

      • ianmcorvidae
        luks: just an empty repo?
      • 2013-01-01 00134, 2013

      • luks
        yes
      • 2013-01-01 00102, 2013

      • ianmcorvidae
        okay, should be there and you should have access
      • 2013-01-01 00112, 2013

      • luks
        thanks
      • 2013-01-01 00143, 2013

      • luks
        it's quite embarrassing that we have libdiscid fixes committed in 2009 and never released :/
      • 2013-01-01 00107, 2013

      • ianmcorvidae
        heh
      • 2013-01-01 00126, 2013

      • ianmcorvidae
        I don't know that anyone in particular has been keeping tabs on that project, I guess that suggests nobody was :)
      • 2013-01-01 00158, 2013

      • ocharles
        Explosions eh
      • 2013-01-01 00140, 2013

      • nikki
        are you here to fix our replication?
      • 2013-01-01 00155, 2013

      • ocharles
        not entirely
      • 2013-01-01 00112, 2013

      • ocharles
        maybe if i catch up enough to understand the problem better
      • 2013-01-01 00136, 2013

      • ianmcorvidae
        problem isn't really well-understood generally, I think right now we're just going for getting replication back on track
      • 2013-01-01 00111, 2013

      • ianmcorvidae
        (though potentially still paused -- just have correct packets up to the current replication sequence on production)
      • 2013-01-01 00127, 2013

      • ianmcorvidae
        but until we know what's actually causing the problem we'll have the potential for getting in trouble again :/
      • 2013-01-01 00147, 2013

      • ocharles
        i see more scare noise about locks though
      • 2013-01-01 00157, 2013

      • ocharles
        1500 locks is nothing to be alarmed or happy about - it's just a number
      • 2013-01-01 00104, 2013

      • ocharles
        do we know what type of locks are held, and where?
      • 2013-01-01 00111, 2013

      • ianmcorvidae
        no
      • 2013-01-01 00119, 2013

      • ianmcorvidae
        which is what I want to test for next time statistics run
      • 2013-01-01 00128, 2013

      • ianmcorvidae
        (which is when this happened as well -- I think it may be related)
      • 2013-01-01 00144, 2013

      • ianmcorvidae
        (people have been getting 502s trying to edit... anything, when stats are running)
      • 2013-01-01 00101, 2013

      • ocharles
        yea, it does sound related
      • 2013-01-01 00116, 2013

      • ocharles
        are our 5xx graphs shining any light on correlation?
      • 2013-01-01 00120, 2013

      • ianmcorvidae
        (which makes no sense, it has almost no locks other than a bunch of access share on various tables and an exclusive lock on the stats table (but one that allows access share))
      • 2013-01-01 00141, 2013

      • ianmcorvidae
        for the statistics problem, it's definitely correlated -- we moved stats an hour later to test exactly this and the problem moved with it
      • 2013-01-01 00113, 2013

      • ianmcorvidae
        I don't know if this is related or how, but it did happen at the right time
      • 2013-01-01 00123, 2013

      • ocharles
        hmm
      • 2013-01-01 00125, 2013

      • ianmcorvidae
        basically I just want more information about the problem I know the most about that looks like it might be related :)
      • 2013-01-01 00101, 2013

      • ocharles
        same
      • 2013-01-01 00119, 2013

      • ianmcorvidae
        of course, that can't happen until twelve hours from now
      • 2013-01-01 00103, 2013

      • ocharles
        unless you collect stats again and throw away current stats
      • 2013-01-01 00112, 2013

      • ocharles
        (well, back them up, run, and then restore)
      • 2013-01-01 00137, 2013

      • ianmcorvidae
        yeah, we could dump and then delete today's stats, run it, then delete and reimport
      • 2013-01-01 00104, 2013

      • ocharles
        right
      • 2013-01-01 00107, 2013

      • ianmcorvidae
        my plan while running it was to trigger what would be a 502 -- i.e. try to submit and edit -- and just dump all of pg_locks to a file while it's timing out
      • 2013-01-01 00113, 2013

      • ianmcorvidae
        an*
      • 2013-01-01 00148, 2013

      • ianmcorvidae
        and then it's "just" a matter of looking at everything that's waiting for a lock to figure out *why* it's waiting when statistics shouldn't need such a thing
      • 2013-01-01 00133, 2013

      • ocharles
        we could also change the time out killer thingy to log the query that was executing at timeout
      • 2013-01-01 00136, 2013

      • ocharles
        but that should be in the serverlog
      • 2013-01-01 00138, 2013

      • ocharles
        (pg)
      • 2013-01-01 00151, 2013

      • ianmcorvidae
        hm
      • 2013-01-01 00154, 2013

      • ianmcorvidae
        I may not know where that log is
      • 2013-01-01 00124, 2013

      • ocharles
        /var/log/postgres/serverlog
      • 2013-01-01 00141, 2013

      • ocharles
        i need to shut this 'unexpected eof' thing up
      • 2013-01-01 00144, 2013

      • ocharles
        in fact, that sorta implies that queries aren't getting aborted and are running for long periods of time
      • 2013-01-01 00144, 2013

      • ianmcorvidae
        hm
      • 2013-01-01 00154, 2013

      • ianmcorvidae
        yeah
      • 2013-01-01 00121, 2013

      • ianmcorvidae
        rob was theorizing that something was causing locks -- by which he may have meant transactions holding locks -- to remain open
      • 2013-01-01 00128, 2013

      • ianmcorvidae
        DBDefs, perhaps?
      • 2013-01-01 00140, 2013

      • ianmcorvidae
        did the timeout make it through the DBDefs changes
      • 2013-01-01 00102, 2013

      • ocharles
        we still don't know if it's a locking problem, really
      • 2013-01-01 00131, 2013

      • ocharles
        iirc, postgresql is setup to log if stuff takes ages to acquire a lock
      • 2013-01-01 00134, 2013

      • ocharles
        and i'm not seeing those messages
      • 2013-01-01 00115, 2013

      • ianmcorvidae
        just brainstorming things to check :)
      • 2013-01-01 00110, 2013

      • ocharles
      • 2013-01-01 00114, 2013

      • ocharles
        somewhat interesting
      • 2013-01-01 00136, 2013

      • ianmcorvidae
        I was wondering if it was that
      • 2013-01-01 00157, 2013

      • ianmcorvidae
        that's the only explicit lock we're getting (the select from editor for update) that looked probable
      • 2013-01-01 00100, 2013

      • ocharles
        they do always crop up at ~1:30
      • 2013-01-01 00107, 2013

      • ianmcorvidae
        yeah, that's the statistics time
      • 2013-01-01 00117, 2013

      • ianmcorvidae
        (since it got moved an hour later for diagnosing this)
      • 2013-01-01 00131, 2013

      • ianmcorvidae
        what I don't understand is why statistics would have a lock on an editor table that conflicts there
      • 2013-01-01 00156, 2013

      • ocharles
        i'm not sure it does, i just wonder if the amount of writes it does causes stuff to slow down
      • 2013-01-01 00101, 2013

      • ocharles
        but that's a whopping slow down
      • 2013-01-01 00128, 2013

      • ianmcorvidae
        yeah
      • 2013-01-01 00102, 2013

      • ocharles
        how is replication broke?
      • 2013-01-01 00114, 2013

      • ianmcorvidae
        aborted in the middle of doing a packet
      • 2013-01-01 00134, 2013

      • ocharles
        is there a log i can see?
      • 2013-01-01 00149, 2013

      • ianmcorvidae
        probably, it'd be in email
      • 2013-01-01 00155, 2013

      • ianmcorvidae
        did you read through rob's email?
      • 2013-01-01 00115, 2013

      • ianmcorvidae looks for the relevant email, anyway
      • 2013-01-01 00115, 2013

      • ocharles
        yea
      • 2013-01-01 00122, 2013

      • ocharles
        i'm not finding what i want in emails
      • 2013-01-01 00132, 2013

      • ianmcorvidae
        I'm not really sure why it aborted, rob seems to have an idea why
      • 2013-01-01 00155, 2013

      • ocharles
        the only abort i see is that the next hour rolled around and an existing job was running
      • 2013-01-01 00117, 2013

      • ianmcorvidae
        yeah, that's all I'm seeing in email
      • 2013-01-01 00149, 2013

      • ocharles
        i'm going to guess that rob killed that job with SIGTERM/SIGKILL
      • 2013-01-01 00150, 2013

      • ianmcorvidae
        however, as rob outlines, it stopped dumping one packet and the next hour (whichever hour that was) included some of the same sequence IDs
      • 2013-01-01 00117, 2013

      • ianmcorvidae
        possibly
      • 2013-01-01 00128, 2013

      • ianmcorvidae
        hoping he can provide insight on this topic in a few hours
      • 2013-01-01 00134, 2013

      • ocharles
        mmmm
      • 2013-01-01 00145, 2013

      • ocharles
        well, i need to get out of bed, have a shower and get some breakfast then
      • 2013-01-01 00149, 2013

      • ocharles
        it's a bit of a lazy day :P
      • 2013-01-01 00151, 2013

      • ianmcorvidae
        seems reasonable
      • 2013-01-01 00144, 2013

      • ianmcorvidae
        if that editor lock is in fact the thing that's failing, btw, we might limit the grabbing of that lock to autoedits, which is the only place where it *should* be required (but we're doing it for every edit)
      • 2013-01-01 00104, 2013

      • ianmcorvidae
        I think that's not the main issue though
      • 2013-01-01 00132, 2013

      • ocharles
        i mostly think that might be a symptomn, but not the problem
      • 2013-01-01 00139, 2013

      • ianmcorvidae
        yeah, agreed
      • 2013-01-01 00103, 2013

      • ocharles
        ok, gonna et up then, bbiab
      • 2013-01-01 00152, 2013

      • Freso
        ianmcorvidae: Just replied to your comment on CR.
      • 2013-01-01 00135, 2013

      • Freso
        And, uh, sorrry for being a blurb of text. I just get up and couldn't manage figuring out a good place to insert linebreaks. :|
      • 2013-01-01 00157, 2013

      • ianmcorvidae
        you need to publish the comment
      • 2013-01-01 00100, 2013

      • ianmcorvidae
        it's not there :P
      • 2013-01-01 00119, 2013

      • Freso
        Oh, right.
      • 2013-01-01 00122, 2013

      • Freso
        Silly CR.
      • 2013-01-01 00124, 2013

      • Freso
        Done.
      • 2013-01-01 00115, 2013

      • LordSputnik joined the channel
      • 2013-01-01 00120, 2013

      • reosarevok joined the channel
      • 2013-01-01 00121, 2013

      • reosarevok joined the channel
      • 2013-01-01 00153, 2013

      • sezuan joined the channel
      • 2013-01-01 00135, 2013

      • voiceinsideyou joined the channel
      • 2013-01-01 00121, 2013

      • voiceinsideyou1 joined the channel
      • 2013-01-01 00137, 2013

      • kepstin-laptop joined the channel
      • 2013-01-01 00158, 2013

      • ruaok joined the channel
      • 2013-01-01 00141, 2013

      • ruaok
        ianmcorvidae: ping?
      • 2013-01-01 00114, 2013

      • nikki
        I imagine he's still asleep. he didn't go to bed until late
      • 2013-01-01 00135, 2013

      • nikki
        not even four hours ago :P
      • 2013-01-01 00136, 2013

      • ruaok
        I figured that. :) I got the last email from him about 5 hours ago.
      • 2013-01-01 00153, 2013

      • ruaok
        I'll wait for him to wake to try and patch things back up.
      • 2013-01-01 00107, 2013

      • ruaok
        I will, however, get the search indexes building again.
      • 2013-01-01 00113, 2013

      • ruaok
        damn nagios. not sending me emails.
      • 2013-01-01 00119, 2013

      • reosarevok
        Supposedly he did that?
      • 2013-01-01 00154, 2013

      • ruaok
        who did what?
      • 2013-01-01 00103, 2013

      • nikki
        ian apparently got search indexes updating again
      • 2013-01-01 00110, 2013

      • reosarevok
        that
      • 2013-01-01 00129, 2013

      • ruaok
        oh, whoops.
      • 2013-01-01 00133, 2013

      • ruaok
        he didn't mail me about that.
      • 2013-01-01 00113, 2013

      • ruaok
        ah looks like we got one set out and I killed the next run thats been going for about an hour
      • 2013-01-01 00145, 2013

      • LordSputnik has left the channel