#musicbrainz

/

      • abelcheung
        might make sense to categorize web access by network block to check for offenders
      • ianmcorvidae
        what's strange is that it just seems to be that everything takes more network than usual
      • it's not like there's one connection that's suddenly eating up 95Mbps
      • abelcheung
        heard that this 50x thing already exists a few months b4, just not as bad?
      • ianmcorvidae
        yeah, we've always had a certain number of them
      • there are some things that just don't return very fast
      • abelcheung
        the surge pattern is too regular
      • ianmcorvidae
        yeah, this is clearly something that's happening at a specific time
      • I'm poking through crontabs
      • nikki
        stats again?
      • the_metalgamer joined the channel
      • ianmcorvidae
        it's running at a time that doesn't make any sense though, it's two hours *before* the main daily crontab
      • pardon, two hours before yesterday, 3.5 today
      • abelcheung
        could be external access instead of internal ones
      • ianmcorvidae
        yeah, but the access patterns don't look different than usual
      • it does seem like that though, since it appears to be load-balanced across the three backend servers
      • probably should look at access logs or something
      • what's suspicious is that this started right after we released a new server version
      • abelcheung
        that's why i suggested checking out web log and divide by network blocks b4
      • ianmcorvidae
        but there's nothing that seems related
      • abelcheung
        new server version...... just before easter, isn't it?
      • ianmcorvidae
        there was one yesterday (my time -- 2013-04-08)
      • there was also one march 25th, but :)
      • this started yesterday
      • hm
      • abelcheung
        it is as bad as during easter
      • nikki
        where is the data going to? just to asterix/astro/pingu?
      • ianmcorvidae
        yeah, pretty much
      • abelcheung: then you're thinking of something unrelated
      • abelcheung
        ianmcorvidae: well, i'm just saying the 50x rate is just as bad, don't mean the reason behind
      • ianmcorvidae
        I guarantee it's not just as bad: http://stats.musicbrainz.org/webstats/nginx-rrd... :)
      • abelcheung
        oh well.... :)
      • ianmcorvidae
        but hopefully we can figure something out, anyway :/
      • nikki
        could it be something to do with cache stuff suddenly expiring?
      • ianmcorvidae
        hm?
      • abelcheung
        DB cache?
      • nikki
        well, we cache data, right? perhaps it all suddenly gets deleted and then has to be refetched from the db and totoro is like "wtf guys not all at once D:"
      • nikki is just thinking of things that sound reasonably plausible :P
      • uptown1 joined the channel
      • ianmcorvidae
        yeah, I dunno, that doesn't seem to explain it going for six hours or whatever :/
      • abelcheung
        I would think about arrogant spiders or mirror first
      • judging from currently limited info
      • ianmcorvidae
        a HA
      • I see a pattern :)
      • oh, wait, that's response time
      • damn
      • drsaunders joined the channel
      • abelcheung
      • ianmcorvidae
        sure, those are shooting up *after* the problem stops
      • abelcheung
        oh.
      • so.... searches peak at 6pm everyday, yet db processes spawns a lot more since 8-9pm, which coincides with lagging response time
      • ianmcorvidae
        yeah, the number of processes jumps up, can't determine why
      • abelcheung
        cronjob?
      • block read jumps too, which means those extra processes are active too
      • ianmcorvidae
        if it's a cronjob it must be cron.daily, it's not at a consistent time (changed by two hours between the two)
      • abelcheung
        wait. is concurrent db transaction number peaking at 4096 everyday?
      • ianmcorvidae
        not as of the 7th, though?
      • abelcheung
        for 6th and 7th, running like 4.5k
      • but that's the peak too
      • and.... i suppose that's old version of db which is not in active use
      • ianmcorvidae
        no, 20110516 is the current DB
      • if that's what you mean
      • abelcheung
        ouch
      • ianmcorvidae
        that's when NGS was released, heh
      • abelcheung
        some ulimit or pg limit might be at work too
      • ianmcorvidae
        yeah, was looking for things in config
      • abelcheung
        before Jan it's going up to 6k, but never above 4-5 ever since -- and it's like a steady rhythm, that's too unnatural
      • ianmcorvidae
        we upgraded pg around 17UTC on the 5th
      • abelcheung
        that is, less than a week ago?
      • ianmcorvidae
        yes
      • there was a major security vulnerability, thus
      • abelcheung
        Probably can get some hint by enabling slow log on pgsql. But this is production DB....
      • ianmcorvidae
        things time out and get cancelled anyway, I'm not sure it's slow queries -- it's apparently queries that are returning -- and returning a *lot*
      • but apparently not returning it the whole way to the client, just internally
      • abelcheung
        There could be multiple problems. For example, when you guys saying the lagging is cured, I am still constantly seeing 502
      • ianmcorvidae
        well, there's multiple problems
      • trying to figure out the one that's having the biggest effect first
      • (e.g. some large releases have always 502'd repeatedly)
      • abelcheung
        not just large releases :)
      • I'd say there's bigger problem when even single CD edits results in 502 repeatedly
      • ianmcorvidae
        that is not a phenomenon that is well-reported; if it's a specific set of releases, rather than a cross-section, it's most likely related to those releases specifically unless it's during the blocks of time where we have known other problems
      • you've seen the 5xx errors graph, so you can see what is anomaly and what isn't
      • abelcheung
        it's easier when reading graphs, but for personal experience that's harded to tell
      • but slow query log can still help a bit too, at least help tracing what's the offending source
      • because the sudden surge causes exploding I/O and dragging everything down
      • especially if the surge can be irrelevant from increase of external requests
      • ianmcorvidae
        hm
      • it seems it is, at least, swapping during these periods
      • abelcheung
        swapping is one of the worst things for DB
      • ianmcorvidae
        yeah
      • still unclear what's causing any of this though
      • we're just getting an ever-better description of why it's bad :P
      • abelcheung
        yup
      • back to square one: why the sudden increase in no. of processes
      • from less than 100 on average, jumping to max'ed 256
      • ianmcorvidae
        hm
      • what it *looks* like is as though it's not using pgbouncer, for some undefined reason
      • looking at the yearly graph (we instituted pgbouncer last august IIRC
      • abelcheung
        the graph for no. of db transactions?
      • Hmm, DB block hit is much lower since 7th
      • Urgh, tried with a few single CDs with >25 tracks, editing all tracks together also result in guaranteed 502 :-(
      • danoply joined the channel
      • DarkAudit joined the channel
      • v6lur joined the channel
      • gmk1 joined the channel
      • outsidecontext joined the channel
      • SultS joined the channel
      • nioncode joined the channel
      • jacobbrett joined the channel
      • jesus2099 joined the channel
      • jesus2099
        it seems that editing an existing tracklist through TRACK PARSER doesn’t owrk any more… :/
      • nioncode joined the channel
      • Leftmost
        I have a few CDs of field recordings of traditional music. They generally only have one or two tracks with any performer information, located in the liner notes. Should I just Various Artists it and [unknown] most of the artists?
      • Ben\Sput joined the channel
      • ijabz joined the channel
      • Jozo
      • ijabz joined the channel
      • v6lur joined the channel
      • JonnyJD
        Jozo: how on earth?
      • Jozo
        JonnyJD: ?
      • JonnyJD
        the bobrowski edits
      • writing a note right now. Not sure at all how one would get the idea that merging these RGs would be the thing to do..
      • Jozo
        JonnyJD: (I have no idea what to say, so I didn't wrote any note)
      • JonnyJD
        Jozo: I fear the editor doesn't speak english anyways
      • nikki
        they claim to in their profile
      • JonnyJD
        hm, just the edit notes don't look like that :-P And german wouldn't be a problem for me.
      • Jozo
        ianmcorvidae: http://ci.musicbrainz.org/job/full-import/ - What this do?
      • nikki
        a full import :P
      • Jozo
        nikki: Yeh. I just tried find yet another thing what can cause db traffic.... (dates on jenkins and logs are bit weird... different timezones)
      • nikki
        that shouldn't touch the db
      • jesus2099 joined the channel
      • jesus2099
        Leftmost: Hi :) It really sounds like [unknown] to mer
      • this TOC is strangely shared by Santana and Die großen Erfolge now
      • JonnyJD
        jesus2099: the fun part. That video has a disc ID.
      • Jozo
        jesus2099: compare discId another attached... (Please remove it :)
      • jesus2099
        I geuss I’m doing no harm by removing this TOC from Santana
      • UmkaDK joined the channel
      • Jozo
        http://musicbrainz.org/release/c162e9d3-b9f8-38... - Am I getting wrong release or is cover art edits fucked... :/
      • vikyath joined the channel
      • nioncode joined the channel
      • reosarevok joined the channel
      • v6lur joined the channel
      • outsidecontext joined the channel
      • v6lur joined the channel