#musicbrainz-devel

/

      • nikki
        these seem to be coming from beta
      • ocharles
        ah yes
      • I didn't update beta
      • nikki
        BEGIN failed--compilation aborted at line 4.
      • CONTEXT: compilation of PL/Perl function "extract_path_value"
      • ocharles
        nikki: beta and that JSON::XS problem should be fixed
      • ruaok
        load down to 50.
      • ocharles
        tail -F /etc/service/mb_server-fastcgi/log/main/current | grep -A20 '\[error\]' | tai64nlocal # on asterix, pingu and astro is looking quiet
      • ruaok
        good
      • ocharles
        two "expected" time outs
      • (subscriptions)
      • ruaok
        totoro should be ready for me to futz with, right?
      • the parts for pingu are our for dellivery.
      • I will run the memory test on totoro while I am out.
      • ocharles
        best to wait until I leave work in 20 minutes
      • ruaok
        oh, won't be yet.
      • ocharles
        ok
      • ruaok
        I'm not ready to leave the house.
      • ocharles
        i'm still pulling bits and bobs off totoro as I bring baron up
      • ruaok
        ok, np.
      • its amazing watching baron struggle with this load.
      • totoro is a nice fucking machine.
      • ocharles
        yea
      • warp
        ruaok: postgres-rrd should be pointed at baron now. how did you see it was unhappy? I should verify that my changes worked :)
      • ocharles
        i do hope baron can get down to under count(cpu_cores) soon though
      • ruaok: I think we really need to get on setting up a virtual ip for our db though
      • or a 'master-db' hostname
      • ruaok
        warp: it was a nagios email alert.
      • ocharles: fer sure.
      • ocharles
        I'm going to go throw these logs tomorrow and enumerate the various ways all of this went wrong
      • ruaok
        also, the dump locks script should be turned off.
      • ocharles
        various bits of uncommited configuration, non-scripted deployments, etc
      • warp
        hm.
      • ocharles
        ok, looks like pgbouncer is up
      • so can someone drive proxies while I point front ends to talk to that?
      • ruaok
        gimme 2 minutes
      • ocharles
        ok, i'll just do it
      • i have to leave in 10 minutes, but I'll be mostly above ground
      • ruaok
        back
      • still want help?
      • ocharles
        all done
      • ruaok
        sorry. :(
      • ocharles
        np
      • warp
        hm, that's quite some load on baron.
      • ruaok has learned to not be worried about a high load when a system is still responsive.
      • warp nods.
      • so we're ok for now?
      • ruaok
        postgres-rrd is still complaining.
      • I just forwarded the email to you warp.
      • warp
        hrm, odd that you get the complaints and I don't.
      • ruaok
        ha. the load is down to 15
      • ruaok is even less worried.
      • warp
        oh, cron. not nagios.
      • ok, I get those.
      • ocharles
        bbiab
      • ruaok
        ok
      • kepstin-work joined the channel
      • baron is at 6. nice. :)
      • warp
        :)
      • andreasmandel joined the channel
      • reosarevok
        :/ 502
      • Hope it's an isolated one
      • warp
        hm. timezones.
      • reosarevok
        500...
      • ruaok
        load is defintely higher now.
      • reosarevok
        Oh well, I guess I should go back to writing minipatches
      • nikki gets a bunch of errors about connections timing out to baron.mb
      • ruaok
        the CPU is pegged.
      • we should lower our global connection limit.
      • nikki: fewer errors now?
      • pingus mobo arrived.
      • nikki
        I'm not seeing any right now
      • reosarevok keeps trying to get the relationship editor to submit stuff
      • warp
        ruaok: the postgresql timezone on baron is different, which is breaking dave's scripts.
      • ruaok
        ah. go ahead and fix the timezone then.
      • warp
        ruaok: I set the server timezone to UTC, so I think that will fix itself after a postgres restart. in the mean time, I patched the scripts.
      • ruaok
        k, thx
      • warp
        (I will have to unpatch them later :)
      • ruaok
        I lowered the global limit to 3000, from 3500.
      • Lets see if that helps a bit.
      • warp
        so that affect website visitors, web service visitors, or both?
      • ruaok
        everyone, IIRC
      • nikki
        btw, search doesn't seem to be updating on beta. label results are over a day old
      • warp grmbls.
      • warp
        nikki: I'll take a look.
      • yup, there are some null pointer exceptions in the log for ijabz to look at.
      • nikki: I will shut down the updaters for other (other is everything but releases and recordigs) and do a full re-index of those.
      • then re-enable them.
      • nikki: have you made a SEARCH ticket for this? (if so, I will attach the logs there, if not... we need to create one :)
      • reosarevok
        Whatever that affects, my edits still aren't going through :(
      • (the global limit thingy)
      • warp
        oh wait
      • the exceptions I'm seeing are just beta search not being able to connect to totoro.
      • ruaok
        reosarevok: I lowered the limit to 2500.
      • not sure that is going to help you. :(
      • warp
        nikki: alright, I'm not going to touch this until totoro is back, I will point beta.musicbrainz.org at the real search servers for the time being.
      • ruaok
        baron is just plain CPU bound. :(
      • I suspect that using something like the new pingu as a temporary DB replacement is going give better results.
      • it has twice the cores baron does
      • warp
        nikki: beta.musicbrainz.org should be using the production search servers now.
      • nikki
        thanks
      • ruaok
        ocharles: when you return.. I think we may want to make lolo a READ ONLY DB slave, so we can spread the load a little.
      • pgbouncer can help with that right?
      • Prophet5 joined the channel
      • ruaok joined the channel
      • ocharles: ping
      • ocharles
        ruaok: pong, free in ten minutes just on the phone atm
      • ruaok
        k
      • ocharles
        but I don't think pgbouncer can help with that
      • ruaok
        I know people use this configuration. how do we do that?
      • if you can look into that, that would be awesome.
      • I will head to DWNI to start on pingu and totoro.
      • nikki
        ocharles: apparently the caa.org/beta/ urls don't work
      • ocharles
        ruaok: you need your code to know when it's only doing read only work
      • ruaok
        I wonder if we take the most popular ws/2 call and point it elsewhere? would that help?
      • ocharles: think about it.
      • I;m heading to DWNI
      • if you need me, use gtalk mayem@ plz
      • ocharles
        ruaok: i have thought about it, it's a hard problem :)
      • warp
        nikki: fixed.
      • nikki
        warp: thanks
      • warp
        the easiest would be to run a seperate musicbrainz-server, point that at read only. and do the routing in nginx.
      • (if we can identify a certain class of webservice requests which are always read only)
      • ocharles: I guess an authenticated webservice request will need to write last_login_date
      • kepstin-work
        iirc, there's really only a couple ws calls that aren't read-only, right?
      • nikki
        wouldn't all GETs be read-only?
      • warp
        (or however the column is called)
      • kepstin-work
        stuff like the isrc submission, etc.
      • warp
        nikki: quite possibly not if they're authenticated.
      • kepstin-work
        I mean, it's not like we have an editable webservice :)
      • nikki
        I thought ian said we're not using last_login_date now?
      • kepstin-work
        oh, right, most queries can be authenticated to get user tags :/
      • warp
        kepstin-work: they can even be authenticated if the authentication isn't used for anything.
      • kepstin-work
        i suppose picard probably just authenticates all requests if you give it a user account, eh
      • warp
        kepstin-work: so just checking for inc=user-tags isn't enough.
      • if the last_login thing isn't updated, it shouldn't matter.
      • kepstin-work
        can you detect any headers added for authentication, maybe?
      • nikki
        when he checked the other day, it hasn't been updating it since we released ngs
      • kepstin-work
        since all write calls need authentication
      • nikki
        and if it's taken 2 years for anyone to notice, it doesn't seem very useful anyway :P
      • warp
        yes, I think we can route based on /ws/2 + no authentication headers. in theory.
      • nikki: it was just put to use in the whole password reset fuss.
      • nikki
        I said "not very useful", not "not at all useful" :P
      • ocharles
        last_login_date is updated since the password problems
      • warp
        I do not acknoledge this theory of usefulness not being binary ;)
      • ocharles
        anyway, yes, it's obviously easy to say "just make non-authenticated GETS load balance" but I'd like a stricter heuristic
      • nikki
        warp: that's your problem then :P
      • ocharles
        I'd /really/ like to have decent effect typing in the data layer to say "this is a read only transaction" which would fail at compile time if you try and write inside it
      • warp
        ocharles: right. but mb server is not written in haskell :)
      • ocharles
        warp: and that is why I haven't done any work on this :)