but I think EXCEPT may be the way to go, so i'll have a look into that
that can kill the entire loop
ianmcorvidae
switching that to <> it does seem to be slower than the 1s you're getting
so that seems right
yeah, 15711.849ms
ocharles
nice work postgres!
ianmcorvidae
though that's still wrong, I guess
so do your thing and I'll stop pretending I know things :P
ocharles
wrong?
ianmcorvidae
I'm selecting everything that has edits, but where the edits are non-open
ocharles
and don't be so self-deprecating :P
ianmcorvidae
where what we want is that or has no edits
though I guess every entity at least has a create edit so maybe that's fine
ocharles
ah, I see what you mean
ianmcorvidae
and it's not entirely wrong, I didn't know EXCEPT existed
ocharles
it gets a bit worse as you start to tack more things on
4s with half the relationships and artist credits
though with an upper bound on two hours, anything is an improvement really
ianmcorvidae
haha
I don't know how the loop ends up getting implemented, but I guess the next most likely to be optimal is to do a single except with the most selective condition and then do the rest as part of the loop
I don't know why that'd work out better though, except postgresql tends to regularly prove itself smarter than I expect, so :P
ocharles
except'ing everything gives me 29 rows in 6s
as in except one at a time
so that seems pretty good, and about the right number!
ianmcorvidae
that's pretty good, yeah
slightly high number of results, possibly
ocharles
well, it didn't run last night
and 15 a day is on par, i think
ianmcorvidae
yeah, suppose so
ocharles
i'll do a bit of manual poking though
our site seems to think they are empty too
gonna make some coffee and then i'll submit a review
ianmcorvidae
cool
ocharles
should probably get this onto astro today so it's ready for the weekend
ianmcorvidae
if we can get it out before tonight's scripts that'd be ideal
weekly things happen tonight, so there's more subscriptions going out than usual
so doing them closer to on-time seems good :)
though I guess there's also a dump so it's not like they'd be on time anyway
no need to be on the astro branch of musicbrainz-server
ianmcorvidae
haha
ocharles
alright, fixed
ianmcorvidae
okay, all of those look good
ocharles
yay
bah, missed one thing that astro has
daily.sh is not meant to be ran with perl
CalculateRelatedTags ran fine
ianmcorvidae
k
I'm not actually sure what that does
I realize :P
ocharles
same :p
ianmcorvidae
regenerates the tag_relation table, apparently
wonder if we or anyone else uses that
we don't, at least
MBJenkins
* Oliver Charles: Remove 'carton' from hourly and daily cron jobs
* Oliver Charles: Rewriting all the empty_ sql functions to run considerably faster
ocharles
._.
marcooliveira joined the channel
ianmcorvidae
ocharles: entirely non-pressing unrelated question that popped into my head: how close are we to being able to add statsd instrumentation to musicbrainz-server? (i.e., is there any reason we can't yet other than we haven't written it)
ocharles
we're ready for it
i did have a branch that added some stats, but I probably don't have that now
it's mostly just a case of writing code
any stats you have in mind?
also, seeing a lot of '<Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><Resource /><RequestId>e556fa4c-8396-4e6d-b62e-443a67c0f433</RequestId></Error>' at the moment...
as in, over 200 of them
that's for metadata.xml it seems
oh, it seems quite random
there are other entries in the log that are ok
hopefully i can get gamekeeper to build in the ppa soon, and we will have monitoring for rabbitmq
Oliver Charles: Fix PgTAP tests for empty_* checks
andreypopp joined the channel
LordSputnik joined the channel
ianmcorvidae
ocharles: well, I was thinking of just some simple things like edits inserted by type
ocharles: nothing hugely pressing or anything, I was just wondering where that all stood
ocharles
right
makes sense!
ultimately, now that I learn more about how this stuff all works, we may consider moving stats there
ianmcorvidae
I'm not sure that all of them make sense there, but possily
+b
well, I mean, I guess we could just change the calculation script to store them there, if nothing else
ocharles
yea, that's mostly what I'm saying
but haven't given it much thought
ianmcorvidae
I'm not sure how our setup would deal with things that only get updated once a day, though
ocharles
in what sense?
you can have variable retention rates
so statistics would just have 1 point per day
ianmcorvidae
ideally we'd not have it be a "stairstep" type thing like if we just used statsd in the most naive way, I guess
with a statsd gauge it could, ostensibly, end up only updating once a day but storing it every $whatever_period
but as long as it works out to have different periods basically arbitrarily it should be fine :)
ocharles
not quite following what you're suggesting with statsd
ianmcorvidae
well, gauges in statsd
which would probably be the most straightforward implementation here
if you send them data less frequently than whatever period statsd uses to store them in carbon, then it'll just repeat the same value
but I guess this is solved either by pushing it to carbon directly, ourselves, or possibly by configuring statsd such that it doesn't update that particular stat as often (or only when it's given a new value)