#musicbrainz-devel

/

      • nikki
        that's harder because it's closed edits too
      • ianmcorvidae
        yeah; I'd looked into that but I couldn't figure out how to fix that one
      • navap
        :(
      • ianmcorvidae
        we may need to kick up the statistics target for some columns of the edit table, I dunno
      • nikki
        it's really weird though, because plenty of people with plenty more edits work fine
      • and plenty of people with plenty less edits also work fine
      • ianmcorvidae
        it's choosing a query plan that's sort of silly
      • so
      • basically the thing is
      • nikki
        it's stupid? :P
      • navap
        It's out to get me
      • ianmcorvidae
        above about 20% of the table being returned a sequential scan is best
      • navap
        ANd jeroenbot
      • ianmcorvidae
        below that an index scan usually is
      • however right around that boundary it's often sort of a toss-up
      • whihc is where navap and jeroenbot are
      • it's choosing a sequential scan but probably should be doing an index in those cases
      • but I'm not sure how to convince it of that :)
      • navap
        I don't understand what you mean by 20% of the table is being returned with a sequential scan
      • Can you elaborate?
      • nikki
        20% of 20 million is 4 million edits...
      • ianmcorvidae
        er, maybe it's not 20%
      • looking at the ticket it's a frequency of about .0025 or higher that it's choosing a sequential scan instead of using edit_idx_editor
      • navap
        Isn't that 10,000? I've got 40,000+ edits
      • ianmcorvidae
        this applies to people with more edits, e.g. nikki/jesus as well, but for them it's the correct choice
      • nikki
        it loads in about 5 seconds for me
      • ianmcorvidae
        yeah, you're near the upper end of the people having trouble
      • navap
        jeroenbot has 70,000
      • ianmcorvidae
        basically the issue is, sequential reads are easier than random reads
      • indexes are random reads
      • so it'll chose a sequential scan if it thinks it'll be touching enough of the table that the benefits of sequential over random reads will be better
      • it's wrong about where that boundary is though
      • and I don't know how to make it smarter about that
      • it may have something to do with the recency of edits, too -- the plan it's choosing is a backwards sequential scan
      • databases are hard :P
      • or perhaps more simply: I got lucky yesterday figuring out how to fix the /edit/open thing :)
      • navap
        heh
      • I notice that rika doesn't have a problem
      • ianmcorvidae
        hm, true
      • ianmcorvidae pokes at that again
      • (I think I'd looked at that before and decided why, but I forget, so)
      • ijabz joined the channel
      • navap
        jeroenbot takes quite long on rika though
      • ALthough it does eventually load
      • ianmcorvidae
        the query is SELECT edit.id, edit.editor, edit.open_time, edit.expire_time, edit.close_time, edit.data, edit.language, edit.type, edit.yes_votes, edit.no_votes, edit.autoedit, edit.status, edit.quality FROM edit WHERE (editor = ?) ORDER BY id DESC OFFSET ? LIMIT 500 btw
      • and yeah, it's fast on my system, but it's using the index scan
      • navap shrugs
      • hm
      • navap
        Thankfully (unfortunately) I don't edit much these days so I don't have to refer to my edits as much
      • ianmcorvidae
        what's in common here is that rika and my thing both have the default default_statistics_target (of 100), while totoro has 300
      • so I guess that's probably why
      • navap
        But today I introduced someone to MB and to give them an example of how people edit I tried to show them my edit queue. They weren't impressed :p
      • ianmcorvidae
        haha
      • navap
        I also showed them the IA today and tried to relate it by talking about the wayback machine.
      • They tried to view an old google page and kept getting 50x ater 50x
      • Not a good demo
      • ianmcorvidae
        heh
      • navap
        It all started from the 10PB sticker on my laptop :)
      • Freso
        Anyone know if there's a way to attach a branch on GitHub to an issue?
      • ianmcorvidae
        it's annoying, as I recall
      • but possible
      • I don't remember specifics other than you have to use curl to their API, you can't do it through their web interface
      • Freso
        I'll DDG it.
      • SO to the rescue!
      • ianmcorvidae
        ah, yeah, that
      • ianmcorvidae waits for this query to run forever on totoro :/
      • heh
      • navap, you haven't edited much lately
      • it's having to scan the whole way back to edit 15496502 evidently
      • mb-chat-logger
      • navap
        Oh, fancy linking
      • ianmcorvidae
        (locally, that's 4,918,231 edits from current)
      • navap
        uh, what? I've edited much more recently
      • ianmcorvidae
        hm
      • that's supposedly your 500th-most-recent edit
      • navap
        That's possible
      • My editing numbers went down drastically around the time NGS released
      • ianmcorvidae
        (by contrast: nikki's is from the third of january :P)
      • nikki
        :P
      • ianmcorvidae checks jeroenbot
      • navap
        It hasn't edited since 2010
      • ianmcorvidae
        ah
      • yeah; the 'bad' plan it's choosing is doing an index scan backwards on edit_pkey
      • i.e., it's looking at every single edit in descending order by ID
      • Ben\Sput has left the channel
      • of course I still have no idea why it's choosing that plan :/
      • navap
        There must be several users who have the same problem as mine, I wonder why no one else has said anything
      • ianmcorvidae
        I guess it seems like this is most likely to affect people who used to edit a lot but don't edit as much now -- which is a perfect category for not noticing it's broken, I guess, if people are on a downswing in participation
      • Freso
        ianmcorvidae: If it's any help, CatCat's edit page always 50x's on me.
      • (To give you another data point to compare with, if needed, etc.)
      • ianmcorvidae
        well, it certainly gets the same query plan
      • could you comment on the ticket to add that note though? easy to forget the list :)
      • Freso
        Quick link to the ticket? :)
      • ianmcorvidae
        MBS-3793
      • mb-chat-logger
      • ianmcorvidae
        heh, once again it uses the index locally
      • Freso
        But, hey, even if you don't know *how* to fix it, it seems you've found *what* needs fixing.
      • That's a big step.
      • ianmcorvidae
        well, not really
      • I know the way to solve it for right now given the data we have
      • (other than actually how to make postgres do it, but you know what I mean)
      • but I don't really know what needs fixing to make it not just break again when we get another million edits, or something
      • and yeah, catcat's edits are getting the same problematic query plan, which results in postgres having to scan through more than a million edits before it gets all of the most recent 500 :/
      • zas
        anyone knows who is "mwiencek" (on github) here ? i'm still trying to match MB/github/irc nicks ;)
      • ianmcorvidae
        zas: that's bitmap
      • zas
        ty ;)
      • ianmcorvidae
        'lalinsky' on github is 'luks' here, and I forget what Mineo's github name is
      • Freso is "Freso" on GitHub. And MB. But on IRC, I'm "Freso".
      • zas
        i'm zas, everywhere.
      • :)
      • ianmcorvidae
        ah, mineo is mineo on github
      • that's good
      • I've likewise chosen the completely different name 'ianmcorvidae' for github ;)
      • Freso
        caller#6 isn't "caller#6" anywhere, because most places don't allow hash in the username. :|
      • ianmcorvidae
        yeah, he's usually callerNo6 or something
      • but that's understandable enough
      • Freso
        ianmcorvidae: You should do something to really mindfuck with people and call yourself ianncorvidae or iamcorvidae etc.
      • ianmcorvidae
        hah
      • Freso
        ianmcorvidae: People would recognise it being you, but they'd never be able to spell it right. >:D
      • ianmcorvidae
        haha
      • navap
        hrglgrgl comes to mind
      • ianmcorvidae
        hrglgrmpf, yes :P
      • zas
        btw, i'm trying to work on Picard, but it seems it takes ages to have anything get in, or even few reviews on PRs. It just doesn't work. Not that my PRs are good or anything, but it is very hard to have feedback or help to improve them.
      • navap
        :)
      • ianmcorvidae
        ianmcørvıdæ clearly
      • Freso
        navap: Only people already cannot spell his name.
      • ianmcorvidae
        yeah, we don't have many people working on picard :/
      • Freso
        navap: Ian's name is easy enough to spell, so add or remove or change a single character, and it'll mess with people.
      • ianmcorvidae
        ianmcovridae
      • that one would be fun :P
      • navap
        heh
      • zas
        about it (Picard), it doesn't support https, but it seems MB will require it soon
      • Freso
        zas: Poke Mineo and bitmap and luks on IRC until they give sound. Wait a day or two. Then repeat.
      • ianmcorvidae: :D
      • ianmcorvidae
        nah, requiring HTTPS will be a ways off
      • zas
        Freso: then i need a bot ;)
      • navap
        ianmcorvidae: What about user authentication?
      • ianmcorvidae
        it's required for oauth, but considering that luks is the one who wrote MB's (pending) oauth support in the first place...
      • navap: for the /login page, sure, but that's all
      • for the webservice digest auth is still totally fine
      • Freso
        zas: I doubt it'll be hard to add HTTPS support though.
      • ianmcorvidae
        picard certainly doesn't use mb.org/login
      • Freso
        And OAuth still isn't merged into beta, let alone test or anywhere.
      • navap
        So no immediate plans to move to digest+ssl? Or is that not really required?
      • zas
        Freso: i agree, but since no one seems to work on this software.... it may be takes longer ;)
      • Freso
        But Python supports HTTPS easily enough, and I'm sure the Qt stack that I've read several people praise wouldn't make the built-in support worse.
      • zas: Like ianmcorvidae said, luks is the principal maintainer of Picard, and he's also the one who wrote the OAuth patch - partially becuase he wanted OAuth for Picard.
      • zas
        well, what i meant is that Picard wasn't designed with https in mind, looking at hardcoded "http://..." string concat in it
      • Freso
        zas: I don't think a Picard patch for OAuth support is far off once the patch actually moves closer to beta/master of musicbrainz-server.