#bookbrainz-devel

/

      • LordSputnik
        No, I don't either, I'd like to have something more like Wikipedia's talk pages, but that's more complex, so I've left it out for now
      • Leftmost
        I'm still a little concerned that an annotations table would balloon, and I think we should treat disambig like other data since it should be smallish.
      • LordSputnik
        Storing them separately should reduce the amount of space they use, rather than increase it
      • Leftmost
        Yeah.
      • LordSputnik
        The size of those tables will almost certainly be smaller than the size of the Entity table, since most entities won't have them
      • Leftmost
        My current line of thinking is annotation as ID, disambig as inline data, encourage people to think of annotations as a stopgap for storing information our schema can't yet incorporate and that may be lost down the road if we move to a better system.
      • LordSputnik
        (and the ids are set to null in the EntityTree if there's no disambig/annotation)
      • Leftmost
        I'm reluctant to commit to storing an annotations table forever, as something about it makes me sad inside.
      • Does that seem reasonable?
      • LordSputnik
        I still don't see the benefit in storing disambiguations inline, I'm afraid
      • I don't think we gain from it, and we increase the amount of memory needed
      • Leftmost
        I see it as a rehash of the names table in MB, which proved to be a pain to maintain and turned out to be hard to define semantically. I think we should be storing names inline and I don't think that disambig is materially different from a name as it relates to storage.
      • Gentlecat has left the channel
      • We're committing to a certain amount of denormalization with the data table as it stands.
      • LordSputnik
        How so?
      • I thought things were reasonable normalized
      • Leftmost
        Multiple data tables can contain substantially similar data referring to the same entity with only minor changes, and that's intentional.
      • LordSputnik
        Well, it's not redundant
      • The information is in the combinations of aliases/annotation/disambiguation/data
      • There's no way of storing that which is more efficient than foreign keys to separate tables, and no way we can reduce the duplication without losing the information
      • ocharles_: kuno: ping, if you're around to give some database-fu
      • Leftmost
        LordSputnik, it also creates a separate join to a generic table which can have an arbitrarily large set of data, which has caused its own set of problems with edits on MB.
      • The table will never be as large as the edits table, but it carries a performance cost and stores entity-specific information in a generic table.
      • LordSputnik
        The EntityData table?
      • Leftmost
        No, the disambiguation table.
      • kuno
        graph databases!
      • Leftmost
        We don't store end dates in a separate table referenced by ID because end dates are small and it creates overhead to do a lookup.
      • Disambig is larger but still relatively small, should still have an upper limit on size, and would also create lookup overhead.
      • Storing per-entity information in a generic table seems worse to me than small duplications of data.
      • LordSputnik
        Oh, so you're suggesting we merge the Disambiguation and EntityData tables
      • Leftmost
        Yes.
      • LordSputnik
        That makes more sense
      • I thought you meant the Disambiguation and EntityTree tables
      • Leftmost
        Oh, sorry. I think I was being really unclear about that.
      • LordSputnik
        To be honest, I'm not 100% sure we need to separate EntityTree and EntityData, but I wanted to hear from ocharles_ before I thought any more about that
      • Since EntityTree is just an entity-type independent way of storing pointers to other tables, and there's no reason EntityData couldn't do that
      • Leftmost
        Here's what I'm thinking, high-level, in terms of data structures:
      • LordSputnik
        So, I'd suggest we keep that as it is for now, until I've discussed that with ocharles_
      • Oh go on
      • Leo_Verto
        oshit
      • Leftmost
        An Entity is a structure with an ID, a type, a rooted tree, and a node pointer associated with it. Each node of the rooted tree is a revision, which contains an edit note, a date, a pointer to a parent node, and a pointer to the data which makes up that revision. The node pointer on the entity points to the current master revision node.
      • Leo_Verto
        restarting the entire network right now
      • LordSputnik
        Leo_Verto: yeah I've seen :P
      • Leftmost: let me draw that
      • mb-chat-logger joined the channel
      • Leftmost: ok, where are aliases?
      • Leftmost
        Not sure offhand. That was a very high-level picture and I'm still not used to the idea of putting aliases in a separate table.
      • LordSputnik
        Ok, so, it doesn't group revisions together at all
      • Leftmost
        No, only by parent.
      • LordSputnik
        But if we're not voting I guess we don't really need edits to group things?
      • Leftmost
        Application of an edit is accomplished by moving the master revision pointer.
      • LordSputnik
        Oh, I mean, in the current schema, edits are groups of revisions which get applied together
      • Leftmost
        Right, I meant in my schema.
      • LordSputnik
        Ok, I think this would work, but we still have the issue of storing different data for different entity types
      • Leftmost
        Not necessarily. Our entity structure can store type without using a typed tree.
      • Leo_Verto
        bb.org back up
      • Leftmost
        Each EntityData structure is associated with only a single Entity, so everything up to that point can be generic.
      • LordSputnik
        Well, what would the EntityData for each revision contain?
      • Leftmost
        Any type-agnostic revision information and a pointer to a type-specific data struct.
      • LordSputnik
        So, like an EntityTree is now?
      • Leftmost
        Yes, similar.
      • LordSputnik
        So, the main difference is in the organisation of revisions, then
      • (and also presumably storing annotations and disambiguations inline in the entity-type agnostic data)
      • Leftmost
        Yeah.
      • LordSputnik
        Right, I'll think about that
      • Let's go onto the final few topics
      • SSH
      • Leo_Verto
        yeah
      • LordSputnik
        Leftmost: in order to set you up on the bb.org sandbox, I need the public SSH keys for the PCs you want to connect from (there's no password authentication afaik)
      • So I guess the best way to do that would be in an email
      • Ok, next is Search (who wanted to talk about that?)
      • Leftmost
        I did.
      • LordSputnik
        btw Leo_Verto: if you wanted SSH access to the bookbrainz server, we can probably sort that out too (although I should probably check with ruaok first)
      • Leftmost
        I've looked into it and it seems that solr and elasticsearch largely comes down to preference. I've got some experience with elasticsearch. It's dead easy to use and dead easy to get running. Any objection to moving forward with an ES search implementation?
      • LordSputnik
        No objection here
      • Leo_Verto
        probably not essential but could be useful in certain situations
      • LordSputnik
        Ok, I'll ask him next Monday, when he's not so busy (hopefully)
      • Leftmost: either way I'd have to learn one or the other, and if it's easy to set up, we can probably switch between them without too much fuss if we really need to
      • Leftmost
        LordSputnik, public key sent.
      • Leo_Verto
        *decrypting transmission*
      • LordSputnik
        Leftmost: ok, I'll see what I can do when I have a proper internet connection again
      • Finally, Setup (also you, Leftmost?) :)
      • Leftmost
        Yep.
      • Just a general note that we should work on making setup easier. I don't know if I mentioned that yesterday.
      • LordSputnik
        nope
      • What bits particularly?
      • I thought it was fairly straightforward (especially compared to MB)
      • Leo_Verto
        mhm
      • Leftmost
        It wasn't difficult or confusing, but I think in general it could be smoother.
      • Leo_Verto
        the frontend setup is pretty well documented by now
      • LordSputnik
        so, currently, you have to install bbschema, install postgres and redis, then clone bbws
      • Install dependencies with pip, then launch the ws
      • Leo_Verto
        oh yeah, installing redis is entirely undocumented
      • Leftmost
        The READMEs for each component should include a short setup howto and any system reqs, I'd say, and -schema or -ws should be able to set up the database with only a user/password to its name.
      • LordSputnik
        Clone the site, npm install the dependencies, compile the javascript, then launch the site
      • Ok, currently database setup is done in a separate script in bbschema
      • Leftmost
        I got it set up without difficulty, just things that may be worth keeping in mind.
      • LordSputnik
        we could move that somewhere else, maybe, or try to integrate it with the ws config file
      • Leo_Verto
        if we get the config system done, we could provide configs for having a whole local setup or just working on the site using the bb.org ws
      • LordSputnik
        (or have a separate bbschema config file containing the database settings?)
      • Leftmost
        It probably belongs in -ws if -schema is just a lib for interacting purely with the schema instead of making database calls.
      • LordSputnik
        Well, that's the thing...
      • I've just started moving some of the editing logic into schema
      • There's a blurry line between them
      • Leftmost
        Hmm. That may be worth some thought, then.
      • LordSputnik
        Ok, anything else for discussion today?
      • Leftmost
        Not that I can think of. Do you wanna think over the -ws/-schema split and talk about it next meeting, maybe?
      • LordSputnik
        Ok, should we aim for 9:30 next time?
      • Leo_Verto
        I can do the publisher/edition forms provided the ws endpoint exists
      • Leftmost
        Sounds good to me. Same bat time (plus ten minutes), same bat channel?
      • LordSputnik
        Leo_Verto: yes, it does - same endpoint as the creator/publisher ones
      • Leo_Verto
        unless we want to hold that off until after the new call model
      • LordSputnik
        Leftmost: haha yeah
      • Leo_Verto
        ah good
      • LordSputnik
        Leftmost: do you still have some time now?
      • Leftmost
        I do.
      • LordSputnik
        Ok, I think we can resolve the differences between your new proposed schema and our current one
      • Leftmost
        Leo_Verto, up to you. It may be next meeting before I even have PRs up for discussion on frontend stuff, and new changes should be relatively easy to integrate.
      • LordSputnik
        So, your proposal has revisions organised in a tree for each entity, which is optimal for parallel revisions being merged, right?
      • Leo_Verto
        oh, one more thing
      • I want to start working on better accessability of the site
      • stuff like adding title texts for icons
      • LordSputnik
        Go for it, that's always welcome
      • Leftmost
        LordSputnik, right.
      • LordSputnik
        labels for screenreaders are also helpful, but I've been leaving them out up until now :(
      • Leftmost
        Leo_Verto, please do. I even have an accessibility consultant if you want feedback.
      • Leo_Verto
        is there a proper way to implement those labels?
      • Leftmost
        (My father is blind.)
      • LordSputnik
        Leftmost: is better support for merging parallel work the only reason for using a tree rather than a list of revisions?
      • Leftmost
        There is. I can get you more information by tomorrow if you want.
      • LordSputnik, off the top of my head, yeah.
      • LordSputnik
        Leo_Verto: there's something like an "aria-*" attribute - it's documented in bootstrap, I think
      • Leo_Verto
        Leftmost, you should join our mailing list https://groups.io/org/groupsio/bookbrainz-devel...
      • Leftmost
        There's also form labelling stuff baked into HTML, I believe.
      • Okay.
      • LordSputnik
        Leftmost: ok, so which one we use should depend on whether we need that level of detail in the revision history
      • Leo_Verto
        oh and I also want to add a nice button to the editor page to send a message, the problem here is vertically centering the button
      • Leftmost
        LordSputnik, I think it's a useful concept. It's a very simple data structure, it's list-like if there's no parallel editing, and it makes it easier for multiple changes to be made and merged.
      • It's also very adaptable in terms of how we choose to display this to the end user.
      • LordSputnik
        Ok
      • Leftmost
        I've been doing a lot of data structures reading lately, so my mind may still be there. :)
      • LordSputnik
        Second thing is- do we really want edits/merge requests *and* revisions
      • Leftmost
        I don't see the need myself.
      • LordSputnik
        Assuming the primary way of peer review is through verifying data, rather than reviewing edits, I don't think so either
      • So, we have a tree of revisions, no edits, and I'd assume that revisions can be reverted easily
      • Leftmost
        Yep. A revert would basically just a new revision with an old dataset.
      • LordSputnik
        Leftmost: wouldn't it just be setting the master_revision further down the tree?
      • OH
      • * Oh, no I guess you'd want a note too
      • rather than silently changing history