#metabrainz

/

      • Leftmost
        LordSputnik, are there any open questions about the database still?
      • LordSputnik
        Leftmost: only planning how to do merging
      • I could maybe blast off a procedure for that tonight
      • Leftmost: if you feel like it, add a couple more of those port jade -> react tasks to GCI
      • I've done one but leaving it unpublished for the time being
      • Leftmost
        Oh, right. I'll keep thinking about how we can deal with revision parenting, since I think that addresses most of the merging problems except the actual property selection.
      • LordSputnik
        (revision and entity display might be easy ones for beginners)
      • Leftmost: ah yeah, I'll think about it now, maybe I'll come up with something
      • I read up on git internals but that hasn't helped much, apart from getting me to think about NES again and that leading to the new model for relationships
      • Leftmost
        The obvious way is having a revision_parent table, but I'm not sure if that's overkill.
      • Yeah, I came up with about the same schema for that driving home from the store.
      • LordSputnik
        So, when we merge, we have two revisions - those are parents to the merge revision, correct?
      • Then, if we revert the merge (split), we create two new revisions, with the previous data, and with the merge revision as the parent?
      • Leftmost
        Yes.
      • (Can we call this CRUMB? Create, Read, Update, Merge, Belete? :))
      • LordSputnik
        Haha, that or DRUMC?
      • Leftmost
        Additionally, a create relationship revision would modify the relationship sets of two different entities. Ideally, the entities would share the same revision and so their master revisions would be the parents. Reverting an add would be single-parent single-child, deleting a relationship would be double-parent (or potentially single-parent) single-child.
      • LordSputnik
        So I think we may need a separate table to keep track of revision relations
      • Lotheric has quit
      • With PK (parent_revision_id, child_revision_id)
      • Freso
        alastairp: Seems like we have a student from New Zealand :)
      • Lotheric joined the channel
      • LordSputnik
        (effectively storing the edges of the directed acyclic graph of revisions)
      • drsaunders has quit
      • Leftmost
        Yeah, that sounds right.
      • drsaunders joined the channel
      • (Man, I love graphs.)
      • LordSputnik
        That may be the most CS sentence I've ever put together :P
      • Leftmost
        It wasn't CS, it was math!
      • LordSputnik
        True. CS is just applied math anyway
      • MBJenkins has quit
      • Leftmost
        Yep.
      • Obligatory xkcd link: https://xkcd.com/435/
      • LordSputnik
        *maths
      • Freso
        Is that the purity level one?
      • Leftmost
        Freso, you know it.
      • Freso
        Yes. Yes it was.
      • darwin
        500 question purity test
      • LordSputnik
        I'm slowly learning more xkcd through shared links and random strip sprees :P
      • Leftmost
        "Have you ever thought about how your research will be useful?"
      • Freso
        Man. The xkcd guy must feel really accomplished, bringing a kind of second language in its own to so many people.
      • Leftmost
        "Have you ever used a real-world analogue to illustrate a finding?"
      • LordSputnik
      • Freso: did you see the side-scrolling xkcd game the other day?
      • Freso
        Nope.
      • Leftmost
        LordSputnik, I got lost in that for over an hour...
      • Freso
        But I do kind of want the thing explainer.
      • LordSputnik
        Leftmost: I went left, got stuck in the volcano, then went right all the way
      • I also explored the star destroyer a little, but not fully :P
      • Leftmost
        You got stuck in the volcano? But there's more past there!
      • Hmm. Storing source and target entities in the relationship table is now technically denormalized.
      • LordSputnik
        Leftmost: Oh I went back again after reloading :P
      • Leftmost: oh yeah, the relationship table is pretty much just (id, type) now
      • To get the entity BBIDs (hopefully not often needed), you'd join through relationship_set__relationship, relationship_set, <entity>_data, <entity>_revision, <entity>_header :P
      • Although we *will* need them every time we display a relationship... hmmm
      • Nyanko-sensei has quit
      • LordSputnik is now imagining a society of sysadmins like https://www.youtube.com/watch?v=UOs-4J6rr-w
      • I think that could actually save the world :P
      • Leftmost
        You'd need to determine the relationship type, the type of the other entity (which I think becomes non-trivial when we use the flag, as we need to test which entity type column we need to check), then select the entity of that type which has a master revision with data containing a relationship set containing that relationship.
      • Has to be a better way than that. :-P
      • qu3ntin02 joined the channel
      • qu3ntin02
        Hey guys
      • Does bookbrainz have an API?
      • Freso
        This question sure is popular.
      • Leftmost
        If we don't store isSource at the relationship set level (shouldn't really be relevant when only looking at the set, right?), then storing source and target BBIDs in the relationship table isn't denormalized and simplifies everything, heh.
      • LordSputnik
        qu3ntin02: https://bookbrainz.org/ws/ - but there's currently little (no) documentation
      • Leftmost
        qu3ntin02, we do have a web service, but it's worth noting that its output format isn't stable.
      • LordSputnik
        If you know python, though, you can check our ws repository at https://github.com/bookbrainz/bookbrainz-ws
      • qu3ntin02
        Ok I see, thanks a lot
      • mildused
        LordSputnik, What is the difference between the two?
      • LordSputnik
        mildused: which two? :)
      • Leftmost
        mildused, one is the production instance, one is the source for it.
      • mildused
        got it
      • Leftmost
        LordSputnik, can you think of an instance in which we want to know whether something is source or target but don't want to join against the relationship table? That seems like information for the relationship data itself.
      • mildused
        Why not Node.js?
      • darwin
        why not zoidberg
      • LordSputnik
        Leftmost: but. You could also have an entity A with entity_data with a relationship_set pointing to a relationship set, and also have entity B with different entity_data pointing to the same relationship_set, meaning that the info stored in the relationship was wrong
      • Leftmost
        Because python was the direction we initially chose to look. That's changing, though.
      • LordSputnik
        mildused: yeah, WS 1.0 will be node.js :)
      • mildused
        So... how far off is that?
      • LordSputnik
        What we're discussing right now is the reason we haven't started on that yet
      • (ie. schema 1.0)
      • mildused
        Ohh haha
      • Leftmost
        LordSputnik, I don't follow. Why would the info stored in the relationship be wrong?
      • mildused
        still using Redis?
      • LordSputnik
        Leftmost: because two different entities could be the source for a relationship through their relationship_sets
      • mildused: yup
      • Because relationship carries so little information
      • Leftmost
        LordSputnik, how? The relationship set wouldn't know anything about whether something is a source.
      • LordSputnik
        I need to think about the relationships a bit more, something is definitely not good with the way they are in that diagram
      • Leftmost
        Could you draw up an example of why http://yuml.me/edit/446257be would be wrong?
      • Nevermind.
      • One sec.
      • LordSputnik
        My numbers are the wrong way around for the sets
      • Leftmost
      • qu3ntin02 has quit
      • LordSputnik
        source and target are BBIDs?
      • Leftmost
        Yes.
      • LordSputnik
        Then the whole graph needs to change to connect relationship and entity
      • Bookzombie joined the channel
      • But that still wouldn't work if you keep relationship_set
      • Leftmost
        I'm not sure I see why.
      • LordSputnik
        Because any number of entities can share the relationship set, meaning that the relationship BBIDs could easily be invalidated
      • Consider an entity with relationships being merged into an entity with no relationships
      • The resulting entity has a different BBID, but the same set of relationships as entity A
      • mildused
        So tasks should be done for the python web service?
      • LordSputnik
        So that will cause problems, but I don't know exactly what problems until we have a rough merge process worked out
      • mildused: yup, exactly
      • Stuff like testing for the current WS will be useful for when we come to write tests for the new WS in a few months time
      • Nyanko-sensei joined the channel
      • Leftmost
        LordSputnik, the procedure would be the same, I think, as if you merged two entities both with relationships: new relationship data will have to be created pointing to the correct entity and a new relationship set created.
      • LordSputnik
        Leftmost: OK, but you'd also have to duplicate the relationships themselves, which is why I'm not sure how much RelationshipSet makes sense when we have BBIDs on the relationship table
      • Need to think some more
      • Leftmost
        Okay.
      • LordSputnik
        Another problem. If you have 5 relationships between an entity and 5 other entities, then you have to make 6 revisions to change them all
      • If you revert only one of those revisions, the relationships become out of sync
      • So, I don't even think we can handle relationships in the same versioning system as entity data
      • identifiers work because they only involve one entity (the other is some external identifier)
      • I'm going to leave it at http://yuml.me/edit/1f859c14 for tonight
      • natta, all! :)
      • d4rkie joined the channel
      • Nyanko-sensei has quit
      • Leftmost
        LordSputnik, we wouldn't need to have more than one revision, I think.
      • With the change from revisions being single-parent, we can change arbitrarily many entities with a single revision.
      • I'm not sure if it's a good thing to change related entities, though. That does seem to balloon pretty fast.
      • darwin
        (DBA suggests cascading updates are often bad)
      • Leftmost
        I'm not willing to give up yet, though. I think putting relationships into the same versioning system can still work, just need to figure out how.
      • Rayna joined the channel
      • Freso
        LordSputnik | Another problem. If you have 5 relationships between an entity and 5 other entities, then you have to make 6 revisions to change them all -- or 10 revisions; 5 revisions on the single entity and 1 revision on each of the five. If a revision is reverted, you "just" "undo" the related revision on the other end too.
      • s/;/:/
      • Leftmost
        Ontologically-speaking, a relationship is a link between two entities. If the identity of one of the entities changes, we want to change the link but not the other entity.
      • This does create practical problems in finding out what that other entity is, though.
      • Rayna has quit
      • LordSputnik, okay, thinking about it: to fetch an edition with a publisher rel, we go EditionHeader -> EditionRevision -> EditionData -> RelationshipSet -> RelationshipSet_Relationship -> Relationship -> RelationshipType already, just to display the edition side. We already know the other side is a publisher, so we just have to do the other side without the last two steps, which is only one join more than we'd have to do to get the
      • publisher side of it anyhow.
      • We could possibly even eliminate the PublisherHeader part of it, since PublisherRevision knows the BBID.
      • Well, no, we probably don't want to do that. We still want to get the master revision if we're fetching the edition's master revision.
      • Leo_Verto has quit
      • Lotheric has quit
      • Lotheric joined the channel
      • D4RK-PH0ENiX has quit
      • opatel99 joined the channel
      • opatel99
        Hello again everybody.
      • MajorLurker joined the channel
      • ariscop has left the channel
      • D4RK-PH0ENiX joined the channel
      • opatel99 has quit
      • Bookzombie
        New pull request "Login and Register pages converted into react.js" by anniezhou301: https://github.com/bookbrainz/bookbrainz-site/p...
      • D4RK-PH0ENiX has quit