LordSputnik, are there any open questions about the database still?
2015-12-09 34355, 2015
LordSputnik
Leftmost: only planning how to do merging
2015-12-09 34307, 2015
LordSputnik
I could maybe blast off a procedure for that tonight
2015-12-09 34329, 2015
LordSputnik
Leftmost: if you feel like it, add a couple more of those port jade -> react tasks to GCI
2015-12-09 34357, 2015
LordSputnik
I've done one but leaving it unpublished for the time being
2015-12-09 34335, 2015
Leftmost
Oh, right. I'll keep thinking about how we can deal with revision parenting, since I think that addresses most of the merging problems except the actual property selection.
2015-12-09 34352, 2015
LordSputnik
(revision and entity display might be easy ones for beginners)
2015-12-09 34311, 2015
LordSputnik
Leftmost: ah yeah, I'll think about it now, maybe I'll come up with something
2015-12-09 34338, 2015
LordSputnik
I read up on git internals but that hasn't helped much, apart from getting me to think about NES again and that leading to the new model for relationships
2015-12-09 34308, 2015
Leftmost
The obvious way is having a revision_parent table, but I'm not sure if that's overkill.
2015-12-09 34337, 2015
Leftmost
Yeah, I came up with about the same schema for that driving home from the store.
2015-12-09 34335, 2015
LordSputnik
So, when we merge, we have two revisions - those are parents to the merge revision, correct?
2015-12-09 34324, 2015
LordSputnik
Then, if we revert the merge (split), we create two new revisions, with the previous data, and with the merge revision as the parent?
2015-12-09 34350, 2015
Leftmost
Yes.
2015-12-09 34324, 2015
Leftmost
(Can we call this CRUMB? Create, Read, Update, Merge, Belete? :))
2015-12-09 34354, 2015
LordSputnik
Haha, that or DRUMC?
2015-12-09 34342, 2015
Leftmost
Additionally, a create relationship revision would modify the relationship sets of two different entities. Ideally, the entities would share the same revision and so their master revisions would be the parents. Reverting an add would be single-parent single-child, deleting a relationship would be double-parent (or potentially single-parent) single-child.
2015-12-09 34348, 2015
LordSputnik
So I think we may need a separate table to keep track of revision relations
2015-12-09 34348, 2015
Lotheric has quit
2015-12-09 34348, 2015
LordSputnik
With PK (parent_revision_id, child_revision_id)
2015-12-09 34349, 2015
Freso
alastairp: Seems like we have a student from New Zealand :)
2015-12-09 34307, 2015
Lotheric joined the channel
2015-12-09 34349, 2015
LordSputnik
(effectively storing the edges of the directed acyclic graph of revisions)
2015-12-09 34303, 2015
drsaunders has quit
2015-12-09 34307, 2015
Leftmost
Yeah, that sounds right.
2015-12-09 34312, 2015
drsaunders joined the channel
2015-12-09 34316, 2015
Leftmost
(Man, I love graphs.)
2015-12-09 34318, 2015
LordSputnik
That may be the most CS sentence I've ever put together :P
Freso: did you see the side-scrolling xkcd game the other day?
2015-12-09 34321, 2015
Freso
Nope.
2015-12-09 34322, 2015
Leftmost
LordSputnik, I got lost in that for over an hour...
2015-12-09 34329, 2015
Freso
But I do kind of want the thing explainer.
2015-12-09 34346, 2015
LordSputnik
Leftmost: I went left, got stuck in the volcano, then went right all the way
2015-12-09 34356, 2015
LordSputnik
I also explored the star destroyer a little, but not fully :P
2015-12-09 34306, 2015
Leftmost
You got stuck in the volcano? But there's more past there!
2015-12-09 34349, 2015
Leftmost
Hmm. Storing source and target entities in the relationship table is now technically denormalized.
2015-12-09 34349, 2015
LordSputnik
Leftmost: Oh I went back again after reloading :P
2015-12-09 34312, 2015
LordSputnik
Leftmost: oh yeah, the relationship table is pretty much just (id, type) now
2015-12-09 34324, 2015
LordSputnik
To get the entity BBIDs (hopefully not often needed), you'd join through relationship_set__relationship, relationship_set, <entity>_data, <entity>_revision, <entity>_header :P
2015-12-09 34325, 2015
LordSputnik
Although we *will* need them every time we display a relationship... hmmm
2015-12-09 34332, 2015
Nyanko-sensei has quit
2015-12-09 34318, 2015
LordSputnik is now imagining a society of sysadmins like https://www.youtube.com/watch?v=UOs-4J6rr-w
2015-12-09 34348, 2015
LordSputnik
I think that could actually save the world :P
2015-12-09 34301, 2015
Leftmost
You'd need to determine the relationship type, the type of the other entity (which I think becomes non-trivial when we use the flag, as we need to test which entity type column we need to check), then select the entity of that type which has a master revision with data containing a relationship set containing that relationship.
2015-12-09 34317, 2015
Leftmost
Has to be a better way than that. :-P
2015-12-09 34308, 2015
qu3ntin02 joined the channel
2015-12-09 34320, 2015
qu3ntin02
Hey guys
2015-12-09 34338, 2015
qu3ntin02
Does bookbrainz have an API?
2015-12-09 34359, 2015
Freso
This question sure is popular.
2015-12-09 34311, 2015
Leftmost
If we don't store isSource at the relationship set level (shouldn't really be relevant when only looking at the set, right?), then storing source and target BBIDs in the relationship table isn't denormalized and simplifies everything, heh.
LordSputnik, What is the difference between the two?
2015-12-09 34350, 2015
LordSputnik
mildused: which two? :)
2015-12-09 34356, 2015
Leftmost
mildused, one is the production instance, one is the source for it.
2015-12-09 34313, 2015
mildused
got it
2015-12-09 34328, 2015
Leftmost
LordSputnik, can you think of an instance in which we want to know whether something is source or target but don't want to join against the relationship table? That seems like information for the relationship data itself.
2015-12-09 34329, 2015
mildused
Why not Node.js?
2015-12-09 34339, 2015
darwin
why not zoidberg
2015-12-09 34304, 2015
LordSputnik
Leftmost: but. You could also have an entity A with entity_data with a relationship_set pointing to a relationship set, and also have entity B with different entity_data pointing to the same relationship_set, meaning that the info stored in the relationship was wrong
2015-12-09 34305, 2015
Leftmost
Because python was the direction we initially chose to look. That's changing, though.
2015-12-09 34321, 2015
LordSputnik
mildused: yeah, WS 1.0 will be node.js :)
2015-12-09 34329, 2015
mildused
So... how far off is that?
2015-12-09 34343, 2015
LordSputnik
What we're discussing right now is the reason we haven't started on that yet
2015-12-09 34352, 2015
LordSputnik
(ie. schema 1.0)
2015-12-09 34301, 2015
mildused
Ohh haha
2015-12-09 34308, 2015
Leftmost
LordSputnik, I don't follow. Why would the info stored in the relationship be wrong?
2015-12-09 34338, 2015
mildused
still using Redis?
2015-12-09 34343, 2015
LordSputnik
Leftmost: because two different entities could be the source for a relationship through their relationship_sets
2015-12-09 34351, 2015
LordSputnik
mildused: yup
2015-12-09 34310, 2015
LordSputnik
Because relationship carries so little information
2015-12-09 34315, 2015
Leftmost
LordSputnik, how? The relationship set wouldn't know anything about whether something is a source.
2015-12-09 34349, 2015
LordSputnik
I need to think about the relationships a bit more, something is definitely not good with the way they are in that diagram
Then the whole graph needs to change to connect relationship and entity
2015-12-09 34337, 2015
Bookzombie joined the channel
2015-12-09 34302, 2015
LordSputnik
But that still wouldn't work if you keep relationship_set
2015-12-09 34324, 2015
Leftmost
I'm not sure I see why.
2015-12-09 34335, 2015
LordSputnik
Because any number of entities can share the relationship set, meaning that the relationship BBIDs could easily be invalidated
2015-12-09 34309, 2015
LordSputnik
Consider an entity with relationships being merged into an entity with no relationships
2015-12-09 34324, 2015
LordSputnik
The resulting entity has a different BBID, but the same set of relationships as entity A
2015-12-09 34354, 2015
mildused
So tasks should be done for the python web service?
2015-12-09 34310, 2015
LordSputnik
So that will cause problems, but I don't know exactly what problems until we have a rough merge process worked out
2015-12-09 34315, 2015
LordSputnik
mildused: yup, exactly
2015-12-09 34354, 2015
LordSputnik
Stuff like testing for the current WS will be useful for when we come to write tests for the new WS in a few months time
2015-12-09 34301, 2015
Nyanko-sensei joined the channel
2015-12-09 34327, 2015
Leftmost
LordSputnik, the procedure would be the same, I think, as if you merged two entities both with relationships: new relationship data will have to be created pointing to the correct entity and a new relationship set created.
2015-12-09 34312, 2015
LordSputnik
Leftmost: OK, but you'd also have to duplicate the relationships themselves, which is why I'm not sure how much RelationshipSet makes sense when we have BBIDs on the relationship table
2015-12-09 34318, 2015
LordSputnik
Need to think some more
2015-12-09 34332, 2015
Leftmost
Okay.
2015-12-09 34315, 2015
LordSputnik
Another problem. If you have 5 relationships between an entity and 5 other entities, then you have to make 6 revisions to change them all
2015-12-09 34344, 2015
LordSputnik
If you revert only one of those revisions, the relationships become out of sync
2015-12-09 34316, 2015
LordSputnik
So, I don't even think we can handle relationships in the same versioning system as entity data
2015-12-09 34357, 2015
LordSputnik
identifiers work because they only involve one entity (the other is some external identifier)
LordSputnik, we wouldn't need to have more than one revision, I think.
2015-12-09 34354, 2015
Leftmost
With the change from revisions being single-parent, we can change arbitrarily many entities with a single revision.
2015-12-09 34324, 2015
Leftmost
I'm not sure if it's a good thing to change related entities, though. That does seem to balloon pretty fast.
2015-12-09 34344, 2015
darwin
(DBA suggests cascading updates are often bad)
2015-12-09 34313, 2015
Leftmost
I'm not willing to give up yet, though. I think putting relationships into the same versioning system can still work, just need to figure out how.
2015-12-09 34340, 2015
Rayna joined the channel
2015-12-09 34347, 2015
Freso
LordSputnik | Another problem. If you have 5 relationships between an entity and 5 other entities, then you have to make 6 revisions to change them all -- or 10 revisions; 5 revisions on the single entity and 1 revision on each of the five. If a revision is reverted, you "just" "undo" the related revision on the other end too.
2015-12-09 34306, 2015
Freso
s/;/:/
2015-12-09 34326, 2015
Leftmost
Ontologically-speaking, a relationship is a link between two entities. If the identity of one of the entities changes, we want to change the link but not the other entity.
2015-12-09 34347, 2015
Leftmost
This does create practical problems in finding out what that other entity is, though.
2015-12-09 34357, 2015
Rayna has quit
2015-12-09 34337, 2015
Leftmost
LordSputnik, okay, thinking about it: to fetch an edition with a publisher rel, we go EditionHeader -> EditionRevision -> EditionData -> RelationshipSet -> RelationshipSet_Relationship -> Relationship -> RelationshipType already, just to display the edition side. We already know the other side is a publisher, so we just have to do the other side without the last two steps, which is only one join more than we'd have to do to get the
2015-12-09 34338, 2015
Leftmost
publisher side of it anyhow.
2015-12-09 34315, 2015
Leftmost
We could possibly even eliminate the PublisherHeader part of it, since PublisherRevision knows the BBID.
2015-12-09 34318, 2015
Leftmost
Well, no, we probably don't want to do that. We still want to get the master revision if we're fetching the edition's master revision.