No, I don't either, I'd like to have something more like Wikipedia's talk pages, but that's more complex, so I've left it out for now
2015-03-06 06531, 2015
Leftmost
I'm still a little concerned that an annotations table would balloon, and I think we should treat disambig like other data since it should be smallish.
2015-03-06 06500, 2015
LordSputnik
Storing them separately should reduce the amount of space they use, rather than increase it
2015-03-06 06517, 2015
Leftmost
Yeah.
2015-03-06 06556, 2015
LordSputnik
The size of those tables will almost certainly be smaller than the size of the Entity table, since most entities won't have them
2015-03-06 06505, 2015
Leftmost
My current line of thinking is annotation as ID, disambig as inline data, encourage people to think of annotations as a stopgap for storing information our schema can't yet incorporate and that may be lost down the road if we move to a better system.
2015-03-06 06510, 2015
LordSputnik
(and the ids are set to null in the EntityTree if there's no disambig/annotation)
2015-03-06 06552, 2015
Leftmost
I'm reluctant to commit to storing an annotations table forever, as something about it makes me sad inside.
2015-03-06 06506, 2015
Leftmost
Does that seem reasonable?
2015-03-06 06528, 2015
LordSputnik
I still don't see the benefit in storing disambiguations inline, I'm afraid
2015-03-06 06547, 2015
LordSputnik
I don't think we gain from it, and we increase the amount of memory needed
2015-03-06 06526, 2015
Leftmost
I see it as a rehash of the names table in MB, which proved to be a pain to maintain and turned out to be hard to define semantically. I think we should be storing names inline and I don't think that disambig is materially different from a name as it relates to storage.
2015-03-06 06555, 2015
Gentlecat has left the channel
2015-03-06 06505, 2015
Leftmost
We're committing to a certain amount of denormalization with the data table as it stands.
2015-03-06 06520, 2015
LordSputnik
How so?
2015-03-06 06549, 2015
LordSputnik
I thought things were reasonable normalized
2015-03-06 06502, 2015
Leftmost
Multiple data tables can contain substantially similar data referring to the same entity with only minor changes, and that's intentional.
2015-03-06 06534, 2015
LordSputnik
Well, it's not redundant
2015-03-06 06500, 2015
LordSputnik
The information is in the combinations of aliases/annotation/disambiguation/data
2015-03-06 06552, 2015
LordSputnik
There's no way of storing that which is more efficient than foreign keys to separate tables, and no way we can reduce the duplication without losing the information
2015-03-06 06532, 2015
LordSputnik
ocharles_: kuno: ping, if you're around to give some database-fu
2015-03-06 06510, 2015
Leftmost
LordSputnik, it also creates a separate join to a generic table which can have an arbitrarily large set of data, which has caused its own set of problems with edits on MB.
2015-03-06 06541, 2015
Leftmost
The table will never be as large as the edits table, but it carries a performance cost and stores entity-specific information in a generic table.
2015-03-06 06514, 2015
LordSputnik
The EntityData table?
2015-03-06 06527, 2015
Leftmost
No, the disambiguation table.
2015-03-06 06555, 2015
kuno
graph databases!
2015-03-06 06501, 2015
Leftmost
We don't store end dates in a separate table referenced by ID because end dates are small and it creates overhead to do a lookup.
2015-03-06 06511, 2015
Leftmost
Disambig is larger but still relatively small, should still have an upper limit on size, and would also create lookup overhead.
2015-03-06 06537, 2015
Leftmost
Storing per-entity information in a generic table seems worse to me than small duplications of data.
2015-03-06 06538, 2015
LordSputnik
Oh, so you're suggesting we merge the Disambiguation and EntityData tables
2015-03-06 06542, 2015
Leftmost
Yes.
2015-03-06 06500, 2015
LordSputnik
That makes more sense
2015-03-06 06510, 2015
LordSputnik
I thought you meant the Disambiguation and EntityTree tables
2015-03-06 06541, 2015
Leftmost
Oh, sorry. I think I was being really unclear about that.
2015-03-06 06552, 2015
LordSputnik
To be honest, I'm not 100% sure we need to separate EntityTree and EntityData, but I wanted to hear from ocharles_ before I thought any more about that
2015-03-06 06523, 2015
LordSputnik
Since EntityTree is just an entity-type independent way of storing pointers to other tables, and there's no reason EntityData couldn't do that
2015-03-06 06556, 2015
Leftmost
Here's what I'm thinking, high-level, in terms of data structures:
2015-03-06 06556, 2015
LordSputnik
So, I'd suggest we keep that as it is for now, until I've discussed that with ocharles_
2015-03-06 06502, 2015
LordSputnik
Oh go on
2015-03-06 06506, 2015
Leo_Verto
oshit
2015-03-06 06506, 2015
Leftmost
An Entity is a structure with an ID, a type, a rooted tree, and a node pointer associated with it. Each node of the rooted tree is a revision, which contains an edit note, a date, a pointer to a parent node, and a pointer to the data which makes up that revision. The node pointer on the entity points to the current master revision node.
2015-03-06 06540, 2015
Leo_Verto
restarting the entire network right now
2015-03-06 06551, 2015
LordSputnik
Leo_Verto: yeah I've seen :P
2015-03-06 06516, 2015
LordSputnik
Leftmost: let me draw that
2015-03-06 06535, 2015
mb-chat-logger joined the channel
2015-03-06 06557, 2015
LordSputnik
Leftmost: ok, where are aliases?
2015-03-06 06538, 2015
Leftmost
Not sure offhand. That was a very high-level picture and I'm still not used to the idea of putting aliases in a separate table.
2015-03-06 06507, 2015
LordSputnik
Ok, so, it doesn't group revisions together at all
2015-03-06 06517, 2015
Leftmost
No, only by parent.
2015-03-06 06532, 2015
LordSputnik
But if we're not voting I guess we don't really need edits to group things?
2015-03-06 06550, 2015
Leftmost
Application of an edit is accomplished by moving the master revision pointer.
2015-03-06 06521, 2015
LordSputnik
Oh, I mean, in the current schema, edits are groups of revisions which get applied together
2015-03-06 06533, 2015
Leftmost
Right, I meant in my schema.
2015-03-06 06516, 2015
LordSputnik
Ok, I think this would work, but we still have the issue of storing different data for different entity types
2015-03-06 06556, 2015
Leftmost
Not necessarily. Our entity structure can store type without using a typed tree.
Each EntityData structure is associated with only a single Entity, so everything up to that point can be generic.
2015-03-06 06543, 2015
LordSputnik
Well, what would the EntityData for each revision contain?
2015-03-06 06512, 2015
Leftmost
Any type-agnostic revision information and a pointer to a type-specific data struct.
2015-03-06 06545, 2015
LordSputnik
So, like an EntityTree is now?
2015-03-06 06502, 2015
Leftmost
Yes, similar.
2015-03-06 06559, 2015
LordSputnik
So, the main difference is in the organisation of revisions, then
2015-03-06 06525, 2015
LordSputnik
(and also presumably storing annotations and disambiguations inline in the entity-type agnostic data)
2015-03-06 06514, 2015
Leftmost
Yeah.
2015-03-06 06557, 2015
LordSputnik
Right, I'll think about that
2015-03-06 06512, 2015
LordSputnik
Let's go onto the final few topics
2015-03-06 06519, 2015
LordSputnik
SSH
2015-03-06 06538, 2015
Leo_Verto
yeah
2015-03-06 06519, 2015
LordSputnik
Leftmost: in order to set you up on the bb.org sandbox, I need the public SSH keys for the PCs you want to connect from (there's no password authentication afaik)
2015-03-06 06543, 2015
LordSputnik
So I guess the best way to do that would be in an email
2015-03-06 06510, 2015
LordSputnik
Ok, next is Search (who wanted to talk about that?)
2015-03-06 06542, 2015
Leftmost
I did.
2015-03-06 06540, 2015
LordSputnik
btw Leo_Verto: if you wanted SSH access to the bookbrainz server, we can probably sort that out too (although I should probably check with ruaok first)
2015-03-06 06541, 2015
Leftmost
I've looked into it and it seems that solr and elasticsearch largely comes down to preference. I've got some experience with elasticsearch. It's dead easy to use and dead easy to get running. Any objection to moving forward with an ES search implementation?
2015-03-06 06502, 2015
LordSputnik
No objection here
2015-03-06 06519, 2015
Leo_Verto
probably not essential but could be useful in certain situations
2015-03-06 06551, 2015
LordSputnik
Ok, I'll ask him next Monday, when he's not so busy (hopefully)
2015-03-06 06540, 2015
LordSputnik
Leftmost: either way I'd have to learn one or the other, and if it's easy to set up, we can probably switch between them without too much fuss if we really need to
2015-03-06 06506, 2015
Leftmost
LordSputnik, public key sent.
2015-03-06 06525, 2015
Leo_Verto
*decrypting transmission*
2015-03-06 06534, 2015
LordSputnik
Leftmost: ok, I'll see what I can do when I have a proper internet connection again
2015-03-06 06557, 2015
LordSputnik
Finally, Setup (also you, Leftmost?) :)
2015-03-06 06527, 2015
Leftmost
Yep.
2015-03-06 06553, 2015
Leftmost
Just a general note that we should work on making setup easier. I don't know if I mentioned that yesterday.
2015-03-06 06507, 2015
LordSputnik
nope
2015-03-06 06522, 2015
LordSputnik
What bits particularly?
2015-03-06 06538, 2015
LordSputnik
I thought it was fairly straightforward (especially compared to MB)
2015-03-06 06557, 2015
Leo_Verto
mhm
2015-03-06 06502, 2015
Leftmost
It wasn't difficult or confusing, but I think in general it could be smoother.
2015-03-06 06513, 2015
Leo_Verto
the frontend setup is pretty well documented by now
2015-03-06 06555, 2015
LordSputnik
so, currently, you have to install bbschema, install postgres and redis, then clone bbws
2015-03-06 06510, 2015
LordSputnik
Install dependencies with pip, then launch the ws
2015-03-06 06525, 2015
Leo_Verto
oh yeah, installing redis is entirely undocumented
2015-03-06 06535, 2015
Leftmost
The READMEs for each component should include a short setup howto and any system reqs, I'd say, and -schema or -ws should be able to set up the database with only a user/password to its name.
2015-03-06 06538, 2015
LordSputnik
Clone the site, npm install the dependencies, compile the javascript, then launch the site
2015-03-06 06515, 2015
LordSputnik
Ok, currently database setup is done in a separate script in bbschema
2015-03-06 06518, 2015
Leftmost
I got it set up without difficulty, just things that may be worth keeping in mind.
2015-03-06 06540, 2015
LordSputnik
we could move that somewhere else, maybe, or try to integrate it with the ws config file
2015-03-06 06547, 2015
Leo_Verto
if we get the config system done, we could provide configs for having a whole local setup or just working on the site using the bb.org ws
2015-03-06 06556, 2015
LordSputnik
(or have a separate bbschema config file containing the database settings?)
2015-03-06 06557, 2015
Leftmost
It probably belongs in -ws if -schema is just a lib for interacting purely with the schema instead of making database calls.
2015-03-06 06510, 2015
LordSputnik
Well, that's the thing...
2015-03-06 06519, 2015
LordSputnik
I've just started moving some of the editing logic into schema
2015-03-06 06527, 2015
LordSputnik
There's a blurry line between them
2015-03-06 06555, 2015
Leftmost
Hmm. That may be worth some thought, then.
2015-03-06 06513, 2015
LordSputnik
Ok, anything else for discussion today?
2015-03-06 06538, 2015
Leftmost
Not that I can think of. Do you wanna think over the -ws/-schema split and talk about it next meeting, maybe?
2015-03-06 06519, 2015
LordSputnik
Ok, should we aim for 9:30 next time?
2015-03-06 06521, 2015
Leo_Verto
I can do the publisher/edition forms provided the ws endpoint exists
2015-03-06 06540, 2015
Leftmost
Sounds good to me. Same bat time (plus ten minutes), same bat channel?
2015-03-06 06543, 2015
LordSputnik
Leo_Verto: yes, it does - same endpoint as the creator/publisher ones
2015-03-06 06552, 2015
Leo_Verto
unless we want to hold that off until after the new call model
2015-03-06 06558, 2015
LordSputnik
Leftmost: haha yeah
2015-03-06 06559, 2015
Leo_Verto
ah good
2015-03-06 06520, 2015
LordSputnik
Leftmost: do you still have some time now?
2015-03-06 06524, 2015
Leftmost
I do.
2015-03-06 06549, 2015
LordSputnik
Ok, I think we can resolve the differences between your new proposed schema and our current one
2015-03-06 06552, 2015
Leftmost
Leo_Verto, up to you. It may be next meeting before I even have PRs up for discussion on frontend stuff, and new changes should be relatively easy to integrate.
2015-03-06 06535, 2015
LordSputnik
So, your proposal has revisions organised in a tree for each entity, which is optimal for parallel revisions being merged, right?
2015-03-06 06513, 2015
Leo_Verto
oh, one more thing
2015-03-06 06533, 2015
Leo_Verto
I want to start working on better accessability of the site
2015-03-06 06556, 2015
Leo_Verto
stuff like adding title texts for icons
2015-03-06 06511, 2015
LordSputnik
Go for it, that's always welcome
2015-03-06 06528, 2015
Leftmost
LordSputnik, right.
2015-03-06 06545, 2015
LordSputnik
labels for screenreaders are also helpful, but I've been leaving them out up until now :(
2015-03-06 06507, 2015
Leftmost
Leo_Verto, please do. I even have an accessibility consultant if you want feedback.
2015-03-06 06509, 2015
Leo_Verto
is there a proper way to implement those labels?
2015-03-06 06512, 2015
Leftmost
(My father is blind.)
2015-03-06 06525, 2015
LordSputnik
Leftmost: is better support for merging parallel work the only reason for using a tree rather than a list of revisions?
2015-03-06 06528, 2015
Leftmost
There is. I can get you more information by tomorrow if you want.
2015-03-06 06548, 2015
Leftmost
LordSputnik, off the top of my head, yeah.
2015-03-06 06550, 2015
LordSputnik
Leo_Verto: there's something like an "aria-*" attribute - it's documented in bootstrap, I think
There's also form labelling stuff baked into HTML, I believe.
2015-03-06 06544, 2015
Leftmost
Okay.
2015-03-06 06558, 2015
LordSputnik
Leftmost: ok, so which one we use should depend on whether we need that level of detail in the revision history
2015-03-06 06511, 2015
Leo_Verto
oh and I also want to add a nice button to the editor page to send a message, the problem here is vertically centering the button
2015-03-06 06517, 2015
Leftmost
LordSputnik, I think it's a useful concept. It's a very simple data structure, it's list-like if there's no parallel editing, and it makes it easier for multiple changes to be made and merged.
2015-03-06 06538, 2015
Leftmost
It's also very adaptable in terms of how we choose to display this to the end user.
2015-03-06 06503, 2015
LordSputnik
Ok
2015-03-06 06538, 2015
Leftmost
I've been doing a lot of data structures reading lately, so my mind may still be there. :)
2015-03-06 06505, 2015
LordSputnik
Second thing is- do we really want edits/merge requests *and* revisions
2015-03-06 06553, 2015
Leftmost
I don't see the need myself.
2015-03-06 06509, 2015
LordSputnik
Assuming the primary way of peer review is through verifying data, rather than reviewing edits, I don't think so either
2015-03-06 06553, 2015
LordSputnik
So, we have a tree of revisions, no edits, and I'd assume that revisions can be reverted easily
2015-03-06 06543, 2015
Leftmost
Yep. A revert would basically just a new revision with an old dataset.
2015-03-06 06507, 2015
LordSputnik
Leftmost: wouldn't it just be setting the master_revision further down the tree?