#bookbrainz-devel

/

22:27 PM
LordSputnik

No, I don't either, I'd like to have something more like Wikipedia's talk pages, but that's more complex, so I've left it out for now

2015-03-06 06531, 2015

22:27 PM
Leftmost

I'm still a little concerned that an annotations table would balloon, and I think we should treat disambig like other data since it should be smallish.

2015-03-06 06500, 2015

22:28 PM
LordSputnik

Storing them separately should reduce the amount of space they use, rather than increase it

2015-03-06 06517, 2015

22:28 PM
Leftmost

Yeah.

2015-03-06 06556, 2015

22:28 PM
LordSputnik

The size of those tables will almost certainly be smaller than the size of the Entity table, since most entities won't have them

2015-03-06 06505, 2015

22:29 PM
Leftmost

My current line of thinking is annotation as ID, disambig as inline data, encourage people to think of annotations as a stopgap for storing information our schema can't yet incorporate and that may be lost down the road if we move to a better system.

2015-03-06 06510, 2015

22:29 PM
LordSputnik

(and the ids are set to null in the EntityTree if there's no disambig/annotation)

2015-03-06 06552, 2015

22:29 PM
Leftmost

I'm reluctant to commit to storing an annotations table forever, as something about it makes me sad inside.

2015-03-06 06506, 2015

22:30 PM
Leftmost

Does that seem reasonable?

2015-03-06 06528, 2015

22:30 PM
LordSputnik

I still don't see the benefit in storing disambiguations inline, I'm afraid

2015-03-06 06547, 2015

22:30 PM
LordSputnik

I don't think we gain from it, and we increase the amount of memory needed

2015-03-06 06526, 2015

22:32 PM
Leftmost

I see it as a rehash of the names table in MB, which proved to be a pain to maintain and turned out to be hard to define semantically. I think we should be storing names inline and I don't think that disambig is materially different from a name as it relates to storage.

2015-03-06 06555, 2015

22:33 PM
Gentlecat has left the channel

2015-03-06 06505, 2015

22:34 PM
Leftmost

We're committing to a certain amount of denormalization with the data table as it stands.

2015-03-06 06520, 2015

22:34 PM
LordSputnik

How so?

2015-03-06 06549, 2015

22:34 PM
LordSputnik

I thought things were reasonable normalized

2015-03-06 06502, 2015

22:35 PM
Leftmost

Multiple data tables can contain substantially similar data referring to the same entity with only minor changes, and that's intentional.

2015-03-06 06534, 2015

22:36 PM
LordSputnik

Well, it's not redundant

2015-03-06 06500, 2015

22:37 PM
LordSputnik

The information is in the combinations of aliases/annotation/disambiguation/data

2015-03-06 06552, 2015

22:37 PM
LordSputnik

There's no way of storing that which is more efficient than foreign keys to separate tables, and no way we can reduce the duplication without losing the information

2015-03-06 06532, 2015

22:38 PM
LordSputnik

ocharles_: kuno: ping, if you're around to give some database-fu

2015-03-06 06510, 2015

22:39 PM
Leftmost

LordSputnik, it also creates a separate join to a generic table which can have an arbitrarily large set of data, which has caused its own set of problems with edits on MB.

2015-03-06 06541, 2015

22:39 PM
Leftmost

The table will never be as large as the edits table, but it carries a performance cost and stores entity-specific information in a generic table.

2015-03-06 06514, 2015

22:40 PM
LordSputnik

The EntityData table?

2015-03-06 06527, 2015

22:40 PM
Leftmost

No, the disambiguation table.

2015-03-06 06555, 2015

22:40 PM
kuno

graph databases!

2015-03-06 06501, 2015

22:41 PM
Leftmost

We don't store end dates in a separate table referenced by ID because end dates are small and it creates overhead to do a lookup.

2015-03-06 06511, 2015

22:42 PM
Leftmost

Disambig is larger but still relatively small, should still have an upper limit on size, and would also create lookup overhead.

2015-03-06 06537, 2015

22:42 PM
Leftmost

Storing per-entity information in a generic table seems worse to me than small duplications of data.

2015-03-06 06538, 2015

22:42 PM
LordSputnik

Oh, so you're suggesting we merge the Disambiguation and EntityData tables

2015-03-06 06542, 2015

22:42 PM
Leftmost

Yes.

2015-03-06 06500, 2015

22:43 PM
LordSputnik

That makes more sense

2015-03-06 06510, 2015

22:43 PM
LordSputnik

I thought you meant the Disambiguation and EntityTree tables

2015-03-06 06541, 2015

22:43 PM
Leftmost

Oh, sorry. I think I was being really unclear about that.

2015-03-06 06552, 2015

22:43 PM
LordSputnik

To be honest, I'm not 100% sure we need to separate EntityTree and EntityData, but I wanted to hear from ocharles_ before I thought any more about that

2015-03-06 06523, 2015

22:44 PM
LordSputnik

Since EntityTree is just an entity-type independent way of storing pointers to other tables, and there's no reason EntityData couldn't do that

2015-03-06 06556, 2015

22:47 PM
Leftmost

Here's what I'm thinking, high-level, in terms of data structures:

2015-03-06 06556, 2015

22:47 PM
LordSputnik

So, I'd suggest we keep that as it is for now, until I've discussed that with ocharles_

2015-03-06 06502, 2015

22:48 PM
LordSputnik

Oh go on

2015-03-06 06506, 2015

22:51 PM
Leo_Verto

oshit

2015-03-06 06506, 2015

22:51 PM
Leftmost

An Entity is a structure with an ID, a type, a rooted tree, and a node pointer associated with it. Each node of the rooted tree is a revision, which contains an edit note, a date, a pointer to a parent node, and a pointer to the data which makes up that revision. The node pointer on the entity points to the current master revision node.

2015-03-06 06540, 2015

22:51 PM
Leo_Verto

restarting the entire network right now

2015-03-06 06551, 2015

22:51 PM
LordSputnik

Leo_Verto: yeah I've seen :P

2015-03-06 06516, 2015

22:52 PM
LordSputnik

Leftmost: let me draw that

2015-03-06 06535, 2015

22:54 PM
mb-chat-logger joined the channel

2015-03-06 06557, 2015

22:54 PM
LordSputnik

Leftmost: ok, where are aliases?

2015-03-06 06538, 2015

22:55 PM
Leftmost

Not sure offhand. That was a very high-level picture and I'm still not used to the idea of putting aliases in a separate table.

2015-03-06 06507, 2015

22:56 PM
LordSputnik

Ok, so, it doesn't group revisions together at all

2015-03-06 06517, 2015

22:56 PM
Leftmost

No, only by parent.

2015-03-06 06532, 2015

22:56 PM
LordSputnik

But if we're not voting I guess we don't really need edits to group things?

2015-03-06 06550, 2015

22:56 PM
Leftmost

Application of an edit is accomplished by moving the master revision pointer.

2015-03-06 06521, 2015

22:57 PM
LordSputnik

Oh, I mean, in the current schema, edits are groups of revisions which get applied together

2015-03-06 06533, 2015

22:57 PM
Leftmost

Right, I meant in my schema.

2015-03-06 06516, 2015

22:58 PM
LordSputnik

Ok, I think this would work, but we still have the issue of storing different data for different entity types

2015-03-06 06556, 2015

22:58 PM
Leftmost

Not necessarily. Our entity structure can store type without using a typed tree.

2015-03-06 06515, 2015

22:59 PM
Leo_Verto

bb.org back up

2015-03-06 06529, 2015

23:00 PM
Leftmost

Each EntityData structure is associated with only a single Entity, so everything up to that point can be generic.

2015-03-06 06543, 2015

23:00 PM
LordSputnik

Well, what would the EntityData for each revision contain?

2015-03-06 06512, 2015

23:02 PM
Leftmost

Any type-agnostic revision information and a pointer to a type-specific data struct.

2015-03-06 06545, 2015

23:02 PM
LordSputnik

So, like an EntityTree is now?

2015-03-06 06502, 2015

23:03 PM
Leftmost

Yes, similar.

2015-03-06 06559, 2015

23:04 PM
LordSputnik

So, the main difference is in the organisation of revisions, then

2015-03-06 06525, 2015

23:05 PM
LordSputnik

(and also presumably storing annotations and disambiguations inline in the entity-type agnostic data)

2015-03-06 06514, 2015

23:06 PM
Leftmost

Yeah.

2015-03-06 06557, 2015

23:06 PM
LordSputnik

Right, I'll think about that

2015-03-06 06512, 2015

23:07 PM
LordSputnik

Let's go onto the final few topics

2015-03-06 06519, 2015

23:07 PM
LordSputnik

SSH

2015-03-06 06538, 2015

23:07 PM
Leo_Verto

yeah

2015-03-06 06519, 2015

23:08 PM
LordSputnik

Leftmost: in order to set you up on the bb.org sandbox, I need the public SSH keys for the PCs you want to connect from (there's no password authentication afaik)

2015-03-06 06543, 2015

23:09 PM
LordSputnik

So I guess the best way to do that would be in an email

2015-03-06 06510, 2015

23:11 PM
LordSputnik

Ok, next is Search (who wanted to talk about that?)

2015-03-06 06542, 2015

23:11 PM
Leftmost

I did.

2015-03-06 06540, 2015

23:12 PM
LordSputnik

btw Leo_Verto: if you wanted SSH access to the bookbrainz server, we can probably sort that out too (although I should probably check with ruaok first)

2015-03-06 06541, 2015

23:12 PM
Leftmost

I've looked into it and it seems that solr and elasticsearch largely comes down to preference. I've got some experience with elasticsearch. It's dead easy to use and dead easy to get running. Any objection to moving forward with an ES search implementation?

2015-03-06 06502, 2015

23:13 PM
LordSputnik

No objection here

2015-03-06 06519, 2015

23:13 PM
Leo_Verto

probably not essential but could be useful in certain situations

2015-03-06 06551, 2015

23:13 PM
LordSputnik

Ok, I'll ask him next Monday, when he's not so busy (hopefully)

2015-03-06 06540, 2015

23:14 PM
LordSputnik

Leftmost: either way I'd have to learn one or the other, and if it's easy to set up, we can probably switch between them without too much fuss if we really need to

2015-03-06 06506, 2015

23:15 PM
Leftmost

LordSputnik, public key sent.

2015-03-06 06525, 2015

23:15 PM
Leo_Verto

*decrypting transmission*

2015-03-06 06534, 2015

23:15 PM
LordSputnik

Leftmost: ok, I'll see what I can do when I have a proper internet connection again

2015-03-06 06557, 2015

23:15 PM
LordSputnik

Finally, Setup (also you, Leftmost?) :)

2015-03-06 06527, 2015

23:16 PM
Leftmost

Yep.

2015-03-06 06553, 2015

23:16 PM
Leftmost

Just a general note that we should work on making setup easier. I don't know if I mentioned that yesterday.

2015-03-06 06507, 2015

23:17 PM
LordSputnik

nope

2015-03-06 06522, 2015

23:17 PM
LordSputnik

What bits particularly?

2015-03-06 06538, 2015

23:17 PM
LordSputnik

I thought it was fairly straightforward (especially compared to MB)

2015-03-06 06557, 2015

23:17 PM
Leo_Verto

mhm

2015-03-06 06502, 2015

23:18 PM
Leftmost

It wasn't difficult or confusing, but I think in general it could be smoother.

2015-03-06 06513, 2015

23:18 PM
Leo_Verto

the frontend setup is pretty well documented by now

2015-03-06 06555, 2015

23:19 PM
LordSputnik

so, currently, you have to install bbschema, install postgres and redis, then clone bbws

2015-03-06 06510, 2015

23:20 PM
LordSputnik

Install dependencies with pip, then launch the ws

2015-03-06 06525, 2015

23:20 PM
Leo_Verto

oh yeah, installing redis is entirely undocumented

2015-03-06 06535, 2015

23:20 PM
Leftmost

The READMEs for each component should include a short setup howto and any system reqs, I'd say, and -schema or -ws should be able to set up the database with only a user/password to its name.

2015-03-06 06538, 2015

23:20 PM
LordSputnik

Clone the site, npm install the dependencies, compile the javascript, then launch the site

2015-03-06 06515, 2015

23:21 PM
LordSputnik

Ok, currently database setup is done in a separate script in bbschema

2015-03-06 06518, 2015

23:21 PM
Leftmost

I got it set up without difficulty, just things that may be worth keeping in mind.

2015-03-06 06540, 2015

23:21 PM
LordSputnik

we could move that somewhere else, maybe, or try to integrate it with the ws config file

2015-03-06 06547, 2015

23:21 PM
Leo_Verto

if we get the config system done, we could provide configs for having a whole local setup or just working on the site using the bb.org ws

2015-03-06 06556, 2015

23:21 PM
LordSputnik

(or have a separate bbschema config file containing the database settings?)

2015-03-06 06557, 2015

23:22 PM
Leftmost

It probably belongs in -ws if -schema is just a lib for interacting purely with the schema instead of making database calls.

2015-03-06 06510, 2015

23:23 PM
LordSputnik

Well, that's the thing...

2015-03-06 06519, 2015

23:23 PM
LordSputnik

I've just started moving some of the editing logic into schema

2015-03-06 06527, 2015

23:23 PM
LordSputnik

There's a blurry line between them

2015-03-06 06555, 2015

23:23 PM
Leftmost

Hmm. That may be worth some thought, then.

2015-03-06 06513, 2015

23:24 PM
LordSputnik

Ok, anything else for discussion today?

2015-03-06 06538, 2015

23:24 PM
Leftmost

Not that I can think of. Do you wanna think over the -ws/-schema split and talk about it next meeting, maybe?

2015-03-06 06519, 2015

23:25 PM
LordSputnik

Ok, should we aim for 9:30 next time?

2015-03-06 06521, 2015

23:25 PM
Leo_Verto

I can do the publisher/edition forms provided the ws endpoint exists

2015-03-06 06540, 2015

23:25 PM
Leftmost

Sounds good to me. Same bat time (plus ten minutes), same bat channel?

2015-03-06 06543, 2015

23:25 PM
LordSputnik

Leo_Verto: yes, it does - same endpoint as the creator/publisher ones

2015-03-06 06552, 2015

23:25 PM
Leo_Verto

unless we want to hold that off until after the new call model

2015-03-06 06558, 2015

23:25 PM
LordSputnik

Leftmost: haha yeah

2015-03-06 06559, 2015

23:25 PM
Leo_Verto

ah good

2015-03-06 06520, 2015

23:26 PM
LordSputnik

Leftmost: do you still have some time now?

2015-03-06 06524, 2015

23:26 PM
Leftmost

I do.

2015-03-06 06549, 2015

23:26 PM
LordSputnik

Ok, I think we can resolve the differences between your new proposed schema and our current one

2015-03-06 06552, 2015

23:26 PM
Leftmost

Leo_Verto, up to you. It may be next meeting before I even have PRs up for discussion on frontend stuff, and new changes should be relatively easy to integrate.

2015-03-06 06535, 2015

23:27 PM
LordSputnik

So, your proposal has revisions organised in a tree for each entity, which is optimal for parallel revisions being merged, right?

2015-03-06 06513, 2015

23:28 PM
Leo_Verto

oh, one more thing

2015-03-06 06533, 2015

23:28 PM
Leo_Verto

I want to start working on better accessability of the site

2015-03-06 06556, 2015

23:28 PM
Leo_Verto

stuff like adding title texts for icons

2015-03-06 06511, 2015

23:29 PM
LordSputnik

Go for it, that's always welcome

2015-03-06 06528, 2015

23:29 PM
Leftmost

LordSputnik, right.

2015-03-06 06545, 2015

23:29 PM
LordSputnik

labels for screenreaders are also helpful, but I've been leaving them out up until now :(

2015-03-06 06507, 2015

23:30 PM
Leftmost

Leo_Verto, please do. I even have an accessibility consultant if you want feedback.

2015-03-06 06509, 2015

23:30 PM
Leo_Verto

is there a proper way to implement those labels?

2015-03-06 06512, 2015

23:30 PM
Leftmost

(My father is blind.)

2015-03-06 06525, 2015

23:30 PM
LordSputnik

Leftmost: is better support for merging parallel work the only reason for using a tree rather than a list of revisions?

2015-03-06 06528, 2015

23:30 PM
Leftmost

There is. I can get you more information by tomorrow if you want.

2015-03-06 06548, 2015

23:30 PM
Leftmost

LordSputnik, off the top of my head, yeah.

2015-03-06 06550, 2015

23:30 PM
LordSputnik

Leo_Verto: there's something like an "aria-*" attribute - it's documented in bootstrap, I think

2015-03-06 06504, 2015

23:31 PM
Leo_Verto

Leftmost, you should join our mailing list https://groups.io/org/groupsio/bookbrainz-develop…

2015-03-06 06516, 2015

23:31 PM
Leftmost

There's also form labelling stuff baked into HTML, I believe.

2015-03-06 06544, 2015

23:31 PM
Leftmost

Okay.

2015-03-06 06558, 2015

23:31 PM
LordSputnik

Leftmost: ok, so which one we use should depend on whether we need that level of detail in the revision history

2015-03-06 06511, 2015

23:33 PM
Leo_Verto

oh and I also want to add a nice button to the editor page to send a message, the problem here is vertically centering the button

2015-03-06 06517, 2015

23:33 PM
Leftmost

LordSputnik, I think it's a useful concept. It's a very simple data structure, it's list-like if there's no parallel editing, and it makes it easier for multiple changes to be made and merged.

2015-03-06 06538, 2015

23:33 PM
Leftmost

It's also very adaptable in terms of how we choose to display this to the end user.

2015-03-06 06503, 2015

23:34 PM
LordSputnik

Ok

2015-03-06 06538, 2015

23:34 PM
Leftmost

I've been doing a lot of data structures reading lately, so my mind may still be there. :)

2015-03-06 06505, 2015

23:35 PM
LordSputnik

Second thing is- do we really want edits/merge requests *and* revisions

2015-03-06 06553, 2015

23:35 PM
Leftmost

I don't see the need myself.

2015-03-06 06509, 2015

23:36 PM
LordSputnik

Assuming the primary way of peer review is through verifying data, rather than reviewing edits, I don't think so either

2015-03-06 06553, 2015

23:37 PM
LordSputnik

So, we have a tree of revisions, no edits, and I'd assume that revisions can be reverted easily

2015-03-06 06543, 2015

23:38 PM
Leftmost

Yep. A revert would basically just a new revision with an old dataset.

2015-03-06 06507, 2015

23:39 PM
LordSputnik

Leftmost: wouldn't it just be setting the master_revision further down the tree?

2015-03-06 06513, 2015

23:39 PM
LordSputnik

OH

2015-03-06 06522, 2015

23:39 PM
LordSputnik

* Oh, no I guess you'd want a note too

2015-03-06 06546, 2015

23:39 PM
LordSputnik

rather than silently changing history