let me read it carefully, I'm actually reading now!
2016-07-04 18627, 2016
hellska_
alastairp: yes I think there's enough to start working on! I think it worth to make a specific plan for this part, as gentlecat suggested in the midterm feedback having a good plan is a very good thing :)
2016-07-04 18659, 2016
Gentlecat
we already had it iirc though
2016-07-04 18630, 2016
Gentlecat
and you started working on implementing non-mbid submissions
2016-07-04 18645, 2016
hellska_
yes, I was thinking that I need a plan to integrate MessyBrainz (that I didn't checked yet!)
2016-07-04 18654, 2016
hellska_
so basically the idea is to call MessyBrainz to generate the IDs while the rest of the project remains the same right?!
2016-07-04 18610, 2016
Gentlecat
alastairp: ^
2016-07-04 18626, 2016
Gentlecat
(I need to read it too, didn't have a chance yet)
2016-07-04 18655, 2016
alastairp
right, I didn't originally suggest to integrate MessyBrainz because I thought we could start basic, just generating our own IDs
2016-07-04 18652, 2016
alastairp
but after talking with ruaok he convinced me that adding MessyBrainz in now is a good idea - especially because I think it's a good idea to be able to make these identifiers as stable as soon as we can
2016-07-04 18625, 2016
hellska_
alastairp: agreed!
2016-07-04 18630, 2016
alastairp
I mean to say, if we generate our own ids now, then in a few months add messybrainz support in, we will have a handful of ids which are neither mbids nor messybrainz ids
2016-07-04 18612, 2016
alastairp
we could use our enum `gid_type` column to show this, but then we'd have to support 3 id types in acousticbrainz
2016-07-04 18630, 2016
Gentlecat
it shouldn't be hard to make them messybrainz ids though, right?
2016-07-04 18641, 2016
alastairp
we could manually import them into messybrainz
2016-07-04 18648, 2016
Gentlecat
right, that's what I mean
2016-07-04 18650, 2016
alastairp
but I don't want to do that
2016-07-04 18602, 2016
alastairp
e.g., what if this combination of artist/title has already been added to messybrainz?
2016-07-04 18615, 2016
Gentlecat
might be easier to just implement sending a query to messybrainz now
2016-07-04 18619, 2016
alastairp
right
2016-07-04 18625, 2016
alastairp
it's 1 http call :)
2016-07-04 18635, 2016
alastairp
zas: do we have an https certificate for messybrainz.org?
2016-07-04 18637, 2016
alastairp
on babar
2016-07-04 18629, 2016
hellska_
gentlecat: alastairp: I really have to look into messybrainz guys! But if I understand we will have 3 types that are MBID, messybrainzIDs and dataset items IDs (generated in messybrainz)
2016-07-04 18603, 2016
alastairp
nope, the last 2 are the same thing
2016-07-04 18615, 2016
alastairp
we'll have 2 types, mbid, messyid
2016-07-04 18638, 2016
hellska_
ok, so what's the third type you mentioned?
2016-07-04 18651, 2016
alastairp
oh, I meant a hypothetical situation
2016-07-04 18618, 2016
alastairp
where we would have mbids, our custom dataset ids, and then messyids when we add support for messybrainz
2016-07-04 18637, 2016
alastairp
but we should add support for messybrainz now, and the second format (custom dataset ids) will stop existing
2016-07-04 18600, 2016
alastairp
messybrainz has an api which takes {"artist": "artistname", "title": "tracktitle"} [and some extra metadata if you have it] and returns you a messybrainz id
2016-07-04 18619, 2016
alastairp
if the same text artist name and track title exist in the database, you will get the same id back
2016-07-04 18638, 2016
hellska_
what about datasets without artist/title?!
2016-07-04 18602, 2016
hellska_
we should not accept this items?!
2016-07-04 18610, 2016
alastairp
correct
2016-07-04 18619, 2016
alastairp
we need to set a limit somewhere
2016-07-04 18630, 2016
alastairp
our limit should be that we need at least those two items
2016-07-04 18633, 2016
mildused joined the channel
2016-07-04 18637, 2016
hellska_
Ok! that's fine!
2016-07-04 18657, 2016
alastairp
this makes sense in the context of dataset building - we should know what is in our dataset!
2016-07-04 18607, 2016
hellska_
yes I know, I was just thinking to real datasets like the ballroom that has no artist and sometimes not even a title, but uses the filename :(
2016-07-04 18618, 2016
alastairp
hrm
2016-07-04 18634, 2016
hellska_
anyway we can set the stadard behaviour and then try to find a way to integrate more complex situations
2016-07-04 18651, 2016
alastairp
I guess that makes sense, You don't need to know who it's by, just what the style is
2016-07-04 18607, 2016
Gentlecat
I think that might be going too far
2016-07-04 18645, 2016
Gentlecat
how many datasets without artist and title are there? and how useful are they actually going to be?
2016-07-04 18611, 2016
hellska_
good point!
2016-07-04 18641, 2016
Gentlecat
this seems to be getting a bit too complex for a project we have here
2016-07-04 18607, 2016
Gentlecat
and we aren't even half-done with what was planned
2016-07-04 18633, 2016
hellska_
yeah you're right! I'll make a plan as simple as possible and I'll share with you so you can comment.
2016-07-04 18623, 2016
Gentlecat
maybe focus on implementing it too
2016-07-04 18624, 2016
hellska_
I'm just thinking in bullet point! Not a complex document ;)
2016-07-04 18658, 2016
hellska_
so then I can start working on the code! For the schema change I already can start to change the code!
2016-07-04 18603, 2016
blozo joined the channel
2016-07-04 18651, 2016
blozo has quit
2016-07-04 18651, 2016
Mineo joined the channel
2016-07-04 18652, 2016
yeeeargh joined the channel
2016-07-04 18618, 2016
Zialus has quit
2016-07-04 18621, 2016
Zialus joined the channel
2016-07-04 18623, 2016
kartikgupta0909 has quit
2016-07-04 18615, 2016
alastairp
me and dima and hellska_ just had a chat about this
2016-07-04 18632, 2016
alastairp
hellska_ will add a comment to the jira ticket about it
2016-07-04 18601, 2016
alastairp
but the general gist is that we'll only accept submissions of datasets with at least that metadata and use messybrainz
2016-07-04 18650, 2016
ruaok
Freso: dockerizing all of the services hosted at DWNI