most of which is starting with 2/3 which are fairly related
2014-01-08 00832, 2014
ianmcorvidae
I haven't quite figured out what amount of API needs to exist, but I think that geordi having access to MB's data that isn't through our crappy webservice is prerequisite for useful things like search by MBID
2014-01-08 00842, 2014
ianmcorvidae
which is sort of why I'm starting with 3
2014-01-08 00817, 2014
ianmcorvidae
but I'm sure you can see why it's feeling like a morass, if 2 and 1 in that thing depend on 3
2014-01-08 00828, 2014
ianmcorvidae
anyway, I think that if I ignore other things (other than shipping some patches periodically, since I think I'm the one who has the appropriate access for that) I can devote time to working on it
so, queryable, but that doesn't include displaying as anything but JSON -- does it need to render things, or can I put that off at first and just dump pretty-printed JSON on a page?
2014-01-08 00847, 2014
ruaok_
that still makes it easy for us to import new data without much hassle
2014-01-08 00801, 2014
ruaok_
PP json is fine.
2014-01-08 00821, 2014
ianmcorvidae
okay. that means I can ignore mapping things at first
2014-01-08 00843, 2014
ruaok_
sure.
2014-01-08 00856, 2014
ianmcorvidae
how queryable/matchable do you care about? is it important to have links between things in geordi and between geordi and MB, or are isolated documents okay?
2014-01-08 00830, 2014
ruaok_
so then a priority list includes: 1. new DB, 2. Improved (non-PP json) display 3. mappings 4. import. Does that sound sane?
2014-01-08 00856, 2014
ianmcorvidae
yes, but I'm wondering if we can't trim down/split up 1. a bit first
2014-01-08 00817, 2014
ianmcorvidae
and 2/3 on that are the same, you need to map something before you can display it as anything but raw JSON :)
2014-01-08 00826, 2014
ruaok_
I think linking inside geordi is less important that just being able to import a release.
2014-01-08 00834, 2014
reosarevok assumed "mappings" meant to MB
2014-01-08 00839, 2014
ianmcorvidae
heh, sorry
2014-01-08 00839, 2014
reosarevok
But maybe not :)
2014-01-08 00851, 2014
ianmcorvidae
mappings = from fields in geordi to fields in some sort of common structure used for display
2014-01-08 00856, 2014
ianmcorvidae
matchings = geordi to MB
2014-01-08 00858, 2014
ianmcorvidae
is the way I use the terms
2014-01-08 00805, 2014
ruaok_
ah
2014-01-08 00817, 2014
ianmcorvidae
I have an old document that explained that but I forget that others aren't as over-their-heads in this stuff as I am :)
2014-01-08 00818, 2014
rvedotrc joined the channel
2014-01-08 00828, 2014
ruaok_
db, mappings/display, matchings, import. that order?
2014-01-08 00843, 2014
ianmcorvidae
unless you think matchings should go even lower than import
2014-01-08 00852, 2014
reosarevok
IMO yes, since without matching you can't indicate you've imported
2014-01-08 00857, 2014
ianmcorvidae
(note that mappings are also prerequisite for import, so those are certainly second)
2014-01-08 00821, 2014
ruaok_
maybe I am not thinking of matchings in the right context.
2014-01-08 00840, 2014
ruaok_
as in automated matchings between a geordi data store and MB?
2014-01-08 00846, 2014
ruaok_
I think that ought to be last.
2014-01-08 00849, 2014
ianmcorvidae
matchings are marking that something in geordi is the same as something in MB
2014-01-08 00852, 2014
ianmcorvidae
whether manual or automatic
2014-01-08 00806, 2014
ruaok_
ok, then that should be near the bottom
2014-01-08 00809, 2014
ianmcorvidae
the big complication with those is that some matchings are only in geordi (wcd) and some are really derived from MB
2014-01-08 00814, 2014
nikki
what needs to change with the display?
2014-01-08 00814, 2014
reosarevok
So if I import something, I want to be able to tell geordi "this is here in MB"
2014-01-08 00821, 2014
reosarevok
Since that lets others not import it
2014-01-08 00830, 2014
reosarevok
(and not worry about it basically)
2014-01-08 00842, 2014
reosarevok
Without that, I'm likely to forget *myself* what I've added and not
2014-01-08 00844, 2014
ianmcorvidae
(where derived-from-MB is things like having a discogs URL in MB)
2014-01-08 00844, 2014
reosarevok
:/
2014-01-08 00801, 2014
ianmcorvidae
(or with ninjatune, it would presumably be something with the right label/catno)
2014-01-08 00815, 2014
ruaok_
would it be enough to keep a paper trail for now and then improve the history of what has been imported later?
2014-01-08 00842, 2014
ianmcorvidae
that would be writing manual matching stuff
2014-01-08 00848, 2014
ianmcorvidae
but ignoring the MB-side matches
2014-01-08 00854, 2014
reosarevok
That'd be fine for me
2014-01-08 00808, 2014
reosarevok
(what ian said, not just throwing it at a wiki :p)
2014-01-08 00840, 2014
ianmcorvidae
that's essentially what current geordi does, except I'd throw out the idea of automatic matches until we can do them right
2014-01-08 00858, 2014
ruaok_ nods at the automatic matches
2014-01-08 00842, 2014
reosarevok
Seems sensible
2014-01-08 00844, 2014
ianmcorvidae
my plan for newgeordi has actually always been to throw out the external automatic match thing anyway, and have any automatic process be part of geordi
2014-01-08 00854, 2014
ianmcorvidae
but the so-called "MB-side" matches are intermediate
2014-01-08 00801, 2014
ianmcorvidae
but they can still wait until later
2014-01-08 00822, 2014
ruaok_
not sure I quite grok the "mb-side" match stuff. what does that entail?
2014-01-08 00825, 2014
ruaok_
MB knowing about geordi?
2014-01-08 00830, 2014
ianmcorvidae
so: db, mappings, display of extracted info, basic manual matching, import of extracted info, reconvene
2014-01-08 00833, 2014
ianmcorvidae
no
2014-01-08 00836, 2014
ianmcorvidae
so
2014-01-08 00843, 2014
ianmcorvidae
say with discogs
2014-01-08 00807, 2014
ianmcorvidae
what a match in geordi from the discogs index is saying is, id XYZ in discogs is MBID ZYX in MB
2014-01-08 00825, 2014
ianmcorvidae
however, in MB we have a relationship that says MBID ZYX in MB is id XYZ in discogs (via a URL relationship)
2014-01-08 00843, 2014
ianmcorvidae
in current geordi we synchronize these by a really ridiculously hacky script that uses geordi's automatic matching system
2014-01-08 00859, 2014
ianmcorvidae
the better way to do it would be to have a replicated DB and just query them :P
2014-01-08 00809, 2014
reosarevok
mhmh
2014-01-08 00828, 2014
ianmcorvidae
since the ultimate goal of geordi is to get info into MB, the notion here is that it's better to store as many matchings (in the geordi sense) as relationships in MB as we can
2014-01-08 00842, 2014
ianmcorvidae
rather than storing them in geordi where they're not much use to anyone who isn't trying to import things from geordi
2014-01-08 00858, 2014
ruaok_
yeah, that should be considered during the reconvene.
2014-01-08 00810, 2014
ianmcorvidae
yeah.
2014-01-08 00820, 2014
ianmcorvidae
the other thing I'll do is ignore the wcd index at first, I think
2014-01-08 00826, 2014
ruaok_ nods at ianmcorvidae
2014-01-08 00838, 2014
ruaok_
agreed.
2014-01-08 00842, 2014
ianmcorvidae
since it has complications of embedded matches from the IA's matching process, and is weird data in general
2014-01-08 00850, 2014
ruaok_
personally I think ninjatune ought to be your first data set to work with.
2014-01-08 00854, 2014
ianmcorvidae
and, well, designing based on the wcd data is basically what got us where we are
2014-01-08 00804, 2014
ianmcorvidae
discogs. because it has concepts of more than one type of entity
2014-01-08 00806, 2014
ruaok_
oh, ouch.
2014-01-08 00818, 2014
reosarevok
ruaok_: ninjatune and discogs IMO (as in, the design should ensure it works fine for both kinds of data)
2014-01-08 00825, 2014
ianmcorvidae
I mean, the 'designing' (i.e. me) is more at fault than the data, but still :P
2014-01-08 00835, 2014
reosarevok
Label info and discogs are probably going to be the two main sources of data after all
2014-01-08 00835, 2014
ruaok_
reosarevok: yeah, agreed. lets make sure it works for both of those.
2014-01-08 00835, 2014
ianmcorvidae
I think both is the right answer anyway, yeah
2014-01-08 00850, 2014
ruaok_
but I say ninjatune because this is an industry relationship that we're lagging on.
2014-01-08 00815, 2014
ianmcorvidae
the lucky thing here is that ninjatune is super easy when we exclude the mb-side matches part :)
2014-01-08 00823, 2014
ianmcorvidae
which we have
2014-01-08 00839, 2014
reosarevok
:)
2014-01-08 00842, 2014
ruaok_
super easy? even with needing a new DB?
2014-01-08 00848, 2014
ruaok_
or are you talking about the overall matching process?
2014-01-08 00800, 2014
ruaok_
granted, there is not much data and we have large chunks of it.
2014-01-08 00805, 2014
ianmcorvidae
super easy in terms of if it works for discogs stuff it'll work for ninjatune
2014-01-08 00811, 2014
ruaok_
I see it more from a political perspective.
2014-01-08 00817, 2014
ruaok_
got it.
2014-01-08 00828, 2014
ianmcorvidae
where if we were doing the mb-side matches those would require ninjatune-specific implementation
2014-01-08 00853, 2014
ruaok_
ok, I think we have a rough idea where we want to go to from here.
2014-01-08 00803, 2014
ruaok_
lets see how much of this stuff we can get done before chicago.