I should do those graphs, lossy stuff has been growing a lot more lately
2014-10-15 28848, 2014
kepstin-laptop
my lossless stuff hasn't finished yet
2014-10-15 28800, 2014
kepstin-laptop
this data set's gonna be a bit more weighted towards japanese pop than most, i think ;)
2014-10-15 28817, 2014
ianmcorvidae
haha
2014-10-15 28846, 2014
ianmcorvidae
probably got more estonian hip-hop than the average dataset just by my 6 CDs worth :P
2014-10-15 28836, 2014
kepstin-laptop
well, it can only improve the results, right?
2014-10-15 28800, 2014
ianmcorvidae
yup!
2014-10-15 28809, 2014
ianmcorvidae
I was thinking I should write a crappy recommender to kick us off
2014-10-15 28823, 2014
ianmcorvidae
something obviously terrible like levenshtein distance of the JSON :P
2014-10-15 28850, 2014
kepstin-laptop
hmm, something with just the low-level data? Could do something silly like just match bpm and key
2014-10-15 28830, 2014
kepstin-laptop
you like this song in C# major at 140bpm, so you'll obviously like this other one too!
2014-10-15 28829, 2014
ianmcorvidae
that's far more sophisticated than I was thinking XD
2014-10-15 28856, 2014
ianmcorvidae
I mean, I'm really thinking in the vein of making a truly terrible recommender that anyone can do better than, because I want to goad them into doing so :P
2014-10-15 28802, 2014
CallerNo6
listeners who like songs with "satan" in the title will probably like other songs with "satan" in the title?
2014-10-15 28822, 2014
kepstin-laptop wonders if there's something really silly and easy you could do which would on average perform worse than random matching.
2014-10-15 28842, 2014
ianmcorvidae
hah
2014-10-15 28813, 2014
CallerNo6
I've been assured that nobody's smart enough to be wrong all the time. But it can't hurt to try?
2014-10-15 28858, 2014
kepstin-laptop
doesn't have to be all the time
2014-10-15 28803, 2014
kepstin-laptop
just on average :)
2014-10-15 28836, 2014
kepstin-laptop
(if you actually got it wrong all the time, you could presumably just flip your rating and get something actually useful)
alastairp: do you have a sec to talk about jesus christ your lord and saviour?
2014-10-15 28853, 2014
ruaok
er wait.
2014-10-15 28804, 2014
ruaok
how about the schema for the highlevel table? :)
2014-10-15 28819, 2014
alastairp
I can see how you might confuse them
2014-10-15 28824, 2014
ruaok
in particular I'm thinking of what version info we should track.
2014-10-15 28824, 2014
alastairp
they're both world-changing
2014-10-15 28831, 2014
ruaok
heh. :)
2014-10-15 28851, 2014
alastairp
are you at the lab, or will do we do it here?
2014-10-15 28808, 2014
ruaok
here. mom is in town and I only have half days while aleta baby-sits mom.
2014-10-15 28820, 2014
ruaok wishes he was in the lab
2014-10-15 28831, 2014
alastairp
I don't know what features or algorithms high-level will be in the output
2014-10-15 28846, 2014
ruaok
yeah, that too.
2014-10-15 28813, 2014
ruaok
so, my inclinatio is to store: json, timestamp and essentia_git_sha
2014-10-15 28827, 2014
ruaok
since, I am thinking that only the AB server should ever calculate high level stuff.
2014-10-15 28839, 2014
ruaok
is that even a reasonable assumption?
2014-10-15 28842, 2014
alastairp
split per algorithm?
2014-10-15 28804, 2014
ruaok
ideally, but I just don't know if the essentia codebase is really ready for that/
2014-10-15 28817, 2014
ruaok
I think we may just need to start with one version and get a move on.
2014-10-15 28827, 2014
ruaok
the good thing is that we can re-calculate this at any time.
2014-10-15 28842, 2014
alastairp
right. that'd be a good start then
2014-10-15 28858, 2014
ruaok
ok, I'll get moving on that.
2014-10-15 28801, 2014
ruaok
any signs of dima?
2014-10-15 28821, 2014
alastairp
if there are many algorithms, there's no difference between 1 binary that spits out lots of bits of json, and many binaries that each spit out their own
2014-10-15 28827, 2014
alastairp
no, but he normally does afternoons, I think
2014-10-15 28800, 2014
alastairp
I'll try and grab him as soon as I can
2014-10-15 28852, 2014
ruaok returns from a mom interruption
2014-10-15 28800, 2014
alastairp
I have to put out some ssl fires on freesound first, but back to this asap
do you want to do antying about highlevel_json / raw_json table namess?
2014-10-15 28807, 2014
ruaok
unsure.
2014-10-15 28824, 2014
ruaok
we are not likely to need the split and view as we do for the lowlevel stuff.
2014-10-15 28842, 2014
ruaok
first question is if ianmcorvidae intended for all the json to go into one table.
2014-10-15 28852, 2014
ruaok
my gut instinct says to use two tables.
2014-10-15 28800, 2014
ruaok
for scalability.
2014-10-15 28809, 2014
ruaok
and then deciding on the names.
2014-10-15 28809, 2014
alastairp
right
2014-10-15 28837, 2014
ruaok
but ianmcorvidae is sleeping, right now.
2014-10-15 28857, 2014
ruaok
but assuming you're ok with the columns in said tables, I'll press on for now.
2014-10-15 28806, 2014
ruaok
changing table names during the review phase is easy.
2014-10-15 28845, 2014
ruaok
combining tables less so, but I think having two tables is desireable.
2014-10-15 28800, 2014
ruaok
we're not losing anything having separate tables.
2014-10-15 28839, 2014
alastairp
yes, I think 2 is a good idea
2014-10-15 28841, 2014
alastairp
otherwise, fine
2014-10-15 28852, 2014
ruaok
ok, I'll keep moving then.
2014-10-15 28803, 2014
ruaok
not sure I can get a PR up for the high level stuff today, but I'll try.
2014-10-15 28820, 2014
ruaok
hm.
2014-10-15 28849, 2014
ruaok
I'll build no locking support into the highlevel stuff.
2014-10-15 28854, 2014
ruaok
I'm going to assume that there will be one master program that looks at the DB, determines which highlevel data needs to be calculated, fires off a thread that will then calculate the highlevel data.
2014-10-15 28811, 2014
ruaok
it then takes ending threads and stores the data into the DB>
2014-10-15 28837, 2014
Nyanko-sensei joined the channel
2014-10-15 28840, 2014
ardoRic
does the vm update the musicbrainz-server code automatically, or should I check it out again ?
2014-10-15 28805, 2014
ruaok
just do a git pull on it.
2014-10-15 28809, 2014
ruaok
it doesn't update automatically
2014-10-15 28841, 2014
KillDaBOB_ joined the channel
2014-10-15 28825, 2014
chirlu` joined the channel
2014-10-15 28806, 2014
KillDaBOB joined the channel
2014-10-15 28841, 2014
Nyanko-sensei joined the channel
2014-10-15 28827, 2014
ijabz1 joined the channel
2014-10-15 28849, 2014
kepstin-laptop
so, >100k recordings now :)
2014-10-15 28831, 2014
alastairp
this is great. 10% of our target in 5 days
2014-10-15 28824, 2014
alastairp
at this rate that'll be ~400k by the end of the month, so if we get more people running it in the coming week I think 500k or more is really doable
2014-10-15 28806, 2014
kepstin-laptop
I've just about hit all the music I have now, though.
2014-10-15 28845, 2014
kepstin-laptop
keeping the rate up probably really requires getting more people to run the tool :)
2014-10-15 28855, 2014
alastairp
right, but the only reason we've not opened this up wider is that the tools still have problems
2014-10-15 28823, 2014
alastairp
rob is confident, and I agree with him, that we can dump this tool on 2-4x as many people immediately
2014-10-15 28842, 2014
alastairp
which will keep up our submission speed
2014-10-15 28815, 2014
kepstin-laptop has started to run it on the stuff he has only has lossy formats now
2014-10-15 28826, 2014
kepstin-laptop
(which is a bunch of touhou arranges, mostly)
2014-10-15 28813, 2014
Nyanko-sensei joined the channel
2014-10-15 28847, 2014
ruaok
in fact, I think we should start tapping people on the shoulders quietly and ask them to jump in.
2014-10-15 28802, 2014
alastairp
right
2014-10-15 28811, 2014
ruaok
we need to get derwin in on this.
2014-10-15 28814, 2014
nikki is still working on her stuff
2014-10-15 28845, 2014
nikki
although when I'll be able to actually run it on *all* of my music is another question
2014-10-15 28823, 2014
ijabz1
if we can get either an osx or windows version available soon will be alot easier to get more users
2014-10-15 28825, 2014
nikki
(right now I can't do korean stuff, because apparently linux has a bug in its support for korean filenames on hfs filesystems)
2014-10-15 28841, 2014
JesseW joined the channel
2014-10-15 28841, 2014
ruaok
ijabz1: that is our goal for friday, if at all possible
2014-10-15 28820, 2014
ijabz1
great
2014-10-15 28818, 2014
jesus2099_ joined the channel
2014-10-15 28859, 2014
alastairp
i wish
2014-10-15 28830, 2014
LordSputnik
btw, have about 12k lossless tracks for scanning - are there instructions anywhere? :)