#metabrainz

/

18:00 PM
reosarevok

So you did manage to break MB after all!

2016-07-18 20044, 2016

18:00 PM
reosarevok

Good job

2016-07-18 20045, 2016

18:00 PM
alfie

:D

2016-07-18 20047, 2016

18:01 PM
CallerNo6

IIUC that means that you get to keep both pieces.

2016-07-18 20015, 2016

18:02 PM
reosarevok

I mean, some of us break stuff all the time, but breaking stuff in your first day is a strong start

2016-07-18 20029, 2016

18:02 PM
reosarevok

As long as you keep reporting it, go and break as much as you manage! ;)

2016-07-18 20041, 2016

18:02 PM
alfie

:D that's my job!

2016-07-18 20042, 2016

18:11 PM
alastairp

kartikgupta0909: right. passing from the server to the client is an interesting task

2016-07-18 20027, 2016

18:12 PM
kartikgupta0909

are we using json for the low level data or switching to something else?

2016-07-18 20031, 2016

18:12 PM
alastairp

currently we get items from the database for the model processing step one at a time: https://github.com/metabrainz/acousticbrainz-serv…

2016-07-18 20047, 2016

18:12 PM
alastairp

I'm interested in changing this so that we load 100 at a time or so, which should be much faster

2016-07-18 20058, 2016

18:12 PM
alastairp

but we still need to make a decision on how to transmit the data

2016-07-18 20015, 2016

18:13 PM
alastairp

ideally we should use the result of your investigation on data types

2016-07-18 20034, 2016

18:13 PM
alastairp

did you finish writing a report of each of the types and their sizes/conversion times?

2016-07-18 20039, 2016

18:13 PM
kartikgupta0909

I think for now we can continue with json, in future if we change our minds we can add an externial deserialisation script.

2016-07-18 20057, 2016

18:13 PM
kartikgupta0909

I didnt write the report was waiting for your comment on the ticket.

2016-07-18 20041, 2016

18:14 PM
kartikgupta0909

could you have a look at the data I have posted on the ticket and let me know if its fine

2016-07-18 20048, 2016

18:14 PM
kartikgupta0909

and I ll make the report

2016-07-18 20036, 2016

18:15 PM
alastairp

did you see the comment that Ulrich made?

2016-07-18 20019, 2016

18:16 PM
alastairp

other than that, what you've done looks good. You should start to write that up. It can just be a markdown file in the same repository that you were using for the code

2016-07-18 20027, 2016

18:16 PM
alastairp

yes, we can continue using json for now anyway

2016-07-18 20020, 2016

18:17 PM
alastairp

for me, the next question is how we move the data. We could have the client make a single request for all of the data, and the server loads it from the database, compresses it, and sends it back

2016-07-18 20034, 2016

18:17 PM
MBJenkins

Yippee, build fixed!

2016-07-18 20035, 2016

18:17 PM
MBJenkins

Project musicbrainz-server_master build #508: FIXED in 19 min: https://ci.metabrainz.org/job/musicbrainz-server_…

2016-07-18 20044, 2016

18:17 PM
alastairp

but I think for a large dataset (say 1000 items) this will leave the http connection open for a long time

2016-07-18 20048, 2016

18:17 PM
kartikgupta0909

yes, i would suggest that too

2016-07-18 20007, 2016

18:18 PM
kartikgupta0909

yes it would

2016-07-18 20027, 2016

18:18 PM
alastairp

then we need to think about what to do if the connection is broken, etc. how do we continue?

2016-07-18 20034, 2016

18:18 PM
alastairp

My preference is actually to have a background task

2016-07-18 20053, 2016

18:18 PM
alastairp

so the client could tell the server "I want this data" and the server can say "OK, come back soon and I'll have it ready"

2016-07-18 20054, 2016

18:18 PM
kartikgupta0909

If the connection breaks then we will have to resend the entire data in case of compression

2016-07-18 20007, 2016

18:19 PM
alastairp

then we can start a background task on the server to extract the data and create an archive

2016-07-18 20011, 2016

18:19 PM
kartikgupta0909

but if we send in batches we could log and restart from where it stopped

2016-07-18 20025, 2016

18:19 PM
kartikgupta0909

ah thats fine too

2016-07-18 20030, 2016

18:19 PM
alastairp

the client can continue polling to see if the archive is ready, and when it is, download it then tell the server that it has it

2016-07-18 20038, 2016

18:19 PM
alastairp

how do you think batches could work?

2016-07-18 20052, 2016

18:19 PM
alastairp

something would need to keep state

2016-07-18 20006, 2016

18:20 PM
kartikgupta0909

maybe 10 songs a a time

2016-07-18 20007, 2016

18:20 PM
alastairp

that could work - if the client asks the server for a list of mbids

2016-07-18 20022, 2016

18:20 PM
alastairp

and then the client sends a request to download 10 items (in your example)

2016-07-18 20033, 2016

18:20 PM
alastairp

and we keep the state on the client

2016-07-18 20041, 2016

18:20 PM
kartikgupta0909

yes I think it should. The first step should be to get the dataset info including the recording ids

2016-07-18 20056, 2016

18:20 PM
kartikgupta0909

then get those recordings one by one or in batches

2016-07-18 20009, 2016

18:21 PM
Gentlecat

consider that you might need to evaluate same or very similar dataset multiple times

2016-07-18 20012, 2016

18:21 PM
alastairp

cool. you can make that start then

2016-07-18 20026, 2016

18:21 PM
Gentlecat

is there any point in actually loading the same data multiple times?

2016-07-18 20028, 2016

18:21 PM
alastairp

hmm

2016-07-18 20040, 2016

18:21 PM
kartikgupta0909

I get the point

2016-07-18 20002, 2016

18:22 PM
kartikgupta0909

but in any case we will have to pass the data again and again until we store it in user's local machine

2016-07-18 20005, 2016

18:22 PM
alastairp

good point, but I'm not sure if it's optimising for the right thing

2016-07-18 20018, 2016

18:22 PM
kartikgupta0909

but that would consume a lot of memory on user's machine which is not desireable

2016-07-18 20025, 2016

18:22 PM
alastairp

so the client could keep a cache of items

2016-07-18 20031, 2016

18:22 PM
alastairp

kartikgupta0909: do you mean disk space?

2016-07-18 20037, 2016

18:22 PM
kartikgupta0909

yes

2016-07-18 20043, 2016

18:22 PM
kartikgupta0909

if there are 1000 files

2016-07-18 20048, 2016

18:22 PM
alastairp

I don't think that's a problem. especially if, as Gentlecat says, they are evaluating lots of things

2016-07-18 20051, 2016

18:22 PM
alastairp

1000 files is small

2016-07-18 20053, 2016

18:22 PM
kartikgupta0909

i guess it would take around 60 mb

2016-07-18 20002, 2016

18:23 PM
alastairp

it's only going to be a problem once they have 100,000

2016-07-18 20013, 2016

18:23 PM
alastairp

it's an interesting proposal

2016-07-18 20022, 2016

18:23 PM
alastairp

it does make the client more comples, though

2016-07-18 20023, 2016

18:23 PM
kartikgupta0909

they might have that too although in music IR thats rarely the case

2016-07-18 20025, 2016

18:23 PM
alastairp

complex

2016-07-18 20007, 2016

18:24 PM
Gentlecat

just noting that by sending all data in one archive you are constraining this thing

2016-07-18 20000, 2016

18:25 PM
Freso

People up for reviews tonight: Freso, ruaok, reosarevok, bitmap, Gentlecat, zas, LordSputnik, Leftmost, Leo_Verto, alastairp, CatQuest, rahulr, QuoraUK, armalcolite, hellska, kartikgupta0909 - let me know if you want on/off.

2016-07-18 20009, 2016

18:25 PM
Freso

(Meeting in ~35 minutes.)

2016-07-18 20012, 2016

18:25 PM
Gentlecat

monday again!

2016-07-18 20029, 2016

18:25 PM
Gentlecat

I need some kind of time slowdown device here

2016-07-18 20057, 2016

18:25 PM
alastairp

kartikgupta0909: I already have a plan this week to look at loading this data in bulk during the dataset evaluation stage

2016-07-18 20007, 2016

18:26 PM
reosarevok

same

2016-07-18 20021, 2016

18:26 PM
mihaitish joined the channel

2016-07-18 20027, 2016

18:26 PM
alastairp

so I will try and do that tomorrow and on Wednesday. I'm interested in seeing how long it takes to load 1000 items 1 at a time, 10 at a time, or 100 at a time

2016-07-18 20045, 2016

18:26 PM
kartikgupta0909

ah okay.

2016-07-18 20050, 2016

18:26 PM
alastairp

perhaps this ties into http://tickets.musicbrainz.org/browse/AB-21

2016-07-18 20055, 2016

18:26 PM
kartikgupta0909

but wont it depend on the user's machine?

2016-07-18 20057, 2016

18:26 PM
Gentlecat

alastairp: how is dataset evaluation going? did you restart the script?

2016-07-18 20006, 2016

18:27 PM
alastairp

ah, no. I wanted to merge that PR

2016-07-18 20008, 2016

18:27 PM
Gentlecat

or you want to merge the changes to it first?

2016-07-18 20010, 2016

18:27 PM
Gentlecat

right

2016-07-18 20018, 2016

18:27 PM
alastairp

since we can keep the history/results

2016-07-18 20028, 2016

18:27 PM
alastairp

but I need to move the data files into /tmp like you suggested

2016-07-18 20031, 2016

18:27 PM
Freso

armalcolite: Kodi is also able to scrobble. Spotify too... :p

2016-07-18 20039, 2016

18:27 PM
alastairp

Freso: spotify is difficult though

2016-07-18 20043, 2016

18:27 PM
Gentlecat

I also need to find some time to make a validation dataset and implement accuracy measurement script

2016-07-18 20057, 2016

18:27 PM
alastairp

because as far as we understand, they scrobble from their end

2016-07-18 20000, 2016

18:28 PM
alastairp

not from the client

2016-07-18 20012, 2016

18:28 PM
Gentlecat

dmitry was talking that he already had something for cross-dataset validation, is that the same thing?

2016-07-18 20018, 2016

18:28 PM
alastairp

yes, that's my stuff

2016-07-18 20021, 2016

18:28 PM
alfie earperks

2016-07-18 20023, 2016

18:28 PM
Gentlecat

ok

2016-07-18 20024, 2016

18:28 PM
alfie

scrobbling?

2016-07-18 20032, 2016

18:28 PM
alastairp

alfie: http://listenbrainz.org/

2016-07-18 20032, 2016

18:28 PM
Gentlecat

is it available somewhere?

2016-07-18 20037, 2016

18:28 PM
alastairp

yes, it should be

2016-07-18 20039, 2016

18:28 PM
alastairp

let me find it

2016-07-18 20049, 2016

18:28 PM
alfie

alastairp: loving y'all more and more.

2016-07-18 20004, 2016

18:29 PM
alastairp

Gentlecat: https://github.com/MTG/acousticbrainz-research/tr…

2016-07-18 20006, 2016

18:29 PM
alastairp

just inviting you now

2016-07-18 20017, 2016

18:29 PM
Gentlecat

maybe you can make a pull request with it into some directory in AB?

2016-07-18 20031, 2016

18:29 PM
Gentlecat

and I'll base my stuff on it

2016-07-18 20047, 2016

18:29 PM
Gentlecat

should take a look first though

2016-07-18 20048, 2016

18:29 PM
alastairp

that's next week's job (I want a rough version of it ready for ismir too)

2016-07-18 20016, 2016

18:30 PM
alastairp

the code is currently written for research, rather than integration

2016-07-18 20027, 2016

18:30 PM
alastairp

I need to work out how to extract the relevant stuff

2016-07-18 20054, 2016

18:30 PM
Freso

alastairp: Ah, that might be so.

2016-07-18 20027, 2016

18:31 PM
Freso

alfie: :)

2016-07-18 20031, 2016

18:31 PM
armalcolite

alastairp: regarding this: https://github.com/metabrainz/listenbrainz-server…

2016-07-18 20053, 2016

18:31 PM
armalcolite

the API key should be used for making the web-requests

2016-07-18 20001, 2016

18:32 PM
Leftmost

Freso, another week with nothing from me. What a slacker.

2016-07-18 20013, 2016

18:32 PM
Freso fires Leftmost

2016-07-18 20029, 2016

18:32 PM
Leftmost

Ha, joke's on you. No one ever hired me!

2016-07-18 20050, 2016

18:32 PM
Freso

Ha, joke's on *you* - I don't actually have the ability to fire people!

2016-07-18 20004, 2016

18:33 PM
armalcolite

alastairp: currently, i check the API key in the GET request and match it with the user who approves it to give him a session key.

2016-07-18 20012, 2016

18:33 PM
CatQuest joined the channel

2016-07-18 20012, 2016

18:33 PM
CatQuest has quit

2016-07-18 20012, 2016

18:33 PM
CatQuest joined the channel

2016-07-18 20022, 2016

18:33 PM
alastairp

alfie: we try

2016-07-18 20041, 2016

18:33 PM
armalcolite

alastairp: but i realised today (when testing lastfm windows client) that they have API hardcoded in the app.

2016-07-18 20059, 2016

18:33 PM
alfie

once i've got some free time i'll see if i can get my cmus script talking to listenbrainz

2016-07-18 20011, 2016

18:34 PM
Freso

cmus?

2016-07-18 20020, 2016

18:34 PM
alfie

cmus, the best music player :D

2016-07-18 20038, 2016

18:34 PM
armalcolite

alastairp: i was using the lastfm

2016-07-18 20050, 2016

18:34 PM
armalcolite

alastairp: i was using the lastfm's official windows client.

2016-07-18 20053, 2016

18:34 PM
alastairp

armalcolite: but if a user loads this auth page (acousticbrainz.org/api/auth/?api_key=x&token=y) we know that user armalcolite is going to use token y

2016-07-18 20057, 2016

18:34 PM
Freso turns to DDG

2016-07-18 20008, 2016

18:35 PM
alastairp

so we can create that link in the auth table

2016-07-18 20014, 2016

18:35 PM
armalcolite

alastairp: that was just double auth i implemented

2016-07-18 20015, 2016

18:35 PM
alastairp

in fact, we don't even need api_key

2016-07-18 20036, 2016

18:35 PM
alastairp

see how I mentioned in the ticket that this is used to identify the *app*, not the user

2016-07-18 20053, 2016

18:35 PM
alastairp

this is how last.fm knows to say "Audacious wants access to your account"

2016-07-18 20008, 2016

18:36 PM
alastairp

note that we can't say what the app is in this case, because that mapping is private to last.fm

2016-07-18 20019, 2016

18:36 PM
alastairp

so we can just say "A legacy last.fm app wants access to your account"

2016-07-18 20001, 2016

18:37 PM
Freso

alfie: Hosted on SourceForge? Really? :/

2016-07-18 20017, 2016

18:37 PM
Freso

Ah, no. Redirected to GitHub. Nvm. ❤️

2016-07-18 20024, 2016

18:37 PM
armalcolite

tokens are fetched for apps then?

2016-07-18 20030, 2016

18:37 PM
alfie

Freso: https://cmus.github.io/ scuse me. :P

2016-07-18 20036, 2016

18:37 PM
armalcolite

bcoz token require API_KEY

2016-07-18 20042, 2016

18:37 PM
armalcolite

*requires

2016-07-18 20053, 2016

18:38 PM
alastairp

armalcolite: from what I understand, the workflow is like this:

2016-07-18 20012, 2016

18:39 PM
alastairp

application generates a random token and directs the user to open a url containing api_key and token

2016-07-18 20021, 2016

18:39 PM
kahu joined the channel

2016-07-18 20025, 2016

18:39 PM
alastairp

user goes to the page and approves access

2016-07-18 20043, 2016

18:39 PM
alastairp

client makes another query using this token (plus api key and signature) to retrieve a real access token

2016-07-18 20017, 2016

18:40 PM
alastairp

client uses this real access token in all queries to the scrobble api. this token identifies the user (because the user was logged in when they gave access to the original token)

2016-07-18 20027, 2016

18:40 PM
Lotheric_ has quit

2016-07-18 20043, 2016

18:40 PM
Freso

alfie: Note that LB has its own API in addition to Last.FM compatible one currently being developed: https://listenbrainz.readthedocs.io/

2016-07-18 20044, 2016

18:40 PM
armalcolite

yes.

2016-07-18 20053, 2016

18:40 PM
armalcolite

but the token should not be random, http://www.last.fm/api/show/auth.getToken

2016-07-18 20057, 2016

18:40 PM
alastairp

ah, I forgot the step that the app actually generates the token

2016-07-18 20058, 2016

18:40 PM
alastairp

right :)