[listenbrainz-server] mayhem opened pull request #1293 (add-mbid-mapping-to-labs-api…add-recording-search-to-labs-api): Add user facing recording search https://github.com/metabrainz/listenbrainz-serv...
ruaok
phew. these two PRs have been on my mind for 2 months now. finally out!
alastairp
!m ruaok
BrainzBot
You're doing good work, ruaok!
alastairp
ruaok: just as a headsup, LB is currently in an undeployable state, yvanzo's last patch requires consul-template 0.18, which is coming in #1237, I hope that _lucifer and I can deploy this tomorrow
_lucifer
i am available currently also if we want to do it today :)
ruaok
ok, good to know. I don't think my PRs have a snowball's chance in texas to get approved before you two fix the LB codebase, so....
alastairp
_lucifer: I don't have any time today, but you can have a look if you want
ruaok
alastairp: is all the code ready to go, so I can help _lucifer with the release? I have some time left in my day.
alastairp
no, I don't think it's ready
ruaok
ok, then I'll leave you to it.
alastairp
I rebased 1237, but I didn't realise that your 1282 was a PR onto that branch, so now we've got major conflicts between the 2 PRs
ruaok
_lucifer: if you're idle for a moment, we can chat about the spark work I am proposing for next week
alastairp
it's going to take some git-fu to unwind it
_lucifer
ruaok: sure
ruaok
doh. I was really hoping to make it easier by making a branch off a branch.
not a good idea or does it need more highlighting in the docs.
BTW: I've done that same thing again, with my two PRs for today.
"not a good idea or does it need more highlighting in the docs." +?
alastairp
yeah, I marked my PR as WIP (at least in the title), because the initial changes that I did were just experiments, so I had planned to rebase it until I had a good solution
perhaps we could be clearer about this. I think it was right that you made your one base off of mine, but perhaps we should have merged it sooner
ruaok
let me leave a comment in the future to make this more clear.
_lucifer
if i understand correctly, we want to rebase 1282 over 1237?
ruaok personally avoids rebasing these days.
ruaok
merge master into my branch does what I mostly need.
but that doesn't answer your question.
:)
_lucifer: 1282 was supposed to be merged into 1237 before reviewing 1237 and then merging both into master at the same time.
the user similarity feature is what I think you and I should tackle next week
the core of the feature is explained in "The actual calculation of the similarities is a statistical problem. Each user can be represented as a vector where the nth element of the vector is the number of times the user has listened to the nth artist. Once each user is represented as a vector, Spark provides an API to calculate the Correlation matrix for a list of vectors where matrix[i][j] = the correlation between the ith and jth
element. "
and that is clearly the right approach for this problem.
_lucifer
lol
do we want to allow users to see similiarity with another user or just show similiar users to them?
iliekcomputers
i would prefer A
ruaok
so that any user can see the similar user list for any other user?
_lucifer
yes
spark supports B out of the box
ruaok
we should allow that.
iliekcomputers
if i'm logged in and I go to ruaok's user page, i should see our "compatibility score"
ruaok
is that a function of spark or just showing the right data?
iliekcomputers
then we can show follower's compatibility and such along with most compatible users or something.
ruaok
do we need to calculate different data for the two approaches? I dont see that....
yvanzo
atj: No problem, can you please explain how you did resolve the issue so we can add it to a troubleshooting section for other macOS users?
iliekcomputers
ruaok: if you're just showing the user other similar user's, you can get away with just getting the similarity for the top X users.
but for A, you need the entire correlation matrix.
ruaok
ah, its the same algorithm, but the number of results we keep differs. is that right?
_lucifer
yes, A would allow seeing the similarity between any two users but B would only show between the top X similiar ones
ruaok
ok, that makes sense.
but, that matrix will be VERY sparse.
iliekcomputers
yeah.
ruaok
and 90% of the users will relate to each other poorly which is useless info.
which makes me think we should have a cut-off where don't keep the users. it doesn't scale otherwise.
iliekcomputers
so it really only matters on how we want to show the data to users.
ruaok
exactly.
iliekcomputers
if it's just compatability scores, i assume we can just do scales of 1 to 5 and get away with it