[listenbrainz-server] mayhem opened pull request #1293 (add-mbid-mapping-to-labs-api…add-recording-search-to-labs-api): Add user facing recording search https://github.com/metabrainz/listenbrainz-server…
2021-02-24 05507, 2021
ruaok
phew. these two PRs have been on my mind for 2 months now. finally out!
2021-02-24 05559, 2021
alastairp
!m ruaok
2021-02-24 05559, 2021
BrainzBot
You're doing good work, ruaok!
2021-02-24 05500, 2021
alastairp
ruaok: just as a headsup, LB is currently in an undeployable state, yvanzo's last patch requires consul-template 0.18, which is coming in #1237, I hope that _lucifer and I can deploy this tomorrow
2021-02-24 05541, 2021
_lucifer
i am available currently also if we want to do it today :)
2021-02-24 05542, 2021
ruaok
ok, good to know. I don't think my PRs have a snowball's chance in texas to get approved before you two fix the LB codebase, so....
2021-02-24 05506, 2021
alastairp
_lucifer: I don't have any time today, but you can have a look if you want
2021-02-24 05536, 2021
ruaok
alastairp: is all the code ready to go, so I can help _lucifer with the release? I have some time left in my day.
2021-02-24 05545, 2021
alastairp
no, I don't think it's ready
2021-02-24 05500, 2021
ruaok
ok, then I'll leave you to it.
2021-02-24 05518, 2021
alastairp
I rebased 1237, but I didn't realise that your 1282 was a PR onto that branch, so now we've got major conflicts between the 2 PRs
2021-02-24 05518, 2021
ruaok
_lucifer: if you're idle for a moment, we can chat about the spark work I am proposing for next week
2021-02-24 05527, 2021
alastairp
it's going to take some git-fu to unwind it
2021-02-24 05529, 2021
_lucifer
ruaok: sure
2021-02-24 05542, 2021
ruaok
doh. I was really hoping to make it easier by making a branch off a branch.
2021-02-24 05554, 2021
ruaok
not a good idea or does it need more highlighting in the docs.
2021-02-24 05505, 2021
ruaok
BTW: I've done that same thing again, with my two PRs for today.
2021-02-24 05535, 2021
ruaok
"not a good idea or does it need more highlighting in the docs." +?
2021-02-24 05547, 2021
alastairp
yeah, I marked my PR as WIP (at least in the title), because the initial changes that I did were just experiments, so I had planned to rebase it until I had a good solution
2021-02-24 05513, 2021
alastairp
perhaps we could be clearer about this. I think it was right that you made your one base off of mine, but perhaps we should have merged it sooner
2021-02-24 05540, 2021
ruaok
let me leave a comment in the future to make this more clear.
2021-02-24 05551, 2021
_lucifer
if i understand correctly, we want to rebase 1282 over 1237?
2021-02-24 05537, 2021
ruaok personally avoids rebasing these days.
2021-02-24 05550, 2021
ruaok
merge master into my branch does what I mostly need.
2021-02-24 05556, 2021
ruaok
but that doesn't answer your question.
2021-02-24 05558, 2021
ruaok
:)
2021-02-24 05501, 2021
ruaok
_lucifer: 1282 was supposed to be merged into 1237 before reviewing 1237 and then merging both into master at the same time.
the user similarity feature is what I think you and I should tackle next week
2021-02-24 05543, 2021
ruaok
the core of the feature is explained in "The actual calculation of the similarities is a statistical problem. Each user can be represented as a vector where the nth element of the vector is the number of times the user has listened to the nth artist. Once each user is represented as a vector, Spark provides an API to calculate the Correlation matrix for a list of vectors where matrix[i][j] = the correlation between the ith and jth
2021-02-24 05543, 2021
ruaok
element. "
2021-02-24 05557, 2021
ruaok
and that is clearly the right approach for this problem.
2021-02-24 05534, 2021
_lucifer
lol
2021-02-24 05536, 2021
_lucifer
do we want to allow users to see similiarity with another user or just show similiar users to them?
2021-02-24 05501, 2021
iliekcomputers
i would prefer A
2021-02-24 05522, 2021
ruaok
so that any user can see the similar user list for any other user?
2021-02-24 05531, 2021
_lucifer
yes
2021-02-24 05539, 2021
_lucifer
spark supports B out of the box
2021-02-24 05544, 2021
ruaok
we should allow that.
2021-02-24 05553, 2021
iliekcomputers
if i'm logged in and I go to ruaok's user page, i should see our "compatibility score"
2021-02-24 05556, 2021
ruaok
is that a function of spark or just showing the right data?
2021-02-24 05519, 2021
iliekcomputers
then we can show follower's compatibility and such along with most compatible users or something.
2021-02-24 05500, 2021
ruaok
do we need to calculate different data for the two approaches? I dont see that....
2021-02-24 05502, 2021
yvanzo
atj: No problem, can you please explain how you did resolve the issue so we can add it to a troubleshooting section for other macOS users?
2021-02-24 05546, 2021
iliekcomputers
ruaok: if you're just showing the user other similar user's, you can get away with just getting the similarity for the top X users.
2021-02-24 05500, 2021
iliekcomputers
but for A, you need the entire correlation matrix.
2021-02-24 05546, 2021
ruaok
ah, its the same algorithm, but the number of results we keep differs. is that right?
2021-02-24 05541, 2021
_lucifer
yes, A would allow seeing the similarity between any two users but B would only show between the top X similiar ones
2021-02-24 05553, 2021
ruaok
ok, that makes sense.
2021-02-24 05504, 2021
ruaok
but, that matrix will be VERY sparse.
2021-02-24 05516, 2021
iliekcomputers
yeah.
2021-02-24 05522, 2021
ruaok
and 90% of the users will relate to each other poorly which is useless info.
2021-02-24 05505, 2021
ruaok
which makes me think we should have a cut-off where don't keep the users. it doesn't scale otherwise.
2021-02-24 05517, 2021
iliekcomputers
so it really only matters on how we want to show the data to users.
2021-02-24 05525, 2021
ruaok
exactly.
2021-02-24 05538, 2021
iliekcomputers
if it's just compatability scores, i assume we can just do scales of 1 to 5 and get away with it