#metabrainz

/

      • alastairp
        oh, the other thing that popped up around this time was https://github.com/spotify/annoy
      • ruaok
        is the theory behind it solid enough?
      • alastairp
        which we didn't try but looks great
      • what do you mean solid?
      • does the similarity work? yes
      • ruaok
        that.
      • alastairp
        yeah, we had to filter out duplicate submissions
      • because it kept on saying that they had really high similarity
      • ruaok
        can we frame this in terms of "this was done, we solved the theoretical aspects, but now we need to scale it"?
      • alastairp
        absolutely
      • ruaok
        that sounds like a perfect gsoc project, no?
      • alastairp
        scalability is a potential issue
      • ruaok
        it is *the* issue, no?
      • alastairp
        we had no machine here which was fast enough to really get an idea about how difficult the issue was
      • we were running with about 25% of the database, and it was taking ages to do stuff (but our infrastructure isn't as powerful as the hetzner dedicated servers)
      • ruaok
        I suspect that we'll need to solve this using spark.
      • rsh7
        iliekcomputers: got time to rebase the integration branch?
      • alastairp
        right, so it depends on what our goals with this are, and how we want to query it
      • if we can wait a few seconds or tens of seconds, BQ or spark are ideal candidates
      • ruaok
        or we rent a stupidly big cloud instance and run it on that periodically.
      • my goals is to feed more data into training recommendation engines.
      • so batches are fine.
      • alastairp
        and it's possible that spark may even be better, because we're not constrained by cube's requirement that the distance metrics are linear
      • right, so, it also depends on what the usecase is
      • clustering? or selection of similarity from a single example instance?
      • ruaok
        this is the usecase I want to solve.
      • I'm unsure of how to get there, but I am sure that I want a track similarity mapping in the end
      • alastairp
        I'm not sure if it's responsible to select every item in the database and independently caculate its similarity with every other track
      • ruaok
        not feasible.
      • alastairp
        right
      • but if it's just clustering, then stuff like annoy or t-sne (https://lvdmaaten.github.io/tsne/) might be really nice
      • ruaok
        hence me suggesting some estimation function that we can use to reduce the number of comparisons we need to make
      • iliekcomputers
        rsh7: hey, hi! Yes, today. I completely forgot :(
      • ruaok
        let me read the thesis this afternoon and digest that. I'll look at t-SNE too.
      • pristine--: can you please do the same?
      • rsh7
        iliekcomputers: wokay, no problem
      • reosarevok
        ruaok probably knows everything about annoy already, right?
      • ruaok
        look who is talking!
      • reosarevok
        ❤️
      • Heh
      • Everyone is angry with these SensCritique people
      • Because they're no longer getting MB updates
      • ruaok btw, did you answer the Finns? I forgot
      • ruaok
        what's with SC?
      • pristine--
        ruaok: sure :)
      • ruaok
        yes, they are willing to give us data, but we need to be ready for it.
      • pristine--
        ruaok: thesis and T-sne, right?
      • alastairp
        pristine--: and annoy
      • reosarevok
        Twitter complaints about how new stuff added to MB isn't showing up on their page, ruaok
      • (and they seem to basically be saying "soon TM" and ignoring people)
      • alastairp
        pristine--: the thesis should give you a good overview about how we consider acoustic similarity
      • iliekcomputers: tell me when we should deploy. it's up to you
      • pristine--
        alastairp: okay:)
      • iliekcomputers
        alastairp: 9PM my time today?
      • alastairp
        no problem
      • iliekcomputers
        Cool, thanks!
      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #254 (master…test-subdomain): "fixes BB-309" docs: Add a section about the "test" subdomain in README https://github.com/bookbrainz/bookbrainz-site/p...
      • BrainzBot
      • travis-ci joined the channel
      • travis-ci
        Project bookbrainz-site build #2034: passed in 3 min 43 sec: https://travis-ci.org/bookbrainz/bookbrainz-sit...
      • travis-ci has left the channel
      • ruaok
        alastairp: another favor please... can you recommend some papers that combine user behavioral data with acoustic data in order to build recommendation engines?
      • didn't dimi do his PhD on that?
      • alastairp
        mmm, good question
      • yeah, I think you're right
      • Gabriel did stuff on behaviour data I think, I'm not sure if he included acoustic data
      • ruaok
        that is the next thing we need to understand.
      • yea, we have the CF in spark to do that. but dimi taught me that we need AB as well.
      • but, what are the algs that are going to scale?
      • alastairp
      • that's his thesis
      • ahmedkrmn_ joined the channel
      • ruaok
        pristine--: iliekcomputers ^^
      • alastairp
        but I'll try and grab him this week and get him to write a handful of notes with a more distilled focus
      • ruaok
        that would be excellent, thank you!
      • perhaps next week maybe the three of us go to lunch?
      • ahmedkrmn has quit
      • ahmedkrmn_ is now known as ahmedkrmn
      • pristine--
        ruaok: should I also come along. Lol
      • And got the paper :)
      • ruaok
        that would be nice, but the commute is a killer.
      • iliekcomputers
        Where's the visa
      • :) :)
      • ruaok
        gaaaaaaaaaah!
      • ruaok starts twitching madly
      • iliekcomputers
        The commute involves landslides :D
      • And civilians axing trees to free themselves
      • heisthepirate joined the channel
      • amCap1712 joined the channel
      • ruaok
        I've told that story to several friends already. to make a contrast between europeans and indians.
      • Mr_Monkey
        iliekcomputers, ruaok : I'm seeing duplicated in my listens after I relinked my Spotify account yesterday. Is that expected?
      • ruaok
        Ive seen it before and we should consider that a bug.
      • file a ticket, Mr_Monkey ?
      • Mr_Monkey
        Will file
      • :)
      • CatQuest
        pristine--: morn morn is the norwegian equivalent of "moin"
      • it's commonly said as "morn morn" twice not just "morn" once
      • code_master5 joined the channel
      • pristine--
        CatQuest: oh. I see.
      • ruaok: yeah. Visa. Lol
      • CatQuest
        oh no not th visas again :C
      • :(
      • heisthepirate has quit
      • that could be clearer in the edit
      • travis-ci joined the channel
      • travis-ci
        metabrainz/picard#4404 (master - 9a7b323 : Laurent Monin): The build passed.
      • travis-ci has left the channel
      • djinni`_ has quit
      • reosarevok registered for the "do you know Estonian laws" exam
      • reosarevok
        One step closer to being a citizen, yay
      • CatQuest
        reosarevok: even more school though.
      • perpetual student you
      • reosarevok
        Nah
      • djinni` joined the channel
      • CatQuest
        but yay!
      • reosarevok
        You basically just go there and are given the Constitution and stuff and just need to show you can figure it out :p
      • So I don't think it actually requires any studying
      • Might like read it once ahead of time just in case, but
      • CatQuest
        yea beter be safe thna srry :P
      • better sorry*
      • gr0uch0mars joined the channel
      • amCap1712
        hi gr0uch0mars
      • Can you explain the way you referred in the PR comment to organize code
      • gr0uch0mars
        hi. yes I was going to look for a good post to link here, but meanwhile I referred to organizing certain files into features
      • like all files of the presentation-layer of Artist together: viewModel, activity, adapters, fragments…
      • that way, it's easier to have a quick preview of what does the app offers (something related to Artist), and there's “only” one place if you have to touch the code
      • Here is a link about Clean Architecture (way beyond simply “grouping features”) that, although difficult to implement in its totality, it worth reading about: https://fernandocejas.com/2018/05/07/architecti...
      • ruaok
        the supporters page is getting pretty long! https://metabrainz.org/supporters
      • gr0uch0mars
        amCap1712: take a look at the post and share your thoughts. Working with a good architecture is as important as making code work (although not urgent)
      • amCap1712
        thanks gr0uch0mars
      • ahmedkrmn has quit
      • gr0uch0mars
        amCap1712: other question I was thinking yesterday. Is there a design for the app UI? Or can we work on an improved design?
      • amCap1712
        gr0uch0mars: we can work on improved design
      • gr0uch0mars
        great. Let me think of some ideas and I'll share them. Meanwhile we can work on presenting the data retrieved from the API in an “ordered” manner, like you are doing for Artists
      • Freso
        Hm. Does it make sense to continue to list AcousticBrainz on /supporters with it also being listed on /projects?
      • amCap1712
        ok great gr0uch0mars
      • Freso
        It feels a bit like "oh, hey, we support ourselves!", no?
      • UPF is already listed on their own.
      • alastairp
        Freso: I think that "supporters" is directly linked to "has an API key to download the database"
      • which is why AB is on supporters
      • Freso
        Could be. Just looks a bit odd and self‐congratulatory to me is all. :)
      • alastairp
        maybe we can hide the account from the page if you really want, but I'm not sure if it's needed
      • Freso
        Nah, not if it's something that takes effort.
      • alastairp
        I think it's possible
      • D4RK-PH0ENiX has quit
      • gr0uch0m_ joined the channel
      • gr0uch0mars has quit
      • ruaok