#metabrainz

/

      • PrathameshG34
        That's basically what I am trying to achieve in my projects as well, just at a lower scaleĀ :)
      • mayhem
        what we may need to do is one extra step:
      • - Convert the artist mbid to text and the recording mbid to text, then lookup the text in our mapping and output that.
      • that is likely the best cleanup that can be done on that data.
      • PrathameshG34
        interestingly enough, I think I might have already written some Python code for that
      • mayhem
        PrathameshG34: good. I think it would make sense to carry forth with your projects using MB/LB data, rather than last.fm data. but that is just my take. :)
      • PrathameshG34
        Oh yes, definitely!
      • That's exactly what I want to do lastfm data is pretty poor, so I am trying to aggregate data from as many feature-rich sources as possible (including MB metadata, and spotify data for stuff that's not on MB)
      • mayhem
        what is the goal of your project?
      • and will the results be open source?
      • PrathameshG34
        Yes, I'll try my best to put it all back into the MB database.
      • mayhem
        cool, but what is your desired data outcome?
      • PrathameshG34
        I listen to music like a maniac, so I wanted to analyze everything about it.
      • Lastfm was the obvious choice because it combined my streaming history from all sources. Then the data was pathetic, so I looked up MB, etc. Now I am just trying to aggregate all that stuff together and give back to this community in the process since it's exactly what I wished to create when I didn't know about it.
      • My desired outcome is to create a process that takes in just a few fields about a stream (title, artist, album, and MBID). Then crawl the web to find as much metadata about it as possible
      • mayhem
        ok, sounds like our goals align well.
      • PrathameshG34
        I wish to later use that data for advanced analytics and stuff. Provide insights that no other platform currently provides, interesting connections and visualizations (like unofficial collabs between artists in form of production credits, etc).
      • Just a LOT of metadata about everything. at one convenient place
      • mayhem
        though adding scraped data to MB might be tricky -- that likely won't be able to be done in an automated manner...
      • PrathameshG34
        mayhem: šŸ¤
      • lucifer
        mayhem: i had tried 10, 15, 25. its selected 10 for iterations so what we are using already.
      • mayhem
        PrathameshG34: yep, those are our goals with listenbrainz.
      • lucifer: ok, then I think we should set the range 5 - 15 so that its current best lies in the middle of the range.
      • let me re-read about the alpha factor again.
      • lucifer
        šŸ‘
      • PrathameshG34
        mayhem: right, I am aware that the metabrainz data addition goes through a lot of scrutiny for obvious reasons, so I am not expecting much from it at this very moment, but hopefully we could create pipelines for it further down the line :))
      • mayhem
        good good
      • "alpha is a parameter applicable to the implicit feedback variant of ALS that governs the baseline confidence in preference observations (defaults to 1.0)."
      • sigh
      • PrathameshG34
        BTW, I'd love to get started with the MLHD and do some EDA with it. However I am facing some problems downloading it. I'll try again and let you guys know if I face any issues with it again šŸ‘
      • I'll be right back
      • mayhem
        k.
      • we also have a common dev machine in a data center that we could probably give you an account on.
      • it has gobs of bandwidht
      • lucifer: lets just repeat our process for the alpha parameter. look at what is picked, pick a new range that gives it more, space. re-run, evaluate, adjust.
      • lucifer
        mayhem: it appears unlike other params we can't request multiple alpha in one job. i'll modify the code tomorrow so that we can test it.
      • PrathameshG34
        sheesh
      • PrathameshG34 has quit
      • PrathameshG joined the channel
      • mayhem
        lucifer: sounds good.
      • lucifer
        šŸ‘
      • PrathameshG
      • I hope waiting for sometime will solve this issue?
      • mayhem
        I wonder if this has been ongoing for a while.
      • PrathameshG
        Yea, I wasn't able to access the dataset yesterday either
      • Are there any mirrors for this?
      • BrainzGit
        [bookbrainz-site] 14dependabot[bot] opened pull request #806 (03master…dependabot/npm_and_yarn/babel/runtime-7.17.7): chore(deps): bump @babel/runtime from 7.16.3 to 7.17.7 https://github.com/metabrainz/bookbrainz-site/p...
      • [bookbrainz-site] 14dependabot[bot] closed pull request #795 (03master…dependabot/npm_and_yarn/babel/runtime-7.17.2): chore(deps): bump @babel/runtime from 7.16.3 to 7.17.2 https://github.com/metabrainz/bookbrainz-site/p...
      • alastairp
        PrathameshG: I emailed the author of the dataset (an old workmate of mine) to see if he knows what the error means. If it doesn't get resolved, I think I have a full copy somewhere, I'll have a look
      • PrathameshG
        alastairp: Thanks a lot! Really appreciated.
      • BrainzGit
        [bookbrainz-site] 14MonkeyDo merged pull request #800 (03master…edition-initial-search-matching-EG): fix(entity-editor): Edition: search for EditionGroups with same name https://github.com/metabrainz/bookbrainz-site/p...
      • [bookbrainz-site] 14MonkeyDo merged pull request #801 (03master…search-initial-pre-filled-name): feat(entity-editor): search for duplicates when pre-filling name https://github.com/metabrainz/bookbrainz-site/p...
      • Shubh has quit
      • [musicbrainz-server] 14reosarevok opened pull request #2454 (03master…MBS-12252): MBS-12252 / MBS-12253 / MBS-12254 / MBS-12255: Genre-related schema additions for consistency https://github.com/metabrainz/musicbrainz-serve...
      • mayhem
      • now with query timings for exact and fuzzy searches
      • dammn, postgres. 495us for a query. you crazy.
      • I smell a max 3 query solution for this. and with postgres, we can scale in a predicatable way.
      • one sec, moving to bono
      • PrathameshG has quit
      • skelly37 has quit
      • yellowhatpro
        Hello akshaaatt , I worked on the user profile feature. For now I just implemented basic view pager, with nothing inside. Have a look :
      • If you like it, I would like to work on the design of each fragment on Figma.
      • v6lur has quit
      • zas
        yvanzo: can you have a look at SIR, queue is growing, not sure why
      • mayhem
      • lucifer: ^^
      • :facepalm: I have no idea why this hasn't screwed up more items in the mapping. 🤯
      • anyways, I am ready to start working on mapping stuffs. the good news appears: not a lot of serious stuff to fix
      • zas
        sir-prod was reporting this error https://www.irccloud.com/pastebin/oqGIA64i/
      • I restarted it, but it doesn't seem to ingest new messages
      • yvanzo: ^^
      • Messages are ingested again (10 minutes after the restart) https://www.irccloud.com/pastebin/3O04228W/
      • I stopped sir-prod again, the load on pink increased a lot and messages ingestion is somehow stuck, number of the messages in the queue is increasing again
      • 2 minutes to connect to rabbitmq, is this normal? https://www.irccloud.com/pastebin/jOpHPO3h/
      • texke`_ joined the channel
      • texke` has quit
      • yvanzo: I tried again, sir ingests few messages, then it gets stuck, and load increases, I stopped it, have a look when you can.
      • aerozol
        Been getting a bug lately where the edit note screen no longer gives a summary/preview of all the edits. Seems to happen more with importers (e.g. atisket, discogs). Anyone else?
      • I could test by editing without any userscripts enabled for a while... but that sounds awful :D