#metabrainz

/

      • shivam-kapila
        Lol. Time to test the nonsense script. 😂
      • 2020-05-14 13520, 2020

      • ruaok
        it is nonsense!
      • 2020-05-14 13515, 2020

      • ruaok
        yvanzo: the MB VM we have at azure for your VM work -- do you still need that?
      • 2020-05-14 13511, 2020

      • yvanzo
        I still use it at least, should it be shut down?
      • 2020-05-14 13550, 2020

      • ruaok
        well, we're not really paying for it, but we're using credits.
      • 2020-05-14 13557, 2020

      • ruaok
        if you use it frequently, then keep it
      • 2020-05-14 13508, 2020

      • ruaok
        but if you use it once a month, then let's shut it down between runs.
      • 2020-05-14 13538, 2020

      • yvanzo
        Ok, I’m using it these days to test pg12 at least.
      • 2020-05-14 13520, 2020

      • ruaok
        I'm glad it is getting use. when that drops off, please please remember to ping me to shut it down.
      • 2020-05-14 13501, 2020

      • yvanzo
        Will do, I don’t use it regularly, but I did use it a lot for the past weeks.
      • 2020-05-14 13500, 2020

      • shivam-kapila
        > for your UI mockup, I'd love to see the ability to add a tag (genre) as part of the UI.
      • 2020-05-14 13500, 2020

      • shivam-kapila
        ruaok: Since it's something associated to track metadata do you think "Edit a listen" should be implemented?
      • 2020-05-14 13520, 2020

      • v6lur has quit
      • 2020-05-14 13518, 2020

      • v6lur joined the channel
      • 2020-05-14 13528, 2020

      • ruaok
        what do you see being edited there?
      • 2020-05-14 13524, 2020

      • shivam-kapila
        We want an option to add tags for the listens right?
      • 2020-05-14 13543, 2020

      • ruaok
        actually, this was to add tags to *recordings*.
      • 2020-05-14 13512, 2020

      • shivam-kapila
        Yes sorry my bad.
      • 2020-05-14 13538, 2020

      • shivam-kapila
        The tags field we sotre in track_metadata
      • 2020-05-14 13542, 2020

      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #421 (master…delete-duplicate-relationships): BB-292 fix(sql): Script to delete duplicate relationships https://github.com/bookbrainz/bookbrainz-site/pul…
      • 2020-05-14 13543, 2020

      • shivam-kapila
        store*
      • 2020-05-14 13543, 2020

      • BrainzBot
        BB-292: Remove duplicate relationships https://tickets.metabrainz.org/browse/BB-292
      • 2020-05-14 13512, 2020

      • ruaok
        shivam-kapila: I want to be able to add a tag to *musicbrainz*, not listenbrainz from that page.
      • 2020-05-14 13506, 2020

      • shivam-kapila
        Oh I get it now. I was thinking that will affect data only LB. Thanks for clarifying.
      • 2020-05-14 13521, 2020

      • ZaphodBeeblebrox has left the channel
      • 2020-05-14 13500, 2020

      • CatQuest joined the channel
      • 2020-05-14 13500, 2020

      • CatQuest has quit
      • 2020-05-14 13500, 2020

      • CatQuest joined the channel
      • 2020-05-14 13511, 2020

      • CatQuest
        <3 reosarevok
      • 2020-05-14 13525, 2020

      • ruaok
        iliekcomputers: "You're on the GitHub Sponsored Organizations waitlist!"
      • 2020-05-14 13523, 2020

      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #1517 (master…MBS-10825): MBS-10825: Fix Muziekweb cleanup, add work support https://github.com/metabrainz/musicbrainz-server/…
      • 2020-05-14 13524, 2020

      • BrainzBot
        MBS-10825: Normalization of some Muziekweb links make them invalid https://tickets.metabrainz.org/browse/MBS-10825
      • 2020-05-14 13507, 2020

      • ishaanshah[m]
      • 2020-05-14 13545, 2020

      • ishaanshah[m]
        removing artist_msid and artist_mbid from the GROUP_BY clause should fix LB-547 right?
      • 2020-05-14 13547, 2020

      • BrainzBot
      • 2020-05-14 13537, 2020

      • loujine joined the channel
      • 2020-05-14 13527, 2020

      • shivam-kapila
        ishaanshah[m]: I suggested him tge same fix some days ago. IIRC he said we should solve this problem in frontend.
      • 2020-05-14 13525, 2020

      • jmp_music has quit
      • 2020-05-14 13527, 2020

      • ephemer0l_ has quit
      • 2020-05-14 13503, 2020

      • ruaok
        zas: I've added the DNS record for bono to metabrainz.org
      • 2020-05-14 13512, 2020

      • alastairp
        ruaok: dime
      • 2020-05-14 13535, 2020

      • ruaok
        it was clear that that extra machine we setup for the summit was superflous, so I nuked it.
      • 2020-05-14 13558, 2020

      • ruaok
        musicbrainz-docker is running on bono. bono can now be reached with bono.metabrainz.org .
      • 2020-05-14 13559, 2020

      • ruaok
        fin.
      • 2020-05-14 13502, 2020

      • alastairp
        right - the one hosting odessey?
      • 2020-05-14 13507, 2020

      • ruaok
        yes.
      • 2020-05-14 13513, 2020

      • alastairp
        I thought that it was hosted on bono too. guess not
      • 2020-05-14 13515, 2020

      • alastairp
        problem solved
      • 2020-05-14 13529, 2020

      • ruaok
        we have bono, lets concentrate on hosting all things related on bono.
      • 2020-05-14 13537, 2020

      • alastairp
        done/done
      • 2020-05-14 13538, 2020

      • alastairp
        thanks
      • 2020-05-14 13554, 2020

      • ruaok
        I'm working to get the artist relations production-ish ready so I can deploy that on bono
      • 2020-05-14 13537, 2020

      • shivam-kapila
        Is bono.metabrainz.org supposed to be open to access?
      • 2020-05-14 13556, 2020

      • alastairp
        no, it's just a domain name that allows us to manage our servers
      • 2020-05-14 13505, 2020

      • alastairp
        you probably want something like similarity.acousticbrainz.org
      • 2020-05-14 13515, 2020

      • alastairp
        (or, you will once its up and running)
      • 2020-05-14 13535, 2020

      • shivam-kapila
        Hm. Okay okay. I was able to access the url so I asked. Thanks
      • 2020-05-14 13516, 2020

      • alastairp
        sure, I guess our webserver accepts requests to all domains
      • 2020-05-14 13525, 2020

      • shivam-kapila
        Got it
      • 2020-05-14 13541, 2020

      • ruaok
        yvanzo: I'm reading through the updated docs for musicbrainz-docker and I've got two questions:
      • 2020-05-14 13516, 2020

      • ruaok
        1. how do I expose the port for PG only? Can I set an alternate port number other than 5432?
      • 2020-05-14 13544, 2020

      • ruaok
        2. How do I tune PG? I need to set more shared_buffers.
      • 2020-05-14 13505, 2020

      • yvanzo
        ruaok: for 1. create a file local/compose/publish-db-port.yml (see compose/publishing-all-ports.yml for example) and run admin/configure add local/compose/publish-db-port.yml; docker-compose up -d
      • 2020-05-14 13533, 2020

      • yvanzo
        2. cannot be done without building your own image for db service, should be added.
      • 2020-05-14 13541, 2020

      • ruaok
        thanks for #1.
      • 2020-05-14 13523, 2020

      • ruaok
        #2 could a drastic issue we need to address. stock PG is dog slow. :(
      • 2020-05-14 13545, 2020

      • ruaok
        shared_buffers = 512MB
      • 2020-05-14 13507, 2020

      • ruaok
        if we're recommending a machine of 16GB, should we set that to 4096MB as default?
      • 2020-05-14 13536, 2020

      • yvanzo
        Maybe but 16GB was mainly required for live indexing.
      • 2020-05-14 13508, 2020

      • ruaok
        well, we can argue over defaults and what-not, but we need a way to tweak the settings.
      • 2020-05-14 13536, 2020

      • ruaok
        because a noob will not know that they need to tune DB and will get the impression that our shit is dog slow.
      • 2020-05-14 13539, 2020

      • yvanzo
        Working on a patch, should be quick.
      • 2020-05-14 13545, 2020

      • ruaok
        <3
      • 2020-05-14 13502, 2020

      • ruaok
        no rush. I've fixed my setup.
      • 2020-05-14 13509, 2020

      • ruaok
        (but I know how)
      • 2020-05-14 13522, 2020

      • ruaok
        the admin/configure stuff is really nice too.
      • 2020-05-14 13519, 2020

      • yvanzo
        the idea is have a file like default/indexer.ini that could be appended to db configuration.
      • 2020-05-14 13529, 2020

      • ruaok
        +1
      • 2020-05-14 13535, 2020

      • CatQuest
      • 2020-05-14 13541, 2020

      • ishaanshah[m]
        iliekcomputers: I had some questions about pyspark and sql
      • 2020-05-14 13518, 2020

      • ishaanshah[m]
        Please ping me when you are up
      • 2020-05-14 13536, 2020

      • iliekcomputers
        hi
      • 2020-05-14 13551, 2020

      • ishaanshah[m]
        hey
      • 2020-05-14 13508, 2020

      • ishaanshah[m]
      • 2020-05-14 13556, 2020

      • CatQuest
        oh man. Mr_Monkey userscript that can pre-fill language twice on https://bookbrainz.org/work/create?author=23f213e… pls (I need to add 28+ works for small short-stries and it means I have to type "norw" and select from a drop down 56 times )
      • 2020-05-14 13509, 2020

      • ishaanshah[m]
        If we remove artist_msid from group_by I think we can fix a part of LB-547
      • 2020-05-14 13510, 2020

      • BrainzBot
      • 2020-05-14 13525, 2020

      • iliekcomputers
        hmmm.
      • 2020-05-14 13544, 2020

      • iliekcomputers
        i'm not completely sure we want to do that calculation on names
      • 2020-05-14 13552, 2020

      • iliekcomputers
        different artists can have same names
      • 2020-05-14 13558, 2020

      • ishaanshah[m]
        Another part is duplicates because of different names
      • 2020-05-14 13510, 2020

      • iliekcomputers
        ideally the bug would be fixed by messybrainz
      • 2020-05-14 13520, 2020

      • iliekcomputers
        i'd not worry about it tbh.
      • 2020-05-14 13547, 2020

      • ishaanshah[m]
        Ok
      • 2020-05-14 13517, 2020

      • ishaanshah[m]
        For range queries we should compare the timestamp right
      • 2020-05-14 13544, 2020

      • iliekcomputers
        so if you look at hdfs
      • 2020-05-14 13546, 2020

      • ishaanshah[m]
        Like, WHERE listen.timestamp > min_ts AND listen.timestamp < max_ts
      • 2020-05-14 13549, 2020

      • pristine__
        ishaanshah[m]: that bug will be better handled once we have the mapping right. Artist with two different msids can have same mbid and it leads to weird results. The ideal way is to use mbid everywhere once we have the mapping :)
      • 2020-05-14 13502, 2020

      • iliekcomputers
        the strcuture of the data is data/year/month.parquet
      • 2020-05-14 13513, 2020

      • iliekcomputers
        where month.parquet contains that month's listens
      • 2020-05-14 13526, 2020

      • iliekcomputers
        for a month, you'd just load the month's data and query on that.
      • 2020-05-14 13531, 2020

      • iliekcomputers
        won't need a where
      • 2020-05-14 13541, 2020

      • iliekcomputers
        but for week, yes, the where would make sense.
      • 2020-05-14 13544, 2020

      • ruaok
        pristine__: I'm working on the aa relations right now. once that is built up, I'm going to do the same for the msb mapping
      • 2020-05-14 13557, 2020

      • ishaanshah[m]
        But that won't work for week
      • 2020-05-14 13504, 2020

      • iliekcomputers
        even for week
      • 2020-05-14 13512, 2020

      • pristine__
        ruaok: no hurry :)
      • 2020-05-14 13513, 2020

      • iliekcomputers
        you should load just the month's data and put the where on that.
      • 2020-05-14 13549, 2020

      • ishaanshah[m]
        Ohk
      • 2020-05-14 13501, 2020

      • ishaanshah[m]
        Now suppose I import a months data
      • 2020-05-14 13535, 2020

      • ishaanshah[m]
        and create a temporary view
      • 2020-05-14 13524, 2020

      • ishaanshah[m]
        Then I should first filter out the the weeks listens
      • 2020-05-14 13530, 2020

      • pristine__
        ruaok: I see that we want to send recordings to lemmy that means we don't need a tar.
      • 2020-05-14 13540, 2020

      • ruaok
        iliekcomputers: thoughts on where this code should live? it uses the MB database to calculate relationship info that will be used in LB recommendation stuff. I'm inclined to make a new top-level dir in listenbrainz-server called `relations` and stuff put the code there.
      • 2020-05-14 13549, 2020

      • ruaok
        pristine__: correct!
      • 2020-05-14 13556, 2020

      • iliekcomputers
        ruaok: yes, that would be ideal.
      • 2020-05-14 13500, 2020

      • ruaok
      • 2020-05-14 13501, 2020

      • iliekcomputers
        imo
      • 2020-05-14 13503, 2020

      • ruaok
        the code in question. :)
      • 2020-05-14 13528, 2020

      • iliekcomputers
        that sounds perfect.
      • 2020-05-14 13532, 2020

      • ruaok
        k
      • 2020-05-14 13544, 2020

      • ruaok
        also makes it much easier to make a PR for it.
      • 2020-05-14 13502, 2020

      • iliekcomputers
        ishaanshah[m]: yeah, that sounds reasonable.
      • 2020-05-14 13503, 2020

      • CatQuest has left the channel
      • 2020-05-14 13531, 2020

      • ishaanshah[m]
        Now will my table update accodingly
      • 2020-05-14 13547, 2020

      • ishaanshah[m]
        Or I should collect and again create a temp view
      • 2020-05-14 13516, 2020

      • iliekcomputers
        collects are very expensive
      • 2020-05-14 13523, 2020

      • ishaanshah[m]
        And then run the GROUP BY query
      • 2020-05-14 13545, 2020

      • iliekcomputers
        load the month's data once, and then create different views from it as you need.
      • 2020-05-14 13520, 2020

      • CatQuest joined the channel
      • 2020-05-14 13520, 2020

      • CatQuest has quit
      • 2020-05-14 13520, 2020

      • CatQuest joined the channel
      • 2020-05-14 13546, 2020

      • ishaanshah[m]
        No, I got that, but I cant select timestamp in group by query right
      • 2020-05-14 13501, 2020

      • iliekcomputers
        oh
      • 2020-05-14 13524, 2020

      • ishaanshah[m]
        so first I will have to filter according to timestamp using WHERE query
      • 2020-05-14 13524, 2020

      • iliekcomputers
        you can create a new dataframe from the month datafram
      • 2020-05-14 13529, 2020

      • iliekcomputers
        and then query on that
      • 2020-05-14 13541, 2020

      • iliekcomputers
        month_df = load_month_data()
      • 2020-05-14 13554, 2020

      • iliekcomputers
        week_df = month_df.filter(blahblhablha)
      • 2020-05-14 13554, 2020

      • ishaanshah[m]
        So basically something like this -
      • 2020-05-14 13559, 2020

      • iliekcomputers
        query on week_df now
      • 2020-05-14 13541, 2020

      • ishaanshah[m]
        Ohk, got it
      • 2020-05-14 13541, 2020

      • CatQuest
        T_T
      • 2020-05-14 13508, 2020

      • ishaanshah[m]
        So i cant run 2 sql queries on a temporary view
      • 2020-05-14 13518, 2020

      • iliekcomputers
        i'm pretty sure you can.
      • 2020-05-14 13546, 2020

      • ishaanshah[m]
        so suppose i create a temp view from df
      • 2020-05-14 13555, 2020

      • ishaanshah[m]
        Run
      • 2020-05-14 13511, 2020

      • ishaanshah[m]
        SELECT * FROM view WHERE ...
      • 2020-05-14 13516, 2020

      • ishaanshah[m]
        and then run
      • 2020-05-14 13534, 2020

      • ishaanshah[m]
        SELECT user_name, artist_msid...