#metabrainz

/

      • Mr_Monkey
        Great :)
      • 2020-06-25 17728, 2020

      • shivam-kapila
        We can work on that tomorrow if you want
      • 2020-06-25 17735, 2020

      • Mr_Monkey
        That sounds good
      • 2020-06-25 17748, 2020

      • travis-ci joined the channel
      • 2020-06-25 17748, 2020

      • travis-ci
        Project bookbrainz-site build #3165: passed in 3 min 33 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2020-06-25 17748, 2020

      • travis-ci has left the channel
      • 2020-06-25 17718, 2020

      • shivam-kapila
        Just wanted to convey to you, I wasnt much active for 2 days.
      • 2020-06-25 17727, 2020

      • shivam-kapila
        Also hi everyone :)
      • 2020-06-25 17745, 2020

      • Mr_Monkey
        No issue. Design takes time, and you needed to investigate the possibilities for this tricky issue
      • 2020-06-25 17751, 2020

      • ruaok
        sorry for not being on top of this (sucky energy in the past few days) but which is the tricky issue you're trying to solve?
      • 2020-06-25 17722, 2020

      • Mr_Monkey
        Sidebar navigation works great on desktop, but where do we put that in mobile and small screens?
      • 2020-06-25 17750, 2020

      • ruaok
        got a link to a recent design so I can see?
      • 2020-06-25 17737, 2020

      • Mr_Monkey
      • 2020-06-25 17751, 2020

      • ruaok
        thx
      • 2020-06-25 17720, 2020

      • ruaok
        sidebar in this context is the left sidebar that contains the player, right?
      • 2020-06-25 17742, 2020

      • alastairp
        jmp_music: what computer are you using? mac or linux? how much memory do you have, and if you're on a mac, how much memory have you assigned to docker?
      • 2020-06-25 17745, 2020

      • Mr_Monkey
        Yes. And more importantly, the dashboard secondary navigation
      • 2020-06-25 17750, 2020

      • ruaok
        I live and have seen mini player 2 in other apps.
      • 2020-06-25 17712, 2020

      • Mr_Monkey
        (the player is easier to solve, see the mini player a thte bottom of the mobile view)
      • 2020-06-25 17725, 2020

      • ruaok
        history | playlists | stats <- secondary navigation?
      • 2020-06-25 17730, 2020

      • Lotheric joined the channel
      • 2020-06-25 17741, 2020

      • ruaok
        ok, kewl. yeah, that looks good to me.
      • 2020-06-25 17741, 2020

      • Mr_Monkey
        Yes. With sub-items
      • 2020-06-25 17728, 2020

      • ruaok
        ok, I see the difficulty. the most obvious solution is the "collapsed top nav" which I also dislike from bootstrap's menu.
      • 2020-06-25 17706, 2020

      • Mr_Monkey
        Yeah. I suggested combining the two navs in mobile into that collapsed bootstrap menu thing
      • 2020-06-25 17707, 2020

      • jmp_music
        alastairp: At home I use a mac. The memory that is assigned to docker is at size of 2GB.
      • 2020-06-25 17717, 2020

      • jmp_music
        I'll assign more to it
      • 2020-06-25 17722, 2020

      • alastairp
        jmp_music: and how many cores?
      • 2020-06-25 17730, 2020

      • jmp_music
        4
      • 2020-06-25 17741, 2020

      • alastairp
        right, we have a known memory leak in this process (https://github.com/MTG/gaia/issues/96
      • 2020-06-25 17755, 2020

      • Mr_Monkey
        Might be very tricky though, considering the top navbar is rendered by flask
      • 2020-06-25 17725, 2020

      • alastairp
        what happens is that if you do too many iterations, the thread will run out of memory and stop accepting new training iterations
      • 2020-06-25 17736, 2020

      • alastairp
        what Job did you get up to?
      • 2020-06-25 17748, 2020

      • alastairp
        x/1728?
      • 2020-06-25 17734, 2020

      • jmp_music
        Oh ok! Yeap, x/1728
      • 2020-06-25 17748, 2020

      • alastairp
        how many did it complete before it stopped responding?
      • 2020-06-25 17736, 2020

      • alastairp
        oh - now I remember
      • 2020-06-25 17752, 2020

      • alastairp
        edit your project_danceability.yaml file, and set `clusterMode: True`
      • 2020-06-25 17756, 2020

      • alastairp
        then run it again
      • 2020-06-25 17732, 2020

      • alastairp
        this will start a separate process for each combination, and so the memory will be freed at the end of each. that should fix the issue of it hanging, you won't need to increase memory if you don't have it available
      • 2020-06-25 17739, 2020

      • jmp_music
        Oh ok!
      • 2020-06-25 17757, 2020

      • jmp_music
        because I didn't remember the on which process it stopped
      • 2020-06-25 17707, 2020

      • jmp_music
        at the previous run
      • 2020-06-25 17745, 2020

      • jmp_music
        What is your suggestion about the memory I should assign to docker?
      • 2020-06-25 17755, 2020

      • jmp_music
        4gb would be ok?
      • 2020-06-25 17756, 2020

      • alastairp
        sure. I asked because if you set it higher, it will process more items before running out of memory ;) I was trying to see if we can set it high enough to be able to finish all combinations before the memory leak gets too big
      • 2020-06-25 17715, 2020

      • alastairp
        however clusterMode: True means that we no longer have to consider this problem
      • 2020-06-25 17755, 2020

      • alastairp
        it's not a problem if you can't remember where it stopped. make these changes and try again
      • 2020-06-25 17703, 2020

      • jmp_music
        yeap, I did the change to the yaml file, and I assigned more memory to the docker.
      • 2020-06-25 17721, 2020

      • alastairp
        let's see if that works
      • 2020-06-25 17740, 2020

      • alastairp
        you can see how complicated it is to get gaia working... this is why we want to replace it!
      • 2020-06-25 17738, 2020

      • ruaok chuckles
      • 2020-06-25 17711, 2020

      • jmp_music
        True that! While it is a great ML system, I understand these reasons.
      • 2020-06-25 17745, 2020

      • jmp_music
        with `clusterMode: True`, continues exceptions were shown up in my latest runs, but with the larger memory assignment, the process finished
      • 2020-06-25 17742, 2020

      • jmp_music
        and the related .html was created
      • 2020-06-25 17723, 2020

      • jmp_music
        I tested the run with `className: danceability`
      • 2020-06-25 17745, 2020

      • jmp_music
        not `testdanceability`, and it finished the process
      • 2020-06-25 17717, 2020

      • jmp_music
      • 2020-06-25 17727, 2020

      • alastairp
        perfect!
      • 2020-06-25 17748, 2020

      • alastairp
        so it should also output the parameters for the combination that got the highest accuracy
      • 2020-06-25 17718, 2020

      • alastairp
        can you show an example of the exceptions that you get with clusterMode: True? That doesn't sound good
      • 2020-06-25 17722, 2020

      • jmp_music
        here are the parameters gaia chose for the danceability classification: https://usercontent.irccloud-cdn.com/file/xwznEsf…
      • 2020-06-25 17728, 2020

      • alastairp
        there it is!
      • 2020-06-25 17740, 2020

      • jmp_music
        yes if course. Wait a sec to run it
      • 2020-06-25 17732, 2020

      • alastairp
        so, now that we have those parameters, the question is: can we put those parameters into sklearn, and get the same 92.8% accuracy?
      • 2020-06-25 17719, 2020

      • jmp_music
        i think it could be done
      • 2020-06-25 17703, 2020

      • jmp_music
        I 'll try it
      • 2020-06-25 17743, 2020

      • jmp_music
        here is the traceback when clusterMode is True
      • 2020-06-25 17745, 2020

      • jmp_music
      • 2020-06-25 17702, 2020

      • jmp_music
        but the process does not stop running
      • 2020-06-25 17741, 2020

      • alastairp
        oh right. I wonder if this is the same problem with the name
      • 2020-06-25 17733, 2020

      • alastairp
        I recommend that we go back a step - delete your results directory, and edit the grountruth file and the profile file to change the className
      • 2020-06-25 17747, 2020

      • alastairp
        on my version I just called it `testdanceability`
      • 2020-06-25 17754, 2020

      • alastairp
        then run it again and see if it completes
      • 2020-06-25 17747, 2020

      • jmp_music
        as profile file you mean the project yaml that is created by the template?
      • 2020-06-25 17709, 2020

      • jmp_music
        basically, this is what you meant! I just run it and it worked
      • 2020-06-25 17717, 2020

      • jmp_music
        '=D :')
      • 2020-06-25 17711, 2020

      • jmp_music
        I'm going to eat something, and I'm back i a few minutes
      • 2020-06-25 17748, 2020

      • ruaok
        alastairp: the battery holders are on my desk in the office.
      • 2020-06-25 17719, 2020

      • alastairp
        ruaok: thanks
      • 2020-06-25 17740, 2020

      • alastairp
        jmp_music: right, again I told you to do it like this because there are some complexities....
      • 2020-06-25 17714, 2020

      • alastairp
        you could edit your script to call train_model(... cluster_mode=True)
      • 2020-06-25 17735, 2020

      • alastairp
        and then delete the results/ directory, delete the project file, edit groundtruth.yaml to change className, and then run it again
      • 2020-06-25 17754, 2020

      • sumedh has quit
      • 2020-06-25 17738, 2020

      • jmp_music
        alastairp: ok! Finally, now the training process finished correctly without errors and exceptions.
      • 2020-06-25 17702, 2020

      • jmp_music
        Loading all results...[1728/1728] (100% done)... number of results: 1728
      • 2020-06-25 17705, 2020

      • sumedh joined the channel
      • 2020-06-25 17721, 2020

      • jmp_music
        and not 576/576 as did it before
      • 2020-06-25 17714, 2020

      • alastairp
        excellent
      • 2020-06-25 17743, 2020

      • jmp_music
        I saw you merged also the PR. Thanks
      • 2020-06-25 17717, 2020

      • JoshDi joined the channel
      • 2020-06-25 17754, 2020

      • JoshDi
        does anyone have any optimized indexer.ini settings for Live Indexing? I noticed that my server can run a full reindex in about 2.5 hrs but live indexing takes almost 24 hours
      • 2020-06-25 17725, 2020

      • ruaok waves at JoshDi
      • 2020-06-25 17742, 2020

      • ruaok
        yvanzo would be the person who could help you, JoshDi
      • 2020-06-25 17731, 2020

      • JoshDi
        thank you ruaok
      • 2020-06-25 17753, 2020

      • ruaok
        and now is the time of day when yvanzo might appear too
      • 2020-06-25 17753, 2020

      • JoshDi
        I have been working with yvanzo via the bug report on git: https://github.com/metabrainz/musicbrainz-docker/…
      • 2020-06-25 17705, 2020

      • JoshDi
        :) why I am posting these questions right about now
      • 2020-06-25 17724, 2020

      • ruaok
        heh.
      • 2020-06-25 17722, 2020

      • JoshDi
        yvanzo can you share a doc or a brief description what all of the settings in the indexer.ini do? here is my latest config: https://github.com/metabrainz/musicbrainz-docker/…
      • 2020-06-25 17748, 2020

      • JoshDi
        I have 128gb of ram and 24 cores / 48 threads
      • 2020-06-25 17717, 2020

      • ruaok
        daaaaamn.
      • 2020-06-25 17740, 2020

      • ruaok
        most people who show up here are like "I got loads and loads of ram on my VM. I gave it 2GB!"
      • 2020-06-25 17702, 2020

      • JoshDi
        lol yea
      • 2020-06-25 17714, 2020

      • chaban
      • 2020-06-25 17716, 2020

      • JoshDi
        I used an intel QL1K with 4x 32gb Samsung DDR 4 ECC ram. I also have a 70TB RAID6 with cachecade in front, and a 1TB bcache as well
      • 2020-06-25 17743, 2020

      • JoshDi
        pretty proud of this machine. built the whole thing (without the 70TB) for about 1K
      • 2020-06-25 17744, 2020

      • ruaok
        bigger than anything we have in production, lol
      • 2020-06-25 17750, 2020

      • JoshDi
        same with my actual job lol
      • 2020-06-25 17706, 2020

      • JoshDi
        these intel ES processors are great if you know how to modify bios files to get them to work
      • 2020-06-25 17751, 2020

      • JoshDi
        if anyone is interested, I can help you build one. this processor as an ES version is only about 200-300 on ebay
      • 2020-06-25 17735, 2020

      • JoshDi
        ruaok ah didnt realize until now - hi Rob! good speaking with you this morning
      • 2020-06-25 17745, 2020

      • ruaok
        hehehehehehehe. :)
      • 2020-06-25 17706, 2020

      • JoshDi
        was I pulling a lot of data via my token?
      • 2020-06-25 17733, 2020

      • JoshDi
        how often can I set my cron to pull replicating packets. Was thinking about changing it to every 3 hours versus every 24 hours
      • 2020-06-25 17742, 2020

      • ruaok
        naw. the only thing that got me interested was your data usage description "for replication" or somesuch.
      • 2020-06-25 17758, 2020

      • ruaok
        as you see fit, it matters not to us.
      • 2020-06-25 17736, 2020

      • JoshDi
        understandable :)
      • 2020-06-25 17743, 2020

      • chaban
      • 2020-06-25 17716, 2020

      • sumedh has quit
      • 2020-06-25 17729, 2020

      • ruaok
        iliekcomputers: ping
      • 2020-06-25 17703, 2020

      • sumedh joined the channel
      • 2020-06-25 17710, 2020

      • ruaok
        I've written the mogrifier and it writes one large file with all the spark formatted listens.
      • 2020-06-25 17723, 2020

      • ruaok
        the previous dump breaks them into users. is that still needed?
      • 2020-06-25 17737, 2020

      • iliekcomputers
        ruaok: the user grouping isn't necessary. The directory structure (year/month.listens) is important though
      • 2020-06-25 17755, 2020

      • ruaok
        great.
      • 2020-06-25 17713, 2020

      • ruaok
        that is how the dumps are structured now, I'll just follow that method then.
      • 2020-06-25 17714, 2020

      • reosarevok
        yvanzo: maybe we should still hotfix? :p https://community.metabrainz.org/t/edits-for-your…
      • 2020-06-25 17720, 2020

      • reosarevok
        (linked by chaban above)
      • 2020-06-25 17734, 2020

      • ruaok
        also, are we able to generate a dump without a spark dump yet?
      • 2020-06-25 17741, 2020

      • ruaok
        I'm now blocked on that.
      • 2020-06-25 17727, 2020

      • ruaok
      • 2020-06-25 17731, 2020

      • ruaok
        like dat?
      • 2020-06-25 17755, 2020

      • ruaok
        written to the filesystem ok? or do you want tar/tar:xz ?
      • 2020-06-25 17719, 2020

      • iliekcomputers
        Tar.xz
      • 2020-06-25 17727, 2020

      • ruaok
        k
      • 2020-06-25 17738, 2020

      • ruaok
        wait, really?
      • 2020-06-25 17759, 2020

      • ruaok
        xz is dog slow and I anticipate running this script *right* before an import to spark.
      • 2020-06-25 17726, 2020

      • ruaok
        just tar would make one file without the wait for compression/decompression.
      • 2020-06-25 17701, 2020

      • iliekcomputers
        We could work with just filesystem tbh
      • 2020-06-25 17714, 2020

      • iliekcomputers
        Would just need some changes on the importer, which I'm happy to take
      • 2020-06-25 17740, 2020

      • iliekcomputers
        For just dumping the full dump, need to remove this line: https://github.com/metabrainz/listenbrainz-server…
      • 2020-06-25 17742, 2020

      • ruaok
        I think that makes sense to me -- a little bit of easy programming in exchanger for speed.
      • 2020-06-25 17703, 2020

      • iliekcomputers
        Sgtm too
      • 2020-06-25 17724, 2020

      • iliekcomputers
        ishaanshah: I'll have to push our meeting by 30 minutes today.
      • 2020-06-25 17743, 2020

      • ruaok
        given that you've been leading managing deployment, can you please trigger a dump for me?
      • 2020-06-25 17703, 2020

      • ruaok
        that will be the last sanity check. if everything checks out we can do the first attempt at a real migration.
      • 2020-06-25 17736, 2020

      • iliekcomputers
        Is it possible to use one of the more recent ones?
      • 2020-06-25 17747, 2020

      • iliekcomputers
        We have one from this Sunday on ftp
      • 2020-06-25 17759, 2020

      • ruaok
        they wont have inserted_timestamps, will they?