- 
    
        
    
        
        SothoTalKer has quit 
2020-10-01 27517, 2020
    - 
    
        
    
        
        davic joined the channel 
2020-10-01 27525, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27556, 2020
    - 
    
        
    
        
        d4rkie has quit 
2020-10-01 27539, 2020
    - 
    
        
    
        
        Nyanko-sensei joined the channel 
2020-10-01 27520, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27550, 2020
    - 
    
        
    
                    Lotheric 
2020-10-01 27558, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27527, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27547, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27551, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27516, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27521, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27504, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27529, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27503, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27540, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27513, 2020
    - 
    
        
    
        
        thomasross has quit 
2020-10-01 27502, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27504, 2020
    - 
    
        
    
        
        MajorLurker has quit 
2020-10-01 27550, 2020
    - 
    
        
    
                    _lucifer pristine___: ping 
2020-10-01 27523, 2020
    - 
    
        
    
        
        supersandro2000 has quit 
2020-10-01 27555, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27547, 2020
    - 
    
        
    
                    pristine___ _lucifer: pong 
2020-10-01 27520, 2020
    - 
    
        
    
                    _lucifer pristine___: i am getting java oom errors. what should i set driver memory for spark as? 
2020-10-01 27529, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27552, 2020
    - 
    
        
    
                    pristine___ 
2020-10-01 27552, 2020
    - 
    
        
    
                    pristine___ This might help. 
2020-10-01 27547, 2020
    - 
    
        
    
                    pristine___ You will have to calculate driver memory, excutor memory and other configs based on your machine. Fun maths :p 
2020-10-01 27515, 2020
    - 
    
        
    
                    _lucifer lol ok :) 
2020-10-01 27525, 2020
    - 
    
        
    
                    _lucifer what are these value for your machine btw? 
2020-10-01 27525, 2020
    - 
    
        
    
                    pristine___ I use few MBs of data so doesn't matter :p 
2020-10-01 27547, 2020
    - 
    
        
    
                    _lucifer yeah right :| 
2020-10-01 27521, 2020
    - 
    
        
    
                    pristine___ You can ishaanshah , I guess he also using full dumps. 
2020-10-01 27551, 2020
    - 
    
        
    
                    _lucifer ishaanshah: ping :) 
2020-10-01 27537, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27523, 2020
    - 
    
        
    
                    _lucifer pristine___: btw, have tried out google colab? 
2020-10-01 27512, 2020
    - 
    
        
    
                    pristine___ Not yet 
2020-10-01 27530, 2020
    - 
    
        
    
                    pristine___ But say? What's in your mind? 
2020-10-01 27509, 2020
    - 
    
        
    
                    _lucifer i was thinking if we could set up a jupyter notebook for quickly experimenting with reca 
2020-10-01 27514, 2020
    - 
    
        
    
                    _lucifer recs. 
2020-10-01 27554, 2020
    - 
    
        
    
                    _lucifer using colab, we may be able to run workloads using k80 gpus so speed and memory will less of an issie 
2020-10-01 27525, 2020
    - 
    
        
    
                    ishaanshah _lucifer: pong! 
2020-10-01 27559, 2020
    - 
    
        
    
                    _lucifer ishaanshah: hi, do you use full dumps or incremental dumps locally while working eith spark? 
2020-10-01 27518, 2020
    - 
    
        
    
                    ishaanshah multiple incremental dumps, not full 
2020-10-01 27532, 2020
    - 
    
        
    
                    _lucifer ah ok 
2020-10-01 27552, 2020
    - 
    
        
    
                    _lucifer i am too using that for listens 
2020-10-01 27502, 2020
    - 
    
        
    
                    _lucifer but for the mapping a full dump 
2020-10-01 27512, 2020
    - 
    
        
    
                    ishaanshah yeah I used full for mapping too 
2020-10-01 27516, 2020
    - 
    
        
    
                    ishaanshah but got OOM 
2020-10-01 27527, 2020
    - 
    
        
    
                    _lucifer yeah same here 
2020-10-01 27541, 2020
    - 
    
        
    
                    _lucifer were you able to tweak the config to get it working? 
2020-10-01 27550, 2020
    - 
    
        
    
                    ishaanshah I have a  8G laptop 
2020-10-01 27558, 2020
    - 
    
        
    
                    ishaanshah and the mapping is 11G 
2020-10-01 27508, 2020
    - 
    
        
    
        
        kori has quit 
2020-10-01 27516, 2020
    - 
    
        
    
                    _lucifer i too have a 8 gig one 
2020-10-01 27536, 2020
    - 
    
        
    
                    ishaanshah so I dont think theres any way we can fix it, using a smaller dump would be better 
2020-10-01 27557, 2020
    - 
    
        
    
                    _lucifer yeah right that would certainly fix this 
2020-10-01 27558, 2020
    - 
    
        
    
                    pristine___ _lucifer: can do, next week when I am back in town 
2020-10-01 27509, 2020
    - 
    
        
    
                    _lucifer great! 
2020-10-01 27520, 2020
    - 
    
        
    
                    pristine___ _lucifer: told ya to not use full dump mapping :p 
2020-10-01 27535, 2020
    - 
    
        
    
                    _lucifer yeah you were right :) 
2020-10-01 27557, 2020
    - 
    
        
    
                    pristine___ Though I think we really need to have smaller dumps for dev 
2020-10-01 27521, 2020
    - 
    
        
    
                    _lucifer i have run it over and am monitoring to see where it fails 
2020-10-01 27535, 2020
    - 
    
        
    
                    pristine___ For better user experience. 
2020-10-01 27555, 2020
    - 
    
        
    
                    ishaanshah _lucifer: How much memory do we get for free on colab? 
2020-10-01 27500, 2020
    - 
    
        
    
                    pristine___ Can you open a ticket for smaller dumps _lucifer ? 
2020-10-01 27517, 2020
    - 
    
        
    
                    _lucifer ishaanshah: i was trying to find the same 
2020-10-01 27526, 2020
    - 
    
        
    
                    _lucifer pristine___: yeah sure will do that 
2020-10-01 27548, 2020
    - 
    
        
    
                    ishaanshah I am interested in having some kind of cloud testing env for spark 
2020-10-01 27502, 2020
    - 
    
        
    
                    _lucifer 12gig 
2020-10-01 27514, 2020
    - 
    
        
    
                    ishaanshah I used databricks, but it has limitations for free accounts 
2020-10-01 27520, 2020
    - 
    
        
    
                    ishaanshah and doesnt work for full dumps 
2020-10-01 27531, 2020
    - 
    
        
    
                    _lucifer yeah right 
2020-10-01 27510, 2020
    - 
    
        
    
                    ishaanshah > 12gig 
2020-10-01 27512, 2020
    - 
    
        
    
                    ishaanshah :( 
2020-10-01 27553, 2020
    - 
    
        
    
                    _lucifer how much do you get on databrixks? 
2020-10-01 27502, 2020
    - 
    
        
    
                    ishaanshah 15G 
2020-10-01 27510, 2020
    - 
    
        
    
                    ishaanshah but limited storage 
2020-10-01 27515, 2020
    - 
    
        
    
                    ishaanshah so cant download the dumps 
2020-10-01 27535, 2020
    - 
    
        
    
                    _lucifer okay but that 12 does not include gpu 
2020-10-01 27524, 2020
    - 
    
        
    
                    _lucifer i'll see if i can find another alternative 
2020-10-01 27538, 2020
    - 
    
        
    
                    _lucifer ishaanshah: what about kaggle, 16 g + 30h gpu/week 
2020-10-01 27542, 2020
    - 
    
        
    
        
        kori joined the channel 
2020-10-01 27556, 2020
    - 
    
        
    
                    ishaanshah _lucifer: Hmm, looks promising, I haven't personally used kaggle though 
2020-10-01 27500, 2020
    - 
    
        
    
                    ishaanshah does it support spark? 
2020-10-01 27506, 2020
    - 
    
        
    
                    shivam-kapila Gcp gives upto 26g on demand 
2020-10-01 27546, 2020
    - 
    
        
    
                    _lucifer ishaanshah: yup, pyspark is just like any other mllib. i just installed pyspark on pc and experimented with using python console 
2020-10-01 27505, 2020
    - 
    
        
    
                    _lucifer lol, gcp banned my account 
2020-10-01 27553, 2020
    - 
    
        
    
                    shivam-kapila Good 
2020-10-01 27500, 2020
    - 
    
        
    
                    ishaanshah _lucifer: ooh nice, let me know if you are able to run mapping on kaggle 
2020-10-01 27524, 2020
    - 
    
        
    
                    _lucifer yeah, will try and let you know :D 
2020-10-01 27546, 2020
    - 
    
        
    
                    ishaanshah I like to experiment with the queries before writing it for production, notebooks like environment are good for this 
2020-10-01 27555, 2020
    - 
    
        
    
                    shivam-kapila ishaanshah: you mentioned about zepplin once 
2020-10-01 27513, 2020
    - 
    
        
    
                    shivam-kapila Wont it solve the issue if we have a zepplin layer in prod 
2020-10-01 27518, 2020
    - 
    
        
    
                    ishaanshah shivam-kapila: yes but it requires you to use your own PC 
2020-10-01 27531, 2020
    - 
    
        
    
                    shivam-kapila Ouch 
2020-10-01 27541, 2020
    - 
    
        
    
                    ishaanshah I dont have a powerful enough pc to join huge datasets 
2020-10-01 27551, 2020
    - 
    
        
    
                    ishaanshah we can add it to prod but its not an easy task 
2020-10-01 27552, 2020
    - 
    
        
    
                    shivam-kapila Mine is slower than yours 
2020-10-01 27555, 2020
    - 
    
        
    
                    _lucifer thats true for almost all of us 
2020-10-01 27510, 2020
    - 
    
        
    
                    shivam-kapila Yeah I saw zepplin integration 
2020-10-01 27518, 2020
    - 
    
        
    
                    shivam-kapila Its somewhat tedious 
2020-10-01 27531, 2020
    - 
    
        
    
                    shivam-kapila Anyways I think a smaller mapping is needed 
2020-10-01 27552, 2020
    - 
    
        
    
                    _lucifer also a listen dataset for that 
2020-10-01 27502, 2020
    - 
    
        
    
                    shivam-kapila Cloud isnt as flex 
2020-10-01 27520, 2020
    - 
    
        
    
                    _lucifer so that the listens are actually in the mapping and we  can get meaningful.resulta 
2020-10-01 27554, 2020
    - 
    
        
    
                    shivam-kapila yes that 
2020-10-01 27532, 2020
    - 
    
        
    
                    shivam-kapila ideally we can pick the latest 5 inc dumps and have corresponiding mapping 
2020-10-01 27556, 2020
    - 
    
        
    
                    shivam-kapila IG thats enough 
2020-10-01 27556, 2020
    - 
    
        
    
                    _lucifer yeah makes sense 
2020-10-01 27527, 2020
    - 
    
        
    
                    ishaanshah 
2020-10-01 27532, 2020
    - 
    
        
    
                    _lucifer but getting that corresponding mapping can be hard 
2020-10-01 27556, 2020
    - 
    
        
    
                    shivam-kapila dunno think o 
2020-10-01 27512, 2020
    - 
    
        
    
                    _lucifer ishaanshah: thanks, i was just going to write these myself. a lot of time saved :D 
2020-10-01 27526, 2020
    - 
    
        
    
                    ishaanshah :D 
2020-10-01 27538, 2020
    - 
    
        
    
                    shivam-kapila theres a dedicated spark extension for jupyter notebook 
2020-10-01 27536, 2020
    - 
    
        
    
                    _lucifer nice! 
2020-10-01 27535, 2020
    - 
    
        
    
        
        Nyanko-sensei has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        _lucifer has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        leonardo has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        imdeni has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        mruszczyk has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        diru1100 has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        reg[m] has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        joshuaboniface has quit 
2020-10-01 27535, 2020
    - 
    
        
    
        
        djinni` has quit 
2020-10-01 27555, 2020
    - 
    
        
    
        
        rdswift_ joined the channel 
2020-10-01 27546, 2020
    - 
    
        
    
        
        rdswift has quit 
2020-10-01 27550, 2020
    - 
    
        
    
        
        rdswift_ is now known as rdswift 
2020-10-01 27526, 2020
    - 
    
        
    
        
        testfreenode joined the channel 
2020-10-01 27504, 2020
    - 
    
        
    
        
        _lucifer joined the channel 
2020-10-01 27544, 2020
    - 
    
        
    
                    _lucifer pristine___: ishaanshah took one hour but request dataframes completed succesfully so issue is not with the mapping 
2020-10-01 27504, 2020
    - 
    
        
    
        
        testfreenode has quit 
2020-10-01 27526, 2020
    - 
    
        
    
                    pristine___ Dataframes created in an hour? 
2020-10-01 27537, 2020
    - 
    
        
    
                    _lucifer yeah 
2020-10-01 27512, 2020
    - 
    
        
    
                    ishaanshah _lucifer: on kaggle or on local dev? 
2020-10-01 27525, 2020
    - 
    
        
    
                    _lucifer local 
2020-10-01 27537, 2020
    - 
    
        
    
                    _lucifer ok my bad it was 2 hours 
2020-10-01 27550, 2020
    - 
    
        
    
                    _lucifer but succesful 
2020-10-01 27552, 2020
    - 
    
        
    
                    ishaanshah Oh the full mapping worked? 
2020-10-01 27506, 2020
    - 
    
        
    
                    ishaanshah What changes did you make? 
2020-10-01 27510, 2020
    - 
    
        
    
                    ishaanshah To the config 
2020-10-01 27511, 2020
    - 
    
        
    
                    _lucifer none 
2020-10-01 27538, 2020
    - 
    
        
    
                    ishaanshah You said you got an OOM at first right? 
2020-10-01 27540, 2020
    - 
    
        
    
                    _lucifer i too had thought the issue was mapping but i had issued all command the last time 
2020-10-01 27553, 2020
    - 
    
        
    
                    _lucifer this time i am running all commands one by one as they complete 
2020-10-01 27507, 2020
    - 
    
        
    
                    _lucifer there are three left one of which should be the culprit 
2020-10-01 27516, 2020
    - 
    
        
    
                    ishaanshah Oh 
2020-10-01 27534, 2020
    - 
    
        
    
                    ishaanshah I dont know why it ran out of memory when I did it