#metabrainz

/

      • D4RK-PH0ENiX has quit
      • 2019-10-22 29541, 2019

      • Sophist_UK joined the channel
      • 2019-10-22 29557, 2019

      • Sophist-UK has quit
      • 2019-10-22 29508, 2019

      • BrainzGit
        [musicbrainz-server] mwiencek opened pull request #1245 (master…pot-git-ls-files): Get po/Makefile prereqs with git-ls-files https://github.com/metabrainz/musicbrainz-server/…
      • 2019-10-22 29545, 2019

      • D4RK-PH0ENiX joined the channel
      • 2019-10-22 29550, 2019

      • weirditude joined the channel
      • 2019-10-22 29558, 2019

      • Sophist-UK joined the channel
      • 2019-10-22 29520, 2019

      • Sophist_UK has quit
      • 2019-10-22 29544, 2019

      • chhavi_ joined the channel
      • 2019-10-22 29542, 2019

      • yvanzo
        bitmap: iirc, there was an error with Tags message being duplicated.
      • 2019-10-22 29557, 2019

      • yvanzo
      • 2019-10-22 29512, 2019

      • travis-ci joined the channel
      • 2019-10-22 29512, 2019

      • travis-ci
        Project bookbrainz-data-js build #1282: passed in 2 min 15 sec: https://travis-ci.org/bookbrainz/bookbrainz-data-…
      • 2019-10-22 29512, 2019

      • travis-ci has left the channel
      • 2019-10-22 29558, 2019

      • jvoisin joined the channel
      • 2019-10-22 29513, 2019

      • jvoisin
        https://github.com/airsonic/airsonic/pull/1224 → Listenbrainz integration with airsonic :)
      • 2019-10-22 29503, 2019

      • weirditude has quit
      • 2019-10-22 29513, 2019

      • CatQuest
        guuyysss
      • 2019-10-22 29515, 2019

      • CatQuest
        ?
      • 2019-10-22 29530, 2019

      • CatQuest
        nvm carry on
      • 2019-10-22 29550, 2019

      • chaban joined the channel
      • 2019-10-22 29554, 2019

      • CatQuest has left the channel
      • 2019-10-22 29519, 2019

      • weirditude joined the channel
      • 2019-10-22 29503, 2019

      • CatQuest joined the channel
      • 2019-10-22 29508, 2019

      • weirditude has quit
      • 2019-10-22 29548, 2019

      • ruaok
        jvoisin: great. :)
      • 2019-10-22 29542, 2019

      • weirditude joined the channel
      • 2019-10-22 29538, 2019

      • weirditude has quit
      • 2019-10-22 29504, 2019

      • ruaok
        Freso: please fill out the doodle for the board meeitng.
      • 2019-10-22 29545, 2019

      • weirditude joined the channel
      • 2019-10-22 29504, 2019

      • ruaok
        if anyone cares, the notes for my "beyond copyright" talk at the GSoC Mentor summit...
      • 2019-10-22 29557, 2019

      • weirditude has quit
      • 2019-10-22 29508, 2019

      • Pac23 has quit
      • 2019-10-22 29551, 2019

      • BrainzGit
        [musicbrainz-server] reosarevok merged pull request #1241 (master…MBS-10431): MBS-10431: Remove 'yourmusic' from 7digital URLs https://github.com/metabrainz/musicbrainz-server/…
      • 2019-10-22 29553, 2019

      • BrainzBot
        MBS-10431: Remove 'yourmusic' from 7digital URLs https://tickets.metabrainz.org/browse/MBS-10431
      • 2019-10-22 29545, 2019

      • Pac23 joined the channel
      • 2019-10-22 29525, 2019

      • chhavi_ has quit
      • 2019-10-22 29513, 2019

      • pristine__
        Moin
      • 2019-10-22 29549, 2019

      • ruaok
        moin!
      • 2019-10-22 29519, 2019

      • pristine__
        PR-O-doom coming soon...
      • 2019-10-22 29524, 2019

      • pristine__
        😂
      • 2019-10-22 29538, 2019

      • weirditude joined the channel
      • 2019-10-22 29556, 2019

      • weirditude has quit
      • 2019-10-22 29501, 2019

      • Gazooo has quit
      • 2019-10-22 29540, 2019

      • Gazooo joined the channel
      • 2019-10-22 29537, 2019

      • CatQuest
        pristine_: actually. I have a request/help from you
      • 2019-10-22 29548, 2019

      • CatQuest
        remember when you all linked msuic videos
      • 2019-10-22 29556, 2019

      • CatQuest
        I need a clip of a veena related music (video) that's a bit silly and/or/funny, especially if short and sweet^^HH silly
      • 2019-10-22 29502, 2019

      • CatQuest
        (other indians too of course)
      • 2019-10-22 29502, 2019

      • CatQuest
        AND, it needs to be free to use/link to
      • 2019-10-22 29527, 2019

      • Matthew_ joined the channel
      • 2019-10-22 29555, 2019

      • weirditude joined the channel
      • 2019-10-22 29522, 2019

      • pristine__
        Not sure if free :(
      • 2019-10-22 29532, 2019

      • Matthew_
        Hi. We run are running a MusicBrainz Slave and have finally got around to replacing the Lucene search server with Solr and realtime indexing. We have started testing realtime indexing rates and we're not seeing a huge amount of throughput in terms of clearing the backlog on our queue. Can anyone give me any guidance as to what settings you have in SIR's config.ini in production? I did find this but I'm not sure it's up to date?
      • 2019-10-22 29532, 2019

      • Matthew_
      • 2019-10-22 29557, 2019

      • weirditude has quit
      • 2019-10-22 29527, 2019

      • ruaok
        Matthew_: hi!
      • 2019-10-22 29537, 2019

      • Matthew_
        Hi ruaok :)
      • 2019-10-22 29543, 2019

      • ruaok
        yvanzo: is the person who can help you.
      • 2019-10-22 29550, 2019

      • Matthew_
        Thanks!
      • 2019-10-22 29559, 2019

      • ruaok
        there are known problems with speed right now ... :(
      • 2019-10-22 29504, 2019

      • Matthew_
        Ah!
      • 2019-10-22 29521, 2019

      • CatQuest
        pristine__: this is beautiful!
      • 2019-10-22 29537, 2019

      • pristine__
        Thanks :(
      • 2019-10-22 29542, 2019

      • pristine__
        :) *
      • 2019-10-22 29507, 2019

      • Matthew_
        FWIW, we aren't able to use amqp functions in our Postgres instance as AWS doesn't allow them on RDS instances. We've forked SIR to push changes onto a database (a simple table) queue, which a SIR amqp publisher then pulls from the table and publishes to RabbitMQ. If yvanzo et al are agreeable to it, I may send a pull request your way: https://github.com/madebykite/sir
      • 2019-10-22 29542, 2019

      • ruaok
        woah. why isn't it allowed? is there a good technical reason?
      • 2019-10-22 29527, 2019

      • CatQuest
        hmm though I don't know.. do you know of something a little. out of the ordinary.. i am looknig for something maybe humourus. maybe silly
      • 2019-10-22 29535, 2019

      • Matthew_
        There are lots of extensions that AWS RDS (and other cloud providers) don't allow as they consider them security risks
      • 2019-10-22 29500, 2019

      • Matthew_
        For example, we can't install MB's collation extensions
      • 2019-10-22 29502, 2019

      • yvanzo
        Matthew_: here is our settings for SIR in production: https://gist.github.com/yvanzo/320510e798a84549af…
      • 2019-10-22 29512, 2019

      • Matthew_
        Thanks, yvanzo!
      • 2019-10-22 29523, 2019

      • ruaok
        I wonder how many "security risks" are actually "that would hurt our bottom line $$$" in reality.
      • 2019-10-22 29517, 2019

      • CatQuest
        welcome to not being allowed drink bottles on planes but you can buy them at the airport for excessive $
      • 2019-10-22 29523, 2019

      • Matthew_
        Possibly, although in the case of things like the perl extension, etc, I can understand why they don't want random code running on managed servers.
      • 2019-10-22 29529, 2019

      • CatQuest
        *allowed to bring drink bottles
      • 2019-10-22 29533, 2019

      • Matthew_
        Thanks for the settings, yvanzo. I'm seeing about 20 messages a second being processed now - a great improvement.
      • 2019-10-22 29511, 2019

      • weirditude joined the channel
      • 2019-10-22 29533, 2019

      • weirditude has quit
      • 2019-10-22 29516, 2019

      • BrainzGit
        [musicbrainz-server] yvanzo opened pull request #1246 (master…mbs-10439-react-collections-list): MBS-10439: Convert user collections list to React https://github.com/metabrainz/musicbrainz-server/…
      • 2019-10-22 29516, 2019

      • BrainzBot
        MBS-10439: Convert user collections list to React https://tickets.metabrainz.org/browse/MBS-10439
      • 2019-10-22 29520, 2019

      • weirditude joined the channel
      • 2019-10-22 29501, 2019

      • weirditude has quit
      • 2019-10-22 29537, 2019

      • weirditude joined the channel
      • 2019-10-22 29537, 2019

      • pristine__
        Hey ruaok if any time you prepare any file to store in HDFS, JSON is a good idea.... Other formats are a pain..... The commas in the list of artist mbids were difficult to process.
      • 2019-10-22 29541, 2019

      • pristine__
        Just a suggestion :)
      • 2019-10-22 29516, 2019

      • ruaok
        oh, duh. sorry.
      • 2019-10-22 29529, 2019

      • ruaok
        make me a lb-labs ticket please to improve that?
      • 2019-10-22 29540, 2019

      • pristine__
        Yeah. Sure.
      • 2019-10-22 29511, 2019

      • ruaok
        I started with CSV as a proof of concept and then forgot to switch to json. should be easy to fix.
      • 2019-10-22 29521, 2019

      • jvoisin has left the channel
      • 2019-10-22 29506, 2019

      • D4RK-PH0ENiX has quit
      • 2019-10-22 29546, 2019

      • pristine__
        ruaok: thats fine. I managed artist mapping with this
      • 2019-10-22 29547, 2019

      • pristine__
      • 2019-10-22 29559, 2019

      • ruaok
        pristine__: what would be better? one JSON doc per line or one document per file?
      • 2019-10-22 29500, 2019

      • pristine__
        recording mappings were first to process
      • 2019-10-22 29511, 2019

      • pristine__
        perfect to process*
      • 2019-10-22 29530, 2019

      • pristine__
        JSON doc per line if you mean one dict { } per line
      • 2019-10-22 29533, 2019

      • pristine__
        sorry
      • 2019-10-22 29501, 2019

      • pristine__
        also, spark does not have array datatype for csv :(
      • 2019-10-22 29520, 2019

      • ruaok
        ok, one JSON doc per line. coming up.
      • 2019-10-22 29528, 2019

      • pristine__
        No hurry :)
      • 2019-10-22 29553, 2019

      • pristine__
        I have uploaded both the mappings in HDFS. but good for future :)
      • 2019-10-22 29510, 2019

      • pristine__
        so ruaok we have a sql folder in labs
      • 2019-10-22 29516, 2019

      • ruaok
        I better do it now, get it out of the way.
      • 2019-10-22 29540, 2019

      • pristine__
        which has all the sql queries used in the four main scripts
      • 2019-10-22 29509, 2019

      • pristine__
        I will slowly delete that folder and use pyspark sql functions instead
      • 2019-10-22 29521, 2019

      • pristine__
        iliekcomputers had a different view then
      • 2019-10-22 29531, 2019

      • pristine__
        but it is best to use the functions
      • 2019-10-22 29540, 2019

      • pristine__
        It will reduce the amout of code
      • 2019-10-22 29520, 2019

      • pristine__
        It will reduce the amout of code delete, alter in spark
      • 2019-10-22 29531, 2019

      • pristine__
        because dfs are immutable
      • 2019-10-22 29507, 2019

      • pristine__
        so we have to use the functions at some point and it will be weird to use the functions half of the time and queries the remaining half. So yes...
      • 2019-10-22 29511, 2019

      • ruaok
        > iliekcomputers had a different view then
      • 2019-10-22 29516, 2019

      • ruaok
        what do you mean by that?
      • 2019-10-22 29507, 2019

      • pristine__
        He wanted to be consistent like and use sql queries everywhere, like we do in LB-server
      • 2019-10-22 29537, 2019

      • pristine__
        but that everywhere is not possible in spark
      • 2019-10-22 29557, 2019

      • pristine__
      • 2019-10-22 29504, 2019

      • pristine__
        the api is nice to use imo
      • 2019-10-22 29537, 2019

      • ruaok
        I see his point, but using the method native to the DB that is in use seems to make more sense to me.
      • 2019-10-22 29525, 2019

      • D4RK-PH0ENiX joined the channel
      • 2019-10-22 29540, 2019

      • pristine__
        which
      • 2019-10-22 29544, 2019

      • pristine__
        DB?
      • 2019-10-22 29548, 2019

      • pristine__
        postegres?
      • 2019-10-22 29512, 2019

      • ruaok
        storing text queries makes sense for postgres, but if it doesn't make sense for spark, let's use the spark native format.
      • 2019-10-22 29537, 2019

      • pristine__
        I understand but my point is if that it wont be consistent. write queries sometimes and use the api sometimes which is difficult to follow
      • 2019-10-22 29538, 2019

      • pristine__
        also
      • 2019-10-22 29545, 2019

      • pristine__
        look at the steps in sql query
      • 2019-10-22 29556, 2019

      • pristine__
        1. create the data frame
      • 2019-10-22 29504, 2019

      • pristine__
        2. register the dataframe
      • 2019-10-22 29509, 2019

      • ruaok
        I think that non-consistency is fine, which is what I am saying.
      • 2019-10-22 29522, 2019

      • pristine__
        3. write query
      • 2019-10-22 29537, 2019

      • pristine__
        4. get the dataframe
      • 2019-10-22 29549, 2019

      • pristine__
        additional steps 3,4
      • 2019-10-22 29551, 2019

      • pristine__
        okay
      • 2019-10-22 29505, 2019

      • pristine__
        but I personally would favour using the API
      • 2019-10-22 29541, 2019

      • ruaok
        whichever you prefer -- just please document how to use it.
      • 2019-10-22 29550, 2019

      • pristine__
        sure :)
      • 2019-10-22 29551, 2019

      • D4RK-PH0ENiX has quit
      • 2019-10-22 29541, 2019

      • pristine__
        having said that ruaok you can totally ignore https://github.com/metabrainz/listenbrainz-labs/p…
      • 2019-10-22 29500, 2019

      • pristine__
        I will close it soon. No queries so no need to track view names
      • 2019-10-22 29529, 2019

      • ruaok
        ok
      • 2019-10-22 29553, 2019

      • pristine__
        I just want that anyone who comes after me to carry on should not jump here and there because they are not able to understand stuff and code:)
      • 2019-10-22 29504, 2019

      • ruaok
        <3
      • 2019-10-22 29501, 2019

      • alastairp
        I was looking at this PR recently, I was really surpsed to see that the spark sql api has no positional parameter support :( on the internet it's really difficult to determine if this is an injection attack vector or not
      • 2019-10-22 29518, 2019

      • alastairp
        using native query api sounds like a good idea to reduce this problem
      • 2019-10-22 29535, 2019

      • iliekcomputers
        Sorry for the trouble! SQL was just simpler to write and understand at the time.
      • 2019-10-22 29548, 2019

      • D4RK-PH0ENiX joined the channel
      • 2019-10-22 29525, 2019

      • alastairp
        it was worth trying, no problem
      • 2019-10-22 29520, 2019

      • ruaok
        Mr_Monkey: you online today at all?