#metabrainz

/

      • lucifer[m]
        mayhem: having spent many hours trying to debug issues with building popularity data, it turns out to be a casing issue in the _mbid field. spark doesn't have a uuid type so it treats them as strings. i think we should lowercase all UUIDs when accepting listens in LB automatically, alternatively we need to lowercase them at dumps time or in each query in spark.
      • mayhem[m]
        but we store them in PG which has a UUID type, which I would assume doesn't have a case, right?
      • lucifer[m]
        also, need to fix it for existing listens which is not fun but oh well
      • we store user submitted listen data (additional_info) as json
      • mayhem[m]
        oh, this is in the JSON field, which doesn't have a UUID type,.
      • I think we should to both. convert to lower case when ingesting, but also do so when we use them for stats work.
      • lucifer[m]
        doing it when using them for stats work would likely make all query slower.
      • if we do it ingestion time, i think we should be fine. given that we also fix the listens already ingested.
      • mayhem[m]
        fun. well, we need to add unique ids to the listen table, so might as well do it then
      • lucifer[m]
        yeah, makes sense.
      • i'll update the popularity queries for now. open a ticket for rest of the stuff.
      • BobSwift[m] has quit
      • pite has quit
      • minimal has quit
      • TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged and not empty as it is bridged to IRC; see https://musicbrainz.org/doc/ChatBrainz for details | Agenda: Reviews, Hetzner mainboard repl. (zas)
      • lusciouslover has quit
      • spynx joined the channel
      • spynxic has quit
      • Kladky has quit
      • lusciouslover joined the channel