#metabrainz

/

      • reosarevok
        I guess one for each might make sense
      • alastairp
        I would even consider doing all of the above
      • reosarevok
        It's just that "folk metal" being tagged "rock" feels odd
      • _lucifer
        white_shadow: yes, we are but currently the ui ones are less in number. also the ui ones need to be run locally on an emulator or a physical device as they cannot run on ci. so its still a work in progress
      • alastairp
        here's a sample data row:
      • reosarevok
        And I can assure you grime being tagged "hip hop" will get downvoted by grime people :D
      • alastairp
        8d1ae2a5-e698-41ae-aff5-cb114be0dce6b6009057-70a9-3ffe-b74e-744e83ca5edepoppop---balladpop---vocalrockrock---pop rockstage & screenstage & screen---soundtrack
      • yeah, right
      • so, we have a really neat algorithm for automatically inferring genre hierarchy
      • reosarevok
        But I guess that is what up and downvote is for, too
      • alastairp
        it looks at tag cooccurrences, if pop always occurs with rock, but rock doesn't always occur with pop, you can infer that pop is a subgenre of rock
      • (ok, bad example, because many people think that they're distinct, but you get the point hopefully)
      • white_shadow
        ok great!
      • reosarevok
        Sure
      • alastairp
        in that case, having something like hiphop is "useful", because it helps to define structure
      • e.g. people are welcome to use the most specific or least specific tag that they can
      • and as you say, voting is useful
      • reosarevok
        Yes, the question is whether this structure is real :)
      • Hmm
      • "*** mb genre: instrumental other genre hip hop/instrumental (100)"
      • alastairp
        yeah, that was an interesting one that came up a number of times
      • https://www.discogs.com/genre/hip+hop -> Related Styles of Music
      • reosarevok
        Also I would like to avoid a tag "folk, world, & country", although I guess you could only add the parent if it itself matches a genre
      • alastairp
        right, again this comes with the explicit grouping of categories
      • so it sounds like you think that some main categories aren't really "genres" as such
      • reosarevok
        Well, instrumental there seems to be explicitly about instrumental hip hop, heh
      • So it probably should match to that more specifically
      • alastairp
        yes, it is explicitly about that, but as it's part of their schema, it makes sense to talk about it in that case
      • I guess the additional question is what's the purpose of this matching process - e.g. imagine if I just went and added these tags. could we just rely on voting to "fix" them?
      • reosarevok
        We are currently missing that as a genre but we should probably add it
      • We kinda sorta could
      • alastairp
        are we interested in changing a few tags (e.g. the bop/bebop thing) when it makes sense?
      • reosarevok
        But I would try to make sure that to some degree we do follow the intent of the other source
      • alastairp
        so, some additional information here:
      • reosarevok
        (to avoid the bot user claiming that say discogs claims X is instrumental when they claim instrumental hip hop)
      • alastairp
        the lastfm and tagtraum sources are folksonomy tags too, we inferred the hierarchy - as I described above
      • the discogs one has this explicit genre/style split
      • right, got it
      • so for lfm and tagtraum, both genre/subgenre were originally tags by someone
      • reosarevok
        Mhm
      • alastairp
        perhaps we could special-case discogs in a few cases where needed
      • reosarevok
        Yeah, I think that seems sensible
      • alastairp
        let me run this matching with the other datasources and see if that's more interesting/useful
      • thanks for the ideas
      • reosarevok
        And for lfm and tagtraum, the genre will only be there if someone had tagged it?
      • Or will it sometimes be assumed from the subgenre?
      • alastairp
        ah, great question
      • reosarevok
        If a) then I guess that would be sensible
      • To just accept it for now and let voting cope
      • alastairp
        so, it's possible that we could infer a relationship betwen a genre and a subgenre tag using a bunch of tracks, but then some other track, someone only tagged it the subgenre
      • in that case, I believe that our dataset only lists the subgenre
      • however, when we did the recognition contest, we accepted the genre as a correct estimate
      • reosarevok
        In that case I would feel more or less comfortable letting it be added from the dataset
      • alastairp
        oh, one other thing - discogs annotations are per album, others are per track
      • reosarevok
        In general, given each bot gets only one vote, it is a pretty safe option
      • alastairp
        for the dataset, we propagate the same tags to all recordings in the releasegroup. but since we can do release/rg tags in mb, maybe we should just do that?
      • reosarevok
        Even if we mess something up it is unlikely that we would have a big problem because we would need to get several datasets wrong for it to be hard to downvote
      • Yeah, absolutely. Ideally, we could also do it so that if the same genre is applied to all (or most?) tracks in a release, then we apply it to the release
      • For when we have per track info
      • alastairp
        can you add that to the ticket? :)
      • reosarevok
        (I expect release/rg genres to be more useful and visible than the recording level ones)
      • Sure
      • alastairp
        OK, what I'm going to do is: 1) perform this matching with lastfm and tagtraum, assuming that the matches will be a bit better because they were all originally tags, 2) unless there are any major weird results from the matching, apply both the genre and subgenre as a tag directly to the recording
      • reosarevok
        If they are on the dataset?
      • alastairp
        3) take a bit more of a look at discogs, to see if the more "umbrella genres" should be removed, and if some subgenre/styles should be disambiguated
      • what do you mean?
      • reosarevok
        You said sometimes you only have subgenre but not genre if they originally were only tagged subgenre
      • ishaanshah
        iliekcomputers do we have our daily meet today?
      • reosarevok
        That is what I meant there
      • ishaanshah
        Daily->weekly
      • reosarevok
        Also, one exception that might be fun is the tag "romantic" - if you considered that a genre of classical
      • alastairp
        oh yes, right. will only apply explicit tags from the dataset, not implicit tags
      • reosarevok
        But I guess maybe you did not
      • iliekcomputers
        ishaanshah: I won't be able to make it, let's do it tomorrow.
      • alastairp
        ack romantic acousticbrainz-mediaeval-discogs-*tsv | grep -v classical | wc -l -> 0
      • lastfm -> rock---newromantic
      • but otherwise, there are no other 'romantic' genres in any dataset that aren't part of classical
      • btw: 2.3 million recordings, potentially more than 1 tag per recording :)
      • reosarevok
        But did "romantic" only get accepted when "classical" was also there?
      • If not, you risk having Kiss from a Rose tagged there :p
      • alastairp
        yeah, that might have been the case. we also did a bunch of filtering
      • ishaanshah
        iliekcomputers: ok sure
      • reosarevok
        (even if yes, you risk having classical music that people consider romantic but unconnected with romantic classical there)
      • alastairp
        especially when there weren't many instances of it, or the inferred genre/subgenre relationship wasn't very strong
      • reosarevok
        Did I ever mention I hate words?
      • I hate words :D
      • alastairp
        yeah, I see where you're coming from
      • it's easier to just have the opinion that everyone else is wrong
      • that helps
      • reosarevok
        I mean, I have the opinion that everyone is wrong, me included
      • At least past me! Sometimes present me too
      • Not sure that helps, but :D
      • spuniun- has quit
      • white_shadow has quit
      • spuniun joined the channel
      • sumedh joined the channel
      • killmePI has quit
      • killme joined the channel
      • sumedh has quit
      • sumedh joined the channel
      • MusicbrainzB0T2 joined the channel
      • MusicbrainzB0T has quit
      • ruaok
        shivam-kapila: how goes?
      • I'm dying to see what you've got.
      • BrainzGit
        [musicbrainz-android] SomalRudra opened pull request #51 (master…fixes_before_release): tag icon generated https://github.com/metabrainz/musicbrainz-andro...
      • sumedh has quit
      • ruaok
        reosarevok: I think I just found the date of "1970-11-31" in the table for a release. :/
      • ferbncode has quit
      • reosarevok
        hah
      • Which one_
      • ruaok
        I can't find it. must be my code then. :p
      • > psycopg2.errors.DatetimeFieldOverflow: date/time field value out of range: "1970-11-31"
      • oh, look!
      • I made an assumption about a date. ha!
      • dumbass. wrong again. #shouldknowbetter
      • reosarevok: does the MB codebase have something like this, but without bugs?
      • > to_date(coalesce(date_year, 9999)::TEXT || '-' || coalesce(date_month, 12)::TEXT || '-' || coalesce(date_day, 31)::TEXT, 'YYYY-MM-DD')
      • that begs the question, what was I trying to do?
      • reosarevok
        We do have a similar case:
      • '"begin_date":{"month":' || COALESCE(begin_date_month::text, 'null'::text) || ',"day":' || COALESCE(begin_date_day::text, 'null'::text) || ',"year":' || COALESCE(begin_date_year::text, 'null'::text) || '},' ||
      • That builds an object, but
      • I guess that wouldn't work with to_date
      • ruaok
        changing from end of unspecified element to begin of unspecified element should amount to the same in this case.
      • so, I've got an easy out. thanks!
      • > psycopg2.errors.DiskFull: could not resize shared memory segment "/PostgreSQL.511541481" to 134217728 bytes: No space left on device
      • sigh.
      • bitmap: would it be possible for me to restart postgres-williams ?
      • bitmap
        ruaok: that's fine, lemme just check if a json dump is running
      • nope, ok to restart then
      • ruaok
        thx, not sure it will help,but I'll try. :)
      • bitmap
        perhaps we need to bump shm-size on the container like we did on floyd?
      • ruaok
        what was it bumped to?
      • but that may need to be the case.
      • but now that I restarted the DB, I cannot connect to it again. 2 step...
      • bitmap
        on floyd it was bumped to 8GB, but I think it's 1GB here
      • ruaok
        yeah, I just did that. but... I wonder if pg-bouncer is now somehow unhappy?
      • if I enter the PG container on williams I can access the DB.
      • but my container connects, but gets DB not found error.
      • any idea what needs to be kicked?
      • bitmap
        hrm, sometimes consul-template doesn't render pgbouncer.ini correctly :\
      • I just did `sv restart pgbouncer`, does it work now?
      • ruaok
        it magically started doing its thing!
      • <3 bitmap
      • bitmap
        yay
      • ruaok
        I just increased the shared amount go like 1100MB because 1100MB is > 1024MB, which is what the SHM was trying to do.
      • let's see if the query succeeds.
      • I think the shm size should be configurable from the per node script -- what do you think?
      • (pass in one more arg)
      • yeah, no that wasn't it. it just needs more space.