#metabrainz

/

      • Shubh joined the channel
      • monotux has quit
      • monotux joined the channel
      • BrainzGit
        [musicbrainz-server] 14yvanzo merged pull request #2458 (03master…mbs-12258-update-if-needed): MBS-12258: Skip updating up-to-date containers https://github.com/metabrainz/musicbrainz-serve...
      • reosarevok
        yvanzo: I was thinking about MBS-12273
      • BrainzBot
        MBS-12273: Improve misleading collection edits header https://tickets.metabrainz.org/browse/MBS-12273
      • reosarevok
        And I get the feeling the best option would actually be to just rename the collection type names from "Artist" to "Artist collection", etc
      • To match what we do with series, which is already "Recording series", "Release series", etc.
      • Would that cause a mess in sir? Do we currently index collections at all?
      • I mean, I could also append (collection) to the headers, but just renaming seems simpler and better to me tbh :)
      • cuanim joined the channel
      • BrainzGit
        [musicbrainz-server] 14reosarevok opened pull request #2464 (03master…MBS-9234): MBS-9234: Don't select a default type when adding new series https://github.com/metabrainz/musicbrainz-serve...
      • yyoung[m] has quit
      • mayhem
        moooin!
      • BrainzGit
        [musicbrainz-server] 14reosarevok merged pull request #2440 (03master…MBS-12227): MBS-12227: Don’t include spammer editors in "valid" statistics https://github.com/metabrainz/musicbrainz-serve...
      • mayhem
        alastairp: ping me when you're ready to work on the metadata viewer stuff. turns out I have to solve the following problems today: canonical recordings (✅), canonical releases (easy), recording to canonical release (not too hard). Then I can add release focused stuff the the metadata cache.
      • nothing like cranking out 3 useful datasets so I can provide freaking coverart to the viewer.
      • cuanim has quit
      • cuanim joined the channel
      • monkey
        Wee !
      • To be fair we do need other recording info, and the viewer would not look right without cover art :)
      • mayhem
        not complaining -- these things need to get done and yes no cover art is a total deal breaker.
      • milosh has quit
      • texke has quit
      • canonical releases data now computing.
      • texke joined the channel
      • alastairp
        mayhem: morning. I'm here today
      • alastairp opens metadata doc
      • mayhem
        moin!
      • given where I am at and a sizable project that needs doing, it might be good for you to work on the auto update feature of the mb_metadata cache.
      • alastairp
        monkey: https://sebastienlorber.com/records-and-tuples-... this was an interesting read
      • mayhem
        I'm going to be busy generating the release data today and likely tomorrow.
      • alastairp
        mayhem: right - reading the replication packets?
      • mayhem
        yes, but not that low level. :)
      • the process, as I envision it:
      • 1. When the data gets generated, save a last-checked timestamp.
      • 2. Then for an hourly cron job, wake up, read last checked timestamp.
      • 3. Fetch MBIDs for artists, recordings and releases that have changed since the last run.
      • 4. Mark these MBIDs as dirty.
      • 5. Fetch all rows that are marked as dirty and re-fetch the data, unsetting the dirty flag.
      • 6. Save an updated last-checked timestamp.
      • 7. Go Back to 2.
      • alastairp
        so to confirm - you have code for generating the data, both from scratch and given a set of candidate mbids (mbids of what?)
      • mayhem
        yes.
      • alastairp
        is there a db with this generated data somewhere so I can look at its structure?
      • mayhem
        bono
      • and gaga.
      • mb_metadata_cache on gaga.
      • it is not up to date, however. I've already created a new column in the table called "artist_mbids UUID[] NOT NULL".
      • and soon there will be a "release_mbid UUID NOT NULL" column.
      • alastairp
        right - that's just what I was going to ask. I think I saw you mention this last night. each row will have a column containing the recording mbid, artist mbids, and release mbids that this row affects?
      • mayhem
        and there is a recording_mbid column. those are your three mbid columns that if one of those MBIDs changes, mark the row dirty.
      • alastairp
        so we don't have to walk through the MB database to find these relations
      • perfect
      • mayhem
        I think I just answered your question, yes?
      • I've not added the GIN index on artist_mbids yet. that still need to be done.
      • alastairp
        yeah, that was the last bit that I wasn't sure about
      • mayhem
        I'm now also making changes to the canonical-recordings branch in order to calculate the pre-cursor data sets for the release stuff. not yet sure how to best handle those branches yet....
      • alastairp
        the metadata mapping is the mb-metdata-cache branch?
      • mayhem
        yes
      • canonical recordings and canonical releases are being calculate right now.
      • BrainzGit
        [musicbrainz-server] 14reosarevok opened pull request #2465 (03master…MBS-12275): MBS-12275 / MBS-12276: YouTube playlist cleanup / validation improvements https://github.com/metabrainz/musicbrainz-serve...
      • Freso
        My brain cannot retain attention in anything for more than 30 seconds today. Going to go for a walk and not look at a screen for a while or try to force my brain. I’ll post notes from last night’s meeting tomorrow. :\
      • CatQuest
        heeey guys, is everything fish?
      • fish.
      • !m freso, take care of yourself first.
      • BrainzBot
        You're doing good work, freso, take care of yourself first.!
      • CatQuest
        mayhem: not that i've put it in thedoc because I don' ven think I could access it f I tried (though might attempt later if there isa lnk.
      • but Internet Archive and/or Wikidata are my open source <3
      • can I also say I love this idea of giving back? great 10++
      • 👏
      • ah i can acess the doc, yay
      • yvanzo
        reosarevok: Collections are not indexed at all; See https://github.com/metabrainz/mbsssss
      • atj
        personally, I would be looking at software libraries which you rely on / have used for a long period of time
      • those are often maintained by one person with little recognition
      • mayhem
        atj: agreed. once you've picked yours, enter them into the spreadsheet.
      • CatQuest
        oh, ok
      • alastairp
      • CatQuest
        !recall dependency
      • BrainzBot
      • mayhem
        atj: my first suggestions are ones that make a huge difference to us (spark, typesense), but I guess apache doesn't really need our money so much.
      • maybe I'll pick something else.
      • hmm. should that XKCD be our inspiration? ideally we only support these random thanklessly maintained projects?
      • atj
        I'm not best placed to know which Python or Perl libraries you rely on
      • mayhem
        atj: that isn't the idea either. what sysadming tools are worth to support?
      • let everyone speak to their corner of open source.
      • speak of?
      • asymmentric joined the channel
      • alastairp
        I'm currently thinking about the diffference between dependencies that we use in our projects, or the tooling that we use to make it
      • mayhem
        does there need to be a distinction?
      • atj
        I'll have a think. I was just conveying my thoughts on the approach that I think would make the most difference.
      • alastairp
        not in terms of actually making a donation, but I initially thought about things like flask, but then I realised that I use something like iterm or tmux just as much if not more
      • mayhem
        I'm fully open to feedback on how to do this better. but I really like the focus on the small projects.
      • alastairp: yes, when you start looking at it, there are gobs and gobs of projects we use an never think of.
      • lucifer
        mayhem: afaik, the money directly donated to apache doesn't go to maintainers of various projects. so might be better to directly donate to the maintainers of projects (if the info is available usually it is but probably not always).
      • mayhem
        lucifer: yes. US based non-profits cannot receive donations that are earmarked for a specific purpose.
      • so, yeah, better not apache.
      • lucifer
        ah right, i had forgotten that.
      • asymmentric has quit
      • yyoung[m] joined the channel
      • mayhem: mb-metadata-cache is the branch for this sprint?
      • Ansh
        lucifer: To show the releases for labels, events for places, I need to add some functions. So should I add them in CB or in BU ?
      • lucifer
        Ansh, whats the purpose of those functions?
      • alastairp
        Ansh: this is to retrieve data from the musicbrainz database? I think it should go in the BU methods
      • lucifer
        mayhem: or better question, which branch has the mb metadata endpoints?
      • mayhem
        mb-metadata-cache
      • Ansh
        alastairp: Yes I need to get data from mb database.
      • lucifer
        👍
      • mayhem
        lucifer: but I am not sure if it makes sense to put the other metadata work into that.
      • alastairp
        Ansh: see for example, the musicbrainz API has a parameter to get releases for labels: https://musicbrainz.org/doc/MusicBrainz_API#Sub...
      • mayhem
        I think the UI might best be in a different branch. backend in this one? what do you think?
      • alastairp
      • lucifer
        i am adding the mbid lookup endpoint so probably a different branch targeting this one so that it can be reviewed separately but i can use the same blueprint etc.
      • alastairp
        so it makes sense to modify https://github.com/metabrainz/brainzutils-pytho... to allow an include parameter for 'releases'
      • lucifer
        ui in a different branch based on this one makes sense.
      • mayhem
        lucifer: sure, just branch off my branch then.
      • lucifer
        👍
      • Ansh
        alastairp: Got it. Also after adding it to BU, can it be directly used in CB? Because the versions we are using is different.
      • alastairp
        Ansh: look at how CB loads the BU dependency: https://github.com/metabrainz/critiquebrainz/bl...
      • you can temporarily modify this to point directly to your branch and then rebuild CB in order to install it
      • Ansh
        Understood
      • lucifer
        mayhem: the listenbrainz.db.metadata seems to be missing in that branch. forgot to commit?
      • mayhem
      • its there.
      • lucifer
        huh its missing here locally. probably issue on my end.
      • ah, i see it now. sorry for the false alarm 😓
      • mayhem: oh, i think we forgot to discuss the msid situation again. artist name, recording name lookup endpoint won't have msids so how do we record the results in the mapping tables. should we generate a msid based on artist name and recording name on the fly?
      • the potential downside is that the usual listens endpoint considers some extra fields while generating msids so there can be dupes. but the mapper is great at handling those so don't think it should be an issue.
      • alternatively, we lookup each time and do not store the result of a match.
      • mayhem
        my plan was not to store the results.
      • because the existing mapping may to a better job.
      • lucifer
        i see makes sense.
      • mayhem
        and if we're really happy with the lighter lookup endpoint, we can add the saving later.
      • lucifer
        yeah it may also help that once the now playing stuff turns into actual listens, some enhancements in the mapper can have it consider temporal relations detect albums so on.
      • 👍
      • mayhem
        exactly.
      • alastairp
        lucifer: hi, remind me - I think that we have a constraint on some postgres array fields to ensure that they are the correct shape, is that right?
      • mayhem
        we do.
      • or we did. looking.
      • alastairp
        that's the one I was thinking of, thanks.
      • mayhem: do you see a value in adding such a constraint to artist_mbids?
      • (in the metadata table)
      • mayhem
        it can't hurt, but there will be very limited access to the table (from few code bits) that it might not really be needed.
      • alastairp
        yeah, right. I thought the same thing
      • mayhem
        ok, bono now has the `mapping.recording_canonical_release` table
      • which could also be used to add cover art to the playlists in LB.
      • monkey
        I agree with you atj, I'm also looking for OSS projects that make my life easier AND don't have a lot of visibility or contributions already
      • agatzk has quit
      • agatzk joined the channel