#metabrainz

/

      • alastairp
        so if for example it was `None` or `{}`, then the `[0]` would cause an error
      • even if it was `[]` we'd get an error
      • Pratha-Fish: right. I was looking at the other one. that's good then
      • ansh
        The edition group['author_credits'] returns an array of objects. So if it does not exist, the database returns [None]
      • alastairp
        how many items did you look up to result in just these 2?
      • ansh: is this the direct result from the view?
      • ansh
        yes
      • I added a table in the views for the author credits
      • alastairp
        right, that's a bit strange. give me a moment, I want to check this query myself
      • Pratha-Fish: that first one is really weird... is this a uuid being used as an artist name? or a bug in the generation of the table?
      • ansh
        Sure
      • Pratha-Fish
        alastairp: I rechecked that one, apparently, it's a uuid that's being used as artist name!
      • Sophist-UK has quit
      • No bugs in table generation afaik
      • Sophist-UK joined the channel
      • alastairp
        what's the original recording mbid, then?
      • Pratha-Fish: oh, I think I worked out the unicode issue
      • we never actually made this an html document
      • check out the last section - "The Complete HTML5 Boilerplate"
      • Sophist_UK joined the channel
      • we should use something like that. we don't need any of the `<meta property=>` or `<link>` or `<script>` tags, but copy the rest
      • and put the table inside <body>
      • Pratha-Fish
        alastairp: The original recording MBID is the same as the artist_name, recording_name, and the canonical_mbid!
      • alastairp
        Pratha-Fish: that's why I think it's a bug
      • because when I click on the link for the canonical mbid, musicbrainz tells me that it doesn't exist
      • but that's almost certain to be impossible, all canonical mbids should exist
      • Sophist-UK has quit
      • Pratha-Fish
        alastairp: exactly! That ID doesn't exist in the musicbrainz table. I just rechecked the specific row in jupyter, and here's what I've found
      • This is before running the mbc.
      • mlhd_recording_mbid is from MLHD
      • mlhd_canonical_mbid is what I found by looking up the mlhd_recording_mbid in redirects and then canonical table
      • Then the mlhd_canonical_mbid is then used with an artist_credit query
      • alastairp
        where do you get rec_name from?
      • (btw, this is another great reason to have our code in scripts rather than a notebook, because then you could link me to the exact line in github)
      • Pratha-Fish
        alastairp: The rec_name and artist_credit are fetched using a SQL query
      • So I am guessing it's a fault in the musicbrainz db
      • alastairp
        the fact that rec_name is the same as mlhd_recording_mbid makes me thing that there's a bug with this sql query
      • no, almost certainly not an issue with the database, probably a problem with the way that we are reading it
      • Pratha-Fish
        alastairp: thankfully I have all this data in a script as well :D
      • alastairp
        great! where's the line, then?
      • Sophist-UK joined the channel
      • Pratha-Fish
      • alastairp: The row I shared earlier is from this particular table. All other rows have their specific rec_name and artist_credit data in place
      • https://github.com/Prathamesh-Ghatole/MLHD/blob... => This script generates everything from scratch
      • https://github.com/Prathamesh-Ghatole/MLHD/blob... => This library has all the helper functions
      • alastairp
        what line makes this dataframe? (please make it easy for me to follow along)
      • you can click on a line number to get a URL that links to the line
      • Pratha-Fish
        right, sorry for the confusion
      • Sophist_UK has quit
      • Sophist_UK joined the channel
      • Sophist_UK has quit
      • Sophist-UK has quit
      • This line fetches rec_name and artist_credit using the artist_credit_list query that you shared last week
      • alastairp
        Pratha-Fish: I'd look in detail at your replace_multi method
      • Pratha-Fish
        right
      • Sophist-UK joined the channel
      • https://github.com/Prathamesh-Ghatole/MLHD/blob... => Here's the replace multi function
      • alastairp
        immediately, I suspect that the item (recording mbid) that you're trying to look up doesn't exist in the musicbrainz database any more, and it's using the lookup key as the result rather than returning "none" or something
      • reading the code, I can't confirm if that is the case, but I suspect so
      • Pratha-Fish
        Hmm that seems likely
      • Here's what the replace_multi function uses under the hood.
      • Aaand you were exactly right!
      • alastairp
        yes, I saw that. but I don't know pandas enough to understand that behaviour just by reading the code
      • Pratha-Fish
      • alastairp: ^ The above code just looks up the input value (in this case canonical_mbid) in the specified table (the artist_credit one)
      • alastairp
        oh right. I missed the 'except KeyError'
      • Sophist-UK has quit
      • Sophist_UK joined the channel
      • Pratha-Fish
        yep :)
      • i.e. in this particular case the "canonical_mbid" simply doesn't exist in the table
      • alastairp
        ansh: do you have an example of an edition group whose author list is [None] ?
      • Pratha-Fish: sorry, I have to run now. I have an event in an hour and need to go and get lunch first
      • do you have enough to work on?
      • Pratha-Fish
        alastairp: It's alright! I think I have enough to work on right now.
      • alastairp
        finding this issue + doing fast lookups for mbc + fixing html table output sounds like it should keep you busy
      • Pratha-Fish
        that's right
      • alastairp
        great, tomorrow morning I'm giving a talk, but I should be online with my laptop through the afternoon
      • ansh
        alastairp: yep '02ae4cfc-6412-4693-93b1-e24dce5e31f9'
      • alastairp
        ansh: thanks, I'll check this when I'm finished with my other tasks today
      • Pratha-Fish
        alastairp: great, I'll ping you only when required.
      • ansh
        I feel we should move this variable `DEFAULT_CACHE_EXPIRATION` used in many places to our config file, it would be really useful. We could disable cache with a single click :)
      • Sophist_UK has quit
      • alastairp
        ansh: go ahead and open a PR for that!
      • ansh: normally when I do this, I open a redis shell and just run FLUSHALL
      • ansh: one really small thing too - I was just looking at the sql query in `fetch_multiple_edition_groups` and it has a trailing ;
      • this isn't needed when running sql from python, can you remove it?
      • ansh
        alastairp: I'll remove the semicolon
      • alastairp
        ansh: I ran this query with the bbid, and see the result. to me this seems a bit odd, and it might be better to try and modify the aggregate to make the query return [] instead of [null]
      • alastairp gone
      • ansh
        alastairp: Okay, i'll try to modify it.
      • s1b1 has quit
      • s1b1 joined the channel
      • yvanzo
        hi reosarevok: my dev env is currently broken, do you plan to work on MBS-12512 again today?
      • BrainzBot
        MBS-12512: genre_alias_type not being dumped in SampleDataDump https://tickets.metabrainz.org/browse/MBS-12512
      • reosarevok
        That's supposed to be fixed with the latest sample dump we generated a few days ago
      • Have you tried it?
      • Hmm, checking
      • yvanzo
        I tried it, see the comments.
      • Have you tried it?
      • reosarevok
        No, because I got a report that it worked fine now
      • Let's see
      • yvanzo
        zas: I have access, thanks.
      • reosarevok
        yvanzo: it's fixed, but the fix was merged into production
      • So you need to also import from the production branch
      • (I tested, importing from master skips genre_alias_type, but from prod it does not)
      • yvanzo
        Alright, thank you, will test.
      • reosarevok
        I'm still finishing an import in prod, you can wait 5 min if you want me to confirm :)
      • But it didn't skip that, at least
      • yvanzo: confirmed, worked in the production branch
      • yvanzo
        It worked here too.
      • reosarevok
        Yay
      • Sorry for not checking that first and causing the annoyance
      • yvanzo
        No problem, thanks for the quick resolution.
      • PetrCBRCZ
        yvanzo: do you have another tip how to clean cache for slave MB installation ?
      • yvanzo
        PetrCBRCZ: that was the reason I was trying to refresh my dev env, still on it.
      • PetrCBRCZ
        yvanzo: ahh ... ok ...thx ;-)
      • yvanzo
        PetrCBRCZ: It returns "OK" here.
      • The message you copied is returned by the following command? sudo docker-compose exec redis redis-cli FLUSHALL
      • PetrCBRCZ
        yes ... it returns "(error) READONLY You can't write against a read only slave."
      • yvanzo
        You should be able to check the number of items in Redis database with: sudo docker-compose exec redis redis-cli DBSIZE
      • PetrCBRCZ
        (integer) 388069
      • yvanzo
        It is unclear why it would be set to READONLY.
      • PetrCBRCZ: Try that instead: sudo docker-compose exec redis redis-cli FLUSHDB
      • PetrCBRCZ
        same error
      • yvanzo
        There might be an issue with your Redis instance.
      • INFO should provide more details for debugging.
      • PetrCBRCZ
        # Server
      • redis_version:3.2.12
      • redis_git_sha1:00000000
      • redis_git_dirty:0
      • redis_build_id:b9a4cd86ce8027d3
      • redis_mode:standalone
      • os:Linux 5.4.0-121-generic x86_64
      • arch_bits:64
      • multiplexing_api:epoll
      • gcc_version:6.4.0
      • process_id:1
      • run_id:8fc13aa160b0d4e660ae3f0de2c91bd4b5d6ca90
      • tcp_port:6379
      • uptime_in_seconds:1649137
      • uptime_in_days:19
      • hz:10
      • lru_clock:14759047
      • executable:/data/redis-server
      • mem_allocator:jemalloc-4.0.3
      • # Persistence