#metabrainz

/

      • kuno joined the channel
      • Freso__ joined the channel
      • lucifer_ joined the channel
      • Freso__ is now known as Freso
      • lucifer has quit
      • lucifer_ is now known as lucifer
      • yyoung joined the channel
      • leonardo has quit
      • leonardo joined the channel
      • antlarr_away has quit
      • antlarr joined the channel
      • yyoung
        yvanzo: I've fixed the tests and would like to hear your advice on my code :)
      • thomasross has quit
      • akashgp0971 joined the channel
      • akashgp09_ joined the channel
      • akashgp0971 has quit
      • bitmap
        yyoung: hey, did you manage to get selenium tests running under docker?
      • yyoung
        bitmap: Not yet, I just use Jenkins, maybe I'll take a look at it the other day
      • bitmap
        ah ok. if you need help with it let me know
      • akashgp09_ has quit
      • akashgp09_ joined the channel
      • akashgp09_ has quit
      • akashgp09_ joined the channel
      • akashgp09_ is now known as akashgp09
      • akashgp09 has quit
      • akashgp09_ joined the channel
      • akashgp0973 joined the channel
      • akashgp0973 has left the channel
      • BrainzGit
        [listenbrainz-server] 14amCap1712 opened pull request #1481 (03master…spark-log): Log all requests received in spark https://github.com/metabrainz/listenbrainz-serv...
      • lucifer
        ruaok: the dump was successful but due to some reason spark didn't import it. `request_import_incremental` without supplying `--id` seems to be borked on spark side.
      • i think it would be simpler if we have file like `LATEST` which contains the latest dump id as MB has https://data.musicbrainz.org/pub/musicbrainz/da...
      • I see the issue, the current code assumes that all dump ids are sequential. it starts searching incremental dumps with ids since the last full dump and checks whether it exists. if not, it stops further checking. in this case 431 is the id of the full dump. but incremental dumps with id 435 and 436 are missing so it believes there are no other dumps to import and returns.
      • shivam-kapila
        maybe try for next 5 IDs and then gracefully stop.
      • lucifer
        that sounds brittle. but it also depends on why those particular ids are missing.
      • monkey has quit
      • outsidecontext_ joined the channel
      • outsidecontext_ has quit
      • outsidecontext_ joined the channel
      • monkey joined the channel
      • agatzk has quit
      • agatzk joined the channel
      • monkey has quit
      • ruaok
        > I see the issue, the current code assumes that all dump ids are sequential. it starts searching incremental dumps with ids since the last full dump and checks whether it exists. if not, it stops further checking. in this case 431 is the id of the full dump. but incremental dumps with id 435 and 436 are missing so it believes there are no other dumps to import and returns.
      • outsidecontext has quit
      • monkey joined the channel
      • lucifer: this is what I was saying last week -- this requirement is not really needed for us -- the referential integrity of the dumps is not compromised by skipping a dump. there will just be a gap in the data.
      • outsidecontext joined the channel
      • we should do tho things: 1) go ahead and import dumps anyway 2) complain to high heaven that a dump is missing.
      • outsidecontext_ has quit
      • outsidecontext_ joined the channel
      • outsidecontext_ has quit
      • outsidecontext_ joined the channel
      • outsidecontext has quit
      • flamingspinach has quit
      • param has quit
      • slush has quit
      • outsidecontext joined the channel
      • outsidecontext_ has quit
      • param joined the channel
      • slush joined the channel
      • lucifer
        ruaok: right, the issue here is the code considers the missing dump as a signal to stop looking further. if start ignoring it, we need another way to check that which the latest dump is.
      • hence, the suggestion about the latest file.
      • ruaok
        timestamps.
      • but, if you prefer, a LATEST file. I hate implementing those. dunno why
      • lucifer
        i see, like the check_ftp_age command.
      • that's doable. i suggested a LATEST file because MB does it that way, they seem to have way less trouble with dumps than us.
      • ruaok
        lucifer: that's because I had all this trouble and more with them over 15 years ago.
      • lucifer: the import command can be a lot simpler really. our dump file names sort nicely, by design.
      • just import to the end of the list and be done.
      • lucifer
        oh! makes sense now why MB dumps are more stable.
      • indeed, it can be a lot simpler.
      • we have some additional complexity introduced on spark side. we store the id and timestamp of import all the imported dumps in hdfs. then while importing we do some non trivial stuff to find which dumps to import.
      • ruaok
        nuke it all.
      • that logic is fitting for some contexts, but not really for our own spark context.
      • lucifer
        storing metadata is probably helpful during debugging but during import isnt needed at all.
      • ruaok
        debugging and process monitoring.
      • if we miss a dump or dumps are out of date, we need to scream.
      • lucifer
        indeed.
      • ruaok
        monkey: can you please work on determining why our LB reports are broken? I'd like to deploy a hotfix ASAP.
      • monkey
        Sure, let me look
      • ruaok
        thx
      • monkey
        Yeah, I see a `TypeError: Cannot read property 'from_ts' of undefined`
      • I'll investigate
      • lucifer
      • this is the relevant error in sentry
      • monkey
        Yeah, the code in static/js/src/stats/UserListeningActivity.tsx is quite fragile. We didn't see any of the errors until a month ago when I set it up to catch and report errors.
      • I'll see if I can make the whole component mor robust
      • I'm not very familiar with that component at all, but I suppose if we don't have all the data needed for the current week (which seems to be what currently breaks the page), I should just show an error message for that range?
      • lucifer
        monkey: i think we should just display 0, in that case for the days for which data is missing.
      • monkey
        Shouldn't the API return 0 for days without listens, in normal cases?
      • Doesn't the complete absence of data indicate that there's an error?
      • lucifer
        i am not sure how it works like whether we coalesce the missing data on frontend or in the api.
      • monkey
        Here's what I have for this week, for example: https://usercontent.irccloud-cdn.com/file/4cxEB...
      • Notice that there's only data for last week
      • lucifer
        missing data is indeed an error but we used to display outdated reports instead of erroring.
      • monkey
        The code sort of relies on there being *some* data so that it can display the start and end dates of the range to the user, but if there's not data at all there's no point anyway
      • Ah?
      • lucifer
        i think there is some bug in week calculation somewhere.
      • monkey
        Outdated reports… Well, the code hasn't changed recently, we're now just catching JS errors. I'm not sure how the mechanism of showing older reports worked
      • lucifer
        yeah, me neither.
      • imo, ideally the api should send the data of the last two weeks.
      • monkey
        That would work, and display things correctly.
      • lucifer
        whichever has last two weeks it has, doesn't matter current or not.
      • i think we might be able to fix it for now by generating new stats.
      • i had imported yesterday's data into spark today morning, if i request new stats now there will be data for the current week.
      • that'll give us some time to think about how to solve this correctly.
      • monkey
        Wait before you do that please
      • It's helpful for me for testing to have it in broken state
      • lucifer
        sure
      • monkey
        So in summary there are two separate issues:
      • 1. The API isn't returning any data for the current week
      • 2. As a result, some parts of the JS page breaks (we weren't previously aware of this as we didn't have sentry reporting it)
      • lucifer
        yes, right.
      • monkey
        As we discussed I'll make sure the front-end recovers from missing data it's expecting, that'll be a start.
      • Which means maybe there's nothing special to do for error #1 other than actually calculating the data (meaning we don't need to ensure we return a provisional 0 listens for each day of the current week up until then)
      • Or do we want to ensure the API calls always return some form of data for the current week?
      • (Or returns an error if we somehow don't have the data?)
      • lucifer
        i see the API does return a 204 if there is not data.
      • monkey
        But it isn't currently returning a 204 as there is data from last week, even though there is no data for this week
      • (I'm talking about the 'week' range here)
      • lucifer
        right, according to my understanding it sends data for both weeks in a single response.
      • monkey
        So it should probably return a 204 if there's only data for the last week but not for the current week
      • lucifer
        i think the data of a single week should still display a report. we use data of two weeks so that we can compare the last two weeks but if that data is not there then displaying just one week's chart is fine.
      • monkey
        I see. Let me try that then
      • Yes, that makes a lot more sense, thanks for the help
      • Yeah, it's not perfect but it sure is better than a broken page (note the presence of the orange square in the caption, but without dates next to it)
      • BrainzGit
        [metabrainz.org] 14mayhem merged pull request #360 (03master…irc): Update IRC links to point to Libera.Chat https://github.com/metabrainz/metabrainz.org/pu...
      • lucifer
        indeed. I think this is fine for use case.
      • BrainzGit
        [metabrainz.org] 14mayhem merged pull request #361 (03master…nicks): Update nicks of team members as on Libera.Chat https://github.com/metabrainz/metabrainz.org/pu...
      • [listenbrainz-server] 14MonkeyDo opened pull request #1482 (03master…monkey-sentry-119844): Make user reports JS component more robust https://github.com/metabrainz/listenbrainz-serv...
      • lucifer
        monkey: is the the only component that was causing issues? do other charts work fine?
      • monkey
      • lucifer
        right, sorry i meant other reports on this page. heatmap and world map etc.
      • monkey
        Yes, they work fine. It's really just the way we tried to access the timestamps that was throwing an error
      • lucifer
        ah! makes sense.
      • atj
        I like the bold/coloured text support, makes those messages much more readable
      • is that an ircd thing?
      • monkey
        lucifer: Oh, and we can ignore the failing front-end test, it's unrelated. I'm working on a separate PR for that
      • lucifer
        sure, i'll merge and deploy then?
      • monkey
        (another one of those "your snapshot contains a date string and each time you run the tests your date string will change, failing the test" kind of deal)
      • shivam-kapila
        atj: I09 get the,09 option in android app but not on desktop client
      • atj
        shivam-kapila: use your powers for good :P
      • BrainzGit
        [listenbrainz-server] 14MonkeyDo merged pull request #1482 (03master…monkey-sentry-119844): Make user reports JS component more robust https://github.com/metabrainz/listenbrainz-serv...
      • monkey
        boop
      • atj
        that's working for me in weechat, but it didn't work on freenode
      • BrainzGit
        [listenbrainz-server] release 03v-2021-05-25.0 has been published by 14github-actions[bot]: https://github.com/metabrainz/listenbrainz-serv...
      • Rotab
        atj: the channel had +c on freenode
      • atj
        rotab: I've forgotten how all these channel modes work, I assume that prevents colours