ruaok: the dump was successful but due to some reason spark didn't import it. `request_import_incremental` without supplying `--id` seems to be borked on spark side.
I see the issue, the current code assumes that all dump ids are sequential. it starts searching incremental dumps with ids since the last full dump and checks whether it exists. if not, it stops further checking. in this case 431 is the id of the full dump. but incremental dumps with id 435 and 436 are missing so it believes there are no other dumps to import and returns.
2021-05-25 14517, 2021
shivam-kapila
maybe try for next 5 IDs and then gracefully stop.
2021-05-25 14557, 2021
lucifer
that sounds brittle. but it also depends on why those particular ids are missing.
2021-05-25 14520, 2021
monkey has quit
2021-05-25 14520, 2021
outsidecontext_ joined the channel
2021-05-25 14520, 2021
outsidecontext_ has quit
2021-05-25 14520, 2021
outsidecontext_ joined the channel
2021-05-25 14520, 2021
monkey joined the channel
2021-05-25 14551, 2021
agatzk has quit
2021-05-25 14558, 2021
agatzk joined the channel
2021-05-25 14519, 2021
monkey has quit
2021-05-25 14525, 2021
ruaok
> I see the issue, the current code assumes that all dump ids are sequential. it starts searching incremental dumps with ids since the last full dump and checks whether it exists. if not, it stops further checking. in this case 431 is the id of the full dump. but incremental dumps with id 435 and 436 are missing so it believes there are no other dumps to import and returns.
2021-05-25 14533, 2021
outsidecontext has quit
2021-05-25 14502, 2021
monkey joined the channel
2021-05-25 14517, 2021
ruaok
lucifer: this is what I was saying last week -- this requirement is not really needed for us -- the referential integrity of the dumps is not compromised by skipping a dump. there will just be a gap in the data.
2021-05-25 14525, 2021
outsidecontext joined the channel
2021-05-25 14541, 2021
ruaok
we should do tho things: 1) go ahead and import dumps anyway 2) complain to high heaven that a dump is missing.
2021-05-25 14527, 2021
outsidecontext_ has quit
2021-05-25 14555, 2021
outsidecontext_ joined the channel
2021-05-25 14555, 2021
outsidecontext_ has quit
2021-05-25 14555, 2021
outsidecontext_ joined the channel
2021-05-25 14507, 2021
outsidecontext has quit
2021-05-25 14519, 2021
flamingspinach has quit
2021-05-25 14520, 2021
param has quit
2021-05-25 14520, 2021
slush has quit
2021-05-25 14537, 2021
outsidecontext joined the channel
2021-05-25 14517, 2021
outsidecontext_ has quit
2021-05-25 14555, 2021
param joined the channel
2021-05-25 14555, 2021
slush joined the channel
2021-05-25 14555, 2021
lucifer
ruaok: right, the issue here is the code considers the missing dump as a signal to stop looking further. if start ignoring it, we need another way to check that which the latest dump is.
2021-05-25 14510, 2021
lucifer
hence, the suggestion about the latest file.
2021-05-25 14547, 2021
ruaok
timestamps.
2021-05-25 14515, 2021
ruaok
but, if you prefer, a LATEST file. I hate implementing those. dunno why
2021-05-25 14512, 2021
lucifer
i see, like the check_ftp_age command.
2021-05-25 14515, 2021
lucifer
that's doable. i suggested a LATEST file because MB does it that way, they seem to have way less trouble with dumps than us.
2021-05-25 14550, 2021
ruaok
lucifer: that's because I had all this trouble and more with them over 15 years ago.
2021-05-25 14532, 2021
ruaok
lucifer: the import command can be a lot simpler really. our dump file names sort nicely, by design.
2021-05-25 14551, 2021
ruaok
just import to the end of the list and be done.
2021-05-25 14500, 2021
lucifer
oh! makes sense now why MB dumps are more stable.
2021-05-25 14513, 2021
lucifer
indeed, it can be a lot simpler.
2021-05-25 14529, 2021
lucifer
we have some additional complexity introduced on spark side. we store the id and timestamp of import all the imported dumps in hdfs. then while importing we do some non trivial stuff to find which dumps to import.
2021-05-25 14553, 2021
ruaok
nuke it all.
2021-05-25 14510, 2021
ruaok
that logic is fitting for some contexts, but not really for our own spark context.
2021-05-25 14511, 2021
lucifer
storing metadata is probably helpful during debugging but during import isnt needed at all.
2021-05-25 14537, 2021
ruaok
debugging and process monitoring.
2021-05-25 14551, 2021
ruaok
if we miss a dump or dumps are out of date, we need to scream.
2021-05-25 14527, 2021
lucifer
indeed.
2021-05-25 14551, 2021
ruaok
monkey: can you please work on determining why our LB reports are broken? I'd like to deploy a hotfix ASAP.
2021-05-25 14516, 2021
monkey
Sure, let me look
2021-05-25 14533, 2021
ruaok
thx
2021-05-25 14543, 2021
monkey
Yeah, I see a `TypeError: Cannot read property 'from_ts' of undefined`
Yeah, the code in static/js/src/stats/UserListeningActivity.tsx is quite fragile. We didn't see any of the errors until a month ago when I set it up to catch and report errors.
2021-05-25 14523, 2021
monkey
I'll see if I can make the whole component mor robust
2021-05-25 14549, 2021
monkey
I'm not very familiar with that component at all, but I suppose if we don't have all the data needed for the current week (which seems to be what currently breaks the page), I should just show an error message for that range?
2021-05-25 14542, 2021
lucifer
monkey: i think we should just display 0, in that case for the days for which data is missing.
2021-05-25 14504, 2021
monkey
Shouldn't the API return 0 for days without listens, in normal cases?
2021-05-25 14522, 2021
monkey
Doesn't the complete absence of data indicate that there's an error?
2021-05-25 14557, 2021
lucifer
i am not sure how it works like whether we coalesce the missing data on frontend or in the api.
missing data is indeed an error but we used to display outdated reports instead of erroring.
2021-05-25 14530, 2021
monkey
The code sort of relies on there being *some* data so that it can display the start and end dates of the range to the user, but if there's not data at all there's no point anyway
2021-05-25 14535, 2021
monkey
Ah?
2021-05-25 14514, 2021
lucifer
i think there is some bug in week calculation somewhere.
2021-05-25 14518, 2021
monkey
Outdated reports… Well, the code hasn't changed recently, we're now just catching JS errors. I'm not sure how the mechanism of showing older reports worked
2021-05-25 14530, 2021
lucifer
yeah, me neither.
2021-05-25 14550, 2021
lucifer
imo, ideally the api should send the data of the last two weeks.
2021-05-25 14511, 2021
monkey
That would work, and display things correctly.
2021-05-25 14542, 2021
lucifer
whichever has last two weeks it has, doesn't matter current or not.
2021-05-25 14532, 2021
lucifer
i think we might be able to fix it for now by generating new stats.
2021-05-25 14534, 2021
lucifer
i had imported yesterday's data into spark today morning, if i request new stats now there will be data for the current week.
2021-05-25 14551, 2021
lucifer
that'll give us some time to think about how to solve this correctly.
2021-05-25 14510, 2021
monkey
Wait before you do that please
2021-05-25 14523, 2021
monkey
It's helpful for me for testing to have it in broken state
2021-05-25 14546, 2021
lucifer
sure
2021-05-25 14535, 2021
monkey
So in summary there are two separate issues:
2021-05-25 14535, 2021
monkey
1. The API isn't returning any data for the current week
2021-05-25 14535, 2021
monkey
2. As a result, some parts of the JS page breaks (we weren't previously aware of this as we didn't have sentry reporting it)
2021-05-25 14502, 2021
lucifer
yes, right.
2021-05-25 14523, 2021
monkey
As we discussed I'll make sure the front-end recovers from missing data it's expecting, that'll be a start.
2021-05-25 14523, 2021
monkey
Which means maybe there's nothing special to do for error #1 other than actually calculating the data (meaning we don't need to ensure we return a provisional 0 listens for each day of the current week up until then)
2021-05-25 14557, 2021
monkey
Or do we want to ensure the API calls always return some form of data for the current week?
2021-05-25 14525, 2021
monkey
(Or returns an error if we somehow don't have the data?)
2021-05-25 14516, 2021
lucifer
i see the API does return a 204 if there is not data.
2021-05-25 14554, 2021
monkey
But it isn't currently returning a 204 as there is data from last week, even though there is no data for this week
2021-05-25 14504, 2021
monkey
(I'm talking about the 'week' range here)
2021-05-25 14525, 2021
lucifer
right, according to my understanding it sends data for both weeks in a single response.
2021-05-25 14556, 2021
monkey
So it should probably return a 204 if there's only data for the last week but not for the current week
2021-05-25 14521, 2021
lucifer
i think the data of a single week should still display a report. we use data of two weeks so that we can compare the last two weeks but if that data is not there then displaying just one week's chart is fine.
2021-05-25 14519, 2021
monkey
I see. Let me try that then
2021-05-25 14559, 2021
monkey
Yes, that makes a lot more sense, thanks for the help
2021-05-25 14557, 2021
monkey
Yeah, it's not perfect but it sure is better than a broken page (note the presence of the orange square in the caption, but without dates next to it)
right, sorry i meant other reports on this page. heatmap and world map etc.
2021-05-25 14529, 2021
monkey
Yes, they work fine. It's really just the way we tried to access the timestamps that was throwing an error
2021-05-25 14546, 2021
lucifer
ah! makes sense.
2021-05-25 14558, 2021
atj
I like the bold/coloured text support, makes those messages much more readable
2021-05-25 14503, 2021
atj
is that an ircd thing?
2021-05-25 14546, 2021
monkey
lucifer: Oh, and we can ignore the failing front-end test, it's unrelated. I'm working on a separate PR for that
2021-05-25 14518, 2021
lucifer
sure, i'll merge and deploy then?
2021-05-25 14537, 2021
monkey
(another one of those "your snapshot contains a date string and each time you run the tests your date string will change, failing the test" kind of deal)
2021-05-25 14554, 2021
shivam-kapila
atj: I09 get the,09 option in android app but not on desktop client