alastairp: the timeout thing might be doable not completely sure though. so like store the latest message in ts writer, if it times out and is redelivered do something about it? or do you mean if rabbitmq could automatically do this for us?
ok, I'm forwarding a schema change question to support@ then
2022-05-25 14524, 2022
reosarevok
Works for me
2022-05-25 14532, 2022
mayhem
atj: akshaaatt yvanzo bitmap alastairp monkey outsidecontext and anyone with a @meb email address. If you'd like a metabrainz dropbox account with loads of storage, go ahead and use your @meb email address to sign up for the dropbox account and you'll be automatically added to the team.
2022-05-25 14541, 2022
akshaaatt
Sounds cool mayhem!
2022-05-25 14558, 2022
mayhem
yeah, go for it. our friend at dropbox gave us a free business account.
2022-05-25 14547, 2022
alastairp
thanks mayhem
2022-05-25 14554, 2022
mayhem
np!
2022-05-25 14513, 2022
alastairp
lucifer: hmm, adding some extra checks via tswriter sounds like trouble and complexity
2022-05-25 14542, 2022
alastairp
I was just thinking out loud that it'd be great if rmq had a "you're _about_ to be disconnected" callback, rather than just "you've been disconnected"
2022-05-25 14524, 2022
alastairp
any thoughts on the comment that I added on LB#2009?
the documentation about max listen payload and the check that we do is different
2022-05-25 14545, 2022
alastairp
so we should update one or the other
2022-05-25 14511, 2022
alastairp
any thoughts on how we can select the largest listen payload from the database? can you do strlen on a json column in pg?
2022-05-25 14534, 2022
mayhem
alastairp: slowly, yes.
2022-05-25 14557, 2022
atj
so is my understanding correct, in that the issue last night was someone submitting a request with 70k listens which effectively DoS'd the application?
we have a 10kb per listen upper limit, but our check is that the average size of all listens in the payload is less than 10kb, not the full message
2022-05-25 14534, 2022
lucifer
re size of data, yeah pg_column_size may work, probably use it to find the largest listen and then do actual length on it because pg might be compressing jsonb and the size might be less than actual then.
2022-05-25 14525, 2022
lucifer
regarding the check, i am not sure why it was added. is it there to prevent overwhelming something api side or db side?
2022-05-25 14531, 2022
alastairp
or maybe we can estimate
2022-05-25 14532, 2022
atj
seems a weird check
2022-05-25 14538, 2022
atj
"the average size of all listens in the payload is less than 10kb"
2022-05-25 14554, 2022
alastairp
I think this may have come about because of our changes in technologies
2022-05-25 14500, 2022
lucifer
if the former then a per document size makes sense but if the latter then a per listen size limit is sensible.
2022-05-25 14522, 2022
alastairp
that check may have been added before we started using rabbitmq (and so before we could have multiple listens per message)
I guess it's just a simpler way of doing the check
2022-05-25 14505, 2022
alastairp
of "each listen < 10k in size"
2022-05-25 14535, 2022
alastairp
because otherwise you'd need to convert body json -> py dict, then iterate through it, then for each item convert back to json to check its size in bytes
so multiple listens existed that time too.probably that way for easier checking indeed
2022-05-25 14541, 2022
lucifer
alastairp: `select *, pg_column_size(data) AS size from listen ORDER BY size DESC LIMIT 5;` to find largest listen. then `select length(data::text) from listen where listened_at = 1651504402 AND user_id = 8741;` for its length. length is 9331.
2022-05-25 14524, 2022
alastairp
commit message by mayhem: "That is probably enough sanity checking for this minute."
2022-05-25 14530, 2022
alastairp
I mean, at the time it probably was
2022-05-25 14513, 2022
alastairp
lucifer: yes, multiple listens in an API payload existed. but I was talking about what happens after the API endpoint
2022-05-25 14540, 2022
alastairp
I think that at that time we may have split the payload up into messages of 1 listen each
alastairp: thats what the SO answer from yesterfay did?
2022-05-25 14523, 2022
alastairp
hi jesus2099!
2022-05-25 14507, 2022
alastairp
lucifer: yes, the answer from yesterday suggested a new format for source dependencies (PEP 508 I understand), but that's still different from what requirements.txt needs
2022-05-25 14540, 2022
lucifer
i see, makes sense to rewrite then
2022-05-25 14556, 2022
alastairp
so... we either need to manage these lists manually, convert automatically, or switch to poetry/pyproject.toml for local development so that we can reuse the same file for local dev and remote installs
2022-05-25 14555, 2022
alastairp
agreed that automatically converting when inserting into setup.py is probably the easiest idea for now - and it's only temporary until new mbdata is released, I guess
2022-05-25 14551, 2022
lucifer
if poetry/pyproject.toml handle this better then makes sense to migrate to those in future.
2022-05-25 14514, 2022
CatQuest
..
2022-05-25 14520, 2022
lucifer
indeed, the current situation should be temporary
2022-05-25 14538, 2022
CatQuest
no. i refuse. you can't call a foss project "poetry" that's just .. how even
2022-05-25 14546, 2022
CatQuest
sigh
2022-05-25 14526, 2022
alastairp
yeah, though I'm just worried that we'll move to a new dependency tool just as a new "better" one gets released
2022-05-25 14527, 2022
alastairp
lol
2022-05-25 14554, 2022
skelly37
outsidecontext, zas: FIFO seems to work for unix, you might want to take a look and review the bare protocol and ideas behind it: https://github.com/skelly37/pipethon. In the evening I'll take care of doing the same but with Windows API and then it's ready to be implemented in Picard.
2022-05-25 14520, 2022
alastairp
skelly37: great work getting this far so soon!
2022-05-25 14542, 2022
alastairp
of course, as they say the first 90% is easy, it's the second 90% that takes most of the time :)
2022-05-25 14549, 2022
atj
pipes work on windows?
2022-05-25 14502, 2022
skelly37
alastairp: thanks :)
2022-05-25 14503, 2022
alastairp
atj: "but with windows API" - I guess not
2022-05-25 14521, 2022
alastairp
ansh: by the way, I didn't confirm with you the other day, but as mayhem pointed out starting early is fine. let me know when you want to discuss this. I was chatting with monkey and we think that it makes sense that I work with you directly when you're working on CB parts, and he works with you when doing BB parts. perhaps the 3 of us could get together in the next week or so and go over your plan again
2022-05-25 14517, 2022
skelly37
atj: It requires pywin32 module, os.mkfifo() is unfortunately unix only.