#metabrainz

/

      • minimal has quit
      • minimal joined the channel
      • saumon has quit
      • saum0n joined the channel
      • coolasf998[m] has quit
      • minimal has quit
      • derat[m]
        Bob Swift: i'm seeing the same thing. 302 redirect from https://translations.metabrainz.org/ to https://translations.metabrainz.org/accounts/lo..., but then it stalls
      • vardhan__ joined the channel
      • krishnacosmic[m] has quit
      • vardhan__ has quit
      • HemangMishra[m] has quit
      • vardhan__ joined the channel
      • outsidecontext[m joined the channel
      • outsidecontext[m
        Bob Swift, derat I had this also yesterday. I don't exactly now what I did, but I reloaed, navigated back to translations.metabrainz.org or such things and eventually I got in.
      • But weblate has become rather slow. So probably they have some technical trouble.
      • Kladky joined the channel
      • Kladky has quit
      • Sophist-UK has quit
      • Kladky joined the channel
      • BrainzGit
        [bookbrainz-site] 14KasukabeDefenceForce opened pull request #1164 (03master…tabview): BB-542: Create a tab view for entities https://github.com/metabrainz/bookbrainz-site/p...
      • Kladky has quit
      • jasje[m] has quit
      • void09_ has quit
      • void09 joined the channel
      • yano has quit
      • yano joined the channel
      • Kladky joined the channel
      • julian45[m] has quit
      • kellnerd[m] has quit
      • BobSwift[m]
        <outsidecontext[m> "But weblate has become rather..." <- That could have been it. I just tried now and was able to get in okay.
      • dvirtz[m] has quit
      • Jade[m] joined the channel
      • Jade[m]
        Hey all! Assignments are done, half term has started and I now have much more free time!
      • I'm writing up my GSoC proposal right now and wanted to ask which [BrainzBot plugins](https://github.com/metabrainz/brainzbot-core/tree/master/botbot_plugins/plugins) people used?
      • The ones I see the most are GitHub, Jira and !m
      • minimal joined the channel
      • mayhem[m] waves at Jade
      • mayhem[m]
        lucifer: ^^
      • lucifer[m]
        Jade: not sure. but i think the three you mentioned are probably it.
      • fwiw if this is about the Matrix archiver, I think you won't need to implement any plugins. the archiver itself would be enough.
      • Jade[m]
        Yeah, I've just posted my proposal, sans-timeline.
      • <lucifer[m]> "fwiw if this is about the Matrix..." <- Those the Jira and `!m` plugins are pretty simple! `!m` is about 6 LoC. There's a pre-existing GitHub plugin. Might as well go for a complete replacement, given that!
      • And maubot is pretty easy to host, too. It uses sqlite, and has a web interface you can use to upload plugins and manage bot accounts.
      • Of course, a secondary priority compared to the rest of the project :)
      • vardhan__ has quit
      • lucifer[m]
        Jade: I think doing this project in rust is a bad idea. There is no benefit to it than doing it in python. and doing it in rust making maintaining it in future hard.
      • Flask + React for frontend is the preferred tech stack for this project.
      • as that's the common tech stack between most of our projects.
      • julian45[m] joined the channel
      • julian45[m]
        <lucifer[m]> "Jade: I think doing this project..." <- i would counter that there is at least some benefit to doing it in rust: as far as sdks from the matrix foundation itself go, `matrix-rust-sdk` is [considered production-ready](https://github.com/matrix-org/matrix-rust-sdk?tab=readme-ov-file#status), esp. as it backs prominent client implementations; on the other hand, `matrix-python-sdk` seems to be [effectively
      • unmaintained](https://github.com/matrix-org/matrix-python-sdk?tab=readme-ov-file#project-status), though there is a [separate library referenced in the readme](https://github.com/matrix-nio/matrix-nio) which seems to see more maintenance
      • Jade[m]
        <lucifer[m]> "Jade: I think doing this project..." <- I went into more detail on the justification in the pre-proposal, but julian45 has the main justification for not using python
      • julian45[m]: (One person I know working with matrix-nio is very much regretting it, too)
      • julian45[m]
        may i ask how so?
      • because i do think lucifer's point that python is already used a lot across our projects' stacks is important to consider
      • * because i do think lucifer's point that python is already used across a lot of our projects' stacks is important to consider
      • Jade[m]
        julian45[m]: It's also relatively unamintained and has multiple bugs with handling state. Apparently the sans-io design makes it very difficult to maintain and fix these bugs, too
      • julian45[m]: Absolutely
      • lucifer[m]
        julian45: Jade: we can use the http api directly if the python sdks are unmaintained
      • Jade[m]
        lucifer[m]: That's... a bad idea. The Matrix API is complex and changes steadily over time in both significant and subtle ways.
      • I didn't really consider React, because the aim was to create static HTML files. React's not really intended for that usecase, although it can do it
      • But the ideal with picking Rust is it would be very low maintenence, like the mail service has been
      • lucifer[m]
        if static HTML + jinja2 suffices then its fine
      • Jade[m]
        MeB also has quite a few Rust projects, and that + Rust having a far better SDK situation + me personally knowing the Rust ecosystem better was enough for me to chose it
      • lucifer[m]: Yeah, that would have been what I chose
      • The most maintained Python SDK is the one maubot uses
      • So mautrix-python + jinja2 + sqlite might be a viable stack for this?
      • lucifer[m]
        FWIW, the matrix devs at FOSDEM suggested me to just use the HTTP API if it works for us but if there is a maintained sdk then that makes sense ofcourse.
      • sqlite sounds good but I am not familiar with how good its full text search is. if there are no special reasons, postgres makes more sense to me.
      • suvid[m] has quit
      • Jade[m]: in my opinion, yes. but others might differ so let's hear from them as well.
      • julian45[m]
        lucifer[m]: postgres must run as a separate service; as jade pointed out in her [pre-proposal](https://community.metabrainz.org/t/gsoc-pre-proposal-matrix-archiver/748262/6), that means "a bunch of data that's separate, easy to forget to backup, and not so easily accessible to the public anymore... chances are that a whole bunch of state ends up stored there."
      • lucifer[m]
        julian45: if we store the data in the main postgres cluster, all the data will be backed up properly.
      • a different database in the same postgres cluster.
      • (sorry i haven't read the pre-proposal yet so don't have all the context.)
      • Jade[m]
        <Jade[m]> "https://github.com/mautrix/..." <- More context on this one is that it powers Beeper's telegram and google chat bridges. However, Beeper/Mautrix have been rewriting their bridges in Go for a while now, and the Telegram rewrite is almost done. That would leave the SDK as community supported, and there's a fair chance it could fall into the same state as the other Python SDKs
      • julian45[m]
        lucifer[m]: i think this could be decently viable, especially considering that the secondary element of the proposed project (maubot implementation) is within the same development family as mautrix-python, so could result in increased reuse of common fundamental components), _but_ at the same time that would need to be worth risk/reward tradeoffs the proposed rust-based approach
      • Jade[m]: ah, good to know
      • s/)//
      • lucifer[m]
        Jade: while generating HTML pages is fine, i also think we should be storing all of the data in the database.
      • Jade[m]
        Basically the most supported and complete SDKs are the Rust SDK (Matrix Foundation / Element), Go (Mautrix/beeper), and the Javascript SDK (Matrix Foundation / Element).
      • lucifer[m]
        so that if we want to regenerate the html pages at any time with a new UI or fix, we can use the data from the database. also it would help in adding all sorts of filtering.
      • Jade[m]
        lucifer[m]: Regenerating from the database is a pretty good reason. I kind of ignored this because the Rust SDK maintains a transparent event cache that avoids calling back to the server when the data is already known
      • lucifer[m]
        is it an in memory cache?
      • Jade[m]
        It would be much more important with the python SDK, as that doesn't have a cache AFAICT
      • A SQLite cache on disk
      • lucifer[m]
        i see, is that the sqlite db in question?
      • Jade[m]
        Jade[m]: It also uses that to maintain encryption state, although that's less relevant for bots
      • lucifer[m]: Which do you mean?
      • lucifer[m]
        yeah, i think encryption is not an issue for us since all of the data is public anyway.
      • Jade[m]
        <lucifer[m]> "sqlite sounds good but I am..." <- To answer the first part of this, it's... OK? Not as good as dedicated search, but not bad
      • lucifer[m]
        Jade[m]: i mean is this the only SQLite db being used in this project or is this an internal sdk one and there will be another one used by the archiving service?
      • bitmap[m]
        <Jade[m]> "I didn't really consider React..." <- the latest release kinda has that use case fully baked in with [server components](https://react.dev/reference/rsc/server-components), though if you're using rust or python then brining node into the mix doesn't seem worth
      • s/brining/bringing/
      • Jade[m]
        So there are three kind of distinct points:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
      • bitmap[m]: Yep. But as you said, it doesn't seem worth it - it's a tonne of extra complexity compared to jinja templates or similar
      • bitmap[m]
        +1 to keeping the tech stack simple and avoiding additional services unless necessary
      • lucifer[m]
        I see, makes sense. Will any matrix messages data be stored in a database outside the sdk if we use the rust one?
      • Jade[m]
        The search index would have the full text, although not in the matrix format. Each event's JSON would be stored in the HTML, but otherwise there isn't much need to store it anywhere else
      • lucifer[m]
        I don't like the idea of storing the data just in the HTML files.
      • It makes future improvements to the project much harder.
      • Jade[m]
        There would be no need to read from the HTML files in any normal situation
      • lucifer[m]
        I think regardless of whether the rust or python sdk is used, the data should be stored in an external database.
      • Jade[m]: Yes but it would make it harder to make any future changes to the UI or fixes once the files have been generated.
      • Jade[m]
        As long as any matrix server is running, the data can be pulled from the matrix server. During normal running of the server, it would be continuously regenerating old files as things like edits, redactions and reactions come in
      • It's not a one-and-done generation
      • So that shouldn't be a factor at all
      • lucifer[m]
        I am not fully sure how the search will work with tantivy but think of a case where you want to filter out messages by a single user across a month.
      • Jade[m]
        my assumption is if chatbrainz is shut down we won't be updating this project much anymore 😅
      • lucifer[m]
        Jade[m]: Yes but that is additional indexing work that needs to be done for every change. if you store a copy of the data then any changes to the UI can be made directly without reindexing all the data.
      • Jade[m]
        lucifer[m]: `user:jade@ellis.link search term`, and it does have indexes for date and time that we'd be filling out for jump-to-date anyway
      • lucifer[m]
        ah cool, that's handy.
      • Jade[m]
        lucifer[m]: So effectively acting as a cache? Which would be handled by the Rust SDK for us, although not by the Python SDK
      • lucifer[m]
        but if the tantivy index needs to be rebuilt, you would need to query chatbrainz for messages again?
      • Jade[m]
        We'd need to query the SDK, and if the messages aren't cached by that it would grab them from chatbrainz
      • Jade[m]: But this would be assumption
      • lucifer[m]
        yes but i don't have a good feeling about relying on the sdk cache. i would much rather that we have full control of the data's copy.
      • Jade[m]
        That's understandable
      • lucifer[m]
        from experience on working with sir and listens data, both which have face issues when changing the data format or something downstream of it. reindexing is one of the biggest pains.
      • Jade[m]
        I guess my feeling is that it would be redundant fitting it in front of the Rust SDK as a cache, so writing it to a database would have pretty much the same effect as writing it to the HTML in practice
      • lucifer[m]
        i personally would trust a database we can control and backup more than the sdk's cache.
      • Jade[m]
        Aside from being more convenient to access
      • lucifer[m]
        it can be redundant yes but i would suggest that the data be stored in the database even if your implementation doesn't end up using it.
      • Jade[m]
        lucifer[m]: Absolutely for key data
      • I think that works, then
      • So writing all events to Posgres with the intention that the HTML can be completely rebuilt from that data
      • lucifer[m]
        yes.
      • Jade[m]
        Yeah, that makes sense
      • Honestly, that's a simple enough interface that it could be made modular, rather than tying it to posgres in particular
      • lucifer[m]
        sure if a generic approach works, sounds good.
      • Jade[m]
        Reading back would be more difficult because of redactions, etc, but well within the realms of possiblility
      • lucifer[m]
        off the top of my head, i would go with one row for one event/message and have a column for to mark it redacted.
      • or delete the row.
      • Jade[m]
        I was thinking having one row per event message and handling redactions/edits/reactions on ingest
      • lucifer[m]
        that works yes.
      • Jade[m]
        lucifer[m]: The way Matrix handles redacted events is by sending an event that instructs servers and clients to delete all but a certain subset of the fields
      • lucifer[m]
        yup makes sense.
      • Jade[m]
        My preference is still to use Rust, on the balance
      • Are you on board with that?
      • lucifer[m]
        my vote is in favor of python. but if the most of other team members prefer rust too i guess its fine.
      • Jade[m]
        Does this work?
      • I'll wait for the others to have their chance to read through before I do anything
      • <Jade[m]> "My preference is still to use..." <- To be clear, I'm not against python, I just think it would require more code and need more long-term maintenance risk
      • But I'm happy either way