hmm.logging in on both prod and beta is very slow right now, is this a known thing or
2025-11-12 31619, 2025
monkey[m]
Probably related to overall MetaBrainz issues, if I had to guess. I've had login issues on multiple projects in the past week
2025-11-12 31655, 2025
ApeKattQuest
hm how do we do underlined text in the markup again?
2025-11-12 31657, 2025
shoesNsocks joined the channel
2025-11-12 31638, 2025
ApeKattQuest
waait I just realised I could possibly fix BB-870 myself, https://bookbrainz.org/identifier-type/23/edit what should I edit the Detection RegEx and Validation RegEx with? (can replace just "authors" with "*" work?)
OK, I fixed the regex, it needed a "non capturing group" around authors|work
2025-11-12 31643, 2025
ApeKattQuest
the A is for A at the end of the entity
2025-11-12 31648, 2025
monkey[m]
openlibrary\.org\/(?:authors|works)\/(OL\d+?A)
2025-11-12 31658, 2025
ApeKattQuest
what does non capturing mean?
2025-11-12 31616, 2025
ApeKattQuest
i need to learn regex properly orz
2025-11-12 31647, 2025
monkey[m]
ApeKattQuest: See in the top right "explanation" panel, it explains each token:
2025-11-12 31647, 2025
monkey[m]
`\d matches a digit (equivalent to [0-9])`
2025-11-12 31647, 2025
monkey[m]
`+? matches the previous token between one and unlimited times,`
2025-11-12 31659, 2025
ApeKattQuest
what aprt of the regex makes it ignore the bit after the OL2165055A bit?
2025-11-12 31620, 2025
ApeKattQuest
I'll look at this now
2025-11-12 31639, 2025
ApeKattQuest
so this will let me test regular expresions so i cna make thme work before pasting? oohh neat
2025-11-12 31602, 2025
monkey[m]
a capturing group means at the end you extract whatever is in the capturing group (in our case we ahe the OL ID captured).
2025-11-12 31602, 2025
monkey[m]
A non capturing group allows you to say, in this case, it should be one of "authors" or "works", but don't extract that value cause I'm not interested in it
2025-11-12 31648, 2025
ApeKattQuest
aha, since we already spesify it in the display template
2025-11-12 31611, 2025
monkey[m]
ApeKattQuest: Basically, the regex extracts what matches the capturing group, doesn't capture the rest
2025-11-12 31644, 2025
ApeKattQuest
why does "works" work tho ?
2025-11-12 31602, 2025
ApeKattQuest
oh becasue it was second?
2025-11-12 31613, 2025
ApeKattQuest
openlibrary\.org\/works|authors\/(OL\d+?A) seems to work on authors but not works
2025-11-12 31639, 2025
monkey[m]
Because of the construction of the regex, it can translate as: either match "openlibrary\.org\/authors" OR match "works\/(OL\d+?A)"
2025-11-12 31603, 2025
ApeKattQuest
yes, that's what I wanted
2025-11-12 31616, 2025
monkey[m]
Adding the non-capturing group ensure the "OR" part only applies to "authors" or "works"
2025-11-12 31625, 2025
ApeKattQuest
oh!
2025-11-12 31634, 2025
ApeKattQuest
so not "author or works or OL### or
2025-11-12 31636, 2025
monkey[m]
You got it ? Difference is subtle
2025-11-12 31615, 2025
ApeKattQuest
so a non apturing grou is (?:foobar|foobas) and a capturing group is (blup|blup)
2025-11-12 31619, 2025
monkey[m]
What you wanted is either match "openlibrary.org/authors/OL.....A" OR match "openlibrary.org/works/OL.....A"
2025-11-12 31655, 2025
monkey[m]
ApeKattQuest: Almost, the bar in the middle is not part of the concept of groups, that's the "OR" part
2025-11-12 31605, 2025
ApeKattQuest
yea
2025-11-12 31623, 2025
ApeKattQuest
whats the sign for *and*
2025-11-12 31638, 2025
monkey[m]
But "(?:something)" will match but not capture "something", while "(something)" will match and capture "something"
2025-11-12 31654, 2025
ApeKattQuest
nothing this!
2025-11-12 31600, 2025
ApeKattQuest
noteing*
2025-11-12 31639, 2025
monkey[m]
There is also a "Quick reference" panel on the right at the bottom which is absolutely necessary for people like me who use regexp once in a while and not everyday
2025-11-12 31612, 2025
monkey[m]
Let me tell you, learning regexps is a journey...
2025-11-12 31628, 2025
ApeKattQuest
lol
2025-11-12 31639, 2025
monkey[m]
But you feel powerful once you manage to craft the esoteric incantation that does the magic you wanted it to do
2025-11-12 31644, 2025
ApeKattQuest
thanks for this site I'll test regexes here before employing in the future
2025-11-12 31621, 2025
monkey[m]
👌 Love seeing you look more at coding skills :)
2025-11-12 31643, 2025
ApeKattQuest
:D
2025-11-12 31606, 2025
monkey[m]
In this case, that was absolutely the correct (and easy) solution to this issue
2025-11-12 31622, 2025
ApeKattQuest
yea, I just didnt know abt capture groups
2025-11-12 31627, 2025
monkey[m]
Tested on test.BB and the regexp works as expected with the non-capturing group:
2025-11-12 31628, 2025
monkey[m] uploaded an image: (28KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/ZNtkZKKSoQARhSafeqMPOiCw/image.png >
2025-11-12 31652, 2025
ApeKattQuest
huh. maybe update codebase of test?
2025-11-12 31607, 2025
monkey[m]
?
2025-11-12 31617, 2025
monkey[m]
Everything works :)
2025-11-12 31624, 2025
monkey[m]
I updated the regex on test and on prod
2025-11-12 31606, 2025
ApeKattQuest
wait.. maybe i can fix my iritation with barcodes not being found when you write/paste eg "9 788205 478022" (spaces) as this is the format most commonly of barcodes... maybe we cna display thme as that too.. i'm gonigot experiemnt on test and see how it looks/works thne get back to you
BB-650: Barcode (and ISBN) insertion and storage improvement
2025-11-12 31649, 2025
ApeKattQuest
would be good to display the format for barcodes as # ###### ###### imho
2025-11-12 31640, 2025
monkey[m]
As mentioned in the comments "modern ISBN / barcode coupling is pretty common", however that is not a guarantee and older books might not have an isbn format, in which case it's an arbitrary number of digits in no specific format, which....
2025-11-12 31645, 2025
ApeKattQuest
whats the difference between validation regex and detectino regex, detectino is the url that is pasted
2025-11-12 31613, 2025
ApeKattQuest
yep. especially older books will have completely different isbn and barcode
2025-11-12 31621, 2025
monkey[m]
Yes, the detection one take what the user pastes and extracts the value.
2025-11-12 31621, 2025
monkey[m]
The validation takes a value and makes sure it fits a pattern
2025-11-12 31600, 2025
ApeKattQuest
also but also the "barcodelookup" url always si bogus
2025-11-12 31639, 2025
monkey[m]
Actually "arbitrary number of digits" is not correct, I'm wrong. The current barcode validation regex expects 11 or 12 digits.
2025-11-12 31609, 2025
monkey[m]
We can adapt it to accept those same 11 or 12 digits, with optional spaces in between them
2025-11-12 31628, 2025
ApeKattQuest
yes, often a barcode will also have an additional a 5-6 shorter one after it (which we cannot yet add)
2025-11-12 31632, 2025
monkey[m]
Or tighten it down more to accept spaces in the format you mentioned
i'd not do the bit where it matches 9788205478022 or 978 82 05 47802 2/978-82-05-47802-2 since those match ISBN and can create krøll with those.
2025-11-12 31653, 2025
ApeKattQuest
for no spaces at all I think usually it's barcode tbh
2025-11-12 31605, 2025
ApeKattQuest
but the page can't knowif it's barcode or isbn-13
2025-11-12 31644, 2025
ApeKattQuest
but format 978 82 05 47802 2 is always isbn and format 9 78820 5478022 is alwasy barcode
2025-11-12 31613, 2025
ApeKattQuest
also if the pasted bit actually includes the word isbn/upc/ean it shoudl beeasy to know if it's barcode or isbn
2025-11-12 31642, 2025
monkey[m]
That's true
2025-11-12 31607, 2025
ApeKattQuest
in the future we couldcreate a report for "when barcodes and isbn doesnt match at all" for our interest
2025-11-12 31615, 2025
monkey[m]
There is still some work to get a proper regex, but this should be a good base to match and validate some of the common formats: https://regex101.com/r/QNQnnY/4
2025-11-12 31603, 2025
ApeKattQuest
and it doesnt trigger with the isbn format!
2025-11-12 31620, 2025
ApeKattQuest
but what will happen whne both isbn format andbarcode matches on 9788205478022 ?
2025-11-12 31625, 2025
monkey[m]
For detection, I like the idea of detecting ean/upc at the beginning. I think those don't usually have the optional 5 digits at the end, so ignore how the last 4 lines match the 5 digits as a separate group.
2025-11-12 31613, 2025
ApeKattQuest
yea. I thik the last 5 is perhasp a separate barcode? liek I think it's an us/canada thing
2025-11-12 31648, 2025
monkey[m]
ApeKattQuest: So I think this is why there is no detection regex for barcode. I propose that we add detection only if it starts with the words upc/ean/barcode, because in that case we know the detection will be correct. Otherwise no detection.
2025-11-12 31648, 2025
monkey[m]
And the validation on the other side can allow the spaces.
2025-11-12 31655, 2025
ApeKattQuest
also it shoudl be possible to also have a n identifyer for price (somethingthat's printed *on* some books)
2025-11-12 31609, 2025
monkey[m]
ApeKattQuest: Those 5 digits are pricing information
2025-11-12 31643, 2025
ApeKattQuest
[15:19] monkey[m] ApeKattQuest: So I think this is why there is no detection regex for barcode. I propose that we add detection only if it starts with the words upc/ean/barcode, because in that case we sounds good. but in reverse: if it starts with "isbn" assume isbn, if no isbn OR ean/barcode/upc thne assume barcode
2025-11-12 31649, 2025
ApeKattQuest
[15:20] monkey[m] ApeKattQuest: Those 5 digits are pricing information yes
(now woudl have been great if we had bookcover becasue )
2025-11-12 31612, 2025
monkey[m]
> "if no isbn OR ean/barcode/upc thne assume barcode"
2025-11-12 31612, 2025
monkey[m]
I disagree with this (unless I misunderstood): I think "9788205478022" should be detected as an ISBN
2025-11-12 31634, 2025
ApeKattQuest
I meanthe problem is that it's just as likly to be barcode as isbn
2025-11-12 31646, 2025
ApeKattQuest
neither is more likly if pasted
2025-11-12 31603, 2025
monkey[m]
I think it's more important to detect ISBNs than barcodes, especially if they end up being the same in a lot of cases
2025-11-12 31615, 2025
monkey[m]
That's why I think it should be the default
2025-11-12 31654, 2025
ApeKattQuest
I don't exactly agree, but it's not a hill I want to die on. we cna do as you say and if we at some point cna make it easier/better/get more imput/i cnage my mind /etc
2025-11-12 31657, 2025
ApeKattQuest
at this point any improvement to barcode is an improvement :D
2025-11-12 31639, 2025
ApeKattQuest
how would the last 5 digits work then? are we storing those or jsut dropping them
2025-11-12 31637, 2025
monkey[m]
I think it makes sense to store them, but the validation regex definitely needs to be updated.
2025-11-12 31637, 2025
monkey[m]
And as said above, if we detect upc/ean then we would probably discard those 5 digits I guess? Assuming EAN/UPC don't have them, only 'ISBN barcodes" do, which could make for a confusing experience for users
2025-11-12 31607, 2025
ApeKattQuest
i honestly think it might be better to have those as a separate field
hmm apparently isbn with 10 doesnt work. if i paste isbn 82-03-10581-5 it doesnt connect until i remove"isbn"
2025-11-12 31627, 2025
ApeKattQuest
also here I run into a problem. this book consists of many stories, poems and other things from other places. each has illustrations, many of wich are specific for this version. but those are not made by the same ! they are individually made by specific illustrators