#bookbrainz

/

      • G0d joined the channel
      • 2025-11-12 31623, 2025

      • ApeKattQuest
        hmm.logging in on both prod and beta is very slow right now, is this a known thing or
      • 2025-11-12 31619, 2025

      • monkey[m]
        Probably related to overall MetaBrainz issues, if I had to guess. I've had login issues on multiple projects in the past week
      • 2025-11-12 31655, 2025

      • ApeKattQuest
        hm how do we do underlined text in the markup again?
      • 2025-11-12 31657, 2025

      • shoesNsocks joined the channel
      • 2025-11-12 31638, 2025

      • ApeKattQuest
        waait I just realised I could possibly fix BB-870 myself, https://bookbrainz.org/identifier-type/23/edit what should I edit the Detection RegEx and Validation RegEx with? (can replace just "authors" with "*" work?)
      • 2025-11-12 31639, 2025

      • BrainzBot
        BB-870: Issue with linking openlibrary author https://tickets.metabrainz.org/browse/BB-870
      • 2025-11-12 31643, 2025

      • ApeKattQuest
        wait let me test on test
      • 2025-11-12 31610, 2025

      • ApeKattQuest
        wait I think I did it!
      • 2025-11-12 31657, 2025

      • ApeKattQuest
        ok so it WORKS
      • 2025-11-12 31659, 2025

      • ApeKattQuest
        hehoo
      • 2025-11-12 31600, 2025

      • monkey[m]
        Nice ApeKattQuest (IRC) !
      • 2025-11-12 31603, 2025

      • monkey[m]
        Although...
      • 2025-11-12 31620, 2025

      • ApeKattQuest
        ops
      • 2025-11-12 31629, 2025

      • ApeKattQuest
        there's probably better ways lol
      • 2025-11-12 31631, 2025

      • monkey[m]
        I think now pasting an OL author link doesn't work anymore
      • 2025-11-12 31639, 2025

      • ApeKattQuest
        huuuhhh
      • 2025-11-12 31641, 2025

      • ApeKattQuest
        oh no
      • 2025-11-12 31643, 2025

      • monkey[m]
        No no, I think you are absolutely correct with the solution
      • 2025-11-12 31653, 2025

      • monkey[m]
        Let me check the regex
      • 2025-11-12 31658, 2025

      • ApeKattQuest
        lol
      • 2025-11-12 31601, 2025

      • ApeKattQuest
        shait that shouldn't break it because a|b should be "matched a or b according to https://regexr.com/
      • 2025-11-12 31633, 2025

      • monkey[m]
        For reference, I usually use this to test regexps, and always test with all examples I have at hand because regexps are HARD
      • 2025-11-12 31633, 2025

      • monkey[m]
      • 2025-11-12 31606, 2025

      • ApeKattQuest
        yea uh I don't know this
      • 2025-11-12 31624, 2025

      • ApeKattQuest
        what is d+?
      • 2025-11-12 31642, 2025

      • monkey[m]
        OK, I fixed the regex, it needed a "non capturing group" around authors|work
      • 2025-11-12 31643, 2025

      • ApeKattQuest
        the A is for A at the end of the entity
      • 2025-11-12 31648, 2025

      • monkey[m]
        openlibrary\.org\/(?:authors|works)\/(OL\d+?A)
      • 2025-11-12 31658, 2025

      • ApeKattQuest
        what does non capturing mean?
      • 2025-11-12 31616, 2025

      • ApeKattQuest
        i need to learn regex properly orz
      • 2025-11-12 31647, 2025

      • monkey[m]
        ApeKattQuest: See in the top right "explanation" panel, it explains each token:
      • 2025-11-12 31647, 2025

      • monkey[m]
        `\d matches a digit (equivalent to [0-9])`
      • 2025-11-12 31647, 2025

      • monkey[m]
        `+? matches the previous token between one and unlimited times,`
      • 2025-11-12 31659, 2025

      • ApeKattQuest
        what aprt of the regex makes it ignore the bit after the OL2165055A bit?
      • 2025-11-12 31620, 2025

      • ApeKattQuest
        I'll look at this now
      • 2025-11-12 31639, 2025

      • ApeKattQuest
        so this will let me test regular expresions so i cna make thme work before pasting? oohh neat
      • 2025-11-12 31602, 2025

      • monkey[m]
        a capturing group means at the end you extract whatever is in the capturing group (in our case we ahe the OL ID captured).
      • 2025-11-12 31602, 2025

      • monkey[m]
        A non capturing group allows you to say, in this case, it should be one of "authors" or "works", but don't extract that value cause I'm not interested in it
      • 2025-11-12 31648, 2025

      • ApeKattQuest
        aha, since we already spesify it in the display template
      • 2025-11-12 31611, 2025

      • monkey[m]
        ApeKattQuest: Basically, the regex extracts what matches the capturing group, doesn't capture the rest
      • 2025-11-12 31644, 2025

      • ApeKattQuest
        why does "works" work tho ?
      • 2025-11-12 31602, 2025

      • ApeKattQuest
        oh becasue it was second?
      • 2025-11-12 31613, 2025

      • ApeKattQuest
        openlibrary\.org\/works|authors\/(OL\d+?A) seems to work on authors but not works
      • 2025-11-12 31639, 2025

      • monkey[m]
        Because of the construction of the regex, it can translate as: either match "openlibrary\.org\/authors" OR match "works\/(OL\d+?A)"
      • 2025-11-12 31603, 2025

      • ApeKattQuest
        yes, that's what I wanted
      • 2025-11-12 31616, 2025

      • monkey[m]
        Adding the non-capturing group ensure the "OR" part only applies to "authors" or "works"
      • 2025-11-12 31625, 2025

      • ApeKattQuest
        oh!
      • 2025-11-12 31634, 2025

      • ApeKattQuest
        so not "author or works or OL### or
      • 2025-11-12 31636, 2025

      • monkey[m]
        You got it ? Difference is subtle
      • 2025-11-12 31615, 2025

      • ApeKattQuest
        so a non apturing grou is (?:foobar|foobas) and a capturing group is (blup|blup)
      • 2025-11-12 31619, 2025

      • monkey[m]
        What you wanted is either match "openlibrary.org/authors/OL.....A" OR match "openlibrary.org/works/OL.....A"
      • 2025-11-12 31655, 2025

      • monkey[m]
        ApeKattQuest: Almost, the bar in the middle is not part of the concept of groups, that's the "OR" part
      • 2025-11-12 31605, 2025

      • ApeKattQuest
        yea
      • 2025-11-12 31623, 2025

      • ApeKattQuest
        whats the sign for *and*
      • 2025-11-12 31638, 2025

      • monkey[m]
        But "(?:something)" will match but not capture "something", while "(something)" will match and capture "something"
      • 2025-11-12 31654, 2025

      • ApeKattQuest
        nothing this!
      • 2025-11-12 31600, 2025

      • ApeKattQuest
        noteing*
      • 2025-11-12 31639, 2025

      • monkey[m]
        There is also a "Quick reference" panel on the right at the bottom which is absolutely necessary for people like me who use regexp once in a while and not everyday
      • 2025-11-12 31612, 2025

      • monkey[m]
        Let me tell you, learning regexps is a journey...
      • 2025-11-12 31628, 2025

      • ApeKattQuest
        lol
      • 2025-11-12 31639, 2025

      • monkey[m]
        But you feel powerful once you manage to craft the esoteric incantation that does the magic you wanted it to do
      • 2025-11-12 31644, 2025

      • ApeKattQuest
        thanks for this site I'll test regexes here before employing in the future
      • 2025-11-12 31621, 2025

      • monkey[m]
        👌 Love seeing you look more at coding skills :)
      • 2025-11-12 31643, 2025

      • ApeKattQuest
        :D
      • 2025-11-12 31606, 2025

      • monkey[m]
        In this case, that was absolutely the correct (and easy) solution to this issue
      • 2025-11-12 31622, 2025

      • ApeKattQuest
        yea, I just didnt know abt capture groups
      • 2025-11-12 31627, 2025

      • monkey[m]
        Tested on test.BB and the regexp works as expected with the non-capturing group:
      • 2025-11-12 31628, 2025

      • monkey[m] uploaded an image: (28KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/ZNtkZKKSoQARhSafeqMPOiCw/image.png >
      • 2025-11-12 31652, 2025

      • ApeKattQuest
        huh. maybe update codebase of test?
      • 2025-11-12 31607, 2025

      • monkey[m]
        ?
      • 2025-11-12 31617, 2025

      • monkey[m]
        Everything works :)
      • 2025-11-12 31624, 2025

      • monkey[m]
        I updated the regex on test and on prod
      • 2025-11-12 31606, 2025

      • ApeKattQuest
        wait.. maybe i can fix my iritation with barcodes not being found when you write/paste eg "9 788205 478022" (spaces) as this is the format most commonly of barcodes... maybe we cna display thme as that too.. i'm gonigot experiemnt on test and see how it looks/works thne get back to you
      • 2025-11-12 31634, 2025

      • monkey[m]
        Experiment on regex101 :)
      • 2025-11-12 31642, 2025

      • ApeKattQuest
        yes that's what Imean
      • 2025-11-12 31653, 2025

      • monkey[m]
        Neat
      • 2025-11-12 31623, 2025

      • ApeKattQuest
      • 2025-11-12 31639, 2025

      • ApeKattQuest
        thats out current isbn regex
      • 2025-11-12 31658, 2025

      • ApeKattQuest
        it's fine to not let it colelct the common barcode pattern
      • 2025-11-12 31635, 2025

      • ApeKattQuest
        .. the barcode doesnt have a detection rexex https://test.bookbrainz.org/identifier-type/11/ed…
      • 2025-11-12 31641, 2025

      • monkey[m]
        Yes I think barcodes are going to be tricky
      • 2025-11-12 31645, 2025

      • ApeKattQuest
      • 2025-11-12 31645, 2025

      • BrainzBot
        BB-650: Barcode (and ISBN) insertion and storage improvement
      • 2025-11-12 31649, 2025

      • ApeKattQuest
        would be good to display the format for barcodes as # ###### ###### imho
      • 2025-11-12 31640, 2025

      • monkey[m]
        As mentioned in the comments "modern ISBN / barcode coupling is pretty common", however that is not a guarantee and older books might not have an isbn format, in which case it's an arbitrary number of digits in no specific format, which....
      • 2025-11-12 31645, 2025

      • ApeKattQuest
        whats the difference between validation regex and detectino regex, detectino is the url that is pasted
      • 2025-11-12 31613, 2025

      • ApeKattQuest
        yep. especially older books will have completely different isbn and barcode
      • 2025-11-12 31621, 2025

      • monkey[m]
        Yes, the detection one take what the user pastes and extracts the value.
      • 2025-11-12 31621, 2025

      • monkey[m]
        The validation takes a value and makes sure it fits a pattern
      • 2025-11-12 31600, 2025

      • ApeKattQuest
        also but also the "barcodelookup" url always si bogus
      • 2025-11-12 31639, 2025

      • monkey[m]
        Actually "arbitrary number of digits" is not correct, I'm wrong. The current barcode validation regex expects 11 or 12 digits.
      • 2025-11-12 31609, 2025

      • monkey[m]
        We can adapt it to accept those same 11 or 12 digits, with optional spaces in between them
      • 2025-11-12 31628, 2025

      • ApeKattQuest
        yes, often a barcode will also have an additional a 5-6 shorter one after it (which we cannot yet add)
      • 2025-11-12 31632, 2025

      • monkey[m]
        Or tighten it down more to accept spaces in the format you mentioned
      • 2025-11-12 31606, 2025

      • ApeKattQuest
        perhasp a second barcode identifier?
      • 2025-11-12 31631, 2025

      • monkey[m]
        How about this for supporting your format? https://regex101.com/r/Vx7Anf/1
      • 2025-11-12 31637, 2025

      • ApeKattQuest
      • 2025-11-12 31637, 2025

      • ApeKattQuest
        has printed 9 781569711019 51495>
      • 2025-11-12 31625, 2025

      • ApeKattQuest
        i'd not do the bit where it matches 9788205478022 or 978 82 05 47802 2/978-82-05-47802-2 since those match ISBN and can create krøll with those.
      • 2025-11-12 31653, 2025

      • ApeKattQuest
        for no spaces at all I think usually it's barcode tbh
      • 2025-11-12 31605, 2025

      • ApeKattQuest
        but the page can't knowif it's barcode or isbn-13
      • 2025-11-12 31644, 2025

      • ApeKattQuest
        but format 978 82 05 47802 2 is always isbn and format 9 78820 5478022 is alwasy barcode
      • 2025-11-12 31613, 2025

      • ApeKattQuest
        also if the pasted bit actually includes the word isbn/upc/ean it shoudl beeasy to know if it's barcode or isbn
      • 2025-11-12 31642, 2025

      • monkey[m]
        That's true
      • 2025-11-12 31607, 2025

      • ApeKattQuest
        in the future we couldcreate a report for "when barcodes and isbn doesnt match at all" for our interest
      • 2025-11-12 31615, 2025

      • monkey[m]
        There is still some work to get a proper regex, but this should be a good base to match and validate some of the common formats: https://regex101.com/r/QNQnnY/4
      • 2025-11-12 31603, 2025

      • ApeKattQuest
        and it doesnt trigger with the isbn format!
      • 2025-11-12 31620, 2025

      • ApeKattQuest
        but what will happen whne both isbn format andbarcode matches on 9788205478022 ?
      • 2025-11-12 31625, 2025

      • monkey[m]
        For detection, I like the idea of detecting ean/upc at the beginning. I think those don't usually have the optional 5 digits at the end, so ignore how the last 4 lines match the 5 digits as a separate group.
      • 2025-11-12 31613, 2025

      • ApeKattQuest
        yea. I thik the last 5 is perhasp a separate barcode? liek I think it's an us/canada thing
      • 2025-11-12 31648, 2025

      • monkey[m]
        ApeKattQuest: So I think this is why there is no detection regex for barcode. I propose that we add detection only if it starts with the words upc/ean/barcode, because in that case we know the detection will be correct. Otherwise no detection.
      • 2025-11-12 31648, 2025

      • monkey[m]
        And the validation on the other side can allow the spaces.
      • 2025-11-12 31655, 2025

      • ApeKattQuest
        also it shoudl be possible to also have a n identifyer for price (somethingthat's printed *on* some books)
      • 2025-11-12 31609, 2025

      • monkey[m]
        ApeKattQuest: Those 5 digits are pricing information
      • 2025-11-12 31643, 2025

      • ApeKattQuest
        [15:19] monkey[m] ApeKattQuest: So I think this is why there is no detection regex for barcode. I propose that we add detection only if it starts with the words upc/ean/barcode, because in that case we sounds good. but in reverse: if it starts with "isbn" assume isbn, if no isbn OR ean/barcode/upc thne assume barcode
      • 2025-11-12 31649, 2025

      • ApeKattQuest
        [15:20] monkey[m] ApeKattQuest: Those 5 digits are pricing information yes
      • 2025-11-12 31603, 2025

      • ApeKattQuest
        (but I meant as a spearate)
      • 2025-11-12 31620, 2025

      • ApeKattQuest
        eg https://bookbrainz.org/edition/72fddc3a-11de-419d… has both the extra 5 numbers *and* prices
      • 2025-11-12 31626, 2025

      • ApeKattQuest
        see the annotation
      • 2025-11-12 31629, 2025

      • monkey[m]
        Makes sense
      • 2025-11-12 31644, 2025

      • ApeKattQuest
        (now woudl have been great if we had bookcover becasue )
      • 2025-11-12 31612, 2025

      • monkey[m]
        > "if no isbn OR ean/barcode/upc thne assume barcode"
      • 2025-11-12 31612, 2025

      • monkey[m]
        I disagree with this (unless I misunderstood): I think "9788205478022" should be detected as an ISBN
      • 2025-11-12 31634, 2025

      • ApeKattQuest
        I meanthe problem is that it's just as likly to be barcode as isbn
      • 2025-11-12 31646, 2025

      • ApeKattQuest
        neither is more likly if pasted
      • 2025-11-12 31603, 2025

      • monkey[m]
        I think it's more important to detect ISBNs than barcodes, especially if they end up being the same in a lot of cases
      • 2025-11-12 31615, 2025

      • monkey[m]
        That's why I think it should be the default
      • 2025-11-12 31654, 2025

      • ApeKattQuest
        I don't exactly agree, but it's not a hill I want to die on. we cna do as you say and if we at some point cna make it easier/better/get more imput/i cnage my mind /etc
      • 2025-11-12 31657, 2025

      • ApeKattQuest
        at this point any improvement to barcode is an improvement :D
      • 2025-11-12 31639, 2025

      • ApeKattQuest
        how would the last 5 digits work then? are we storing those or jsut dropping them
      • 2025-11-12 31637, 2025

      • monkey[m]
        I think it makes sense to store them, but the validation regex definitely needs to be updated.
      • 2025-11-12 31637, 2025

      • monkey[m]
        And as said above, if we detect upc/ean then we would probably discard those 5 digits I guess? Assuming EAN/UPC don't have them, only 'ISBN barcodes" do, which could make for a confusing experience for users
      • 2025-11-12 31607, 2025

      • ApeKattQuest
        i honestly think it might be better to have those as a separate field
      • 2025-11-12 31619, 2025

      • monkey[m]
        EAN-5 , they're called
      • 2025-11-12 31628, 2025

      • ApeKattQuest
        yea
      • 2025-11-12 31637, 2025

      • ApeKattQuest
        ok so should we put https://regex101.com/r/QNQnnY/4 into barcode? 🤔
      • 2025-11-12 31649, 2025

      • ApeKattQuest
      • 2025-11-12 31649, 2025

      • BrainzBot
        BB-467: Add Bookland Add-On Code (EAN-5) identifier
      • 2025-11-12 31610, 2025

      • ApeKattQuest
        oh and here is a note: https://tickets.metabrainz.org/browse/BB-508 (so if the barcode is selected thne it shouldnt beoverridden by new matching)
      • 2025-11-12 31610, 2025

      • BrainzBot
        BB-508: Identifier type selection issue