#bookbrainz-devel

/

      • chrysn joined the channel
      • 2015-04-16 10611, 2015

      • chrysn
        is there any documentation yet on the high-level schema used on bookbrainz? (ie. what is an edition / a publication, as opposed to the database-level documentation of what is an entity)
      • 2015-04-16 10622, 2015

      • chrysn
        i'm a little confused that the publication i've entered is, in the relationship to the book it publishes, is called an edition.
      • 2015-04-16 10635, 2015

      • chrysn
      • 2015-04-16 10642, 2015

      • ruaok joined the channel
      • 2015-04-16 10635, 2015

      • Freso joined the channel
      • 2015-04-16 10617, 2015

      • Freso joined the channel
      • 2015-04-16 10609, 2015

      • Freso joined the channel
      • 2015-04-16 10614, 2015

      • chrysn
        (picking up after "it's just been released two weeks ago from #musicbrainz")
      • 2015-04-16 10615, 2015

      • Freso
        chrysn: :)
      • 2015-04-16 10648, 2015

      • chrysn
        have there been approaches to model bookbrainz in rdf so far? i'm a big fan of linked data, currently modelling rdf at unrelated projects, but i figure bookbrainz could integrate greatly there.
      • 2015-04-16 10636, 2015

      • Freso
        kuno and Leftmost had some chatter about JSON-LD yesterday.
      • 2015-04-16 10600, 2015

      • chrysn goes to the archive to read it up
      • 2015-04-16 10645, 2015

      • Freso
      • 2015-04-16 10659, 2015

      • Freso
        The beginning of it wasn't caught be MBChatLogger apparently though. :(
      • 2015-04-16 10616, 2015

      • Freso
        And I can't remember where mb-chat-logger archives at.
      • 2015-04-16 10634, 2015

      • Freso
        Anyway, the main part of the conversation is there.
      • 2015-04-16 10607, 2015

      • chrysn has to get used to not searching for "rdf" any more these days but for "json-ld"
      • 2015-04-16 10637, 2015

      • Freso
        :)
      • 2015-04-16 10653, 2015

      • chrysn
        to throw in a little of my ideas on yesterday's linked data chat:
      • 2015-04-16 10646, 2015

      • chrysn
        i feel that the main value in linked data comes from the ability to explicitly (or implicitly via statements that are marked accordingly) state identity.
      • 2015-04-16 10649, 2015

      • chrysn
        matching terry pratchett by (name, gender, birth day) is cumbersome, but if we state that our terry pratchett foaf:isPrimaryTopicOf http://en.wikipedia.org/wiki/Terry_Pratchett, those databases can be linked together unambiguously, and so can others by transitivity.
      • 2015-04-16 10615, 2015

      • chrysn
        in the case of bookbrainz, my (as a linked data user) primary expectation would be the clean modelling (as i know it from musicbrainz' ngs): when i describe my library, bb's modelling should give me a way to state "i have a copy of the 1st edition of the iliad" vs "i have a copy if a book that is some edition of the iliad".
      • 2015-04-16 10606, 2015

      • chrysn
        that pretty much boils down to my above question -- is there already a description of the semantics of publisher / edition / publication / work etc?
      • 2015-04-16 10605, 2015

      • ruaok joined the channel
      • 2015-04-16 10614, 2015

      • Freso
        chrysn: That's part of what http://chatlogs.musicbrainz.org/bookbrainz-devel/… is about. :)
      • 2015-04-16 10639, 2015

      • Freso
        There's some general idea about the concepts for the different entity types, but no formal description of them yet.
      • 2015-04-16 10601, 2015

      • Freso
        But I think the idea is to match MB as much as possible.
      • 2015-04-16 10626, 2015

      • Freso
        So publication is probably ≃ release group and edition ≃ release.
      • 2015-04-16 10652, 2015

      • Freso
        (E.g., "some edition of the Iliad" == publication, "this edition of the Iliad" == edition)
      • 2015-04-16 10644, 2015

      • ruaok_t joined the channel
      • 2015-04-16 10619, 2015

      • CatQuest immediately when looking at https://bookbrainz.org/publication/7d7c5f63-7bf3-41e6-8335-ff50498cb4e4 wants to put "2003 Gollancz paperback" in to the disambiguation field :)
      • 2015-04-16 10644, 2015

      • CatQuest
        also, like mb had freedb-importr in theo ld days, should we have a "some other database" importer? initially the thing with the freedbimporter was to populate the rather barrent database
      • 2015-04-16 10602, 2015

      • CatQuest
        (which isn't he issue now a days so is why it's not a very done thing today)
      • 2015-04-16 10621, 2015

      • loujine joined the channel
      • 2015-04-16 10633, 2015

      • chrysn
        CatQuest: granted, i should have put that there in the first place :-/
      • 2015-04-16 10606, 2015

      • Freso joined the channel
      • 2015-04-16 10650, 2015

      • chrysn
        looking at ontologies in the field (eg. bibo: which is used by schema.org, or the far more generic dcterms:), they seem all pretty loose on the distinctions between the work per se, revised editions and unmodified reprints. ("the iliad is a book, so is this very edition of it, and the latter is derived from the former").
      • 2015-04-16 10622, 2015

      • chrysn
        is there a pre-existing ontology that has usable criteria to tell what is a work, what's an edition and what is a publication, anything that can be taken as a guideline?
      • 2015-04-16 10656, 2015

      • Freso
        Probably not. :)
      • 2015-04-16 10642, 2015

      • chrysn
        i'm currently digging through libraries, they should be the experts on that, but i'm afraid they're not experts on modern modelling :-/
      • 2015-04-16 10618, 2015

      • Leo_Verto
        hey chrysn
      • 2015-04-16 10605, 2015

      • chrysn
        hi
      • 2015-04-16 10629, 2015

      • Leo_Verto
        If you have any criticism or ideas what could be improved on the frontend, feel free to ping me
      • 2015-04-16 10648, 2015

      • Leo_Verto
        CatQuest, afaik the freedb import aftermath was a huge mess
      • 2015-04-16 10614, 2015

      • Leo_Verto
        and so far, we've decided to handle imports via userscripts so there is alwaysa user responsible for the data
      • 2015-04-16 10625, 2015

      • chrysn
        Leo_Verto: so far, i'm trying to wrap my head around the backend model
      • 2015-04-16 10611, 2015

      • Leo_Verto
        we are still kinda working on creating proper definitions for the entities (see the agenda in the channel topic=
      • 2015-04-16 10613, 2015

      • Leo_Verto
        )
      • 2015-04-16 10643, 2015

      • chrysn
        so is the modelling still in flux / up to discussion?
      • 2015-04-16 10615, 2015

      • Leo_Verto
        the actual schema is pretty much done for now, minor changes are possible though
      • 2015-04-16 10637, 2015

      • Leo_Verto
        the way we define entities (especially how to properly name them) is still a huge topic
      • 2015-04-16 10647, 2015

      • Leo_Verto
        and what kind of database BB actually is :P
      • 2015-04-16 10613, 2015

      • chrysn
        so the concept of what's an entity is fixed, but which kinds of entities there are isn't?
      • 2015-04-16 10623, 2015

      • chrysn
        "which kind": what do you mean?
      • 2015-04-16 10658, 2015

      • Leo_Verto
        not really, mostly just how to name entities and for example what classifies as a publication/an edition
      • 2015-04-16 10618, 2015

      • chrysn
        sorry, but i don't fully understand: how can the concepts of having a book, a publication and an edition be fixed, when there are no criteria for what is what?
      • 2015-04-16 10634, 2015

      • chrysn
        (ie. if you don't know what it is, how can you know they are distinct concepts?)
      • 2015-04-16 10649, 2015

      • Leo_Verto
        oh, sorry if I explained that poorly, the concepts of different entities are pretty final. we're just working on coming up with understandable definitions and preventing edge cases
      • 2015-04-16 10659, 2015

      • chrysn
        i see. my impression is roughly that a book is the most abstract concept (and might, if the author changed his mind between editions, might refer to completely different texts), while...
      • 2015-04-16 10646, 2015

      • chrysn
        ... an edition describes all instances of a book that share the same "body" (they'd be derived from the same plain-text file), and...
      • 2015-04-16 10655, 2015

      • Leo_Verto
        I think the entities can be expressed as a chain, becoming less and less abstract: work > publication > edition
      • 2015-04-16 10604, 2015

      • chrysn
        ... a publication is all books that are exact clones modulo serial numbers?
      • 2015-04-16 10615, 2015

      • chrysn
        ok, i even got them the wrong way round :-/
      • 2015-04-16 10637, 2015

      • Leo_Verto
        one idea is that a publication can contain multiple work
      • 2015-04-16 10649, 2015

      • Leo_Verto
        e.g an anthology
      • 2015-04-16 10651, 2015

      • Leo_Verto
        *works
      • 2015-04-16 10641, 2015

      • chrysn
        but in terms of "simple" books, when a re-print is issued with some typos fixed, what would be shared and what would be new?
      • 2015-04-16 10609, 2015

      • Leo_Verto
        the entity names are a bit confusing but that is because we want bb to store not only books but also other kinds of publications like (quoting CatQuest): "also comcs, magazines. books, ebox, audiobooks, pamplets, papyrus, stone tablets? (hey mb as waxrolls so why not)"
      • 2015-04-16 10638, 2015

      • Leo_Verto
        the re-print would usually be a new edition of an existing publication
      • 2015-04-16 10602, 2015

      • chrysn
        given i've earlier used the example of the iliad, that's a good idea (afaik that was not written at all in its first 400 years)
      • 2015-04-16 10617, 2015

      • Leo_Verto
        sometimes you find "Nth" edition in books, which makes the distinction very clear, unfortunately not all publishers do that
      • 2015-04-16 10632, 2015

      • Leo_Verto
        the illiad itself would be a work
      • 2015-04-16 10644, 2015

      • Leo_Verto
        publications of it might exist in different forms from different times
      • 2015-04-16 10646, 2015

      • chrysn
        at least you usually get a printing year even if there is no "Nth edition"
      • 2015-04-16 10655, 2015

      • Leo_Verto
        yeah
      • 2015-04-16 10621, 2015

      • kuno
        chrysn: "is there a pre-existing ontology that has usable criteria to tell what is a work", FRBR is the closest I guess. http://vocab.org/frbr/core.html
      • 2015-04-16 10626, 2015

      • chrysn
        an unmodified reprint of a work by another publisher woud be a new publication or a new release?
      • 2015-04-16 10630, 2015

      • kuno
        (sorry if that was already mentioned, I didn't read the entire backscroll)
      • 2015-04-16 10634, 2015

      • Leo_Verto
        a new publication
      • 2015-04-16 10610, 2015

      • kuno
      • 2015-04-16 10638, 2015

      • chrysn
        Leo_Verto: ok, that gives me an idea i should be able to apply
      • 2015-04-16 10639, 2015

      • chrysn
        kuno: that looks good, thanks :-)
      • 2015-04-16 10626, 2015

      • chrysn
        so the whole area of "A is a re-narration of B" and similar can be dealt with as work-work relationships
      • 2015-04-16 10602, 2015

      • chrysn
        and translations would be different works, again with work-work relationships
      • 2015-04-16 10631, 2015

      • chrysn
        how about books that heavily change their content through editions? (typical of educational or scientific books; their 40th edition barely shares a word with their 1st) -- different works?
      • 2015-04-16 10619, 2015

      • Leo_Verto
        well, if you were to split one of those books into different works, where would you make that cut? :P
      • 2015-04-16 10619, 2015

      • Leo_Verto
        Seems like a case of Theseus' paradox to me
      • 2015-04-16 10637, 2015

      • chrysn
        if i were to draw the line, i'd use different works when changes were made that are not obvious to be required for correctness as viewed by a reader of the older version without further knowledge of the topic of the book.
      • 2015-04-16 10626, 2015

      • chrysn
        ie. typos -> same book (required for correcteness from general knowledge), clarification of self-contradictions -> same book (because visible to a reader without further knowledge),
      • 2015-04-16 10654, 2015

      • chrysn
        corrected factual statement -> new book (because not evident without knowledge of the topic)
      • 2015-04-16 10613, 2015

      • chrysn
        yes, it shares problems of theseus' ship, but just because there is a paradox we shouldn't treat all ships as one -- the line may not be 100% clear all the time, but it should be possible to have a rough indication for practical purpuses.
      • 2015-04-16 10607, 2015

      • chrysn just went through librarything, looking for a catch given they advertise their cc-by-sa-3.0-equivalent common knowledge license, only to find out that just small parts of their database are what they call common knowledge
      • 2015-04-16 10654, 2015

      • CallerNo6 refers often to the frbr work spectrum thingy @ https://pantherfile.uwm.edu/kipp/public/courses/511/511notes-bibstructs_html_m42e2c6da.png
      • 2015-04-16 10627, 2015

      • CallerNo6
        (best part being the explicit spectrum-ness of it)
      • 2015-04-16 10635, 2015

      • chrysn
        CallerNo6: that is well illustrative. would you consider the cataloging rules cut-off point indicated there suitable for the distinction between bb:Work instances?
      • 2015-04-16 10658, 2015

      • Leo_Verto
        Hmm, we do generally handle translations as new works though
      • 2015-04-16 10614, 2015

      • Leo_Verto
        But that might come in handy tomorrow
      • 2015-04-16 10601, 2015

      • chrysn
        what is tomorrow?
      • 2015-04-16 10649, 2015

      • Leo_Verto
        our weekly meeting
      • 2015-04-16 10635, 2015

      • LordSputnik joined the channel
      • 2015-04-16 10643, 2015

      • CallerNo6
        Leo_Verto, I guess "translation" vs "free translation" might be an important distinction?
      • 2015-04-16 10630, 2015

      • Leo_Verto
        hey Ben :D
      • 2015-04-16 10649, 2015

      • Leo_Verto
        well, what exactly is a free translation?
      • 2015-04-16 10608, 2015

      • CallerNo6
        I don't know :-)
      • 2015-04-16 10620, 2015

      • chrysn
        my earlier tour through the libraries has shown me various cut-off points they're using (checked "Die Blechtrommel" in austrian and german nat'l libraries and vienna locals), but translations were always handled differently
      • 2015-04-16 10634, 2015

      • chrysn
        (in the sense of "right of the cut-off point")
      • 2015-04-16 10641, 2015

      • chrysn
        concerning free translation: sounds like re-narration in another language from the location on the chart
      • 2015-04-16 10614, 2015

      • Leo_Verto
        mhm
      • 2015-04-16 10626, 2015

      • Leo_Verto
        Leo_Verto has changed the topic to: http://bookbrainz.org | https://github.com/bookbrainz | Want to help? Grab a task from http://tickets.musicbrainz.org/browse/BB | http://bit.ly/1EDx7Lb | Agenda: Work/Pub/Edition definitions, relationship schema, entity subtypes
      • 2015-04-16 10628, 2015

      • CallerNo6
        My initial thought is that a literal translation is not a new work. Like, if you run your novel through translate.google
      • 2015-04-16 10640, 2015

      • CallerNo6
        (although it might unintentionally be a comedy?)
      • 2015-04-16 10608, 2015

      • CallerNo6
        But a literary translation is more "free".
      • 2015-04-16 10634, 2015

      • chrysn
        i think there is at least one good reason to keep translations as a "work": to cleanly distinguish different translations. (where would that happen, otherwise?)
      • 2015-04-16 10608, 2015

      • CallerNo6
        Agreed, from a db-point-of-view, a translation needs to be a new thingy.
      • 2015-04-16 10646, 2015

      • chrysn
        different widely published translations are a commonplace thing. (at some point in time, someone will want to do the honorable task of enterin the lord of the rings trilogy & co)
      • 2015-04-16 10601, 2015

      • CallerNo6
        How work-y is that new thingy? I dunno. It's on the spectrum. Like most of my favorite people.
      • 2015-04-16 10622, 2015

      • chrysn
        with my rdf-modelling hat on, i'd advocate allowing the work to be as fine-grained as the data provider intends to be, but having transitive properties between them to cover practical questions
      • 2015-04-16 10636, 2015

      • CallerNo6
        I agree to a point. At some point, I'm less comfortable using the word "work" if it's going to stray too far from common usage.
      • 2015-04-16 10659, 2015

      • chrysn
        my impression is that "work" is already a pretty generic term (as opposed to "book")
      • 2015-04-16 10604, 2015

      • CallerNo6
        (but by all means, fine-grain all the thingies)
      • 2015-04-16 10619, 2015

      • Leo_Verto
        talking about "book"
      • 2015-04-16 10639, 2015

      • Leo_Verto
        we still don't know exactly how to describe BookBrainz :D
      • 2015-04-16 10649, 2015

      • chrysn
        "A community for collecting well-structured metadata about textual publications"?
      • 2015-04-16 10626, 2015

      • Leo_Verto
        hm, that sounds a lot better than what we were working with
      • 2015-04-16 10640, 2015

      • Leo_Verto
        our main problem is covering all the sources we want BB to include
      • 2015-04-16 10608, 2015

      • chrysn
        ... without grabbing too far and becoming an ontology of every creative work
      • 2015-04-16 10614, 2015

      • Leo_Verto
        the obvious solution would be "books", another suggestion is "literature"
      • 2015-04-16 10642, 2015

      • Leo_Verto
        which Leftmost suggested might be understood differently by most native English speakers
      • 2015-04-16 10614, 2015

      • chrysn
        what's bad about "books"? after all, i don't see bookbrainz include scientific papers right now.
      • 2015-04-16 10633, 2015

      • Leo_Verto
        well, we are planning to include a lot more work types than just books
      • 2015-04-16 10608, 2015

      • LordSputnik
      • 2015-04-16 10631, 2015

      • Leo_Verto
        oh yeah
      • 2015-04-16 10641, 2015

      • Leo_Verto
        publication #1 :D
      • 2015-04-16 10604, 2015

      • chrysn
        i appreciate that, given i've spent a lot of time with bad metadata about scientific publications (half my bachelor thesis, actually...)
      • 2015-04-16 10615, 2015

      • kuno
        the focus is books, doesn't mean other stuff isn't allowed. (see musicbrainz and having audiobooks and language learning audio and such in the DB)
      • 2015-04-16 10645, 2015

      • chrysn
        i figure the audiobooks are in for an excellent interlinking between those projects
      • 2015-04-16 10655, 2015

      • LordSputnik
        chrysn: me too - referencing is a pain when you have to look in every paper to find other relevant papers ;)
      • 2015-04-16 10612, 2015

      • Leo_Verto
        audiobooks are actually a topic we need to talk about with the MB people
      • 2015-04-16 10634, 2015

      • chrysn
        yeah i've partially automated the progress to find what i'd call the topic relevance of a paper, with actually good results, but not enough time yet to actually write about it
      • 2015-04-16 10640, 2015

      • ruaok joined the channel
      • 2015-04-16 10635, 2015

      • CallerNo6
        rather than 'becoming an ontology of every creative work', why not adopt an off-the-shelf ontology of every creative work?
      • 2015-04-16 10615, 2015

      • yeeeargh joined the channel
      • 2015-04-16 10642, 2015

      • CallerNo6 feels like sometimes the development of ontologies is too implementation-oriented
      • 2015-04-16 10652, 2015

      • chrysn
        i don't think it's the implementation that drives the constraining of ontologies, it's the quality assurance that needs it
      • 2015-04-16 10626, 2015

      • chrysn
        it's hard to review someone's free-form thinking, but once you share a structure, you can argue on that
      • 2015-04-16 10631, 2015

      • chrysn
        *based on that
      • 2015-04-16 10617, 2015

      • CallerNo6
        well, what I mean is, when you say " keep translations as a "work": to cleanly distinguish different translations"...
      • 2015-04-16 10602, 2015

      • CallerNo6
        ...what I hear is "let's define 'work' based (in part) on our existing implementation"
      • 2015-04-16 10610, 2015

      • CallerNo6
        ... where I might have said "keep translations as separate *entities*"