is there any documentation yet on the high-level schema used on bookbrainz? (ie. what is an edition / a publication, as opposed to the database-level documentation of what is an entity)
2015-04-16 10622, 2015
chrysn
i'm a little confused that the publication i've entered is, in the relationship to the book it publishes, is called an edition.
(picking up after "it's just been released two weeks ago from #musicbrainz")
2015-04-16 10615, 2015
Freso
chrysn: :)
2015-04-16 10648, 2015
chrysn
have there been approaches to model bookbrainz in rdf so far? i'm a big fan of linked data, currently modelling rdf at unrelated projects, but i figure bookbrainz could integrate greatly there.
2015-04-16 10636, 2015
Freso
kuno and Leftmost had some chatter about JSON-LD yesterday.
The beginning of it wasn't caught be MBChatLogger apparently though. :(
2015-04-16 10616, 2015
Freso
And I can't remember where mb-chat-logger archives at.
2015-04-16 10634, 2015
Freso
Anyway, the main part of the conversation is there.
2015-04-16 10607, 2015
chrysn has to get used to not searching for "rdf" any more these days but for "json-ld"
2015-04-16 10637, 2015
Freso
:)
2015-04-16 10653, 2015
chrysn
to throw in a little of my ideas on yesterday's linked data chat:
2015-04-16 10646, 2015
chrysn
i feel that the main value in linked data comes from the ability to explicitly (or implicitly via statements that are marked accordingly) state identity.
2015-04-16 10649, 2015
chrysn
matching terry pratchett by (name, gender, birth day) is cumbersome, but if we state that our terry pratchett foaf:isPrimaryTopicOf http://en.wikipedia.org/wiki/Terry_Pratchett, those databases can be linked together unambiguously, and so can others by transitivity.
2015-04-16 10615, 2015
chrysn
in the case of bookbrainz, my (as a linked data user) primary expectation would be the clean modelling (as i know it from musicbrainz' ngs): when i describe my library, bb's modelling should give me a way to state "i have a copy of the 1st edition of the iliad" vs "i have a copy if a book that is some edition of the iliad".
2015-04-16 10606, 2015
chrysn
that pretty much boils down to my above question -- is there already a description of the semantics of publisher / edition / publication / work etc?
There's some general idea about the concepts for the different entity types, but no formal description of them yet.
2015-04-16 10601, 2015
Freso
But I think the idea is to match MB as much as possible.
2015-04-16 10626, 2015
Freso
So publication is probably ≃ release group and edition ≃ release.
2015-04-16 10652, 2015
Freso
(E.g., "some edition of the Iliad" == publication, "this edition of the Iliad" == edition)
2015-04-16 10644, 2015
ruaok_t joined the channel
2015-04-16 10619, 2015
CatQuest immediately when looking at https://bookbrainz.org/publication/7d7c5f63-7bf3-41e6-8335-ff50498cb4e4 wants to put "2003 Gollancz paperback" in to the disambiguation field :)
2015-04-16 10644, 2015
CatQuest
also, like mb had freedb-importr in theo ld days, should we have a "some other database" importer? initially the thing with the freedbimporter was to populate the rather barrent database
2015-04-16 10602, 2015
CatQuest
(which isn't he issue now a days so is why it's not a very done thing today)
2015-04-16 10621, 2015
loujine joined the channel
2015-04-16 10633, 2015
chrysn
CatQuest: granted, i should have put that there in the first place :-/
2015-04-16 10606, 2015
Freso joined the channel
2015-04-16 10650, 2015
chrysn
looking at ontologies in the field (eg. bibo: which is used by schema.org, or the far more generic dcterms:), they seem all pretty loose on the distinctions between the work per se, revised editions and unmodified reprints. ("the iliad is a book, so is this very edition of it, and the latter is derived from the former").
2015-04-16 10622, 2015
chrysn
is there a pre-existing ontology that has usable criteria to tell what is a work, what's an edition and what is a publication, anything that can be taken as a guideline?
2015-04-16 10656, 2015
Freso
Probably not. :)
2015-04-16 10642, 2015
chrysn
i'm currently digging through libraries, they should be the experts on that, but i'm afraid they're not experts on modern modelling :-/
2015-04-16 10618, 2015
Leo_Verto
hey chrysn
2015-04-16 10605, 2015
chrysn
hi
2015-04-16 10629, 2015
Leo_Verto
If you have any criticism or ideas what could be improved on the frontend, feel free to ping me
2015-04-16 10648, 2015
Leo_Verto
CatQuest, afaik the freedb import aftermath was a huge mess
2015-04-16 10614, 2015
Leo_Verto
and so far, we've decided to handle imports via userscripts so there is alwaysa user responsible for the data
2015-04-16 10625, 2015
chrysn
Leo_Verto: so far, i'm trying to wrap my head around the backend model
2015-04-16 10611, 2015
Leo_Verto
we are still kinda working on creating proper definitions for the entities (see the agenda in the channel topic=
2015-04-16 10613, 2015
Leo_Verto
)
2015-04-16 10643, 2015
chrysn
so is the modelling still in flux / up to discussion?
2015-04-16 10615, 2015
Leo_Verto
the actual schema is pretty much done for now, minor changes are possible though
2015-04-16 10637, 2015
Leo_Verto
the way we define entities (especially how to properly name them) is still a huge topic
2015-04-16 10647, 2015
Leo_Verto
and what kind of database BB actually is :P
2015-04-16 10613, 2015
chrysn
so the concept of what's an entity is fixed, but which kinds of entities there are isn't?
2015-04-16 10623, 2015
chrysn
"which kind": what do you mean?
2015-04-16 10658, 2015
Leo_Verto
not really, mostly just how to name entities and for example what classifies as a publication/an edition
2015-04-16 10618, 2015
chrysn
sorry, but i don't fully understand: how can the concepts of having a book, a publication and an edition be fixed, when there are no criteria for what is what?
2015-04-16 10634, 2015
chrysn
(ie. if you don't know what it is, how can you know they are distinct concepts?)
2015-04-16 10649, 2015
Leo_Verto
oh, sorry if I explained that poorly, the concepts of different entities are pretty final. we're just working on coming up with understandable definitions and preventing edge cases
2015-04-16 10659, 2015
chrysn
i see. my impression is roughly that a book is the most abstract concept (and might, if the author changed his mind between editions, might refer to completely different texts), while...
2015-04-16 10646, 2015
chrysn
... an edition describes all instances of a book that share the same "body" (they'd be derived from the same plain-text file), and...
2015-04-16 10655, 2015
Leo_Verto
I think the entities can be expressed as a chain, becoming less and less abstract: work > publication > edition
2015-04-16 10604, 2015
chrysn
... a publication is all books that are exact clones modulo serial numbers?
2015-04-16 10615, 2015
chrysn
ok, i even got them the wrong way round :-/
2015-04-16 10637, 2015
Leo_Verto
one idea is that a publication can contain multiple work
2015-04-16 10649, 2015
Leo_Verto
e.g an anthology
2015-04-16 10651, 2015
Leo_Verto
*works
2015-04-16 10641, 2015
chrysn
but in terms of "simple" books, when a re-print is issued with some typos fixed, what would be shared and what would be new?
2015-04-16 10609, 2015
Leo_Verto
the entity names are a bit confusing but that is because we want bb to store not only books but also other kinds of publications like (quoting CatQuest): "also comcs, magazines. books, ebox, audiobooks, pamplets, papyrus, stone tablets? (hey mb as waxrolls so why not)"
2015-04-16 10638, 2015
Leo_Verto
the re-print would usually be a new edition of an existing publication
2015-04-16 10602, 2015
chrysn
given i've earlier used the example of the iliad, that's a good idea (afaik that was not written at all in its first 400 years)
2015-04-16 10617, 2015
Leo_Verto
sometimes you find "Nth" edition in books, which makes the distinction very clear, unfortunately not all publishers do that
2015-04-16 10632, 2015
Leo_Verto
the illiad itself would be a work
2015-04-16 10644, 2015
Leo_Verto
publications of it might exist in different forms from different times
2015-04-16 10646, 2015
chrysn
at least you usually get a printing year even if there is no "Nth edition"
2015-04-16 10655, 2015
Leo_Verto
yeah
2015-04-16 10621, 2015
kuno
chrysn: "is there a pre-existing ontology that has usable criteria to tell what is a work", FRBR is the closest I guess. http://vocab.org/frbr/core.html
2015-04-16 10626, 2015
chrysn
an unmodified reprint of a work by another publisher woud be a new publication or a new release?
2015-04-16 10630, 2015
kuno
(sorry if that was already mentioned, I didn't read the entire backscroll)
Leo_Verto: ok, that gives me an idea i should be able to apply
2015-04-16 10639, 2015
chrysn
kuno: that looks good, thanks :-)
2015-04-16 10626, 2015
chrysn
so the whole area of "A is a re-narration of B" and similar can be dealt with as work-work relationships
2015-04-16 10602, 2015
chrysn
and translations would be different works, again with work-work relationships
2015-04-16 10631, 2015
chrysn
how about books that heavily change their content through editions? (typical of educational or scientific books; their 40th edition barely shares a word with their 1st) -- different works?
2015-04-16 10619, 2015
Leo_Verto
well, if you were to split one of those books into different works, where would you make that cut? :P
2015-04-16 10619, 2015
Leo_Verto
Seems like a case of Theseus' paradox to me
2015-04-16 10637, 2015
chrysn
if i were to draw the line, i'd use different works when changes were made that are not obvious to be required for correctness as viewed by a reader of the older version without further knowledge of the topic of the book.
2015-04-16 10626, 2015
chrysn
ie. typos -> same book (required for correcteness from general knowledge), clarification of self-contradictions -> same book (because visible to a reader without further knowledge),
2015-04-16 10654, 2015
chrysn
corrected factual statement -> new book (because not evident without knowledge of the topic)
2015-04-16 10613, 2015
chrysn
yes, it shares problems of theseus' ship, but just because there is a paradox we shouldn't treat all ships as one -- the line may not be 100% clear all the time, but it should be possible to have a rough indication for practical purpuses.
2015-04-16 10607, 2015
chrysn just went through librarything, looking for a catch given they advertise their cc-by-sa-3.0-equivalent common knowledge license, only to find out that just small parts of their database are what they call common knowledge
2015-04-16 10654, 2015
CallerNo6 refers often to the frbr work spectrum thingy @ https://pantherfile.uwm.edu/kipp/public/courses/511/511notes-bibstructs_html_m42e2c6da.png
2015-04-16 10627, 2015
CallerNo6
(best part being the explicit spectrum-ness of it)
2015-04-16 10635, 2015
chrysn
CallerNo6: that is well illustrative. would you consider the cataloging rules cut-off point indicated there suitable for the distinction between bb:Work instances?
2015-04-16 10658, 2015
Leo_Verto
Hmm, we do generally handle translations as new works though
2015-04-16 10614, 2015
Leo_Verto
But that might come in handy tomorrow
2015-04-16 10601, 2015
chrysn
what is tomorrow?
2015-04-16 10649, 2015
Leo_Verto
our weekly meeting
2015-04-16 10635, 2015
LordSputnik joined the channel
2015-04-16 10643, 2015
CallerNo6
Leo_Verto, I guess "translation" vs "free translation" might be an important distinction?
2015-04-16 10630, 2015
Leo_Verto
hey Ben :D
2015-04-16 10649, 2015
Leo_Verto
well, what exactly is a free translation?
2015-04-16 10608, 2015
CallerNo6
I don't know :-)
2015-04-16 10620, 2015
chrysn
my earlier tour through the libraries has shown me various cut-off points they're using (checked "Die Blechtrommel" in austrian and german nat'l libraries and vienna locals), but translations were always handled differently
2015-04-16 10634, 2015
chrysn
(in the sense of "right of the cut-off point")
2015-04-16 10641, 2015
chrysn
concerning free translation: sounds like re-narration in another language from the location on the chart
My initial thought is that a literal translation is not a new work. Like, if you run your novel through translate.google
2015-04-16 10640, 2015
CallerNo6
(although it might unintentionally be a comedy?)
2015-04-16 10608, 2015
CallerNo6
But a literary translation is more "free".
2015-04-16 10634, 2015
chrysn
i think there is at least one good reason to keep translations as a "work": to cleanly distinguish different translations. (where would that happen, otherwise?)
2015-04-16 10608, 2015
CallerNo6
Agreed, from a db-point-of-view, a translation needs to be a new thingy.
2015-04-16 10646, 2015
chrysn
different widely published translations are a commonplace thing. (at some point in time, someone will want to do the honorable task of enterin the lord of the rings trilogy & co)
2015-04-16 10601, 2015
CallerNo6
How work-y is that new thingy? I dunno. It's on the spectrum. Like most of my favorite people.
2015-04-16 10622, 2015
chrysn
with my rdf-modelling hat on, i'd advocate allowing the work to be as fine-grained as the data provider intends to be, but having transitive properties between them to cover practical questions
2015-04-16 10636, 2015
CallerNo6
I agree to a point. At some point, I'm less comfortable using the word "work" if it's going to stray too far from common usage.
2015-04-16 10659, 2015
chrysn
my impression is that "work" is already a pretty generic term (as opposed to "book")
2015-04-16 10604, 2015
CallerNo6
(but by all means, fine-grain all the thingies)
2015-04-16 10619, 2015
Leo_Verto
talking about "book"
2015-04-16 10639, 2015
Leo_Verto
we still don't know exactly how to describe BookBrainz :D
2015-04-16 10649, 2015
chrysn
"A community for collecting well-structured metadata about textual publications"?
2015-04-16 10626, 2015
Leo_Verto
hm, that sounds a lot better than what we were working with
2015-04-16 10640, 2015
Leo_Verto
our main problem is covering all the sources we want BB to include
2015-04-16 10608, 2015
chrysn
... without grabbing too far and becoming an ontology of every creative work
2015-04-16 10614, 2015
Leo_Verto
the obvious solution would be "books", another suggestion is "literature"
2015-04-16 10642, 2015
Leo_Verto
which Leftmost suggested might be understood differently by most native English speakers
2015-04-16 10614, 2015
chrysn
what's bad about "books"? after all, i don't see bookbrainz include scientific papers right now.
2015-04-16 10633, 2015
Leo_Verto
well, we are planning to include a lot more work types than just books
i appreciate that, given i've spent a lot of time with bad metadata about scientific publications (half my bachelor thesis, actually...)
2015-04-16 10615, 2015
kuno
the focus is books, doesn't mean other stuff isn't allowed. (see musicbrainz and having audiobooks and language learning audio and such in the DB)
2015-04-16 10645, 2015
chrysn
i figure the audiobooks are in for an excellent interlinking between those projects
2015-04-16 10655, 2015
LordSputnik
chrysn: me too - referencing is a pain when you have to look in every paper to find other relevant papers ;)
2015-04-16 10612, 2015
Leo_Verto
audiobooks are actually a topic we need to talk about with the MB people
2015-04-16 10634, 2015
chrysn
yeah i've partially automated the progress to find what i'd call the topic relevance of a paper, with actually good results, but not enough time yet to actually write about it
2015-04-16 10640, 2015
ruaok joined the channel
2015-04-16 10635, 2015
CallerNo6
rather than 'becoming an ontology of every creative work', why not adopt an off-the-shelf ontology of every creative work?
2015-04-16 10615, 2015
yeeeargh joined the channel
2015-04-16 10642, 2015
CallerNo6 feels like sometimes the development of ontologies is too implementation-oriented
2015-04-16 10652, 2015
chrysn
i don't think it's the implementation that drives the constraining of ontologies, it's the quality assurance that needs it
2015-04-16 10626, 2015
chrysn
it's hard to review someone's free-form thinking, but once you share a structure, you can argue on that
2015-04-16 10631, 2015
chrysn
*based on that
2015-04-16 10617, 2015
CallerNo6
well, what I mean is, when you say " keep translations as a "work": to cleanly distinguish different translations"...
2015-04-16 10602, 2015
CallerNo6
...what I hear is "let's define 'work' based (in part) on our existing implementation"
2015-04-16 10610, 2015
CallerNo6
... where I might have said "keep translations as separate *entities*"