djce (~davide@195.60.9.122) has joined #musicbrainz
2003-05-01 12104, 2003
ruaok
hey.
2003-05-01 12128, 2003
ruaok
so, I've got a q for you.
2003-05-01 12146, 2003
djce
ok
2003-05-01 12118, 2003
ruaok
In the tunepimp library I am creating a nice OO model for background processing the TRM generation, the TRM lookup, and optionally the filelookup if the TRM lookup yields nothing.
2003-05-01 12140, 2003
ruaok
I've got a pipeline setup that moves files through the process nicely.
2003-05-01 12140, 2003
ruaok
However, if a user inserts 1000 files into a TP based application and lets them churn through all of them it will do file lookups for each of the files that were not found.
2003-05-01 12156, 2003
djce
so far so good...
2003-05-01 12159, 2003
ruaok
In the current tagger, the file lookups only happen when a user requests them.
2003-05-01 12111, 2003
djce
ah. ok.
2003-05-01 12121, 2003
ruaok
In this new model, they would ALWAYS be done when the TRM doesn't return a match.
2003-05-01 12154, 2003
ruaok
Not optimal, but I would like to not have the user wait for these things when they can be done in the background.
It would be nice that if a file was not found, that the application has already done one lookup so the user can take the next step right away.
2003-05-01 12133, 2003
ruaok
That presents more traffic against the server.
2003-05-01 12139, 2003
ruaok
Which I don't like.
2003-05-01 12109, 2003
djce
Because of load on the server, or because of network speed for the client?
2003-05-01 12115, 2003
ruaok
The alternatives are to give some guidelines to the application writer to use some heuristics to reduce the number of needless lookups, but that makes working with the TP more complicated.
2003-05-01 12121, 2003
ruaok
Both, really.
2003-05-01 12128, 2003
ruaok
More the server, though.
2003-05-01 12104, 2003
ruaok
Scalability always nags me in the back on my mind, and while this is a clean architecural decision, it has potential severe ramifications on the server.
2003-05-01 12112, 2003
djce
part of that could be saved by combining it into one request-response.
2003-05-01 12122, 2003
ruaok
And I don't like wasting out precious bandwidth.
2003-05-01 12132, 2003
ruaok
I had thought about that as well.
2003-05-01 12154, 2003
djce
you could even combine it with the TRM generation, if the MB_SERVER is set to zim.
2003-05-01 12101, 2003
ruaok
Maybe the TRM lookup should be skipped and we should just do a filelookup for each file.
2003-05-01 12102, 2003
djce
If you're serious about saving network traffic, you should consider a single request-response that handles multiple files.
2003-05-01 12133, 2003
djce
so maybe one sigserver lookup per file,
2003-05-01 12155, 2003
djce
but then a combined TRM-lookup-plus-file-lookup for /n/ files.
2003-05-01 12159, 2003
ruaok
And what do we need to conserve more?
2003-05-01 12108, 2003
ruaok
network bandwidth or server capacity?
2003-05-01 12126, 2003
djce
mmmm.
2003-05-01 12142, 2003
ruaok
Your approach saves bandwidth, which is not too drastic in this case, IMHO.
2003-05-01 12152, 2003
djce
ideally the library should accept some sort of "hint" from the calling app
2003-05-01 12152, 2003
ruaok
But the server capacity is the same in the end.
2003-05-01 12108, 2003
djce
to suggest which files to lookup , and which not to do yet.
2003-05-01 12115, 2003
ruaok
I'm just trying to avoid making calls to the server that the user may never need.
2003-05-01 12124, 2003
djce
example: often when I use the tagger, I load in many files
2003-05-01 12133, 2003
ruaok
hints: yes, that is what I was hinting at. :-)
2003-05-01 12134, 2003
djce
but then never follow it through and throw half the results away,
2003-05-01 12137, 2003
ruaok groans
2003-05-01 12144, 2003
ruaok
Ding.
2003-05-01 12156, 2003
djce
well, from the TP lib point of view, that's easy.
2003-05-01 12103, 2003
djce
It just moves the hard work to the calling app.
2003-05-01 12114, 2003
djce
Unless the app is lazy and either
2003-05-01 12118, 2003
ruaok
Ding, and the whole ideab behind TP is make the calling app a snap.
2003-05-01 12123, 2003
djce
a) always does the file lookup immediately or
2003-05-01 12136, 2003
djce
b) always defers the file lookup until the last minute.
2003-05-01 12157, 2003
djce
The calling app can still be easy,
2003-05-01 12104, 2003
djce
but to be /slick/ requires some effort.
2003-05-01 12117, 2003
djce
I think that's the best you can do with any lib.
2003-05-01 12137, 2003
djce
Using a lib is never just a matter of "plug it in"; you always need to understand the best way to use the tool.
2003-05-01 12138, 2003
ruaok
You're right on the money.
2003-05-01 12101, 2003
ruaok
At the same time, lots of people never learn the proper way to do things.
2003-05-01 12111, 2003
djce
Right on.
2003-05-01 12121, 2003
ruaok
And provind an easy, almost foolproof way of doing things, is the best way to avoid that.
2003-05-01 12139, 2003
ruaok
So, I'm debating back and forth on this issue, and came to no good conclusions.
2003-05-01 12142, 2003
djce
Easy, accurate, it will work. But suboptimal.
2003-05-01 12156, 2003
djce
Optimising the app is the work of the app writer, not you.
2003-05-01 12105, 2003
djce
unless you write the app too of course :-)
2003-05-01 12107, 2003
djce
$accounts{'djce'}->debit("0.02") :-)
2003-05-01 12110, 2003
ruaok
And I do plan to do that, but I won't be the only done doing this.
2003-05-01 12119, 2003
ruaok
:-)
2003-05-01 12136, 2003
djce has to stop quoting in Perl
2003-05-01 12113, 2003
ruaok
My other thought is to simply say that the user experience is important and that such lookup queries can be replicated to mirror servers.
2003-05-01 12145, 2003
djce
also true
2003-05-01 12112, 2003
ruaok
Perhaps there is some way the TP lib could delay the lookups if the user does not appear to be using them.
2003-05-01 12131, 2003
ruaok
But that is fraught with peril -- it seems a sketchy proposition at best.
2003-05-01 12135, 2003
ruaok
So, if you had to make this call, what call would you make?
2003-05-01 12136, 2003
djce
given that the logic for "intelligently" pre-looking-up some files is both tricky and ill-defined,
2003-05-01 12150, 2003
djce
I currently would have to opt for no pre-lookups at all.
2003-05-01 12110, 2003
djce
But also think about single-RDF calls to handle several files at once.
2003-05-01 12130, 2003
ruaok
Hmmmm. Lemme go read some server code real quick.
2003-05-01 12142, 2003
ruaok
Go check out QS.pm, and look for TrackInfoFromTRMId
2003-05-01 12155, 2003
ruaok
I think I have the answer for this one already. :-)
2003-05-01 12129, 2003
ruaok
If the TRM lookup fails it does a filelookup already.
2003-05-01 12131, 2003
djce
I see it, but I don't pretend to understand it :-(
2003-05-01 12107, 2003
djce
Oh. Now I see :-)
2003-05-01 12118, 2003
djce
Well, that answers one question.
2003-05-01 12133, 2003
ruaok
It passes all the known info to the FileLookup and if the lookup is 90% or better it returns a match.
2003-05-01 12141, 2003
ruaok
If not the results are discarded.
2003-05-01 12115, 2003
ruaok
So, we should create a new function that combines both and returns meaningful results if there was no TRM match.
2003-05-01 12135, 2003
djce
Right.
2003-05-01 12151, 2003
djce
So close already. Should be a simple change (to the server, anyway)
2003-05-01 12102, 2003
ruaok
Thus we would actually be more efficient than we are now if we do pre-lookups on all the tracks.
2003-05-01 12131, 2003
ruaok
The RDF will be muddled since the outcome of the query could return vastyl different info.
2003-05-01 12149, 2003
ruaok
I think I will create a completely new lookup function and slowly phase out the old one.
2003-05-01 12152, 2003
ruaok
Ick.
2003-05-01 12108, 2003
ruaok
But its the best way to go, and it will be faster.
2003-05-01 12124, 2003
djce
indeed so.
2003-05-01 12134, 2003
djce
Make the new query handle multiple files too?
2003-05-01 12135, 2003
ruaok
Ok, I have a plan of attack then.
2003-05-01 12143, 2003
ruaok
hmmm.
2003-05-01 12114, 2003
djce
You don't have to use it, yet. Just allow for it.
2003-05-01 12127, 2003
ruaok
That is an idea....
2003-05-01 12147, 2003
ruaok
OK, I will consider it when I design the new RDF query/response.
2003-05-01 12153, 2003
djce
Specify as "... a list of track info (where currently the list must contain exactly one track)"