Getting faster ?
Getting faster ?
Posted on: Tue, 10/21/2008 - 19:41
Is there anyway to get the answer faster ??
Suppose I do not need actually the whole answer from Calais just part of it, can I obtained an output faster ??
If it is not possible, just say "no".


Comments
Hello,
I finally post the last bug I was mentionning
http://ks36587.kimsufi.com/calaisbug/bug4.html
First the entry
then the simple output
Well the ouput is not so big, so I don't believe that compression will change anything.
My goal is more or less to have reasonnable speed on queries.
At the moment I send to openCalais the abstract from Yahoo Boss, for 50 results.
And it takes more or less 10 secs to analyse it ( it is an average measure ).
So, no, I can't really cut the entries.
But what I can try is sending 50 queries to Calais
instead of one big query.
But I fear, it will take longer and will be inevitably less accurate
since a lot of information may be extracted from junction.
If you wanna see an example of entry,
http://ks36587.kimsufi.com/calaisbug/
Besides I manage to produce another strange bug,
where almost the whole document was considered as an industry name
( very odd bug indeed )
I'll try to post it.
The headline page that you submit to Open Calais, is already, by its nature, a built-in summary. They are many partial sentences, unrelated to each other. Calais won't add much value to that. If you parsed each sentence (line) and sent it separately, it would be faster.
As for the bug in which Calais treated the whole document as an industry name, could you send us the pertinent files or text?
Regards,
There are two main things that will impact the speed of a response from Calais: 1) The size and complexity of the text you send us to process, and 2) Network transmission and latency effects. The smaller text sizes the Calais processing itself is extraordinarily fast.
So:
1) As above - send us only the text you need analyzed - but within reason. if you just send us a sentence we're not going to do a very good job.
2) Consider using the Simple output format - this removes a lot of the overhead associated with RDF (but loses a lot of the functionality).
3) The current Technology Preview version support HTTP compression. That will cut your "over the wire" transaction size down a lot.
4) The technology Preview has a new command to tell Calais not to echo the original text back to you - again less content over the wire.
Let us know how it works.
Regards,