I'm sure somebody's already playing around. However, remember that these system is used extensively by millions of people daily to send emails, log on, etc., and it cannot be made automatic, so the vast majority (>99%) will still be appropriate readings (actually, I would think that the main source of mistakes is simply that not even people can read the degraded text). And these are not fundamental texts that need to be scanned precisely. The project is to digitize _any_ existing book to preserve it, so a few mistakes (often easily understandable when reading whole sentences) here and there won't be a problem. Instead, OCR systems apparently have rates of nonreading of mistakes of up to 20% with these old books.
cacio
Announcement
Collapse
No announcement yet.
ysearch code words are nuts
Collapse
X
-
Originally posted by cacioI just read an article on Science that puts a positive spin on the new ysearch security system (which is present not just in ysearch but also in hundreds of other websites).
Apparently, somebody had the clever idea to use all our efforts to help digitizing old books! It works this way. One of the two words is the word actually used for security purposes. The second word is a word that an OCR system did not understand, and we are interpreting it for the system.
There are now initiative to digitize old books, which may otherwise get lost. The problem, though, is that old books are difficult to read for OCR system, since they are yellowed and often damaged. Humans can interpret things very well, but of course it would be impossibly expensive to have humans do the reading.
By linking the security system to this initiative, the creators of the system have managed to use our (millions) of collective little efforts to help digitize books. In the article, the authors says that the accuracy is > 99%.
In fact, I had not noticed, but the system is called reCAPTCHA. CAPTCHA is the name of the original security system, re stands for reading. Also, the icon says: Stop spam, read books. Now the :read books" part makes sense.
So I guess everytime we spend some time inputting those words, we can think that we are actually helping save old books!
cacio
Leave a comment:
-
Originally posted by cacioI just read an article on Science that puts a positive spin on the new ysearch security system (which is present not just in ysearch but also in hundreds of other websites).
Apparently, somebody had the clever idea to use all our efforts to help digitizing old books! It works this way. One of the two words is the word actually used for security purposes. The second word is a word that an OCR system did not understand, and we are interpreting it for the system.
There are now initiative to digitize old books, which may otherwise get lost. The problem, though, is that old books are difficult to read for OCR system, since they are yellowed and often damaged. Humans can interpret things very well, but of course it would be impossibly expensive to have humans do the reading.
By linking the security system to this initiative, the creators of the system have managed to use our (millions) of collective little efforts to help digitize books. In the article, the authors says that the accuracy is > 99%.
In fact, I had not noticed, but the system is called reCAPTCHA. CAPTCHA is the name of the original security system, re stands for reading. Also, the icon says: Stop spam, read books. Now the :read books" part makes sense.
So I guess everytime we spend some time inputting those words, we can think that we are actually helping save old books!
cacio
Leave a comment:
-
I just read an article on Science that puts a positive spin on the new ysearch security system (which is present not just in ysearch but also in hundreds of other websites).
Apparently, somebody had the clever idea to use all our efforts to help digitizing old books! It works this way. One of the two words is the word actually used for security purposes. The second word is a word that an OCR system did not understand, and we are interpreting it for the system.
There are now initiative to digitize old books, which may otherwise get lost. The problem, though, is that old books are difficult to read for OCR system, since they are yellowed and often damaged. Humans can interpret things very well, but of course it would be impossibly expensive to have humans do the reading.
By linking the security system to this initiative, the creators of the system have managed to use our (millions) of collective little efforts to help digitize books. In the article, the authors says that the accuracy is > 99%.
In fact, I had not noticed, but the system is called reCAPTCHA. CAPTCHA is the name of the original security system, re stands for reading. Also, the icon says: Stop spam, read books. Now the :read books" part makes sense.
So I guess everytime we spend some time inputting those words, we can think that we are actually helping save old books!
cacio
Leave a comment:
-
Originally posted by vinnieAs I've stated in the E-M35 forum, these code words are nothing but frustrating to bonafide users and will do nothing that I'm aware of to stop someone from "harvesting" e-mail addresses to use for unintended purposes. I can see having us type the passwords once when we first start a search, but to have to do it each time we want to compare the gd or haplotypes of other users from our initial search is ridiculous.
Leave a comment:
-
As I've stated in the E-M35 forum, these code words are nothing but frustrating to bonafide users and will do nothing that I'm aware of to stop someone from "harvesting" e-mail addresses to use for unintended purposes. I can see having us type the passwords once when we first start a search, but to have to do it each time we want to compare the gd or haplotypes of other users from our initial search is ridiculous.
Leave a comment:
-
Agree... even aside from the unsettling nuttiness of the words themselves, I recently tried a search and my code words were something blurred and illegible...I had to click refresh to get words that I could even read.
Leave a comment:
-
Originally posted by Jim Denningysearch codewords are nuts
Do you mean the graphics text?
This keeps people (computers) from downloading everyones information.
I'm sure this goes on all the time, the hacking of other peoples data..
Darroll
Leave a comment:
-
Originally posted by Jim Denningysearch codewords are nuts
I guess I could go either way on that. But, still, some of them are pretty amusing.
Jack
Leave a comment:
-
ysearch code words are nuts
ysearch codewords are nutsTags: None
Leave a comment: