Announcement

Collapse
No announcement yet.

ysearch code words are nuts

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cacio
    replied
    I'm sure somebody's already playing around. However, remember that these system is used extensively by millions of people daily to send emails, log on, etc., and it cannot be made automatic, so the vast majority (>99%) will still be appropriate readings (actually, I would think that the main source of mistakes is simply that not even people can read the degraded text). And these are not fundamental texts that need to be scanned precisely. The project is to digitize _any_ existing book to preserve it, so a few mistakes (often easily understandable when reading whole sentences) here and there won't be a problem. Instead, OCR systems apparently have rates of nonreading of mistakes of up to 20% with these old books.

    cacio

    Leave a comment:


  • Nagelfar
    replied
    Originally posted by cacio
    I just read an article on Science that puts a positive spin on the new ysearch security system (which is present not just in ysearch but also in hundreds of other websites).

    Apparently, somebody had the clever idea to use all our efforts to help digitizing old books! It works this way. One of the two words is the word actually used for security purposes. The second word is a word that an OCR system did not understand, and we are interpreting it for the system.

    There are now initiative to digitize old books, which may otherwise get lost. The problem, though, is that old books are difficult to read for OCR system, since they are yellowed and often damaged. Humans can interpret things very well, but of course it would be impossibly expensive to have humans do the reading.

    By linking the security system to this initiative, the creators of the system have managed to use our (millions) of collective little efforts to help digitize books. In the article, the authors says that the accuracy is > 99%.

    In fact, I had not noticed, but the system is called reCAPTCHA. CAPTCHA is the name of the original security system, re stands for reading. Also, the icon says: Stop spam, read books. Now the :read books" part makes sense.

    So I guess everytime we spend some time inputting those words, we can think that we are actually helping save old books!

    cacio
    If this is true, and the word gets out, a bunch of clowns are only going to type in one of the words correctly and something else ironic for the other one and see if it is accepted, screwing with hundreds of years of preserved knowledge and vandalizing these works in under the radar ways that may never be noticed.

    Leave a comment:


  • FredSpringer
    replied
    Originally posted by cacio
    I just read an article on Science that puts a positive spin on the new ysearch security system (which is present not just in ysearch but also in hundreds of other websites).

    Apparently, somebody had the clever idea to use all our efforts to help digitizing old books! It works this way. One of the two words is the word actually used for security purposes. The second word is a word that an OCR system did not understand, and we are interpreting it for the system.

    There are now initiative to digitize old books, which may otherwise get lost. The problem, though, is that old books are difficult to read for OCR system, since they are yellowed and often damaged. Humans can interpret things very well, but of course it would be impossibly expensive to have humans do the reading.

    By linking the security system to this initiative, the creators of the system have managed to use our (millions) of collective little efforts to help digitize books. In the article, the authors says that the accuracy is > 99%.

    In fact, I had not noticed, but the system is called reCAPTCHA. CAPTCHA is the name of the original security system, re stands for reading. Also, the icon says: Stop spam, read books. Now the :read books" part makes sense.

    So I guess everytime we spend some time inputting those words, we can think that we are actually helping save old books!

    cacio
    How much are we getting paid?

    Leave a comment:


  • cacio
    replied
    I just read an article on Science that puts a positive spin on the new ysearch security system (which is present not just in ysearch but also in hundreds of other websites).

    Apparently, somebody had the clever idea to use all our efforts to help digitizing old books! It works this way. One of the two words is the word actually used for security purposes. The second word is a word that an OCR system did not understand, and we are interpreting it for the system.

    There are now initiative to digitize old books, which may otherwise get lost. The problem, though, is that old books are difficult to read for OCR system, since they are yellowed and often damaged. Humans can interpret things very well, but of course it would be impossibly expensive to have humans do the reading.

    By linking the security system to this initiative, the creators of the system have managed to use our (millions) of collective little efforts to help digitize books. In the article, the authors says that the accuracy is > 99%.

    In fact, I had not noticed, but the system is called reCAPTCHA. CAPTCHA is the name of the original security system, re stands for reading. Also, the icon says: Stop spam, read books. Now the :read books" part makes sense.

    So I guess everytime we spend some time inputting those words, we can think that we are actually helping save old books!

    cacio

    Leave a comment:


  • DKF
    replied
    Originally posted by vinnie
    As I've stated in the E-M35 forum, these code words are nothing but frustrating to bonafide users and will do nothing that I'm aware of to stop someone from "harvesting" e-mail addresses to use for unintended purposes. I can see having us type the passwords once when we first start a search, but to have to do it each time we want to compare the gd or haplotypes of other users from our initial search is ridiculous.
    I am pretty sure that FTDNA would never do anything to frustrate genetic genealogist hobbyists unless they had experienced some sort of horrendous problem. Hence, this "inconvenience" (and yes I curse under my breath every time I have to do it) must be a necessity to avoid the misuse of the database and allow us to have the access we want / need. Harvesters come in many guises and I suspect that unethical competitors, since this is a "public" database, could grab all of the data, make it their own, and not even have the decency to cite where they obtained the data. If the words seem a bit bizarre to English speakers, what must they seem to those in say Europe. The specific configuration of letters is irrelevant, but the consonants and vowels are in an orderly sequence, so the problem is attenuated somewhat. I suspect that if only "regular" words were used a program could be devised to bypass the present system just by using pattern recognition software along with a dictionary database. A small price to pay folks.

    Leave a comment:


  • vinnie
    replied
    As I've stated in the E-M35 forum, these code words are nothing but frustrating to bonafide users and will do nothing that I'm aware of to stop someone from "harvesting" e-mail addresses to use for unintended purposes. I can see having us type the passwords once when we first start a search, but to have to do it each time we want to compare the gd or haplotypes of other users from our initial search is ridiculous.

    Leave a comment:


  • ylgitn
    replied
    Agree... even aside from the unsettling nuttiness of the words themselves, I recently tried a search and my code words were something blurred and illegible...I had to click refresh to get words that I could even read.

    Leave a comment:


  • darroll
    replied
    Originally posted by Jim Denning
    ysearch codewords are nuts
    Jim,
    Do you mean the graphics text?
    This keeps people (computers) from downloading everyones information.
    I'm sure this goes on all the time, the hacking of other peoples data..

    Darroll

    Leave a comment:


  • Clochaire
    replied
    Originally posted by Jim Denning
    ysearch codewords are nuts
    In the sense of being nonesensical? Or merely an annoying hinderance?

    I guess I could go either way on that. But, still, some of them are pretty amusing.

    Jack

    Leave a comment:


  • Jim Denning
    started a topic ysearch code words are nuts

    ysearch code words are nuts

    ysearch codewords are nuts
Working...
X