Image Image Image Image Image Image




Post new topic Reply to topic  [ 188 posts ]  Go to page 1, 2, 3, 4, 5 ... 10  Next
Author Message
 Post subject: Optical Character Recognition (OCR)
PostPosted: Sun Feb 01, 2009 8:01 pm 
Offline
Regular member
User avatar

Posts: 75
Favourite Bot: ...
EDIT /Indiana/: Check out this codingTheWheel article for short introduction on the topic.

For screen scraping, I need OCR. I can capture screen (with blit screen (c++)). Now I want to be able to recognise characters. I know the font they use, so that should help. I'd like to make a code such that I can resize the poker window and still recognize the character. For that reason, capturing every letter and then comparing pixel by pixel will not work.

Does anybody have any advice regarding OCR when we know the font used ? (I guess it should be easier!)


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Sun Feb 01, 2009 9:55 pm 
Offline
PokerAI fellow
User avatar

Posts: 1306
Location: Finland
Favourite Bot: Self-made
robertledoux wrote:
For screen scraping, I need OCR. I can capture screen (with blit screen (c++)). Now I want to be able to recognise characters. I know the font they use, so that should help. I'd like to make a code such that I can resize the poker window and still recognize the character. For that reason, capturing every letter and then comparing pixel by pixel will not work.

Does anybody have any advice regarding OCR when we know the font used ? (I guess it should be easier!)

Was looking at the same thing and found this: http://www.pixel-technology.com/freeware/tessnet2/

_________________
Opinions expressed are my own, your mileage may vary... ;)
Warning: If I spot an opportunity to give sarcastic replies, I will take it. Nothing personal. I don't even know you.


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 12:18 pm 
Offline
PokerAI fellow
User avatar

Posts: 745
Favourite Bot: my bot
Tesseract was reported on this forum as being too slow. Many people just match up pixels without doing full blown OCR.


Top
 Profile E-mail  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 12:53 pm 
Offline
PokerAI fellow
User avatar

Posts: 1306
Location: Finland
Favourite Bot: Self-made
spears wrote:
Tesseract was reported on this forum as being too slow. Many people just match up pixels without doing full blown OCR.

I think I found the same post that you are referring to, but it was about dealer chat OCR, right? I'm not doing that, I'm just trying to get player names and stack sized on the beginning of the hand. And the player names are OCR'd only when I detect that a new player has entered onto a seat with much faster and simpler check.

_________________
Opinions expressed are my own, your mileage may vary... ;)
Warning: If I spot an opportunity to give sarcastic replies, I will take it. Nothing personal. I don't even know you.


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 2:26 pm 
Offline
Regular member
User avatar

Posts: 75
Favourite Bot: ...
TooMuchCoffee wrote:
Was looking at the same thing and found this: http://www.pixel-technology.com/freeware/tessnet2/


Tx for your answers. I knew about Tesseract, but I'm also trying to OCR the chat. So I need something fast. (And I usually prefer to do things myself).

Spears, I agree with you, I want to do pixel matching. But does anybody have any idea to do a pixel maching smart enough to still work even when you have resized the window?


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 2:31 pm 
Offline
Regular member
User avatar

Posts: 75
Favourite Bot: ...
BTW, I was planning on parsing the chat because I've read somewhere, that it's quite difficult to parse directly the table (there will always be a little something that you might sometimes miss)

What do you guys think?


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 3:35 pm 
Offline
PokerAI fellow
User avatar

Posts: 1415
Favourite Bot: none
Parsing the chat with only OCR will be very tough.

Just an example:
You are playing full ring and are on the button, all players in front of you tick auto-fold and in the instant the first player folds its your turn to act. Now you have 6 new messages in the chat in the fraction of a second. If you are now multi tabling with let say 4 or more tables your chat will almost certainly not be that big to still see all 6 lines therefore you will miss some lines.

If you can ensure that you will always see all new lines when its your turn, then go ahead with that approach. If not: try something else.

Cheers.

_________________
Cheers.


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 3:38 pm 
Offline
Junior member
User avatar

Posts: 15
Favourite Bot: Bender from futurama
I think I could do this without the chatbox alltogether. I currently have a script that scrapes EVERYTHING except each players chip stack at a certain poker site, without using anything in the chatbox, it does this without lag in realtime.

pm me for more info.

_________________
David.

Because someone here HAS to know less than everyone else!


Top
 Profile E-mail  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 3:44 pm 
Offline
PokerAI fellow
User avatar

Posts: 1415
Favourite Bot: none
autoitmad wrote:
I think I could do this without the chatbox alltogether. I currently have a script that scrapes EVERYTHING except each players chip stack at a certain poker site, without using anything in the chatbox, it does this without lag in realtime.

pm me for more info.


Don't get to excited, you are not the first one to have achieved this...
If you are so eager to show everyone how cool your stuff is why don't you publish it in these forums?

Cheers.

_________________
Cheers.


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 4:03 pm 
Offline
Junior member
User avatar

Posts: 15
Favourite Bot: Bender from futurama
Im not excited, I just think it can be done easier than what you need to.

I am very keen to share my work, but until I get to see the content of the restricted forums I will keep my cards close to my chest. I realise that my skills are small compared to some of you guys, but if I play my hand now im just asking to be taken for a ride.

_________________
David.

Because someone here HAS to know less than everyone else!


Top
 Profile E-mail  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 4:43 pm 
Offline
PokerAI fellow
User avatar

Posts: 1306
Location: Finland
Favourite Bot: Self-made
Report: Tried tessnet2.dll along with RGB->TIFF conversion and the results were very poor. Nothing got recognized. Well, sure, each seat had '~' sitting on it... ;) I suspect that the simple 1-pixel wide bitmap characters were just too, er.., simple for Tesseract.

So I reverted back to my old NN-recognition and it does work 100%. The reason why I wanted to replace my NN-recognition with tesseract (or some other more generic OCR) is that the NN's are a bit tedious to teach and the fonts differ sufficiently that I have to create new NN's for each font/size and that is just plain boring. But I guess a man's got to do etc...

_________________
Opinions expressed are my own, your mileage may vary... ;)
Warning: If I spot an opportunity to give sarcastic replies, I will take it. Nothing personal. I don't even know you.


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 5:15 pm 
Offline
Senior member
User avatar

Posts: 136
Location: Paris - France
Favourite Bot: ZBot
I tried to use OCR too when I began coding my bot, and I've not been able to have a 100% accuracy. There were always unrecognized characters (they often have difficulties to deal with 0 and O or 1 and l), as poker names can't be compared to a dictionary.

Moreover, you can miss some informations using OCR (for example on the poker client I'm botting atm, the opponents names are replaced by their action when they act), meaning that you'll sometimes have to print the window several times to get everything you need. The more table windows you'll have on your screen, the smaller the characters will be, leading in a less accurate OCR.

I don't really like OCR as the primary method to extract data from the tables (just imagine how big is the issue if your OCR is sometimes wrong on the player stacks or bets). But I needed it for little things were the possibilities of text to recognize were limited (like 3 or 4 different words), meaning that a medium accuracy was enough to recognize the exact text.


Top
 Profile E-mail  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 6:34 pm 
Offline
PokerAI fellow
User avatar

Posts: 1306
Location: Finland
Favourite Bot: Self-made
Zoobie wrote:
I tried to use OCR too when I began coding my bot, and I've not been able to have a 100% accuracy. There were always unrecognized characters (they often have difficulties to deal with 0 and O or 1 and l), as poker names can't be compared to a dictionary.

Moreover, you can miss some informations using OCR (for example on the poker client I'm botting atm, the opponents names are replaced by their action when they act), meaning that you'll sometimes have to print the window several times to get everything you need. The more table windows you'll have on your screen, the smaller the characters will be, leading in a less accurate OCR.

I don't really like OCR as the primary method to extract data from the tables (just imagine how big is the issue if your OCR is sometimes wrong on the player stacks or bets). But I needed it for little things were the possibilities of text to recognize were limited (like 3 or 4 different words), meaning that a medium accuracy was enough to recognize the exact text.

Well, let me say how I'm doing it: I have a background timer that TRIES to OCR the name from each such seat where a) someone is sitting and b) I don't have a valid name yet. And when there is a name associated to a particular seat, I'm not doing the OCR again until I notice that there is a new name on the seat. My background thread runs every 100ms, iterating through the seats so that the whole table gets covered in 1s.

But that's just in the beginning (very first time the table is opened): Then I hook the chat box also and "double-check" the names based on the dealer position, dealt cards (which seats got cards) and the sequence of players acting.

So: When I see a table for 1 second so that the names are in plain view, I'll have "unconfirmed" names associated to the seats and when I see the players name on the chat box for the first time (and after that see the dealers name, as this particular site tells who is the dealer in the beginning of the hand on the text box) then I mark that name as confirmed and IF the observed chat name differs from the OCR'd name, I replace the OCR'd name with the one observed from the chat box as it definitely is more reliable.

So, why not just do everything from the chat box? Well, I did that earlier, but noticed that there becomes a problem a) with the very first hand that my bot is sitting on the table b) when someone leaves behind my bot and a new player sits in, I don't get to see any actions from him before my bot and don't know who he is etc. These all cause problems to my bot as there are often sort of uncertain situations on the table when couldn't say who is there.

So, a combination of background OCR + detecting when a name changes on a seat + confirming the name from the chat pretty much does it. And note that my OCR'ing isn't a continuous strain on the CPU, it's running only when required.

On, about the stacks or bets: I don't scrape those, I'm not really interested in those as I'm only doing FL.

But the problem with actions & names -> Why on earth would you OCR the names at such a moment when there can be an action ruining the NAME... ;) Why not be a little bit smarter about WHEN to perform the OCR'ing.

_________________
Opinions expressed are my own, your mileage may vary... ;)
Warning: If I spot an opportunity to give sarcastic replies, I will take it. Nothing personal. I don't even know you.


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 7:16 pm 
Offline
Senior member
User avatar

Posts: 136
Location: Paris - France
Favourite Bot: ZBot
Quote:
When I see a table for 1 second so that the names are in plain view, I'll have "unconfirmed" names associated to the seats and when I see the players name on the chat box for the first time (and after that see the dealers name, as this particular site tells who is the dealer in the beginning of the hand on the text box) then I mark that name as confirmed and IF the observed chat name differs from the OCR'd name, I replace the OCR'd name with the one observed from the chat box as it definitely is more reliable.

Well, I didn't want that because of the opponent modelling. I wanted to be sure I could find the player in my database as soon as the game starts.

Quote:
So, why not just do everything from the chat box? Well, I did that earlier, but noticed that there becomes a problem a) with the very first hand that my bot is sitting on the table

Right, I still have that issue when my bot is not in the hand ;) I guess OCR is the only solution in that case, but I prefered to spend some time on other parts of my bot (it's still very young). Atm I only check the RBG values of one pixel to see if some cards are dealt to me to start the threads on the table. I also check one pixel on each seat to see if a player is away when my first game starts.

Quote:
b) when someone leaves behind my bot and a new player sits in, I don't get to see any actions from him before my bot and don't know who he is etc. These all cause problems to my bot as there are often sort of uncertain situations on the table when couldn't say who is there.

I just got this information from memory, it's probably specific to my poker client. I suppose Pokerstars requires a real disassembling. In fact pretty much get all the data from memory.

Quote:
But the problem with actions & names -> Why on earth would you OCR the names at such a moment when there can be an action ruining the NAME... ;) Why not be a little bit smarter about WHEN to perform the OCR'ing.

The only moment I had to do the screenshot was when dealing cards, but the animations sometimes covered the name of some players. I insist that the bot was quite young and that two screenshots with a 300ms delay would have probably corrected the problem, but I wanted to be sure to get some perfect data, and OCR is not the best for that imo. It probably is good with some tweaking, but I was quite lazy ;)


Top
 Profile E-mail  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Mon Feb 02, 2009 7:32 pm 
Offline
PokerAI fellow
User avatar

Posts: 1306
Location: Finland
Favourite Bot: Self-made
Zoobie wrote:
Well, I didn't want that because of the opponent modelling. I wanted to be sure I could find the player in my database as soon as the game starts.

So... you never play with anyone that is not on your database? What happens if such a player sits on your table? Your bot sits out for, say, 50 hands and then you have enough data of him? :drink ;)

Quote:
I just got this information from memory, it's probably specific to my poker client. I suppose Pokerstars requires a real disassembling. In fact pretty much get all the data from memory.

Right, that sounds good and I'll do that also on some site.

Quote:
The only moment I had to do the screenshot was when dealing cards, but the animations sometimes covered the name of some players. I insist that the bot was quite young and that two screenshots with a 300ms delay would have probably corrected the problem, but I wanted to be sure to get some perfect data, and OCR is not the best for that imo. It probably is good with some tweaking, but I was quite lazy ;)

Yep, as I said: It works easily when you do it on the background (on-demand) and not just rely on a specific moment. I had also hard time finding a good single moment tied to some chat actions where the names were in plain sight, there was also some "Big Blind" or somesuch nonsense on the player info box. But having discovered the "try on the background always when required"-idea, the thing started working like a charm.

In fact, I'm just testing my code (when writing this) and when a new table is opened, I have 100% success rate on each players name (using my self-made NN-OCR) within 1-2 seconds of the table opening. Can't ask much more than that... ;)

_________________
Opinions expressed are my own, your mileage may vary... ;)
Warning: If I spot an opportunity to give sarcastic replies, I will take it. Nothing personal. I don't even know you.


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Tue Feb 03, 2009 7:04 am 
Offline
PokerAI fellow
User avatar

Posts: 664
Favourite Bot: Johnny #5
TooMuchCoffee wrote:
Was looking at the same thing and found this: http://www.pixel-technology.com/freeware/tessnet2/


Thanks for posting that TMC. With the proper setup, it works like a charm!


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Tue Feb 03, 2009 7:28 am 
Offline
PokerAI fellow
User avatar

Posts: 2158
Favourite Bot: My next one
c2008 wrote:
Thanks for posting that TMC. With the proper setup, it works like a charm!
Care to elaborate ? TMC seemed to have some issues with it.


Top
 Profile E-mail  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Tue Feb 03, 2009 12:21 pm 
Offline
PokerAI fellow
User avatar

Posts: 7514
Favourite Bot: V12
I tried this one (http://www.pixel-technology.com/freeware/tessnet2/) and it seems to work poor.
E.g. if I write random text in pbrush and try to OCR it does mistakes over text which otherwise seems simple.

Also it fails to OCR anything meaningful on the attached example. Can anyone get better results for it?

Attachment:
test3.bmp
test3.bmp [ 30.68 KB | Viewed 3560 times ]

_________________
indiana


Top
 Profile E-mail  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Tue Feb 03, 2009 8:09 pm 
Offline
Regular member
User avatar

Posts: 75
Favourite Bot: ...
Thanks for all your answers.
I'm gonna be more specific though.

Like I said, I'm interested in pixel matching, but I want it to keep working even after resizing.

Here is my first thought :
1 Delimit each letter (should be easy : the chat is black and white)
2 Divide each letter in 4 parts (North West,NE,SW,SE). Count the number of black pixels on each part (pixel of the letter rather than the background).
3 Compute the ratio of the number of pixels of each part to the number of pixel of the whole letter.

If this number remains the same even after resizing, then it's okay, I have a pixel matching algorithm that will keep working even after window resizing.

I'm not saying this is working. This is the kind of ideas I'm looking for.
Has anybody tried this sort of things?
What do you guys think?


Top
 Profile  
 
 Post subject: Re: Optical Character Recognition
PostPosted: Tue Feb 03, 2009 8:15 pm 
Offline
Regular member
User avatar

Posts: 75
Favourite Bot: ...
Coffee4tw wrote:
Parsing the chat with only OCR will be very tough.

Just an example:
You are playing full ring and are on the button, all players in front of you tick auto-fold and in the instant the first player folds its your turn to act. Now you have 6 new messages in the chat in the fraction of a second. If you are now multi tabling with let say 4 or more tables your chat will almost certainly not be that big to still see all 6 lines therefore you will miss some lines.

If you can ensure that you will always see all new lines when its your turn, then go ahead with that approach. If not: try something else.

Cheers.


True, but it's not complicated to handle.

You can easily see when this situation happens, and in that case, you just have to make your bot click up on the chat... You'll loose a little time, but since it is rare, it's ok


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 188 posts ]  Go to page 1, 2, 3, 4, 5 ... 10  Next


Who is online

Users browsing this forum: No registered users and 0 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: