OpenHoldem scraper

From PokerAI

Jump to: navigation, search

Contents

Scraper engine

This chapter is written in the context of OpenHoldem 1.1, which is not yet released. Specifically, generic OCR capability ("R" transform) is not available in the 1.0 code tree.

The scraper engine is called by the heartbeat thread once per cycle. It is a very linear function, and proceeds as follows:

  1. Take a snapshot of the poker client window. If it is the same as the previous snapshot, then do nothing, stop now and return, otherwise:
  2. Scrape common cards
  3. For each seat:
    1. Scrape player cards
    2. Scrape seated status
    3. Scrape active status
    4. Scrape dealer button
    5. Scrape name
    6. Scrape balance
    7. Scrape current bet
  4. Scrape buttons
  5. Scrape pots
  6. Scrape limits

For each "scrape" type, the results are stored in the CScraper class (defined in scraper.h)

Scrape subroutines

Here are details on the specific scrape subroutines. You need to refer to the latest code and in-code documentation to be sure 100% of what the behaviour of each of the methods actually is.

CScraper::scrape_common_cards

 for i=0 to 4 
    call process_region, call do_transform 
    if do_transform returns a match, then set common card [i] to the return value, otherwise 
       set common card [i] to CARD_NOCARD 
 end for 
   

CScraper::scrape_player_cards

 for i=0 to 1 
   [try r$u region first, if it is defined] 
     call process_region, call do_transform 
     if do_transform returns a match, then set player card [i] to the return value 
  [no luck with r$u? Then try r$p region next, if it is defined] 
     call process_region, call do_transform 
     if do_transform returns a match, then set player card [i] to the return value 
  [no luck with r$u? Then try cardback region next, if it is defined] 
     call process_region, call do_transform 
     if do_transform returns a match, then set player card [i] to CARD_BACK 
 end for 
   

CScraper::scrape_seated

 Start by assuming the seat is NOT seated 
 [try r$u region first, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set player seated to true 
 [no luck with r$u? Then try r$p region next, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set player seated to true 


CScraper::scrape_active

 Start by assuming the seat is NOT active 
 [try r$u region first, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set player active to true 
 [no luck with r$u? Then try r$p region next, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set player active to true 
   

CScraper::scrape_dealer

 call process_region, call do_transform 
 if do_transform returns a match, then set player dealer to true, otherwise set to false 
   

CScraper::scrape_name

 set got_new_scrape to false 
 [try r$uname region first, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set got_new_scrape to true 
 [no luck with r$uname? Then try r$uXname region next, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set got_new_scrape to true 
 [no luck with r$uXname? Then try r$pXname region next, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set got_new_scrape to true 

 if got_new_scrape is true, then 
   if we don't yet have a name for this seat, then set it to the scraped value 
   else keep count of the number of times we have scraped this name for this seat, if it 
      equals or exceeds the number set in preferences, then set the seat name to the scraped name 
   

CScraper::scrape_balance

 set got_new_scrape to false 
 [try r$ubalance region first, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set got_new_scrape to true 
 [no luck with r$ubalance? Then try r$uXbalance region next, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set got_new_scrape to true 
 [no luck with r$uXbalance? Then try r$pXbalance region next, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set got_new_scrape to true 
 
 if got_new_scrape is true, then 
   if this seat's balance is 0, then set it to the scraped value 
   else keep count of the number of times we have scraped the balance for this seat, if it 
      equals or exceeds the number set in preferences, then set the seat balance to the scraped 
      balance 
   

CScraper::scrape_bet

 set bet for this chair to zero 
 [try r$pXbet region first, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, then set bet for this chair to the scraped amount 
 [no luck with r$pXbet? Then try r$pXchip region next, if it is defined] 
    call process_region, call do_chip_scrape 
   

CScraper::scrape_buttons

 set buttonlabel0-3 to fold, call, raise, allin, respectively 
 for i=0 to 9 
    [process r$iXstate region, if it is defined] 
       call process_region, call do_transform 
       if do_transform returns a match, set state of button [i] to the return value from do_transform 
    [process r$i86Xstate region, if it is defined] 
       call process_region, call do_transform 
       if do_transform returns a match, set state of i86 button [i] to the return value from do_transform 
    [process r$iXlabel region, if it is defined] 
       call process_region, call do_transform 
       if do_transform returns a match, set label of button [i] to the return value from do_transform 
 end for 
 [process r$i86state region, if it is defined] 
    call process_region, call do_transform 
    if do_transform returns a match, set i86buttonstate to the return value from do_transform 
   

CScraper::scrape_pots

 for i=0 to 4 
   set pot [i] to 0 
   [process r$c0potX region, if it is defined] 
      call process_region, call do_transform 
      if do_transform returns a match, then set pot [i] to the scraped amount 
   [no luck with r$c0potX? Then try r$c0potXchipYZ region next, if it is defined] 
      call process_region, call do_chip_scrape for pot [i] 
 end for 


CScraper::scrape_limits

 [process r$c0istournament region, if it is defined] 
   call process_region, call do_transform 
   if do_transform returns a match, then set istournament to true, otherwise set to false 
 [process r$c0handnumber region, if it is defined] 
   call process_region, call do_transform 
   if do_transform returns a match, then set handnumber to the scraped value, otherwise zero 
 for i=0 to 9 
   [process r$c0handnumberX region, if it is defined] 
      call process_region, call do_transform 
      if do_transform returns a match, then set handnumber to the scraped value, otherwise zero 
 end for 
   
 if blinds are not locked, then 
    [process s$ttlimits, if it is defined] 
       get title bar text, and call parse_string_bsl with the pattern defined in s$ttlimits 
    for i=0 to 9 
       [process s$ttlimitsX, if it is defined] 
         get title bar text, and call parse_string_bsl with the pattern defined in s$ttlimitsX 
    end for    
    [process r$c0limits, s$c0limits, if they are both defined] 
       call process_region, call do_transform  (for r$c0limits) 
       if do_transform returns a match, then get title bar text, and call parse_string_bsl with 
          the pattern defined in s$c0limits 
    for i=0 to 9 
       [process r$c0limitsX, s$c0limitsX, if they are both defined] 
          call process_region, call do_transform  (for r$c0limitsX) 
          if do_transform returns a match, then get title bar text, and call parse_string_bsl with 
             the pattern defined in s$c0limitsX 
    end for 
    [process r$c0sblind region, if it is defined] 
       call process_region, call do_transform 
       if do_transform returns a match, then set sblind to the scraped value 
    [process r$c0bblind region, if it is defined] 
       call process_region, call do_transform 
       if do_transform returns a match, then set bblind to the scraped value 
    [process r$c0bigbet region, if it is defined] 
       call process_region, call do_transform 
       if do_transform returns a match, then set bbet to the scraped value 
    [process r$c0ante region, if it is defined] 
       call process_region, call do_transform 
       if do_transform returns a match, then set ante to the scraped value 
 
 end if (blinds are not locked)    
   

Set sblind and bblind with some complex logic based on game type and results from all the above scraping in the "scrape_limits" function (see code for this).

Common subroutines

CScraper::process_region

Moves the portion of the poker window (as specified for the region in the profile) into a bitmap for use by the transform engine. It also returns a true/false if this portion of the window has changed, so we can decide if we want to transform it or not.


CScraper::parse_string_bsl

This funtion takes as parameters the text to be parsed, and a parse format string. It populates the handnumber, sblind, bblind, bbet, sb_bb, bb_BB, and ante variables if those tokens are found as matches in the input text, based on the layout in the format string. It will set the handnumber and sblind variable immediately if they are found, but will only set bblind, bbet, sb_bb, bb_BB, and ante if both of the following are true:

  • The value is currently zero for this variable
  • The "found" flag for this variable is set to false

The found flag is set to false when a new hand is encountered, but as you can see, we do not routinely reset these variables. This information is too important to accidentally lose on a misscrape, so it is carried forward and only overwritten as required.


CScraper::do_transform

This function is a simple switch/case that calls the actual appropriate transformation code based on the transformation code for the region in the profile.

  • case C, call c_transform
  • case I, call i_transform
  • case H, call h_transform
  • case T, call t_transform
  • case N, return (do nothing)
  • case R, call r_transform


CScraper::do_chip_scrape

This is a long function that steps through each chip stack, then through each chip in that stack, and attempts to find a hash match for the value of that chip. All those hash matches are summed up, and the total value of the stacks is returned. The code is a bunch of loops, and should be fairly self- explanatory.


Transformation routines

c_transform

This function calculates the simple average of each red, green, and blue component in the region. The WH documentation calls this "a 3d center of mass calculation", but in truth it is a simple average.

That red, green and blue point is then plotted on XYZ axes. A sphere is defined using the "color" parameter of the region (again, RGB plotted on XYZ axes) and the absolute value of the "radius" parameter of the region. If the average color point falls within the defined sphere, then we have a match. UNLESS the radius is negative, then we need the average to fall outside of the sphere in order to have a match.

i_transform

Not yet implemented. Once the OpenHoldem development team sees this transform used in the real world, we will implement this function.

h_transform

This function first calculates a hash of the pixels in the region. If this is a type "0" region, it uses all the pixels to create the hash, if it is a type "1", "2", or "3" region, it uses selected pixels as identified by the "P" parameters in the profile that match the type.

The hash is calculated using the hashword function, and then a matching hash is searched for in the h$ lines of the profile, and the index to the correct h$ is returned if found.

t_transform

This function first loads only the foreground pixels into an in-memory structure. Foreground pixels are identified with colors/radii/spheres as in the "C" transform.

The in memory pixel map is then shifted left and down (virtually) so that there are no blank lines to the left or on the bottom.

In order to find a match, the algorithm first tries to scan left to right in the pixmap until it reaches a column of pixels that are all background. When that happens, we then create a hash of those pixels and search for a match.

If a match is not found, then the algorithm does a right-to-left scan, searching for the largest match of a hash value.

If a match is found in either case, then the pointer is moved over to the right an amount equal to the size of the found character+1, and then the process continues, searching for the next character.

Once the algorithm runs out of foreground pixels, it returns the recognized string. It is important to note that if the algorithm encounters a set of pixels that does not match any character hash, that the return value will be an empty string. This is an indicator that there is something wrong with the definition of the font in the profile.

r_transform

This OCR transform uses the Tesseract OCR library to perform a traditional OCR pass on the region. It first resizes the region by an amount equal to the parameter specified in the profile, as Tesseract requires a vertical height of at least 20 pixels in order to have satisfactory OCR results. The ImageMagick library is called upon to do the resize.

The Tesseract OCR engine is then called, and the resultant string is returned.


Miscellaneous utilities

The bottom of scraper.cpp contains a number of utility functions to recognize strings (or signals, as the WH documentation refers to them). These are collected all in one place, for easy of maintenance and extension in the future.


See also

Personal tools