Screen Scraper Tricks (Extracting Data From Difficult Websites) Defcon 17
Description:
This is the video of the presentation titled "Screen Scraper Tricks: Extracting Data from Difficult Websites" given at Defcon 17 by Michael Schrenk.
Abstract: Screen scrapers and data mining bots often encounter problems when extracting data from modern websites. Obstacles like AJAX discourage many bot writers from completing screen scraping projects. The good news is that you can overcome most challenges if you learn a few tricks.
This session describes the (sometimes mind numbing) roadblocks that can come between you and your ability to apply a screen scraper to a website. You'll discover simple techniques for extracting data from websites that freely employ DHTML, AJAX, complex cookie management as well as other techniques. Additionally, you will also learn how "agencies" create large scale CAPTCHA solutions.
Author Bio: Michael Schrenk is a webbot developer and the author of "Webbots, Spiders, and Screen Scrapers" (2007, No Starch Press). He has also written for ComputerWorld, php|architect and Web Techniques magazines. Mike also gave presentations at DEF CON X, XI and XV. He works for a wide range of clients across North American as well as in Russia, Spain and The Netherlands. Stop by www.schrenk.com and say hello.
Tags: basics ,
Disclaimer: We are a infosec video aggregator and this video is linked from an external website. The original author may be different from the user re-posting/linking it here. Please do not assume the authors to be same without verifying.
Comments: