Results 1 to 6 of 6
  1. #1
    Holmesdr's Avatar
    Join Date
    Nov 2020
    Gender
    male
    Location
    Amsterdam
    Posts
    6
    Reputation
    10
    Thanks
    0

    What are the rules associated with web scraping?

    I starting learned Beautiful Soup and want to test myself by doing some projects but I found not all websites allow web scraping and somethings about robots.txt. What are the legal things associated with it anyone advise me about what should I do or some projects?

  2. #2
    Holmesdr's Avatar
    Join Date
    Nov 2020
    Gender
    male
    Location
    Amsterdam
    Posts
    6
    Reputation
    10
    Thanks
    0
    I have found solutions for test my project https://mydataprovider.com/solutions/web-scraping/. It's provide any type of data extraction configuration the needs of me

  3. #3
    Matthew's Avatar
    Join Date
    Mar 2017
    Gender
    male
    Posts
    5,330
    Reputation
    1162
    Thanks
    1,156
    You could scrape all external links that the site gives you (does it redirect you to a third party, etc) or be boring and scrape the image links that are embedded in the site's code. I'm not to brushed up on web scraping

  4. #4
    Bababam's Avatar
    Join Date
    Nov 2020
    Gender
    female
    Posts
    6
    Reputation
    10
    Thanks
    1
    Are you getting a specific error/stack trace?

    You should be able to open a standard HTTPS connection to the webpage you want to scrape. The server wont know the difference between your program or a users web browser. If you are having problems in this area then either check that you are setting the correct spoofed HTTP headers such as device type, etc. Or, change your implementation to use an existing browser in headless mode for your HTTPS connection.

  5. #5
    JackGambler's Avatar
    Join Date
    Jul 2022
    Gender
    male
    Posts
    1
    Reputation
    10
    Thanks
    0

    Sport

    i cant read and

  6. #6
    PeterBrodbeck's Avatar
    Join Date
    Apr 2023
    Gender
    male
    Posts
    1
    Reputation
    10
    Thanks
    0
    It is also possible to use a headless browser, such as Puppeteer or Selenium, to scrape web pages. This approach can provide greater flexibility and control over the scraping process, as it allows you to interact with the page as if you were using a real browser.

Similar Threads

  1. [Solved] What are the rules on video Tuts?
    By crietenz in forum Vindictus Help
    Replies: 5
    Last Post: 08-27-2011, 09:29 AM
  2. What are the chances of being caught with a wallhack?
    By thebeanie in forum Call of Duty Modern Warfare 2 Discussions
    Replies: 14
    Last Post: 12-03-2009, 06:56 PM
  3. When are the rules going to be enforced?
    By webbleking1.2 in forum General
    Replies: 1
    Last Post: 06-09-2008, 03:58 PM
  4. What are the types? help please
    By sidnietje in forum WarRock - International Hacks
    Replies: 0
    Last Post: 05-30-2007, 08:25 AM