Results 1 to 5 of 5
  1. #1
    Alroundeath's Avatar
    Join Date
    Sep 2009
    Gender
    male
    Posts
    331
    Reputation
    8
    Thanks
    29
    My Mood
    Amused

    Grab Data From a Site

    Wanting to know where I should look to find out how to get data from a website.

    Example: Posts on a forum from a certain page.

    Or... Reputation of that person from a page
    Or... Posts per day
    etc.

    Just wondering what method I would use to do this, or where I could learn such magical prowess xP

  2. #2
    Hassan's Avatar
    Join Date
    May 2010
    Gender
    male
    Location
    System.Threading.Tasks
    Posts
    4,764
    Reputation
    495
    Thanks
    2,133
    My Mood
    Dead
    First you will need to capture the source code of the website. To do this add the following block of function to your program:

    Code:
    Function GetPage(ByVal pageUrl As String) As String
    	  Dim s As String = ""
    	  Try
    		 Dim request As HttpWebRequest = WebRequest.Create(pageUrl)
    		 Dim response As HttpWebResponse = request.GetResponse()
    		 Using reader As StreamReader = New StreamReader(response.GetResponseStream())
    			s = reader.ReadToEnd()
    		 End Using
    	  Catch ex As Exception
    		 msgbox ("Error getting source code: " & vbcrlf & ex.Message)
    	  End Try
    	  Return s
       End Function
    Also add the following namespaces:

    Code:
    Imports System.Net
    Imports System.IO
    Imports System.Text
    Usage:

    Code:
    RichTextbox1.text=GetPage("www.mpgh.net")
    ^^ This will insert the source code of the index page of MPGH to richtextbox.

    Now for the parsing (extracting specific parts of the webpage like reputation etc...), we will use Read Between function:


    Code:
    Public Function rb(ByVal readfrom As String, ByVal str1 As String, ByVal str2 As String)
             
    
                    Dim x1 As Integer = readfrom .IndexOf(str1) + (str1.Length + 1)
                    Dim x2 As Integer = readfrom .IndexOf(str2, x1) + 1
    
                    Dim result As String = Mid(readfrom , x1, x2 - x1)
    
                    Return result
           End Function
    Reputation, posts per days are usually tagged. Like:

    Code:
    <Post user="FlameSaber">11</Post>
    Now I'll show u example usage of how to extract 11 from the above tag using the methods I told above.

    Code:
    Sub ExtractPosts(url as string)
    Dim n as string = geturl(url)
    Dim PostsofFlameSaber as string = rb("<Post user=""FlameSaber"">","</Post>)
    
    Msgbox ("Posts per day of FlameXaber are: " & cint(PostsofFlameSaber ))
    End Sub
    Now, I just call this method from anywhere I want:

    Code:
    Private Sub Form1_load(sender as object,e as system.eventargs) Handles Me.Load
    ExtractPosts("https://www.mpgh.net/forum/members/553108-flamesaber.html")
    End Sub
    The tags "<Post user=""FlameSaber"">" are just an example. They can be like this or can be changed, change it to whatever form they are in. To figure this out thoroughly view the page's source code.

    Hope this helps !!

  3. The Following 2 Users Say Thank You to Hassan For This Useful Post:

    Alroundeath (07-11-2010),Jason (07-10-2010)

  4. #3
    Alroundeath's Avatar
    Join Date
    Sep 2009
    Gender
    male
    Posts
    331
    Reputation
    8
    Thanks
    29
    My Mood
    Amused
    Quote Originally Posted by FLAMESABER View Post
    First you will need to capture the source code of the website. To do this add the following block of function to your program:

    Code:
    Function GetPage(ByVal pageUrl As String) As String
    	  Dim s As String = ""
    	  Try
    		 Dim request As HttpWebRequest = WebRequest.Create(pageUrl)
    		 Dim response As HttpWebResponse = request.GetResponse()
    		 Using reader As StreamReader = New StreamReader(response.GetResponseStream())
    			s = reader.ReadToEnd()
    		 End Using
    	  Catch ex As Exception
    		 msgbox ("Error getting source code: " & vbcrlf & ex.Message)
    	  End Try
    	  Return s
       End Function
    Also add the following namespaces:

    Code:
    Imports System.Net
    Imports System.IO
    Imports System.Text
    Usage:

    Code:
    RichTextbox1.text=GetPage("www.mpgh.net")
    ^^ This will insert the source code of the index page of MPGH to richtextbox.

    Now for the parsing (extracting specific parts of the webpage like reputation etc...), we will use Read Between function:


    Code:
    Public Function rb(ByVal readfrom As String, ByVal str1 As String, ByVal str2 As String)
             
    
                    Dim x1 As Integer = readfrom .IndexOf(str1) + (str1.Length + 1)
                    Dim x2 As Integer = readfrom .IndexOf(str2, x1) + 1
    
                    Dim result As String = Mid(readfrom , x1, x2 - x1)
    
                    Return result
           End Function
    Reputation, posts per days are usually tagged. Like:

    Code:
    <Post user="FlameSaber">11</Post>
    Now I'll show u example usage of how to extract 11 from the above tag using the methods I told above.

    Code:
    Sub ExtractPosts(url as string)
    Dim n as string = geturl(url)
    Dim PostsofFlameSaber as string = rb("<Post user=""FlameSaber"">","</Post>)
    
    Msgbox ("Posts per day of FlameXaber are: " & cint(PostsofFlameSaber ))
    End Sub
    Now, I just call this method from anywhere I want:

    Code:
    Private Sub Form1_load(sender as object,e as system.eventargs) Handles Me.Load
    ExtractPosts("https://www.mpgh.net/forum/members/553108-flamesaber.html")
    End Sub
    The tags "<Post user=""FlameSaber"">" are just an example. They can be like this or can be changed, change it to whatever form they are in. To figure this out thoroughly view the page's source code.

    Hope this helps !!


    You deserve a bit of recognition for what you just typed out, why not make your own thread and have it added to the tutorials list? x)

  5. #4
    Hassan's Avatar
    Join Date
    May 2010
    Gender
    male
    Location
    System.Threading.Tasks
    Posts
    4,764
    Reputation
    495
    Thanks
    2,133
    My Mood
    Dead
    Quote Originally Posted by Alroundeath View Post




    You deserve a bit of recognition for what you just typed out, why not make your own thread and have it added to the tutorials list? x)
    Thanks, I thought of writing an extensive parser tutorials many times, but too lazy for doing that. I think I'll write it with in 3-4 days. I'll cover up everything so you can understand string manipulation

  6. #5
    Alroundeath's Avatar
    Join Date
    Sep 2009
    Gender
    male
    Posts
    331
    Reputation
    8
    Thanks
    29
    My Mood
    Amused
    Quote Originally Posted by FLAMESABER View Post
    Thanks, I thought of writing an extensive parser tutorials many times, but too lazy for doing that. I think I'll write it with in 3-4 days. I'll cover up everything so you can understand string manipulation
    You don't understand how helpful that would be x)

    Because what you posted above sort of confused me the first time I read through it xP


    Once again, thanks xP