Grab Data From a Site

**Alroundeath** · 07-10-2010

Wanting to know where I should look to find out how to get data from a website.

Example: Posts on a forum from a certain page.

Or... Reputation of that person from a page
Or... Posts per day
etc.

Just wondering what method I would use to do this, or where I could learn such magical prowess xP

**Hassan** · 07-10-2010

First you will need to capture the source code of the website. To do this add the following block of function to your program:

Code:

Function GetPage(ByVal pageUrl As String) As String
	  Dim s As String = ""
	  Try
		 Dim request As HttpWebRequest = WebRequest.Create(pageUrl)
		 Dim response As HttpWebResponse = request.GetResponse()
		 Using reader As StreamReader = New StreamReader(response.GetResponseStream())
			s = reader.ReadToEnd()
		 End Using
	  Catch ex As Exception
		 msgbox ("Error getting source code: " & vbcrlf & ex.Message)
	  End Try
	  Return s
   End Function

Also add the following namespaces:

Code:

Imports System.Net
Imports System.IO
Imports System.Text

Usage:

Code:

RichTextbox1.text=GetPage("www.mpgh.net")

^^ This will insert the source code of the index page of MPGH to richtextbox.

Now for the parsing (extracting specific parts of the webpage like reputation etc...), we will use Read Between function:

Code:

Public Function rb(ByVal readfrom As String, ByVal str1 As String, ByVal str2 As String)
         

                Dim x1 As Integer = readfrom .IndexOf(str1) + (str1.Length + 1)
                Dim x2 As Integer = readfrom .IndexOf(str2, x1) + 1

                Dim result As String = Mid(readfrom , x1, x2 - x1)

                Return result
       End Function

Reputation, posts per days are usually tagged. Like:

Code:

<Post user="FlameSaber">11</Post>

Now I'll show u example usage of how to extract 11 from the above tag using the methods I told above.

Code:

Sub ExtractPosts(url as string)
Dim n as string = geturl(url)
Dim PostsofFlameSaber as string = rb("<Post user=""FlameSaber"">","</Post>)

Msgbox ("Posts per day of FlameXaber are: " & cint(PostsofFlameSaber ))
End Sub

Now, I just call this method from anywhere I want:

Code:

Private Sub Form1_load(sender as object,e as system.eventargs) Handles Me.Load
ExtractPosts("https://www.mpgh.net/forum/members/553108-flamesaber.html")
End Sub

The tags "<Post user=""FlameSaber"">" are just an example. They can be like this or can be changed, change it to whatever form they are in. To figure this out thoroughly view the page's source code.

Hope this helps !!

**Alroundeath** · 07-11-2010

Originally Posted by FLAMESABER

First you will need to capture the source code of the website. To do this add the following block of function to your program:

Code:

Function GetPage(ByVal pageUrl As String) As String Dim s As String = "" Try Dim request As HttpWebRequest = WebRequest.Create(pageUrl) Dim response As HttpWebResponse = request.GetResponse() Using reader As StreamReader = New StreamReader(response.GetResponseStream()) s = reader.ReadToEnd() End Using Catch ex As Exception msgbox ("Error getting source code: " & vbcrlf & ex.Message) End Try Return s End Function

Also add the following namespaces:

Code:

Imports System.Net Imports System.IO Imports System.Text

Usage:

Code:

RichTextbox1.text=GetPage("www.mpgh.net")

^^ This will insert the source code of the index page of MPGH to richtextbox.

Now for the parsing (extracting specific parts of the webpage like reputation etc...), we will use Read Between function:

Code:

Public Function rb(ByVal readfrom As String, ByVal str1 As String, ByVal str2 As String) Dim x1 As Integer = readfrom .IndexOf(str1) + (str1.Length + 1) Dim x2 As Integer = readfrom .IndexOf(str2, x1) + 1 Dim result As String = Mid(readfrom , x1, x2 - x1) Return result End Function

Reputation, posts per days are usually tagged. Like:

Code:

<Post user="FlameSaber">11</Post>

Now I'll show u example usage of how to extract 11 from the above tag using the methods I told above.

Code:

Sub ExtractPosts(url as string) Dim n as string = geturl(url) Dim PostsofFlameSaber as string = rb("<Post user=""FlameSaber"">","</Post>) Msgbox ("Posts per day of FlameXaber are: " & cint(PostsofFlameSaber )) End Sub

Now, I just call this method from anywhere I want:

Code:

Private Sub Form1_load(sender as object,e as system.eventargs) Handles Me.Load ExtractPosts("https://www.mpgh.net/forum/members/553108-flamesaber.html") End Sub

The tags "<Post user=""FlameSaber"">" are just an example. They can be like this or can be changed, change it to whatever form they are in. To figure this out thoroughly view the page's source code.

Hope this helps !!

You deserve a bit of recognition for what you just typed out, why not make your own thread and have it added to the tutorials list? x)

**Hassan** · 07-11-2010

Originally Posted by Alroundeath

You deserve a bit of recognition for what you just typed out, why not make your own thread and have it added to the tutorials list? x)

Thanks, I thought of writing an extensive parser tutorials many times, but too lazy for doing that. I think I'll write it with in 3-4 days. I'll cover up everything so you can understand string manipulation

**Alroundeath** · 07-12-2010

Originally Posted by FLAMESABER

Thanks, I thought of writing an extensive parser tutorials many times, but too lazy for doing that. I think I'll write it with in 3-4 days. I'll cover up everything so you can understand string manipulation

You don't understand how helpful that would be x)

Because what you posted above sort of confused me the first time I read through it xP

Once again, thanks xP

Thread: Grab Data From a Site

Thread Tools

Display

Grab Data From a Site

The Following 2 Users Say Thank You to Hassan For This Useful Post: