Page 1 of 2 12 LastLast
Results 1 to 15 of 16
  1. #1
    kibbles18's Avatar
    Join Date
    Oct 2008
    Gender
    male
    Location
    US
    Posts
    860
    Reputation
    5
    Thanks
    127

    Parsing links from a webpage.

    so i made an email bot (a program that reads and emails emails stored in a txt file) and now i want to make a program that can open up a URL as a string and search for all domains/text in it as well ass visit all links on the page. how should i approach this?

  2. #2
    Lyoto Machida's Avatar
    Join Date
    Jan 2011
    Gender
    male
    Location
    Far away with girls
    Posts
    3,734
    Reputation
    133
    Thanks
    1,621
    My Mood
    Aggressive
    Use an webbrowser, make it redirect to that site and then:

    Dim Links As HtmlSomethingCollection = WebBrowser1.Document.GetElementByTagName("a")

    to see all the links:

    For Each SHit As HtmlElement in Links
    Listbox1.Items.Add(Shit.GetAtributte("href").ToStr ing) // to add the links to a listbox
    Next

    its something like that, i dont code vb for a long time ago..

  3. #3
    kibbles18's Avatar
    Join Date
    Oct 2008
    Gender
    male
    Location
    US
    Posts
    860
    Reputation
    5
    Thanks
    127
    so i need to use a webbrowser and navigate to a page and then what?

    heres my code so fat [wont work]
    Code:
    Public Class Form1
    
        Dim email As HtmlElement
    
        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
    
        End Sub
        Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
            Dim i As Integer
            WebBrowser1.Navigate(TextBox1.Text)
            Dim emails As System.Windows.Forms.HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName(TextBox2.Text)
            Dim count As Integer = WebBrowser1.Document.GetElementsByTagName(TextBox2.Text).Count
            For i = 0 To count
                ListBox1.Items.Add(email.GetAttribute("href").ToString)
            Next
        End Sub
    End Class
    Last edited by kibbles18; 06-03-2011 at 08:54 PM.

  4. #4
    Lyoto Machida's Avatar
    Join Date
    Jan 2011
    Gender
    male
    Location
    Far away with girls
    Posts
    3,734
    Reputation
    133
    Thanks
    1,621
    My Mood
    Aggressive
    Code:
    Public Class Form1
    
        Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
     
            WebBrowser1.Navigate(TextBox1.Text)
           
        End Sub
    
    
    ' put this on the WebBRowser1_DocumentComplete ' Event
    
     Dim emails As System.Windows.Forms.HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName(TextBox2.Text)
            
            For Each  email As HtmlElement in emails
                ListBox1.Items.Add(email.GetAttribute("href").ToString)
            Next
    
    End Class



    But that is to get the link , is that what you want'?

  5. #5
    kibbles18's Avatar
    Join Date
    Oct 2008
    Gender
    male
    Location
    US
    Posts
    860
    Reputation
    5
    Thanks
    127
    i want to find all emails with @gmail.com or @hotmail.com first and display them in the list box
    Last edited by kibbles18; 06-03-2011 at 09:54 PM.

  6. #6
    Lyoto Machida's Avatar
    Join Date
    Jan 2011
    Gender
    male
    Location
    Far away with girls
    Posts
    3,734
    Reputation
    133
    Thanks
    1,621
    My Mood
    Aggressive
    Button 1:
    Code:
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
     
            WebBrowser1.Navigate(TextBox1.Text)
           
        End Sub
    WebBrowser_DocumentComplete event:

    Code:
     Dim emails As System.Windows.Forms.HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName(TextBox2.Text)
            
            For Each  email As HtmlElement in emails
    If email.GetAttribute("href").ToString.Contains("@") then
                ListBox1.Items.Add(email.GetAttribute("href").ToString)
    end if
            Next

  7. #7
    kibbles18's Avatar
    Join Date
    Oct 2008
    Gender
    male
    Location
    US
    Posts
    860
    Reputation
    5
    Thanks
    127
    if i use "a" in txtbox2 it works. how can i make it so it will find emails even without links? like if the email did not contain a link? and can u explain "href" and "a" to me? why we use it?
    Last edited by kibbles18; 06-03-2011 at 11:45 PM.

  8. #8
    freedompeace's Avatar
    Join Date
    Jul 2010
    Gender
    female
    Posts
    3,033
    Reputation
    340
    Thanks
    2,792
    My Mood
    Sad
    use the following regex:
    Code:
    <([a-zA-Z0-9="]+)(href|onclick)="([^"']+)">*</\1>
    Verify the data within group 3 as HTTP URI, or evaluate the javascript to see if its a link to something.

  9. #9
    Lyoto Machida's Avatar
    Join Date
    Jan 2011
    Gender
    male
    Location
    Far away with girls
    Posts
    3,734
    Reputation
    133
    Thanks
    1,621
    My Mood
    Aggressive
    Quote Originally Posted by freedompeace View Post
    use the following regex:
    Code:
    <([a-zA-Z0-9="]+)(href|onclick)="([^"']+)">*</\1>
    Verify the data within group 3 as HTTP URI, or evaluate the javascript to see if its a link to something.
    yeah, Or just make some checks? Dont you know to use the If function?

    If Textbox1.Text.Length > 5 Then Msgbox("The texbox have more then 5characters")

    Make your own checks o.O

  10. #10
    kibbles18's Avatar
    Join Date
    Oct 2008
    Gender
    male
    Location
    US
    Posts
    860
    Reputation
    5
    Thanks
    127
    but sometimes the emails in a website dont have a link so how to get those? they are valid emails as well
    Last edited by kibbles18; 06-04-2011 at 10:00 AM.

  11. #11
    sythe179's Avatar
    Join Date
    Jun 2010
    Gender
    male
    Location
    The internet
    Posts
    660
    Reputation
    15
    Thanks
    1,458
    My Mood
    Paranoid
    this work?
    Code:
    For Each str As String In WebBrowser1.Document.All
                If str = ("*" & "@" & "*" & "." & "*") Then
                    listbox1.items.add(str)
                End If
            Next

  12. #12
    kibbles18's Avatar
    Join Date
    Oct 2008
    Gender
    male
    Location
    US
    Posts
    860
    Reputation
    5
    Thanks
    127
    like if it was in plain text and it wasnt a link to anywhere, would it still work? and thanks ***** ill try that when i get back
    Last edited by kibbles18; 06-04-2011 at 09:59 AM.

  13. #13
    Blubb1337's Avatar
    Join Date
    Sep 2009
    Gender
    male
    Location
    Germany
    Posts
    5,915
    Reputation
    161
    Thanks
    3,108
    Quote Originally Posted by *****179 View Post
    this work?
    Code:
    For Each str As String In WebBrowser1.Document.All
                If str = ("*" & "@" & "*" & "." & "*") Then
                    listbox1.items.add(str)
                End If
            Next
    That should not work at all.



  14. #14
    sythe179's Avatar
    Join Date
    Jun 2010
    Gender
    male
    Location
    The internet
    Posts
    660
    Reputation
    15
    Thanks
    1,458
    My Mood
    Paranoid
    Quote Originally Posted by Blubb1337 View Post
    That should not work at all.
    ahh....i see why...
    whoah...realy need some sleep....47 hours....yup..time to sleep..
    hey tablets...HERE I COME!....

  15. #15
    kibbles18's Avatar
    Join Date
    Oct 2008
    Gender
    male
    Location
    US
    Posts
    860
    Reputation
    5
    Thanks
    127
    so i can use
    Code:
    WebBrowser1.Document.All
    and then search for a string?

    its not working. im trying to get it to find an email thats not a link, just text. heres me code:
    Code:
            For Each emailstr As String In WebBrowser1.Document.All.ToString
                If emailstr.Contains("@") & emailstr.Contains(".") Then
                    ListBox1.Items.Add(emailstr)
                End If
            Next
    Last edited by kibbles18; 06-04-2011 at 08:01 PM.

Page 1 of 2 12 LastLast