def url_finder(data):
all =re.findall("http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+",data)
for i in all:
outpt = i.strip('"').strip("'") + "\n"
print outpt
inpt = "aaaaaaaaaaaaaa http://www.google.com bbbbbbbbb http://example010.blogspot.com ccccccccc http://google.com dddd http://a.b/a/a/a/index.html"
url_finder(inpt)
This code will simply find url using regular expression and output it.
Wednesday, January 28, 2009
Subscribe to:
Post Comments (Atom)








No comments:
Post a Comment