Css selector tip: https://saucelabs.com/resources/articles/selenium-tips-css-selectors
- Selenium package interact with different browsers(Chrome, firefox, safari etc), Driver provides the bridge. (we use driver for chrome)
from selenium import webdriver
#install selenium
chrome_driver_path = "C:\dayeon2020\chromedriver.exe" #for mac: no .exe
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://www.amazon.com/Cuckoo-
CRP-P0609S-Cooker-10-10-11-60/dp/B01JRTZVVM/ref=sr_1_4?qid=1611738512&sr=8-4")
driver.close() #single window
# driver.quit() #entire program. regardless how many tabs
- BS has its limits when the website is written JS, Angular etc
driver.get("https://www.amazon.com/Cuckoo-
CRP-P0609S-Cooker-10-10-11-60/dp/B01JRTZVVM/ref=sr_1_4?qid=1611738512&sr=8-4")
price = driver.find_element_by_id("priceblock_ourprice")
print(price.text)
driver.quit() #entire program. regardless how many tabs
search bar
logo
from selenium import webdriver
#install selenium
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://www.python.org/")
#search_bar
search_bar = driver.find_element_by_name("q")
print(search_bar.tag_name) #print: input
print(search_bar.get_attribute("placeholder")) #print: search
#logo
logo = driver.find_element_by_class_name("python-logo")
print(logo.size) #print: {'height': 72, 'width': 255}
driver.quit() #entire program. regardless how many tabs
- Be aware of (".documentation-widget a") dot before the selector. Even though the full class name was "small-widget documentation-widget"
#selector
driver.get("https://www.python.org/")
documentation_link = driver.find_element_by_css_selector(".documentation-widget a")
print(documentation_link.text) #print: docs.python.org
driver.close()
- XPath can be used to navigate through elements and attributes in an XML document.
- Right click and copy xpath. We need to change the double quotes inside xpath into single quotes as it will crash with double quotes outside.
driver.get("https://www.python.org/")
bug_link = driver.find_element_by_xpath("//*[@id='site-map']/div[2]/div/ul/li[3]/a")
print(bug_link.text) #print: Submit Website Bug
find_elements: returns all that matchaes in a LIST form
Article: Locating strategy
events = driver.find_elements_by_xpath('//*[@id="content"]/div/section/div[2]/div[2]/div/ul')
print(events)
# -> print object in a list
[<selenium.webdriver.remote.webelement.WebElement
(session="0621dbfca2d047fd85c6dba14c554b6c", element="fac0caf1-8b8a-4ee0-9f2d-5d347cb3df72")>]
print(type(events))
# -> <class 'list'>
for i in events:
print(i.text)
use" .text.splitlines()" to put each items in a list when there are mutiple lists under ul element
event = driver.find_element_by_xpath('//*[@id="content"]/div/section/div[2]/div[2]/div/ul')
print(event)
#-> object
<selenium.webdriver.remote.webelement.WebElement
(session="5107d054b8b39ed4b87b64ddb660c292", element="4c2f8ec2-f025-4fab-a2e3-2b6f99ef154e")>
print(type(event))
#-> <class 'selenium.webdriver.remote.webelement.WebElement'>
In my code, range(1, 6) is not scalable.
Angela used css selector after inspecting the structure of the website.
Selecting the right method will only come only with experience and knowledge in HTML&CSS
<MY CODE>
from selenium import webdriver
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://www.python.org/")
events = {}
for i in range(1, 6):
time = driver.find_element_by_xpath(f"//*[@id='content']/div/section/div[2]/div[2]/div/ul/li[{i}]/time")
name = driver.find_element_by_xpath(f"//*[@id='content']/div/section/div[2]/div[2]/div/ul/li[{i}]/a")
events[i - 1] = {
'time': f"2021-{time.text}",
'name': name.text
}
print(events)
#print: {0: {'time': '2021-01-30', 'name': 'BelPy 2021'},
1: {'time': '2021-01-30', 'name': 'PyCamp Leipzig'},
2: {'time': '2021-02-19', 'name': 'PyCascades 2021'},
3: {'time': '2021-03-18', 'name': 'PyCon Cameroon 2021'},
4: {'time': '2021-04-22', 'name': 'GeoPython 2021'}}
driver.quit()
<Angela's code>
event_times = driver.find_elements_by_css_selector(".event-widget time")
event_names = driver.find_elements_by_css_selector(".event-widget li a")
print(event_times) #-> selenium object
events = {}
for n in range(len(event_times)):
events[n] = {
"time": event_times[n].text,
"name": event_names[n].text
}
print(events)
#print: {0: {'time': '01-30', 'name': 'BelPy 2021'},
1: {'time': '01-30', 'name': 'PyCamp Leipzig'},
2: {'time': '02-19', 'name': 'PyCascades 2021'},
3: {'time': '03-18', 'name': 'PyCon Cameroon 2021'},
4: {'time': '04-22', 'name': 'GeoPython 2021'}}
Using dictionary comprehension. Something I innitially tried and failed.
.splitlines() : single items into a list
range(range(0, len(events), 2)) : scalable way! zero to end of the list, every 2 steps
events = driver.find_element_by_xpath(
'//*[@id="content"]/div/section/div[2]/div[2]/div/ul').text.splitlines()
print(events)
#['01-30', 'BelPy 2021', '01-30', 'PyCamp Leipzig',
'02-19', 'PyCascades 2021', '03-18', 'PyCon Cameroon 2021', '04-22', 'GeoPython 2021']
dictionary = {i: {'time': events[i], 'name': events[i + 1]} for i in range(0, len(events), 2)}
print(dictionary)
- The year is hidden by CSS if the window is too small.
When printing the text inside the "time" tag, only the text that is visible (i.e. text where the CSS property "visibility" is equal to "visible") will be printed.
However, on python.org the CSS property "visibility" of the year component of the event is set to "hidden" when the browser window is resized to a certain width.
This is why, when the browser window has a specific width, the year component of the event is not displayed.
Example 1:
driver.set_window_size(width=100, height=200)
driver.get("https://www.python.org/")
event_times = driver.find_elements_by_css_selector(".event-widget time")
for time in event_times:
print(time.text)
Example 2:
driver.maximize_window()
driver.get("https://www.python.org/")
event_times = driver.find_elements_by_css_selector(".event-widget time")
for time in event_times:
print(time.text)
from selenium import webdriver
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://en.wikipedia.org/wiki/Main_Page")
num = driver.find_element_by_id("articlecount")
print(num.text) #print 6,237,906 articles in English
num2 = driver.find_element_by_css_selector("#articlecount a")
print(num2.text) #print 6,237,906
driver.quit()
with Selenium
Link sits in between anchor tags 'a'
from selenium import webdriver
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://en.wikipedia.org/wiki/Main_Page")
count = driver.find_element_by_css_selector("#articlecount a")
# count.click()
all_fortals = driver.find_element_by_link_text("All portals")
all_fortals.click()
# driver.quit()
we need to import Keys which has a bunch of different keys CONSTANT like ENTER, SHIFT, ALT etc
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
chrome_driver_path = "C:\dayeon2020\chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get("https://en.wikipedia.org/wiki/Main_Page")
search = driver.find_element_by_name("search")
search.send_keys("python")
search.send_keys(Keys.ENTER)
#### task. filling in the signup form
top = driver.find_element_by_class_name("top")
top.send_keys("python")
middle = driver.find_element_by_class_name("middle")
middle.send_keys("lee")
bottom = driver.find_element_by_class_name("bottom")
bottom.send_keys("pythonlee@mail.com")
# click = driver.find_element_by_class_name("btn-block")
# click.send_keys(Keys.ENTER)
time()
Function time.time returns the current time in seconds since 1st Jan 1970. The value is in floating point, so you can even use it with sub-second precision. In the beginning the value t_end is calculated to be "now" + 15 minutes. The loop will run until the current time exceeds this preset ending time.
Try this:
import time
t_end = time.time() + 60 * 15
while time.time() < t_end:
# do whatever you do
This will run for 15 min x 60 s = 900 seconds.
word
on a similar vein