Skip to content

Commit 9c992fe

Browse files
committed
GUI App, Scraping
1 parent efde95e commit 9c992fe

File tree

2 files changed

+43
-43
lines changed

2 files changed

+43
-43
lines changed

README.md

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -2458,7 +2458,7 @@ if __name__ == '__main__':
24582458

24592459
GUI App
24602460
-------
2461-
#### A weight converter GUI application:
2461+
#### Runs a desktop app for converting weights from metric units into pounds:
24622462

24632463
```python
24642464
# $ pip3 install PySimpleGUI
@@ -2467,8 +2467,7 @@ import PySimpleGUI as sg
24672467
text_box = sg.Input(default_text='100', enable_events=True, key='QUANTITY')
24682468
dropdown = sg.InputCombo(['g', 'kg', 't'], 'kg', readonly=True, enable_events=True, k='UNIT')
24692469
label = sg.Text('100 kg is 220.462 lbs.', key='OUTPUT')
2470-
button = sg.Button('Close')
2471-
window = sg.Window('Weight Converter', [[text_box, dropdown], [label], [button]])
2470+
window = sg.Window('Weight Converter', [[text_box, dropdown], [label], [sg.Button('Close')]])
24722471

24732472
while True:
24742473
event, values = window.read()
@@ -2479,8 +2478,7 @@ while True:
24792478
except ValueError:
24802479
continue
24812480
unit = values['UNIT']
2482-
factors = {'g': 0.001, 'kg': 1, 't': 1000}
2483-
lbs = quantity * factors[unit] / 0.45359237
2481+
lbs = quantity * {'g': 0.001, 'kg': 1, 't': 1000}[unit] / 0.45359237
24842482
window['OUTPUT'].update(value=f'{quantity} {unit} is {lbs:g} lbs.')
24852483
window.close()
24862484
```
@@ -2493,7 +2491,8 @@ Scraping
24932491
# $ pip3 install requests beautifulsoup4
24942492
import requests, bs4, os
24952493

2496-
response = requests.get('https://en.wikipedia.org/wiki/Python_(programming_language)')
2494+
url = 'https://en.wikipedia.org/wiki/Python_(programming_language)'
2495+
response = requests.get(url, headers={'User-Agent': 'cpc-bot'})
24972496
document = bs4.BeautifulSoup(response.text, 'html.parser')
24982497
table = document.find('table', class_='infobox vevent')
24992498
python_url = table.find('th', text='Website').next_sibling.a['href']
@@ -2509,25 +2508,27 @@ print(f'{python_url}, file://{os.path.abspath(filename)}')
25092508
```python
25102509
# $ pip3 install selenium
25112510
from selenium import webdriver
2511+
```
25122512

2513-
<WebDrv> = webdriver.Chrome/Firefox/Safari/Edge() # Opens a browser. Also <WebDrv>.quit().
2514-
<WebDrv>.get('<url>') # Also <WebDrv>.implicitly_wait(seconds).
2515-
<str> = <WebDrv>.page_source # Returns HTML of fully rendered page.
2516-
<El> = <WebDrv/El>.find_element('css selector', …) # '<tag>#<id>.<class>[<attr>="<val>"]…'.
2517-
<list> = <WebDrv/El>.find_elements('xpath', …) # '//<tag>[@<attr>="<val>"]…'. See XPath.
2518-
<str> = <El>.get_attribute(<str>) # Property if exists. Also <El>.text.
2519-
<El>.click/clear() # Also <El>.send_keys(<str>).
2513+
```python
2514+
<Drv> = webdriver.Chrome/Firefox/Safari/Edge() # Opens the browser. Also <Driver>.quit().
2515+
<Drv>.implicitly_wait(seconds) # Sets timeout for find_element/s() methods.
2516+
<Drv>.get('<url>') # Blocks until browser fires the load event.
2517+
<str> = <Drv>.page_source # Returns HTML of the page's current state.
2518+
<El> = <Drv/El>.find_element('xpath', <str>) # Accepts '//<tag>[@<attr_name>="<val>"]…'.
2519+
<str> = <El>.get_attribute('<name>') # Returns attribute or property if exists.
2520+
<El>.click/clear() # Also <El>.text and <El>.send_keys(<str>).
25202521
```
25212522

25222523
#### XPath — also available in lxml, Scrapy, and browser's console via `'$x("<xpath>")'`:
25232524
```python
2524-
<xpath> = //<element>[/ or // <element>] # /<child>, //<descendant>, /../<sibling>
2525-
<xpath> = //<element>/following::<element> # Next element. Also preceding/parent/…
2526-
<element> = <tag><conditions><index> # `<tag> = */a/…`, `<index> = [1/2/…]`.
2527-
<condition> = [<sub_cond> [and/or <sub_cond>]] # For negation use `not(<sub_cond>)`.
2528-
<sub_cond> = @<attr>[="<val>"] # `text()=`, `.=` match (complete) text.
2529-
<sub_cond> = contains(@<attr>, "<val>") # Is <val> a substring of attr's value?
2530-
<sub_cond> = [//]<element> # Has matching child? Descendant if //.
2525+
<xpath> = //<element>[/ or // <element>] # E.g. …/child, …//descendant, …/../sibling.
2526+
<xpath> = //<element>/following::<element> # Next element. Also preceding::, parent::.
2527+
<element> = <tag><conditions><index> # Tag accepts */a/…. Use [1/2/…] for index.
2528+
<condition> = [<sub_cond> [and/or <sub_cond>]] # Use not(<sub_cond>) to negate condition.
2529+
<sub_cond> = @<attr>[="<val>"] # `text()=` and `.=` match (complete) text.
2530+
<sub_cond> = contains(@<attr>, "<val>") # Is <val> a substring of attribute's value?
2531+
<sub_cond> = [//]<element> # Has matching child? Descendant if //<el>.
25312532
```
25322533

25332534

index.html

Lines changed: 22 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@
5656

5757
<body>
5858
<header>
59-
<aside>September 5, 2025</aside>
59+
<aside>September 7, 2025</aside>
6060
<a href="https://gto76.github.io" rel="author">Jure Šorn</a>
6161
</header>
6262

@@ -2025,14 +2025,13 @@ <h3 id="format-2">Format</h3><div><h4 id="forstandardtypesizesandmanualalignment
20252025
</code></pre></div></div>
20262026

20272027

2028-
<div><h2 id="guiapp"><a href="#guiapp" name="guiapp">#</a>GUI App</h2><div><h4 id="aweightconverterguiapplication">A weight converter GUI application:</h4><pre><code class="python language-python hljs"><span class="hljs-comment"># $ pip3 install PySimpleGUI</span>
2028+
<div><h2 id="guiapp"><a href="#guiapp" name="guiapp">#</a>GUI App</h2><div><h4 id="runsadesktopappforconvertingweightsfrommetricunitsintopounds">Runs a desktop app for converting weights from metric units into pounds:</h4><pre><code class="python language-python hljs"><span class="hljs-comment"># $ pip3 install PySimpleGUI</span>
20292029
<span class="hljs-keyword">import</span> PySimpleGUI <span class="hljs-keyword">as</span> sg
20302030

20312031
text_box = sg.Input(default_text=<span class="hljs-string">'100'</span>, enable_events=<span class="hljs-keyword">True</span>, key=<span class="hljs-string">'QUANTITY'</span>)
20322032
dropdown = sg.InputCombo([<span class="hljs-string">'g'</span>, <span class="hljs-string">'kg'</span>, <span class="hljs-string">'t'</span>], <span class="hljs-string">'kg'</span>, readonly=<span class="hljs-keyword">True</span>, enable_events=<span class="hljs-keyword">True</span>, k=<span class="hljs-string">'UNIT'</span>)
20332033
label = sg.Text(<span class="hljs-string">'100 kg is 220.462 lbs.'</span>, key=<span class="hljs-string">'OUTPUT'</span>)
2034-
button = sg.Button(<span class="hljs-string">'Close'</span>)
2035-
window = sg.Window(<span class="hljs-string">'Weight Converter'</span>, [[text_box, dropdown], [label], [button]])
2034+
window = sg.Window(<span class="hljs-string">'Weight Converter'</span>, [[text_box, dropdown], [label], [sg.Button(<span class="hljs-string">'Close'</span>)]])
20362035

20372036
<span class="hljs-keyword">while</span> <span class="hljs-keyword">True</span>:
20382037
event, values = window.read()
@@ -2043,8 +2042,7 @@ <h3 id="format-2">Format</h3><div><h4 id="forstandardtypesizesandmanualalignment
20432042
<span class="hljs-keyword">except</span> ValueError:
20442043
<span class="hljs-keyword">continue</span>
20452044
unit = values[<span class="hljs-string">'UNIT'</span>]
2046-
factors = {<span class="hljs-string">'g'</span>: <span class="hljs-number">0.001</span>, <span class="hljs-string">'kg'</span>: <span class="hljs-number">1</span>, <span class="hljs-string">'t'</span>: <span class="hljs-number">1000</span>}
2047-
lbs = quantity * factors[unit] / <span class="hljs-number">0.45359237</span>
2045+
lbs = quantity * {<span class="hljs-string">'g'</span>: <span class="hljs-number">0.001</span>, <span class="hljs-string">'kg'</span>: <span class="hljs-number">1</span>, <span class="hljs-string">'t'</span>: <span class="hljs-number">1000</span>}[unit] / <span class="hljs-number">0.45359237</span>
20482046
window[<span class="hljs-string">'OUTPUT'</span>].update(value=<span class="hljs-string">f'<span class="hljs-subst">{quantity}</span> <span class="hljs-subst">{unit}</span> is <span class="hljs-subst">{lbs:g}</span> lbs.'</span>)
20492047
window.close()
20502048
</code></pre></div></div>
@@ -2053,7 +2051,8 @@ <h3 id="format-2">Format</h3><div><h4 id="forstandardtypesizesandmanualalignment
20532051
<div><h2 id="scraping"><a href="#scraping" name="scraping">#</a>Scraping</h2><div><h4 id="scrapespythonsurlandlogofromitswikipediapage">Scrapes Python's URL and logo from its Wikipedia page:</h4><pre><code class="python language-python hljs"><span class="hljs-comment"># $ pip3 install requests beautifulsoup4</span>
20542052
<span class="hljs-keyword">import</span> requests, bs4, os
20552053

2056-
response = requests.get(<span class="hljs-string">'https://en.wikipedia.org/wiki/Python_(programming_language)'</span>)
2054+
url = <span class="hljs-string">'https://en.wikipedia.org/wiki/Python_(programming_language)'</span>
2055+
response = requests.get(url, headers={<span class="hljs-string">'User-Agent'</span>: <span class="hljs-string">'cpc-bot'</span>})
20572056
document = bs4.BeautifulSoup(response.text, <span class="hljs-string">'html.parser'</span>)
20582057
table = document.find(<span class="hljs-string">'table'</span>, class_=<span class="hljs-string">'infobox vevent'</span>)
20592058
python_url = table.find(<span class="hljs-string">'th'</span>, text=<span class="hljs-string">'Website'</span>).next_sibling.a[<span class="hljs-string">'href'</span>]
@@ -2067,24 +2066,24 @@ <h3 id="format-2">Format</h3><div><h4 id="forstandardtypesizesandmanualalignment
20672066

20682067
<div><h3 id="selenium">Selenium</h3><p><strong>Library for scraping websites with dynamic content.</strong></p><pre><code class="python language-python hljs"><span class="hljs-comment"># $ pip3 install selenium</span>
20692068
<span class="hljs-keyword">from</span> selenium <span class="hljs-keyword">import</span> webdriver
2070-
2071-
&lt;WebDrv&gt; = webdriver.Chrome/Firefox/Safari/Edge() <span class="hljs-comment"># Opens a browser. Also &lt;WebDrv&gt;.quit().</span>
2072-
&lt;WebDrv&gt;.get(<span class="hljs-string">'&lt;url&gt;'</span>) <span class="hljs-comment"># Also &lt;WebDrv&gt;.implicitly_wait(seconds).</span>
2073-
&lt;str&gt; = &lt;WebDrv&gt;.page_source <span class="hljs-comment"># Returns HTML of fully rendered page.</span>
2074-
&lt;El&gt; = &lt;WebDrv/El&gt;.find_element(<span class="hljs-string">'css selector'</span>, …) <span class="hljs-comment"># '&lt;tag&gt;#&lt;id&gt;.&lt;class&gt;[&lt;attr&gt;="&lt;val&gt;"]…'.</span>
2075-
&lt;list&gt; = &lt;WebDrv/El&gt;.find_elements(<span class="hljs-string">'xpath'</span>, …) <span class="hljs-comment"># '//&lt;tag&gt;[@&lt;attr&gt;="&lt;val&gt;"]…'. See XPath.</span>
2076-
&lt;str&gt; = &lt;El&gt;.get_attribute(&lt;str&gt;) <span class="hljs-comment"># Property if exists. Also &lt;El&gt;.text.</span>
2077-
&lt;El&gt;.click/clear() <span class="hljs-comment"># Also &lt;El&gt;.send_keys(&lt;str&gt;).</span>
20782069
</code></pre></div>
20792070

20802071

2081-
<div><h4 id="xpathalsoavailableinlxmlscrapyandbrowsersconsoleviadxxpath">XPath — also available in lxml, Scrapy, and browser's console via <code class="python hljs"><span class="hljs-string">'$x("&lt;xpath&gt;")'</span></code>:</h4><pre><code class="python language-python hljs">&lt;xpath&gt; = //&lt;element&gt;[/ <span class="hljs-keyword">or</span> // &lt;element&gt;] <span class="hljs-comment"># /&lt;child&gt;, //&lt;descendant&gt;, /../&lt;sibling&gt;</span>
2082-
&lt;xpath&gt; = //&lt;element&gt;/following::&lt;element&gt; <span class="hljs-comment"># Next element. Also preceding/parent/…</span>
2083-
&lt;element&gt; = &lt;tag&gt;&lt;conditions&gt;&lt;index&gt; <span class="hljs-comment"># `&lt;tag&gt; = */a/…`, `&lt;index&gt; = [1/2/…]`.</span>
2084-
&lt;condition&gt; = [&lt;sub_cond&gt; [<span class="hljs-keyword">and</span>/<span class="hljs-keyword">or</span> &lt;sub_cond&gt;]] <span class="hljs-comment"># For negation use `not(&lt;sub_cond&gt;)`.</span>
2085-
&lt;sub_cond&gt; = @&lt;attr&gt;[=<span class="hljs-string">"&lt;val&gt;"</span>] <span class="hljs-comment"># `text()=`, `.=` match (complete) text.</span>
2086-
&lt;sub_cond&gt; = contains(@&lt;attr&gt;, <span class="hljs-string">"&lt;val&gt;"</span>) <span class="hljs-comment"># Is &lt;val&gt; a substring of attr's value?</span>
2087-
&lt;sub_cond&gt; = [//]&lt;element&gt; <span class="hljs-comment"># Has matching child? Descendant if //.</span>
2072+
<pre><code class="python language-python hljs">&lt;Drv&gt; = webdriver.Chrome/Firefox/Safari/Edge() <span class="hljs-comment"># Opens the browser. Also &lt;Driver&gt;.quit().</span>
2073+
&lt;Drv&gt;.implicitly_wait(seconds) <span class="hljs-comment"># Sets timeout for find_element/s() methods.</span>
2074+
&lt;Drv&gt;.get(<span class="hljs-string">'&lt;url&gt;'</span>) <span class="hljs-comment"># Blocks until browser fires the load event.</span>
2075+
&lt;str&gt; = &lt;Drv&gt;.page_source <span class="hljs-comment"># Returns HTML of the page's current state.</span>
2076+
&lt;El&gt; = &lt;Drv/El&gt;.find_element(<span class="hljs-string">'xpath'</span>, &lt;str&gt;) <span class="hljs-comment"># Accepts '//&lt;tag&gt;[@&lt;attr_name&gt;="&lt;val&gt;"]…'.</span>
2077+
&lt;str&gt; = &lt;El&gt;.get_attribute(<span class="hljs-string">'&lt;name&gt;'</span>) <span class="hljs-comment"># Returns attribute or property if exists.</span>
2078+
&lt;El&gt;.click/clear() <span class="hljs-comment"># Also &lt;El&gt;.text and &lt;El&gt;.send_keys(&lt;str&gt;).</span>
2079+
</code></pre>
2080+
<div><h4 id="xpathalsoavailableinlxmlscrapyandbrowsersconsoleviadxxpath">XPath — also available in lxml, Scrapy, and browser's console via <code class="python hljs"><span class="hljs-string">'$x("&lt;xpath&gt;")'</span></code>:</h4><pre><code class="python language-python hljs">&lt;xpath&gt; = //&lt;element&gt;[/ <span class="hljs-keyword">or</span> // &lt;element&gt;] <span class="hljs-comment"># E.g. …/child, …//descendant, …/../sibling.</span>
2081+
&lt;xpath&gt; = //&lt;element&gt;/following::&lt;element&gt; <span class="hljs-comment"># Next element. Also preceding::, parent::.</span>
2082+
&lt;element&gt; = &lt;tag&gt;&lt;conditions&gt;&lt;index&gt; <span class="hljs-comment"># Tag accepts */a/…. Use [1/2/…] for index.</span>
2083+
&lt;condition&gt; = [&lt;sub_cond&gt; [<span class="hljs-keyword">and</span>/<span class="hljs-keyword">or</span> &lt;sub_cond&gt;]] <span class="hljs-comment"># Use not(&lt;sub_cond&gt;) to negate condition.</span>
2084+
&lt;sub_cond&gt; = @&lt;attr&gt;[=<span class="hljs-string">"&lt;val&gt;"</span>] <span class="hljs-comment"># `text()=` and `.=` match (complete) text.</span>
2085+
&lt;sub_cond&gt; = contains(@&lt;attr&gt;, <span class="hljs-string">"&lt;val&gt;"</span>) <span class="hljs-comment"># Is &lt;val&gt; a substring of attribute's value?</span>
2086+
&lt;sub_cond&gt; = [//]&lt;element&gt; <span class="hljs-comment"># Has matching child? Descendant if //&lt;el&gt;.</span>
20882087
</code></pre></div>
20892088

20902089
<div><h2 id="webapp"><a href="#webapp" name="webapp">#</a>Web App</h2><p><strong>Flask is a micro web framework/server. If you just want to open a html file in a web browser use <code class="python hljs"><span class="hljs-string">'webbrowser.open(&lt;path&gt;)'</span></code> instead.</strong></p><pre><code class="python language-python hljs"><span class="hljs-comment"># $ pip3 install flask</span>
@@ -2934,7 +2933,7 @@ <h3 id="format-2">Format</h3><div><h4 id="forstandardtypesizesandmanualalignment
29342933

29352934

29362935
<footer>
2937-
<aside>September 5, 2025</aside>
2936+
<aside>September 7, 2025</aside>
29382937
<a href="https://gto76.github.io" rel="author">Jure Šorn</a>
29392938
</footer>
29402939

0 commit comments

Comments
 (0)